Spellcheck is a useful feature to provide to your users when building your search app. Apache Solr natively supports spellchecking, making it easier for developers to enhance the UX of their apps.
In this article, let’s look into enabling the spell-checking module in Apache Solr (in SolrCloud mode).
The SpellCheck component provides inline query suggestions based on similar terms. The basis for these suggestions can be terms in a field in Solr, externally created text files, or fields in other Lucene indexes.
Solr offers 4 types of spellcheckers, they are:
- Index-Based Spell Checker
- Direct Solr Spell Checker
- File-Based Spell Checker
- Word Break Solr Spell Checker
The easiest way to start spellchecking on Solr (in Standalone mode) is to edit the solrconfig.xml
file in your Solr installation. The Solr docs cover this pretty well, so in this tutorial, I’ll be talking about how to get it set up in SolrCloud mode.
Let’s setup up the techproducts example, using Docker to try out the spellcheck feature.
docker run --name solr_container -d -p 8983:8983 -t solr -c -f
docker exec -it --user=solr solr_container bin/solr create_collection -c techproducts
docker exec -it --user=solr solr_container bin/post -c techproducts example/exampledocs/
Next, we need to enable spellcheck in the /select
request handler of Solr. We’ll use the Config API in Solr for this. Go ahead and create a file called solrconfig.json
with the following content.
{
"update-requesthandler": {
"name": "/select",
"class": "solr.SearchHandler",
"defaults": {
"echoParams": "explicit",
"rows": 10,
"defType": "edismax",
"qf": "name",
"spellcheck": "on",
"spellcheck.dictionary": ["default", "wordbreak"],
"spellcheck.count": 10,
"spellcheck.alternativeTermCount": 5,
"spellcheck.maxResultsForSuggest": 5,
"spellcheck.collate": "true",
"spellcheck.collateExtendedResults": "true",
"spellcheck.maxCollationTries": 10,
"spellcheck.maxCollations": 5
},
"last-components": ["spellcheck"]
},
"update-searchcomponent": [
{
"name": "spellcheck",
"class": "solr.SpellCheckComponent",
"spellchecker": [
{
"name": "default",
"field": "name",
"classname": "solr.DirectSolrSpellChecker",
"distanceMeasure": "internal",
"accuracy": 0.5,
"maxQueryFrequency": 0.01,
"maxEdits": 2,
"minPrefix": 1,
"maxInspections": 5,
"minQueryLength": 4
},
{
"name": "wordbreak",
"field": "name",
"classname": "solr.WordBreakSolrSpellChecker",
"combineWords": "true",
"breakWords": "false",
"maxChanges": 10
}
],
"queryAnalyzerFieldType": "text_general"
}
]
}
Here we are setting the parameters for the spellcheck in the /select
request handler. You can find a list of configurable parameters for the spellcheck module here.
I’ve configured two different spellcheckers here: The DirectSolrSpellChecker
and WordBreakSolrSpellChecker
.
Now let’s send this configuration to Solr via the Config API:
curl -X POST -H 'Content-type:application/json' -d @solrconfig.json http://localhost:8983/solr/techproducts/config
Now you should be good to go. Go ahead and try searching for incorrectly spelled words and Solr should give you suggestions.
Some examples:
- Search for
cannon
:
2. Search for power shot
:
The SpellCheck module in Solr is a very useful tool and can be configured in many ways to suit your needs. I hope this quick tutorial has helped you to get up and running with spellchecking on Solr. As always, if you have any issues or questions, leave them in the comments and I’ll try my best to help.
Thanks for reading!