A relevant search meets an end user's expectations every single time. However, both measuring and optimizing the search relevancy with Elasticsearch requires one to be an expert search engine user, and even then, it continues to be an ongoing effort that takes months to yield fruitful results.
Appbase.io now offers Search Relevancy - a control plane containing a suite of GUIs that enable user to improve their search relevancy settings without requiring any guesswork. Combined with Actionable Analytics, Search Relevancy enables you to optimize your search's relevance in a data-driven manner.
Note: Search Relevancy control plane and APIs are available for all Production and Enterprise plan users.
Language forms the core of a search engine. The Language Settings UI enables you to configure your search engine to work with the language that your users are going to search for.
Appbase.io offers support for 38 languages with the default relevancy configured to work universally.
Languages Supported: arabic, armenian, basque, bengali, brazilian portuguese, bulgarian, catalan, chinese*, czech, danish, dutch, english, estonian, finnish, french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, japanese*, korean*, latvian, lithuanian, norwegian, persian, polish*, portuguese, romanian, russian, sorani, spanish, swedish, turkish, ukranian*, thai
When you select a primary language that's different from Universal, the universal analyzer continues to work with it. This way, you can benefit from a multi-lingual search.
* The following languages require additional analyzer plugins to be installed in your Elasticsearch cluster.
|chinese||smartcn||The chinese language falls back to the built-in
|japanese||kuromoji||The kuromoji analyzer enables the analysis of japanese text.|
|korean||nori||The korean language falls back to the built-in
|polish||stempel||The stempel analyzer enables high-quality stemming for polish text.|
|ukranian||ukranian||The ukranian analyzer enables the analysis of ukranian text.|
Outside of the choice of the primary language, an index can also be configured with the following additional language specific settings.
Stopwords: Enable/disable or configure custom stopwords (i.e. words to be ignored by the search engine)
Stemming Exceptions: Set words that are excluded from your language's stemming process.
Normalize Diacritics: Enabled by default, this setting controls whether diacritics are normalized for search matches or not.
Search Settings allow you to control your search query settings.
The UI view lets you control and set the fields that have a search use-case and a variable field weight that's used to boost a match at search time.
By default, when appbase.io is used to import data for an Elasticsearch index, it sets the search use-case and the appropriate indexing and search analyzers for all the
text fields. As a user, you can change the fields that are searchable.
Outside of the ability to set searchable fields and optionally their weights, you can also set the following search settings.
Search Operators: Enabling Search Operators allows your end-users to use search operators and construct advanced queries like:
"fried eggs" +(eggplant | potato) -frittata" that are allowed in advanced modes of popular consumer search engines. Internally, this setting translates the query term to use a simple query string query. This setting is disabled by default.
Enable Typo Tolerance: Enabling typo tolerance allows your end-users room to be slightly off with their search queries and have the search engine still interpret those correctly.
Once typo tolerance is enabled, you can use it in one of these three modes:
AUTO lets the search engine decide the number of acceptable characters that are off based on the length of the search query. We recommend this as a good default.
1 lets the search engine allow up to 1 character to be off from the indexed content.
2 lets the search engine allow up to 2 characters to be off from the indexed content.
Allowing typo-tolerance beyond 2 isn't recommended as that can yield a lot of false positive hits. You should instead set specific synonyms in such cases.
Enable Synonyms: Enabling synonyms lets you set a dictionary of synonyms that is used at search time to map to the indexed content. This setting is disabled by default.
Aggregation Settings allows you to set the fields that should be used for aggregations (aka search facets).
The UI view lets you control and set the fields that have an aggregations use-case and select the type of the aggregation:
Term (which applies to both text or numeric data fields) or
Range (which applies to only numeric data fields).
Once a type is set, the Search Preview UI shows the facets for the aggregation fields. Here's an example showing how it would look like:
These are some other options available in the Aggregations settings.
Default Size For Aggregations: This indicates the number of unique facet values to retrieve for a given aggregation field. Defaults to
10. We recommending not setting this more than
Default Sort: This indicates the default sort order of facet values for a given aggregation field. Defaults to Count (highest value first), but it can also be set to either an ascending or a descending order.
Include Null Values: This setting dictates whether null (or empty) value data would be returned and displayed as a facet value for a given aggregation field. Defaults to
Aggregation Settings allows you to set the result page size, which fields to return back and set result highlight settings.
Page Size: Set the number of results to return in one search query's response. Defaults to
Fields To Return: Set the fields to exclude and include using specific fields or patterns. This lets you control the response size and as a consequence, lead to an improved latency.
Result Highlight Settings: Enable highlighting and set specific fields to highlight, set the highlight tag (e.g.
<mark>), set the total highlight fragments and the max fragment size to return. Controlling these lets you optimize the response size with highlighting tailored for your use-case.
Index Settings lets you configure the shards and replica settings for your index.
Manage Shards: Allows you to change your index's shard size by re-indexing in place.
How to think about sharding:
Each shard is a self-contained index. An Elasticsearch index is just a logical grouping of the physical shards.
- Data within different shards can be processed parallely when executing a search query. The higher the number of shards, the faster the search can execute given that CPU cores and memory are sufficiently available.
- At the same time, shards come with a significant overhead. There is a soft limit of 1,000 shards per data node. You want to keep the total shards per node below this limit.
Manage Replicas: Allows you to update your index replica settings.
A 1-replica setup implies all the primary shards are replicated, resulting in twice the number of net shards. Replication makes your search high availabile by ensuring redundancy of data. A no-replica setup implies no redundancy. In case of a node failover, data residing in the shards of the node will become unavailable.
A good rule of thumb is to have a 1-replica setup whenever you have at least two nodes in the cluster. For an even higher redundancy, you can opt for a 2-replica setup.
You can read more about shards and replicas over here.
The Schema view gives you an overall view of your index's fields. You can set a specific use-case, data type, add a new field to the index or remove a field from the index.
The Synonyms view allows you to add or edit synonyms for your search index. Synonyms set here are applied at query time and thus don't require re-indexing of data.
All searchable fields (i.e. use-case=search) get a
.synonyms field assigned to them. You should search against this particular field to take advantage of synonyms matching.
There are two types of synonyms supported:
- Equivalent Synonyms: Equivalent synonyms lets you set two or more synonym words and treat all the words as equal.
- One way Synonyms: One way Synonyms let you set one or more alternative words for a given search term (an indexed content term). This then maps the alternative words to the given search term but not vice-versa.
Fun Fact: Synonyms set are case insensitive and they can also span multiple words.
Query Suggestions is a daily populated index by appbase.io based on search analytics. The Query Suggestions UI lets you set preferences for how this index should get populated. You can read more about it over here.
Query Rules allows you to set rules based on the incoming search query, filters or universally. A rule can allow you to:
- Promote a particular result (useful for merchandizing),
- Hide an irrelevant result,
- Apply an additional facet,
- Modify the incoming search term,
- Return custom data (useful for advertising/merchandizing),
- Run a user-defined function - providing endless possibilities to extend search.
Learn more about query rules over here.
Functions allow you to run user-defined functions to extend the search engine. Read more about them over here.
The Search Preview UI is at the core of Search Relevancy views. It lets you test your configured search, aggregation, results settings and review them prior to setting them live.
You can see it present on all the views with a
Test Search Relevancy button.
Raw view: The Raw view lets you see the underlying search API call as well as modify it and see the response as a raw JSON. Under the hood, appbase.io uses the ReactiveSearch API to make the search requests.
You can export the Search Preview UI using the
Open in Codesandbox button. This produces a React boilerplate codebase built using ReactiveSearch.