How to Increase Search Result Relevancy: Collecting Relevancy Raw Data

May 22, 2019
By Jay M.

When users are using your website’s search engine, they expect the most relevant results to be at the top of the list. For this reason, it’s good to know there are well-established strategies to improve search result relevance, thereby, improving user satisfaction. Consequently, it’s good to know that when this problem arises, there are well-established strategies to improve search result relevance, thereby, improving user satisfaction. The first step is to generate a dataset of relevancy judgements to measure the relevancy of your search results.

One source of relevancy judgements is Subject Matter Experts (SMEs). These people are usually a part of your organization and have special skills and knowledge about the topics being searched. There are several methodologies for collecting the judgements of SMEs. The most common one is to present a single search result and then ask the SMEs to rate the relevance of the single result on a numerical scale (for example from 1 to 4). You should ask them to do this many time for many different searches, and build a dataset of relevancy judgments. In addition to having SMEs apply ratings of search results, it is also useful to use expert users from your site’s user population to rate search results and generate the dataset.

Another alternative to surveys of SMEs and expert users is to record the behavior of real users on your site and use those to make inferences about the relevancy of the search results they encounter. This can be done using a web metrics analysis and reporting engine that most common public sites use. Here are some of the items you can analyze to derive the relevance of search results:

Record which result the user clicks on first after conducting a search. Any results that the user skips over should be considered less relevant. The result the user clicks is more relevant than the ones above.
Record the dwell time on the page after the user clicks a result. If the user clicks on a search result and spends a significant amount of time on the resulting page, the relevancy of that document can be rated higher. If the user clicks, but then immediately returns to the search results, that document can be rated as less relevant.
Where applicable, it may be possible to keep track of users who take another action on a document after clicking through it. In e-commerce, a product that gets added to the shopping cart is more relevant than one that isn’t. In other contexts, it may be possible to track whether a user shares, saves, bookmarks or prints a document. All of these actions would indicate a higher relevance.

This post describes a few of the most common ways of collecting data on relevancy for items provided by your search engine. In my next post, I’ll describe a few of the ways you can actually analyze this data to find out how well your search engine is actually doing in returning relevant results to your users. Stay tuned!