Why Use Relevance Feedback?

An alert reader wrote:

> The problem is that when it comes to searching databases, the only kind of
> search Felice is familiar with/likes/understands/whatever is good old
> fashioned booleans. I've been trying, without much success, to convince her
> that the relevance ranking scheme employed in the "simple" search is actually
> much easier to use, as well as being more powerful at the same time, but it's
> an uphill battle. She now thinks boolean searches are the *only* form we
> should offer on our web site.
>
> The problem is that I havn't yet figured out how to explain what relevance
> ranking is all about in a way that makes sense, and I don't understand the
> details of the algorithm(s) well enough, so I always end up getting argued
> into a corner and find myself unable to explain some detail.

It's not about relevance ranking so much as it is about information discovery. Consider the query "dog AND cat". If you want information about dogs and cats, this isn't necessarily going to give you what you want, because some of the documents with dog and not cat will possibly be of interest. Rather than ignoring those documents completely, it's better to put the documents with both words at the top of the ranking, and the rest lower in the list.

The problem with using Booleans in full-text searches is that it's overly selective. That's fine when you're selecting records from a relational database, and where you either match a field value or not, but it doesn't allow you to assess "close" matches. And close counts if you're seeking information instead of just selecting records.

Booleans are traditionally most useful for what I call "bibliographic searches" - when you're matching a pattern. Relevance ranking is most useful for information discovery. There's a big difference. Which does she think your users are most likely to be doing?

One other distinction occurs to me. Booleans are good if you pretty much know what's in the database and want to select out certain ones (implying you know ahead of time which ones you ought to get back). Relevance rankings are good if you don't know what's in the database and want to get results which are _likely_ to be useful.

Hope this helps. I don't actually think one is better than the other – it depends on what your users are trying to do. What I do know is that when the only tool you have is a hammer, everything looks like a nail.