Friday, February 4, 2011

Difference Between FREETEXT and CONTAINS


Yes - Let's take a look at the comparison first and then we can work through a few different examples.  The FREETEXT command is another alternative to access the data indexed by Full Text Search.  In general the FREETEXT command provides the ability to search for a matched term based on the meaning of the terms as opposed to the exact character string.  At a high level, this commands finds matches based on separating the string into individual words, determining inflectional versions of the word and using a thesaurus to expand or replace the term to improve the search.

Now let's compare the FREETEXT functionality with the CONTAINS command.  The CONTAINS command uses exact match or fuzzy logic to perform the matches with a single word or a phrase.  In addition, the words near another word can be found as well as performing a weighted match of multiple words where each word has a weight as compared to the others that are searched.  Check out CONTAINS (Transact-SQL) for a explanation on the CONTAINS command.

Depending on the search you are performing dictates which Full Text Search command you should use.  Keep in mind that FREETEXT and CONTAINS are only two of the four commands available.  The other two commands are CONTAINSTABLE and FREETEXTTABLE.  The comparison between the four commands will be saved for a future tip since it is fairly involved explanation that should include examples.

Until that point in time, here is one data point to consider: according to SQL Server 2005 Books Online FREETEXT (Transact-SQL) "Full-text queries using FREETEXT are less precise than those full-text queries using CONTAINS. The SQL Server full-text search engine identifies important words and phrases. No special meaning is given to any of the reserved keywords or wildcard characters that typically have meaning when specified in the parameter of the CONTAINS predicate."  Based on my testing, when the basic terms are queried with either command similar results are returned, so the precise factor for simple queries seems less of an issue.  For complex searches the CONTAINS command wins hands down with the ability to use wild cards, NEAR statements, etc. As such, if the flexibility of the search is built into the front end application then the highest level of flexibility on the back end, between the FREETEXT and CONTAINS commands, tips the scales toward the CONTAINS command.

Syntax:

USE AdventureWorks;
GO
SELECT *
FROM Production.Product
WHERE FREETEXT(*, 'screw washer spaner');
GO

No comments:

Post a Comment