Full Text Queries : CONTAINS and FREETEXT predicates
This is the second post in the series of posts on full text search. In the first post of the series I discussed on full text search basics and configuration.
For running full text queries you use either full text predicates viz. CONTAINS and FREETEXT or full text functions viz. CONTAINSTABLE and FREETEXTTABLE.
Full text predicates are used in WHERE clause of the query and return true or false whereas Full text functions return Tables and hence are used in FROM clause of the query.
In this post I would be discussing about full text queries using predicates.Using full text functions would be topic for next post.
Full text queries are quite powerful and can be of following types:
- Simple Term : Searching for a particular word or phrase
- Prefix Term : searching for prefix word or phrase
- Proximity Term: Looks for words or phrases close to each other
- Generation Term : Looks for inflectional forms i.e. if the word is a verb the search will look for various tenses of word .E.g if you search for “Start” you will get results for Start ,Started,Starting etc.Similarly if word you are looking for is a Noun you will get singular and plural forms of it.
- Thesaurus : Looks for synonyms of a particular word based on thesaurus.E.g if you search for Start, search will look for Start ,Begin etc.
- Weighted Term: Looks for various search terms based on the weights associated with them
FREETEXT and CONTAINS
FreeText is more restrictive predicate which by default does a search based on various forms of a word or phrase (that means it by default includes Inflectional forms as well as thesaurus).
Contains, unlike FreeText, gives you flexibility to do various forms of search separately.
Let’s see some examples below to make it more clear how different searches are done through FREETEXT and CONTAINS.
CONTAINS
As I mentioned above, CONTAINS provides the flexibility of running various forms of queries separately. Let’s go one by one through few examples of each form to get better understanding.
Simple Term
Query to Return records with exact match for word “Clothe” in column EmployeeJobSector and EmployeeResume
SELECT * FROM EMPLOYEEPROFILE WHERE CONTAINS ((EMPLOYEEJOBSECTOR ,EMPLOYEERESUME), 'CLOTHE')
Query to return records with exact match for word “Clothe” in any of the columns (one query searches in all the columns of the table)in the table EmployeeProfile
SELECT * FROM EMPLOYEEPROFILE WHERE CONTAINS (*, 'CLOTHE')
Prefix Term
Query to return records with values starting with word “Cloth” in column EmployeeJobSector
SELECT * FROM EMPLOYEEPROFILE WHERE CONTAINS (EMPLOYEEJOBSECTOR, 'CLOTH*')
Proximity Term
Query to return records where Word “Extensive” and “Experience” are at max 10 letters apart.
SELECT * FROM EMPLOYEEPROFILE WHERE CONTAINS (EMPLOYEERESUME, 'NEAR ((EXTENSIVE, EXPERIENCE), 10)')
Query to return records where Word “Extensive” and “Experience” are at max 10 letters apart and are in the same sequence (i.e. Extensive comes before Experience)
SELECT * FROM EMPLOYEEPROFILE WHERE CONTAINS (EMPLOYEERESUME, 'NEAR ((EXTENSIVE, EXPERIENCE), 10, TRUE)')
Generation Term
Query to return records with all the inflectional forms of the word “Start” like Start, Started, and Starting etc.
SELECT * FROM EMPLOYEEPROFILE WHERE CONTAINS (*, ‘FORMSOF (INFLECTIONAL, Start)')
Thesaurus
Query to return records with all the synonyms of the word “Start” like Start, Begin etc.
SELECT * FROM EMPLOYEEPROFILE WHERE CONTAINS (*, ‘FORMSOF (THESAURUS, Start)')
NOTE: For this query to work you must have a Thesaurus available in you SQL Server data folder .Usually SQL Server does not ship with a default thesaurus but it can be easily downloaded and once put in the appropriate folder queries should work fine.Refer this post for details.
FreeText
FREETEXT predicate, as I mentioned earlier, does not provide much variations and comes by default with inflectional and thesaurus search. Syntax for using FREETEXT is exactly the same as CONTAINS i.e. Just replace CONTAINS with FREETEXT.
OK. So now the questions comes … why we need FREETEXT if we already have CONTAINS?.
Well from doing these samples one of the case where I found it useful (and I am sure there may be more cases) is when you need to do an Inflectional, Thesaurus and NEAR search together in a single query. Below is an example
Query to return records with all forms of word “Start” and “Teach” (both inflectional and synonyms) occuring within 10 letter spaces of each other
SELECT * FROM EMPLOYEEPROFILE WHERE FREETEXT(EMPLOYEERESUME, 'NEAR ((START, TEACH), 10)')
Above query will search for all the forms of the word like Start, Started, Teach, Taught, Teaching etc.
This kind of search is not simple to do using CONTAINS as combining FORMSOF and NEAR is not permitted in a single query.
Hope this post gives good idea on full text queries using predicates. In next post I will discuss on function CONTAINSTABLE and FREETEXTTABLE
For further reading on this and related topics, you can refer to this exhaustive course on pluralsight.
Pingback: Full Text Search : Basics and Configuration | Coding Canvas
Pingback: Full Text Queries : CONTAINSTABLE and FREETEXTTABLE functions | Coding Canvas
Pingback: Full text search : Stoplist and Stopword | Coding Canvas
Is it possible to send the word Clothes as a parameter in below
SELECT * FROM EMPLOYEEPROFILE
WHERE
FREETEXT(EMPLOYEERESUME’FORMSOF(INFLECTIONAL, ‘+ @freetext+’)’)
this is my query:- select * from dbo.zones where freetext(name, ‘test’); select * from dbo.zones where contains(name, ‘formsof(INFLECTIONAL, test)’);
this give me result of- test, testing, tested
select * from dbo.zones where contains(name, ‘test’);
this query only give me test result.
I am trying to get record which have “test” string like, “this is a test case”.
so my query needs to find “test” string in “this is a test case” values also.
CONTAINS seems to have a bug that does not return all the correct values. So we may revert back to using LIKE operator. And if so, we will not be able to use the full_text_index that was built to be used by CONTAINS operator.