MySQL fulltext search Boolean mode confusion
I'm getting a bit confused when trying to set up a search utilizing fulltext search in boolean mode. Here is the query I'm using:
$query = "SELECT *, MATCH(title) AGAINST('$q' IN BOOLEAN MODE) AS score FROM results WHERE MATCH(title) AGAINST('$q' IN BOOLEAN MODE) ORDER BY score DESC";
When I run a search for +divorce+refinance, the results returned are:
1) Divorce: Paying Off Spouse = Rate/Term Refinance 2) Divorce - What to Look Out For Regarding Divorced Borrowers
Am I right in thinking that the second result should not be appearing, as it does not have both words? If not, how can I create that functionality?
Maybe I am mistaken, but if you search this string +divorce+refinance you get a weird result. If you want to search both words, your should search for +divorce +refinance (with a space between).
I tested it and it returns only one row:
Divorce: Paying Off Spouse = Rate/Term Refinance
Your problem relates to the create a prioritized boolean query and for this type of query one has to go in depth of Boolean search and to now how the Boolean search is performed. In simple words let me explain you why the second number result of result is shown.
Once should first understand what does Boolean means in programming? It means either condition is true or false i,e 0 to 1.
Now let me explain for the Boolean search is performed? You have given two words. Let us search the row by row in Boolean mode. Search engine start and searches the row by row now where ever the First word is found, it makes the record true and give score as 1 to the rows in which the first word is found and also prepare the numbers of words found in the row.
Now it moves the next word and do the same process gives the record True and makes a list of records wherever the word is found and also prepare the number of words found in the row.
Now there are two rows of results are available and they are clubbed and with the priority is given to the words with the maximum number of words and row here is the main problem lies.
First >>> total nos. >> Second >> total nos. >>> Final >> row Word >>> Results >> Word >>>> of words > > > Results >>no >>Answer
1 >>>>>>>> 2 >>>>>>>>1>>>>>>>>>1>>>>>>>>1.33>>>> 1 >>> 1.33 0 >>>>>>>> 0 >>>>>>>>2>>>>>>>>>2>>>>>>>>1.25>>>> 2 >>> 1.25 0 >>>>>>>> 0 >>>>>>>>1>>>>>>>>>0>>>>>>>>1.25>>>> 3 >>> 1
While clubbing two results lists when true added with false then result is true, as if you add 1 + 0 = 1 and the results are should with value more than 1. So, while scoring the relevancy to the words found it is always found that the search engine shows the results where it found any word.
Scoring relevancy queries are done in two types either ignore the scores which are equal to one and only do calculations on the records who's score is greater than 1. Second is to make such a query that it never shows the records equal to one. As in your case you can so the below things also to get the correct results for two words:
SELECT *, ( (1.3 * (MATCH(title) AGAINST ('+term +term2' IN BOOLEAN MODE))) + (0.6 * (MATCH(text) AGAINST ('+term +term2' IN BOOLEAN MODE))) ) AS score FROM results WHERE ( MATCH(title, text) AGAINST ('+term +term2' IN BOOLEAN MODE) ) HAVING relevance > 0 ORDER BY relevance DESC;
I know that using the word HAVING make the query little slow but there is no other solution available. Hope this solves your query.