Search in Index on Mongodb returning non-existent words, 'Bed' do you think 'Camera'?

Asked

Viewed 42 times

2

Hello I am new in this database and I have made the following index for my documents:

produtos.create_index([('Tags', pymongo.TEXT)], unique=False, sparse=True,name='tags', default_language='portuguese')

then command a find()

   db.getCollection('produtos').find({$text:{$search:'cama'}})

And it returns me a lot of smartphones as for example one with these tags:

"Tags" : [ 
        "iphone", 
        "64gb", 
        "dourado", 
        "tela", 
        "4.7", 
        "ios", 
        "4g", 
        "câmera", 
        "12mp", 
        "apple"
    ],

The only esoteric explanation I found is the "camera". I deleted the "Camera" but the "Cam" are also being returned. Does he search for chars instead of words? I tried the 'Galax' and did not return any "Sansumg Galaxy". But 'ga' returns motherboards like "Gigabytes', 'lga' ... I’m lost in this.

1 answer

1


From what I understand of what you want to do, the operator $search the way it is being used is not ideal for this case. O $search will break the string passed into several pieces and make a or with all these pieces. That is, what it does is a tokenization of the search string. A documentation of Mongo shows this behavior.

For example, if you have: db.times.find( { $text: { $search: "vasco vice" } } ), it will return all that contain or the string "vasco" or "vice". And if you want the entire string to be taken into account, db.times.find( { $text: { $search: "\"vasco vice\"" } } )

Another point is that the $text can be case sensitive or not, it has a parameter for this, as in documentation.

Now answering some of your questions, when you searched for galax the query did not return Sansumg Galaxy because it’s probably case sensitive. And when you searched for ga he returned Gigabytes and lga because they both have the ga.

  • In fact in the hasty reading at the beginning of the documentation this there "perform a text search of string content". But I still wonder why 'camera' came if 'bed' are different. However this already solved. Thanks.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.