On Thu, 7 Jul 2011 12:37:00 +0100, Patrick Totzke wrote: > Hi! > Something strange goes on when I use unicode literals as querystrings: > Database().create_query(u'teststring') yields different results than > Database().create_query('teststring').. > > Now it should not be a problem to decode the string to whatever encoding > is used by notmuch/xapian internally using 'teststring'.encode('utf8') > for example. But can I reliably expect all strings in the index to be valid utf8? > > At any rate, I think this conversion should be made from inside the bindings. > A query should return the same results for querystrings as string- and unicode literals. > Any thoughts? I hate encodings and they always confuse the heck out of me. I would prefer if everything was always UTF8. notmuch.h actually doesn't state which encoding the query string should be and neither did http://xapian.org/docs/queryparser.html. ojwb said, it takes UTF-8, so that's what we should be doing. I'll send a patch as a reply shortly, Patrick, do you care to test if this fixes things for you? Sebastian