From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jean Louis Newsgroups: gmane.emacs.help Subject: Re: Any package for boolean search? Date: Sat, 28 Dec 2024 19:20:38 +0300 Message-ID: References: <86a5cjdlrv.fsf@gmail.com> <86v7v6avg8.fsf@gmail.com> <86seq8a2tg.fsf@gmail.com> <868qrzaj1l.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="30883"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mutt/2.2.12 (2023-09-09) Cc: Help GNU Emacs To: Joel Reicher Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Sat Dec 28 17:21:31 2024 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tRZZC-0007vs-P3 for geh-help-gnu-emacs@m.gmane-mx.org; Sat, 28 Dec 2024 17:21:30 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tRZYV-0005T8-56; Sat, 28 Dec 2024 11:20:47 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tRZYU-0005Sz-49 for help-gnu-emacs@gnu.org; Sat, 28 Dec 2024 11:20:46 -0500 Original-Received: from stw1.rcdrun.com ([217.170.207.13]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tRZYS-00056X-FH for help-gnu-emacs@gnu.org; Sat, 28 Dec 2024 11:20:45 -0500 Original-Received: from localhost ([::ffff:41.75.177.38]) (AUTH: PLAIN admin, TLS: TLS1.3,256bits,ECDHE_RSA_AES_256_GCM_SHA384) by stw1.rcdrun.com with ESMTPSA id 000000000007DCD5.0000000067702559.0012FCB7; Sat, 28 Dec 2024 09:20:41 -0700 Mail-Followup-To: Joel Reicher , Help GNU Emacs Content-Disposition: inline In-Reply-To: <868qrzaj1l.fsf@gmail.com> Received-SPF: pass client-ip=217.170.207.13; envelope-from=bugs@gnu.support; helo=stw1.rcdrun.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.help:149048 Archived-At: * Joel Reicher [2024-12-28 19:08]: > Solving a search query is an eval, but I think you are saying you want to > avoid code injection which is fair enough. Search query is just a string I receive so parsing the string for OR will be my first development. > Off the top of my head you can map over the sexpr returned by read and > rewrite the symbols to ones you control before evaling. I understand that approach, and I understand there would be way to allow only some symbols, though I assure you that reading the sexp is not needed. Generally I do a lot of evals in my documents. I find the feature fantastic. I am using it for template interpolation. RCD Template Interpolation System for Emacs: https://hyperscope.link/3/7/1/3/3/RCD-Template-Interpolation-System-for-Emacs.html Though for website search it will be string parsing on chunks of different types of queries. > In my opinion it is still worth avoiding a parse, which is what Lisp can > give you. 1. First stage of development - treat it as exact, collect IDs - allow 1-2 Levenshtein difference, -- Function: string-distance string1 string2 &optional bytecompare This function returns the “Levenshtein distance” between the source string STRING1 and the target string STRING2. The Levenshtein - get 5-10 pages where both or more words of the query, which are not stop words, appear apart from each other, unless user quoted the query 2. Second stage - recognize if there is any OR -- Jean Louis