From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.devel Subject: Re: Emacs i18n Date: Thu, 28 Mar 2019 12:03:26 +0100 Message-ID: <9B85B6CD-13C2-4E38-8A7A-19FC76CBE8E1@acm.org> References: <87o97aq6gz.fsf@jidanni.org> <87tvgoud56.fsf@mail.linkov.net> <83o96wk2mi.fsf@gnu.org> <87k1hjfvjd.fsf@mail.linkov.net> <871s3p0zdz.fsf@mail.linkov.net> <83h8ckezyt.fsf@gnu.org> <87h8cjspc0.fsf@mail.linkov.net> <29a53a39-fa50-1e94-9420-a3ea1250aa44@gmail.com> <87r2azq478.fsf@mail.linkov.net> <83o963s4gx.fsf@gnu.org> <87va09ckym.fsf@mail.linkov.net> <83sgvdndt0.fsf@gnu.org> <8736ncgcnm.fsf@mail.linkov.net> <87a7hir9xp.fsf@mail.linkov.net> <5417CC78-C70D-4AE2-869F-BA9FE1D40633@acm.org> <87d0mcui68.fsf@mail.linkov.net> Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="167960"; mail-complaints-to="usenet@blaine.gmane.org" Cc: Emacs developers To: Juri Linkov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Mar 28 12:03:52 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h9Sp4-000hXj-N1 for ged-emacs-devel@m.gmane.org; Thu, 28 Mar 2019 12:03:50 +0100 Original-Received: from localhost ([127.0.0.1]:34489 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h9Sp3-0001my-NN for ged-emacs-devel@m.gmane.org; Thu, 28 Mar 2019 07:03:49 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:51242) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h9Sop-0001j7-VQ for emacs-devel@gnu.org; Thu, 28 Mar 2019 07:03:39 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h9Som-0006Ug-7q for emacs-devel@gnu.org; Thu, 28 Mar 2019 07:03:35 -0400 Original-Received: from mail232c50.megamailservers.eu ([91.136.10.242]:43446 helo=mail37c50.megamailservers.eu) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h9Sol-0006TX-K1 for emacs-devel@gnu.org; Thu, 28 Mar 2019 07:03:32 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1553771009; bh=0b8oIKocJEINyM3IEnYlRPISNedYorOort0tAAWknBQ=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=BMarGz8OR5QvDLi1w+yEFHpXSGVx2d1GoYA7ZRjj4z5nL2a/7d+CsC9KN5Kf0ohY2 Vm/EwSAaT+jlF7jEqnpm7fXZtZYqsc677TEV+iAJ0cqsQN5F2fHQsehrDw7XyCDqOv OgPwQRAn+nKblhQrM04XKNmkqSiCownKVgVWfYNc= Feedback-ID: mattiase@acm.or Original-Received: from [192.168.1.64] (c-e636e253.032-75-73746f71.bbcust.telenor.se [83.226.54.230]) (authenticated bits=0) by mail37c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x2SB3QqQ031713; Thu, 28 Mar 2019 11:03:28 +0000 In-Reply-To: <87d0mcui68.fsf@mail.linkov.net> X-Mailer: Apple Mail (2.3445.104.8) X-CTCH-RefID: str=0001.0A0B020A.5C9CAA01.0014, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=J+uEEjvS c=1 sm=1 tr=0 a=M+GU/qJco4WXjv8D6jB2IA==:117 a=M+GU/qJco4WXjv8D6jB2IA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=IkcTkHD0fZMA:10 a=ucMQw-l_AAAA:8 a=Mb8VAxtHnrY6ErksvNgA:9 a=QEXdDO2ut3YA:10 a=xkTruGkd22MpkFU079mG:22 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 91.136.10.242 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:234798 Archived-At: 27 mars 2019 kl. 22.22 skrev Juri Linkov : >=20 > I tried =E2=80=98regexp-opt=E2=80=99 and it generates a ready-to-use = regexp: >=20 > (replace-regexp-in-string > "%d" "\\\\([0-9]+\\\\)" > (regexp-opt '("finished with %d match found" > "finished with %d matches found" > "finished with no matches found"))) >=20 > =E2=87=92 "\\(?:finished with \\(?:\\(?:\\([0-9]+\\) = match\\(?:es\\)?\\|no matches\\) found\\)\\)" Well now. There is no guarantee that regexp-opt won't split the %d. = Format strings must be parsed left-to-right for correctness=C2=B9. I'm = still skeptical, but if you really want to give this a try, then first = segment the format string: "Today %d little piggies built %03o houses and said '%s'." "Today %d little piggy built %o house and said '%s'." =3D> ("Today " ?d " little piggies built " ?o " houses and said '" ?s "'.") ("Today " ?d " little piggy built " ?o " house and said '" ?s "'.") leaving the format placeholders as atomic entities (here shown as = characters, but you may need more information there). Then run your fav diff algo on the result. Most important to performance = is prefix merging; anything else is just to make the regexp smaller. Here, prefix and suffix merging would leave you with (still in abstract = form) ("Today " ?d " little pigg" (("ies built " ?o " houses") ("y built " ?o " house")) " and said '" ?s "'.") =46rom there you can either recursively try to find more common = subsequences, or call it a day and render it into a regexp: "Today -?[0-9]+ little pigg\\(?:ies built -?[0-7]+ houses\\|y built = -?[0-7]+ house\\) and said '\\(?:.\\|\n\\)*'." All this will need to be done at run-time, since it is run on translated = strings. =C2=B9 To match format parameters, try something like (rx "%" (opt (1+ digit) "$") (0+ digit) (opt "." (0+ digit)) (any "%sdioxXefgcS"))