From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.help Subject: Re: "split-sentences"? Date: Sat, 23 Jan 2021 11:48:49 +0200 Message-ID: <83zh103rwe.fsf@gnu.org> References: <87zh109r2d.fsf@zoho.eu> <87v9bo9myu.fsf@zoho.eu> <20210123084136.GA2306@tuxteam.de> <87lfcknhs5.fsf@logand.com> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14349"; mail-complaints-to="usenet@ciao.gmane.io" To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Sat Jan 23 10:49:06 2021 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1l3FXW-0003de-1w for geh-help-gnu-emacs@m.gmane-mx.org; Sat, 23 Jan 2021 10:49:06 +0100 Original-Received: from localhost ([::1]:51604 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l3FXV-0005xG-3R for geh-help-gnu-emacs@m.gmane-mx.org; Sat, 23 Jan 2021 04:49:05 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:43048) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l3FXE-0005xA-3n for help-gnu-emacs@gnu.org; Sat, 23 Jan 2021 04:48:48 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:38170) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l3FXD-000440-0m for help-gnu-emacs@gnu.org; Sat, 23 Jan 2021 04:48:47 -0500 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:4413 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1l3FXC-0006EH-Eb for help-gnu-emacs@gnu.org; Sat, 23 Jan 2021 04:48:46 -0500 In-Reply-To: <87lfcknhs5.fsf@logand.com> (message from Tomas Hlavaty on Sat, 23 Jan 2021 10:07:06 +0100) X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:127327 Archived-At: > From: Tomas Hlavaty > Date: Sat, 23 Jan 2021 10:07:06 +0100 > > Does emacs expose unicode text functions? It does expose some of them, although not necessarily under the names used by the UCS. > For example to classify characters, determine graphemes, words, > sentences, line breaks etc? We have get-char-code-property for Unicode character properties and find-composition for finding grapheme clusters (Emacs doesn't care about graphemes, unless you use these two terms as aliases). For words, sentences, and line breaks, we use our own definitions, and generally don't support the Unicode delimiters like U+2028.