From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Marcin Borkowski Newsgroups: gmane.emacs.help Subject: Re: Fwd: How to check whether a character (or one-character string) is a letter? Date: Sun, 05 Oct 2014 02:11:03 +0200 Message-ID: <8738b3ml48.fsf@wmi.amu.edu.pl> References: <87iok0y8wr.fsf@wmi.amu.edu.pl> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1412467892 1130 80.91.229.3 (5 Oct 2014 00:11:32 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 5 Oct 2014 00:11:32 +0000 (UTC) To: "help-gnu-emacs\@gnu.org" Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sun Oct 05 02:11:27 2014 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XaZQ6-0000GA-Jt for geh-help-gnu-emacs@m.gmane.org; Sun, 05 Oct 2014 02:11:26 +0200 Original-Received: from localhost ([::1]:45658 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XaZQ6-00040W-1l for geh-help-gnu-emacs@m.gmane.org; Sat, 04 Oct 2014 20:11:26 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:50860) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XaZPr-00040N-9e for help-gnu-emacs@gnu.org; Sat, 04 Oct 2014 20:11:17 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XaZPl-0001Ku-IL for help-gnu-emacs@gnu.org; Sat, 04 Oct 2014 20:11:11 -0400 Original-Received: from msg.wmi.amu.edu.pl ([2001:808:114:2::50]:58087) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XaZPl-0001Kk-At for help-gnu-emacs@gnu.org; Sat, 04 Oct 2014 20:11:05 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by msg.wmi.amu.edu.pl (Postfix) with ESMTP id DB82B50777 for ; Sun, 5 Oct 2014 02:11:04 +0200 (CEST) Original-Received: from msg.wmi.amu.edu.pl ([127.0.0.1]) by localhost (msg.wmi.amu.edu.pl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FwhqZhU1zX2x for ; Sun, 5 Oct 2014 02:11:04 +0200 (CEST) Original-Received: from localhost (111-128.echostar.pl [213.156.111.128]) by msg.wmi.amu.edu.pl (Postfix) with ESMTPSA id 73B2550742 for ; Sun, 5 Oct 2014 02:11:04 +0200 (CEST) In-reply-to: X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:808:114:2::50 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:100275 Archived-At: On 2014-10-04, at 04:47, John Mastro wrote: > [I first sent this directly to Marcin in error - yeah, I use the email gateway] > > Hi Marcin, > > Marcin Borkowski wrote: >> Assume that I have a character (taken from some string, which in turn is >> copied from the buffer - so it need not be ASCII). What is the best way >> to check whether it is a letter within ASCII range? >> >> The reason I'm asking is that I'm writing a function which converts an >> arbitrary string to a valid (and nice) filename (e.g., only letters and >> hyphens) - so basically I want to walk a string character by character >> and convert any space to a hyphen and omit any other non-letter. Am I >> reinventing the wheel? > > There are a bunch of ways to do this, but one reasonable approach is to > use a regular expression. I think this will do what you want: > > (defun reasonable-filename (str) > (let* ((str (replace-regexp-in-string "[ \t\n\r]" "-" str)) > (str (replace-regexp-in-string "[^a-zA-Z-]" "" str))) > str)) I think this is probably better than mapcar'ing through the string... > This is a variation which will also allow the result to contain numbers: > > (defun reasonable-filename (str) > (let* ((str (replace-regexp-in-string "[ \t\n\r]" "-" str)) > (str (replace-regexp-in-string "[^a-zA-Z0-9-]" "" str))) > str)) This I don't want, since in case of equal filenames, I want to differentiate them by appending a number, and allowing digits might break this. But thanks anyway. > To answer your question about identifying whether a character is an > ASCII letter, the key is that Emacs's characters are really "just" > integers. Wikipedia has some charts[1] that show the numbers associated > with the characters. The letters are conveniently grouped together, so > we can use something like this: > > (defun ascii-letter-p (char) > (and (characterp char) > (>= char 65) > (<= char 122))) > > (Of course, this only works if it's really a character, as opposed to a > string of length one. If it's a string of length one you could either > "extract" the character with `aref' or use a regular expression > instead.) > > Hope that helps. Yes it does! Best, -- Marcin Borkowski http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski Adam Mickiewicz University