From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Yuri Khan Newsgroups: gmane.emacs.help Subject: Re: How to check whether a character (or one-character string) is a letter? Date: Sat, 4 Oct 2014 11:08:32 +0700 Message-ID: References: <87iok0y8wr.fsf@wmi.amu.edu.pl> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: ger.gmane.org 1412395734 13520 80.91.229.3 (4 Oct 2014 04:08:54 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 4 Oct 2014 04:08:54 +0000 (UTC) Cc: "help-gnu-emacs@gnu.org" To: Marcin Borkowski Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sat Oct 04 06:08:48 2014 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XaGeE-0003AQ-Ag for geh-help-gnu-emacs@m.gmane.org; Sat, 04 Oct 2014 06:08:46 +0200 Original-Received: from localhost ([::1]:42467 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XaGeD-0005OH-T0 for geh-help-gnu-emacs@m.gmane.org; Sat, 04 Oct 2014 00:08:45 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40009) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XaGe4-0005OB-BA for help-gnu-emacs@gnu.org; Sat, 04 Oct 2014 00:08:37 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XaGe1-0000hn-Dq for help-gnu-emacs@gnu.org; Sat, 04 Oct 2014 00:08:36 -0400 Original-Received: from mail-ie0-x231.google.com ([2607:f8b0:4001:c03::231]:58489) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XaGe1-0000hW-8G for help-gnu-emacs@gnu.org; Sat, 04 Oct 2014 00:08:33 -0400 Original-Received: by mail-ie0-f177.google.com with SMTP id rd18so821198iec.22 for ; Fri, 03 Oct 2014 21:08:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=6uNeWSJ3hczKhh8doUsRQRVGa+wVasChzi89g2Oa99A=; b=DlueDxD/TSO5KIaNSAab8WyMaQOA8RbeH7y8YowwS1DQdtVPqscu/iIbfo7a1RoXmM sX4hUBtIAZY6GA3HPoyjN1SjRoyvQfhLgkpXpRpknWdgK7Ci/lZfCLCPPUEttVFZ0+Wj 26tZ5seiGntcp+W2CQaq+de7EknFz0GPZkfSqEacojz6qu+fULISfHzxpa6qe3QbJvuT fspPKOo1f+arGfwLMmwbMTzwH30GnMm7Ko3kR4LZDJ3LUcoibpUfrd0+676lOUlIQELI r6gsjTb1inM7J0+RpBbnVVajoslziRifHbmUp1FPv64DDtnxF/nN2rRs/RgtfNr1tGr/ QyVQ== X-Received: by 10.50.142.97 with SMTP id rv1mr3271587igb.11.1412395712260; Fri, 03 Oct 2014 21:08:32 -0700 (PDT) Original-Received: by 10.107.4.79 with HTTP; Fri, 3 Oct 2014 21:08:32 -0700 (PDT) In-Reply-To: <87iok0y8wr.fsf@wmi.amu.edu.pl> X-Google-Sender-Auth: n2CoeuaczNPBT09sKad-uF3S9PM X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:4001:c03::231 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:100263 Archived-At: On Sat, Oct 4, 2014 at 7:29 AM, Marcin Borkowski wrote: > The reason I'm asking is that I'm writing a function which converts an > arbitrary string to a valid (and nice) filename (e.g., only letters and > hyphens) - so basically I want to walk a string character by character > and convert any space to a hyphen and omit any other non-letter. Am I > reinventing the wheel? What are your assumptions about input string arbitrariness, your requirements about output filename niceness, and your requirements about the properties of the mapping? Because these may be in conflict. For example, if you assume any arbitrary strings, want only [-0-9A-Za-z_] characters, and want reasonably different strings to map into different filenames, then you will end up having to preserve non-nice characters as ugly character encodings (in the spirit of urlencode, XML character references, or Punycode). Otherwise, whole words or sentences in Russian, Japanese or Greek will map into an empty filename.