From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Len Blanks Newsgroups: gmane.emacs.help Subject: Re: Troubles in Regular Expression Paradise Date: Wed, 14 May 2014 17:07:11 -0500 Organization: Socialist Workers Computational Vision and Robotics Collective Message-ID: References: <87oaz0ilei.fsf@geodiff-mac3.ulb.ac.be> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1400105264 2196 80.91.229.3 (14 May 2014 22:07:44 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 14 May 2014 22:07:44 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu May 15 00:07:39 2014 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WkhKs-0001NM-LT for geh-help-gnu-emacs@m.gmane.org; Thu, 15 May 2014 00:07:38 +0200 Original-Received: from localhost ([::1]:54685 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WkhKr-0005aS-Sj for geh-help-gnu-emacs@m.gmane.org; Wed, 14 May 2014 18:07:37 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41862) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WkhKc-0005aD-Q2 for help-gnu-emacs@gnu.org; Wed, 14 May 2014 18:07:27 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WkhKY-0001J4-52 for help-gnu-emacs@gnu.org; Wed, 14 May 2014 18:07:22 -0400 Original-Received: from haruspex.net ([72.52.120.134]:51926) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1WkhKX-0001Ip-TY for help-gnu-emacs@gnu.org; Wed, 14 May 2014 18:07:18 -0400 Original-Received: from localhost ([70.171.94.129]) by haruspex.net for ; Wed, 14 May 2014 15:07:13 -0700 Mail-Copies-To: never X-GPG-Key-ID: C7189399 X-GPG-Fingerprint: 1938 0CB6 C718 9399 X-url: http://www.haruspex.net/ X-Attribution: Len X-Unexpected: The Spanish Inquisition X-Mayan-date: Long count = 13.0.1.7.9; tzolkin = 6 Muluc; haab = 2 Zip Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAElBMVEUBAgRiT1SDarwLCAeV 09MYY4MBZrZUAAACUklEQVQ4jVWUzXbiMAyFRd3u8U/3TWr2Csrsx46yH9r4/V9lroxhOoFz SPi491qyBX3TuNbWLyb6ZZ/0BNcnePsBZJFr073NLYrIZQDcCtM/QMnATXxkEjPQY57becTR jUcGQJ3n2cCy/wwH2AHsdxZi4K55aQdA7g/vZuXkAWB1OezBoRJqMm+jDlMYcIwQaqsWvleu dwALlwGSKi+R6BWKLwAneKWdvlZVryp02lFgO+jEIu4KkLSoFi/OwmuWFP0iotSCeq8bubD3 OqBOOXKillWjAd175WoX8zrAx3+gMjtqCHAa6UW71WUoyABWtbnrHdhj1g90w+6qFgp1Nqs2 h5p0W0oHcKM85w5KKkHLSR8K0ZAVlbeSC5rk8wMsmuaGDHw3vdTi4wAVivlQLHfyxYkXGhmV 14LOIOM7TeTiE6icarH9uHx7Yc8LPwGld7Nqt817zwHguAOXSgdHKT6V6/kB4pS9WbUjFTTY 3wH6uGFp2erAqoJqeCo2LbUr6hTQ4eUBCg7qZ8+YdNK8yecA3kmUDLD7gh08yzZAdAsvvVcL FJWWhyKSHQAFkLlio9cB6pkWlh0hF+8zqr1O1FB6jThmpkBLYsrF8akYKOwigOtNnDVH4lM1 ECmwnfkOppBxazvYyvm0EeNgZIQfS4k2OgYmLphPFnexTdfZYxA6uNmoRMznOwDahpY7lwbA aj1dYZVT8J9QPAATL/RmVt8hV2QM4Jiio1cbwj8uFBiEOxByZ0drB5A79gNwf68HRuoPotMm 3gCPSe9/TFBQ/k3+L0/4zgX9OMedAAAAAElFTkSuQmCC In-Reply-To: <87oaz0ilei.fsf@geodiff-mac3.ulb.ac.be> (Nicolas Richard's message of "Wed, 14 May 2014 14:40:05 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (darwin) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 72.52.120.134 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:97651 Archived-At: Nicolas Richard writes: > Len Blanks writes: >> I'm trying to parse an xml file containing information for a song "currently playing" >> on my iTunes - specifically the artist, name of the tune and the CD in appears on. > >> and here is a function I had hoped would strip the relevant fields and return a string in >> the form: "_artist_'s _title_ from the CD _album_" to be inserted in a X-NOW-PLAYING: >> header in emails and usenet posts: > > The regexp question was answered, so I allow myself to mention > libxml-parse-xml-region instead of regexps for parsing xml. > > First eval: > (setq yftest (libxml-parse-html-region (point-min) (point-max))) > in a buffer which holds your file. > > Then you can get away with: > (caddr (assoc 'title (caddr yftest))) > (caddr (assoc 'artist (caddr yftest))) > (caddr (assoc 'album (caddr yftest))) > to get the title, artist and album respectively. > > As a side note, I wrote some elisp for using the kind of data that > libxml-parse-html-region spits out, so here's my way of solving your > problem with my code : > > (require 'tree-html) > ;; https://github.com/YoungFrog/tree-html/blob/master/tree-html.el > > (defun yf/get-value-for-sole-subtree-with-given-tag (tree tag) > "Assume there's exactly one XML element with given TAG in TREE, and return its > associated value." > (yf/tree-html-get-value > (yf/tree-html-get-sole-element > (yf/tree-html-select > tree > (lambda (tree) > (eq tag (yf/tree-html-get-tag tree))))))) > > (format "%s's %s from the CD %s" > (yf/get-value-for-sole-subtree-with-given-tag yftest 'artist) > (yf/get-value-for-sole-subtree-with-given-tag yftest 'title) > (yf/get-value-for-sole-subtree-with-given-tag yftest 'album)) > > (Yes, I'm *that* bad at naming things.) I need to do a better job researching what is available, rather than reinventing the wheel. Thanks very much; I'll redo what I have using libxml and your code to compare and learn. Regards, -- Len Je suis Marxiste - tendance Groucho -- Slogan used at Nanterre in Paris, 1968