unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* g-client: character coding problem
@ 2007-05-13 14:55 Joseph Fahey
  2007-05-13 15:33 ` Hadron
  2007-05-14  7:28 ` Enchanter
  0 siblings, 2 replies; 14+ messages in thread
From: Joseph Fahey @ 2007-05-13 14:55 UTC (permalink / raw)
  To: help-gnu-emacs



Hello all,

I am trying to use T. V. Ramah's g-client interface to the Google API,
in particular as an interface to Blogger.

http://emacspeak.blogspot.com/2007/03/emacs-client-for-google-services.html#cooliris

G-client, and more specifically gblogger.el, works great for me... as
long as I stick to ASCII characters. Any accented characters from the
iso-latin-1 subgroup show up incorrectly. I've tried a lot of
different things to alter the coding system, mostly by setting
everything I can to utf-8-unix.

I think I have found the problem, but am not sure how to solve it.
gblogger.el uses xsltproc via shell-command-on-region to prepare a
blog, before sending the buffer using curl. It appears that the
characters are coming back malformed. Here is the function in
g-utils.el:

(defsubst g-xsl-transform-region (start end xsl)
  "Replace region by result of transforming via XSL."
  (declare (special g-xslt-program))
  (let ((coding-system-for-write 'utf-8))
  (shell-command-on-region
   start end
   (format "%s %s - %s"
           g-xslt-program xsl (g-xslt-debug))
   'replace)))

If I run the following code (the elisp in the middle of this xml) in a
utf-8 buffer (-u) I get malformed characters (the accents in the
"content" part):

<entry xmlns='http://www.w3.org/2005/Atom'>
  <generator url="http://purl.org/net/emacs-gblogger/">http://purl.org/net/emacs-gblogger/</generator>
  <author> <name>Me </name> </author>
  <title mode="escaped" type="text/html">être </title>
  <content type='xhtml'>
    <div xmlns="http://www.w3.org/1999/xhtml">
<!--content goes here -->
<p>Être ou ne pas être ? 

  (g-xsl-transform-region 
  (point-min) (point-max) "~/elisp/g-client/blogger-edit-post.xsl")

</p>
    </div>
  </content>
</entry>

I get the same results if I change the code to:

 (let ((coding-system-for-write 'utf-8-unix))
  (g-xsl-transform-region
   (point-min) (point-max) "~/elisp/g-client/blogger-edit-post.xsl")) 

Here is what I get if I do "C-h C":

Coding system for saving this buffer:
  Not set locally, use the default.
Default coding system (for new files):
  u -- utf-8-unix (alias of mule-utf-8-unix)

Coding system for keyboard input:
  nil
Coding system for terminal output:
  u -- utf-8 (alias of mule-utf-8)

Defaults for subprocess I/O:
  decoding: 1 -- iso-latin-1 (alias: iso-8859-1 latin-1)

  encoding: 1 -- iso-latin-1 (alias: iso-8859-1 latin-1)


Priority order for recognizing coding systems when reading files:
  1. iso-latin-1 (alias: iso-8859-1 latin-1)
  2. windows-1252 (alias: cp1252)
  3. mule-utf-8 (alias: utf-8) ... etc.

I suspect that the problem is coming from the defaults for subprocess
I/O, but I'm not sure how to change that.

So... here I am at wit's end. Any ideas would be much appreciated.

thanks

Joe

PS: this is on Linux with GNU Emacs 22.0.94.2. Here is the output from locale:

LANG=fr_FR.UTF-8@euro
LC_CTYPE=fr_FR.UTF-8
LC_NUMERIC="fr_FR.UTF-8@euro"
LC_TIME="fr_FR.UTF-8@euro"
LC_COLLATE="fr_FR.UTF-8@euro"
LC_MONETARY="fr_FR.UTF-8@euro"
LC_MESSAGES=C
LC_PAPER="fr_FR.UTF-8@euro"
LC_NAME="fr_FR.UTF-8@euro"
LC_ADDRESS="fr_FR.UTF-8@euro"
LC_TELEPHONE="fr_FR.UTF-8@euro"
LC_MEASUREMENT="fr_FR.UTF-8@euro"
LC_IDENTIFICATION="fr_FR.UTF-8@euro"
LC_ALL=

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2007-05-17  8:49 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-13 14:55 g-client: character coding problem Joseph Fahey
2007-05-13 15:33 ` Hadron
2007-05-13 15:43   ` Hadron
2007-05-13 17:55     ` Joseph Fahey
2007-05-13 21:08       ` Hadron
2007-05-14  7:45         ` Tim X
2007-05-14  9:36           ` Hadron
2007-05-17  8:49             ` Tim X
2007-05-14  9:39           ` Hadron
2007-05-14 10:16           ` Peter Dyballa
     [not found]           ` <mailman.631.1179138267.32220.help-gnu-emacs@gnu.org>
2007-05-14 10:31             ` Hadron
2007-05-14 12:12               ` Peter Dyballa
2007-05-14  7:28 ` Enchanter
2007-05-14  7:49   ` Tim X

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).