To get Unicode file (UTF8) with a leading byte-order-marker characters?

unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed

* To get Unicode file (UTF8) with a leading byte-order-marker characters?
@ 2010-06-08 15:26 Paul Chany
  2010-06-08 23:02 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Paul Chany @ 2010-06-08 15:26 UTC (permalink / raw)
  To: help-gnu-emacs

Hi,

I'm using Emacs to program in Objective-C following a GNUstep Tutorial.
In the application that I made there is a resource file for localisation
(translation): Ablak.strings.

The file should be ASCII (using \U escapes for unicode characters) or
Unicode (UTF16 or UTF8) with a leading byte-order-marker.

How can I get this file using Emacs?

Any advices will be appreciated!

-- 
Regards,
Paul Chany
You can freely correct my English.
http://csanyi-pal.info

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker characters?
       [not found] <mailman.3.1276010799.23544.help-gnu-emacs@gnu.org>
@ 2010-06-08 17:38 ` Pascal J. Bourguignon
  0 siblings, 0 replies; 12+ messages in thread
From: Pascal J. Bourguignon @ 2010-06-08 17:38 UTC (permalink / raw)
  To: help-gnu-emacs

Paul Chany <csanyipal@gmail.com> writes:

> Hi,
>
> I'm using Emacs to program in Objective-C following a GNUstep Tutorial.
> In the application that I made there is a resource file for localisation
> (translation): Ablak.strings.
>
> The file should be ASCII (using \U escapes for unicode characters) or
> Unicode (UTF16 or UTF8) with a leading byte-order-marker.
>
> How can I get this file using Emacs?

The generation of the byte-order-marker is automatic, for utf-16-be or
utf-16-le.  It is meaningless for utf-8.

Just put on the first or second line of the file a comment specifyin
the coding system you want:

/* -*- coding:utf-16-be -*- */



-- 
__Pascal Bourguignon__                     http://www.informatimago.com/


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker characters?
  2010-06-08 15:26 To get Unicode file (UTF8) with a leading byte-order-marker characters? Paul Chany
@ 2010-06-08 23:02 ` Eli Zaretskii
  2010-06-11 20:59   ` Paul Chany
  2010-06-13  4:55 ` tomas
       [not found] ` <mailman.2.1276404928.6139.help-gnu-emacs@gnu.org>
  2 siblings, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2010-06-08 23:02 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Paul Chany <csanyipal@gmail.com>
> Date: Tue, 08 Jun 2010 17:26:21 +0200
> 
> The file should be ASCII (using \U escapes for unicode characters) or
> Unicode (UTF16 or UTF8) with a leading byte-order-marker.
> 
> How can I get this file using Emacs?

C-x RET f utf-8-with-signature RET



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker characters?
  2010-06-08 23:02 ` Eli Zaretskii
@ 2010-06-11 20:59   ` Paul Chany
  2010-06-12  5:56     ` Kevin Rodgers
  2010-06-12 11:24     ` Eli Zaretskii
  0 siblings, 2 replies; 12+ messages in thread
From: Paul Chany @ 2010-06-11 20:59 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 585 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Paul Chany <csanyipal@gmail.com>
>> 
>> The file should be ASCII (using \U escapes for unicode characters) or
>> Unicode (UTF16 or UTF8) with a leading byte-order-marker.
>> 
>> How can I get this file using Emacs?
>
> C-x RET f utf-8-with-signature RET

Well, when I did so I get the following message:

<NSException: 0x9f91dd0> NAME:NSGenericException REASON:Parse failed at
line 2 (char 41) - unexpected character (wanted ';') INFO:(nil)

and the translation don't works. :(
What's wrong with it?

I attach the file here (it's small):

[-- Attachment #2: Ablak.strings --]
[-- Type: application/octet-stream, Size: 87 bytes --]

"Ablakot bezár" = "Zatvara aplikaciju"
"Jó napot kívánok!" = "Dobar dan želim!"

[-- Attachment #3: Type: text/plain, Size: 83 bytes --]


-- 
Regards,
Paul Chany
You can freely correct my English.
http://csanyi-pal.info

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker characters?
  2010-06-11 20:59   ` Paul Chany
@ 2010-06-12  5:56     ` Kevin Rodgers
  2010-06-12 11:24     ` Eli Zaretskii
  1 sibling, 0 replies; 12+ messages in thread
From: Kevin Rodgers @ 2010-06-12  5:56 UTC (permalink / raw)
  To: help-gnu-emacs

Paul Chany wrote:
> Eli Zaretskii <eliz@gnu.org> writes:
> 
>>> From: Paul Chany <csanyipal@gmail.com>
>>>
>>> The file should be ASCII (using \U escapes for unicode characters) or
>>> Unicode (UTF16 or UTF8) with a leading byte-order-marker.
>>>
>>> How can I get this file using Emacs?
>> C-x RET f utf-8-with-signature RET
> 
> Well, when I did so I get the following message:
> 
> <NSException: 0x9f91dd0> NAME:NSGenericException REASON:Parse failed at
> line 2 (char 41) - unexpected character (wanted ';') INFO:(nil)
> 
> and the translation don't works. :(
> What's wrong with it?
> 
> I attach the file here (it's small):

I saved the file and visited it in Emacs 23.2.1 (i386-apple-darwin8.11.1, NS 
apple-appkit-824.48), and no errors or warnings were reported.

`C-h C RET' displays:

Coding system for saving this buffer:
   U -- utf-8-with-signature-unix

...

and with the cursor over character 41 (the second character on line 2),
`C-u C-x =' displays:

         character: ó (243, #o363, #xf3)
preferred charset: unicode (Unicode (ISO10646))
        code point: 0xF3
            syntax: w 	which means: word
          category: .:Base, c:Chinese, j:Japanese, l:Latin, v:Viet
       buffer code: #xC3 #xB3
         file code: #xEF #xBB #xBF #xC3 #xB3
		   (encoded by coding system utf-8-with-signature-unix)
           display: by this font (glyph code)
     nil:-apple-Monaco-medium-normal-normal-*-12-*-*-*-m-0-iso10646-1 (#x79)

Character code properties: customize what to show
   name: LATIN SMALL LETTER O WITH ACUTE
   old-name: LATIN SMALL LETTER O ACUTE
   general-category: Ll (Letter, Lowercase)
   decomposition: (111 769) ('o' '́')

-- 
Kevin Rodgers
Denver, Colorado, USA




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker characters?
  2010-06-11 20:59   ` Paul Chany
  2010-06-12  5:56     ` Kevin Rodgers
@ 2010-06-12 11:24     ` Eli Zaretskii
  2010-06-12 11:49       ` Paul Chany
  1 sibling, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2010-06-12 11:24 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Paul Chany <csanyipal@gmail.com>
> Date: Fri, 11 Jun 2010 22:59:02 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> From: Paul Chany <csanyipal@gmail.com>
> >> 
> >> The file should be ASCII (using \U escapes for unicode characters) or
> >> Unicode (UTF16 or UTF8) with a leading byte-order-marker.
> >> 
> >> How can I get this file using Emacs?
> >
> > C-x RET f utf-8-with-signature RET
> 
> Well, when I did so I get the following message:
> 
> <NSException: 0x9f91dd0> NAME:NSGenericException REASON:Parse failed at
> line 2 (char 41) - unexpected character (wanted ';') INFO:(nil)
> 
> and the translation don't works. :(
> What's wrong with it?

What did you do exactly?  Can you show a precise recipe starting with
"emacs -Q" that causes this problem?

FWIW, if I save the file you sent and then type

   C-x RET c utf-8-with-signature RET C-x C-f Ablak.strings RET

I get the file in the buffer without any problems.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker characters?
  2010-06-12 11:24     ` Eli Zaretskii
@ 2010-06-12 11:49       ` Paul Chany
  0 siblings, 0 replies; 12+ messages in thread
From: Paul Chany @ 2010-06-12 11:49 UTC (permalink / raw)
  To: help-gnu-emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Paul Chany <csanyipal@gmail.com>
>> Date: Fri, 11 Jun 2010 22:59:02 +0200
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> From: Paul Chany <csanyipal@gmail.com>
>> >> 
>> >> The file should be ASCII (using \U escapes for unicode characters) or
>> >> Unicode (UTF16 or UTF8) with a leading byte-order-marker.
>> >> 
>> >> How can I get this file using Emacs?
>> >
>> > C-x RET f utf-8-with-signature RET
>> 
>> Well, when I did so I get the following message:
>> 
>> <NSException: 0x9f91dd0> NAME:NSGenericException REASON:Parse failed at
>> line 2 (char 41) - unexpected character (wanted ';') INFO:(nil)
>> 
>> and the translation don't works. :(

> What did you do exactly?  Can you show a precise recipe starting with
> "emacs -Q" that causes this problem?

Sorry, it is ma mistake here!
The error occure not in Emacs but in my GNUstep application!

So I don't know how to create such a file that that the GNUstep don't
get me this error message!

-- 
Regards,
Paul Chany
You can freely correct my English.
http://csanyi-pal.info




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker characters?
  2010-06-08 15:26 To get Unicode file (UTF8) with a leading byte-order-marker characters? Paul Chany
  2010-06-08 23:02 ` Eli Zaretskii
@ 2010-06-13  4:55 ` tomas
  2010-06-13  8:14   ` Paul Chany
       [not found] ` <mailman.2.1276404928.6139.help-gnu-emacs@gnu.org>
  2 siblings, 1 reply; 12+ messages in thread
From: tomas @ 2010-06-13  4:55 UTC (permalink / raw)
  To: Paul Chany; +Cc: help-gnu-emacs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, Jun 08, 2010 at 05:26:21PM +0200, Paul Chany wrote:
> Hi,
> 
> I'm using Emacs to program in Objective-C following a GNUstep Tutorial.
> In the application that I made there is a resource file for localisation
> (translation): Ablak.strings.
> 
> The file should be ASCII (using \U escapes for unicode characters) or
> Unicode (UTF16 or UTF8) with a leading byte-order-marker.

Note that I have no clue about GNUstep, so I might be off by a big
amount, but -- are you sure the system wants a leading byte order mark
with UTF-8? (strictly speaking, it's unnecesary --rather slightly
annoying-- on UTF-8. I always thought that in entered the Unicode
consortium via Microsoft, who always likes to play this kind of
shenanigans on us).

You might try without leading BOM?

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFMFGS+Bcgs9XrR2kYRAt/QAJ9+D1FzSpGOtF/Cgh29yt6uoGFkFACfXQs6
Kf9qMXFyG6BCnSu7hXqWhKo=
=8llU
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker characters?
  2010-06-13  4:55 ` tomas
@ 2010-06-13  8:14   ` Paul Chany
  2010-06-14 12:17     ` Paul Chany
  0 siblings, 1 reply; 12+ messages in thread
From: Paul Chany @ 2010-06-13  8:14 UTC (permalink / raw)
  To: help-gnu-emacs

tomas@tuxteam.de writes:

> On Tue, Jun 08, 2010 at 05:26:21PM +0200, Paul Chany wrote:
>> I'm using Emacs to program in Objective-C following a GNUstep Tutorial.
>> In the application that I made there is a resource file for localisation
>> (translation): Ablak.strings.
>> 
>> The file should be ASCII (using \U escapes for unicode characters) or
>> Unicode (UTF16 or UTF8) with a leading byte-order-marker.
>
> Note that I have no clue about GNUstep, so I might be off by a big
> amount, but -- are you sure the system wants a leading byte order mark
> with UTF-8? (strictly speaking, it's unnecesary --rather slightly
> annoying-- on UTF-8. I always thought that in entered the Unicode
> consortium via Microsoft, who always likes to play this kind of
> shenanigans on us).

> You might try without leading BOM?

Naturally, I was tried without leading BOM, but get the same error
message.

One GNUstep developer says the following:
'GNUstep expects the .strings file to be in US-ASCII only, or any
non-ASCII characters to be escaped, or in UTF-8 with a BOM marker.
Very annoying limitation; I think it is because of OpenStep
compatibility.'

-- 
Regards,
Paul Chany
You can freely correct my English.
http://csanyi-pal.info




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker characters?
  2010-06-13  8:14   ` Paul Chany
@ 2010-06-14 12:17     ` Paul Chany
  0 siblings, 0 replies; 12+ messages in thread
From: Paul Chany @ 2010-06-14 12:17 UTC (permalink / raw)
  To: help-gnu-emacs

Paul Chany <csanyipal@gmail.com> writes:

> tomas@tuxteam.de writes:
>
>> On Tue, Jun 08, 2010 at 05:26:21PM +0200, Paul Chany wrote:
>>> I'm using Emacs to program in Objective-C following a GNUstep
>>> Tutorial. In the application that I made there is a resource file
>>> for localisation (translation): Ablak.strings.
>>> 
>>> The file should be ASCII (using \U escapes for unicode characters)
>>> or Unicode (UTF16 or UTF8) with a leading byte-order-marker.
>>
>> Note that I have no clue about GNUstep, so I might be off by a big 
>> amount, but -- are you sure the system wants a leading byte order
>> mark with UTF-8? (strictly speaking, it's unnecesary --rather
>> slightly annoying-- on UTF-8. I always thought that in entered the
>> Unicode consortium via Microsoft, who always likes to play this kind
>> of shenanigans on us).
>
>> You might try without leading BOM?
>
> Naturally, I was tried without leading BOM, but get the same error
> message.

I get the solution!
The .strings file is actually a C file so the lines must be ended with
the ';' character. My .strings file has three lines and when I edited
it so so at and of every lines wrote the ';' character, and compile the
application, then the error message gone.

-- 
Regards,
Paul Chany
You can freely correct my English.
http://csanyi-pal.info




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker  characters?
       [not found]   ` <slrni1e3u2.n7i.nospam-abuse@powdermilk.math.berkeley.edu>
@ 2010-06-15 13:56     ` Jason Rumney
       [not found]       ` <slrni1gaut.q6l.nospam-abuse@powdermilk.math.berkeley.edu>
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Rumney @ 2010-06-15 13:56 UTC (permalink / raw)
  To: help-gnu-emacs

On Jun 15, 1:29 pm, Ilya Zakharevich <nospam-ab...@ilyaz.org> wrote:
> On 2010-06-13, to...@tuxteam.de <to...@tuxteam.de> wrote:

> ???  If you know it it UTF-8, the BOM is not necessary.  But the same
> holds for any other encoding of Unicode.  So your sentiment makes no
> sense here...

Think about what the letters BOM stand for, and you might see why it
is relevant for UTF-16, but redundant for UTF-8.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: To get Unicode file (UTF8) with a leading byte-order-marker  characters?
       [not found]         ` <pco39wn85iv.fsf@math.ntnu.no>
@ 2010-06-16 14:10           ` Jason Rumney
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Rumney @ 2010-06-16 14:10 UTC (permalink / raw)
  To: help-gnu-emacs

On Jun 16, 9:35 pm, Harald Hanche-Olsen <han...@math.ntnu.no> wrote:

> Anyway, the BOM designation is deprecated. The proper unicode name for
> the character in question is ZERO WIDTH NO-BREAK SPACE.

Its more complicated than that. It was previously renamed, then in a
later version of the Unicode spec, its use as a ZERO WIDTH NO-BREAK
SPACE character was deprecated as WORD JOINER, ZERO WIDTH JOINER and
ZERO WIDTH NON-JOINER provide the same functionality with more
detailed meaning, so it retains only the BOM usage, with the ZWNBSP
name.  Welcome to design by committee!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-06-16 14:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-08 15:26 To get Unicode file (UTF8) with a leading byte-order-marker characters? Paul Chany
2010-06-08 23:02 ` Eli Zaretskii
2010-06-11 20:59   ` Paul Chany
2010-06-12  5:56     ` Kevin Rodgers
2010-06-12 11:24     ` Eli Zaretskii
2010-06-12 11:49       ` Paul Chany
2010-06-13  4:55 ` tomas
2010-06-13  8:14   ` Paul Chany
2010-06-14 12:17     ` Paul Chany
     [not found] ` <mailman.2.1276404928.6139.help-gnu-emacs@gnu.org>
     [not found]   ` <slrni1e3u2.n7i.nospam-abuse@powdermilk.math.berkeley.edu>
2010-06-15 13:56     ` Jason Rumney
     [not found]       ` <slrni1gaut.q6l.nospam-abuse@powdermilk.math.berkeley.edu>
     [not found]         ` <pco39wn85iv.fsf@math.ntnu.no>
2010-06-16 14:10           ` Jason Rumney
     [not found] <mailman.3.1276010799.23544.help-gnu-emacs@gnu.org>
2010-06-08 17:38 ` Pascal J. Bourguignon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).