unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Copying and pasting Cyrillic text between Emacs and other apps
@ 2004-01-27  4:08 eMaXer
  2004-01-27 23:16 ` Paul Gorodyansky
  0 siblings, 1 reply; 24+ messages in thread
From: eMaXer @ 2004-01-27  4:08 UTC (permalink / raw)



A can't figure out how to do this. 

I am using NT Emacs 21.3.1 on Windows XP. When I try to copy Cyrillic
text from some other application into an Emacs buffer, I get only
question marks.

I tried to do this:

(codepage-setup 1251)
(set-selection-coding-system 'cp1251)

But it doesn't help. Any ideas?


-- 
In the beginning was the Word, 
and the Word was in a buffer,
and the buffer was in Emacs.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-27  4:08 Copying and pasting Cyrillic text between Emacs and other apps eMaXer
@ 2004-01-27 23:16 ` Paul Gorodyansky
  2004-01-28  0:25   ` Jason Rumney
                     ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Paul Gorodyansky @ 2004-01-27 23:16 UTC (permalink / raw)


eMaXer <zxy@yxz67483.com> wrote in message news:<u3ca2m3tc.fsf@yxz67483.com>...
> A can't figure out how to do this. 
> 
> I am using NT Emacs 21.3.1 on Windows XP. When I try to copy Cyrillic
> text from some other application into an Emacs buffer, I get only
> question marks.
> 

Looks like Emacs is a NON-Unicode program...
Please see work-around explained in the Chapter 2 "Copy/Paste"
of the "Unicode related issues" section on my site.


-- 
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet": 
http://ourworld.compuserve.com/homepages/PaulGor/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-27 23:16 ` Paul Gorodyansky
@ 2004-01-28  0:25   ` Jason Rumney
  2004-01-28  6:33     ` Eli Zaretskii
                       ` (3 more replies)
  2004-01-28  6:27   ` Eli Zaretskii
       [not found]   ` <mailman.1483.1075271175.928.help-gnu-emacs@gnu.org>
  2 siblings, 4 replies; 24+ messages in thread
From: Jason Rumney @ 2004-01-28  0:25 UTC (permalink / raw)


paulgor@compuserve.com (Paul Gorodyansky) writes:

> eMaXer <zxy@yxz67483.com> wrote in message news:<u3ca2m3tc.fsf@yxz67483.com>...
> > A can't figure out how to do this. 
> > 
> > I am using NT Emacs 21.3.1 on Windows XP. When I try to copy Cyrillic
> > text from some other application into an Emacs buffer, I get only
> > question marks.
> > 
> 
> Looks like Emacs is a NON-Unicode program...
> Please see work-around explained in the Chapter 2 "Copy/Paste"
> of the "Unicode related issues" section on my site.

As noted on that page, the System code page must be cyrillic for the
clipboard to handle cp1251.

Emacs will put data on the clipboard in whatever encoding you tell it
to, but most other programs will only use the System default.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-27 23:16 ` Paul Gorodyansky
  2004-01-28  0:25   ` Jason Rumney
@ 2004-01-28  6:27   ` Eli Zaretskii
  2004-01-28  7:55     ` Eli Zaretskii
                       ` (2 more replies)
       [not found]   ` <mailman.1483.1075271175.928.help-gnu-emacs@gnu.org>
  2 siblings, 3 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-28  6:27 UTC (permalink / raw)


> From: paulgor@compuserve.com (Paul Gorodyansky)
> Newsgroups: gnu.emacs.help
> Date: 27 Jan 2004 15:16:48 -0800
> > 
> > I am using NT Emacs 21.3.1 on Windows XP. When I try to copy Cyrillic
> > text from some other application into an Emacs buffer, I get only
> > question marks.
> 
> Looks like Emacs is a NON-Unicode program...

What exactly is ``a Unicode program'' on Windows?  What is missing in
Emacs to DTRT with Cyrillic text in the clipboard on a Windows box
whose system codepage is not cp1251?

I'm asking because Emacs, at least the CVS version, does support
various Unicode-related encodings of text.  I tried them all in the
attempt to paste into Emacs Cyrillic text copied from another Windows
application, and I still get question marks.

My impression is that decoding of clipboard text is not the problem.
Rather, I'm guessing that Emacs fetches the clipboard text in a way
that forces Windows to try to translate it to the current system
codepage, and that this translation attempt replaces all Cyrillic
characters with question marks.  The Windows XP clipboard viewer, for
example, does show the Cyrillic text from the clipboard correctly.  So
it sounds like there is a way to fetch the text from the clipboard,
it's just that Emacs somehow doesn't do it right.

> Please see work-around explained in the Chapter 2 "Copy/Paste"
> of the "Unicode related issues" section on my site.

Alas, none of the solutions there is free software.  UniPad, in
particular, wants you to register if you want a version that isn't
limited to 1000-character documents.  So your work-arounds are not
very practical, unfortunately.

Btw, does anyone know of a tool that can show what's in the clipboard
together with how the text is encoded there?  I found several
clipboard-related utilities, but none of them seems to do what I want,
which is to show me the codepoints of each character in the clipboard.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28  0:25   ` Jason Rumney
@ 2004-01-28  6:33     ` Eli Zaretskii
       [not found]     ` <mailman.1484.1075271523.928.help-gnu-emacs@gnu.org>
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-28  6:33 UTC (permalink / raw)


[Perhaps we should move this discussion to emacs-devel@gnu.org.]

> From: jasonr (Jason Rumney) @  f2s.com
> Newsgroups: gnu.emacs.help
> Date: 28 Jan 2004 00:25:51 +0000
> 
> Emacs will put data on the clipboard in whatever encoding you tell it
> to, but most other programs will only use the System default.

It seems like there's more here than meets the eye.

Emacs in CVS can use UTF-8, UTF-16-BE and UTF-16-LE as well.  I tried
them, but the Windows clipboard viewer still shows garbage when I
type "C-x RET X ENCODING RET M-w" (where ENCODING is one of the UTF-*
encodings mentioned above) with Cyrillic characters in the region.

What is it that Emacs doesn't do right here?  Because if I copy
Cyrillic text from the Explorer on the same system, the clipboard
viewer does show it correctly.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28  6:27   ` Eli Zaretskii
@ 2004-01-28  7:55     ` Eli Zaretskii
  2004-01-28  9:29     ` Eli Zaretskii
       [not found]     ` <mailman.1491.1075282404.928.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-28  7:55 UTC (permalink / raw)


> Date: 28 Jan 2004 08:27:22 +0200
> From: Eli Zaretskii <eliz@elta.co.il>
> 
> My impression is that decoding of clipboard text is not the problem.
> Rather, I'm guessing that Emacs fetches the clipboard text in a way
> that forces Windows to try to translate it to the current system
> codepage, and that this translation attempt replaces all Cyrillic
> characters with question marks.  The Windows XP clipboard viewer, for
> example, does show the Cyrillic text from the clipboard correctly.  So
> it sounds like there is a way to fetch the text from the clipboard,
> it's just that Emacs somehow doesn't do it right.

After some googling around, it seems that Emacs should use
CF_UNICODETEXT type in clipboard operations, instead of CF_TEXT that
it uses now.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
       [not found]     ` <mailman.1484.1075271523.928.help-gnu-emacs@gnu.org>
@ 2004-01-28  8:54       ` Jason Rumney
  2004-01-28 13:36         ` Eli Zaretskii
  2004-01-28 19:31         ` Paul Gorodyansky
  0 siblings, 2 replies; 24+ messages in thread
From: Jason Rumney @ 2004-01-28  8:54 UTC (permalink / raw)


Eli Zaretskii <eliz@elta.co.il> writes:

> Emacs in CVS can use UTF-8, UTF-16-BE and UTF-16-LE as well.  I tried
> them, but the Windows clipboard viewer still shows garbage when I
> type "C-x RET X ENCODING RET M-w" (where ENCODING is one of the UTF-*
> encodings mentioned above) with Cyrillic characters in the region.

Your system default is probably whatever 8bit encoding Windows uses
for Hebrew (I suspect not the ISO one). It is not UTF-8 or UTF-16.

> What is it that Emacs doesn't do right here?

Emacs doesn't use UNICODE_TEXT format. Maybe it should, but we have
to detect if it is available somehow, since blindly using it will
cause other apps to see nothing on the clipboard. We also need to
know whether CJK tables are loaded in East Asian locales, or Emacs'
conversion to UTF-16 will produce garbage, and we'll end up with worse
results than using the system default encoding for most East Asian users.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
       [not found]   ` <mailman.1483.1075271175.928.help-gnu-emacs@gnu.org>
@ 2004-01-28  9:01     ` Jason Rumney
  2004-01-28 13:48       ` Eli Zaretskii
  2004-01-28 19:45     ` Paul Gorodyansky
  1 sibling, 1 reply; 24+ messages in thread
From: Jason Rumney @ 2004-01-28  9:01 UTC (permalink / raw)


Eli Zaretskii <eliz@elta.co.il> writes:

> Btw, does anyone know of a tool that can show what's in the clipboard
> together with how the text is encoded there?

The standard Clipboard viewer should do that. At least on XP it has
Text, Locale, OEM Text and Unicode Text entries when text is copied
to the clipboard. The locale can't be viewed, but it seems that Emacs
might be able to do something with that to get text that Emacs puts
on the clipboard in the right encoding. But it cannot influence how
other apps put text on the clipboard (which AFAIK is always in the
System locale, even if the characters cannot be encoded in that).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28  6:27   ` Eli Zaretskii
  2004-01-28  7:55     ` Eli Zaretskii
@ 2004-01-28  9:29     ` Eli Zaretskii
       [not found]     ` <mailman.1491.1075282404.928.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-28  9:29 UTC (permalink / raw)


> Date: 28 Jan 2004 08:27:22 +0200
> From: Eli Zaretskii <eliz@elta.co.il>
> 
> Btw, does anyone know of a tool that can show what's in the clipboard
> together with how the text is encoded there?

Found it.  Google for ClipConvert and for Album from RWS Localization
Tools.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28  8:54       ` Jason Rumney
@ 2004-01-28 13:36         ` Eli Zaretskii
  2004-01-28 19:31         ` Paul Gorodyansky
  1 sibling, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-28 13:36 UTC (permalink / raw)


> > What is it that Emacs doesn't do right here?
> 
> Emacs doesn't use UNICODE_TEXT format. Maybe it should, but we have
> to detect if it is available somehow

The MSDN docs says that the function IsClipboardFormatAvailable can be
used to see if CF_UNICODETEXT is supported.  Can Emacs use that
function?

> since blindly using it will cause other apps to see nothing on the
> clipboard.

It seems like the recommended practice is to put the text into the
clipboard several times in different formats, with the more powerful
formats first.  Windows applications are supposed to walk the list of
available data formats and use the first one they support.

So Emacs could put a CF_UNICODETEXT object, then the CF_TEXT object
into the clipboard, and the Windows apps will then use the Unicode
encoding if they can, else the system page encoded text.

When getting text from the clipboard, Emacs should try CF_UNICODETEXT
first, if it's present in the clipboard (EnumClipboardFormats could be
used to check that).

Of course, this means the effect of "C-x RET x" and "C-x RET X" on
Windows should be rethought...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28  9:01     ` Jason Rumney
@ 2004-01-28 13:48       ` Eli Zaretskii
  0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-28 13:48 UTC (permalink / raw)


> > Btw, does anyone know of a tool that can show what's in the clipboard
> > together with how the text is encoded there?

> The standard Clipboard viewer should do that. At least on XP it has
> Text, Locale, OEM Text and Unicode Text entries when text is copied
> to the clipboard.

I wanted something that will show me the Windows codepage and the
Unicode codepoints of the characters; the standard clipboard viewer
doesn't do that AFAICS (and you first need to find it, since M$ moved
it out of sight for some reason ;-).

I found ClipConvert and (its newer incarnation) Album that do what I
wanted.  They can also convert between encodings and CF_* formats, so
it looks like there's another alternative for Paul Gorodyansky's page.

> The locale can't be viewed, but it seems that Emacs might be able to
> do something with that to get text that Emacs puts on the clipboard
> in the right encoding.

Yes.

> But it cannot influence how other apps put text on the clipboard
> (which AFAIK is always in the System locale, even if the characters
> cannot be encoded in that).

It seems like on Windows XP, the characters are implicitly converted
to Unicode (by some internal Windows machinery), so Emacs can always
win using CF_UNICODETEXT if it's available in the clipboard.  Failing
that, we can again use CF_LOCALE to determine the correct encoding
with which to decode CF_TEXT text.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28  8:54       ` Jason Rumney
  2004-01-28 13:36         ` Eli Zaretskii
@ 2004-01-28 19:31         ` Paul Gorodyansky
  2004-01-28 20:31           ` Jason Rumney
  1 sibling, 1 reply; 24+ messages in thread
From: Paul Gorodyansky @ 2004-01-28 19:31 UTC (permalink / raw)


jasonr (Jason Rumney) @  f2s.com wrote in message news:<uad48wj0r.fsf@jasonrumney.net>...
> Eli Zaretskii <eliz@elta.co.il> writes:
> 
> > Emacs in CVS can use UTF-8, UTF-16-BE and UTF-16-LE as well.  I tried
> > them, but the Windows clipboard viewer still shows garbage when I
> > type "C-x RET X ENCODING RET M-w" (where ENCODING is one of the UTF-*
> > encodings mentioned above) with Cyrillic characters in the region.
> 
> Your system default is probably whatever 8bit encoding Windows uses
> for Hebrew (I suspect not the ISO one). It is not UTF-8 or UTF-16.

There is *no* Windows systems with Unicode as system code page 
In MS terminilogy, CP_ACP is always a legacy code page.
GetACP() Win32 API returns the value - 1251, 932, etc.

-- 
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet": 
http://ourworld.compuserve.com/homepages/PaulGor/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
       [not found]     ` <mailman.1502.1075297883.928.help-gnu-emacs@gnu.org>
@ 2004-01-28 19:40       ` Paul Gorodyansky
  2004-01-29  6:04         ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Gorodyansky @ 2004-01-28 19:40 UTC (permalink / raw)


Eli Zaretskii <eliz@gnu.org> wrote in message news:<mailman.1502.1075297883.928.help-gnu-emacs@gnu.org>...
> 
> I wanted something that will show me the Windows codepage and the
> Unicode codepoints of the characters;

To see Windows code page I use 2 things:
a) go to Console and type
   chcp
it returns OEM code page, say 850 and thus I know that
Windows code page is 1252 :)
MS has all that listed:
http://www.microsoft.com/globaldev/reference/cphome.mspx

b) have my own 2-line C program that calls GetACP()
   and puts it on screen :) so I can see 
   "System Code Page: 1252"

As for characters and their Unicode codepoints:
a) Start/Run - charmap - and I can see a Unicode # for
   each symbol
b) http://www.unicode.org/unicode/reports/tr24/charts/index.html

> 
> I found ClipConvert and (its newer incarnation) Album that do what I
> wanted.  They can also convert between encodings and CF_* formats, so
> it looks like there's another alternative for Paul Gorodyansky's page.

Thanks! I'll look at it.


-- 
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet": 
http://ourworld.compuserve.com/homepages/PaulGor/

> 
> > The locale can't be viewed, but it seems that Emacs might be able to
> > do something with that to get text that Emacs puts on the clipboard
> > in the right encoding.
> 
> Yes.
> 
> > But it cannot influence how other apps put text on the clipboard
> > (which AFAIK is always in the System locale, even if the characters
> > cannot be encoded in that).
> 
> It seems like on Windows XP, the characters are implicitly converted
> to Unicode (by some internal Windows machinery), so Emacs can always
> win using CF_UNICODETEXT if it's available in the clipboard.  Failing
> that, we can again use CF_LOCALE to determine the correct encoding
> with which to decode CF_TEXT text.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
       [not found]   ` <mailman.1483.1075271175.928.help-gnu-emacs@gnu.org>
  2004-01-28  9:01     ` Jason Rumney
@ 2004-01-28 19:45     ` Paul Gorodyansky
  2004-01-28 19:57       ` Kevin Rodgers
  2004-01-29  5:55       ` Eli Zaretskii
  1 sibling, 2 replies; 24+ messages in thread
From: Paul Gorodyansky @ 2004-01-28 19:45 UTC (permalink / raw)


Eli Zaretskii <eliz@elta.co.il> wrote in message news:<mailman.1483.1075271175.928.help-gnu-emacs@gnu.org>...
> 
> > Please see work-around explained in the Chapter 2 "Copy/Paste"
> > of the "Unicode related issues" section on my site.
> 
> Alas, none of the solutions there is free software.  UniPad, in
> particular, wants you to register if you want a version that isn't
> limited to 1000-character documents.  So your work-arounds are not
> very practical, unfortunately.

I was not aware of that - when I checked it last time - last
year, it was comletely free "for personal use". Thanks for letting
me know...

But Netscape *is* free :) - I personally need to deal with
the discussed issue _every day_ and I never ever use UniPad -
I use Netscape 4.8 Composer

> 
> Btw, does anyone know of a tool that can show what's in the clipboard
> together with how the text is encoded there?  I found several
> clipboard-related utilities, but none of them seems to do what I want,
> which is to show me the codepoints of each character in the clipboard.

You wrote the above _before_ you found that Clipboard utulity
you was talking about in your post of January 28th, right?
Or that utility also not fulfilling all your needs?

-- 
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet": 
http://ourworld.compuserve.com/homepages/PaulGor/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28 19:45     ` Paul Gorodyansky
@ 2004-01-28 19:57       ` Kevin Rodgers
  2004-01-29  5:55       ` Eli Zaretskii
  1 sibling, 0 replies; 24+ messages in thread
From: Kevin Rodgers @ 2004-01-28 19:57 UTC (permalink / raw)


Paul Gorodyansky wrote:

> But Netscape *is* free :) - I personally need to deal with
> the discussed issue _every day_ and I never ever use UniPad -
> I use Netscape 4.8 Composer

Not in the way that matters: http://www.gnu.org/philosophy/free-sw.html


-- 
Kevin Rodgers

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
       [not found]     ` <mailman.1500.1075297455.928.help-gnu-emacs@gnu.org>
@ 2004-01-28 20:28       ` Jason Rumney
  2004-01-29  6:17         ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Rumney @ 2004-01-28 20:28 UTC (permalink / raw)


Eli Zaretskii <eliz@gnu.org> writes:

> The MSDN docs says that the function IsClipboardFormatAvailable can be
> used to see if CF_UNICODETEXT is supported.  Can Emacs use that
> function?

That function will only tell you if the clipboard currently contains
Unicode text. To detect if CF_UNICODETEXT is supported, you'd have to
put some text onto the clipboard first, destroying whatever some
other app has put there. This might be OK if it was done the first
time Emacs put data onto the clipboard though.

> So Emacs could put a CF_UNICODETEXT object, then the CF_TEXT object
> into the clipboard, and the Windows apps will then use the Unicode
> encoding if they can, else the system page encoded text.

On versions of Windows that support Unicode, Windows automatically
provides CF_UNICODETEXT when CF_TEXT is placed on the clipboard and
vice versa. The problem with the question marks appears to be due to
a lossy conversion after an app puts data on in CF_UNICODETEXT. For
non-lossy behaviour, we are probably better setting CF_LOCALE and
putting the text on as CF_TEXT, but reading text from the clipboard as
CF_UNICODETEXT. This would allow Emacs to be used for the workaround
that was suggested earlier in this thread.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28 19:31         ` Paul Gorodyansky
@ 2004-01-28 20:31           ` Jason Rumney
  2004-01-29  5:29             ` Paul Gorodyansky
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Rumney @ 2004-01-28 20:31 UTC (permalink / raw)


paulgor@compuserve.com (Paul Gorodyansky) writes:

> > Your system default is probably whatever 8bit encoding Windows uses
> > for Hebrew (I suspect not the ISO one). It is not UTF-8 or UTF-16.
> 
> There is *no* Windows systems with Unicode as system code page 

It is theoretically possible to use UTF-8 as your system code page,
but having tried it once I can tell you that it stops a lot of
programs from working (including cmd.exe).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
       [not found]     ` <mailman.1491.1075282404.928.help-gnu-emacs@gnu.org>
@ 2004-01-28 20:32       ` Paul Gorodyansky
  2004-01-29  6:14         ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Gorodyansky @ 2004-01-28 20:32 UTC (permalink / raw)


Eli Zaretskii <eliz@gnu.org> wrote in message news:<mailman.1491.1075282404.928.help-gnu-emacs@gnu.org>...
> > Date: 28 Jan 2004 08:27:22 +0200
> > From: Eli Zaretskii <eliz@elta.co.il>
> > 
> > Btw, does anyone know of a tool that can show what's in the clipboard
> > together with how the text is encoded there?
> 
> Found it.  Google for ClipConvert and for Album from RWS Localization
> Tools.

I looked at the screen-shot of Album...

IMHO, it may be perfect for you and me, but definitely not for
and end user who has no idea about Internationalization,
so I doubt I'd put it as a suggestion on my page...

(free) Netscape 4.8 is *so* much easier, quicker, simpler -
standard scenario (based on numerious e-mails I get):

  - English computer (Windows 95...XP)
  - person works with non-Western texts, say Cyrillic (1251)

  - Netscape 4.8 has - unlike IE - Default Encoding, so
    a person has his non-Western encoding as default there.
    It means that every time he opens Netscape Composer,
    that encoding is already a current one in that window

  - now he needs to copy Cyrillic text from IE or Word
    to a non-Unicode window of a text editor or 
    say Dreamweaver

  - no need to do those I18n specific things and settings
    and selections as Album requires to do
    (BTW, does Album require you to select needed
     options _every time_?):

    just copy your text from Word to Netscape Composer window
    and then copy from there to your destination!

It's so-o-o simple :) I use it all the time, because my
working tool is a non-Unicode plain text editor UltraEdit.

-- 
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet": 
http://ourworld.compuserve.com/homepages/PaulGor/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28 20:31           ` Jason Rumney
@ 2004-01-29  5:29             ` Paul Gorodyansky
  2004-01-29  8:54               ` Jason Rumney
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Gorodyansky @ 2004-01-29  5:29 UTC (permalink / raw)


jasonr (Jason Rumney) @  f2s.com wrote in message news:<uptd3vmpt.fsf@jasonrumney.net>...
> paulgor@compuserve.com (Paul Gorodyansky) writes:
> 
> > > Your system default is probably whatever 8bit encoding Windows uses
> > > for Hebrew (I suspect not the ISO one). It is not UTF-8 or UTF-16.
> > 
> > There is *no* Windows systems with Unicode as system code page 
> 
> It is theoretically possible to use UTF-8 as your system code page,
> but having tried it once I can tell you that it stops a lot of
> programs from working (including cmd.exe).

How did you try it??? It was discussed zillion times in say
microsoft.public.win32.programmer.international that _the only_
way to change System Code Page is to do it via Control Panel
(it's NOT possible to do from within your programming code) -
  
"Default" button in NT/2000, "Language for non-Unicode programs"
in XP. In both cases MS lists there _real_ languages, no UTF-8 there...

So a user can _not_ set UTF-8 as System Code Page.

-- 
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet": 
http://ourworld.compuserve.com/homepages/PaulGor/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28 19:45     ` Paul Gorodyansky
  2004-01-28 19:57       ` Kevin Rodgers
@ 2004-01-29  5:55       ` Eli Zaretskii
  1 sibling, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-29  5:55 UTC (permalink / raw)


> From: paulgor@compuserve.com (Paul Gorodyansky)
> Newsgroups: gnu.emacs.help
> Date: 28 Jan 2004 11:45:17 -0800
> 
> But Netscape *is* free :)

You can get it without paying a dime, that's true, but it's a large
package, so many people would not like installing it just to be able
to paste between apps...

> > Btw, does anyone know of a tool that can show what's in the clipboard
> > together with how the text is encoded there?  I found several
> > clipboard-related utilities, but none of them seems to do what I want,
> > which is to show me the codepoints of each character in the clipboard.
> 
> You wrote the above _before_ you found that Clipboard utulity
> you was talking about in your post of January 28th, right?

Right.  The gnu.org mail servers were hosed for a good portion of the
day yesterday, so messages might be out of order.  Looking at the
Date: header should help.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28 19:40       ` Paul Gorodyansky
@ 2004-01-29  6:04         ` Eli Zaretskii
  0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-29  6:04 UTC (permalink / raw)


> From: paulgor@compuserve.com (Paul Gorodyansky)
> Newsgroups: gnu.emacs.help
> Date: 28 Jan 2004 11:40:13 -0800
> 
> To see Windows code page I use 2 things:
> a) go to Console and type
>    chcp
> it returns OEM code page, say 850 and thus I know that
> Windows code page is 1252 :)
> MS has all that listed:
> http://www.microsoft.com/globaldev/reference/cphome.mspx
> 
> b) have my own 2-line C program that calls GetACP()
>    and puts it on screen :) so I can see 
>    "System Code Page: 1252"

It turns out my wording was inaccurate and thus misleading.  What I
wanted to see was what codepage was used to encode the characters.
You seem to be assuming that this codepage is always identical to the
system codepage, but that is not really true, at least not on Windows
XP.  Try copying into the clipboard Cyrillic characters from the
Explorer on a non-Cyrillic Windows machine, and you will see that
CF_TEXT is encoded in cp1251 even though the system codepage is
something different.

> As for characters and their Unicode codepoints:
> a) Start/Run - charmap - and I can see a Unicode # for
>    each symbol
> b) http://www.unicode.org/unicode/reports/tr24/charts/index.html

Sure, there are lots of places where Unicode codepoints of the
characters are listed, but what I wanted to know is how does Windows
encode them in the clipboard.  It turns out they use the 16-bit
Unicode codepoints, at least for the BMP.  (Out of curiosity: do you
or anyone else know how does Windows encode characters outside the
BMP?  Is it UTF-16 or something else?)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28 20:32       ` Paul Gorodyansky
@ 2004-01-29  6:14         ` Eli Zaretskii
  0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-29  6:14 UTC (permalink / raw)


> From: paulgor@compuserve.com (Paul Gorodyansky)
> Newsgroups: gnu.emacs.help
> Date: 28 Jan 2004 12:32:07 -0800
> 
> I looked at the screen-shot of Album...
> 
> IMHO, it may be perfect for you and me, but definitely not for
> and end user who has no idea about Internationalization,
> so I doubt I'd put it as a suggestion on my page...

It's your page, so you get to decide, but IMHO there's no need to
assume that your page is read only by dummies ;-) This issue is
sufficiently complicated already, so if the reader can understand how
to do it with UniPad, she can understand Album as well.  With UniPad,
you need to use "Copy As..." and "Paste As...", with Album, you need
to set the input or output codepages to whatever you want.  It's that
simple.  (And Album can be set up to perform the conversion
automatically, so the user can copy-paste as if the problem didn't
exist.)

>   - no need to do those I18n specific things and settings
>     and selections as Album requires to do

All you need to set is the output codepage for conversion (and
sometimes the input as well, it depends on the other application).

>     (BTW, does Album require you to select needed
>      options _every time_?):

No, whatever you change is used the next time Album is started.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-28 20:28       ` Jason Rumney
@ 2004-01-29  6:17         ` Eli Zaretskii
  0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2004-01-29  6:17 UTC (permalink / raw)


> From: jasonr (Jason Rumney) @  f2s.com
> Newsgroups: gnu.emacs.help
> Date: 28 Jan 2004 20:28:08 +0000
> 
> For non-lossy behaviour, we are probably better setting CF_LOCALE
> and putting the text on as CF_TEXT, but reading text from the
> clipboard as CF_UNICODETEXT.

Right.

Now, if we could find someone who would volunteer to code this...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Copying and pasting Cyrillic text between Emacs and other apps
  2004-01-29  5:29             ` Paul Gorodyansky
@ 2004-01-29  8:54               ` Jason Rumney
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Rumney @ 2004-01-29  8:54 UTC (permalink / raw)


paulgor@compuserve.com (Paul Gorodyansky) writes:

> How did you try it??? It was discussed zillion times in say
> microsoft.public.win32.programmer.international that _the only_
> way to change System Code Page is to do it via Control Panel
> (it's NOT possible to do from within your programming code) -

I think I edited the registry directly, which gives finer grained
control than the control panel.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2004-01-29  8:54 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-27  4:08 Copying and pasting Cyrillic text between Emacs and other apps eMaXer
2004-01-27 23:16 ` Paul Gorodyansky
2004-01-28  0:25   ` Jason Rumney
2004-01-28  6:33     ` Eli Zaretskii
     [not found]     ` <mailman.1484.1075271523.928.help-gnu-emacs@gnu.org>
2004-01-28  8:54       ` Jason Rumney
2004-01-28 13:36         ` Eli Zaretskii
2004-01-28 19:31         ` Paul Gorodyansky
2004-01-28 20:31           ` Jason Rumney
2004-01-29  5:29             ` Paul Gorodyansky
2004-01-29  8:54               ` Jason Rumney
     [not found]     ` <mailman.1502.1075297883.928.help-gnu-emacs@gnu.org>
2004-01-28 19:40       ` Paul Gorodyansky
2004-01-29  6:04         ` Eli Zaretskii
     [not found]     ` <mailman.1500.1075297455.928.help-gnu-emacs@gnu.org>
2004-01-28 20:28       ` Jason Rumney
2004-01-29  6:17         ` Eli Zaretskii
2004-01-28  6:27   ` Eli Zaretskii
2004-01-28  7:55     ` Eli Zaretskii
2004-01-28  9:29     ` Eli Zaretskii
     [not found]     ` <mailman.1491.1075282404.928.help-gnu-emacs@gnu.org>
2004-01-28 20:32       ` Paul Gorodyansky
2004-01-29  6:14         ` Eli Zaretskii
     [not found]   ` <mailman.1483.1075271175.928.help-gnu-emacs@gnu.org>
2004-01-28  9:01     ` Jason Rumney
2004-01-28 13:48       ` Eli Zaretskii
2004-01-28 19:45     ` Paul Gorodyansky
2004-01-28 19:57       ` Kevin Rodgers
2004-01-29  5:55       ` Eli Zaretskii

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).