* Emacspeak and UTF-8 -- possible? @ 2007-08-03 19:24 cmr.Pent 2007-08-07 10:41 ` Tim X 0 siblings, 1 reply; 19+ messages in thread From: cmr.Pent @ 2007-08-03 19:24 UTC (permalink / raw) To: help-gnu-emacs I'm having the following problem: when I run emacspeak, I cannot input or load cyrillic characters. My emacs version is 21.4, my language environment is UTF-8, and in plain emacs cyrillics work just fine. When I invoke emacs using "emacspeak" command, all non-latin characters turn into umlauts. The funny thing is that even on gnu.org site (I use w3m) some unicode symbols (like quotation marks) turn into umlauts. I have a suspicion that the problem is in emacspeaks' inability to work with multibyte characters but I'm not sure because of my poor understanding of emacs' character coding internals. I'm really only a occasional user of emacspeak, but nevertheless I'd be glad to know if there is a way to overcome the problem. For now it would be Ok if cyrillics are displayed properly even if they are not read aloud at all or are mistakenly read as umlauts. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-03 19:24 Emacspeak and UTF-8 -- possible? cmr.Pent @ 2007-08-07 10:41 ` Tim X 2007-08-08 9:09 ` cmr.Pent 2007-08-08 14:56 ` Stefan Monnier 0 siblings, 2 replies; 19+ messages in thread From: Tim X @ 2007-08-07 10:41 UTC (permalink / raw) To: help-gnu-emacs "cmr.Pent@gmail.com" <cmr.Pent@gmail.com> writes: > I'm having the following problem: when I run emacspeak, I cannot input > or load cyrillic characters. My emacs version is 21.4, my language > environment is UTF-8, and in plain emacs cyrillics work just fine. > When I invoke emacs using "emacspeak" command, all non-latin > characters turn into umlauts. > > The funny thing is that even on gnu.org site (I use w3m) some unicode > symbols (like quotation marks) turn into umlauts. I have a suspicion > that the problem is in emacspeaks' inability to work with multibyte > characters but I'm not sure because of my poor understanding of emacs' > character coding internals. > > I'm really only a occasional user of emacspeak, but nevertheless I'd > be glad to know if there is a way to overcome the problem. For now it > would be Ok if cyrillics are displayed properly even if they are not > read aloud at all or are mistakenly read as umlauts. > Emacspeak AFAIK doesn't support multi-byte characters. The problem is that many speech synthesises, particularly older hardware based ones like the dectalk, don't understand UTF-8 character sets. If you send them a multibyte character, they either lock up, speak garbage or do something else unexpected. Therre have been some branches that have forked off the original emacspeak code, but I'm not sure what their status is or where you could find out more (google?). There is also an emacspeak mailing list. One thing you could try, assuming your TTS synth handles multibyte characters is to ensure the variable emacspeak-unibyte is not set to t e.g. ,----[ C-h v emacspeak-unibyte RET ] | emacspeak-unibyte is a variable defined in `emacspeak-setup.el'. | Its value is t | | | Documentation: | Set this to nil before starting emacspeak | if you are running in a multibyte enabled environment. | | [back] `---- This may or may not help, just a guess really. Tim -- tcross (at) rapttech dot com dot au ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-07 10:41 ` Tim X @ 2007-08-08 9:09 ` cmr.Pent 2007-08-08 14:56 ` Stefan Monnier 1 sibling, 0 replies; 19+ messages in thread From: cmr.Pent @ 2007-08-08 9:09 UTC (permalink / raw) To: help-gnu-emacs Thanks, I was able to solve problem. The solution was to hack Debian emacspeak startup shell script and emacspeak-setup.el. After that latin characters are read very nicely (I use eflite engine), and cyrillics are displayed (and can be inputed) just fine as well. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-07 10:41 ` Tim X 2007-08-08 9:09 ` cmr.Pent @ 2007-08-08 14:56 ` Stefan Monnier 2007-08-08 23:44 ` Robert D. Crawford 2007-08-10 5:27 ` Emacspeak and UTF-8 -- possible? Tim X 1 sibling, 2 replies; 19+ messages in thread From: Stefan Monnier @ 2007-08-08 14:56 UTC (permalink / raw) To: help-gnu-emacs > Emacspeak AFAIK doesn't support multi-byte characters. The problem is > that many speech synthesises, particularly older hardware based ones like > the dectalk, don't understand UTF-8 character sets. If you send them > a multibyte character, they either lock up, speak garbage or do something > else unexpected. That's not a good reason to prevent display of any other char. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-08 14:56 ` Stefan Monnier @ 2007-08-08 23:44 ` Robert D. Crawford 2007-08-09 17:44 ` Stefan Monnier 2007-08-10 5:39 ` Tim X 2007-08-10 5:27 ` Emacspeak and UTF-8 -- possible? Tim X 1 sibling, 2 replies; 19+ messages in thread From: Robert D. Crawford @ 2007-08-08 23:44 UTC (permalink / raw) To: help-gnu-emacs Stefan Monnier <monnier@iro.umontreal.ca> writes: >> Emacspeak AFAIK doesn't support multi-byte characters. The problem is >> that many speech synthesises, particularly older hardware based ones like >> the dectalk, don't understand UTF-8 character sets. If you send them >> a multibyte character, they either lock up, speak garbage or do something >> else unexpected. > > That's not a good reason to prevent display of any other char. Does any other char mean UTF-8? If this is the case, wouldn't you agree that it is better to not have UTF-8 support than to not be able to use the computer because your speech synth locks up unexpectedly and often? rdc -- Robert D. Crawford rdc1x@comcast.net You will attract cultured and artistic people to your home. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-08 23:44 ` Robert D. Crawford @ 2007-08-09 17:44 ` Stefan Monnier 2007-08-10 7:40 ` Tim X 2007-08-10 5:39 ` Tim X 1 sibling, 1 reply; 19+ messages in thread From: Stefan Monnier @ 2007-08-09 17:44 UTC (permalink / raw) To: help-gnu-emacs >>> Emacspeak AFAIK doesn't support multi-byte characters. The problem is >>> that many speech synthesises, particularly older hardware based ones like >>> the dectalk, don't understand UTF-8 character sets. If you send them >>> a multibyte character, they either lock up, speak garbage or do something >>> else unexpected. >> >> That's not a good reason to prevent display of any other char. > Does any other char mean UTF-8? If this is the case, wouldn't you agree > that it is better to not have UTF-8 support than to not be able to use > the computer because your speech synth locks up unexpectedly and often? No, I'm saying that the place where they placed the check to filter out unwanted chars is wrong. They should have Emacs accept any random encoding as always, and then encode/filter the text they send to the underlying process. Emacs constantly encodes decodes text between different encodings. E.g. If you visit a latin-1 file, it gets decoded into Emacs's internal representation, and when you save it, it gets re-encoded into latin-1 (unless you've decided to change the file's encoding in which case it may be reencoded in any other coding-system). So if the speech process only understands latin-1, they should simply set the coding-system used for that process accordingly and everything should just work. They may encounter difficulties finding the proper coding-system that handles unencodable chars (e.g. cyrillic chars with a latin-1 coding-system) in the way they want (e.g. drop the char altogether or replace it with a "?" or some other special char), but people on emacs-devel@gnu.org will be happy to help resolve those. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-09 17:44 ` Stefan Monnier @ 2007-08-10 7:40 ` Tim X 0 siblings, 0 replies; 19+ messages in thread From: Tim X @ 2007-08-10 7:40 UTC (permalink / raw) To: help-gnu-emacs Stefan Monnier <monnier@iro.umontreal.ca> writes: >>>> Emacspeak AFAIK doesn't support multi-byte characters. The problem is >>>> that many speech synthesises, particularly older hardware based ones like >>>> the dectalk, don't understand UTF-8 character sets. If you send them >>>> a multibyte character, they either lock up, speak garbage or do something >>>> else unexpected. >>> >>> That's not a good reason to prevent display of any other char. > >> Does any other char mean UTF-8? If this is the case, wouldn't you agree >> that it is better to not have UTF-8 support than to not be able to use >> the computer because your speech synth locks up unexpectedly and often? > > No, I'm saying that the place where they placed the check to filter out > unwanted chars is wrong. They should have Emacs accept any random encoding > as always, and then encode/filter the text they send to the > underlying process. > Yes, the way emacspeak handles it is wrong given emacs' current internals and how it deals with the issue. But you have totally overlooked the fact that emacspeak was designed and developed before emacs had this capability. If you were implementing emacspeak today, then this is likely how you would do it. > Emacs constantly encodes decodes text between different encodings. E.g. If > you visit a latin-1 file, it gets decoded into Emacs's internal > representation, and when you save it, it gets re-encoded into latin-1 > (unless you've decided to change the file's encoding in which case it may > be reencoded in any other coding-system). Yes, and how many years has it taken to get this working well and reliably? This is not a criticism, just pointing out that this wasn't a trivial change. Likewise, it is not a trivial change with emacspeak, which is the largest and possibly most complex of all the add-on emacs packages I've seen. > > So if the speech process only understands latin-1, they should simply set > the coding-system used for that process accordingly and everything should > just work. They may encounter difficulties finding the proper coding-system > that handles unencodable chars (e.g. cyrillic chars with > a latin-1 coding-system) in the way they want (e.g. drop the char > altogether or replace it with a "?" or some other special char), but people > on emacs-devel@gnu.org will be happy to help resolve those. Again, I generally agree and this is along the lines of previous discussions on the topic amongst emacspeak users. However, your description makes it sound like all that needs to be done is a couple of minor changes. This is not the case. The design of emacspeak was done back when essentially all you had to worry about was ascii characters and at the time, you essentially had one decent quality hardware synthesiser which could only handle the basic ascii character set. I don't think it even handled extended ascii. There were decisions made, which in hindsight were probably incorrect. For example, emacspeak does a fair amount of processing of characters prior to sending them to the speech device - in fact, it sends them to an intermediate layer written in TCL which does further processing. Originally, a lot of the internal processing within the elisp part of emacspeak was not modular or done in a single location that would make it easy to change. There are also a number of other issues about how to process these characters, what to translate and what to translate to, determining when to translate and when not to and how to control all of this to get the best results while keeping the whole system as responsive as possible. However, the main issue I have with your analysis is that you obviously don't understand what emacspeak does and how it works. It is not simply a screen reader that just sends the text as it appears on the screen to a TTS engine. Emacspeak adds a lot of contextual information, which is one of its strengths and what makes it so much more than a 'dumb' screen reader. Other systems, like speechd handles character encodings better in this respect, but it is simpler in design and has the advantage of being designed after emacs had itself incorporated support for various encodings. It also has the advantage of using a speech interface that has also been designed with multi character encoding support. As emacs' own handling of character encodings has matured, work has been going on to refactor emacspeak code to make the necessary changes to support various encodings easier. Over the last couple of years, TTS synthesisers have improved and a growing number now support UTF-8 and other encodings. The TCL interface layer has now got support for handling different character encodings etc. So, in many ways, some of the required basics are in place to make the necessary changes, but it is still a major undertaking. So major in fact that everyone who has previously started looking at this has generally decided it was too much work for just one person. You also need to realise that for the majority of emacspeak users, what is displayed on the screen is irrelevant. In fact, I know of a number of emacspeak users who don't even use a screen at all. Remember that emacspeak is a specialised add-on with a targetted audience, not a general purpose emacs package. The fact it limits the range of characters that can be displayed (and even entered) to what can be turned into speech by the TTS engine was not an issue for most users and has only relatively recently become an issue because of the evolution of both emacs and available TTS engines. Until demand is sufficient that enough people are prepared to actually do the work needed to change how emacspeak works, nothing will change making sweeping generalised statements about how it is wrong achieves nothing and contributes even less and totally fails to recognise the hard and very innovative work of the author in not only providing the first really functional interface on Linux for blind and VI users, but also in demonstrating radically different approaches to computer interfaces for those requiring assistive technology. Tim ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-08 23:44 ` Robert D. Crawford 2007-08-09 17:44 ` Stefan Monnier @ 2007-08-10 5:39 ` Tim X 2007-08-10 9:23 ` grep-find question (Is it a bug of GunWin32 version of "grep") brianjiang 1 sibling, 1 reply; 19+ messages in thread From: Tim X @ 2007-08-10 5:39 UTC (permalink / raw) To: help-gnu-emacs "Robert D. Crawford" <rdc1x@comcast.net> writes: > Stefan Monnier <monnier@iro.umontreal.ca> writes: > >>> Emacspeak AFAIK doesn't support multi-byte characters. The problem is >>> that many speech synthesises, particularly older hardware based ones like >>> the dectalk, don't understand UTF-8 character sets. If you send them >>> a multibyte character, they either lock up, speak garbage or do something >>> else unexpected. >> >> That's not a good reason to prevent display of any other char. > > Does any other char mean UTF-8? If this is the case, wouldn't you agree > that it is better to not have UTF-8 support than to not be able to use > the computer because your speech synth locks up unexpectedly and often? > The problem is a lack of familiarity with how emacspeak achieves what it does so simply. As you would know, making emacspeak support other encodings would involve a complete re-write of its internals - in fact, a whole new translation layer would be required in order to enable full support of emacs' supported character encodings *and* support both TTS engines that do and do not support encodings other than ASCII. I suspect that in order to keep it as flexible as it is now with respect to how little is required to support various add on packages, a totally new architecture would be required. This is also likely to introduce additional processing overhead which could easily degrade the real time responsiveness to the point where the system is not usable, particularly when using some of the free TTS engines, which already are only just acceptable with respect to responsiveness. Stefan makes some valuable contributions, especially to this group, but in this instance, his opinion is not based on anything of substance and contributes nothing of relevance. Tim ^ permalink raw reply [flat|nested] 19+ messages in thread
* grep-find question (Is it a bug of GunWin32 version of "grep") 2007-08-10 5:39 ` Tim X @ 2007-08-10 9:23 ` brianjiang 2007-08-10 14:07 ` Eli Zaretskii 0 siblings, 1 reply; 19+ messages in thread From: brianjiang @ 2007-08-10 9:23 UTC (permalink / raw) To: help-gnu-emacs I tried to use grep-find today (in windows XP). So I downgraded the GunWin32 version grep and findutil. But it seems that the "-i" (--ignore-case") doesn't work correctly. When I didn't use "-i" option, searching succeeded: ===================================== D:\WiKi>find . -type f -exec grep -nH RS17 {} ";" ./MyBase.muse:116: - RS17: D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nH RS17 ./MyBase.muse:116: - RS17: Then when I added the "-i" option, all the searching failed: ======================================-=== D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi RS17 D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi rs17 D:\WiKi>find . -type f -exec grep -nHi RS17 {} ";" D:\WiKi>find . -type f -exec grep -nHi rs17 {} ";" And I try the "mingw" version of these tools, the "-i" version works well: =================================================== Brian@BRIANJIANG /d/WiKi $ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:116: - RS17: Brian@BRIANJIANG /d/WiKi $ find . -type f -print0 | xargs -0 -e grep -nHi rs17 ./MyBase.muse:116: - RS17: Is it a bug of GnuWin32 version of "grep" program? Regards, Brian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: grep-find question (Is it a bug of GunWin32 version of "grep") 2007-08-10 9:23 ` grep-find question (Is it a bug of GunWin32 version of "grep") brianjiang @ 2007-08-10 14:07 ` Eli Zaretskii 2007-08-11 5:55 ` brianjiang 0 siblings, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2007-08-10 14:07 UTC (permalink / raw) To: help-gnu-emacs > Date: Fri, 10 Aug 2007 17:23:32 +0800 > From: <brianjiang@gdnt.com.cn> > > Then when I added the "-i" option, all the searching failed: > ======================================-=== > D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi RS17 > > D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi rs17 > > D:\WiKi>find . -type f -exec grep -nHi RS17 {} ";" > > D:\WiKi>find . -type f -exec grep -nHi rs17 {} ";" I cannot reproduce this with the GnuWin32 port of Grep 2.5.1 (from grep-2.5.1a-bin.zip on the GnuWin32 site). What version do you have on your machine? If you have the same version as I do, maybe you have some problem with pcre.dll, the regexp library on which Grep depends (like if some other package you installed overwrote the version of pcre.dll that came with Grep 2.5.1)? > And I try the "mingw" version of these tools, the "-i" version works > well: What is the "mingw" version? where did you get the binaries? Do you mean the MSYS version, perhaps? ^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: grep-find question (Is it a bug of GunWin32 version of "grep") 2007-08-10 14:07 ` Eli Zaretskii @ 2007-08-11 5:55 ` brianjiang 2007-08-11 7:06 ` brianjiang 2007-08-11 10:06 ` Eli Zaretskii 0 siblings, 2 replies; 19+ messages in thread From: brianjiang @ 2007-08-11 5:55 UTC (permalink / raw) To: eliz, help-gnu-emacs I use the GnuWin32 of Grep 2.5.1 too (install using grep-2.5.1a-2-setup.exe) ------------------------------------------------------------------------------------------------------------------------------------ D:\WiKi>grep -V grep (GNU grep) 2.5.1 Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. And I cannot find the pcre.dll in my computer. Instead, I find pcre3.dll (installed by GnuWin32): --------------------------------------------------------------------------------------------------- D:\WiKi>which pcre3.dll C:/Program Files/GnuWin32/bin/pcre3.dll The command path has no problem: ---------------------------------------------------------- D:\WiKi>which grep C:/Program Files/GnuWin32/bin/grep.EXE D:\WiKi>which find C:/Program Files/GnuWin32/bin/find.EXE D:\WiKi>which xargs C:/Program Files/GnuWin32/bin/xargs.EXE And I found if when text in the file is lowcase, e.g., "rs17", then I can use "-i" option to find it successfully. e.g., "-i RS17". But if the text in the file is uppercase, e.g., "RS17", then neither "-i rs17" nor "-i RS17" can found it :( See below: (Have you tried that?) ------------------------------------------------------------------------------------------------------------------------------- D:\WiKi> D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi rs17 ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi RS17 ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nH RS17 ./MyBase.muse:116: - RS17: D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nH rs17 ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 Yes, when I say "mingw" version, I means MSYS version. I always mess up these two term :( The MSYS version of grep is 2.4.2: ------------------------------------------------------------------------ $ grep -V grep (GNU grep) 2.4.2 Copyright 1988, 1992-1999, 2000 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. And it can find the thing correctly (3 matches): ------------------------------------------------ Brian@BRIANJIANG /D/WiKi $ find . -type f -print0 | xargs -0 -e grep -nHi rs17 ./MyBase.muse:116: - RS17: ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 Brian@BRIANJIANG /D/WiKi $ find . -type f -print0 | xargs -0 -e grep -nHi RS17 ./MyBase.muse:116: - RS17: ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 I currently use the MSYS version for my Emacs and it works well. Regards, Brian -----Original Message----- From: help-gnu-emacs-bounces+brianjiang=gdnt.com.cn@gnu.org [mailto:help-gnu-emacs-bounces+brianjiang=gdnt.com.cn@gnu.org] On Behalf Of Eli Zaretskii Sent: 2007年8月10日 22:07 To: help-gnu-emacs@gnu.org Subject: Re: grep-find question (Is it a bug of GunWin32 version of "grep") > Date: Fri, 10 Aug 2007 17:23:32 +0800 > From: <brianjiang@gdnt.com.cn> > > Then when I added the "-i" option, all the searching failed: > ======================================-=== > D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi RS17 > > D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi rs17 > > D:\WiKi>find . -type f -exec grep -nHi RS17 {} ";" > > D:\WiKi>find . -type f -exec grep -nHi rs17 {} ";" I cannot reproduce this with the GnuWin32 port of Grep 2.5.1 (from grep-2.5.1a-bin.zip on the GnuWin32 site). What version do you have on your machine? If you have the same version as I do, maybe you have some problem with pcre.dll, the regexp library on which Grep depends (like if some other package you installed overwrote the version of pcre.dll that came with Grep 2.5.1)? > And I try the "mingw" version of these tools, the "-i" version works > well: What is the "mingw" version? where did you get the binaries? Do you mean the MSYS version, perhaps? _______________________________________________ help-gnu-emacs mailing list help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs ^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: grep-find question (Is it a bug of GunWin32 version of "grep") 2007-08-11 5:55 ` brianjiang @ 2007-08-11 7:06 ` brianjiang 2007-08-11 10:06 ` Eli Zaretskii 1 sibling, 0 replies; 19+ messages in thread From: brianjiang @ 2007-08-11 7:06 UTC (permalink / raw) To: eliz, help-gnu-emacs Just reformat my previous mail.... -----Original Message----- From: help-gnu-emacs-bounces+brianjiang=gdnt.com.cn@gnu.org [mailto:help-gnu-emacs-bounces+brianjiang=gdnt.com.cn@gnu.org] On Behalf Of brianjiang@gdnt.com.cn Sent: 2007年8月11日 13:55 To: eliz@gnu.org; help-gnu-emacs@gnu.org Subject: RE: grep-find question (Is it a bug of GunWin32 version of "grep") I use the GnuWin32 of Grep 2.5.1 too (install using grep-2.5.1a-2-setup.exe) ------------------------------------------------------------------------------------------------------------------------------------ D:\WiKi>grep -V grep (GNU grep) 2.5.1 Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. And I cannot find the pcre.dll in my computer. Instead, I find pcre3.dll (installed by GnuWin32): --------------------------------------------------------------------------------------------------- D:\WiKi>which pcre3.dll C:/Program Files/GnuWin32/bin/pcre3.dll The command path has no problem: ---------------------------------------------------------- D:\WiKi>which grep C:/Program Files/GnuWin32/bin/grep.EXE D:\WiKi>which find C:/Program Files/GnuWin32/bin/find.EXE D:\WiKi>which xargs C:/Program Files/GnuWin32/bin/xargs.EXE And I found if when text in the file is lowcase, e.g., "rs17", then I can use "-i" option to find it successfully. e.g., "-i RS17". But if the text in the file is uppercase, e.g., "RS17", then neither "-i rs17" nor "-i RS17" can found it :( See below: (Have you tried that?) ------------------------------------------------------------------------------------------------------------------------------- D:\WiKi> D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi rs17 ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi RS17 ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nH RS17 ./MyBase.muse:116: - RS17: D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nH rs17 ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 Yes, when I say "mingw" version, I means MSYS version. I always mess up these two term :( The MSYS version of grep is 2.4.2: ------------------------------------------------------------------------ $ grep -V grep (GNU grep) 2.4.2 Copyright 1988, 1992-1999, 2000 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. And it can find the thing correctly (3 matches): ------------------------------------------------ Brian@BRIANJIANG /D/WiKi $ find . -type f -print0 | xargs -0 -e grep -nHi rs17 ./MyBase.muse:116: - RS17: ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 Brian@BRIANJIANG /D/WiKi $ find . -type f -print0 | xargs -0 -e grep -nHi RS17 ./MyBase.muse:116: - RS17: ./MyBase.muse:284:$ find . -type f -exec grep -nHi rs17 {} NUL ";" ./MyBase.muse:285:$ find . -type f -print0 | xargs -0 -e grep -nHi rs17 I currently use the MSYS version for my Emacs and it works well. Regards, Brian -----Original Message----- From: help-gnu-emacs-bounces+brianjiang=gdnt.com.cn@gnu.org [mailto:help-gnu-emacs-bounces+brianjiang=gdnt.com.cn@gnu.org] On Behalf Of Eli Zaretskii Sent: 2007年8月10日 22:07 To: help-gnu-emacs@gnu.org Subject: Re: grep-find question (Is it a bug of GunWin32 version of "grep") > Date: Fri, 10 Aug 2007 17:23:32 +0800 > From: <brianjiang@gdnt.com.cn> > > Then when I added the "-i" option, all the searching failed: > ======================================-=== > D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi RS17 > > D:\WiKi>find . -type f -print0 | xargs -0 -e grep -nHi rs17 > > D:\WiKi>find . -type f -exec grep -nHi RS17 {} ";" > > D:\WiKi>find . -type f -exec grep -nHi rs17 {} ";" I cannot reproduce this with the GnuWin32 port of Grep 2.5.1 (from grep-2.5.1a-bin.zip on the GnuWin32 site). What version do you have on your machine? If you have the same version as I do, maybe you have some problem with pcre.dll, the regexp library on which Grep depends (like if some other package you installed overwrote the version of pcre.dll that came with Grep 2.5.1)? > And I try the "mingw" version of these tools, the "-i" version works > well: What is the "mingw" version? where did you get the binaries? Do you mean the MSYS version, perhaps? _______________________________________________ help-gnu-emacs mailing list help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs _______________________________________________ help-gnu-emacs mailing list help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: grep-find question (Is it a bug of GunWin32 version of "grep") 2007-08-11 5:55 ` brianjiang 2007-08-11 7:06 ` brianjiang @ 2007-08-11 10:06 ` Eli Zaretskii 2007-08-12 9:58 ` brianjiang 1 sibling, 1 reply; 19+ messages in thread From: Eli Zaretskii @ 2007-08-11 10:06 UTC (permalink / raw) To: help-gnu-emacs > Date: Sat, 11 Aug 2007 13:55:17 +0800 > From: <brianjiang@gdnt.com.cn> > > And I cannot find the pcre.dll in my computer. Instead, I find pcre3.dll (installed by GnuWin32): > --------------------------------------------------------------------------------------------------- > D:\WiKi>which pcre3.dll > C:/Program Files/GnuWin32/bin/pcre3.dll Strange. I installed Grep from a zip file, not with a self-installing setup program, and I clearly see only pcre.dll in the grep-2.5.1a-dep.zip archive I still have on my machine. What is the time stamp of pcre3.dll on your system? Also, does it help to reinstall Grep? > And I found if when text in the file is lowcase, e.g., "rs17", then I can use "-i" option to find it successfully. e.g., "-i RS17". > But if the text in the file is uppercase, e.g., "RS17", then neither "-i rs17" nor "-i RS17" can found it :( See below: > (Have you tried that?) Yes, I've tried that, and it works as I expect: it finds text case-insensitively no matter if I specify the search string in uppercase or lowercase. > I currently use the MSYS version for my Emacs and it works well. FWIW, I don't recommend this. MSYS ports are meant for one purpose only: to be able to build MinGW ports of other tools. For that purpose, they sometimes tweak the command-line arguments in order to allow running those commands from Unix shell scripts. In particular, they convert Windows file names with drive letters into pseudo-Posix file names that start with a forward slash. If a command-line argument looks like a file name, but really isn't, this conversion could have devastating effect on the Grep command. ^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: grep-find question (Is it a bug of GunWin32 version of "grep") 2007-08-11 10:06 ` Eli Zaretskii @ 2007-08-12 9:58 ` brianjiang 2007-08-12 18:59 ` Eli Zaretskii 0 siblings, 1 reply; 19+ messages in thread From: brianjiang @ 2007-08-12 9:58 UTC (permalink / raw) To: eliz, help-gnu-emacs Hi Eli, Where did you download the package? I downgraded it from http://gnuwin32.sourceforge.net/packages/grep.htm. I have uninstall and re-install it, but the problem still exists. And I also try the ZIP package, the result is the same. And I cannot find pcre.dll in these packages. Only pcre3.dll there. My pcre3.dll is shipped with SETUP package and ZIP package. The timestamp is as follows: ------------------------------------------------------- Directory of C:\Program Files\GnuWin32\bin 2007-08-12 17:41 <DIR> . 2007-08-12 17:41 <DIR> .. 2007-04-03 03:40 33 egrep 2007-04-03 03:40 33 fgrep 2005-04-19 02:53 160,256 find.exe 2007-06-23 03:35 122,368 grep.exe 2004-03-17 04:37 898,048 libiconv2.dll 2005-05-07 03:52 103,424 libintl3.dll 2005-04-19 02:53 113,664 locate.exe 2007-03-17 17:56 140,288 pcre3.dll 2006-10-10 13:48 30 rgrep 2005-03-18 06:38 8,228 updatedb 2005-05-01 17:13 61,440 which.exe 2005-04-19 02:53 32,768 xargs.exe 12 File(s) 1,640,580 bytes 2 Dir(s) 2,093,121,536 bytes free > Yes, I've tried that, and it works as I expect: it finds text > case-insensitively no matter if I specify the search string in > uppercase or lowercase. Actually, I mean it has problem when the text string in the file is uppercase. In my example, I have a text string "RS17" in file "MyBase.muse". When I use "-i rs17" or "-i RS17" to search, I cannot find "RS17" in that file. But I also have another text string "rs17" in file "MyBase.muse", and I use "-i rs17" or "-i RS17" and I can find it ("rs17" in the file) successfully. Very strange. Regards, Brian -----Original Message----- From: help-gnu-emacs-bounces+brianjiang=gdnt.com.cn@gnu.org [mailto:help-gnu-emacs-bounces+brianjiang=gdnt.com.cn@gnu.org] On Behalf Of Eli Zaretskii Sent: 2007年8月11日 18:06 To: help-gnu-emacs@gnu.org Subject: Re: grep-find question (Is it a bug of GunWin32 version of "grep") > Date: Sat, 11 Aug 2007 13:55:17 +0800 > From: <brianjiang@gdnt.com.cn> > > And I cannot find the pcre.dll in my computer. Instead, I find pcre3.dll (installed by GnuWin32): > ---------------------------------------------------------------------- > ----------------------------- > D:\WiKi>which pcre3.dll > C:/Program Files/GnuWin32/bin/pcre3.dll Strange. I installed Grep from a zip file, not with a self-installing setup program, and I clearly see only pcre.dll in the grep-2.5.1a-dep.zip archive I still have on my machine. What is the time stamp of pcre3.dll on your system? Also, does it help to reinstall Grep? > And I found if when text in the file is lowcase, e.g., "rs17", then I can use "-i" option to find it successfully. e.g., "-i RS17". > But if the text in the file is uppercase, e.g., "RS17", then neither "-i rs17" nor "-i RS17" can found it :( See below: > (Have you tried that?) Yes, I've tried that, and it works as I expect: it finds text case-insensitively no matter if I specify the search string in uppercase or lowercase. > I currently use the MSYS version for my Emacs and it works well. FWIW, I don't recommend this. MSYS ports are meant for one purpose only: to be able to build MinGW ports of other tools. For that purpose, they sometimes tweak the command-line arguments in order to allow running those commands from Unix shell scripts. In particular, they convert Windows file names with drive letters into pseudo-Posix file names that start with a forward slash. If a command-line argument looks like a file name, but really isn't, this conversion could have devastating effect on the Grep command. _______________________________________________ help-gnu-emacs mailing list help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: grep-find question (Is it a bug of GunWin32 version of "grep") 2007-08-12 9:58 ` brianjiang @ 2007-08-12 18:59 ` Eli Zaretskii 0 siblings, 0 replies; 19+ messages in thread From: Eli Zaretskii @ 2007-08-12 18:59 UTC (permalink / raw) To: help-gnu-emacs > Date: Sun, 12 Aug 2007 17:58:48 +0800 > From: <brianjiang@gdnt.com.cn> > > > Yes, I've tried that, and it works as I expect: it finds text > > case-insensitively no matter if I specify the search string in > > uppercase or lowercase. > > Actually, I mean it has problem when the text string in the file is > uppercase. For me, it works both with upper- and lower-case text in the file. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-08 14:56 ` Stefan Monnier 2007-08-08 23:44 ` Robert D. Crawford @ 2007-08-10 5:27 ` Tim X 2007-08-11 4:02 ` Stefan Monnier 1 sibling, 1 reply; 19+ messages in thread From: Tim X @ 2007-08-10 5:27 UTC (permalink / raw) To: help-gnu-emacs Stefan Monnier <monnier@iro.umontreal.ca> writes: >> Emacspeak AFAIK doesn't support multi-byte characters. The problem is >> that many speech synthesises, particularly older hardware based ones like >> the dectalk, don't understand UTF-8 character sets. If you send them >> a multibyte character, they either lock up, speak garbage or do something >> else unexpected. > > That's not a good reason to prevent display of any other char. > > > Stefan Nobody said it was a good reason, its just the reason. I'd suggest you need to have familiarity with how it works and its internals and history before you can make a judgement, Given your contributions and long involvement with emacs I'm surprised to see you make such an uninformed statement which contributes so little. Tim ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-10 5:27 ` Emacspeak and UTF-8 -- possible? Tim X @ 2007-08-11 4:02 ` Stefan Monnier 2007-08-13 21:47 ` Raman 0 siblings, 1 reply; 19+ messages in thread From: Stefan Monnier @ 2007-08-11 4:02 UTC (permalink / raw) To: help-gnu-emacs >>> Emacspeak AFAIK doesn't support multi-byte characters. The problem is >>> that many speech synthesises, particularly older hardware based ones like >>> the dectalk, don't understand UTF-8 character sets. If you send them >>> a multibyte character, they either lock up, speak garbage or do something >>> else unexpected. >> >> That's not a good reason to prevent display of any other char. > Nobody said it was a good reason, its just the reason. I'd suggest you > need to have familiarity with how it works and its internals and history > before you can make a judgement, > Given your contributions and long involvement with Emacs I'm surprised to > see you make such an uninformed statement which contributes so little. I'm sorry if I offended you. I didn't mean to say that Emacspeak is crap or stupid, really. I just felt like it was useful to point out that the justification that the limitation is due to the speech synthesis process's own limitations was not quite correct. The real reason is that Emacspeak has not been adjusted to the way Emacs handles encoding, so the limitation is a result of historical circumstances coupled with a lack of manpower/expertise/motivation to rework the code in order to lift this restriction. Nothing to be ashamed here. Again, I'm sorry if I sounded like I was criticizing Emacspeak, Stefan PS: How ironic that Emacspeak provides "the first really functional interface on Linux for [blind and] VI users". ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-11 4:02 ` Stefan Monnier @ 2007-08-13 21:47 ` Raman 2007-08-14 18:28 ` Stefan Monnier 0 siblings, 1 reply; 19+ messages in thread From: Raman @ 2007-08-13 21:47 UTC (permalink / raw) To: help-gnu-emacs > > > PS: How ironic that Emacspeak provides "the first really functional > interface on Linux for [blind and] VI users". ;-) It actually was a first in helping blind users run VI independently on a workstation console --- thanks to the wonders of eterm.el in Emacs 19.26 --- which for the first time made curses-based apps runnable in an Emacs Term. As for the character coding issues --- I forcibly set EMACS_UNIBYTE to T in the Emacspeak setup files a few years ago to make it clear that the UTF-8 piece needed work. Doing anything else would just cause bizarre/rude surprizes half-way while one is working. Now, when you have a system like Emacs itself which attracts a large number of developers, it is possible to enable something as complex as character encoding and translation and over time debug all of the issues --- we've seen this happen in the case of mainline Emacs over the last 8 years. However, Emacspeak does not have this luxury, and pretending that multibyte support works when it doesn't would only lead to large amounts of frustration and support questions of the form "why doesnt xxx ." on the mailing list. For the record, it should be possible to now add multibyte support by carefully binding buffer-encoding in the scratch buffer used to transform text and by setting process-encoding for that buffer before streaming out the text to the TTS process. But doing this will require: A) Manpower Cycles(ramanpower cycles are vanishingly small for this) B) Ability to test --- TTS engines, multibyte texts, and someone who uses it C) Early users who are sufficiently knowledgeable to be able to bear the pain, identify problems, and and define solutions --Raman ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Emacspeak and UTF-8 -- possible? 2007-08-13 21:47 ` Raman @ 2007-08-14 18:28 ` Stefan Monnier 0 siblings, 0 replies; 19+ messages in thread From: Stefan Monnier @ 2007-08-14 18:28 UTC (permalink / raw) To: help-gnu-emacs > A) Manpower Cycles(ramanpower cycles are vanishingly small for this) > B) Ability to test --- TTS engines, multibyte texts, and > someone who uses it > C) Early users who are sufficiently knowledgeable to be able > to bear the pain, identify problems, and and define > solutions Maybe a first step will be to make it a customizable option, marked as "please help test&debug it if you can". If it ever starts to appear mildly usable, you can then change its default to "enabled", and document how to turn it off at as many places as you can think of. If you ever intend to start this transition, better start early and better let other people find&fix the bugs for you ;-) Unibyte sessions in Emacs are becoming more and more problematic, so you'll probably have to make the transition at some point (there's no indication we'll ever remove support for unibyte *buffers and strings*, which have some very important uses, but it's difficult to improve the support for multibyte sessions without introducing minor problems in unibyte *sessions*). Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2007-08-14 18:28 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-08-03 19:24 Emacspeak and UTF-8 -- possible? cmr.Pent 2007-08-07 10:41 ` Tim X 2007-08-08 9:09 ` cmr.Pent 2007-08-08 14:56 ` Stefan Monnier 2007-08-08 23:44 ` Robert D. Crawford 2007-08-09 17:44 ` Stefan Monnier 2007-08-10 7:40 ` Tim X 2007-08-10 5:39 ` Tim X 2007-08-10 9:23 ` grep-find question (Is it a bug of GunWin32 version of "grep") brianjiang 2007-08-10 14:07 ` Eli Zaretskii 2007-08-11 5:55 ` brianjiang 2007-08-11 7:06 ` brianjiang 2007-08-11 10:06 ` Eli Zaretskii 2007-08-12 9:58 ` brianjiang 2007-08-12 18:59 ` Eli Zaretskii 2007-08-10 5:27 ` Emacspeak and UTF-8 -- possible? Tim X 2007-08-11 4:02 ` Stefan Monnier 2007-08-13 21:47 ` Raman 2007-08-14 18:28 ` Stefan Monnier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).