* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes [not found] ` <<83lhal1qtm.fsf@gnu.org> @ 2015-10-29 20:53 ` Drew Adams 2015-10-30 7:47 ` Eli Zaretskii [not found] ` <<2c1ac781-86b8-4365-8466-52455afb79f6@default> 1 sibling, 1 reply; 16+ messages in thread From: Drew Adams @ 2015-10-29 20:53 UTC (permalink / raw) To: Eli Zaretskii, Drew Adams; +Cc: 21780 [-- Attachment #1: Type: text/plain, Size: 4724 bytes --] > It might work as you expect already. You can try this: > > . After "C-h f some-function RET", switch to the *Help* buffer and > type "C-x RET f utf-8 RET", then save the buffer as in your > recipe. > > . Now visit the file where you saved the *Help* buffer: if the > curved quotes display correctly, then "it works for you". That's exactly what I reported I did. The result was seeing octal escapes when I opened the file in a new Emacs session. Or so I thought. I just repeated it now with `emacs -Q' and it does not seem to be a problem if utf-8 is used. I think now that what I did earlier, when specified utf-8, I was using my setup. When I try that again (with my setup), I see the problem I reported. Also, with my setup the *Warning* text is different. Instead of providing lots of possible encoding choices and using chinese-iso-8bit as the default (which is what I get with emacs -Q - why is that, BTW?), it says: Select one of the safe coding systems listed below, or cancel the writing with C-g and edit the buffer to remove or modify the problematic characters, or specify any other coding system (and risk losing the problematic characters). raw-text no-conversion It was in that context that I anyway tried utf-8 and got the awful result. I also tried raw-text, since utf-8 was not in the prompt (but was apparently accepted, and apparently did not help). The full warning is this, when I use my setup (which uses font "-outline-Lucida Console-normal-normal-normal-mono-14-*-*-*-c-*-iso8859-1"): These default coding systems were tried to encode text in the buffer 'throw-isearch-help5.txt': (undecided-unix (489 . 8216) (491 . 8217) (499 . 8216) (503 . 8217) (577 . 8216) (583 . 8217) (875 . 8216) (892 . 8217) (912 . 8216) (931 . 8217) (963 . 8216)) (iso-latin-1-unix (489 . 8216) (491 . 8217) (499 . 8216) (503 . 8217) (577 . 8216) (583 . 8217) (875 . 8216) (892 . 8217) (912 . 8216) (931 . 8217) (963 . 8216)) However, each of them encountered characters it couldn't encode: undecided-unix cannot encode these: ' ' ' ' ' ' ' ' ' ' ... iso-latin-1-unix cannot encode these: ' ' ' ' ' ' ' ' ' ' ... Click on a character (or switch to this window by 'C-x o' and select the characters by RET) to jump to the place it appears, where 'C-u C-x =' will give information about it. Select one of the safe coding systems listed below, or cancel the writing with C-g and edit the buffer to remove or modify the problematic characters, or specify any other coding system (and risk losing the problematic characters). raw-text no-conversion I got only a #...# file written with utf-8, as the Emacs 25 build I have crashes all the time, and trying to select the minibuffer frame after the popped-up *Warning* frame grabs the selection just crashes Emacs. And that happens now when I try again, with my setup. I've attached those files from the first try I made for this, with my setup. I see now that the `#...#' one has UTF-8 encoding and the other, `...' without the #s, has encoding `t' in the mode line, which I guess means raw text. To report the problem I then used `emacs -Q', and this time I tried raw text, and I saw the octal escapes. So I mistakenly reported that I saw them after specifying both utf-8 and raw text. > . If the curved quotes look like raw bytes or, worse, pairs of > non-ASCII characters, you need to visit such file like this: > > C-x RET c utf-8 RET C-x C-f FILE-NAME RET Users should not have to do that. I thought they would, like me have that annoyance. > > In Emacs, before saving, the buffer looks fine. > > It looks fine, but the encoding mnemonic on the mode line is not "U" > (which stands for UTF-8), right? That is why Emacs asks you for > encoding: it cannot save these characters using your locale's default > encoding (which is what the *Help* buffer uses by default). > Yes, because you probably told Emacs to use raw-text or somesuch, when > it asked. See above. > > Do I need to save the buffer using some other encoding? If so, which? > > Yes, you could tell it to use UTF-8 when it asked. (After my change, > Emacs will do that automatically, no questions asked, when saving > *Help* buffers with curved quotes.) Sounds good. > > Emacs proposed two encodings (one of which was raw text, which I tried; > > and I tried also utf-8, which I would have thought would show curly > > quotes OK. > > UTF-8 should have worked. I wouldn't expect you to see octal escapes > after saving in UTF-8. See above for, I think, the explanation of what I did and saw. Thx. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: throw-isearch-help.txt --] [-- Type: text/plain; charset=Windows-1252; name="throw-isearch-help.txt", Size: 11918 bytes --] isearch-forward is an interactive compiled Lisp function in `isearch+.el'. It is bound to C-s, menu-bar search i-search isearch-forward. (isearch-forward &optional ARG NO-RECURSIVE-EDIT) For more information check the manuals. Search forward incrementally - Isearch+ version. With a non-negative prefix arg, do an incremental regular expression search instead. With a negative prefix arg, do a (plain, not regexp) incremental search across multiple buffers: * If the prefix arg is â-â (from âM--â) then you are prompted for the list of buffers. * Otherwise, (e.g. âM-- 2â), you are prompted for a regexp that matches the names of the buffers to be searched. If you try to exit with the search string empty then nonincremental search is used. As you type characters, they add to the search string and are found. The following non-printing keys are bound in âisearch-mode-mapâ. Options ------- âisearchp-case-foldâ - search is case sensitive? âisearchp-dim-outside-search-area-flagâ [*] - dim non-search zones? âisearchp-dimming-colorâ [*] - color for non-search zones âisearchp-set-region-flagâ - select last search target? âisearchp-restrict-to-region-flagâ - restrict search to region? âisearchp-deactivate-region-flagâ - search deactivates region? âisearchp-ignore-comments-flagâ [*] - ignore THINGs in comments? âisearchp-hide-whitespace-before-comment-flagâ [*] - precomment space? âisearchp-mouse-2-flagâ - âmouse-2â anywhere yanks selection? âisearchp-regexp-quote-yank-flagâ - regexp-quote yanked text? âisearchp-toggle-option-flagâ - toggle options too? âisearchp-drop-mismatchâ - handling input after search mismatch âisearchp-drop-mismatch-regexp-flagâ - regexp search drop mismatch? âisearchp-initiate-edit-commandsâ - keys that edit, not exit [*] Requires library âisearch-prop.elâ. Commands -------- DEL - cancel last input item from end of search string RET - exit, leaving point at location found C-s - search again forward, C-r backward C-y C-w - yank a word or char from buffer onto search string C-z - yank a char from buffer onto search string C-M-w - delete char from end of search string C-y C-e - yank text up to end of line onto search string C-y C-y - yank the last string of killed or copied text M-y - replace string just yanked with string killed/copied before it M-w - copy current search string to kill ring C-_ - yank a symbol or char from buffer onto search string C-( - yank sexp, symbol, or char from buffer onto search string C-q - quote a control character, to search for it C-x 8 RET - add a Unicode char to search string by Unicode name C-M-l - remove failed part of search string, if any C-g - remove failed part of search string, or cancel if none C-x o - invoke Emacs command loop recursively, during Isearch M-g - insert successful search string from when you hit âC-gâ M-s e - edit the search string in the minibuffer M-n, M-p - search for next/previous item in search ring M-x isearchp-complete - complete the search string using the search ring M-% - run âquery-replaceâ to replace search-string matches C-M-% - run âquery-replace-regexpâ M-s o - run âoccurâ for search-string matches M-s h r - run âhighlight-regexpâ to highlight search-string matches M-x isearchp-fontify-buffer-now - fontify whole buffer M-x isearchp-set-region-around-search-target - select last search <f1> b - list all Isearch key bindings <f1> k - show documentation of an Isearch key <f1> m - show documentation for Isearch mode M-k - cycle option âisearchp-drop-mismatchâ M-s c - toggle case-sensitivity (for current search or more: âC-uâ) M-s h l - option âlazy-highlight-cleanupâ (removal of highlighting) C-+ - toggle searching invisible text M-s i - toggle searching invisible text, for current search or more M-s v - toggle option âisearchp-toggle-option-flagâ C-x n - toggle restricting search to active region C-SPC - toggle setting region around search target C-` - toggle quoting (escaping) of regexp special characters M-s w - toggle word-searching M-s _ - toggle symbol-searching M-s SPC - toggle whitespace matching A âSPCâ char normally matches all whitespace defined by variable âsearch-whitespace-regexpâ. See also variables âisearch-lax-whitespaceâ and âisearch-regexp-lax-whitespaceâ. Commands that Require Library âisearch-prop.elâ ----------------------------------------------- C-t - search for a character (overlay or text) property C-M-t - regexp-search for a character (overlay or text) property C-M-~ - toggle searching complements of normal search contexts C-M-S-d - toggle dimming non-search zones C-M-; - toggle ignoring comments for âisearchp-thingâ M-; - hide or (âC-uâ) show comments M-x isearchp-put-prop-on-region - add a text property to region M-x isearchp-add-regexp-as-property - add prop to regexp matches M-x isearchp-regexp-context-search - search regexp contexts M-x isearchp-regexp-define-contexts - define regexp contexts M-x isearchp-imenu - search Emacs-Lisp definitions M-x isearchp-imenu-command - search Emacs command definitions M-x isearchp-imenu-non-interactive-function - search non-commands M-x isearchp-imenu-macro - search Emacs-Lisp macro definitions M-x isearchp-thing - search THING search contexts M-x isearchp-thing-define-contexts - define THING contexts M-x isearchp-previous-visible-thing - go to previous visible THING M-x isearchp-next-visible-thing - go to next visible THING Input Methods ------------- If an input method is turned on in the current buffer, that input method is also active while you are typing characters to search. To toggle the input method, type C-\. It also toggles the input method in the current buffer. To use a different input method for searching, type C-^, and specify an input method you want to use. --- The above keys, bound in âisearch-mode-mapâ, are often controlled by user options - do <f1> C-a on search-.* to find them. If either option âisearch-allow-prefixâ or option âisearch-allow-scrollâ is non-nil then you can use a prefix arg with an Isearch key. If option âisearch-allow-scrollâ is non-nil then you can use scrolling keys without exiting Isearch. If these options are both nil then other control and meta chars terminate the search and are then used normally (depending on âsearch-exit-optionâ). Likewise for function keys and mouse button events. If this function is called non-interactively with nil argument NO-RECURSIVE-EDIT then it does not return to the calling function until the search is done. See function âisearch-modeâ. Bindings in Isearch minor mode: ------------------------------ key binding --- ------- TAB .. C-j isearch-printing-char SPC .. ~ isearch-printing-char  .. ø¿½¿ isearch-printing-char .. ÿ isearch-printing-char C-g isearch-abort C-h isearch-mode-help RET isearch-exit C-o isearch-moccur C-q isearch-quote-char C-r isearch-repeat-backward C-s isearch-repeat-forward C-t isearchp-property-forward C-w isearch-yank-word-or-char C-x Prefix Command C-y Prefix Command C-z isearchp-yank-char ESC Prefix Command C-\ isearch-toggle-input-method C-^ isearch-toggle-specified-input-method C-_ isearchp-yank-symbol-or-char DEL isearch-delete-char S-SPC isearch-printing-char C-SPC isearchp-toggle-set-region C-( isearchp-yank-sexp-symbol-or-char C-+ isearchp-toggle-search-invisible C-; iedit-mode C-` isearchp-toggle-regexp-quote-yank C-S-SPC isearchp-narrow-to-lazy-highlights <C-M-return> isearchp-act-on-demand <C-M-tab> icicle-isearch-complete <C-end> goto-longest-line <M-S-delete> isearchp-cleanup <M-tab> icicle-isearch-complete <backtab> icicle-search-w-isearch-string <down-mouse-2> ignore <escape> Prefix Command <f1> Prefix Command <help> Prefix Command <mouse-2> isearch-mouse-2 <next> isearch-repeat-forward <prior> isearch-repeat-backward <remap> Prefix Command <return> isearch-exit <switch-frame> ignore C-x 8 Prefix Command C-x n isearchp-toggle-region-restriction C-x o isearchp-open-recursive-edit C-x r Prefix Command C-y C-c isearchp-yank-char C-y C-e isearchp-yank-line C-y C-w isearchp-yank-word-or-char C-y C-y isearch-yank-kill C-y ESC Prefix Command C-y C-_ isearchp-yank-symbol-or-char C-y C-( isearchp-yank-sexp-symbol-or-char C-y C-2 isearch-yank-secondary C-M-i icicle-isearch-complete C-M-l isearchp-remove-failed-part C-M-r isearch-repeat-backward C-M-s isearch-repeat-forward C-M-t isearchp-property-forward-regexp C-M-w isearch-del-char C-M-y isearch-yank-secondary ESC ESC Prefix Command M-% isearch-query-replace M-: isearchp-eval-sexp-and-insert M-; isearchp-toggle-hiding-comments M-O isearch-moccur-all M-c isearch-toggle-case-fold M-e isearch-edit-string M-g isearchp-retrieve-last-quit-search M-k isearchp-cycle-mismatch-removal M-n isearch-ring-advance M-o icicle-isearch-history-insert M-p isearch-ring-retreat M-r isearch-toggle-regexp M-s Prefix Command M-w isearchp-kill-ring-save M-y isearch-yank-pop C-M-S-d isearchp-toggle-dimming-outside-search-area C-M-% isearch-query-replace-regexp C-M-; isearchp-toggle-ignoring-comments C-M-` isearchp-toggle-literal-replacement C-M-~ isearchp-toggle-complementing-domain M-ESC ESC isearch-cancel M-s C-e isearch-yank-line M-s SPC isearch-toggle-lax-whitespace M-s ' isearch-toggle-character-fold M-s _ isearch-toggle-symbol M-s c isearch-toggle-case-fold M-s e isearch-edit-string M-s h Prefix Command M-s i isearch-toggle-invisible M-s o isearch-occur M-s r isearch-toggle-regexp M-s v isearchp-toggle-option-toggle M-s w isearch-toggle-word <escape> <tab> icicle-isearch-complete <remap> <isearch-complete> isearchp-complete <f1> C-h isearch-help-for-help <f1> ? isearch-help-for-help <f1> b isearch-describe-bindings <f1> k isearch-describe-key <f1> m isearch-describe-mode <f1> q help-quit <f1> <f1> isearch-help-for-help <f1> <help> isearch-help-for-help <help> C-h isearch-help-for-help <help> ? isearch-help-for-help <help> b isearch-describe-bindings <help> k isearch-describe-key <help> m isearch-describe-mode <help> q help-quit <help> <f1> isearch-help-for-help <help> <help> isearch-help-for-help C-x r g isearchp-append-register C-x 8 RET isearch-char-by-name C-y M-g isearchp-retrieve-last-quit-search C-y M-y isearch-yank-pop M-s h h hlt-highlight-isearch-matches M-s h l isearchp-toggle-lazy-highlight-cleanup M-s h r isearch-highlight-regexp M-s h u hlt-unhighlight-isearch-matches [-- Attachment #3: #throw-isearch-help.txt# --] [-- Type: application/octet-stream, Size: 11620 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-29 20:53 ` bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes Drew Adams @ 2015-10-30 7:47 ` Eli Zaretskii 0 siblings, 0 replies; 16+ messages in thread From: Eli Zaretskii @ 2015-10-30 7:47 UTC (permalink / raw) To: Drew Adams; +Cc: 21780 > Date: Thu, 29 Oct 2015 13:53:30 -0700 (PDT) > From: Drew Adams <drew.adams@oracle.com> > Cc: 21780@debbugs.gnu.org > > I think now that what I did earlier, when specified utf-8, I was > using my setup. When I try that again (with my setup), I see the > problem I reported. So the question now becomes what do you have in your setup that causes this. I'm guessing you do something that changes the defaults for encoding/decoding text. > Also, with my setup the *Warning* text is different. Instead of > providing lots of possible encoding choices and using > chinese-iso-8bit as the default (which is what I get with emacs > -Q - why is that, BTW?), it says: > > Select one of the safe coding systems listed below, > or cancel the writing with C-g and edit the buffer > to remove or modify the problematic characters, > or specify any other coding system (and risk losing > the problematic characters). > > raw-text no-conversion What its suggests depend on your customizations. > The full warning is this, when I use my setup (which uses font > "-outline-Lucida Console-normal-normal-normal-mono-14-*-*-*-c-*-iso8859-1"): The font has nothing to do with this. > > . If the curved quotes look like raw bytes or, worse, pairs of > > non-ASCII characters, you need to visit such file like this: > > > > C-x RET c utf-8 RET C-x C-f FILE-NAME RET > > Users should not have to do that. They have been doing that since Emacs 20.1. This is how you visit a file whose encoding Emacs cannot guess correctly. You just didn't have the pleasure of bumping into this problem until now. Another possibility is to visit the file normally, see that it wasn't decoded correctly, then type "C-x RET r utf-8 RET", which will revisit the file using the specified encoding. > > UTF-8 should have worked. I wouldn't expect you to see octal escapes > > after saving in UTF-8. > > See above for, I think, the explanation of what I did and saw. It doesn't. You have something in your customizations that runs afoul of your expectations (which do work in "emacs -Q"). ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <<2c1ac781-86b8-4365-8466-52455afb79f6@default>]
[parent not found: <<83k2q423x7.fsf@gnu.org>]
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes [not found] ` <<83k2q423x7.fsf@gnu.org> @ 2015-10-30 15:07 ` Drew Adams 2015-10-30 15:22 ` Eli Zaretskii [not found] ` <<17cf8a49-1cc4-4834-91ec-b7d092693ebf@default> 1 sibling, 1 reply; 16+ messages in thread From: Drew Adams @ 2015-10-30 15:07 UTC (permalink / raw) To: Eli Zaretskii, Drew Adams; +Cc: 21780 > > I think now that what I did earlier, when specified utf-8, I was > > using my setup. When I try that again (with my setup), I see the > > problem I reported. > > So the question now becomes what do you have in your setup that causes > this. I'm guessing you do something that changes the defaults for > encoding/decoding text. I have tons of stuff in my setup. Let me know what I might search for that might affect encoding/decoding. > > Also, with my setup the *Warning* text is different. Instead of > > providing lots of possible encoding choices and using > > chinese-iso-8bit as the default (which is what I get with emacs > > -Q - why is that, BTW?), it says: > > > > Select one of the safe coding systems listed below, > > or cancel the writing with C-g and edit the buffer > > to remove or modify the problematic characters, > > or specify any other coding system (and risk losing > > the problematic characters). > > > > raw-text no-conversion > > What its suggests depend on your customizations. Such as? Again, could it be affected by the default font choice? > > > UTF-8 should have worked. I wouldn't expect you to see octal escapes > > > after saving in UTF-8. > > > > See above for, I think, the explanation of what I did and saw. > > It doesn't. You have something in your customizations that runs afoul > of your expectations (which do work in "emacs -Q"). Please read again what I said about the crash. I think it at least explains why I saw octal escapes when I visited the last-saved file - it was saved with raw text. Only the auto-save file was saved with utf-8. When I visited the file in a new session, it showed the octal escapes because it was saved as raw text. Why that happened I'm not sure. Perhaps when I did `C-x C-w' the *Help* buffer was first saved as raw text (?), then autosaved as utf-8, then the crash interrupted finally resaving the file itself as utf-8 (?). You will need to give me an idea what to look for, if I am to try hunting for something in my init file and all that it loads. One thing I do is this, to get Unix-style line endings: (setq-default buffer-file-coding-system 'undecided-unix) And I do this: (setq process-coding-system-alist (cons '("bash" . (raw-text-dos . raw-text-unix)) process-coding-system-alist)) But I don't imagine that either of those is related to this. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-30 15:07 ` Drew Adams @ 2015-10-30 15:22 ` Eli Zaretskii 0 siblings, 0 replies; 16+ messages in thread From: Eli Zaretskii @ 2015-10-30 15:22 UTC (permalink / raw) To: Drew Adams; +Cc: 21780 > Date: Fri, 30 Oct 2015 08:07:54 -0700 (PDT) > From: Drew Adams <drew.adams@oracle.com> > Cc: 21780@debbugs.gnu.org > > One thing I do is this, to get Unix-style line endings: > (setq-default buffer-file-coding-system 'undecided-unix) That's your problem, mot probably. Can you try again after removing it? If that solves the problem, I can then tell you how to do what you want without disrupting encoding/decoding defaults. > And I do this: > (setq process-coding-system-alist > (cons '("bash" . (raw-text-dos . raw-text-unix)) > process-coding-system-alist)) This is not related, but it is also wrong. Why do you do that? ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <<17cf8a49-1cc4-4834-91ec-b7d092693ebf@default>]
[parent not found: <<83si4sz8i5.fsf@gnu.org>]
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes [not found] ` <<83si4sz8i5.fsf@gnu.org> @ 2015-10-30 16:02 ` Drew Adams 2015-10-30 16:17 ` Drew Adams 2015-10-30 20:50 ` Eli Zaretskii 0 siblings, 2 replies; 16+ messages in thread From: Drew Adams @ 2015-10-30 16:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 21780 [-- Attachment #1: Type: text/plain, Size: 1157 bytes --] > > (setq-default buffer-file-coding-system 'undecided-unix) > > That's your problem, mot probably. Can you try again after removing > it? If that solves the problem, I can then tell you how to do what > you want without disrupting encoding/decoding defaults. > > > And I do this: > > (setq process-coding-system-alist > > (cons '("bash" . (raw-text-dos . raw-text-unix)) > > process-coding-system-alist)) > > This is not related, but it is also wrong. Why do you do that? Why are these things "wrong"? I do them as part of the setup to use Cygwin. I do them in `setup-cygwin.el', which is, incidentally, used by quite a few people AFAIK. http://www.emacswiki.org/emacs/download/setup-cygwin.el Anyway, I tried commenting out the first of those. That did change the text of the *Warning* buffer, so that it mentioned utf-8 as one of the possibilities. (Unfortunately, I still cannot get the file saved, because Emacs crashes. Again, the autosave file looks fine in a new session, and shows U(Unix) in the mode line.) Attached is a screenshot of the *Warning* text after commenting out that line. [-- Attachment #2: throw-emacs-C-x-C-w-help-buf.png --] [-- Type: image/png, Size: 52341 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-30 16:02 ` Drew Adams @ 2015-10-30 16:17 ` Drew Adams 2015-10-30 20:50 ` Eli Zaretskii 1 sibling, 0 replies; 16+ messages in thread From: Drew Adams @ 2015-10-30 16:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 21780 > Anyway, I tried commenting out the first of those. That did > change the text of the *Warning* buffer, so that it mentioned > utf-8 as one of the possibilities. (Unfortunately, I still > cannot get the file saved, because Emacs crashes. Again, the > autosave file looks fine in a new session, and shows U(Unix) > in the mode line.) > > Attached is a screenshot of the *Warning* text after commenting > out that line. Sorry, I was wrong about this and something I said earlier. Even with the line present, I do not see the problem I thought I saw earlier, of the *Warning* presenting only two encoding options, one of which was raw text. Dunno what build I saw that in. The screenshot I just sent is what I see even in the build I reported about (and it mentions utf-8). The rest of the problems are as I reported them. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-30 16:02 ` Drew Adams 2015-10-30 16:17 ` Drew Adams @ 2015-10-30 20:50 ` Eli Zaretskii 2015-10-30 20:57 ` Eli Zaretskii 1 sibling, 1 reply; 16+ messages in thread From: Eli Zaretskii @ 2015-10-30 20:50 UTC (permalink / raw) To: Drew Adams; +Cc: 21780 > Date: Fri, 30 Oct 2015 09:02:15 -0700 (PDT) > From: Drew Adams <drew.adams@oracle.com> > Cc: 21780@debbugs.gnu.org > > > > (setq-default buffer-file-coding-system 'undecided-unix) > > > > That's your problem, mot probably. Can you try again after removing > > it? If that solves the problem, I can then tell you how to do what > > you want without disrupting encoding/decoding defaults. > > > > > And I do this: > > > (setq process-coding-system-alist > > > (cons '("bash" . (raw-text-dos . raw-text-unix)) > > > process-coding-system-alist)) > > > > This is not related, but it is also wrong. Why do you do that? > > Why are these things "wrong"? Because they defeat some of the heuristics that decoding and encoding needs to silently DTRT. Whoever wrote them didn't understand what she was doing, and most probably didn't understand what is the problem that needed to be solved. If you want to have Unix EOLs by default, the correct customization is this: (setq-default buffer-file-coding-system (coding-system-change-eol-conversion buffer-file-coding-system 'unix)) This modifies just the EOL type of the default encoding, leaving the rest intact. The other customization, for process-coding-system-alist, is a very bad idea, if your Bash can sometimes report non-ASCII strings. AFAIK, Cygwin nowadays uses UTF-8 as its encoding, so the correct customization would be to use utf-8 instead of raw-text there. > I do them as part of the setup to use Cygwin. I do them in > `setup-cygwin.el', which is, incidentally, used by quite a few > people AFAIK. > http://www.emacswiki.org/emacs/download/setup-cygwin.el That file needs this fixed ASAP. > Anyway, I tried commenting out the first of those. That did > change the text of the *Warning* buffer, so that it mentioned > utf-8 as one of the possibilities. (Unfortunately, I still > cannot get the file saved, because Emacs crashes. Again, the > autosave file looks fine in a new session, and shows U(Unix) > in the mode line.) The crash is some separate problem, it doesn't crash for me. Anyway, this is all tangential to the problem. After the file is saved as UTF-8, does visiting it display it correctly, after you correct your customizations as indicated above? If not, you will have to use "C-x RET c" or "C-x RET r", as I mentioned earlier. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-30 20:50 ` Eli Zaretskii @ 2015-10-30 20:57 ` Eli Zaretskii 2015-10-29 1:50 ` Drew Adams 0 siblings, 1 reply; 16+ messages in thread From: Eli Zaretskii @ 2015-10-30 20:57 UTC (permalink / raw) To: drew.adams; +Cc: 21780 > Date: Fri, 30 Oct 2015 22:50:57 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 21780@debbugs.gnu.org > > If you want to have Unix EOLs by default, the correct customization is > this: > > (setq-default buffer-file-coding-system > (coding-system-change-eol-conversion > buffer-file-coding-system 'unix)) This variant is better: (setq-default buffer-file-coding-system (coding-system-change-eol-conversion default-buffer-file-coding-system 'unix)) ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes @ 2015-10-29 1:50 ` Drew Adams 2015-10-29 17:41 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Drew Adams @ 2015-10-29 1:50 UTC (permalink / raw) To: 21780 emacs -Q M-x load-library isearch.el C-h f isearch-forward In buffer *Help*: C-x C-w foo.txt You get a coding-system warning. I tried saving it as utf-8 and as raw text. In both cases, when I open that file in a new Emacs session, I see octal escapes where there were curly quotes. Why were there curly quotes? Because `C-h f' produces curly quotes. This is a regression - no such problem exists with Emacs 24.5 (or prior). In order to produce a reasonable, readable file from the *Help* buffer that I attached to the following mail message to emacs-devel@gnu.org, I had to resort to using Emacs 24.5: "RE: Exposing Isearch toggleable options", 2015-10-28, ~21:20 (BTW, it is apparently *not* the case, in spite of what is stated at http://lists.gnu.org/archive/html/emacs-devel/2015-10/index.html, that the mailing list archive is updated every 30 minutes. Far from it, it seems. That's why I didn't provide a URL to the emacs-devel post. Got tired after 1/2 hour of waiting for it to show up.) In GNU Emacs 25.0.50.1 (i686-pc-mingw32) of 2015-10-09 Bzr revision: af45926d66d303fdc4c2c3ebbc820b4a54d9e4a0 Windowing system distributor `Microsoft Corp.', version 6.1.7601 Configured using: `configure --host=i686-pc-mingw32 --enable-checking=yes,glyphs' ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-29 1:50 ` Drew Adams @ 2015-10-29 17:41 ` Eli Zaretskii 2015-10-30 23:06 ` Andy Moreton 2015-10-31 18:10 ` Andy Moreton 2 siblings, 0 replies; 16+ messages in thread From: Eli Zaretskii @ 2015-10-29 17:41 UTC (permalink / raw) To: Drew Adams; +Cc: 21780-done > Date: Wed, 28 Oct 2015 18:50:49 -0700 (PDT) > From: Drew Adams <drew.adams@oracle.com> > > emacs -Q > M-x load-library isearch.el > C-h f isearch-forward > In buffer *Help*: C-x C-w foo.txt > > You get a coding-system warning. I tried saving it as utf-8 and as raw > text. > > In both cases, when I open that file in a new Emacs session, I see octal > escapes where there were curly quotes. Thanks, I fixed the first part of this: Emacs should no longer ask annoying questions when you save help buffers with curved quotes. The second part, which happens when visiting the saved file, is not a bug: you need to specify the encoding of files when visiting them in locales whose default encoding is different. (Actually, I expect this to work automatically for you, at least in "emacs -Q", but that doesn't happen in every locale.) ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-29 1:50 ` Drew Adams 2015-10-29 17:41 ` Eli Zaretskii @ 2015-10-30 23:06 ` Andy Moreton 2015-10-31 7:28 ` Eli Zaretskii 2015-10-31 18:10 ` Andy Moreton 2 siblings, 1 reply; 16+ messages in thread From: Andy Moreton @ 2015-10-30 23:06 UTC (permalink / raw) To: 21780 On Fri 30 Oct 2015, Eli Zaretskii wrote: >> Date: Fri, 30 Oct 2015 22:50:57 +0200 >> From: Eli Zaretskii <eliz@gnu.org> >> Cc: 21780@debbugs.gnu.org >> >> If you want to have Unix EOLs by default, the correct customization is >> this: >> >> (setq-default buffer-file-coding-system >> (coding-system-change-eol-conversion >> buffer-file-coding-system 'unix)) > > This variant is better: > > (setq-default buffer-file-coding-system > (coding-system-change-eol-conversion > default-buffer-file-coding-system 'unix)) Why ? Help for `default-buffer-file-coding-system' says: This variable is obsolete since 23.2; use ‘buffer-file-coding-system’ instead. I think Drew's version was fine. AndyM ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-30 23:06 ` Andy Moreton @ 2015-10-31 7:28 ` Eli Zaretskii 0 siblings, 0 replies; 16+ messages in thread From: Eli Zaretskii @ 2015-10-31 7:28 UTC (permalink / raw) To: Andy Moreton; +Cc: 21780 > From: Andy Moreton <andrewjmoreton@gmail.com> > Date: Fri, 30 Oct 2015 23:06:22 +0000 > > On Fri 30 Oct 2015, Eli Zaretskii wrote: > > >> Date: Fri, 30 Oct 2015 22:50:57 +0200 > >> From: Eli Zaretskii <eliz@gnu.org> > >> Cc: 21780@debbugs.gnu.org > >> > >> If you want to have Unix EOLs by default, the correct customization is > >> this: > >> > >> (setq-default buffer-file-coding-system > >> (coding-system-change-eol-conversion > >> buffer-file-coding-system 'unix)) > > > > This variant is better: > > > > (setq-default buffer-file-coding-system > > (coding-system-change-eol-conversion > > default-buffer-file-coding-system 'unix)) > > Why ? Help for `default-buffer-file-coding-system' says: > > This variable is obsolete since 23.2; > use ‘buffer-file-coding-system’ instead. Then use (default-value buffer-file-coding-system) instead. The point being to use the default value to modify the default value. When this code runs, the current buffer should have the same value as its buffer-local value, but I preferred not to rely on that. > I think Drew's version was fine. What Drew's version? The one that unconditionally used undecided-unix as the default? I very much disagree. ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-29 1:50 ` Drew Adams 2015-10-29 17:41 ` Eli Zaretskii 2015-10-30 23:06 ` Andy Moreton @ 2015-10-31 18:10 ` Andy Moreton 2 siblings, 0 replies; 16+ messages in thread From: Andy Moreton @ 2015-10-31 18:10 UTC (permalink / raw) To: 21780 On Sat 31 Oct 2015, Eli Zaretskii wrote: >> From: Andy Moreton <andrewjmoreton@gmail.com> >> Date: Fri, 30 Oct 2015 23:06:22 +0000 >> >> On Fri 30 Oct 2015, Eli Zaretskii wrote: >> >> >> Date: Fri, 30 Oct 2015 22:50:57 +0200 >> >> From: Eli Zaretskii <eliz@gnu.org> >> >> Cc: 21780@debbugs.gnu.org >> >> >> >> If you want to have Unix EOLs by default, the correct customization is >> >> this: >> >> >> >> (setq-default buffer-file-coding-system >> >> (coding-system-change-eol-conversion >> >> buffer-file-coding-system 'unix)) >> > >> > This variant is better: >> > >> > (setq-default buffer-file-coding-system >> > (coding-system-change-eol-conversion >> > default-buffer-file-coding-system 'unix)) >> >> Why ? Help for `default-buffer-file-coding-system' says: >> >> This variable is obsolete since 23.2; >> use ‘buffer-file-coding-system’ instead. > > Then use (default-value buffer-file-coding-system) instead. The point > being to use the default value to modify the default value. When this > code runs, the current buffer should have the same value as its > buffer-local value, but I preferred not to rely on that. I see your point. So you prefer this: (setq-default buffer-file-coding-system (coding-system-change-eol-conversion (default-value 'buffer-file-coding-system) 'unix)) Thanks for answering this - a useful addition to my init.el. AndyM ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <<48a87436-595a-4f65-9e3c-094f6d77ee96@default>]
[parent not found: <<83eggcytam.fsf@gnu.org>]
[parent not found: <<83d1vwyt02.fsf@gnu.org>]
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes [not found] ` <<83d1vwyt02.fsf@gnu.org> @ 2015-10-30 21:27 ` Drew Adams 0 siblings, 0 replies; 16+ messages in thread From: Drew Adams @ 2015-10-30 21:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 21780 > > If you want to have Unix EOLs by default, the correct customization is > > this: > (setq-default buffer-file-coding-system > (coding-system-change-eol-conversion > default-buffer-file-coding-system 'unix)) OK. Thx. ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <<ab84db1b-77e3-4ef1-ac0e-e91264d78b8f@default>]
[parent not found: <<83oafh1sj1.fsf@gnu.org>]
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes [not found] ` <<83oafh1sj1.fsf@gnu.org> @ 2015-10-29 17:58 ` Drew Adams 2015-10-29 18:18 ` Eli Zaretskii 0 siblings, 1 reply; 16+ messages in thread From: Drew Adams @ 2015-10-29 17:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 21780-done > > emacs -Q > > M-x load-library isearch.el > > C-h f isearch-forward > > In buffer *Help*: C-x C-w foo.txt > > > > You get a coding-system warning. I tried saving it as utf-8 and as raw > > text. > > > > In both cases, when I open that file in a new Emacs session, I see octal > > escapes where there were curly quotes. > > Thanks, I fixed the first part of this: Emacs should no longer ask > annoying questions when you save help buffers with curved quotes. > > The second part, which happens when visiting the saved file, is not a > bug: you need to specify the encoding of files when visiting them in > locales whose default encoding is different. (Actually, I expect this > to work automatically for you, at least in "emacs -Q", but that > doesn't happen in every locale.) I guess I should interpret this as meaning that the bug is fixed (?). But I don't understand the second part. What do I need to change, as a user, to get this to work as I would expect? In Emacs, before saving, the buffer looks fine. When visiting the resulting file it does not look right - it is unreadable. There are 3 octal escapes for each opening curly quote and 3 of them for each closing curly quote. That can amount to quite a lot of noise. Do I need to save the buffer using some other encoding? If so, which? Emacs proposed two encodings (one of which was raw text, which I tried; and I tried also utf-8, which I would have thought would show curly quotes OK. I would think that Emacs would DTRT when opening the file, based on the encoding used to save it. Should users really need to do something special each time they visit the file? They've never had to do this before, for basic, common *Help* output. This still seems like a regression to me, as there is no such annoyance in Emacs 24.5 or prior. Then, Emacs did not use curly quotes for `describe-*' command output, and saved *Help* buffers were readable from the outset. If readers have to jump through hoops (e.g. changing "locales"), and there is no good fix for this regression in behavior, then I'd suggest that maybe `describe-*' commands should not use curly quotes. [Or could this perhaps be a font problem? Might the default font (e.g. on MS Windows) just need to be changed?] ^ permalink raw reply [flat|nested] 16+ messages in thread
* bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes 2015-10-29 17:58 ` Drew Adams @ 2015-10-29 18:18 ` Eli Zaretskii 0 siblings, 0 replies; 16+ messages in thread From: Eli Zaretskii @ 2015-10-29 18:18 UTC (permalink / raw) To: Drew Adams; +Cc: 21780 > Date: Thu, 29 Oct 2015 10:58:24 -0700 (PDT) > From: Drew Adams <drew.adams@oracle.com> > Cc: 21780-done@debbugs.gnu.org > > > > emacs -Q > > > M-x load-library isearch.el > > > C-h f isearch-forward > > > In buffer *Help*: C-x C-w foo.txt > > > > > > You get a coding-system warning. I tried saving it as utf-8 and as raw > > > text. > > > > > > In both cases, when I open that file in a new Emacs session, I see octal > > > escapes where there were curly quotes. > > > > Thanks, I fixed the first part of this: Emacs should no longer ask > > annoying questions when you save help buffers with curved quotes. > > > > The second part, which happens when visiting the saved file, is not a > > bug: you need to specify the encoding of files when visiting them in > > locales whose default encoding is different. (Actually, I expect this > > to work automatically for you, at least in "emacs -Q", but that > > doesn't happen in every locale.) > > I guess I should interpret this as meaning that the bug is fixed (?). Yes, I think so. > But I don't understand the second part. What do I need to change, as > a user, to get this to work as I would expect? It might work as you expect already. You can try this: . After "C-h f some-function RET", switch to the *Help* buffer and type "C-x RET f utf-8 RET", then save the buffer as in your recipe. . Now visit the file where you saved the *Help* buffer: if the curved quotes display correctly, then "it works for you". . If the curved quotes look like raw bytes or, worse, pairs of non-ASCII characters, you need to visit such file like this: C-x RET c utf-8 RET C-x C-f FILE-NAME RET > In Emacs, before saving, the buffer looks fine. It looks fine, but the encoding mnemonic on the mode line is not "U" (which stands for UTF-8), right? That is why Emacs asks you for encoding: it cannot save these characters using your locale's default encoding (which is what the *Help* buffer uses by default). > When visiting the resulting file it does not look right - it is > unreadable. There are 3 octal escapes for each opening curly quote > and 3 of them for each closing curly quote. That can amount to > quite a lot of noise. Yes, because you probably told Emacs to use raw-text or somesuch, when it asked. > Do I need to save the buffer using some other encoding? If so, which? Yes, you could tell it to use UTF-8 when it asked. (After my change, Emacs will do that automatically, no questions asked, when saving *Help* buffers with curved quotes.) > Emacs proposed two encodings (one of which was raw text, which I tried; > and I tried also utf-8, which I would have thought would show curly > quotes OK. UTF-8 should have worked. I wouldn't expect you to see octal escapes after saving in UTF-8. > I would think that Emacs would DTRT when opening the file, based on > the encoding used to save it. It cannot always do that. To make sure it always does, there should be a 'coding' cookie in the file or a file-local variable to the same effect. But you will have to add it manually; I don't think it's OK for Emacs to insert such additions on its own, because Emacs has no idea how the file will be used. > Should users really need to do something special each time they > visit the file? They've never had to do this before, for basic, > common *Help* output. If you customize text-quoting-style to use ASCII characters for quoting, Emacs will still behave as it did before: the file you produce will be pure ASCII, so no decoding is necessary. > If readers have to jump through hoops (e.g. changing "locales"), > and there is no good fix for this regression in behavior, then I'd > suggest that maybe `describe-*' commands should not use curly quotes. Saving a *Help* buffer is not a frequent operation, and most users nowadays live in UTF-8 locales anyway. And even in some non-UTF-8 locales Emacs will succeed in displaying the file correctly when visiting it, even without the need to type "C-x RET c". So I think this is not a catastrophe. > [Or could this perhaps be a font problem? Might the default font > (e.g. on MS Windows) just need to be changed?] No, it's not a font problem: Emacs did display those characters before you saved the buffer, right? ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2015-10-31 18:10 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <<536fdfb7-20dd-4d23-886c-8e842b6844fd@default> [not found] ` <<83lhal1qtm.fsf@gnu.org> 2015-10-29 20:53 ` bug#21780: 25.0.50; Saving *Help* results in bad encoding because of curly quotes Drew Adams 2015-10-30 7:47 ` Eli Zaretskii [not found] ` <<2c1ac781-86b8-4365-8466-52455afb79f6@default> [not found] ` <<83k2q423x7.fsf@gnu.org> 2015-10-30 15:07 ` Drew Adams 2015-10-30 15:22 ` Eli Zaretskii [not found] ` <<17cf8a49-1cc4-4834-91ec-b7d092693ebf@default> [not found] ` <<83si4sz8i5.fsf@gnu.org> 2015-10-30 16:02 ` Drew Adams 2015-10-30 16:17 ` Drew Adams 2015-10-30 20:50 ` Eli Zaretskii 2015-10-30 20:57 ` Eli Zaretskii 2015-10-29 1:50 ` Drew Adams 2015-10-29 17:41 ` Eli Zaretskii 2015-10-30 23:06 ` Andy Moreton 2015-10-31 7:28 ` Eli Zaretskii 2015-10-31 18:10 ` Andy Moreton [not found] <<48a87436-595a-4f65-9e3c-094f6d77ee96@default> [not found] ` <<83eggcytam.fsf@gnu.org> [not found] ` <<83d1vwyt02.fsf@gnu.org> 2015-10-30 21:27 ` Drew Adams [not found] <<ab84db1b-77e3-4ef1-ac0e-e91264d78b8f@default> [not found] ` <<83oafh1sj1.fsf@gnu.org> 2015-10-29 17:58 ` Drew Adams 2015-10-29 18:18 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).