* Re: master 58a3c54: Avoid using string-make-unibyte in select.el [not found] ` <20190622083525.F1CA5209DE@vcs0.savannah.gnu.org> @ 2019-06-22 8:58 ` Lars Ingebrigtsen 2019-06-22 9:38 ` Eli Zaretskii 2019-06-22 13:26 ` [Emacs-diffs] " Stefan Monnier 1 sibling, 1 reply; 9+ messages in thread From: Lars Ingebrigtsen @ 2019-06-22 8:58 UTC (permalink / raw) To: emacs-devel; +Cc: Eli Zaretskii eliz@gnu.org (Eli Zaretskii) writes: > + (or (null (multibyte-string-p str)) > + (setq str (encode-coding-string 'raw-text-unix str)))) Shouldn't that be (encode-coding-string str 'raw-text-unix) ? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: master 58a3c54: Avoid using string-make-unibyte in select.el 2019-06-22 8:58 ` master 58a3c54: Avoid using string-make-unibyte in select.el Lars Ingebrigtsen @ 2019-06-22 9:38 ` Eli Zaretskii 0 siblings, 0 replies; 9+ messages in thread From: Eli Zaretskii @ 2019-06-22 9:38 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Eli Zaretskii <eliz@gnu.org> > Date: Sat, 22 Jun 2019 10:58:58 +0200 > > > + (or (null (multibyte-string-p str)) > > + (setq str (encode-coding-string 'raw-text-unix str)))) > > Shouldn't that be > > (encode-coding-string str 'raw-text-unix) Ouch! Of course! Thanks for catching this; fixed. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in select.el [not found] ` <20190622083525.F1CA5209DE@vcs0.savannah.gnu.org> 2019-06-22 8:58 ` master 58a3c54: Avoid using string-make-unibyte in select.el Lars Ingebrigtsen @ 2019-06-22 13:26 ` Stefan Monnier 2019-06-22 13:42 ` Eli Zaretskii 1 sibling, 1 reply; 9+ messages in thread From: Stefan Monnier @ 2019-06-22 13:26 UTC (permalink / raw) To: emacs-devel; +Cc: Eli Zaretskii > + (or (null (multibyte-string-p str)) > + (setq str (encode-coding-string 'raw-text-unix str)))) Isn't this the same as (setq str (string-to-unibyte str))? Stefan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in select.el 2019-06-22 13:26 ` [Emacs-diffs] " Stefan Monnier @ 2019-06-22 13:42 ` Eli Zaretskii 2019-06-22 16:44 ` Stefan Monnier 0 siblings, 1 reply; 9+ messages in thread From: Eli Zaretskii @ 2019-06-22 13:42 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Eli Zaretskii <eliz@gnu.org> > Date: Sat, 22 Jun 2019 09:26:38 -0400 > > > + (or (null (multibyte-string-p str)) > > + (setq str (encode-coding-string 'raw-text-unix str)))) > > Isn't this the same as (setq str (string-to-unibyte str))? No, because the former doesn't signal an error. (And I didn't want to use any of those string-to/as-uni/multibyte functions anyway.) The only thing we are supposed to do in the multibyte case is to make sure the raw bytes are converted to their single-byte representation, which is exactly what raw-text-unix does. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in select.el 2019-06-22 13:42 ` Eli Zaretskii @ 2019-06-22 16:44 ` Stefan Monnier 2019-06-22 16:57 ` Eli Zaretskii 0 siblings, 1 reply; 9+ messages in thread From: Stefan Monnier @ 2019-06-22 16:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >> > + (or (null (multibyte-string-p str)) >> > + (setq str (encode-coding-string 'raw-text-unix str)))) >> Isn't this the same as (setq str (string-to-unibyte str))? > No, because the former doesn't signal an error. Oh, right, that's yet another subtle distinction between all those alternatives. BTW, do we actually need to convert to unibyte here? (most place where we expect a unibyte string, we silently convert from multibyte when needed, in a way that's basically equivalent to the above). > (And I didn't want to use any of those string-to/as-uni/multibyte > functions anyway.) I hated those functions and still do for the string-as and string-make variety, but I'm beginning to like the string-to variety when we need to convert the representation of a sequence of *bytes* within encoding/decoding them as chars. So maybe the present case argues for adding a `no-error` argument to string-to-unibyte. I say this because to me (encode-coding-string 'raw-text-unix str) is an oxymoron since `raw-text-unix` is a synonym of `binary` and `no-conversion`, which basically says "do any encoding/decoding, instead preserve bytes as bytes". IOW coding-systems like `raw-text` make sense in places like the `coding:` tag or in buffer-file-coding-system, where we are forced to put some kind of coding-system and where it is hence handy to be able to use `raw-text-unix` to basically skip the en/decoding. But I find them confusing when passed as a constant to `en/decode-coding-string`. > The only thing we are supposed to do in the multibyte case is to make > sure the raw bytes are converted to their single-byte representation, > which is exactly what raw-text-unix does. Right (and indeed string-make-unibyte worked in practice for the same reason that encoding with pretty much any coding-system preserves the bytes as well, save for a few exceptions like utf-8-emacs). Stefan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in select.el 2019-06-22 16:44 ` Stefan Monnier @ 2019-06-22 16:57 ` Eli Zaretskii 2019-06-23 2:45 ` Stefan Monnier 0 siblings, 1 reply; 9+ messages in thread From: Eli Zaretskii @ 2019-06-22 16:57 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org > Date: Sat, 22 Jun 2019 12:44:05 -0400 > > So maybe the present case argues for adding a `no-error` argument to > string-to-unibyte. What is the use case for string-to-unibyte that cannot be satisfied by encoding with raw-text/binary, if we also don't signal an error? > I say this because to me (encode-coding-string 'raw-text-unix str) > is an oxymoron since `raw-text-unix` is a synonym of `binary` and > `no-conversion`, which basically says "do any encoding/decoding, > instead preserve bytes as bytes". For reasons of avoiding mental overload, I prefer not to use no-conversion where in fact there is a conversion. That's why I didn't use 'binary' in this case. > IOW coding-systems like `raw-text` make sense in places like the > `coding:` tag or in buffer-file-coding-system, where we are forced to > put some kind of coding-system and where it is hence handy to be able to > use `raw-text-unix` to basically skip the en/decoding. > But I find them confusing when passed as a constant to > `en/decode-coding-string`. It's the other way around here. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in select.el 2019-06-22 16:57 ` Eli Zaretskii @ 2019-06-23 2:45 ` Stefan Monnier 2019-06-23 14:26 ` Eli Zaretskii 0 siblings, 1 reply; 9+ messages in thread From: Stefan Monnier @ 2019-06-23 2:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >> So maybe the present case argues for adding a `no-error` argument to >> string-to-unibyte. > What is the use case for string-to-unibyte that cannot be satisfied by > encoding with raw-text/binary, if we also don't signal an error? The use case is clear code that says explicitly that this chunk of code is not trying to convert between chars and bytes but only to convert between two representations of a sequence of bytes. It's also code that clearly does the reverse of string-to-multibyte (whereas decode-doding-string doesn't do the reverse of encode-coding-string when it comes to `raw-text`). >> I say this because to me (encode-coding-string 'raw-text-unix str) >> is an oxymoron since `raw-text-unix` is a synonym of `binary` and >> `no-conversion`, which basically says "do any encoding/decoding, >> instead preserve bytes as bytes". > > For reasons of avoiding mental overload, I prefer not to use > no-conversion where in fact there is a conversion. I also hate `no-conversion`. But for the same reason I dislike `raw-text` because the name gives me no intuition and since it is about preserving bytes rather than characters, it doesn't have much to do with "text". > That's why I didn't use 'binary' in this case. Binary doesn't say what the conversion does, indeed, but it does say that it applies to binary (rather than text) contents, so I find its name does provide the needed intuition. >> IOW coding-systems like `raw-text` make sense in places like the >> `coding:` tag or in buffer-file-coding-system, where we are forced to >> put some kind of coding-system and where it is hence handy to be able to >> use `raw-text-unix` to basically skip the en/decoding. >> But I find them confusing when passed as a constant to >> `en/decode-coding-string`. > > It's the other way around here. I don't know what "other way around" means in this context. Stefan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in select.el 2019-06-23 2:45 ` Stefan Monnier @ 2019-06-23 14:26 ` Eli Zaretskii 2019-06-24 3:48 ` Stefan Monnier 0 siblings, 1 reply; 9+ messages in thread From: Eli Zaretskii @ 2019-06-23 14:26 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org > Date: Sat, 22 Jun 2019 22:45:07 -0400 > > >> So maybe the present case argues for adding a `no-error` argument to > >> string-to-unibyte. > > What is the use case for string-to-unibyte that cannot be satisfied by > > encoding with raw-text/binary, if we also don't signal an error? > > The use case is clear code that says explicitly that this chunk of code > is not trying to convert between chars and bytes but only to convert > between two representations of a sequence of bytes. (a) I wouldn't call anything related to string-to-unibyte "clear", because the act of converting a string to unibyte is not well defined. (b) Encoding text can also be defined as "converting between two representations of a sequence of bytes". > It's also code that clearly does the reverse of string-to-multibyte > (whereas decode-doding-string doesn't do the reverse of > encode-coding-string when it comes to `raw-text`). I think decode-doding-string does do the reverse. > >> IOW coding-systems like `raw-text` make sense in places like the > >> `coding:` tag or in buffer-file-coding-system, where we are forced to > >> put some kind of coding-system and where it is hence handy to be able to > >> use `raw-text-unix` to basically skip the en/decoding. > >> But I find them confusing when passed as a constant to > >> `en/decode-coding-string`. > > > > It's the other way around here. > > I don't know what "other way around" means in this context. It means that our preferences in this case are opposite. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in select.el 2019-06-23 14:26 ` Eli Zaretskii @ 2019-06-24 3:48 ` Stefan Monnier 0 siblings, 0 replies; 9+ messages in thread From: Stefan Monnier @ 2019-06-24 3:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > (a) I wouldn't call anything related to string-to-unibyte "clear", > because the act of converting a string to unibyte is not well defined. > (b) Encoding text can also be defined as "converting between two > representations of a sequence of bytes". No, encoding and decoding change the bytes, whereas string-to preserves the bytes (just once represented as a unibyte string (which can never be anything else than a sequence of bytes) and the other as a sequence of chars (some of which stand for bytes)). A sequence of bytes can be represented in many different ways: - a unibyte string is the canonical way (because it can only do that, so when you receive such a thing you don't need to look for possible non-bytes in the sequence or for a non-proper sequence). - a vector of integers between 0 and 255. - a list of integers between 0 and 255. - a multibyte string with chars within the union of the ascii charset and the eight-bit charset. string-to lets you convert a given sequence of bytes between the first and the last. >> It's also code that clearly does the reverse of string-to-multibyte >> (whereas decode-doding-string doesn't do the reverse of >> encode-coding-string when it comes to `raw-text`). > I think decode-doding-string does do the reverse. No: decode-coding-string returns a unibyte string when called with `raw-text` or `binary`, contrary to string-to-multibyte. >> >> IOW coding-systems like `raw-text` make sense in places like the >> >> `coding:` tag or in buffer-file-coding-system, where we are forced to >> >> put some kind of coding-system and where it is hence handy to be able to >> >> use `raw-text-unix` to basically skip the en/decoding. >> >> But I find them confusing when passed as a constant to >> >> `en/decode-coding-string`. >> > It's the other way around here. >> I don't know what "other way around" means in this context. > It means that our preferences in this case are opposite. AFAIK using `raw-text` or `no-conversion` in auto-coding-alist or in `coding:` tags is not a matter of preference: you simply can't specify string-to-*byte in there. Stefan ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-06-24 3:48 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20190622083524.20617.42423@vcs0.savannah.gnu.org> [not found] ` <20190622083525.F1CA5209DE@vcs0.savannah.gnu.org> 2019-06-22 8:58 ` master 58a3c54: Avoid using string-make-unibyte in select.el Lars Ingebrigtsen 2019-06-22 9:38 ` Eli Zaretskii 2019-06-22 13:26 ` [Emacs-diffs] " Stefan Monnier 2019-06-22 13:42 ` Eli Zaretskii 2019-06-22 16:44 ` Stefan Monnier 2019-06-22 16:57 ` Eli Zaretskii 2019-06-23 2:45 ` Stefan Monnier 2019-06-23 14:26 ` Eli Zaretskii 2019-06-24 3:48 ` Stefan Monnier
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.