From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kaushal Modi Subject: Ox-html: Replace with and with Date: Tue, 23 Oct 2018 20:38:37 -0400 Message-ID: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000bc49430578eeb537" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:51756) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gF7CI-0000SF-Bt for emacs-orgmode@gnu.org; Tue, 23 Oct 2018 20:38:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gF7CH-0004Mb-4K for emacs-orgmode@gnu.org; Tue, 23 Oct 2018 20:38:54 -0400 Received: from mail-lj1-x236.google.com ([2a00:1450:4864:20::236]:42542) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gF7CG-0004KL-Oj for emacs-orgmode@gnu.org; Tue, 23 Oct 2018 20:38:52 -0400 Received: by mail-lj1-x236.google.com with SMTP id l25-v6so1289924lja.9 for ; Tue, 23 Oct 2018 17:38:52 -0700 (PDT) List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: emacs-org list --000000000000bc49430578eeb537 Content-Type: text/plain; charset="UTF-8" Hello, I am not an HTML expert. But recently off-list, I learnt that and tags aren't recommended to be used for styling any more (for a while now). Instead and should be used respectively. If there are no objections, I can commit this little change to the master branch. References: - https://developer.mozilla.org/en-US/docs/Web/HTML/Element/b - https://developer.mozilla.org/en-US/docs/Web/HTML/Element/i#Usage_Notes -- Kaushal Modi --000000000000bc49430578eeb537 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

I am = not an HTML expert. But recently off-list, I learnt that <b> and <= i> tags aren't recommended to be used for styling any more (for a wh= ile now).=C2=A0

Instead = <strong> and <em> should be used respectively.=C2=A0

If there are no objections, I can = commit this little change to the master branch.=C2=A0

References:

=
--000000000000bc49430578eeb537-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: Re: Ox-html: Replace with and with Date: Wed, 24 Oct 2018 08:04:37 +0200 Message-ID: <87r2gfyj62.fsf@nicolasgoaziou.fr> References: Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:42477) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gFCHu-0003S2-Af for emacs-orgmode@gnu.org; Wed, 24 Oct 2018 02:05:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gFCHm-0007rG-5I for emacs-orgmode@gnu.org; Wed, 24 Oct 2018 02:04:57 -0400 Received: from relay10.mail.gandi.net ([217.70.178.230]:35905) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gFCHa-0007iL-KS for emacs-orgmode@gnu.org; Wed, 24 Oct 2018 02:04:47 -0400 In-Reply-To: (Kaushal Modi's message of "Tue, 23 Oct 2018 20:38:37 -0400") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: Kaushal Modi Cc: emacs-org list Hello, Kaushal Modi writes: > I am not an HTML expert. But recently off-list, I learnt that and > tags aren't recommended to be used for styling any more (for a while now). > > Instead and should be used respectively. > > If there are no objections, I can commit this little change to the master > branch. > > References: > > - https://developer.mozilla.org/en-US/docs/Web/HTML/Element/b > - > https://developer.mozilla.org/en-US/docs/Web/HTML/Element/i#Usage_Notes No objection from me. Thank you! Regards, -- Nicolas Goaziou From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kaushal Modi Subject: Re: Ox-html: Replace with and with Date: Wed, 24 Oct 2018 11:14:48 -0400 Message-ID: References: <87r2gfyj62.fsf@nicolasgoaziou.fr> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:34174) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gFL73-0002b6-DH for emacs-orgmode@gnu.org; Wed, 24 Oct 2018 11:30:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gFKsf-0007yA-CX for emacs-orgmode@gnu.org; Wed, 24 Oct 2018 11:15:34 -0400 Received: from mail-lf1-x135.google.com ([2a00:1450:4864:20::135]:36675) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gFKsd-0007wo-J8 for emacs-orgmode@gnu.org; Wed, 24 Oct 2018 11:15:32 -0400 Received: by mail-lf1-x135.google.com with SMTP id l1-v6so4289340lfc.3 for ; Wed, 24 Oct 2018 08:15:30 -0700 (PDT) In-Reply-To: <87r2gfyj62.fsf@nicolasgoaziou.fr> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: emacs-org list On Wed, Oct 24, 2018 at 2:04 AM Nicolas Goaziou wr= ote: > > > No objection from me. Thank you! Actually, before making this change, I started reading up on the HTML5 spec on the b, strong, i, em tags, and now I am confused as ever. Facts: - b and i are not deprecated - b and strong are both valid but their use depends on the writer's context (but Org mode has just one mark for either "*") - i and em are both valid but their use depends on the writer's context (but Org mode has just one mark for either "/"). >From "em" docs[em], in the NOTE section there: > The em element isn=E2=80=99t a generic "italics" element. Sometimes, text= is intended to stand out from the rest of the paragraph, as if it was in a= different mood or voice. For this, the i element is more appropriate. See the b tag docs[b] and i tag docs[i], and this W3C FAQ on using b and i tags[faq] for more. *Summary* (/see what I did there?/): I guess there's no need to change what "*" and "/" do right now in ox-html, as there doesn't seem "one right way" to do things here. And folks strongly wanting to use and for bold and italic can customize org-html-text-markup-alist. HTML experts, please chime in. [em]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-em-element [b]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-b-element [i]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-i-element [faq]: https://www.w3.org/International/questions/qa-b-and-i-tags From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tim Cross Subject: Re: Ox-html: Replace with and with Date: Thu, 25 Oct 2018 08:00:07 +1100 Message-ID: <87in1rkqlk.fsf@gmail.com> References: <87r2gfyj62.fsf@nicolasgoaziou.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:33630) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gFQGM-00062B-4g for emacs-orgmode@gnu.org; Wed, 24 Oct 2018 17:00:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gFQGK-0000VR-8G for emacs-orgmode@gnu.org; Wed, 24 Oct 2018 17:00:22 -0400 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]:41740) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gFQGF-0000PA-LY for emacs-orgmode@gnu.org; Wed, 24 Oct 2018 17:00:16 -0400 Received: by mail-pf1-x431.google.com with SMTP id a19-v6so3029615pfo.8 for ; Wed, 24 Oct 2018 14:00:13 -0700 (PDT) In-reply-to: List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: Kaushal Modi Cc: emacs-org list Kaushal Modi writes: > On Wed, Oct 24, 2018 at 2:04 AM Nicolas Goaziou = wrote: >> >> >> No objection from me. Thank you! > > Actually, before making this change, I started reading up on the HTML5 > spec on the b, strong, i, em tags, and now I am confused as ever. > > Facts: > > - b and i are not deprecated > - b and strong are both valid but their use depends on the writer's > context (but Org mode has just one mark for either "*") > - i and em are both valid but their use depends on the writer's > context (but Org mode has just one mark for either "/"). > > From "em" docs[em], in the NOTE section there: > >> The em element isn=E2=80=99t a generic "italics" element. Sometimes, tex= t is intended to stand out from the rest of the paragraph, as if it was in = a different mood or voice. For this, the i element is more appropriate. > > See the b tag docs[b] and i tag docs[i], and this W3C FAQ on using b > and i tags[faq] for more. > > > *Summary* (/see what I did there?/): > > I guess there's no need to change what "*" and "/" do right now in > ox-html, as there doesn't seem "one right way" to do things here. > > And folks strongly wanting to use and for bold and > italic can customize org-html-text-markup-alist. > > HTML experts, please chime in. > > > > [em]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-em-element > [b]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-b-element > [i]: https://www.w3.org/TR/html5/textlevel-semantics.html#the-i-element > [faq]: https://www.w3.org/International/questions/qa-b-and-i-tags I'll start by stating I'm definitely not an HTML expert. I do believe we should move away from b/i to strong/em as I think these are the correct semantic tags to use and are generally what is preferred. This means they are also likely to already have appropriate 'styling' in many 'canned' styles and valid consistent interpretations for different media types.=20 The problem with b and i is that they specify how rather than what and don't always make sense for all possible media types. For example, what does 'bold' or 'italic' mean for a screen reader? I don't think this is something that is urgent, but it is the direction we should go. The only real reason for sooner rather than later is that we can probably simplify some of the exporters and ensure any new exporters are correct and won't need to be change retrospectively. Tim --=20 Tim Cross From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Garreau\, Alexandre" Subject: *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace with and with ] Date: Fri, 26 Oct 2018 07:24:00 +0200 Message-ID: <87a7n1i8lr.fsf_-_@portable.galex-713.eu> References: <87r2gfyj62.fsf@nicolasgoaziou.fr> <87in1rkqlk.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:43497) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gFuf6-0003BP-AE for emacs-orgmode@gnu.org; Fri, 26 Oct 2018 01:27:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gFubN-00083o-Sa for emacs-orgmode@gnu.org; Fri, 26 Oct 2018 01:24:07 -0400 Received: from portable.galex-713.eu ([2a00:5884:8305::1]:48316) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gFubN-00082z-H6 for emacs-orgmode@gnu.org; Fri, 26 Oct 2018 01:24:05 -0400 In-Reply-To: <87in1rkqlk.fsf@gmail.com> (Tim Cross's message of "Thu, 25 Oct 2018 08:00:07 +1100") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: Tim Cross Cc: emacs-org list , Kaushal Modi Sorry, just found out that interesting (to me) thread I shouldn=E2=80=99t h= ave let go: On 2018-10-25 at 08:00, Tim Cross wrote: > Kaushal Modi writes: >> [=E2=80=A6] >> - b and i are not deprecated >> - b and strong are both valid but their use depends on the writer's >> context (but Org mode has just one mark for either "*") >> - i and em are both valid but their use depends on the writer's >> context (but Org mode has just one mark for either "/"). >> >> [=E2=80=A6] >>=20 >> From "em" docs[em], in the NOTE section there: >>> The em element isn=E2=80=99t a generic "italics" element. Sometimes, te= xt >>> is intended to stand out from the rest of the paragraph, as if it >>> was in a different mood or voice. For this, the i element is more >>> appropriate. >> >> [=E2=80=A6] >> >> I guess there's no need to change what "*" and "/" do right now in >> ox-html, as there doesn't seem "one right way" to do things here. >> >> And folks strongly wanting to use and for bold and >> italic can customize org-html-text-markup-alist. >> >> HTML experts, please chime in. > > I'll start by stating I'm definitely not an HTML expert. I don=E2=80=99t exactely know what an expert is, at least I=E2=80=99m not a professional, but I have passed some time figuring out various HTML specs semantic meaning. More especially, I=E2=80=99ve a big interest in semantics and typography, a= nd past many time on my now deleted-crecreated-then-lost github account, and mail, to convince people to switch to more semantical markuping (oh, and to use complex CSS selectors rather than classes, and stop using
and at all) and better typography (such as curly quotes, simple quotes inside quotes, and many things specific to french). > The problem with b and i is that they specify how rather than what and > don't always make sense for all possible media types. For example, what > does 'bold' or 'italic' mean for a screen reader? Italic means often pronounced with a different pitch afair. Bold probably means prounced differently too but I don=E2=80=99t know how this is pronounced iirc. I need to recheck with orca and firefox addons (I=E2=80= =99ll do for a next mail). That might be change accross screenreaders so I might have to find some friend having a windows computer with NVDA, JAWS or some other non-free program to either ask or check. I believe the most correct handling for screen readers would be to use the appropriate language from the attribute lang or xml:lang of tag, otherwise slower and slightly higher pitch, and for the exact same higher pitch as caps, without changing speed, plus adding it to an easily reachable =E2=80=9Ckeyword-list=E2=80=9D, just as . Fyi=C2=A0: both italic, bold, and underline, have been invented in typograp= hy as special ways of *purposely* making text harder to read. Both the intent and result is that the reader taking more time to read something in italic, for instance, will memorize it better, and have more free time to think about it, hence increasing the importance of this something. In the following =E2=80=9Cfrom far=E2=80=9D means when you look at the glob= al document and are not focusing reading a particular part of it. It doesn=E2=80=99t m= ean you are at a far distance and you can still read it, like it is for uppercase. Italic is the best way, the most readable, as it=E2=80=99s only seen when reading, near the text, but not =E2=80=9Cfrom far=E2=80=9D and doesn=E2=80= =99t break structure, flowing, or =E2=80=9Ctypographic grey=E2=80=9D (=E2=80=9Cgris typographique= =E2=80=9D, I=E2=80=99m not aware of the english term). It is hence commonly used for emphasis (best usage: if ever it gets long, it gets hard to read, but that reflects the fact original meaning was hard to grasp or hear or say originally), citation of artistical work names (such as books: conventional usage, but still okay, as these are mostly short anyway), and quotations (discouraged usage as they can get long (and thus unreadable) and quote marks cover this, *not* to be used *along* with them, never, as it is terribly redundant and almost no serious professional printer do that). Bold is sometimes harder to read, and sometimes, if not too bold, easier, however it=E2=80=99s really easy to =E2=80=9Cnotice=E2=80=9D its te= xt from looking afar: therefore it=E2=80=99s normally *exclusively* recommanded for text structur= es, whose *role* is to purposely cut in parts the text, that is: *outlines*. However, in an attempt of pseudo-backward compatibility and =E2=80=9Cbut lo= ok everybody was okay since the beginning=E2=80=9D, by the W3C, another usage = for bold than in outlines has been found: keywords. These are *meant* to be seen from far, are usually small (one word), and yet wouldn=E2=80=99t alter= text structure, and might not be candidate for (however most time they should). Underline is to be banned from everywhere, theorically. It is an especially simple and awful way of making text unreadable: it cuts the legs of non-zero-ascent letters (making as hard to read as italic) *and* is easy to lookup from far, yet you can notice the underline without having the word easily and quickly grasped when seen from far, like bold. Iirc it has been invented for typewriters because italic wasn=E2=80= =99t available, for which it is the poorest candidate ever. It is also used in manuscript text, as people actually trying to manually write in italics or bold are nowadays few and others are often unable to do so. Most time I saw it used manuscriptly to anotate and highlight text. Conventions has been developed around this: in typewriter as well as manuscript text, you normally *only* use it for artistic works names (instead of italic), and blue hyperlinks. It is sad it has developed as a such important convention but it is done, clear, and well established. The W3C meaning of =E2=80=9Cadded text=E2=80=9D seems quite somewhat artifi= cial to me, as it is not more conventional to use it for =E2=80=9Cadded changes=E2=80= =9D than any other typographic convention. However it is necessarily *one of these*, as it is commonly used to highlight and anotate text (however the tag is here for that, in HTML). > I do believe we should move away from b/i to strong/em as I think these > are the correct semantic tags to use and are generally what is > preferred. This means they are also likely to already have appropriate > 'styling' in many 'canned' styles and valid consistent interpretations > for different media types.=20 This is unsemantic (and is giving org markup a presentational rather than semantic role, so I strongly oppose this) and could break true accessibility. I=E2=80=99d say ideally what we should have is more markup = to be compatible with HTML, as recently, with XHTML1, 2 and HTML5, it has become one of the richer and most clearly defined markup language available. However as org, comparably to markdown and rst, is trying to achieve some compatibility with classical clear-text markuping, such as in email, and from what semantics I detected, I=E2=80=99d say the following= =C2=A0: =E2=80=93=C2=A0tag =E2=80=9C*=E2=80=9D with , maybe find cases where = =E2=80=9C=E2=80=9D might be appropriate (for keywords, typically): I=E2=80=99d say an interesting experiment woul= d, for some given languages (such as english, to begin) detect if an article (=E2=80=9Cthe=E2=80=9D, =E2=80=9Ca=E2=80=9D, =E2=80=9Can=E2=80=9D= =E2=80=A6) is part of the markup: then it=E2=80=99s not a keyword (hence ), if it=E2=80=99s *preceding* the markup, then more probably it is a markup (but not necessarily)=C2=A0; =E2=80=94=C2=A0tag =E2=80=9C/=E2=80=9D with , as this match the most = accurate and commonly meaning of =E2=80=9C/=E2=80=9D, =E2=80=9C_=E2=80=9D might be appropriate = as well, but may be redundant (so a safe (potentially usable as buffer-local) custom var would do better). However there are some cases where =E2=80=9C/=E2=80=9D would be= more appropriate as (I=E2=80=99d say the vast majority of occurences are w= ords from foreign languages, other are most often incorrect and abusive usage of =E2=80=9C/=E2=80=9D); =E2=80=94=C2=A0tag =E2=80=9C_=E2=80=9D as either , if correct var is = of the correct value, or , *only* if near =E2=80=9C+=E2=80=9D markup. Otherwise, as org only= use =E2=80=9C[]=E2=80=9D for hyperlinks, I don=E2=80=99t know. Note that, indeed, =E2=80=9C=E2=80=9D has no usage. If it was up t= o me it should be banned. Maybe its most accurate usage would be for upcase urgent emphasis-text: *URGENT: READ THIS NOW OR YOU WILL DIE* (you might use if absolutely wanting to, for upcase emphasis text, or emphasis text containing =E2=80=9Curgent:=E2=80=9D or =E2=80=9Cimportant:= =E2=80=9D, and differently localized versions (format-level linguistic imperialism, bla bla: note for the same very reason this would work as is for french, but me and many people would funnily feel more reassured, respected or whatever if they were blessed by being in a list whose car is "fr")). > I don't think this is something that is urgent, but it is the > direction we should go. The only real reason for sooner rather than > later is that we can probably simplify some of the exporters and > ensure any new exporters are correct and won't need to be change > retrospectively. This has to be a semantics work to be reported on *all* semantic backends. As there are =E2=80=9Caccessibility=E2=80=9D workaround for almo= st all formats (even PDF, which is understandable as it got important and widely used, while normally meant only for printing, hence display, not semantics (but you know, these days, you can put javascript in these=E2=80= =A6)), this may mean =E2=80=9Cevery backend=E2=80=9D. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tim Cross Subject: Re: *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace with and with ] Date: Sat, 27 Oct 2018 07:15:30 +1100 Message-ID: <87ftwslb19.fsf@gmail.com> References: <87r2gfyj62.fsf@nicolasgoaziou.fr> <87in1rkqlk.fsf@gmail.com> <87a7n1i8lr.fsf_-_@portable.galex-713.eu> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:42749) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gG8WB-0007Y7-UM for emacs-orgmode@gnu.org; Fri, 26 Oct 2018 16:15:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gG8W9-0000p2-K1 for emacs-orgmode@gnu.org; Fri, 26 Oct 2018 16:15:39 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]:34872) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gG8W8-0000oE-MA for emacs-orgmode@gnu.org; Fri, 26 Oct 2018 16:15:37 -0400 Received: by mail-pg1-x52d.google.com with SMTP id 32-v6so1041607pgu.2 for ; Fri, 26 Oct 2018 13:15:36 -0700 (PDT) In-reply-to: <87a7n1i8lr.fsf_-_@portable.galex-713.eu> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: "Garreau, Alexandre" Cc: emacs-org list , Kaushal Modi Garreau, Alexandre writes: > Sorry, just found out that interesting (to me) thread I shouldn=E2=80=99t= have > let go: > > On 2018-10-25 at 08:00, Tim Cross wrote: >> Kaushal Modi writes: >>> [=E2=80=A6] >>> - b and i are not deprecated >>> - b and strong are both valid but their use depends on the writer's >>> context (but Org mode has just one mark for either "*") >>> - i and em are both valid but their use depends on the writer's >>> context (but Org mode has just one mark for either "/"). >>> >>> [=E2=80=A6] >>> >>> From "em" docs[em], in the NOTE section there: >>>> The em element isn=E2=80=99t a generic "italics" element. Sometimes, t= ext >>>> is intended to stand out from the rest of the paragraph, as if it >>>> was in a different mood or voice. For this, the i element is more >>>> appropriate. >>> >>> [=E2=80=A6] >>> >>> I guess there's no need to change what "*" and "/" do right now in >>> ox-html, as there doesn't seem "one right way" to do things here. >>> >>> And folks strongly wanting to use and for bold and >>> italic can customize org-html-text-markup-alist. >>> >>> HTML experts, please chime in. >> >> I'll start by stating I'm definitely not an HTML expert. > > I don=E2=80=99t exactely know what an expert is, at least I=E2=80=99m not= a > professional, but I have passed some time figuring out various HTML > specs semantic meaning. > > More especially, I=E2=80=99ve a big interest in semantics and typography,= and > past many time on my now deleted-crecreated-then-lost github account, > and mail, to convince people to switch to more semantical markuping (oh, > and to use complex CSS selectors rather than classes, and stop using >
and at all) and better typography (such as curly quotes, > simple quotes inside quotes, and many things specific to french). > >> The problem with b and i is that they specify how rather than what and >> don't always make sense for all possible media types. For example, what >> does 'bold' or 'italic' mean for a screen reader? > > Italic means often pronounced with a different pitch afair. Bold > probably means prounced differently too but I don=E2=80=99t know how this= is > pronounced iirc. I need to recheck with orca and firefox addons (I=E2=80= =99ll > do for a next mail). That might be change accross screenreaders so I > might have to find some friend having a windows computer with NVDA, JAWS > or some other non-free program to either ask or check. > > I believe the most correct handling for screen readers would be to use > the appropriate language from the attribute lang or xml:lang of tag, > otherwise slower and slightly higher pitch, and for the exact same > higher pitch as caps, without changing speed, plus adding it to an > easily reachable =E2=80=9Ckeyword-list=E2=80=9D, just as . > > Fyi: both italic, bold, and underline, have been invented in typography > as special ways of *purposely* making text harder to read. Both the > intent and result is that the reader taking more time to read something > in italic, for instance, will memorize it better, and have more free > time to think about it, hence increasing the importance of this > something. > > In the following =E2=80=9Cfrom far=E2=80=9D means when you look at the gl= obal document > and are not focusing reading a particular part of it. It doesn=E2=80=99t= mean > you are at a far distance and you can still read it, like it is for > uppercase. > > Italic is the best way, the most readable, as it=E2=80=99s only seen when > reading, near the text, but not =E2=80=9Cfrom far=E2=80=9D and doesn=E2= =80=99t break structure, > flowing, or =E2=80=9Ctypographic grey=E2=80=9D (=E2=80=9Cgris typographiq= ue=E2=80=9D, I=E2=80=99m not aware of > the english term). It is hence commonly used for emphasis (best usage: > if ever it gets long, it gets hard to read, but that reflects the fact > original meaning was hard to grasp or hear or say originally), citation > of artistical work names (such as books: conventional usage, but still > okay, as these are mostly short anyway), and quotations (discouraged > usage as they can get long (and thus unreadable) and quote marks cover > this, *not* to be used *along* with them, never, as it is terribly > redundant and almost no serious professional printer do that). > > Bold is sometimes harder to read, and sometimes, if not too bold, > easier, however it=E2=80=99s really easy to =E2=80=9Cnotice=E2=80=9D its = text from looking afar: > therefore it=E2=80=99s normally *exclusively* recommanded for text struct= ures, > whose *role* is to purposely cut in parts the text, that is: *outlines*. > However, in an attempt of pseudo-backward compatibility and =E2=80=9Cbut = look > everybody was okay since the beginning=E2=80=9D, by the W3C, another usag= e for > bold than in outlines has been found: keywords. These are *meant* to be > seen from far, are usually small (one word), and yet wouldn=E2=80=99t alt= er text > structure, and might not be candidate for (however most time they > should). > > Underline is to be banned from everywhere, theorically. It is an > especially simple and awful way of making text unreadable: it cuts the > legs of non-zero-ascent letters (making as hard to read as italic) *and* > is easy to lookup from far, yet you can notice the underline without > having the word easily and quickly grasped when seen from far, like > bold. Iirc it has been invented for typewriters because italic wasn=E2= =80=99t > available, for which it is the poorest candidate ever. It is also used > in manuscript text, as people actually trying to manually write in > italics or bold are nowadays few and others are often unable to do so. > Most time I saw it used manuscriptly to anotate and highlight text. > Conventions has been developed around this: in typewriter as well as > manuscript text, you normally *only* use it for artistic works names > (instead of italic), and blue hyperlinks. It is sad it has developed as > a such important convention but it is done, clear, and well established. > > The W3C meaning of =E2=80=9Cadded text=E2=80=9D seems quite somewhat arti= ficial to me, > as it is not more conventional to use it for =E2=80=9Cadded changes=E2=80= =9D than any > other typographic convention. However it is necessarily *one of these*, > as it is commonly used to highlight and anotate text (however the > tag is here for that, in HTML). > >> I do believe we should move away from b/i to strong/em as I think these >> are the correct semantic tags to use and are generally what is >> preferred. This means they are also likely to already have appropriate >> 'styling' in many 'canned' styles and valid consistent interpretations >> for different media types. > > This is unsemantic (and is giving org markup a presentational rather > than semantic role, so I strongly oppose this) and could break true > accessibility. I=E2=80=99d say ideally what we should have is more marku= p to be > compatible with HTML, as recently, with XHTML1, 2 and HTML5, it has > become one of the richer and most clearly defined markup language > available. However as org, comparably to markdown and rst, is trying to > achieve some compatibility with classical clear-text markuping, such as > in email, and from what semantics I detected, I=E2=80=99d say the followi= ng: > =E2=80=93tag =E2=80=9C*=E2=80=9D with , maybe find cases where =E2=80= =9C=E2=80=9D might be appropriate > (for keywords, typically): I=E2=80=99d say an interesting experiment wo= uld, > for some given languages (such as english, to begin) detect if an > article (=E2=80=9Cthe=E2=80=9D, =E2=80=9Ca=E2=80=9D, =E2=80=9Can=E2=80= =9D=E2=80=A6) is part of the markup: then it=E2=80=99s not a > keyword (hence ), if it=E2=80=99s *preceding* the markup, then more > probably it is a markup (but not necessarily); > =E2=80=94tag =E2=80=9C/=E2=80=9D with , as this match the most accu= rate and commonly > meaning of =E2=80=9C/=E2=80=9D, =E2=80=9C_=E2=80=9D might be appropriat= e as well, but may be redundant > (so a safe (potentially usable as buffer-local) custom var would do > better). However there are some cases where =E2=80=9C/=E2=80=9D would = be more > appropriate as (I=E2=80=99d say the vast majority of occurences are= words > from foreign languages, other are most often incorrect and abusive > usage of =E2=80=9C/=E2=80=9D); > =E2=80=94tag =E2=80=9C_=E2=80=9D as either , if correct var is of t= he correct value, or > , *only* if near =E2=80=9C+=E2=80=9D markup. Otherwise, as org on= ly use =E2=80=9C[]=E2=80=9D for > hyperlinks, I don=E2=80=99t know. > > Note that, indeed, =E2=80=9C=E2=80=9D has no usage. If it was up= to me it > should be banned. Maybe its most accurate usage would be for upcase > urgent emphasis-text: *URGENT: READ THIS NOW OR YOU WILL DIE* (you might > use if absolutely wanting to, for upcase emphasis text, or > emphasis text containing =E2=80=9Curgent:=E2=80=9D or =E2=80=9Cimportant:= =E2=80=9D, and differently > localized versions (format-level linguistic imperialism, bla bla: note > for the same very reason this would work as is for french, but me and > many people would funnily feel more reassured, respected or whatever if > they were blessed by being in a list whose car is "fr")). > >> I don't think this is something that is urgent, but it is the >> direction we should go. The only real reason for sooner rather than >> later is that we can probably simplify some of the exporters and >> ensure any new exporters are correct and won't need to be change >> retrospectively. > > This has to be a semantics work to be reported on *all* semantic > backends. As there are =E2=80=9Caccessibility=E2=80=9D workaround for al= most all > formats (even PDF, which is understandable as it got important and > widely used, while normally meant only for printing, hence display, not > semantics (but you know, these days, you can put javascript in these=E2= =80=A6)), > this may mean =E2=80=9Cevery backend=E2=80=9D. I have either misunderstood most of your position or I simply disagree with it - I'm not sure which. - Much of what you argue seems to be based around ideas associated with typography. IMO this is where things fall down. Typography is really only relevant to 'printing' (either on paper or screen). Markup is not just about printing - it is about conveying what the author wanted and how that is best interpreted will depend on the media being used (i.e. how the content is 'rendered') and should largely be up to the consumer.=20 - I am a screen reader user. While you are correct that pitch, tone, speed and different voices are often used to convey things like 'bold' or 'italic', there is no universally accepted rule for this interpretation, at least not in the same sense as there is with typography. We all know what bold or italic looks like, but there is no agreement as to what these should sound like. When you use Jaws, you will get a different result from when you use Orca or Emacspeak or Window Eyes or .... However, this shouldn't really matter - how these are 'rendered' should ideally be under the control of the individual consuming the content. When I consume a document, it should be my decision as to how the content is presented and for me, interpreting 'strong' or 'emphasis' seems to be far clearer than 'bold' or 'italic'. - I don't believe there is any strong reason that the markup used by org should have any strong reference to HTML in appearance. Org supports many different backends, many of which don't have anything to do with HTML at all. It is perhaps unfortunate that Org syntax and markdown are quite different (though I feel the unfortunate part is that markdown didn't follow org more closely as I much prefer Org's syntax to most markdown semantics).=20=20 - Probably the number 1 issue I come across when dealing with markup is the expectation too many authors have that things will be rendered in the browser in a specific way (a particular font, colour, position, size, etc). This is a mistake. The big advantage of electronic presentation is that for the first time, the consumer can have control over the presentation - they can customise it to meet their requirements or preferences. The problem with and is that it gives authors an expectation their content will be rendered in a specific way. Some may argue that the author should be able to control how their content is rendered. I think this is misleading because unlike printed material, the author has no control over the presentation media - they don't know how large the screen is, what the capabilities of the screen is, what fonts are installed etc. Therefore, tags which focus on meaning i.e. I want this to stand out or I want this to be emphasised are clearer than tags which say to make this bold or make this italic.=20=20 The debate over , , and is likely to continue for some years yet. I do think things are moving towards / and nearly everything I read these days recommends these over and . It is pretty well accepted that XHTML was a mistake and HTML5 goes a long way to address the issues introduced with XHTML - I think XHTML as a standard is pretty much relegated to an evolutionary dead end. I do agree
is over used. In particular, HTML5 has a number of new tags which should be used to convey document structure which would be a better choice than
with different 'class' attributes. However, we will continue to see a lot of div tags, even when authors begin to use newer tags - at least it is a lot better than the early days when everything was stuck inside tables! Backends which generate HTML should be generating HTML5 compliant output if for no other reason than it is clearer and easier than XHTML.=20 As to the OP's original question regarding changing and in HTML backends - while I would vote for strong/em over b/i, I don't think there is any real need to do this, certainly not in the short term. As was pointed out b/i has not been deprecated, so it is still valid. There is no suggestion to change Org's own internal markup (ironically referred to as bold and italic!), so overall, the status quo seems fine. Tim .=20 -- Tim Cross From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Garreau\, Alexandre" Subject: Re: *markup*, /markup/ and _markup_ true semantics [Was: Re: Ox-html: Replace with and with ] Date: Sat, 27 Oct 2018 14:52:05 +0200 Message-ID: <875zxn362y.fsf@portable.galex-713.eu> References: <87r2gfyj62.fsf@nicolasgoaziou.fr> <87in1rkqlk.fsf@gmail.com> <87a7n1i8lr.fsf_-_@portable.galex-713.eu> <87ftwslb19.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:59010) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gGO4a-0003hC-64 for emacs-orgmode@gnu.org; Sat, 27 Oct 2018 08:52:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gGO4X-0007sO-4k for emacs-orgmode@gnu.org; Sat, 27 Oct 2018 08:52:12 -0400 Received: from portable.galex-713.eu ([2a00:5884:8305::1]:52934) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gGO4W-0007rQ-LE for emacs-orgmode@gnu.org; Sat, 27 Oct 2018 08:52:08 -0400 In-Reply-To: <87ftwslb19.fsf@gmail.com> (Tim Cross's message of "Sat, 27 Oct 2018 07:15:30 +1100") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: Tim Cross Cc: emacs-org list , Kaushal Modi On 2018/10/27 at 07:15, Tim Cross wrote: > I have either misunderstood most of your position or I simply disagree > with it - I'm not sure which. maybe a mix of both? I hope it=E2=80=99s a misunderstandnment but if it=E2= =80=99s not I want to understand too so to get to a constructive agreement. > - Much of what you argue seems to be based around ideas associated with > typography. IMO this is where things fall down. Typography is really > only relevant to 'printing' (either on paper or screen). Markup is not > just about printing - it is about conveying what the author wanted and Indeed. But many people do not abstract what they mean to write and still (often, poorly) think in terms of =E2=80=9Citalic=E2=80=9D and =E2=80= =9Cbold=E2=80=9D (the org manual, as you later said, even do so). What I wanted to underline is that both =E2=80=9Citalic=E2=80=9D and =E2=80=9Cbold=E2=80=9D (and underlin= e too somewhat) are not just arbitrary display-level caracteristic that had the particularity to later get a meaning: *first* a *meaning* was wanted, and *then* they were invented as an imperfect, more or less good, way to translate these meanings or their intents to display (it=E2=80=99s as imperfect as a bitmap= or handwriting of a circle, or a sampled and compressed audio, is to the bezier curve or equation of a circle which resulted in it, or the function that produced the audio (such as a LilyPond musical partition or a resulting MIDI file)). I=E2=80=99m willing to extract as much of the original meaning (be it about attention, memorization, structuration, etc. (very abstract cognitive human features are still more common than visual-recognition features)) so it can be then better applied everywhere, without the burden and constraints of the original media (display), with a little of history because I like to rehistoricize things into their material and social background, so not to see them as a static, ahistoric, uncreated, uncriticizable, concept. Concepts and tools are made for people to serve them, not the opposite. > how that is best interpreted will depend on the media being used > (i.e. how the content is 'rendered') and should largely be up to the > consumer.=20 Yes totally, this is why I believe we, at best, should try to give clear and defined meaning to why do we use *, / and _-tags, rather than just translating them to the traditional , , and tags, that were actually just a poor 1-to-1 wrapping to the old , and tags, which had no meaning, and still have confused, complex and not backward-compatible meaning. And why sometimes it might be better to set up user options, so if authors disagree with what is meant by their tags, they can change it, so in the end that gives the correct semantic markup and everybody will get the same, intended, meaning. Also why, ideally, for the web, I wished server-side CSS never existed and we only used it as a user-customization language (but still most websites have poor semantic tagging, and complex tags composition have still no clear defined meaning so it the end it becomes either guessing, either a request to add yet-another tag to the already complex HTML spec). > - I am a screen reader user. While you are correct that pitch, tone, > speed and different voices are often used to convey things like 'bold' > or 'italic', there is no universally accepted rule for this > interpretation, at least not in the same sense as there is with > typography. I know, that=E2=80=99s why I wanted to check with Orca, NVDA, and maybe Jaw= s too if I could. > We all know what bold or italic looks like, but there is no > agreement as to what these should sound like. When you use Jaws, you > will get a different result from when you use Orca or Emacspeak or > Window Eyes or .... However, this shouldn't really matter - how > these are 'rendered' should ideally be under the control of the > individual consuming the content. When I consume a document, it > should be my decision as to how the content is presented and for me, > interpreting 'strong' or 'emphasis' seems to be far clearer than > 'bold' or 'italic'. That=E2=80=99s why I=E2=80=99d like * and / to get better meaning than bold= and italic. For me it is already widely accepted that * is, sometimes, considered as bold, but more widely used for emphasis. So it should be considered as such (and, personally, I=E2=80=99ve meant this so that it could begin rende= ring with italic on display for instance, or whatever is the favorite emphasis method of the user, it should be configurable). / is a way harder problem as it has been used because of its slanted appearance, to mean italic, so sometimes it=E2=80=99s used for emphasis, sometimes for other uses of emphasis. Ideally I=E2=80=99d like to be acted= it=E2=80=99s not for emphasis (it=E2=80=99s way less used and supported than * for it, a= nd * already serves this purpose very well informally), so implementations derive some other meaning for it, to get richer semantics. > - I don't believe there is any strong reason that the markup used by org > should have any strong reference to HTML in appearance. Org supports > many different backends, many of which don't have anything to do with > HTML at all. It is perhaps unfortunate that Org syntax and markdown > are quite different (though I feel the unfortunate part is that > markdown didn't follow org more closely as I much prefer Org's syntax > to most markdown semantics).=20=20 I don=E2=80=99t like markdown either, nor ReStructuredText. Why I talked a= lot about HTML is for two reasons: the discussion was initially about it, and it is, afaik, the richest and most known semantical markup language. It is *way* richer than LaTeX, org, md, rst, etc. maybe even odt and texinfo, but I=E2=80=99m unsure. However the * and / exports to texinfo with the same tags as html, that is respectively strong and emphasis, which I find sad as * is what is mostly used for emphasis (and too levels are pretty much not needed, why richer semantics could). ODT seems to use =E2=80=9C=E2=80=9D with = =E2=80=9Cstyle-name=3D"Emphasis"=E2=80=9D: I heard ODT could be somewhat semantic, but I don=E2=80=99t know if that the = best they can do (maybe this =E2=80=9Cstyle-name=E2=80=9D has standard semantics= ? because to me styling is for presentation, and tagging for semantics). Also a problem of many backends is they=E2=80=99re made for printing or less semantic: pdf is not made for semantics, although I heard somewhere that they were trial to make it so (which sounds silly as it is tailored for printing and supports almost no dynamic modifications, it would be better to stop using PDFs at all, in, eg, administration). > - Probably the number 1 issue I come across when dealing with markup is > the expectation too many authors have that things will be rendered in > the browser in a specific way (a particular font, colour, position, > size, etc). This is a mistake. The big advantage of electronic > presentation is that for the first time, the consumer can have control > over the presentation - they can customise it to meet their > requirements or preferences. *Exactely*. Except that then, web become commercial, and businesses have found it especially good way to control what users saw almost as fully as in advertisements (so it can bring control, power to them, and also money, secondarily (if they use non-semantic tags and only
and in awfully complex sgml soup, then no user is able to control anything)), just as French minitel would, and they begun first to abuse display-level tagging, then to abuse CSS and html-style-soup (full of 80% of
and , and enormous CSSes, yay! what a progress! =E2=80= =A6>< yet now we have worse: less CSS, less =E2=80=9Cstyle=E2=80=9D, and more =E2= =80=9Cdata-*=E2=80=9D and non-free surveillance javascript to replace them). > The problem with and is that it gives authors an expectation > their content will be rendered in a specific way. Not anymore, since W3C, somewhat breaking backward-compatibility, decided is for =E2=80=9Ckeywords=E2=80=9D without special emphasis, and= not being a definition (there=E2=80=99s already afaik for that), and is for =E2=80=9Ddifferently-pronounced phrasing content=E2=80=9D, without emphasis= , such as text prounced with a tone of disgust, or foreign-language text (so if you want to embed french words not used enough to be in english dictionary, and if it=E2=80=99s nor a real quotation (), you = should use du texte en fran=C3=A7aise). So I can theorically decide that any word markuped may compose a local list of easily reachable (for instance with keystrokes) =E2=80=9Ckeywords=E2=80=9D, like lynx, that b should be displayed normally,= but in blue, and that would be a standard-complying www user-agent. > Some may argue that the author should be able to control how their > content is rendered. I think this is misleading because unlike > printed material, the author has no control over the presentation > media - they don't know how large the screen is, what the > capabilities of the screen is, what fonts are installed > etc. Therefore, tags which focus on meaning i.e. I want this to > stand out or I want this to be emphasised are clearer than tags > which say to make this bold or make this italic. Yes they can: they can require you server-side connecting from a local network on computers furnished by the organization of the place (already saw that), while checking what do you do and how to make you doing it client-side with proprietary javascript, or even to have a tablet with retina screen with a such range of screen sizes, on iOS=E2=80=A6 and=E2=80= =A6 btw=E2=80=A6 this already exists, there=E2=80=99s an app for it: AppStore (GooglePlay too): t= hey furnish HTML/CSS UI, controled by proprietary software, only distributed for their devices, theorically only working on those (at least the apps are developed, configured, and tested so). And afaik developers don=E2=80= =99t mind making their software more usable with TalkBack (I don=E2=80=99t even = know if there=E2=80=99s a such thing for iOS). The excuse of =E2=80=9Cthe device is not always the same=E2=80=9D is to me = a weak one: this can, with special political and commercial restrictions, be lowered, and then it could be considered a =E2=80=9Creasonable workaround= =E2=80=9D (while this is not). What should be advertised is it breaks accessibility, don=E2=80=99t comply standard, will certainly break forward-compatibility, legally-mandatory interoperability, and, as for proprietary software, gives power to authors (or, more often but not always, companies) and deprive users of what they could and should have. This power is comparable to what power is gained through advertisements, propaganda. > The debate over , , and is likely to continue for > some years yet. I do think things are moving towards / > and nearly everything I read these days recommends these over and > . There are companies (and some individual, or countries) who gain power by doing so, just as they can do by pushing proprietary software (yet on a different level), so I don=E2=80=99t believe they will ever stop doing anything equivalent. Nor advertise they would do so. So this is not a debate. Like there is no =E2=80=9Cfree vs proprietary=E2= =80=9D debate, or =E2=80=9Cclimate change vs this-is-god/a-myth=E2=80=9D debate: the advoc= ate of the first have arguments and facts, the actors of the second are either stating their ennemies are idealist, stating their goal are unrealizable, or they =E2=80=9Cdo so because they have no choice=E2=80=9D, = or =E2=80=9Cdo their best not harm=E2=80=9D, and then push more and more pervasive and unadverti= sed way of harming, such as DRM and proprietary javascript, or, in our case, use sometimes and , but allow users to publish content using and (and colors!!!), and making their whole website a soup of
and , heavily relying on a gigantic style soup, based on a site-specific CSS stylesheet, that will be partially generated server-side, partially heavily =E2=80=9Cimprove=E2=80=9D (that is: depend u= pon) proprietary javascript. Btw, this is what Google does. And Google is quite evidently the biggest feudal lord on the Web. > It is pretty well accepted that XHTML was a mistake and HTML5 goes a > long way to address the issues introduced with XHTML - I think XHTML > as a standard is pretty much relegated to an evolutionary dead end. XHTML was a beautiful standard and was dismissed because all the money and resources were placed on HTML5, whose main selling point was new media resources (namely