From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Julius Hamilton Newsgroups: gmane.emacs.help Subject: Re: Highlight saved, rendered HTML document Date: Thu, 10 Jun 2021 17:29:58 +0200 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3743"; mail-complaints-to="usenet@ciao.gmane.io" To: Julius Hamilton , help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Thu Jun 10 17:34:13 2021 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lrMh9-0000hJ-EH for geh-help-gnu-emacs@m.gmane-mx.org; Thu, 10 Jun 2021 17:34:11 +0200 Original-Received: from localhost ([::1]:46382 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lrMh8-0006jP-Fx for geh-help-gnu-emacs@m.gmane-mx.org; Thu, 10 Jun 2021 11:34:10 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54280) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lrMdM-0000OP-Pe for help-gnu-emacs@gnu.org; Thu, 10 Jun 2021 11:30:17 -0400 Original-Received: from mail-oi1-x232.google.com ([2607:f8b0:4864:20::232]:43647) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lrMdI-00055s-Oy for help-gnu-emacs@gnu.org; Thu, 10 Jun 2021 11:30:16 -0400 Original-Received: by mail-oi1-x232.google.com with SMTP id x196so2518575oif.10 for ; Thu, 10 Jun 2021 08:30:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=VaEU5Wpa/QplBUCakkHXF0tv/3+bD+91W1yc50HKTls=; b=uQ96eo3rn46dF3sworSnqZkVlJy4nN2ViQ7dVnC8KP5aXCVoiXMnCsP1PX/yQEACT2 PqS/MYeopjraVQopZhVonLKhlXP1j/U9/bK+NEuGcR+b7Wjv+UTpBs0akK7jLFZ+BvB2 yNslhCNDEdmrtzKpYR+wp608zvr4XpPa++Gs1Eb5rMKAth7W+Db1+SjCoHiCG3n+tcJj 9Pw3LNAPDh8Gyl34fvo+4l1WzEhd0r3eywj6/XbPBloOeUP4xM2RotyVzAc4EfZY40Im hMOQaNx2EsUp61tJOr+1Gs+EMJMhJ83YY14emTsKVVFRSZaVeBoDZn8d3nazop8sYo23 Hjvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=VaEU5Wpa/QplBUCakkHXF0tv/3+bD+91W1yc50HKTls=; b=sLFaipA6xKZGCL2+ekx5DRCjvaXZYzUoSVrxGI+qg+zE0UtZlHfvY4oJ+HSC3eGfz0 ebBG0iMdRTdo7JtIjvCuB/wjVjh04XS5qoKkZXKlrmcg95lw0MHq8ezTdw5OeDqlJgiN 6+vHLS41kEl6rfgt+wnUCEY/I+J8DbpdemmRiO9uMC1tWPNf49rceOl94x7qLjzOodri EempAVF7WrAKbNJc7Rq1HuhQ+FdhaYnPl7Ya2odjcx+PfruzWyANV/mrw9Tx5uThOEkZ pRo/a5fS3gPiQCeVqSgy3YOFDlLzN8AcpfzZ2ww9rA1AKhb3EZLq1zopnESEsSiYuGkO F9eA== X-Gm-Message-State: AOAM53096CFyfhZ5oXMsjwxQ0Fz3fgjy0IlHIN6FnuZMo3LPOy3gVwbE LpiVy/Yd9M4LfldpG/ddMNdUF9H7CzZetZvv5HY= X-Google-Smtp-Source: ABdhPJybi6QKThJFP+FbxllECL715Efml98YsjG8TPGVp0W2pNrW9ne6ObgOPdsHFKfy6va/4xaSWU0QvHxKWBJEBCg= X-Received: by 2002:aca:efc1:: with SMTP id n184mr3834836oih.48.1623339010765; Thu, 10 Jun 2021 08:30:10 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::232; envelope-from=julkhami@gmail.com; helo=mail-oi1-x232.google.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_SBL_A=0.1 autolearn=ham autolearn_force=no X-Spam_action: no action X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:130705 Archived-At: Thanks very much. I'll read this over and let you know what I think. Best regards, Julius On Wed, Jun 9, 2021, 21:51 Jean Louis wrote: > * Julius Hamilton [2021-06-09 21:06]: > > Hello, > > > > I would like to be able to highlight webpages offline, for better readi= ng > > comprehension of them. > > Hypothes.is Annotate the web, with anyone, anywhere. > https://web.hypothes.is/ > > That may be one of best tools for annotation. It could be installed on > your computer. > > More resources: > > Open Annotation =C2=B7 GitHub > https://github.com/openannotation > > Home - Annotator - Annotating the Web > http://annotatorjs.org/ > > Different solution is to save the HTML page as PDF and use Emacs to > annotate PDF (you said it works) or Evince PDF viewer to annotate it. > > There is different solution to convert HTML to text and then to > annotate it by using: > > ;; Author: Bastian Bechtold > ;; Maintainer: Bastian Bechtold > ;; URL: https://github.com/bastibe/annotate.el > > Converting HTML to text is not hard, there are many tools to do that, > including with Emacs. > > $ elinks --dump https://www.example.com > example.txt > > or > > $ pandoc -f html -t plain https://www.example.com > > > I recently discovered that for some reason, these tools do not work > > for downloaded pages being viewed in a browser. Maybe it's because > > they try to save the highlights in relation to each URL, and the > > downloaded pages don't have URLs. > > Maybe this system could help? > > Home | CollectiveAccess > https://collectiveaccess.org/ > > You may install CollectiveAccess on your computer and annotate > anything from WWWW. Demo: > > https://demo.collectiveaccess.org/index.php/system/auth/login?redirect=3D= https%3A%2F%2Fdemo.collectiveaccess.org%2Findex.php%2FDashboard%2FIndex > > > I was wondering if anybody could recommend a way to highlight rendered > HTML > > pages in Emacs. I know Emacs provides annotation tools for PDFs in > > pdf-tools mode, and highlighting plaintext in a certain highlighting > mode. > > It seems likely that it should be possible for HTML pages too. > > > > Just to be clear, I don't mean syntax highlighting HTML code, but rathe= r > > moving a cursor through a web document to highlight information of > > interest. > > I could use annotate.el to annotate HTML that I have opened with > eww-open-file and annotated with annotate-mode, but I could not save > annotations. Now I am thinking it could be or should be possible to > adapt it. > > Cc: to Ihor as he may know the solution. > > How annotate.el works you can see in the attached image, but I think > that annotation is too short or somehow limited if it is straight in > the text. > > Good and simple way to annotate documents would be either GNU > Hyperbole or `eev' package, then I would take the approach of making > buttons which I would highlight and be able to quickly jump to the > annotation. Here is the example hyperlink to text annotation: > "/home/admin/tmp/annotations.txt" > > Or hyperlink to specific line number: > "/home/admin/tmp/annotations.txt:2" > > Or `eev' hyperlinks: > > (find-fline "~/tmp/annotations.txt") > > Or like this below that could annotate the paragraph and jump to > annotations file searching for "lorem ipsum", or it could go to > specific position, it implies that files are writeable. > > (find-fline "~/tmp/annotations.txt" "lorem ipsum") > > Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a diam > lectus. Sed sit amet ipsum mauris. Maecenas congue ligula ac quam > viverra nec consectetur ante hendrerit. Donec et mollis > dolor. > > I would take the programmatic approach to annotations on the higher > level which would or could work with files but also buffers not > related to files such as those values edited from a database. The > approach would be similar to `eev' package and function `find-fline', > so I would make it for read only files based on the line or query, for > writeable files based on the query only (prone to fail if things are > changed). A query or a line could even be highlighted later if mode is > turned on, or it could become a button on the fly (Emacs package > button.el) -- and data would be stored outside, in the database object > that refers to the file. That approach makes it little more visual. > > Right now I am annotating any file, any object by using database > meta-level attributes, so if there is a file there is description, > internal description, text, report, author, tags, all such information > pieces are separate from the file, thus not so specific to parts of > the text as I simply not need it that defined. I have 14000 objects to > PDFs by page number, that is not an annotation but is similar, as I > can jump from description straight to PDF (or files of any > kinds). This message I have already "annotated" and can further work > on it, it is offline though it is online, jumping from annotation to > offline or online version works too. > > -- > Jean > > Take action in Free Software Foundation campaigns: > https://www.fsf.org/campaigns > > In support of Richard M. Stallman > https://stallmansupport.org/ > > =E2=9F=A6 (hyperscope 38467) =E2=9F=A7 >