unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Search Engine Manipulation with Emacs-w3m
@ 2021-06-13 23:51 Emanuel Berg via Users list for the GNU Emacs text editor
  0 siblings, 0 replies; only message in thread
From: Emanuel Berg via Users list for the GNU Emacs text editor @ 2021-06-13 23:51 UTC (permalink / raw)
  To: help-gnu-emacs; +Cc: emacs-w3m, emacs-devel

What do you think, can you "fool" Google by running this once
a day, or how often is required?

Maybe they see thru it since/if it all comes from the same
IP... (I don't know if static/dynamic IP are still concepts
but mine is the same, anyway. I remember a forum from ~20
years ago, they used to "IP ban" people to prevent them from
just creating a new account, however on some users that
wouldn't work as they had dynamic IPs.)

Anyway here it is...

* `search' is the Google search (d'oh)

* `hit' is the string that matches the desired page's title,
which is displayed by Google as a hyperlink

So this would automate the search for `search', the iteration
of pages until the `hit' page, and then that hyperlink
is followed.

Another use case for this is to search for your stuff, with
a search string you fancy, and see on what Google page your
stuff turn up, if indeed that happens.

Comments...

;;; -*- lexical-binding: t -*-
;;;
;;; this file:
;;;   http://user.it.uu.se/~embe8573/emacs-init/w3m/w3m-sem.el
;;;   https://dataswamp.org/~incal/emacs-init/w3m/w3m-sem.el

(require 'cl-lib)

(require 'w3m-search)
(require 'w3m-tabs) ; https://dataswamp.org/~incal/emacs-init/w3m/w3m-tabs.el

(defun w3m-sem (search hit &optional max)
  (switch-to-buffer (w3m-new-tab "SEM"))
  (w3m-search w3m-search-default-engine search)
  (let ((sleep-time 3))
    (sleep-for sleep-time)
    (cl-loop for i from 1 to (or max 10) do
      (if (search-forward hit (point-max) t)
          (progn
            (goto-char (match-beginning 0))
            (w3m-view-this-url)
            (cl-return (message "hit page %d" i)) )
        (if (search-forward ">" (point-max) t)
            (progn
              (goto-char (match-beginning 0))
              (w3m-view-this-url)
              (sleep-for sleep-time) )
          (cl-return (message "no hit; can't continue from page %d" i)) ))
      finally return (message "no hit on pages 1-%d" i) )))

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-06-13 23:51 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-13 23:51 Search Engine Manipulation with Emacs-w3m Emanuel Berg via Users list for the GNU Emacs text editor

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).