From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Marcin Borkowski Newsgroups: gmane.emacs.help Subject: Re: kill your darlings Date: Mon, 24 Jun 2019 20:27:25 +0200 Message-ID: <87ftnyyheq.fsf@mbork.pl> References: <86v9wvy62j.fsf@zoho.eu> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="70831"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: mu4e 1.1.0; emacs 27.0.50 Cc: help-gnu-emacs@gnu.org, Emanuel Berg To: Richard Melville <6tricky9@gmail.com> Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon Jun 24 20:29:54 2019 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hfTj0-000IHV-9u for geh-help-gnu-emacs@m.gmane.org; Mon, 24 Jun 2019 20:29:54 +0200 Original-Received: from localhost ([::1]:53798 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hfTiz-0000JO-9N for geh-help-gnu-emacs@m.gmane.org; Mon, 24 Jun 2019 14:29:53 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:33358) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hfTif-0000In-Hb for help-gnu-emacs@gnu.org; Mon, 24 Jun 2019 14:29:35 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hfTic-00059V-RK for help-gnu-emacs@gnu.org; Mon, 24 Jun 2019 14:29:32 -0400 Original-Received: from mail.mojserwer.eu ([195.110.48.8]:52650) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hfTiY-00053a-VT for help-gnu-emacs@gnu.org; Mon, 24 Jun 2019 14:29:29 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by mail.mojserwer.eu (Postfix) with ESMTP id 75C70E6CB8; Mon, 24 Jun 2019 20:29:20 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mail.mojserwer.eu Original-Received: from mail.mojserwer.eu ([127.0.0.1]) by localhost (mail.mojserwer.eu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KHts8TCITy4L; Mon, 24 Jun 2019 20:29:17 +0200 (CEST) Original-Received: from localhost (unknown [5.174.20.129]) by mail.mojserwer.eu (Postfix) with ESMTPSA id E0341E6CB5; Mon, 24 Jun 2019 20:29:16 +0200 (CEST) In-reply-to: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.110.48.8 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.org gmane.emacs.help:121031 Archived-At: On 2019-06-24, at 10:42, Richard Melville <6tricky9@gmail.com> wrote: > On Mon, 24 Jun 2019 at 05:13, Emanuel Berg via help-gnu-emacs < > help-gnu-emacs@gnu.org> wrote: > >> Everyone knows that everyone use their favorite >> constructs in peach and in writing. E.g., >> I like to use "e.g.", and I like to end >> sentences with something like "for sure" - >> no doubt :) >> >> On a mailing list this is not really a problem. >> But for e.g. in a relationship in can become >> very enervating, without exaggerating :) (Okay, >> you get it, I'll stop now. Or will I...) >> >> We can't (?) program our relationships with >> Elisp, but I wonder if there is a tool or >> method to detect "darlings" in a text. >> For example, I'm writing a LaTeX text now - it >> isn't even halfway done, but currently at >> 1965 lines, I have used the word "emellertid" >> 8 times (it means "however" but sounds more >> stiff and old-fashioned) - and if I weren't >> aware of it, it'd be a good idea if Emacs could >> tell me I overused the word, so I could >> consider removing some of them. And perhaps >> (actually it is likely) there are other of my >> "darlings" that I *am* unaware of! >> >> The kind of stuff I described first, with >> sentence constructions and so on, I get it it >> is probably very difficult for a computer >> program to detect. But overuse of words could >> be as simple as >> >> - count all words >> >> - see what words are the most common >> >> - are there word there that much longer than >> the others? warn the user about possible >> overuse >> >> - obviously, if one is writing a paper on the >> mating process of the Trigonosaurus, one >> would simply disregard the recommendation to >> not use that wierd word all the time >> >> - to compare the text to the Internet would be >> a possibility, but I don't really like it. >> It would mean the program would try to make >> you write like everyone else. That's not the >> point: the point is to make you aware of >> something, that you might be unaware of! >> >> Is there anything like that going on anywhere >> in the Emacs world? >> >> -- >> underground experts united >> http://user.it.uu.se/~embe8573 >> https://dataswamp.org/~incal > > > Yes, it's called proofreading. Wow, you made my day! Seriously though, there _are_ things like that. I never used proselint (http://proselint.com/), but it seems to do a similar thing - though it detects common mistakes (and things perceived as mistakes in our stupid times, apparently), and Emanuel wanted something that could somehow detect peculiarities of style of a particular person. I'm pretty sure it's doable, though I know only a very, very little about natural language processing to provide any details. Doing statistics might be more difficult in languages with a lot of inflections (like Polish or Latin). English is way simpler wrt that; I have no idea about Swedish (though it's not easy to imagine it being more difficult than Polish;-)). Best, -- Marcin Borkowski http://mbork.pl