unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#25549: 25.1; eshell grep gives inconsistent output
@ 2017-01-26 22:06 me
  2017-01-29 17:25 ` npostavs
  2017-01-30 20:38 ` bug#25549: test case Tijs Mallaerts
  0 siblings, 2 replies; 7+ messages in thread
From: me @ 2017-01-26 22:06 UTC (permalink / raw)
  To: 25549


When using grep on a large file in eshell, the output is sometimes
  inconsistent. For example, when grepping for a string in my
server's apache2 access log (~5 MB) I got anywhere from around 200 to
around 1300 results, just running the same grep command over and over
(the file itself was not changing significantly).

The bug cannot be consistently reproduced, but has happened in multiple
situations and others have reportedly reproduced it per discussion on
reddit
(https://www.reddit.com/r/emacs/comments/5qbd70/trampeshellgrep_returning_wildly_different/).


In GNU Emacs 25.1.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.22.5)
 of 2016-12-23 built on juergen
Windowing system distributor 'The X.Org Foundation', version 11.0.11901000
System Description:	Arch Linux

Configured using:
 'configure --prefix=/usr --sysconfdir=/etc --libexecdir=/usr/lib
 --localstatedir=/var --with-x-toolkit=gtk3 --with-xft
 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe
 -fstack-protector-strong' CPPFLAGS=-D_FORTIFY_SOURCE=2
 LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GCONF GSETTINGS
NOTIFY ACL GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11

Important settings:
  value of $LANG: en_US.utf8
  locale-coding-system: utf-8-unix

Major mode: EShell

Minor modes in effect:
  magit-auto-revert-mode: t
  global-git-commit-mode: t
  diff-auto-refine-mode: t
  global-flycheck-mode: t
  pyvenv-mode: t
  helm-mode: t
  shell-dirtrack-mode: t
  async-bytecomp-package-mode: t
  delete-selection-mode: t
  cua-mode: t
  electric-pair-mode: t
  pdf-occur-global-minor-mode: t
  show-paren-mode: t
  tabbar-mwheel-mode: t
  tabbar-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
Git finished [2 times]
C-c p is undefined
Quit
[C-t] show common commands, [?] describe events, [C-h i] show manual
Running git push -v gl master:refs/heads/master [2 times]
Git finished
[C-t] show common commands, [?] describe events, [C-h i] show manual
Running git push -v production master:refs/heads/master [2 times]
Git finished [2 times]
Making completion list...

Load-path shadows:
~/.emacs.d/elisp/org2blog/metaweblog hides /home/alanm/.emacs.d/elpa/metaweblog-20141130.605/metaweblog
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-xcode hides ~/.emacs.d/elisp/company-xcode
/home/alanm/.emacs.d/elpa/company-20170112.2005/company hides ~/.emacs.d/elisp/company
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-abbrev hides ~/.emacs.d/elisp/company-abbrev
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-dabbrev hides ~/.emacs.d/elisp/company-dabbrev
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-dabbrev-code hides ~/.emacs.d/elisp/company-dabbrev-code
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-oddmuse hides ~/.emacs.d/elisp/company-oddmuse
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-etags hides ~/.emacs.d/elisp/company-etags
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-eclim hides ~/.emacs.d/elisp/company-eclim
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-nxml hides ~/.emacs.d/elisp/company-nxml
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-ispell hides ~/.emacs.d/elisp/company-ispell
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-gtags hides ~/.emacs.d/elisp/company-gtags
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-tempo hides ~/.emacs.d/elisp/company-tempo
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-elisp hides ~/.emacs.d/elisp/company-elisp
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-template hides ~/.emacs.d/elisp/company-template
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-keywords hides ~/.emacs.d/elisp/company-keywords
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-files hides ~/.emacs.d/elisp/company-files
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-semantic hides ~/.emacs.d/elisp/company-semantic
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-clang hides ~/.emacs.d/elisp/company-clang
/home/alanm/.emacs.d/elpa/autopair-20160304.437/autopair hides ~/.emacs.d/elisp/autopair
/home/alanm/.emacs.d/elpa/company-20170112.2005/company-css hides ~/.emacs.d/elisp/company-css
/home/alanm/.emacs.d/elpa/cmake-mode-20160928.505/cmake-mode hides /usr/share/emacs/site-lisp/cmake-mode
~/.emacs.d/elisp/rst hides /usr/share/emacs/25.1/lisp/textmodes/rst
/home/alanm/.emacs.d/elpa/css-mode-1.0/css-mode hides /usr/share/emacs/25.1/lisp/textmodes/css-mode
/home/alanm/.emacs.d/elpa/scala-mode-20170118.558/ob-scala hides /usr/share/emacs/25.1/lisp/org/ob-scala
/home/alanm/.emacs.d/elpa/soap-client-3.1.1/soap-inspect hides /usr/share/emacs/25.1/lisp/net/soap-inspect
/home/alanm/.emacs.d/elpa/soap-client-3.1.1/soap-client hides /usr/share/emacs/25.1/lisp/net/soap-client
/home/alanm/.emacs.d/elpa/ntlm-2.1.0/ntlm hides /usr/share/emacs/25.1/lisp/net/ntlm
/home/alanm/.emacs.d/elpa/seq-20151121.1017/seq hides /usr/share/emacs/25.1/lisp/emacs-lisp/seq
/home/alanm/.emacs.d/elpa/tabulated-list-20120406.1351/tabulated-list hides /usr/share/emacs/25.1/lisp/emacs-lisp/tabulated-list

Features:
(shadow sort mail-extr emacsbug sendmail tramp-cmds company-tern
dash-functional url-http-ntlm ntlm calc calc-loaddefs calc-macs hmac-md5
hex-util md4 ac-js2 skewer-mode cache-table simple-httpd
js2-highlight-vars tern-auto-complete tern cursor-sensor tabify
conf-mode tramp-cache winner diff network-stream nsm starttls warnings
company-oddmuse company-keywords company-etags company-gtags
company-dabbrev-code company-dabbrev company-files company-cmake
company-xcode company-clang company-eclim company-template company-css
company-nxml company-bbdb company-jedi jedi jedi-core python-environment
epc ctable concurrent deferred auto-complete popup vc-mtn vc-hg vc-bzr
vc-src vc-sccs vc-svn vc-cvs vc-rcs vc vc-dispatcher ibuffer-vc em-unix
em-term term ehelp em-script em-prompt em-ls em-hist em-pred em-glob
em-dirs em-cmpl em-basic em-banner em-alias esh-var esh-io esh-cmd
esh-opt esh-ext esh-proc esh-arg esh-groups eshell esh-module esh-mode
esh-util magit-obsolete magit-blame magit-stash magit-bisect
magit-remote magit-commit magit-sequence magit-notes magit-worktree
magit-branch magit-files magit-refs magit-status magit magit-repos
magit-apply magit-wip magit-log magit-diff smerge-mode magit-core
magit-autorevert autorevert filenotify magit-process magit-margin
magit-mode magit-git crm magit-section magit-popup git-commit
magit-utils log-edit message idna rfc822 mml mml-sec epg mm-decode
mm-bodies mm-encode mailabbrev mail-utils gmm-utils mailheader pcvs-util
add-log with-editor web-mode disp-table whitespace vc-git diff-mode
rainbow-mode color emmet-mode css-mode misearch multi-isearch lua-mode
server ob-lilypond zenburn-theme flycheck subr-x org-clock org2blog
ox-wp ox-latex ox-icalendar ox-html ox-ascii ox-publish ox org-element
cl metaweblog xml-rpc timezone url-http tls gnutls mail-parse rfc2231
rfc2047 rfc2045 ietf-drums url-gw org org-macro org-footnote
org-pcomplete org-list org-faces org-entities noutline outline
org-version ob-emacs-lisp ob ob-tangle ob-ref ob-lob ob-table ob-exp
org-src ob-keys ob-comint ob-core ob-eval org-compat org-macs
org-loaddefs cal-menu calendar cal-loaddefs js2-refactor js2r-paredit
js2r-conveniences js2r-conditionals js2r-wrapping js2r-functions
js2r-vars multiple-cursors-core rect js2r-iife js2r-formatting
js2r-helpers dash js2-mode js sgml-mode cc-mode cc-fonts cc-guess
cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
django-snippets yasnippet highlight-indentation flymake company elpy
pyvenv elpy-django s elpy-refactor python tramp-sh json map grep files-x
etags xref project ido helm-mode helm-files rx image-dired tramp
tramp-compat tramp-loaddefs trampver ucs-normalize shell pcomplete
dired-x dired-aux ffap thingatpt helm-buffers helm-elscreen helm-tags
helm-bookmark helm-adaptive helm-info helm-locate helm-grep helm-regexp
helm-external helm-net browse-url xml url url-proxy url-privacy
url-expand url-methods url-history url-cookie url-domsuf url-util
mailcap helm-utils helm-help helm-types helm helm-source eieio-compat
helm-multi-match helm-lib helm-config helm-easymenu async-bytecomp async
qml-mode delsel cua-base elec-pair pdf-occur ibuf-ext ibuffer tablist
tablist-filter semantic/wisent/comp semantic/wisent
semantic/wisent/wisent semantic/util-modes semantic/util semantic
semantic/tag semantic/lex semantic/fw mode-local find-func cedet dired
pdf-isearch let-alist pdf-misc imenu pdf-tools compile comint ansi-color
ring cus-edit cus-start cus-load wid-edit pdf-view bookmark pp jka-compr
pdf-cache pdf-info tq pdf-util advice format-spec image-mode wc-mode
printing ps-print ps-def lpr paren tabbar easy-mmode edmacro kmacro
finder-inf css-mode-autoloads flappymacs-autoloads epc-autoloads
ctable-autoloads nose-autoloads pep8-autoloads slime-autoloads
tabulated-list-autoloads url-auth url-parse auth-source cl-seq eieio
eieio-core cl-macs gnus-util time-date mm-util help-fns mail-prsvr
password-cache url-vars virtualenv-autoloads info package epg-config seq
byte-opt gv bytecomp byte-compile cl-extra help-mode easymenu cconv
cl-loaddefs pcase cl-lib mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment
elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan
thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian
slovak czech european ethiopic indian cyrillic chinese charscript
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote dbusbind inotify
dynamic-setting system-font-setting font-render-setting move-toolbar gtk
x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 722368 86999)
 (symbols 48 63599 2)
 (miscs 40 866 1138)
 (strings 32 197238 16662)
 (string-bytes 1 5612576)
 (vectors 16 99923)
 (vector-slots 8 1977787 49327)
 (floats 8 2414 842)
 (intervals 56 9241 322)
 (buffers 976 53))





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#25549: 25.1; eshell grep gives inconsistent output
  2017-01-26 22:06 bug#25549: 25.1; eshell grep gives inconsistent output me
@ 2017-01-29 17:25 ` npostavs
  2017-01-30 15:29   ` me
  2017-01-30 20:38 ` bug#25549: test case Tijs Mallaerts
  1 sibling, 1 reply; 7+ messages in thread
From: npostavs @ 2017-01-29 17:25 UTC (permalink / raw)
  To: me; +Cc: 25549

tags 25549 moreinfo unreproducible
quit

me@alandmoore.com writes:

> When using grep on a large file in eshell, the output is sometimes
>   inconsistent. For example, when grepping for a string in my
> server's apache2 access log (~5 MB) I got anywhere from around 200 to
> around 1300 results, just running the same grep command over and over
> (the file itself was not changing significantly).
>
> The bug cannot be consistently reproduced, but has happened in multiple
> situations and others have reportedly reproduced it per discussion on
> reddit
> (https://www.reddit.com/r/emacs/comments/5qbd70/trampeshellgrep_returning_wildly_different/).

Could you give some hints on what is needed to reproduce this?  Is it
the size of the grepped file?  Number of matches?  Are there some
non-ascii characters in the file?  Is Tramp required (the first post in
the linked thread mentions it, but some other comments seem to suggest
it's not required)?  You mention the file was not changing
"significantly", does this happen also when file stays *exactly*
identical?

I tried doubling up xdisp.c until it was over 5 MB, but grepping for "int"
in eshell returned a consistent number of matches.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#25549: 25.1; eshell grep gives inconsistent output
  2017-01-29 17:25 ` npostavs
@ 2017-01-30 15:29   ` me
  0 siblings, 0 replies; 7+ messages in thread
From: me @ 2017-01-30 15:29 UTC (permalink / raw)
  To: npostavs; +Cc: 25549

On Sun, 29 Jan 2017 12:25:48 -0500, npostavs@users.sourceforge.net wrote:
> tags 25549 moreinfo unreproducible
> quit
> 
> me@alandmoore.com writes:
> 
>> When using grep on a large file in eshell, the output is sometimes
>>   inconsistent. For example, when grepping for a string in my
>> server's apache2 access log (~5 MB) I got anywhere from around 200 to
>> around 1300 results, just running the same grep command over and over
>> (the file itself was not changing significantly).
>>
>> The bug cannot be consistently reproduced, but has happened in multiple
>> situations and others have reportedly reproduced it per discussion on
>> reddit
>>
(https://www.reddit.com/r/emacs/comments/5qbd70/trampeshellgrep_returning_wildly_different/).
> 
> Could you give some hints on what is needed to reproduce this?  Is it
> the size of the grepped file?  Number of matches?  Are there some
> non-ascii characters in the file?  Is Tramp required (the first post in
> the linked thread mentions it, but some other comments seem to suggest
> it's not required)?  You mention the file was not changing
> "significantly", does this happen also when file stays *exactly*
> identical?
> 
> I tried doubling up xdisp.c until it was over 5 MB, but grepping for
"int"
> in eshell returned a consistent number of matches.

It seems I and others were able to reproduce it locally, but it was much
harder to do so.  The problem seems much more pronounced when tramp is
involved.

It also seems to matter that the file is being written to.  I discovered
it when grepping a fairly active apache access log (being written to maybe
40-60 times a minute, on average).  I am able to consistently reproduce the
bug with this file, even when local to the server (though it's far less
pronounced when local).  I have not been successful reproducing it with a
file that is not changing at all.

I have asked the others who confirmed the bug on reddit to comment with
their observations as well.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#25549: test case
  2017-01-26 22:06 bug#25549: 25.1; eshell grep gives inconsistent output me
  2017-01-29 17:25 ` npostavs
@ 2017-01-30 20:38 ` Tijs Mallaerts
  2017-01-31  2:29   ` bug#25549: 25.1; eshell grep gives inconsistent output npostavs
  1 sibling, 1 reply; 7+ messages in thread
From: Tijs Mallaerts @ 2017-01-30 20:38 UTC (permalink / raw)
  To: 25549

[-- Attachment #1: Type: text/plain, Size: 294 bytes --]

On my machine following steps seem to be a reproducible test case:

- create a file "test-file.txt" with 20.000 identical lines with content
"This is a line."
- open eshell and insert the command: cat test-file.txt | grep line | wc

This seems to return a different result every time it's run.

[-- Attachment #2: Type: text/html, Size: 392 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#25549: 25.1; eshell grep gives inconsistent output
  2017-01-30 20:38 ` bug#25549: test case Tijs Mallaerts
@ 2017-01-31  2:29   ` npostavs
  2017-02-03  3:26     ` npostavs
  0 siblings, 1 reply; 7+ messages in thread
From: npostavs @ 2017-01-31  2:29 UTC (permalink / raw)
  To: Tijs Mallaerts; +Cc: me, 25549

tags 25549 = confirmed
quit

Tijs Mallaerts <tijs.mallaerts@gmail.com> writes:

> On my machine following steps seem to be a reproducible test case:
>
> - create a file "test-file.txt" with 20.000 identical lines with content "This is a line."
> - open eshell and insert the command: cat test-file.txt | grep line | wc
>
> This seems to return a different result every time it's run.

Thanks, I can reproduce with this.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#25549: 25.1; eshell grep gives inconsistent output
  2017-01-31  2:29   ` bug#25549: 25.1; eshell grep gives inconsistent output npostavs
@ 2017-02-03  3:26     ` npostavs
  2017-02-09  4:46       ` npostavs
  0 siblings, 1 reply; 7+ messages in thread
From: npostavs @ 2017-02-03  3:26 UTC (permalink / raw)
  To: Tijs Mallaerts; +Cc: me, 25549

[-- Attachment #1: Type: text/plain, Size: 1419 bytes --]

tags 25549 patch
quit

npostavs@users.sourceforge.net writes:
>
> Tijs Mallaerts <tijs.mallaerts@gmail.com> writes:
>
>> On my machine following steps seem to be a reproducible test case:
>>
>> - create a file "test-file.txt" with 20.000 identical lines with content "This is a line."
>> - open eshell and insert the command: cat test-file.txt | grep line | wc
>>
>> This seems to return a different result every time it's run.
>
> Thanks, I can reproduce with this.

The problem can be more easily reproduced (i.e., more often, and only
3000 lines) by

    cat test-file.txt | sleepy-cat | wc

where sleepy-cat is

    #!/bin/sh
    while read line ; do
        echo "$line"
        sleep 0.000001
    done

The problem happens when one of the commands in the pipeline sends its
output to Emacs quickly and the next command in the pipeline is slower.
On receiving data from the first command in eshell-insertion-filter we
call eshell-output-object to send it to the next command, but since
sending might block, Emacs can run other process filters and sentinels
instead.  In this case, while sending a data chunk from cmd1 to cmd2, we
actually end up reading all the data from cmd1 until it terminates and
we call its sentinel.  The sentinel closes the pipes and sends EOF to
cmd2, but we still haven't sent the data from cmd1 to cmd2 yet.

Closing the pipes in a timer, as in the patch below, seems to fix it for
me.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch --]
[-- Type: text/x-diff, Size: 1943 bytes --]

From a1757a7114fbc20733554d8aa44cea0fa8a991e2 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Thu, 2 Feb 2017 09:19:43 -0500
Subject: [PATCH v1] Make sure eshell pipelines don't drop data

* lisp/eshell/esh-proc.el (eshell-sentinel): If called while still
handling output of the process, make sure to close the pipes only later,
so that the next process in the pipeline recieves EOF only after getting
all its input (Bug#25549).
---
 lisp/eshell/esh-proc.el | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/lisp/eshell/esh-proc.el b/lisp/eshell/esh-proc.el
index b0dbb22..ba5cb5c 100644
--- a/lisp/eshell/esh-proc.el
+++ b/lisp/eshell/esh-proc.el
@@ -393,8 +393,20 @@ eshell-sentinel
 		    (unless (string= string "run")
 		      (unless (string-match "^\\(finished\\|exited\\)" string)
 			(eshell-insertion-filter proc string))
-		      (eshell-close-handles (process-exit-status proc) 'nil
-					    (cadr entry))))
+                      (let ((handles (nth 1 entry))
+                            (str (prog1 (nth 3 entry)
+                                   (setf (nth 3 entry) nil)))
+                            (status (process-exit-status proc)))
+                        ;; If we're in the middle of handling output
+                        ;; from this process then schedule the EOF for
+                        ;; later.
+                        (letrec ((finish-io
+                                  (lambda ()
+                                    (if (nth 4 entry)
+                                        (run-at-time 0 nil finish-io)
+                                      (when str (eshell-output-object str nil handles))
+                                      (eshell-close-handles status 'nil handles)))))
+                          (funcall finish-io)))))
 		(eshell-remove-process-entry entry))))
 	(eshell-kill-process-function proc string)))))
 
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* bug#25549: 25.1; eshell grep gives inconsistent output
  2017-02-03  3:26     ` npostavs
@ 2017-02-09  4:46       ` npostavs
  0 siblings, 0 replies; 7+ messages in thread
From: npostavs @ 2017-02-09  4:46 UTC (permalink / raw)
  To: Tijs Mallaerts; +Cc: me, 25549

tags 25549 fixed
close 25549 26.1
quit

npostavs@users.sourceforge.net writes:

>
> Closing the pipes in a timer, as in the patch below, seems to fix it for
> me.

Pushed to master, [1: 2bfe83fcd3].

1: 2017-02-08 23:37:57 -0500 2bfe83fcd3575d03f371a7d0e6b8c5d661bc5ac7
  Make sure eshell pipelines don't drop data






^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-02-09  4:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-26 22:06 bug#25549: 25.1; eshell grep gives inconsistent output me
2017-01-29 17:25 ` npostavs
2017-01-30 15:29   ` me
2017-01-30 20:38 ` bug#25549: test case Tijs Mallaerts
2017-01-31  2:29   ` bug#25549: 25.1; eshell grep gives inconsistent output npostavs
2017-02-03  3:26     ` npostavs
2017-02-09  4:46       ` npostavs

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).