Long lines and bidi [Was: Re: bug#13623: ...]

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Long lines and bidi [Was: Re: bug#13623: ...]
       [not found]             ` <834nhp9u9j.fsf@gnu.org>
@ 2013-02-08 13:33               ` Dmitry Antipov
  2013-02-08 14:07                 ` Eli Zaretskii
  2013-02-08 15:33                 ` Stefan Monnier
  0 siblings, 2 replies; 27+ messages in thread
From: Dmitry Antipov @ 2013-02-08 13:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Emacs development discussions

On 02/06/2013 10:23 PM, Eli Zaretskii wrote:

> Another area of redisplay optimizations would be the infamous
> very-long-lines use case.  (Personally, I think this one is the single
> most important deficiency in the current display engine, by far more
> important than any other display problem.)

I tried to scroll (down from the beginning and then up from the end) the
very pathological file (~150M with just ~500 lines) and got the following
profile:

8.59%        emacs  emacs                          [.] bidi_resolve_weak
7.92%        emacs  emacs                          [.] bidi_level_of_next_char
7.81%        emacs  emacs                          [.] get_next_display_element
7.12%        emacs  emacs                          [.] move_it_in_display_line_to
6.96%        emacs  emacs                          [.] x_produce_glyphs
5.06%        emacs  libc-2.16.so                   [.] __memcpy_ssse3_back
4.56%        emacs  emacs                          [.] next_element_from_buffer
4.38%        emacs  emacs                          [.] bidi_move_to_visually_next
4.26%        emacs  emacs                          [.] scan_buffer
3.04%        emacs  libXft.so.2.3.1                [.] XftCharIndex
2.93%        emacs  emacs                          [.] bidi_fetch_char
2.67%        emacs  emacs                          [.] bidi_cache_iterator_state
2.61%        emacs  emacs                          [.] lookup_glyphless_char_display
2.47%        emacs  libXft.so.2.3.1                [.] XftGlyphExtents
2.35%        emacs  emacs                          [.] bidi_resolve_neutral
1.95%        emacs  emacs                          [.] bidi_get_type
1.86%        emacs  emacs                          [.] detect_coding
1.70%        emacs  emacs                          [.] produce_chars
1.50%        emacs  emacs                          [.] bidi_resolve_explicit_1
1.18%        emacs  emacs                          [.] get_per_char_metric
1.13%        emacs  emacs                          [.] bidi_cache_search.constprop.4
1.01%        emacs  emacs                          [.] xftfont_text_extents
0.90%        emacs  emacs                          [.] bidi_explicit_dir_char
0.88%        emacs  emacs                          [.] bidi_resolve_explicit
...

So the first question is: is it feasible/possible/desirable to detect that
the buffer has no R2L text at all and automatically force bidi-paragraph-direction
to left-to-right and bidi-display-reordering to nil?

Dmitry



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi [Was: Re: bug#13623: ...]
  2013-02-08 13:33               ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov
@ 2013-02-08 14:07                 ` Eli Zaretskii
  2013-02-08 14:46                   ` Long lines and bidi Eli Zaretskii
  2013-02-08 16:21                   ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov
  2013-02-08 15:33                 ` Stefan Monnier
  1 sibling, 2 replies; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-08 14:07 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: emacs-devel

> Date: Fri, 08 Feb 2013 17:33:47 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: Emacs development discussions <emacs-devel@gnu.org>
> 
> On 02/06/2013 10:23 PM, Eli Zaretskii wrote:
> 
> > Another area of redisplay optimizations would be the infamous
> > very-long-lines use case.  (Personally, I think this one is the single
> > most important deficiency in the current display engine, by far more
> > important than any other display problem.)
> 
> I tried to scroll (down from the beginning and then up from the end) the
> very pathological file (~150M with just ~500 lines) and got the following
> profile:

Profile alone is not enough.  Please tell how did you "scroll",
exactly (which commands did you use), and please also show the
absolute times it took to perform each command.

> 8.59%        emacs  emacs                          [.] bidi_resolve_weak

What was in the file?  bidi_resolve_weak high on the profile hints
that it was full of punctuation or digits or banks, which is not
really an interesting case.

> 7.92%        emacs  emacs                          [.] bidi_level_of_next_char
> 7.81%        emacs  emacs                          [.] get_next_display_element
> 7.12%        emacs  emacs                          [.] move_it_in_display_line_to
> 6.96%        emacs  emacs                          [.] x_produce_glyphs
> 5.06%        emacs  libc-2.16.so                   [.] __memcpy_ssse3_back
> 4.56%        emacs  emacs                          [.] next_element_from_buffer
> 4.38%        emacs  emacs                          [.] bidi_move_to_visually_next
> 4.26%        emacs  emacs                          [.] scan_buffer
> 3.04%        emacs  libXft.so.2.3.1                [.] XftCharIndex
> 2.93%        emacs  emacs                          [.] bidi_fetch_char
> 2.67%        emacs  emacs                          [.] bidi_cache_iterator_state
> 2.61%        emacs  emacs                          [.] lookup_glyphless_char_display
> 2.47%        emacs  libXft.so.2.3.1                [.] XftGlyphExtents
> 2.35%        emacs  emacs                          [.] bidi_resolve_neutral
> 1.95%        emacs  emacs                          [.] bidi_get_type
> 1.86%        emacs  emacs                          [.] detect_coding
> 1.70%        emacs  emacs                          [.] produce_chars
> 1.50%        emacs  emacs                          [.] bidi_resolve_explicit_1
> 1.18%        emacs  emacs                          [.] get_per_char_metric
> 1.13%        emacs  emacs                          [.] bidi_cache_search.constprop.4
> 1.01%        emacs  emacs                          [.] xftfont_text_extents
> 0.90%        emacs  emacs                          [.] bidi_explicit_dir_char
> 0.88%        emacs  emacs                          [.] bidi_resolve_explicit
> ...
> 
> So the first question is: is it feasible/possible/desirable to detect that
> the buffer has no R2L text at all and automatically force bidi-paragraph-direction
> to left-to-right and bidi-display-reordering to nil?

Ah, _that_ red herring...  Why is that the first question?  What were
the times with and without bidi-display-reordering in this file?  In
my testing, the display engine performs awfully slow in both cases, so
even though turning off reordering makes it faster, it is still so
terribly slow that the problem is not going to be solved by that.

As to your question: how can we know what characters are or aren't in
the buffer without scanning it?  And scanning the buffer is exactly
what bidi.c does.

As to bidi-paragraph-direction, the detection of the paragraph
direction is turned off for long paragraphs anyway.  Again, does
setting bidi-paragraph-direction to left-to-right give you reasonable
performance in that file?  If not, this is just another red herring.

Anyway, I think this is the wrong way to try to find the solution.
The problem is not that scanning is slower with the bidi display.  (If
it were, we would see terribly slow performance with "normal" files as
well.)  The problem is that _we_scan_too_many_characters_.  See this
part of the profile:

> 7.12%        emacs  emacs                  [.] move_it_in_display_line_to

The display routines of the move_it_* family, which are heavily used
in scrolling, cursor movement, and just about any display operation,
_always_ scan each line from the beginning to the end, before they get
to the next line.  When each line is very long, those scans are very
expensive.  The way to make display significantly faster for long
lines is to avoid scanning entire lines.  The problem is how to do
that without losing accuracy, e.g., without missing characters that
affect the line metrics.

IOW, our problem is to find clever algorithms and provide supporting
data structures for those algorithms, so that we could avoid scanning
very long lines in their entirety each time we need to move the
cursor.  When we find these algorithms and code them, the bidi
"problem" will disappear without a trace.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-08 14:07                 ` Eli Zaretskii
@ 2013-02-08 14:46                   ` Eli Zaretskii
  2013-02-08 16:38                     ` Dmitry Antipov
  2013-02-08 16:21                   ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-08 14:46 UTC (permalink / raw)
  To: dmantipov; +Cc: emacs-devel

> Date: Fri, 08 Feb 2013 16:07:23 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> Profile alone is not enough.  Please tell how did you "scroll",
> exactly (which commands did you use), and please also show the
> absolute times it took to perform each command.

Btw, if you are serious about finding a solution to the long-line
display misfeature (or any other too-slow redisplay situation), I
generally find it necessary to do precision timing of the suspicious
parts of code, because otherwise it is impossible to find the actual
culprits.  On GNU/Linux, I use the following simple function:

  double
  timer_time (void)
  {
    struct timeval tv;

    gettimeofday (&tv, NULL);
    return tv.tv_usec * 0.000001 + tv.tv_sec;
  }

Now, to time a particular portion of the code, do something like this:

  double t1, t2;
  ...
  t1 = timer_time ();
  /* here comes the code that should be timed */
  t2 = timer_time ();
  if (t2 - t1 > THRESHOLD)
    fprintf (stderr, "that code took %.4g sec\n", t2 - t1);

The value of THRESHOLD depends on the magnitude of the slow-down you
are working on.  I generally start with 0.1 of the time it takes to
perform some redisplay operation; e.g., if it takes 5 sec to move the
cursor, start with 0.5 sec.  gettimeofday has a sufficient resolution
on GNU/Linux to get you sub-millisecond accuracy, which is more than
enough for display engine measurements.

Using the above, you can quickly identify the function(s) that take
most of the time of a particular redisplay operation, then time the
parts of those functions to find the most expensive parts, and so on,
recursively, until you find the hot spots (more than 50% of the slow
operation).

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi [Was: Re: bug#13623: ...]
  2013-02-08 13:33               ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov
  2013-02-08 14:07                 ` Eli Zaretskii
@ 2013-02-08 15:33                 ` Stefan Monnier
  2013-02-08 16:05                   ` Eli Zaretskii
  1 sibling, 1 reply; 27+ messages in thread
From: Stefan Monnier @ 2013-02-08 15:33 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: Eli Zaretskii, Emacs development discussions

> So the first question is: is it feasible/possible/desirable to detect
> that the buffer has no R2L text at all and automatically force
> bidi-paragraph-direction to left-to-right and bidi-display-reordering
> to nil?

Would this speed things up by a constant factor, or would it actually
remove an O(N) factor?

I think a fix will need more than a constant factor speed up.

Did you check both the truncate-lines=nil and the truncate-lines=t cases?
I think that for the truncate-lines=t case, we won't be able to avoid
the O(linelength) slowdown (but we should try and skip the non-displayed
part of lines faster, especially when there's no
`display/after/before-string' property).

        Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi [Was: Re: bug#13623: ...]
  2013-02-08 15:33                 ` Stefan Monnier
@ 2013-02-08 16:05                   ` Eli Zaretskii
  0 siblings, 0 replies; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-08 16:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dmantipov, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Emacs development discussions <emacs-devel@gnu.org>
> Date: Fri, 08 Feb 2013 10:33:08 -0500
> 
> > So the first question is: is it feasible/possible/desirable to detect
> > that the buffer has no R2L text at all and automatically force
> > bidi-paragraph-direction to left-to-right and bidi-display-reordering
> > to nil?
> 
> Would this speed things up by a constant factor, or would it actually
> remove an O(N) factor?

The former, because the bidi iteration is slower than the original
unidirectional one by a constant factor, on the average.

> I think a fix will need more than a constant factor speed up.

Indeed.

> Did you check both the truncate-lines=nil and the truncate-lines=t cases?
> I think that for the truncate-lines=t case, we won't be able to avoid
> the O(linelength) slowdown (but we should try and skip the non-displayed
> part of lines faster, especially when there's no
> `display/after/before-string' property).

The problem is not with the part of text we actually display, because
the number of characters shown in a window does not depend on whether
we have truncate-lines=t or nil.  The problem is that most redisplay
operations always scan some text that is eventually not shown in the
window.  The longer the lines, the more text we scan that is outside
of the window.

For example, any redisplay that needs to scroll the window up (M-v
etc.) needs to find the buffer position for the window start.  To do
that, we use move_it_vertically_backward, which moves N screen lines
up (back) in the buffer.  But what that function does is move N
_buffer_lines_ back, and then moves forward by screen lines to find
which position is N screen lines above where we started.  If each line
is hundreds or thousands of characters, it is clear that moving back N
buffer lines will move much more than needed, and thereafter moving by
screen lines back through all those thousands of characters wastes a
lot of CPU cycles.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi [Was: Re: bug#13623: ...]
  2013-02-08 14:07                 ` Eli Zaretskii
  2013-02-08 14:46                   ` Long lines and bidi Eli Zaretskii
@ 2013-02-08 16:21                   ` Dmitry Antipov
  2013-02-08 17:04                     ` Eli Zaretskii
  1 sibling, 1 reply; 27+ messages in thread
From: Dmitry Antipov @ 2013-02-08 16:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 02/08/2013 06:07 PM, Eli Zaretskii wrote:

> Profile alone is not enough.  Please tell how did you "scroll",
> exactly (which commands did you use), and please also show the
> absolute times it took to perform each command.

(defun scroll-both ()
   (interactive)
   (let ((start (float-time)))
     (progn
       (dotimes (n 100) (progn (scroll-up) (redisplay)))
       (goto-char (point-max))
       (dotimes (n 100) (progn (scroll-down) (redisplay)))
       (message "Elapsed %f seconds" (- (float-time) start)))))

With bidi, ~600 second elapsed, and:

     25.18%        emacs  emacs                          [.] scan_buffer
      7.04%        emacs  emacs                          [.] bidi_resolve_weak
      6.47%        emacs  emacs                          [.] get_next_display_element
      6.37%        emacs  emacs                          [.] bidi_level_of_next_char
      5.14%        emacs  libc-2.16.so                   [.] __memcpy_ssse3_back
      5.05%        emacs  emacs                          [.] move_it_in_display_line_to
      4.94%        emacs  emacs                          [.] x_produce_glyphs
      4.84%        emacs  libXft.so.2.3.1                [.] XftCharIndex
      3.72%        emacs  emacs                          [.] bidi_move_to_visually_next
      3.70%        emacs  emacs                          [.] next_element_from_buffer
      2.90%        emacs  libXft.so.2.3.1                [.] XftGlyphExtents
      2.05%        emacs  emacs                          [.] bidi_fetch_char
      2.02%        emacs  emacs                          [.] lookup_glyphless_char_display
      2.01%        emacs  emacs                          [.] bidi_resolve_neutral
      1.76%        emacs  emacs                          [.] bidi_cache_iterator_state
      1.70%        emacs  emacs                          [.] bidi_get_type
      1.51%        emacs  emacs                          [.] bidi_resolve_explicit_1
      1.18%        emacs  libXft.so.2.3.1                [.] XftFontCheckGlyph
      1.12%        emacs  emacs                          [.] xftfont_encode_char
      1.01%        emacs  emacs                          [.] xftfont_text_extents

Without bidi, ~230 seconds elapsed, and:

     21.36%        emacs  emacs                          [.] x_produce_glyphs
     17.92%        emacs  emacs                          [.] get_next_display_element
     15.07%        emacs  emacs                          [.] move_it_in_display_line_to
      8.37%        emacs  emacs                          [.] next_element_from_buffer
      8.34%        emacs  libXft.so.2.3.1                [.] XftCharIndex
      6.12%        emacs  emacs                          [.] lookup_glyphless_char_display
      4.21%        emacs  libXft.so.2.3.1                [.] XftGlyphExtents
      3.07%        emacs  emacs                          [.] xftfont_encode_char
      2.68%        emacs  emacs                          [.] xftfont_text_extents
      1.87%        emacs  emacs                          [.] get_per_char_metric
      1.53%        emacs  libXft.so.2.3.1                [.] XftFontCheckGlyph
      1.49%        emacs  emacs                          [.] composition_compute_stop_pos
      1.35%        emacs  emacs                          [.] set_iterator_to_next

cache-long-line-scans is nil in both cases.

I suspect that scroll should be direction-agnostic in theory; but both profiled
runs shows that scroll-down is much, much slower than scroll-up (that's why
elapsed time is so huge in both cases).

> What was in the file?  bidi_resolve_weak high on the profile hints
> that it was full of punctuation or digits or banks, which is not
> really an interesting case.

Your guess is correct; but I suspect that an average text in human language
contains less punctuations, digits and blanks than the C source code of the
same size :-).

> Ah, _that_ red herring...  Why is that the first question?  What were
> the times with and without bidi-display-reordering in this file?  In
> my testing, the display engine performs awfully slow in both cases, so
> even though turning off reordering makes it faster, it is still so
> terribly slow that the problem is not going to be solved by that.
>
> As to your question: how can we know what characters are or aren't in
> the buffer without scanning it?  And scanning the buffer is exactly
> what bidi.c does.

Hm... insert-file-contents tries to detect encoding by looking at first 1K
and last 3K of the file. Why the similar approach isn't applicable to bidi?

Dmitry




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-08 14:46                   ` Long lines and bidi Eli Zaretskii
@ 2013-02-08 16:38                     ` Dmitry Antipov
  2013-02-08 16:52                       ` Eli Zaretskii
  0 siblings, 1 reply; 27+ messages in thread
From: Dmitry Antipov @ 2013-02-08 16:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 02/08/2013 06:46 PM, Eli Zaretskii wrote:

> Btw, if you are serious about finding a solution to the long-line
> display misfeature (or any other too-slow redisplay situation), I
> generally find it necessary to do precision timing of the suspicious
> parts of code, because otherwise it is impossible to find the actual
> culprits.  On GNU/Linux, I use the following simple function:

Ah, please, there is a difference between 2013 and 1980.

1) perf record -e stalled-cycles-frontend -e stalled-cycles-backend -F 10000 [workload]
2) perf report --stdio ==>

25.18%        emacs  emacs                          [.] scan_buffer
  7.04%        emacs  emacs                          [.] bidi_resolve_weak
...
3) perf annotate scan_buffer --stdio ==>

       :                while (cursor >= ceiling_addr)
       :                  {
       :                    unsigned char *scan_start = cursor;
       :
       :                    while (*cursor != target && --cursor >= ceiling_addr)
65.74 :        526620:       movzbl (%r14),%eax
  6.46 :        526624:       cmp    %r15d,%eax
  0.17 :        526627:       je     526632 <scan_buffer+0x512>
27.33 :        526629:       sub    $0x1,%r14
  0.03 :        52662d:       cmp    %r14,%rbx
  0.19 :        526630:       jbe    526620 <scan_buffer+0x500>
       :                      ;

So, ~90% of time spent in scan_buffer is:

799                while (*cursor != target && --cursor >= ceiling_addr)
800                  ;

Dmitry




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-08 16:38                     ` Dmitry Antipov
@ 2013-02-08 16:52                       ` Eli Zaretskii
  2013-02-09  3:34                         ` Paul Eggert
  0 siblings, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-08 16:52 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: emacs-devel

> Date: Fri, 08 Feb 2013 20:38:24 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: emacs-devel@gnu.org
> 
> On 02/08/2013 06:46 PM, Eli Zaretskii wrote:
> 
> > Btw, if you are serious about finding a solution to the long-line
> > display misfeature (or any other too-slow redisplay situation), I
> > generally find it necessary to do precision timing of the suspicious
> > parts of code, because otherwise it is impossible to find the actual
> > culprits.  On GNU/Linux, I use the following simple function:
> 
> Ah, please, there is a difference between 2013 and 1980.

Sorry, you lost me here.

> 1) perf record -e stalled-cycles-frontend -e stalled-cycles-backend -F 10000 [workload]
> 2) perf report --stdio ==>
> 
> 25.18%        emacs  emacs                          [.] scan_buffer
>   7.04%        emacs  emacs                          [.] bidi_resolve_weak

That's why testing redisplay on buffers which are predominantly
punctuation will give you unrealistic measurements.  (If you want to
understand why, read UAX#9.)

> So, ~90% of time spent in scan_buffer is:
> 
> 799                while (*cursor != target && --cursor >= ceiling_addr)
> 800                  ;

Which cannot be optimized.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi [Was: Re: bug#13623: ...]
  2013-02-08 16:21                   ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov
@ 2013-02-08 17:04                     ` Eli Zaretskii
  0 siblings, 0 replies; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-08 17:04 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: emacs-devel

> Date: Fri, 08 Feb 2013 20:21:57 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: emacs-devel@gnu.org
> 
> On 02/08/2013 06:07 PM, Eli Zaretskii wrote:
> 
> > Profile alone is not enough.  Please tell how did you "scroll",
> > exactly (which commands did you use), and please also show the
> > absolute times it took to perform each command.
> 
> (defun scroll-both ()
>    (interactive)
>    (let ((start (float-time)))
>      (progn
>        (dotimes (n 100) (progn (scroll-up) (redisplay)))
>        (goto-char (point-max))
>        (dotimes (n 100) (progn (scroll-down) (redisplay)))
>        (message "Elapsed %f seconds" (- (float-time) start)))))
> 
> With bidi, ~600 second elapsed, and:
> 
>      25.18%        emacs  emacs                          [.] scan_buffer
>       7.04%        emacs  emacs                          [.] bidi_resolve_weak
>       6.47%        emacs  emacs                          [.] get_next_display_element
>       6.37%        emacs  emacs                          [.] bidi_level_of_next_char
>       5.14%        emacs  libc-2.16.so                   [.] __memcpy_ssse3_back
>       5.05%        emacs  emacs                          [.] move_it_in_display_line_to
>       4.94%        emacs  emacs                          [.] x_produce_glyphs
>       4.84%        emacs  libXft.so.2.3.1                [.] XftCharIndex
>       3.72%        emacs  emacs                          [.] bidi_move_to_visually_next
>       3.70%        emacs  emacs                          [.] next_element_from_buffer
>       2.90%        emacs  libXft.so.2.3.1                [.] XftGlyphExtents
>       2.05%        emacs  emacs                          [.] bidi_fetch_char
>       2.02%        emacs  emacs                          [.] lookup_glyphless_char_display
>       2.01%        emacs  emacs                          [.] bidi_resolve_neutral
>       1.76%        emacs  emacs                          [.] bidi_cache_iterator_state
>       1.70%        emacs  emacs                          [.] bidi_get_type
>       1.51%        emacs  emacs                          [.] bidi_resolve_explicit_1
>       1.18%        emacs  libXft.so.2.3.1                [.] XftFontCheckGlyph
>       1.12%        emacs  emacs                          [.] xftfont_encode_char
>       1.01%        emacs  emacs                          [.] xftfont_text_extents
> 
> Without bidi, ~230 seconds elapsed, and:

This is consistent with my past measurements:

 (a) disabling bidi makes redisplay faster, but it is still awfully
     slow (2.3 sec per scroll);

 (b) bidi iteration is about 2 times slower than the unidirectional
     one (you get 3 times slower because your buffer is full of weak
     characters, which make the bidi iterator work harder due to the
     requirements of the Unicode Bidirectional Algorithm.

> I suspect that scroll should be direction-agnostic in theory

That theory is wrong.  The reason is that functions that move by
display lines can only move forward.  So moving backward is coded very
differently (a.k.a. "slower").

> but both profiled runs shows that scroll-down is much, much slower
> than scroll-up (that's why elapsed time is so huge in both cases).

That's expected; see also my explanation in a previous mail, which
describes what move_it_vertically_backward does.  That function is
used a lot by scroll-down.

> > What was in the file?  bidi_resolve_weak high on the profile hints
> > that it was full of punctuation or digits or banks, which is not
> > really an interesting case.
> 
> Your guess is correct; but I suspect that an average text in human language
> contains less punctuations, digits and blanks than the C source code of the
> same size :-).

An average C code still has only a small fraction of punctuation.
Just look at any C file.

> > As to your question: how can we know what characters are or aren't in
> > the buffer without scanning it?  And scanning the buffer is exactly
> > what bidi.c does.
> 
> Hm... insert-file-contents tries to detect encoding by looking at first 1K
> and last 3K of the file. Why the similar approach isn't applicable to bidi?

No.  Detecting encoding by a small portion is a heuristic that works
only because most every file is encoded consistently.  When a file is
encoded inconsistently, the result of the above decoding heuristic is
horribly wrong, and the consequences for the user are grave.  As a
recent example, see bug #13505.

By contrast, scripts used in a text file do not have to be consistent
or uniformly distributed over the file at all.  So the probability to
get this wrong will be much higher.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-08 16:52                       ` Eli Zaretskii
@ 2013-02-09  3:34                         ` Paul Eggert
  2013-02-09  8:46                           ` Eli Zaretskii
  0 siblings, 1 reply; 27+ messages in thread
From: Paul Eggert @ 2013-02-09  3:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dmitry Antipov, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1070 bytes --]

On 02/08/2013 08:52 AM, Eli Zaretskii wrote:
>> > So, ~90% of time spent in scan_buffer is:
>> > 
>> > 799                while (*cursor != target && --cursor >= ceiling_addr)
>> > 800                  ;

> Which cannot be optimized.

It can be sped up somewhat, by using memrchr.

This won't solve these performance issues, but it helps:
on my platform (x86-64 Ubuntu 12.10) I ran Dmitry's scroll-both benchmark
<http://lists.gnu.org/archive/html/emacs-devel/2013-02/msg00147.html>
on a real file (the trunk's src/xdisp.c), and it was 25% faster overall
(1.19 seconds versus 1.49 seconds) when I used memrchr there
and memchr for forward searches.

I'll attach the patch I used.  Eli, it'll need a bit of hacking to
port to MS-Windows, since the substitute memrchr implementation
(which is supplied) will need to be compiled.

Dmitry, is this something you can easily try with your benchmarks?

Most of the attached patch is boilerplate taken unmodified from gnulib,
to support memrchr on non-GNU platforms.  The key part of the change is
at the end, to src/search.c.


[-- Attachment #2: memchr.txt --]
[-- Type: text/plain, Size: 75410 bytes --]

=== modified file '.bzrignore'
--- .bzrignore	2013-02-01 06:30:51 +0000
+++ .bzrignore	2013-02-09 03:12:48 +0000
@@ -97,6 +97,7 @@
 lib/stdio.h
 lib/stdint.h
 lib/stdlib.h
+lib/string.h
 lib/sys/
 lib/SYS
 lib/time.h

=== modified file 'ChangeLog'
--- ChangeLog	2013-02-08 23:37:17 +0000
+++ ChangeLog	2013-02-09 03:12:48 +0000
@@ -1,3 +1,11 @@
+2013-02-09  Paul Eggert  <eggert@cs.ucla.edu>
+
+	Tune redisplay by using memchr and memrchr.
+	* .bzrignore: Add string.h.
+	* lib/gnulib.mk, m4/gnulib-comp.m4: Regenerate.
+	* lib/memrchr.c, lib/string.in.h, m4/memrchr.m4, m4/string_h.m4:
+	New files, from gnulib.
+
 2013-02-08  Paul Eggert  <eggert@cs.ucla.edu>
 
 	Merge from gnulib, incorporating:

=== modified file 'admin/ChangeLog'
--- admin/ChangeLog	2013-02-01 06:30:51 +0000
+++ admin/ChangeLog	2013-02-09 03:12:48 +0000
@@ -1,3 +1,8 @@
+2013-02-09  Paul Eggert  <eggert@cs.ucla.edu>
+
+	Tune redisplay by using memchr and memrchr.
+	* merge-gnulib (GNULIB_MODULES): Add memrchr.
+
 2013-02-01  Paul Eggert  <eggert@cs.ucla.edu>
 
 	Use fdopendir, fstatat and readlinkat, for efficiency (Bug#13539).

=== modified file 'admin/merge-gnulib'
--- admin/merge-gnulib	2013-02-01 06:30:51 +0000
+++ admin/merge-gnulib	2013-02-09 03:12:48 +0000
@@ -31,7 +31,8 @@
   dtoastr dtotimespec dup2 environ execinfo faccessat
   fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday
   ignore-value intprops largefile lstat
-  manywarnings mktime pselect pthread_sigmask putenv readlink readlinkat
+  manywarnings memrchr mktime
+  pselect pthread_sigmask putenv readlink readlinkat
   sig2str socklen stat-time stdalign stdarg stdbool stdio
   strftime strtoimax strtoumax symlink sys_stat
   sys_time time timer-time timespec-add timespec-sub unsetenv utimens

=== modified file 'lib/gnulib.mk'
--- lib/gnulib.mk	2013-02-08 23:37:17 +0000
+++ lib/gnulib.mk	2013-02-09 03:12:48 +0000
@@ -21,7 +21,7 @@
 # the same distribution terms as the rest of that program.
 #
 # Generated by gnulib-tool.
-# Reproduce by: gnulib-tool --import --dir=. --lib=libgnu --source-base=lib --m4-base=m4 --doc-base=doc --tests-base=tests --aux-dir=build-aux --avoid=dup --avoid=errno --avoid=fchdir --avoid=fcntl --avoid=fstat --avoid=malloc-posix --avoid=msvc-inval --avoid=msvc-nothrow --avoid=open --avoid=openat-die --avoid=opendir --avoid=raise --avoid=save-cwd --avoid=select --avoid=sigprocmask --avoid=sys_types --avoid=threadlib --makefile-name=gnulib.mk --conditional-dependencies --no-libtool --macro-prefix=gl --no-vc-files alloca-opt c-ctype c-strcase careadlinkat close-stream crypto/md5 crypto/sha1 crypto/sha256 crypto/sha512 dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat manywarnings mktime pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens warnings
+# Reproduce by: gnulib-tool --import --dir=. --lib=libgnu --source-base=lib --m4-base=m4 --doc-base=doc --tests-base=tests --aux-dir=build-aux --avoid=dup --avoid=errno --avoid=fchdir --avoid=fcntl --avoid=fstat --avoid=malloc-posix --avoid=msvc-inval --avoid=msvc-nothrow --avoid=open --avoid=openat-die --avoid=opendir --avoid=raise --avoid=save-cwd --avoid=select --avoid=sigprocmask --avoid=sys_types --avoid=threadlib --makefile-name=gnulib.mk --conditional-dependencies --no-libtool --macro-prefix=gl --no-vc-files alloca-opt c-ctype c-strcase careadlinkat close-stream crypto/md5 crypto/sha1 crypto/sha256 crypto/sha512 dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat manywarnings memrchr mktime pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens warnings
 
 
 MOSTLYCLEANFILES += core *.stackdump
@@ -480,6 +480,15 @@
 
 ## end   gnulib module lstat
 
+## begin gnulib module memrchr
+
+
+EXTRA_DIST += memrchr.c
+
+EXTRA_libgnu_a_SOURCES += memrchr.c
+
+## end   gnulib module memrchr
+
 ## begin gnulib module mktime
 
 
@@ -1105,6 +1114,106 @@
 
 ## end   gnulib module strftime
 
+## begin gnulib module string
+
+BUILT_SOURCES += string.h
+
+# We need the following in order to create <string.h> when the system
+# doesn't have one that works with the given compiler.
+string.h: string.in.h $(top_builddir)/config.status $(CXXDEFS_H) $(ARG_NONNULL_H) $(WARN_ON_USE_H)
+	$(AM_V_GEN)rm -f $@-t $@ && \
+	{ echo '/* DO NOT EDIT! GENERATED AUTOMATICALLY! */' && \
+	  sed -e 's|@''GUARD_PREFIX''@|GL|g' \
+	      -e 's|@''INCLUDE_NEXT''@|$(INCLUDE_NEXT)|g' \
+	      -e 's|@''PRAGMA_SYSTEM_HEADER''@|@PRAGMA_SYSTEM_HEADER@|g' \
+	      -e 's|@''PRAGMA_COLUMNS''@|@PRAGMA_COLUMNS@|g' \
+	      -e 's|@''NEXT_STRING_H''@|$(NEXT_STRING_H)|g' \
+	      -e 's/@''GNULIB_FFSL''@/$(GNULIB_FFSL)/g' \
+	      -e 's/@''GNULIB_FFSLL''@/$(GNULIB_FFSLL)/g' \
+	      -e 's/@''GNULIB_MBSLEN''@/$(GNULIB_MBSLEN)/g' \
+	      -e 's/@''GNULIB_MBSNLEN''@/$(GNULIB_MBSNLEN)/g' \
+	      -e 's/@''GNULIB_MBSCHR''@/$(GNULIB_MBSCHR)/g' \
+	      -e 's/@''GNULIB_MBSRCHR''@/$(GNULIB_MBSRCHR)/g' \
+	      -e 's/@''GNULIB_MBSSTR''@/$(GNULIB_MBSSTR)/g' \
+	      -e 's/@''GNULIB_MBSCASECMP''@/$(GNULIB_MBSCASECMP)/g' \
+	      -e 's/@''GNULIB_MBSNCASECMP''@/$(GNULIB_MBSNCASECMP)/g' \
+	      -e 's/@''GNULIB_MBSPCASECMP''@/$(GNULIB_MBSPCASECMP)/g' \
+	      -e 's/@''GNULIB_MBSCASESTR''@/$(GNULIB_MBSCASESTR)/g' \
+	      -e 's/@''GNULIB_MBSCSPN''@/$(GNULIB_MBSCSPN)/g' \
+	      -e 's/@''GNULIB_MBSPBRK''@/$(GNULIB_MBSPBRK)/g' \
+	      -e 's/@''GNULIB_MBSSPN''@/$(GNULIB_MBSSPN)/g' \
+	      -e 's/@''GNULIB_MBSSEP''@/$(GNULIB_MBSSEP)/g' \
+	      -e 's/@''GNULIB_MBSTOK_R''@/$(GNULIB_MBSTOK_R)/g' \
+	      -e 's/@''GNULIB_MEMCHR''@/$(GNULIB_MEMCHR)/g' \
+	      -e 's/@''GNULIB_MEMMEM''@/$(GNULIB_MEMMEM)/g' \
+	      -e 's/@''GNULIB_MEMPCPY''@/$(GNULIB_MEMPCPY)/g' \
+	      -e 's/@''GNULIB_MEMRCHR''@/$(GNULIB_MEMRCHR)/g' \
+	      -e 's/@''GNULIB_RAWMEMCHR''@/$(GNULIB_RAWMEMCHR)/g' \
+	      -e 's/@''GNULIB_STPCPY''@/$(GNULIB_STPCPY)/g' \
+	      -e 's/@''GNULIB_STPNCPY''@/$(GNULIB_STPNCPY)/g' \
+	      -e 's/@''GNULIB_STRCHRNUL''@/$(GNULIB_STRCHRNUL)/g' \
+	      -e 's/@''GNULIB_STRDUP''@/$(GNULIB_STRDUP)/g' \
+	      -e 's/@''GNULIB_STRNCAT''@/$(GNULIB_STRNCAT)/g' \
+	      -e 's/@''GNULIB_STRNDUP''@/$(GNULIB_STRNDUP)/g' \
+	      -e 's/@''GNULIB_STRNLEN''@/$(GNULIB_STRNLEN)/g' \
+	      -e 's/@''GNULIB_STRPBRK''@/$(GNULIB_STRPBRK)/g' \
+	      -e 's/@''GNULIB_STRSEP''@/$(GNULIB_STRSEP)/g' \
+	      -e 's/@''GNULIB_STRSTR''@/$(GNULIB_STRSTR)/g' \
+	      -e 's/@''GNULIB_STRCASESTR''@/$(GNULIB_STRCASESTR)/g' \
+	      -e 's/@''GNULIB_STRTOK_R''@/$(GNULIB_STRTOK_R)/g' \
+	      -e 's/@''GNULIB_STRERROR''@/$(GNULIB_STRERROR)/g' \
+	      -e 's/@''GNULIB_STRERROR_R''@/$(GNULIB_STRERROR_R)/g' \
+	      -e 's/@''GNULIB_STRSIGNAL''@/$(GNULIB_STRSIGNAL)/g' \
+	      -e 's/@''GNULIB_STRVERSCMP''@/$(GNULIB_STRVERSCMP)/g' \
+	      < $(srcdir)/string.in.h | \
+	  sed -e 's|@''HAVE_FFSL''@|$(HAVE_FFSL)|g' \
+	      -e 's|@''HAVE_FFSLL''@|$(HAVE_FFSLL)|g' \
+	      -e 's|@''HAVE_MBSLEN''@|$(HAVE_MBSLEN)|g' \
+	      -e 's|@''HAVE_MEMCHR''@|$(HAVE_MEMCHR)|g' \
+	      -e 's|@''HAVE_DECL_MEMMEM''@|$(HAVE_DECL_MEMMEM)|g' \
+	      -e 's|@''HAVE_MEMPCPY''@|$(HAVE_MEMPCPY)|g' \
+	      -e 's|@''HAVE_DECL_MEMRCHR''@|$(HAVE_DECL_MEMRCHR)|g' \
+	      -e 's|@''HAVE_RAWMEMCHR''@|$(HAVE_RAWMEMCHR)|g' \
+	      -e 's|@''HAVE_STPCPY''@|$(HAVE_STPCPY)|g' \
+	      -e 's|@''HAVE_STPNCPY''@|$(HAVE_STPNCPY)|g' \
+	      -e 's|@''HAVE_STRCHRNUL''@|$(HAVE_STRCHRNUL)|g' \
+	      -e 's|@''HAVE_DECL_STRDUP''@|$(HAVE_DECL_STRDUP)|g' \
+	      -e 's|@''HAVE_DECL_STRNDUP''@|$(HAVE_DECL_STRNDUP)|g' \
+	      -e 's|@''HAVE_DECL_STRNLEN''@|$(HAVE_DECL_STRNLEN)|g' \
+	      -e 's|@''HAVE_STRPBRK''@|$(HAVE_STRPBRK)|g' \
+	      -e 's|@''HAVE_STRSEP''@|$(HAVE_STRSEP)|g' \
+	      -e 's|@''HAVE_STRCASESTR''@|$(HAVE_STRCASESTR)|g' \
+	      -e 's|@''HAVE_DECL_STRTOK_R''@|$(HAVE_DECL_STRTOK_R)|g' \
+	      -e 's|@''HAVE_DECL_STRERROR_R''@|$(HAVE_DECL_STRERROR_R)|g' \
+	      -e 's|@''HAVE_DECL_STRSIGNAL''@|$(HAVE_DECL_STRSIGNAL)|g' \
+	      -e 's|@''HAVE_STRVERSCMP''@|$(HAVE_STRVERSCMP)|g' \
+	      -e 's|@''REPLACE_STPNCPY''@|$(REPLACE_STPNCPY)|g' \
+	      -e 's|@''REPLACE_MEMCHR''@|$(REPLACE_MEMCHR)|g' \
+	      -e 's|@''REPLACE_MEMMEM''@|$(REPLACE_MEMMEM)|g' \
+	      -e 's|@''REPLACE_STRCASESTR''@|$(REPLACE_STRCASESTR)|g' \
+	      -e 's|@''REPLACE_STRCHRNUL''@|$(REPLACE_STRCHRNUL)|g' \
+	      -e 's|@''REPLACE_STRDUP''@|$(REPLACE_STRDUP)|g' \
+	      -e 's|@''REPLACE_STRSTR''@|$(REPLACE_STRSTR)|g' \
+	      -e 's|@''REPLACE_STRERROR''@|$(REPLACE_STRERROR)|g' \
+	      -e 's|@''REPLACE_STRERROR_R''@|$(REPLACE_STRERROR_R)|g' \
+	      -e 's|@''REPLACE_STRNCAT''@|$(REPLACE_STRNCAT)|g' \
+	      -e 's|@''REPLACE_STRNDUP''@|$(REPLACE_STRNDUP)|g' \
+	      -e 's|@''REPLACE_STRNLEN''@|$(REPLACE_STRNLEN)|g' \
+	      -e 's|@''REPLACE_STRSIGNAL''@|$(REPLACE_STRSIGNAL)|g' \
+	      -e 's|@''REPLACE_STRTOK_R''@|$(REPLACE_STRTOK_R)|g' \
+	      -e 's|@''UNDEFINE_STRTOK_R''@|$(UNDEFINE_STRTOK_R)|g' \
+	      -e '/definitions of _GL_FUNCDECL_RPL/r $(CXXDEFS_H)' \
+	      -e '/definition of _GL_ARG_NONNULL/r $(ARG_NONNULL_H)' \
+	      -e '/definition of _GL_WARN_ON_USE/r $(WARN_ON_USE_H)'; \
+	      < $(srcdir)/string.in.h; \
+	} > $@-t && \
+	mv $@-t $@
+MOSTLYCLEANFILES += string.h string.h-t
+
+EXTRA_DIST += string.in.h
+
+## end   gnulib module string
+
 ## begin gnulib module strtoimax
 
 

=== added file 'lib/memrchr.c'
--- lib/memrchr.c	1970-01-01 00:00:00 +0000
+++ lib/memrchr.c	2013-02-09 03:12:48 +0000
@@ -0,0 +1,161 @@
+/* memrchr -- find the last occurrence of a byte in a memory block
+
+   Copyright (C) 1991, 1993, 1996-1997, 1999-2000, 2003-2013 Free Software
+   Foundation, Inc.
+
+   Based on strlen implementation by Torbjorn Granlund (tege@sics.se),
+   with help from Dan Sahlin (dan@sics.se) and
+   commentary by Jim Blandy (jimb@ai.mit.edu);
+   adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu),
+   and implemented by Roland McGrath (roland@ai.mit.edu).
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#if defined _LIBC
+# include <memcopy.h>
+#else
+# include <config.h>
+# define reg_char char
+#endif
+
+#include <string.h>
+#include <limits.h>
+
+#undef __memrchr
+#ifdef _LIBC
+# undef memrchr
+#endif
+
+#ifndef weak_alias
+# define __memrchr memrchr
+#endif
+
+/* Search no more than N bytes of S for C.  */
+void *
+__memrchr (void const *s, int c_in, size_t n)
+{
+  /* On 32-bit hardware, choosing longword to be a 32-bit unsigned
+     long instead of a 64-bit uintmax_t tends to give better
+     performance.  On 64-bit hardware, unsigned long is generally 64
+     bits already.  Change this typedef to experiment with
+     performance.  */
+  typedef unsigned long int longword;
+
+  const unsigned char *char_ptr;
+  const longword *longword_ptr;
+  longword repeated_one;
+  longword repeated_c;
+  unsigned reg_char c;
+
+  c = (unsigned char) c_in;
+
+  /* Handle the last few bytes by reading one byte at a time.
+     Do this until CHAR_PTR is aligned on a longword boundary.  */
+  for (char_ptr = (const unsigned char *) s + n;
+       n > 0 && (size_t) char_ptr % sizeof (longword) != 0;
+       --n)
+    if (*--char_ptr == c)
+      return (void *) char_ptr;
+
+  longword_ptr = (const longword *) char_ptr;
+
+  /* All these elucidatory comments refer to 4-byte longwords,
+     but the theory applies equally well to any size longwords.  */
+
+  /* Compute auxiliary longword values:
+     repeated_one is a value which has a 1 in every byte.
+     repeated_c has c in every byte.  */
+  repeated_one = 0x01010101;
+  repeated_c = c | (c << 8);
+  repeated_c |= repeated_c << 16;
+  if (0xffffffffU < (longword) -1)
+    {
+      repeated_one |= repeated_one << 31 << 1;
+      repeated_c |= repeated_c << 31 << 1;
+      if (8 < sizeof (longword))
+        {
+          size_t i;
+
+          for (i = 64; i < sizeof (longword) * 8; i *= 2)
+            {
+              repeated_one |= repeated_one << i;
+              repeated_c |= repeated_c << i;
+            }
+        }
+    }
+
+  /* Instead of the traditional loop which tests each byte, we will test a
+     longword at a time.  The tricky part is testing if *any of the four*
+     bytes in the longword in question are equal to c.  We first use an xor
+     with repeated_c.  This reduces the task to testing whether *any of the
+     four* bytes in longword1 is zero.
+
+     We compute tmp =
+       ((longword1 - repeated_one) & ~longword1) & (repeated_one << 7).
+     That is, we perform the following operations:
+       1. Subtract repeated_one.
+       2. & ~longword1.
+       3. & a mask consisting of 0x80 in every byte.
+     Consider what happens in each byte:
+       - If a byte of longword1 is zero, step 1 and 2 transform it into 0xff,
+         and step 3 transforms it into 0x80.  A carry can also be propagated
+         to more significant bytes.
+       - If a byte of longword1 is nonzero, let its lowest 1 bit be at
+         position k (0 <= k <= 7); so the lowest k bits are 0.  After step 1,
+         the byte ends in a single bit of value 0 and k bits of value 1.
+         After step 2, the result is just k bits of value 1: 2^k - 1.  After
+         step 3, the result is 0.  And no carry is produced.
+     So, if longword1 has only non-zero bytes, tmp is zero.
+     Whereas if longword1 has a zero byte, call j the position of the least
+     significant zero byte.  Then the result has a zero at positions 0, ...,
+     j-1 and a 0x80 at position j.  We cannot predict the result at the more
+     significant bytes (positions j+1..3), but it does not matter since we
+     already have a non-zero bit at position 8*j+7.
+
+     So, the test whether any byte in longword1 is zero is equivalent to
+     testing whether tmp is nonzero.  */
+
+  while (n >= sizeof (longword))
+    {
+      longword longword1 = *--longword_ptr ^ repeated_c;
+
+      if ((((longword1 - repeated_one) & ~longword1)
+           & (repeated_one << 7)) != 0)
+        {
+          longword_ptr++;
+          break;
+        }
+      n -= sizeof (longword);
+    }
+
+  char_ptr = (const unsigned char *) longword_ptr;
+
+  /* At this point, we know that either n < sizeof (longword), or one of the
+     sizeof (longword) bytes starting at char_ptr is == c.  On little-endian
+     machines, we could determine the first such byte without any further
+     memory accesses, just by looking at the tmp result from the last loop
+     iteration.  But this does not work on big-endian machines.  Choose code
+     that works in both cases.  */
+
+  while (n-- > 0)
+    {
+      if (*--char_ptr == c)
+        return (void *) char_ptr;
+    }
+
+  return NULL;
+}
+#ifdef weak_alias
+weak_alias (__memrchr, memrchr)
+#endif

=== added file 'lib/string.in.h'
--- lib/string.in.h	1970-01-01 00:00:00 +0000
+++ lib/string.in.h	2013-02-09 03:12:48 +0000
@@ -0,0 +1,1029 @@
+/* A GNU-like <string.h>.
+
+   Copyright (C) 1995-1996, 2001-2013 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef _@GUARD_PREFIX@_STRING_H
+
+#if __GNUC__ >= 3
+@PRAGMA_SYSTEM_HEADER@
+#endif
+@PRAGMA_COLUMNS@
+
+/* The include_next requires a split double-inclusion guard.  */
+#@INCLUDE_NEXT@ @NEXT_STRING_H@
+
+#ifndef _@GUARD_PREFIX@_STRING_H
+#define _@GUARD_PREFIX@_STRING_H
+
+/* NetBSD 5.0 mis-defines NULL.  */
+#include <stddef.h>
+
+/* MirBSD defines mbslen as a macro.  */
+#if @GNULIB_MBSLEN@ && defined __MirBSD__
+# include <wchar.h>
+#endif
+
+/* The __attribute__ feature is available in gcc versions 2.5 and later.
+   The attribute __pure__ was added in gcc 2.96.  */
+#if __GNUC__ > 2 || (__GNUC__ == 2 && __GNUC_MINOR__ >= 96)
+# define _GL_ATTRIBUTE_PURE __attribute__ ((__pure__))
+#else
+# define _GL_ATTRIBUTE_PURE /* empty */
+#endif
+
+/* NetBSD 5.0 declares strsignal in <unistd.h>, not in <string.h>.  */
+/* But in any case avoid namespace pollution on glibc systems.  */
+#if (@GNULIB_STRSIGNAL@ || defined GNULIB_POSIXCHECK) && defined __NetBSD__ \
+    && ! defined __GLIBC__
+# include <unistd.h>
+#endif
+
+/* The definitions of _GL_FUNCDECL_RPL etc. are copied here.  */
+
+/* The definition of _GL_ARG_NONNULL is copied here.  */
+
+/* The definition of _GL_WARN_ON_USE is copied here.  */
+
+
+/* Find the index of the least-significant set bit.  */
+#if @GNULIB_FFSL@
+# if !@HAVE_FFSL@
+_GL_FUNCDECL_SYS (ffsl, int, (long int i));
+# endif
+_GL_CXXALIAS_SYS (ffsl, int, (long int i));
+_GL_CXXALIASWARN (ffsl);
+#elif defined GNULIB_POSIXCHECK
+# undef ffsl
+# if HAVE_RAW_DECL_FFSL
+_GL_WARN_ON_USE (ffsl, "ffsl is not portable - use the ffsl module");
+# endif
+#endif
+
+
+/* Find the index of the least-significant set bit.  */
+#if @GNULIB_FFSLL@
+# if !@HAVE_FFSLL@
+_GL_FUNCDECL_SYS (ffsll, int, (long long int i));
+# endif
+_GL_CXXALIAS_SYS (ffsll, int, (long long int i));
+_GL_CXXALIASWARN (ffsll);
+#elif defined GNULIB_POSIXCHECK
+# undef ffsll
+# if HAVE_RAW_DECL_FFSLL
+_GL_WARN_ON_USE (ffsll, "ffsll is not portable - use the ffsll module");
+# endif
+#endif
+
+
+/* Return the first instance of C within N bytes of S, or NULL.  */
+#if @GNULIB_MEMCHR@
+# if @REPLACE_MEMCHR@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define memchr rpl_memchr
+#  endif
+_GL_FUNCDECL_RPL (memchr, void *, (void const *__s, int __c, size_t __n)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (memchr, void *, (void const *__s, int __c, size_t __n));
+# else
+#  if ! @HAVE_MEMCHR@
+_GL_FUNCDECL_SYS (memchr, void *, (void const *__s, int __c, size_t __n)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+#  endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C" { const void * std::memchr (const void *, int, size_t); }
+       extern "C++" { void * std::memchr (void *, int, size_t); }  */
+_GL_CXXALIAS_SYS_CAST2 (memchr,
+                        void *, (void const *__s, int __c, size_t __n),
+                        void const *, (void const *__s, int __c, size_t __n));
+# endif
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (memchr, void *, (void *__s, int __c, size_t __n));
+_GL_CXXALIASWARN1 (memchr, void const *,
+                   (void const *__s, int __c, size_t __n));
+# else
+_GL_CXXALIASWARN (memchr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef memchr
+/* Assume memchr is always declared.  */
+_GL_WARN_ON_USE (memchr, "memchr has platform-specific bugs - "
+                 "use gnulib module memchr for portability" );
+#endif
+
+/* Return the first occurrence of NEEDLE in HAYSTACK.  */
+#if @GNULIB_MEMMEM@
+# if @REPLACE_MEMMEM@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define memmem rpl_memmem
+#  endif
+_GL_FUNCDECL_RPL (memmem, void *,
+                  (void const *__haystack, size_t __haystack_len,
+                   void const *__needle, size_t __needle_len)
+                  _GL_ATTRIBUTE_PURE
+                  _GL_ARG_NONNULL ((1, 3)));
+_GL_CXXALIAS_RPL (memmem, void *,
+                  (void const *__haystack, size_t __haystack_len,
+                   void const *__needle, size_t __needle_len));
+# else
+#  if ! @HAVE_DECL_MEMMEM@
+_GL_FUNCDECL_SYS (memmem, void *,
+                  (void const *__haystack, size_t __haystack_len,
+                   void const *__needle, size_t __needle_len)
+                  _GL_ATTRIBUTE_PURE
+                  _GL_ARG_NONNULL ((1, 3)));
+#  endif
+_GL_CXXALIAS_SYS (memmem, void *,
+                  (void const *__haystack, size_t __haystack_len,
+                   void const *__needle, size_t __needle_len));
+# endif
+_GL_CXXALIASWARN (memmem);
+#elif defined GNULIB_POSIXCHECK
+# undef memmem
+# if HAVE_RAW_DECL_MEMMEM
+_GL_WARN_ON_USE (memmem, "memmem is unportable and often quadratic - "
+                 "use gnulib module memmem-simple for portability, "
+                 "and module memmem for speed" );
+# endif
+#endif
+
+/* Copy N bytes of SRC to DEST, return pointer to bytes after the
+   last written byte.  */
+#if @GNULIB_MEMPCPY@
+# if ! @HAVE_MEMPCPY@
+_GL_FUNCDECL_SYS (mempcpy, void *,
+                  (void *restrict __dest, void const *restrict __src,
+                   size_t __n)
+                  _GL_ARG_NONNULL ((1, 2)));
+# endif
+_GL_CXXALIAS_SYS (mempcpy, void *,
+                  (void *restrict __dest, void const *restrict __src,
+                   size_t __n));
+_GL_CXXALIASWARN (mempcpy);
+#elif defined GNULIB_POSIXCHECK
+# undef mempcpy
+# if HAVE_RAW_DECL_MEMPCPY
+_GL_WARN_ON_USE (mempcpy, "mempcpy is unportable - "
+                 "use gnulib module mempcpy for portability");
+# endif
+#endif
+
+/* Search backwards through a block for a byte (specified as an int).  */
+#if @GNULIB_MEMRCHR@
+# if ! @HAVE_DECL_MEMRCHR@
+_GL_FUNCDECL_SYS (memrchr, void *, (void const *, int, size_t)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+# endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const void * std::memrchr (const void *, int, size_t); }
+       extern "C++" { void * std::memrchr (void *, int, size_t); }  */
+_GL_CXXALIAS_SYS_CAST2 (memrchr,
+                        void *, (void const *, int, size_t),
+                        void const *, (void const *, int, size_t));
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (memrchr, void *, (void *, int, size_t));
+_GL_CXXALIASWARN1 (memrchr, void const *, (void const *, int, size_t));
+# else
+_GL_CXXALIASWARN (memrchr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef memrchr
+# if HAVE_RAW_DECL_MEMRCHR
+_GL_WARN_ON_USE (memrchr, "memrchr is unportable - "
+                 "use gnulib module memrchr for portability");
+# endif
+#endif
+
+/* Find the first occurrence of C in S.  More efficient than
+   memchr(S,C,N), at the expense of undefined behavior if C does not
+   occur within N bytes.  */
+#if @GNULIB_RAWMEMCHR@
+# if ! @HAVE_RAWMEMCHR@
+_GL_FUNCDECL_SYS (rawmemchr, void *, (void const *__s, int __c_in)
+                                     _GL_ATTRIBUTE_PURE
+                                     _GL_ARG_NONNULL ((1)));
+# endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const void * std::rawmemchr (const void *, int); }
+       extern "C++" { void * std::rawmemchr (void *, int); }  */
+_GL_CXXALIAS_SYS_CAST2 (rawmemchr,
+                        void *, (void const *__s, int __c_in),
+                        void const *, (void const *__s, int __c_in));
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (rawmemchr, void *, (void *__s, int __c_in));
+_GL_CXXALIASWARN1 (rawmemchr, void const *, (void const *__s, int __c_in));
+# else
+_GL_CXXALIASWARN (rawmemchr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef rawmemchr
+# if HAVE_RAW_DECL_RAWMEMCHR
+_GL_WARN_ON_USE (rawmemchr, "rawmemchr is unportable - "
+                 "use gnulib module rawmemchr for portability");
+# endif
+#endif
+
+/* Copy SRC to DST, returning the address of the terminating '\0' in DST.  */
+#if @GNULIB_STPCPY@
+# if ! @HAVE_STPCPY@
+_GL_FUNCDECL_SYS (stpcpy, char *,
+                  (char *restrict __dst, char const *restrict __src)
+                  _GL_ARG_NONNULL ((1, 2)));
+# endif
+_GL_CXXALIAS_SYS (stpcpy, char *,
+                  (char *restrict __dst, char const *restrict __src));
+_GL_CXXALIASWARN (stpcpy);
+#elif defined GNULIB_POSIXCHECK
+# undef stpcpy
+# if HAVE_RAW_DECL_STPCPY
+_GL_WARN_ON_USE (stpcpy, "stpcpy is unportable - "
+                 "use gnulib module stpcpy for portability");
+# endif
+#endif
+
+/* Copy no more than N bytes of SRC to DST, returning a pointer past the
+   last non-NUL byte written into DST.  */
+#if @GNULIB_STPNCPY@
+# if @REPLACE_STPNCPY@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef stpncpy
+#   define stpncpy rpl_stpncpy
+#  endif
+_GL_FUNCDECL_RPL (stpncpy, char *,
+                  (char *restrict __dst, char const *restrict __src,
+                   size_t __n)
+                  _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (stpncpy, char *,
+                  (char *restrict __dst, char const *restrict __src,
+                   size_t __n));
+# else
+#  if ! @HAVE_STPNCPY@
+_GL_FUNCDECL_SYS (stpncpy, char *,
+                  (char *restrict __dst, char const *restrict __src,
+                   size_t __n)
+                  _GL_ARG_NONNULL ((1, 2)));
+#  endif
+_GL_CXXALIAS_SYS (stpncpy, char *,
+                  (char *restrict __dst, char const *restrict __src,
+                   size_t __n));
+# endif
+_GL_CXXALIASWARN (stpncpy);
+#elif defined GNULIB_POSIXCHECK
+# undef stpncpy
+# if HAVE_RAW_DECL_STPNCPY
+_GL_WARN_ON_USE (stpncpy, "stpncpy is unportable - "
+                 "use gnulib module stpncpy for portability");
+# endif
+#endif
+
+#if defined GNULIB_POSIXCHECK
+/* strchr() does not work with multibyte strings if the locale encoding is
+   GB18030 and the character to be searched is a digit.  */
+# undef strchr
+/* Assume strchr is always declared.  */
+_GL_WARN_ON_USE (strchr, "strchr cannot work correctly on character strings "
+                 "in some multibyte locales - "
+                 "use mbschr if you care about internationalization");
+#endif
+
+/* Find the first occurrence of C in S or the final NUL byte.  */
+#if @GNULIB_STRCHRNUL@
+# if @REPLACE_STRCHRNUL@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define strchrnul rpl_strchrnul
+#  endif
+_GL_FUNCDECL_RPL (strchrnul, char *, (const char *__s, int __c_in)
+                                     _GL_ATTRIBUTE_PURE
+                                     _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (strchrnul, char *,
+                  (const char *str, int ch));
+# else
+#  if ! @HAVE_STRCHRNUL@
+_GL_FUNCDECL_SYS (strchrnul, char *, (char const *__s, int __c_in)
+                                     _GL_ATTRIBUTE_PURE
+                                     _GL_ARG_NONNULL ((1)));
+#  endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const char * std::strchrnul (const char *, int); }
+       extern "C++" { char * std::strchrnul (char *, int); }  */
+_GL_CXXALIAS_SYS_CAST2 (strchrnul,
+                        char *, (char const *__s, int __c_in),
+                        char const *, (char const *__s, int __c_in));
+# endif
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (strchrnul, char *, (char *__s, int __c_in));
+_GL_CXXALIASWARN1 (strchrnul, char const *, (char const *__s, int __c_in));
+# else
+_GL_CXXALIASWARN (strchrnul);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strchrnul
+# if HAVE_RAW_DECL_STRCHRNUL
+_GL_WARN_ON_USE (strchrnul, "strchrnul is unportable - "
+                 "use gnulib module strchrnul for portability");
+# endif
+#endif
+
+/* Duplicate S, returning an identical malloc'd string.  */
+#if @GNULIB_STRDUP@
+# if @REPLACE_STRDUP@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strdup
+#   define strdup rpl_strdup
+#  endif
+_GL_FUNCDECL_RPL (strdup, char *, (char const *__s) _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (strdup, char *, (char const *__s));
+# else
+#  if defined __cplusplus && defined GNULIB_NAMESPACE && defined strdup
+    /* strdup exists as a function and as a macro.  Get rid of the macro.  */
+#   undef strdup
+#  endif
+#  if !(@HAVE_DECL_STRDUP@ || defined strdup)
+_GL_FUNCDECL_SYS (strdup, char *, (char const *__s) _GL_ARG_NONNULL ((1)));
+#  endif
+_GL_CXXALIAS_SYS (strdup, char *, (char const *__s));
+# endif
+_GL_CXXALIASWARN (strdup);
+#elif defined GNULIB_POSIXCHECK
+# undef strdup
+# if HAVE_RAW_DECL_STRDUP
+_GL_WARN_ON_USE (strdup, "strdup is unportable - "
+                 "use gnulib module strdup for portability");
+# endif
+#endif
+
+/* Append no more than N characters from SRC onto DEST.  */
+#if @GNULIB_STRNCAT@
+# if @REPLACE_STRNCAT@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strncat
+#   define strncat rpl_strncat
+#  endif
+_GL_FUNCDECL_RPL (strncat, char *, (char *dest, const char *src, size_t n)
+                                   _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (strncat, char *, (char *dest, const char *src, size_t n));
+# else
+_GL_CXXALIAS_SYS (strncat, char *, (char *dest, const char *src, size_t n));
+# endif
+_GL_CXXALIASWARN (strncat);
+#elif defined GNULIB_POSIXCHECK
+# undef strncat
+# if HAVE_RAW_DECL_STRNCAT
+_GL_WARN_ON_USE (strncat, "strncat is unportable - "
+                 "use gnulib module strncat for portability");
+# endif
+#endif
+
+/* Return a newly allocated copy of at most N bytes of STRING.  */
+#if @GNULIB_STRNDUP@
+# if @REPLACE_STRNDUP@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strndup
+#   define strndup rpl_strndup
+#  endif
+_GL_FUNCDECL_RPL (strndup, char *, (char const *__string, size_t __n)
+                                   _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (strndup, char *, (char const *__string, size_t __n));
+# else
+#  if ! @HAVE_DECL_STRNDUP@
+_GL_FUNCDECL_SYS (strndup, char *, (char const *__string, size_t __n)
+                                   _GL_ARG_NONNULL ((1)));
+#  endif
+_GL_CXXALIAS_SYS (strndup, char *, (char const *__string, size_t __n));
+# endif
+_GL_CXXALIASWARN (strndup);
+#elif defined GNULIB_POSIXCHECK
+# undef strndup
+# if HAVE_RAW_DECL_STRNDUP
+_GL_WARN_ON_USE (strndup, "strndup is unportable - "
+                 "use gnulib module strndup for portability");
+# endif
+#endif
+
+/* Find the length (number of bytes) of STRING, but scan at most
+   MAXLEN bytes.  If no '\0' terminator is found in that many bytes,
+   return MAXLEN.  */
+#if @GNULIB_STRNLEN@
+# if @REPLACE_STRNLEN@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strnlen
+#   define strnlen rpl_strnlen
+#  endif
+_GL_FUNCDECL_RPL (strnlen, size_t, (char const *__string, size_t __maxlen)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (strnlen, size_t, (char const *__string, size_t __maxlen));
+# else
+#  if ! @HAVE_DECL_STRNLEN@
+_GL_FUNCDECL_SYS (strnlen, size_t, (char const *__string, size_t __maxlen)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+#  endif
+_GL_CXXALIAS_SYS (strnlen, size_t, (char const *__string, size_t __maxlen));
+# endif
+_GL_CXXALIASWARN (strnlen);
+#elif defined GNULIB_POSIXCHECK
+# undef strnlen
+# if HAVE_RAW_DECL_STRNLEN
+_GL_WARN_ON_USE (strnlen, "strnlen is unportable - "
+                 "use gnulib module strnlen for portability");
+# endif
+#endif
+
+#if defined GNULIB_POSIXCHECK
+/* strcspn() assumes the second argument is a list of single-byte characters.
+   Even in this simple case, it does not work with multibyte strings if the
+   locale encoding is GB18030 and one of the characters to be searched is a
+   digit.  */
+# undef strcspn
+/* Assume strcspn is always declared.  */
+_GL_WARN_ON_USE (strcspn, "strcspn cannot work correctly on character strings "
+                 "in multibyte locales - "
+                 "use mbscspn if you care about internationalization");
+#endif
+
+/* Find the first occurrence in S of any character in ACCEPT.  */
+#if @GNULIB_STRPBRK@
+# if ! @HAVE_STRPBRK@
+_GL_FUNCDECL_SYS (strpbrk, char *, (char const *__s, char const *__accept)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1, 2)));
+# endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C" { const char * strpbrk (const char *, const char *); }
+       extern "C++" { char * strpbrk (char *, const char *); }  */
+_GL_CXXALIAS_SYS_CAST2 (strpbrk,
+                        char *, (char const *__s, char const *__accept),
+                        const char *, (char const *__s, char const *__accept));
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (strpbrk, char *, (char *__s, char const *__accept));
+_GL_CXXALIASWARN1 (strpbrk, char const *,
+                   (char const *__s, char const *__accept));
+# else
+_GL_CXXALIASWARN (strpbrk);
+# endif
+# if defined GNULIB_POSIXCHECK
+/* strpbrk() assumes the second argument is a list of single-byte characters.
+   Even in this simple case, it does not work with multibyte strings if the
+   locale encoding is GB18030 and one of the characters to be searched is a
+   digit.  */
+#  undef strpbrk
+_GL_WARN_ON_USE (strpbrk, "strpbrk cannot work correctly on character strings "
+                 "in multibyte locales - "
+                 "use mbspbrk if you care about internationalization");
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strpbrk
+# if HAVE_RAW_DECL_STRPBRK
+_GL_WARN_ON_USE (strpbrk, "strpbrk is unportable - "
+                 "use gnulib module strpbrk for portability");
+# endif
+#endif
+
+#if defined GNULIB_POSIXCHECK
+/* strspn() assumes the second argument is a list of single-byte characters.
+   Even in this simple case, it cannot work with multibyte strings.  */
+# undef strspn
+/* Assume strspn is always declared.  */
+_GL_WARN_ON_USE (strspn, "strspn cannot work correctly on character strings "
+                 "in multibyte locales - "
+                 "use mbsspn if you care about internationalization");
+#endif
+
+#if defined GNULIB_POSIXCHECK
+/* strrchr() does not work with multibyte strings if the locale encoding is
+   GB18030 and the character to be searched is a digit.  */
+# undef strrchr
+/* Assume strrchr is always declared.  */
+_GL_WARN_ON_USE (strrchr, "strrchr cannot work correctly on character strings "
+                 "in some multibyte locales - "
+                 "use mbsrchr if you care about internationalization");
+#endif
+
+/* Search the next delimiter (char listed in DELIM) starting at *STRINGP.
+   If one is found, overwrite it with a NUL, and advance *STRINGP
+   to point to the next char after it.  Otherwise, set *STRINGP to NULL.
+   If *STRINGP was already NULL, nothing happens.
+   Return the old value of *STRINGP.
+
+   This is a variant of strtok() that is multithread-safe and supports
+   empty fields.
+
+   Caveat: It modifies the original string.
+   Caveat: These functions cannot be used on constant strings.
+   Caveat: The identity of the delimiting character is lost.
+   Caveat: It doesn't work with multibyte strings unless all of the delimiter
+           characters are ASCII characters < 0x30.
+
+   See also strtok_r().  */
+#if @GNULIB_STRSEP@
+# if ! @HAVE_STRSEP@
+_GL_FUNCDECL_SYS (strsep, char *,
+                  (char **restrict __stringp, char const *restrict __delim)
+                  _GL_ARG_NONNULL ((1, 2)));
+# endif
+_GL_CXXALIAS_SYS (strsep, char *,
+                  (char **restrict __stringp, char const *restrict __delim));
+_GL_CXXALIASWARN (strsep);
+# if defined GNULIB_POSIXCHECK
+#  undef strsep
+_GL_WARN_ON_USE (strsep, "strsep cannot work correctly on character strings "
+                 "in multibyte locales - "
+                 "use mbssep if you care about internationalization");
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strsep
+# if HAVE_RAW_DECL_STRSEP
+_GL_WARN_ON_USE (strsep, "strsep is unportable - "
+                 "use gnulib module strsep for portability");
+# endif
+#endif
+
+#if @GNULIB_STRSTR@
+# if @REPLACE_STRSTR@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define strstr rpl_strstr
+#  endif
+_GL_FUNCDECL_RPL (strstr, char *, (const char *haystack, const char *needle)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (strstr, char *, (const char *haystack, const char *needle));
+# else
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const char * strstr (const char *, const char *); }
+       extern "C++" { char * strstr (char *, const char *); }  */
+_GL_CXXALIAS_SYS_CAST2 (strstr,
+                        char *, (const char *haystack, const char *needle),
+                        const char *, (const char *haystack, const char *needle));
+# endif
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (strstr, char *, (char *haystack, const char *needle));
+_GL_CXXALIASWARN1 (strstr, const char *,
+                   (const char *haystack, const char *needle));
+# else
+_GL_CXXALIASWARN (strstr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+/* strstr() does not work with multibyte strings if the locale encoding is
+   different from UTF-8:
+   POSIX says that it operates on "strings", and "string" in POSIX is defined
+   as a sequence of bytes, not of characters.  */
+# undef strstr
+/* Assume strstr is always declared.  */
+_GL_WARN_ON_USE (strstr, "strstr is quadratic on many systems, and cannot "
+                 "work correctly on character strings in most "
+                 "multibyte locales - "
+                 "use mbsstr if you care about internationalization, "
+                 "or use strstr if you care about speed");
+#endif
+
+/* Find the first occurrence of NEEDLE in HAYSTACK, using case-insensitive
+   comparison.  */
+#if @GNULIB_STRCASESTR@
+# if @REPLACE_STRCASESTR@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define strcasestr rpl_strcasestr
+#  endif
+_GL_FUNCDECL_RPL (strcasestr, char *,
+                  (const char *haystack, const char *needle)
+                  _GL_ATTRIBUTE_PURE
+                  _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (strcasestr, char *,
+                  (const char *haystack, const char *needle));
+# else
+#  if ! @HAVE_STRCASESTR@
+_GL_FUNCDECL_SYS (strcasestr, char *,
+                  (const char *haystack, const char *needle)
+                  _GL_ATTRIBUTE_PURE
+                  _GL_ARG_NONNULL ((1, 2)));
+#  endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const char * strcasestr (const char *, const char *); }
+       extern "C++" { char * strcasestr (char *, const char *); }  */
+_GL_CXXALIAS_SYS_CAST2 (strcasestr,
+                        char *, (const char *haystack, const char *needle),
+                        const char *, (const char *haystack, const char *needle));
+# endif
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (strcasestr, char *, (char *haystack, const char *needle));
+_GL_CXXALIASWARN1 (strcasestr, const char *,
+                   (const char *haystack, const char *needle));
+# else
+_GL_CXXALIASWARN (strcasestr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+/* strcasestr() does not work with multibyte strings:
+   It is a glibc extension, and glibc implements it only for unibyte
+   locales.  */
+# undef strcasestr
+# if HAVE_RAW_DECL_STRCASESTR
+_GL_WARN_ON_USE (strcasestr, "strcasestr does work correctly on character "
+                 "strings in multibyte locales - "
+                 "use mbscasestr if you care about "
+                 "internationalization, or use c-strcasestr if you want "
+                 "a locale independent function");
+# endif
+#endif
+
+/* Parse S into tokens separated by characters in DELIM.
+   If S is NULL, the saved pointer in SAVE_PTR is used as
+   the next starting point.  For example:
+        char s[] = "-abc-=-def";
+        char *sp;
+        x = strtok_r(s, "-", &sp);      // x = "abc", sp = "=-def"
+        x = strtok_r(NULL, "-=", &sp);  // x = "def", sp = NULL
+        x = strtok_r(NULL, "=", &sp);   // x = NULL
+                // s = "abc\0-def\0"
+
+   This is a variant of strtok() that is multithread-safe.
+
+   For the POSIX documentation for this function, see:
+   http://www.opengroup.org/susv3xsh/strtok.html
+
+   Caveat: It modifies the original string.
+   Caveat: These functions cannot be used on constant strings.
+   Caveat: The identity of the delimiting character is lost.
+   Caveat: It doesn't work with multibyte strings unless all of the delimiter
+           characters are ASCII characters < 0x30.
+
+   See also strsep().  */
+#if @GNULIB_STRTOK_R@
+# if @REPLACE_STRTOK_R@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strtok_r
+#   define strtok_r rpl_strtok_r
+#  endif
+_GL_FUNCDECL_RPL (strtok_r, char *,
+                  (char *restrict s, char const *restrict delim,
+                   char **restrict save_ptr)
+                  _GL_ARG_NONNULL ((2, 3)));
+_GL_CXXALIAS_RPL (strtok_r, char *,
+                  (char *restrict s, char const *restrict delim,
+                   char **restrict save_ptr));
+# else
+#  if @UNDEFINE_STRTOK_R@ || defined GNULIB_POSIXCHECK
+#   undef strtok_r
+#  endif
+#  if ! @HAVE_DECL_STRTOK_R@
+_GL_FUNCDECL_SYS (strtok_r, char *,
+                  (char *restrict s, char const *restrict delim,
+                   char **restrict save_ptr)
+                  _GL_ARG_NONNULL ((2, 3)));
+#  endif
+_GL_CXXALIAS_SYS (strtok_r, char *,
+                  (char *restrict s, char const *restrict delim,
+                   char **restrict save_ptr));
+# endif
+_GL_CXXALIASWARN (strtok_r);
+# if defined GNULIB_POSIXCHECK
+_GL_WARN_ON_USE (strtok_r, "strtok_r cannot work correctly on character "
+                 "strings in multibyte locales - "
+                 "use mbstok_r if you care about internationalization");
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strtok_r
+# if HAVE_RAW_DECL_STRTOK_R
+_GL_WARN_ON_USE (strtok_r, "strtok_r is unportable - "
+                 "use gnulib module strtok_r for portability");
+# endif
+#endif
+
+
+/* The following functions are not specified by POSIX.  They are gnulib
+   extensions.  */
+
+#if @GNULIB_MBSLEN@
+/* Return the number of multibyte characters in the character string STRING.
+   This considers multibyte characters, unlike strlen, which counts bytes.  */
+# ifdef __MirBSD__  /* MirBSD defines mbslen as a macro.  Override it.  */
+#  undef mbslen
+# endif
+# if @HAVE_MBSLEN@  /* AIX, OSF/1, MirBSD define mbslen already in libc.  */
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define mbslen rpl_mbslen
+#  endif
+_GL_FUNCDECL_RPL (mbslen, size_t, (const char *string)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (mbslen, size_t, (const char *string));
+# else
+_GL_FUNCDECL_SYS (mbslen, size_t, (const char *string)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_SYS (mbslen, size_t, (const char *string));
+# endif
+_GL_CXXALIASWARN (mbslen);
+#endif
+
+#if @GNULIB_MBSNLEN@
+/* Return the number of multibyte characters in the character string starting
+   at STRING and ending at STRING + LEN.  */
+_GL_EXTERN_C size_t mbsnlen (const char *string, size_t len)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1));
+#endif
+
+#if @GNULIB_MBSCHR@
+/* Locate the first single-byte character C in the character string STRING,
+   and return a pointer to it.  Return NULL if C is not found in STRING.
+   Unlike strchr(), this function works correctly in multibyte locales with
+   encodings such as GB18030.  */
+# if defined __hpux
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define mbschr rpl_mbschr /* avoid collision with HP-UX function */
+#  endif
+_GL_FUNCDECL_RPL (mbschr, char *, (const char *string, int c)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (mbschr, char *, (const char *string, int c));
+# else
+_GL_FUNCDECL_SYS (mbschr, char *, (const char *string, int c)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_SYS (mbschr, char *, (const char *string, int c));
+# endif
+_GL_CXXALIASWARN (mbschr);
+#endif
+
+#if @GNULIB_MBSRCHR@
+/* Locate the last single-byte character C in the character string STRING,
+   and return a pointer to it.  Return NULL if C is not found in STRING.
+   Unlike strrchr(), this function works correctly in multibyte locales with
+   encodings such as GB18030.  */
+# if defined __hpux || defined __INTERIX
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define mbsrchr rpl_mbsrchr /* avoid collision with system function */
+#  endif
+_GL_FUNCDECL_RPL (mbsrchr, char *, (const char *string, int c)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (mbsrchr, char *, (const char *string, int c));
+# else
+_GL_FUNCDECL_SYS (mbsrchr, char *, (const char *string, int c)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_SYS (mbsrchr, char *, (const char *string, int c));
+# endif
+_GL_CXXALIASWARN (mbsrchr);
+#endif
+
+#if @GNULIB_MBSSTR@
+/* Find the first occurrence of the character string NEEDLE in the character
+   string HAYSTACK.  Return NULL if NEEDLE is not found in HAYSTACK.
+   Unlike strstr(), this function works correctly in multibyte locales with
+   encodings different from UTF-8.  */
+_GL_EXTERN_C char * mbsstr (const char *haystack, const char *needle)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSCASECMP@
+/* Compare the character strings S1 and S2, ignoring case, returning less than,
+   equal to or greater than zero if S1 is lexicographically less than, equal to
+   or greater than S2.
+   Note: This function may, in multibyte locales, return 0 for strings of
+   different lengths!
+   Unlike strcasecmp(), this function works correctly in multibyte locales.  */
+_GL_EXTERN_C int mbscasecmp (const char *s1, const char *s2)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSNCASECMP@
+/* Compare the initial segment of the character string S1 consisting of at most
+   N characters with the initial segment of the character string S2 consisting
+   of at most N characters, ignoring case, returning less than, equal to or
+   greater than zero if the initial segment of S1 is lexicographically less
+   than, equal to or greater than the initial segment of S2.
+   Note: This function may, in multibyte locales, return 0 for initial segments
+   of different lengths!
+   Unlike strncasecmp(), this function works correctly in multibyte locales.
+   But beware that N is not a byte count but a character count!  */
+_GL_EXTERN_C int mbsncasecmp (const char *s1, const char *s2, size_t n)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSPCASECMP@
+/* Compare the initial segment of the character string STRING consisting of
+   at most mbslen (PREFIX) characters with the character string PREFIX,
+   ignoring case.  If the two match, return a pointer to the first byte
+   after this prefix in STRING.  Otherwise, return NULL.
+   Note: This function may, in multibyte locales, return non-NULL if STRING
+   is of smaller length than PREFIX!
+   Unlike strncasecmp(), this function works correctly in multibyte
+   locales.  */
+_GL_EXTERN_C char * mbspcasecmp (const char *string, const char *prefix)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSCASESTR@
+/* Find the first occurrence of the character string NEEDLE in the character
+   string HAYSTACK, using case-insensitive comparison.
+   Note: This function may, in multibyte locales, return success even if
+   strlen (haystack) < strlen (needle) !
+   Unlike strcasestr(), this function works correctly in multibyte locales.  */
+_GL_EXTERN_C char * mbscasestr (const char *haystack, const char *needle)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSCSPN@
+/* Find the first occurrence in the character string STRING of any character
+   in the character string ACCEPT.  Return the number of bytes from the
+   beginning of the string to this occurrence, or to the end of the string
+   if none exists.
+   Unlike strcspn(), this function works correctly in multibyte locales.  */
+_GL_EXTERN_C size_t mbscspn (const char *string, const char *accept)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSPBRK@
+/* Find the first occurrence in the character string STRING of any character
+   in the character string ACCEPT.  Return the pointer to it, or NULL if none
+   exists.
+   Unlike strpbrk(), this function works correctly in multibyte locales.  */
+# if defined __hpux
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define mbspbrk rpl_mbspbrk /* avoid collision with HP-UX function */
+#  endif
+_GL_FUNCDECL_RPL (mbspbrk, char *, (const char *string, const char *accept)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (mbspbrk, char *, (const char *string, const char *accept));
+# else
+_GL_FUNCDECL_SYS (mbspbrk, char *, (const char *string, const char *accept)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_SYS (mbspbrk, char *, (const char *string, const char *accept));
+# endif
+_GL_CXXALIASWARN (mbspbrk);
+#endif
+
+#if @GNULIB_MBSSPN@
+/* Find the first occurrence in the character string STRING of any character
+   not in the character string REJECT.  Return the number of bytes from the
+   beginning of the string to this occurrence, or to the end of the string
+   if none exists.
+   Unlike strspn(), this function works correctly in multibyte locales.  */
+_GL_EXTERN_C size_t mbsspn (const char *string, const char *reject)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSSEP@
+/* Search the next delimiter (multibyte character listed in the character
+   string DELIM) starting at the character string *STRINGP.
+   If one is found, overwrite it with a NUL, and advance *STRINGP to point
+   to the next multibyte character after it.  Otherwise, set *STRINGP to NULL.
+   If *STRINGP was already NULL, nothing happens.
+   Return the old value of *STRINGP.
+
+   This is a variant of mbstok_r() that supports empty fields.
+
+   Caveat: It modifies the original string.
+   Caveat: These functions cannot be used on constant strings.
+   Caveat: The identity of the delimiting character is lost.
+
+   See also mbstok_r().  */
+_GL_EXTERN_C char * mbssep (char **stringp, const char *delim)
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSTOK_R@
+/* Parse the character string STRING into tokens separated by characters in
+   the character string DELIM.
+   If STRING is NULL, the saved pointer in SAVE_PTR is used as
+   the next starting point.  For example:
+        char s[] = "-abc-=-def";
+        char *sp;
+        x = mbstok_r(s, "-", &sp);      // x = "abc", sp = "=-def"
+        x = mbstok_r(NULL, "-=", &sp);  // x = "def", sp = NULL
+        x = mbstok_r(NULL, "=", &sp);   // x = NULL
+                // s = "abc\0-def\0"
+
+   Caveat: It modifies the original string.
+   Caveat: These functions cannot be used on constant strings.
+   Caveat: The identity of the delimiting character is lost.
+
+   See also mbssep().  */
+_GL_EXTERN_C char * mbstok_r (char *string, const char *delim, char **save_ptr)
+     _GL_ARG_NONNULL ((2, 3));
+#endif
+
+/* Map any int, typically from errno, into an error message.  */
+#if @GNULIB_STRERROR@
+# if @REPLACE_STRERROR@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strerror
+#   define strerror rpl_strerror
+#  endif
+_GL_FUNCDECL_RPL (strerror, char *, (int));
+_GL_CXXALIAS_RPL (strerror, char *, (int));
+# else
+_GL_CXXALIAS_SYS (strerror, char *, (int));
+# endif
+_GL_CXXALIASWARN (strerror);
+#elif defined GNULIB_POSIXCHECK
+# undef strerror
+/* Assume strerror is always declared.  */
+_GL_WARN_ON_USE (strerror, "strerror is unportable - "
+                 "use gnulib module strerror to guarantee non-NULL result");
+#endif
+
+/* Map any int, typically from errno, into an error message.  Multithread-safe.
+   Uses the POSIX declaration, not the glibc declaration.  */
+#if @GNULIB_STRERROR_R@
+# if @REPLACE_STRERROR_R@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strerror_r
+#   define strerror_r rpl_strerror_r
+#  endif
+_GL_FUNCDECL_RPL (strerror_r, int, (int errnum, char *buf, size_t buflen)
+                                   _GL_ARG_NONNULL ((2)));
+_GL_CXXALIAS_RPL (strerror_r, int, (int errnum, char *buf, size_t buflen));
+# else
+#  if !@HAVE_DECL_STRERROR_R@
+_GL_FUNCDECL_SYS (strerror_r, int, (int errnum, char *buf, size_t buflen)
+                                   _GL_ARG_NONNULL ((2)));
+#  endif
+_GL_CXXALIAS_SYS (strerror_r, int, (int errnum, char *buf, size_t buflen));
+# endif
+# if @HAVE_DECL_STRERROR_R@
+_GL_CXXALIASWARN (strerror_r);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strerror_r
+# if HAVE_RAW_DECL_STRERROR_R
+_GL_WARN_ON_USE (strerror_r, "strerror_r is unportable - "
+                 "use gnulib module strerror_r-posix for portability");
+# endif
+#endif
+
+#if @GNULIB_STRSIGNAL@
+# if @REPLACE_STRSIGNAL@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define strsignal rpl_strsignal
+#  endif
+_GL_FUNCDECL_RPL (strsignal, char *, (int __sig));
+_GL_CXXALIAS_RPL (strsignal, char *, (int __sig));
+# else
+#  if ! @HAVE_DECL_STRSIGNAL@
+_GL_FUNCDECL_SYS (strsignal, char *, (int __sig));
+#  endif
+/* Need to cast, because on Cygwin 1.5.x systems, the return type is
+   'const char *'.  */
+_GL_CXXALIAS_SYS_CAST (strsignal, char *, (int __sig));
+# endif
+_GL_CXXALIASWARN (strsignal);
+#elif defined GNULIB_POSIXCHECK
+# undef strsignal
+# if HAVE_RAW_DECL_STRSIGNAL
+_GL_WARN_ON_USE (strsignal, "strsignal is unportable - "
+                 "use gnulib module strsignal for portability");
+# endif
+#endif
+
+#if @GNULIB_STRVERSCMP@
+# if !@HAVE_STRVERSCMP@
+_GL_FUNCDECL_SYS (strverscmp, int, (const char *, const char *)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1, 2)));
+# endif
+_GL_CXXALIAS_SYS (strverscmp, int, (const char *, const char *));
+_GL_CXXALIASWARN (strverscmp);
+#elif defined GNULIB_POSIXCHECK
+# undef strverscmp
+# if HAVE_RAW_DECL_STRVERSCMP
+_GL_WARN_ON_USE (strverscmp, "strverscmp is unportable - "
+                 "use gnulib module strverscmp for portability");
+# endif
+#endif
+
+
+#endif /* _@GUARD_PREFIX@_STRING_H */
+#endif /* _@GUARD_PREFIX@_STRING_H */

=== modified file 'm4/gnulib-comp.m4'
--- m4/gnulib-comp.m4	2013-02-01 06:30:51 +0000
+++ m4/gnulib-comp.m4	2013-02-09 03:12:48 +0000
@@ -83,6 +83,7 @@
   AC_REQUIRE([AC_SYS_LARGEFILE])
   # Code from module lstat:
   # Code from module manywarnings:
+  # Code from module memrchr:
   # Code from module mktime:
   # Code from module multiarch:
   # Code from module nocrash:
@@ -117,6 +118,7 @@
   # Code from module stdio:
   # Code from module stdlib:
   # Code from module strftime:
+  # Code from module string:
   # Code from module strtoimax:
   # Code from module strtoll:
   # Code from module strtoull:
@@ -242,6 +244,12 @@
     gl_PREREQ_LSTAT
   fi
   gl_SYS_STAT_MODULE_INDICATOR([lstat])
+  gl_FUNC_MEMRCHR
+  if test $ac_cv_func_memrchr = no; then
+    AC_LIBOBJ([memrchr])
+    gl_PREREQ_MEMRCHR
+  fi
+  gl_STRING_MODULE_INDICATOR([memrchr])
   gl_FUNC_MKTIME
   if test $REPLACE_MKTIME = 1; then
     AC_LIBOBJ([mktime])
@@ -294,6 +302,7 @@
   gl_STDIO_H
   gl_STDLIB_H
   gl_FUNC_GNU_STRFTIME
+  gl_HEADER_STRING_H
   gl_FUNC_STRTOIMAX
   if test $HAVE_STRTOIMAX = 0 || test $REPLACE_STRTOIMAX = 1; then
     AC_LIBOBJ([strtoimax])
@@ -757,6 +766,7 @@
   lib/lstat.c
   lib/md5.c
   lib/md5.h
+  lib/memrchr.c
   lib/mktime-internal.h
   lib/mktime.c
   lib/openat-priv.h
@@ -790,6 +800,7 @@
   lib/stdlib.in.h
   lib/strftime.c
   lib/strftime.h
+  lib/string.in.h
   lib/strtoimax.c
   lib/strtol.c
   lib/strtoll.c
@@ -848,6 +859,7 @@
   m4/lstat.m4
   m4/manywarnings.m4
   m4/md5.m4
+  m4/memrchr.m4
   m4/mktime.m4
   m4/multiarch.m4
   m4/nocrash.m4
@@ -877,6 +889,7 @@
   m4/stdio_h.m4
   m4/stdlib_h.m4
   m4/strftime.m4
+  m4/string_h.m4
   m4/strtoimax.m4
   m4/strtoll.m4
   m4/strtoull.m4

=== added file 'm4/memrchr.m4'
--- m4/memrchr.m4	1970-01-01 00:00:00 +0000
+++ m4/memrchr.m4	2013-02-09 03:12:48 +0000
@@ -0,0 +1,23 @@
+# memrchr.m4 serial 10
+dnl Copyright (C) 2002-2003, 2005-2007, 2009-2013 Free Software Foundation,
+dnl Inc.
+dnl This file is free software; the Free Software Foundation
+dnl gives unlimited permission to copy and/or distribute it,
+dnl with or without modifications, as long as this notice is preserved.
+
+AC_DEFUN([gl_FUNC_MEMRCHR],
+[
+  dnl Persuade glibc <string.h> to declare memrchr().
+  AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS])
+
+  AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS])
+  AC_CHECK_DECLS_ONCE([memrchr])
+  if test $ac_cv_have_decl_memrchr = no; then
+    HAVE_DECL_MEMRCHR=0
+  fi
+
+  AC_CHECK_FUNCS([memrchr])
+])
+
+# Prerequisites of lib/memrchr.c.
+AC_DEFUN([gl_PREREQ_MEMRCHR], [:])

=== added file 'm4/string_h.m4'
--- m4/string_h.m4	1970-01-01 00:00:00 +0000
+++ m4/string_h.m4	2013-02-09 03:12:48 +0000
@@ -0,0 +1,120 @@
+# Configure a GNU-like replacement for <string.h>.
+
+# Copyright (C) 2007-2013 Free Software Foundation, Inc.
+# This file is free software; the Free Software Foundation
+# gives unlimited permission to copy and/or distribute it,
+# with or without modifications, as long as this notice is preserved.
+
+# serial 21
+
+# Written by Paul Eggert.
+
+AC_DEFUN([gl_HEADER_STRING_H],
+[
+  dnl Use AC_REQUIRE here, so that the default behavior below is expanded
+  dnl once only, before all statements that occur in other macros.
+  AC_REQUIRE([gl_HEADER_STRING_H_BODY])
+])
+
+AC_DEFUN([gl_HEADER_STRING_H_BODY],
+[
+  AC_REQUIRE([AC_C_RESTRICT])
+  AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS])
+  gl_NEXT_HEADERS([string.h])
+
+  dnl Check for declarations of anything we want to poison if the
+  dnl corresponding gnulib module is not in use, and which is not
+  dnl guaranteed by C89.
+  gl_WARN_ON_USE_PREPARE([[#include <string.h>
+    ]],
+    [ffsl ffsll memmem mempcpy memrchr rawmemchr stpcpy stpncpy strchrnul
+     strdup strncat strndup strnlen strpbrk strsep strcasestr strtok_r
+     strerror_r strsignal strverscmp])
+])
+
+AC_DEFUN([gl_STRING_MODULE_INDICATOR],
+[
+  dnl Use AC_REQUIRE here, so that the default settings are expanded once only.
+  AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS])
+  gl_MODULE_INDICATOR_SET_VARIABLE([$1])
+  dnl Define it also as a C macro, for the benefit of the unit tests.
+  gl_MODULE_INDICATOR_FOR_TESTS([$1])
+])
+
+AC_DEFUN([gl_HEADER_STRING_H_DEFAULTS],
+[
+  GNULIB_FFSL=0;        AC_SUBST([GNULIB_FFSL])
+  GNULIB_FFSLL=0;       AC_SUBST([GNULIB_FFSLL])
+  GNULIB_MEMCHR=0;      AC_SUBST([GNULIB_MEMCHR])
+  GNULIB_MEMMEM=0;      AC_SUBST([GNULIB_MEMMEM])
+  GNULIB_MEMPCPY=0;     AC_SUBST([GNULIB_MEMPCPY])
+  GNULIB_MEMRCHR=0;     AC_SUBST([GNULIB_MEMRCHR])
+  GNULIB_RAWMEMCHR=0;   AC_SUBST([GNULIB_RAWMEMCHR])
+  GNULIB_STPCPY=0;      AC_SUBST([GNULIB_STPCPY])
+  GNULIB_STPNCPY=0;     AC_SUBST([GNULIB_STPNCPY])
+  GNULIB_STRCHRNUL=0;   AC_SUBST([GNULIB_STRCHRNUL])
+  GNULIB_STRDUP=0;      AC_SUBST([GNULIB_STRDUP])
+  GNULIB_STRNCAT=0;     AC_SUBST([GNULIB_STRNCAT])
+  GNULIB_STRNDUP=0;     AC_SUBST([GNULIB_STRNDUP])
+  GNULIB_STRNLEN=0;     AC_SUBST([GNULIB_STRNLEN])
+  GNULIB_STRPBRK=0;     AC_SUBST([GNULIB_STRPBRK])
+  GNULIB_STRSEP=0;      AC_SUBST([GNULIB_STRSEP])
+  GNULIB_STRSTR=0;      AC_SUBST([GNULIB_STRSTR])
+  GNULIB_STRCASESTR=0;  AC_SUBST([GNULIB_STRCASESTR])
+  GNULIB_STRTOK_R=0;    AC_SUBST([GNULIB_STRTOK_R])
+  GNULIB_MBSLEN=0;      AC_SUBST([GNULIB_MBSLEN])
+  GNULIB_MBSNLEN=0;     AC_SUBST([GNULIB_MBSNLEN])
+  GNULIB_MBSCHR=0;      AC_SUBST([GNULIB_MBSCHR])
+  GNULIB_MBSRCHR=0;     AC_SUBST([GNULIB_MBSRCHR])
+  GNULIB_MBSSTR=0;      AC_SUBST([GNULIB_MBSSTR])
+  GNULIB_MBSCASECMP=0;  AC_SUBST([GNULIB_MBSCASECMP])
+  GNULIB_MBSNCASECMP=0; AC_SUBST([GNULIB_MBSNCASECMP])
+  GNULIB_MBSPCASECMP=0; AC_SUBST([GNULIB_MBSPCASECMP])
+  GNULIB_MBSCASESTR=0;  AC_SUBST([GNULIB_MBSCASESTR])
+  GNULIB_MBSCSPN=0;     AC_SUBST([GNULIB_MBSCSPN])
+  GNULIB_MBSPBRK=0;     AC_SUBST([GNULIB_MBSPBRK])
+  GNULIB_MBSSPN=0;      AC_SUBST([GNULIB_MBSSPN])
+  GNULIB_MBSSEP=0;      AC_SUBST([GNULIB_MBSSEP])
+  GNULIB_MBSTOK_R=0;    AC_SUBST([GNULIB_MBSTOK_R])
+  GNULIB_STRERROR=0;    AC_SUBST([GNULIB_STRERROR])
+  GNULIB_STRERROR_R=0;  AC_SUBST([GNULIB_STRERROR_R])
+  GNULIB_STRSIGNAL=0;   AC_SUBST([GNULIB_STRSIGNAL])
+  GNULIB_STRVERSCMP=0;  AC_SUBST([GNULIB_STRVERSCMP])
+  HAVE_MBSLEN=0;        AC_SUBST([HAVE_MBSLEN])
+  dnl Assume proper GNU behavior unless another module says otherwise.
+  HAVE_FFSL=1;                  AC_SUBST([HAVE_FFSL])
+  HAVE_FFSLL=1;                 AC_SUBST([HAVE_FFSLL])
+  HAVE_MEMCHR=1;                AC_SUBST([HAVE_MEMCHR])
+  HAVE_DECL_MEMMEM=1;           AC_SUBST([HAVE_DECL_MEMMEM])
+  HAVE_MEMPCPY=1;               AC_SUBST([HAVE_MEMPCPY])
+  HAVE_DECL_MEMRCHR=1;          AC_SUBST([HAVE_DECL_MEMRCHR])
+  HAVE_RAWMEMCHR=1;             AC_SUBST([HAVE_RAWMEMCHR])
+  HAVE_STPCPY=1;                AC_SUBST([HAVE_STPCPY])
+  HAVE_STPNCPY=1;               AC_SUBST([HAVE_STPNCPY])
+  HAVE_STRCHRNUL=1;             AC_SUBST([HAVE_STRCHRNUL])
+  HAVE_DECL_STRDUP=1;           AC_SUBST([HAVE_DECL_STRDUP])
+  HAVE_DECL_STRNDUP=1;          AC_SUBST([HAVE_DECL_STRNDUP])
+  HAVE_DECL_STRNLEN=1;          AC_SUBST([HAVE_DECL_STRNLEN])
+  HAVE_STRPBRK=1;               AC_SUBST([HAVE_STRPBRK])
+  HAVE_STRSEP=1;                AC_SUBST([HAVE_STRSEP])
+  HAVE_STRCASESTR=1;            AC_SUBST([HAVE_STRCASESTR])
+  HAVE_DECL_STRTOK_R=1;         AC_SUBST([HAVE_DECL_STRTOK_R])
+  HAVE_DECL_STRERROR_R=1;       AC_SUBST([HAVE_DECL_STRERROR_R])
+  HAVE_DECL_STRSIGNAL=1;        AC_SUBST([HAVE_DECL_STRSIGNAL])
+  HAVE_STRVERSCMP=1;            AC_SUBST([HAVE_STRVERSCMP])
+  REPLACE_MEMCHR=0;             AC_SUBST([REPLACE_MEMCHR])
+  REPLACE_MEMMEM=0;             AC_SUBST([REPLACE_MEMMEM])
+  REPLACE_STPNCPY=0;            AC_SUBST([REPLACE_STPNCPY])
+  REPLACE_STRDUP=0;             AC_SUBST([REPLACE_STRDUP])
+  REPLACE_STRSTR=0;             AC_SUBST([REPLACE_STRSTR])
+  REPLACE_STRCASESTR=0;         AC_SUBST([REPLACE_STRCASESTR])
+  REPLACE_STRCHRNUL=0;          AC_SUBST([REPLACE_STRCHRNUL])
+  REPLACE_STRERROR=0;           AC_SUBST([REPLACE_STRERROR])
+  REPLACE_STRERROR_R=0;         AC_SUBST([REPLACE_STRERROR_R])
+  REPLACE_STRNCAT=0;            AC_SUBST([REPLACE_STRNCAT])
+  REPLACE_STRNDUP=0;            AC_SUBST([REPLACE_STRNDUP])
+  REPLACE_STRNLEN=0;            AC_SUBST([REPLACE_STRNLEN])
+  REPLACE_STRSIGNAL=0;          AC_SUBST([REPLACE_STRSIGNAL])
+  REPLACE_STRTOK_R=0;           AC_SUBST([REPLACE_STRTOK_R])
+  UNDEFINE_STRTOK_R=0;          AC_SUBST([UNDEFINE_STRTOK_R])
+])

=== modified file 'src/ChangeLog'
--- src/ChangeLog	2013-02-08 17:42:09 +0000
+++ src/ChangeLog	2013-02-09 03:12:48 +0000
@@ -1,3 +1,11 @@
+2013-02-09  Paul Eggert  <eggert@cs.ucla.edu>
+
+	Tune redisplay by using memchr and memrchr.
+	* search.c (scan_buffer): Omit first arg TARGET, as it's always '\n'.
+	All callers changed.
+	(scan_buffer, scan_newline): Use memchr and memrchr rather than
+	scanning byte-by-byte.
+
 2013-02-08  Stefan Monnier  <monnier@iro.umontreal.ca>
 
 	* lread.c (skip_dyn_bytes): New function (bug#12598).

=== modified file 'src/editfns.c'
--- src/editfns.c	2013-01-23 20:07:28 +0000
+++ src/editfns.c	2013-02-09 03:12:48 +0000
@@ -735,8 +735,7 @@
 	      /* This is the ONLY_IN_LINE case, check that NEW_POS and
 		 FIELD_BOUND are on the same line by seeing whether
 		 there's an intervening newline or not.  */
-	      || (scan_buffer ('\n',
-			       XFASTINT (new_pos), XFASTINT (field_bound),
+	      || (scan_buffer (XFASTINT (new_pos), XFASTINT (field_bound),
 			       fwd ? -1 : 1, &shortage, 1),
 		  shortage != 0)))
 	/* Constrain NEW_POS to FIELD_BOUND.  */

=== modified file 'src/lisp.h'
--- src/lisp.h	2013-02-08 05:28:52 +0000
+++ src/lisp.h	2013-02-09 03:12:48 +0000
@@ -3338,7 +3338,7 @@
 extern ptrdiff_t fast_string_match_ignore_case (Lisp_Object, Lisp_Object);
 extern ptrdiff_t fast_looking_at (Lisp_Object, ptrdiff_t, ptrdiff_t,
                                   ptrdiff_t, ptrdiff_t, Lisp_Object);
-extern ptrdiff_t scan_buffer (int, ptrdiff_t, ptrdiff_t, ptrdiff_t,
+extern ptrdiff_t scan_buffer (ptrdiff_t, ptrdiff_t, ptrdiff_t,
 			      ptrdiff_t *, bool);
 extern EMACS_INT scan_newline (ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t,
 			       EMACS_INT, bool);

=== modified file 'src/search.c'
--- src/search.c	2013-02-08 14:44:53 +0000
+++ src/search.c	2013-02-09 03:12:48 +0000
@@ -619,7 +619,7 @@
 }
 
 \f
-/* Search for COUNT instances of the character TARGET between START and END.
+/* Search for COUNT newlines between START and END.
 
    If COUNT is positive, search forwards; END must be >= START.
    If COUNT is negative, search backwards for the -COUNTth instance;
@@ -634,13 +634,13 @@
    this is not the same as the usual convention for Emacs motion commands.
 
    If we don't find COUNT instances before reaching END, set *SHORTAGE
-   to the number of TARGETs left unfound, and return END.
+   to the number of newlines left unfound, and return END.
 
    If ALLOW_QUIT, set immediate_quit.  That's good to do
    except when inside redisplay.  */
 
 ptrdiff_t
-scan_buffer (int target, ptrdiff_t start, ptrdiff_t end,
+scan_buffer (ptrdiff_t start, ptrdiff_t end,
 	     ptrdiff_t count, ptrdiff_t *shortage, bool allow_quit)
 {
   struct region_cache *newline_cache;
@@ -656,7 +656,7 @@
   else
     {
       direction = -1;
-      if (!end) 
+      if (!end)
 	end = BEGV, end_byte = BEGV_BYTE;
     }
   if (end_byte == -1)
@@ -684,7 +684,7 @@
 
         /* If we're looking for a newline, consult the newline cache
            to see where we can avoid some scanning.  */
-        if (target == '\n' && newline_cache)
+        if (newline_cache)
           {
             ptrdiff_t next_change;
             immediate_quit = 0;
@@ -726,29 +726,28 @@
               unsigned char *scan_start = cursor;
 
               /* The dumb loop.  */
-              while (*cursor != target && ++cursor < ceiling_addr)
-                ;
+	      unsigned char *nl = memchr (cursor, '\n', ceiling_addr - cursor);
 
               /* If we're looking for newlines, cache the fact that
                  the region from start to cursor is free of them. */
-              if (target == '\n' && newline_cache)
+              if (newline_cache)
                 know_region_cache (current_buffer, newline_cache,
                                    BYTE_TO_CHAR (start_byte + scan_start - base),
-                                   BYTE_TO_CHAR (start_byte + cursor - base));
-
-              /* Did we find the target character?  */
-              if (cursor < ceiling_addr)
-                {
-                  if (--count == 0)
-                    {
-                      immediate_quit = 0;
-                      return BYTE_TO_CHAR (start_byte + cursor - base + 1);
-                    }
-                  cursor++;
-                }
+                                   BYTE_TO_CHAR (start_byte + (nl ? nl : ceiling_addr) - base));
+
+              /* Did we find the newline?  */
+              if (! nl)
+		break;
+
+	      if (--count == 0)
+		{
+		  immediate_quit = 0;
+		  return BYTE_TO_CHAR (start_byte + nl - base + 1);
+		}
+	      cursor = nl + 1;
             }
 
-          start = BYTE_TO_CHAR (start_byte + cursor - base);
+          start = BYTE_TO_CHAR (start_byte + ceiling_addr - base);
         }
       }
   else
@@ -760,7 +759,7 @@
 	ptrdiff_t tem;
 
         /* Consult the newline cache, if appropriate.  */
-        if (target == '\n' && newline_cache)
+        if (newline_cache)
           {
             ptrdiff_t next_change;
             immediate_quit = 0;
@@ -795,30 +794,29 @@
           while (cursor >= ceiling_addr)
             {
               unsigned char *scan_start = cursor;
-
-              while (*cursor != target && --cursor >= ceiling_addr)
-                ;
+	      unsigned char *nl = memrchr (ceiling_addr, '\n',
+					   cursor + 1 - ceiling_addr);
 
               /* If we're looking for newlines, cache the fact that
                  the region from after the cursor to start is free of them.  */
-              if (target == '\n' && newline_cache)
+              if (newline_cache)
                 know_region_cache (current_buffer, newline_cache,
-                                   BYTE_TO_CHAR (start_byte + cursor - base),
+                                   BYTE_TO_CHAR (start_byte + (nl ? nl : ceiling_addr - 1) - base),
                                    BYTE_TO_CHAR (start_byte + scan_start - base));
 
-              /* Did we find the target character?  */
-              if (cursor >= ceiling_addr)
-                {
-                  if (++count >= 0)
-                    {
-                      immediate_quit = 0;
-                      return BYTE_TO_CHAR (start_byte + cursor - base);
-                    }
-                  cursor--;
-                }
+              /* Did we find the newline?  */
+              if (! nl)
+		break;
+
+	      if (++count >= 0)
+		{
+		  immediate_quit = 0;
+		  return BYTE_TO_CHAR (start_byte + nl - base);
+		}
+	      cursor = nl - 1;
             }
 
-	  start = BYTE_TO_CHAR (start_byte + cursor - base);
+	  start = BYTE_TO_CHAR (start_byte + (ceiling_addr - 1) - base);
         }
       }
 
@@ -874,29 +872,25 @@
 	  ceiling = min (limit_byte - 1, ceiling);
 	  ceiling_addr = BYTE_POS_ADDR (ceiling) + 1;
 	  base = (cursor = BYTE_POS_ADDR (start_byte));
-	  while (1)
+
+	  do
 	    {
-	      while (*cursor != '\n' && ++cursor != ceiling_addr)
-		;
-
-	      if (cursor != ceiling_addr)
+	      unsigned char *nl = memchr (cursor, '\n', ceiling_addr - cursor);
+	      if (! nl)
+		break;
+	      if (--count == 0)
 		{
-		  if (--count == 0)
-		    {
-		      immediate_quit = old_immediate_quit;
-		      start_byte = start_byte + cursor - base + 1;
-		      start = BYTE_TO_CHAR (start_byte);
-		      TEMP_SET_PT_BOTH (start, start_byte);
-		      return 0;
-		    }
-		  else
-		    if (++cursor == ceiling_addr)
-		      break;
+		  immediate_quit = old_immediate_quit;
+		  start_byte += nl - base + 1;
+		  start = BYTE_TO_CHAR (start_byte);
+		  TEMP_SET_PT_BOTH (start, start_byte);
+		  return 0;
 		}
-	      else
-		break;
+	      cursor = nl + 1;
 	    }
-	  start_byte += cursor - base;
+	  while (cursor < ceiling_addr);
+
+	  start_byte += ceiling_addr - base;
 	}
     }
   else
@@ -905,31 +899,28 @@
 	{
 	  ceiling = BUFFER_FLOOR_OF (start_byte - 1);
 	  ceiling = max (limit_byte, ceiling);
-	  ceiling_addr = BYTE_POS_ADDR (ceiling) - 1;
+	  ceiling_addr = BYTE_POS_ADDR (ceiling);
 	  base = (cursor = BYTE_POS_ADDR (start_byte - 1) + 1);
 	  while (1)
 	    {
-	      while (--cursor != ceiling_addr && *cursor != '\n')
-		;
+	      unsigned char *nl = memrchr (ceiling_addr, '\n',
+					   cursor - ceiling_addr);
+	      if (! nl)
+		break;
 
-	      if (cursor != ceiling_addr)
+	      if (++count == 0)
 		{
-		  if (++count == 0)
-		    {
-		      immediate_quit = old_immediate_quit;
-		      /* Return the position AFTER the match we found.  */
-		      start_byte = start_byte + cursor - base + 1;
-		      start = BYTE_TO_CHAR (start_byte);
-		      TEMP_SET_PT_BOTH (start, start_byte);
-		      return 0;
-		    }
+		  immediate_quit = old_immediate_quit;
+		  /* Return the position AFTER the match we found.  */
+		  start_byte += nl - base + 1;
+		  start = BYTE_TO_CHAR (start_byte);
+		  TEMP_SET_PT_BOTH (start, start_byte);
+		  return 0;
 		}
-	      else
-		break;
+
+	      cursor = nl;
 	    }
-	  /* Here we add 1 to compensate for the last decrement
-	     of CURSOR, which took it past the valid range.  */
-	  start_byte += cursor - base + 1;
+	  start_byte += ceiling_addr - base;
 	}
     }
 
@@ -942,7 +933,7 @@
 ptrdiff_t
 find_next_newline_no_quit (ptrdiff_t from, ptrdiff_t cnt)
 {
-  return scan_buffer ('\n', from, 0, cnt, (ptrdiff_t *) 0, 0);
+  return scan_buffer (from, 0, cnt, (ptrdiff_t *) 0, 0);
 }
 
 /* Like find_next_newline, but returns position before the newline,
@@ -953,7 +944,7 @@
 find_before_next_newline (ptrdiff_t from, ptrdiff_t to, ptrdiff_t cnt)
 {
   ptrdiff_t shortage;
-  ptrdiff_t pos = scan_buffer ('\n', from, to, cnt, &shortage, 1);
+  ptrdiff_t pos = scan_buffer (from, to, cnt, &shortage, 1);
 
   if (shortage == 0)
     pos--;


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-09  3:34                         ` Paul Eggert
@ 2013-02-09  8:46                           ` Eli Zaretskii
  2013-02-09  9:05                             ` Paul Eggert
  0 siblings, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-09  8:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: dmantipov, emacs-devel

> Date: Fri, 08 Feb 2013 19:34:20 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: Dmitry Antipov <dmantipov@yandex.ru>, emacs-devel@gnu.org
> 
> On 02/08/2013 08:52 AM, Eli Zaretskii wrote:
> >> > So, ~90% of time spent in scan_buffer is:
> >> > 
> >> > 799                while (*cursor != target && --cursor >= ceiling_addr)
> >> > 800                  ;
> 
> > Which cannot be optimized.
> 
> It can be sped up somewhat, by using memrchr.
> 
> This won't solve these performance issues, but it helps:
> on my platform (x86-64 Ubuntu 12.10) I ran Dmitry's scroll-both benchmark
> <http://lists.gnu.org/archive/html/emacs-devel/2013-02/msg00147.html>
> on a real file (the trunk's src/xdisp.c), and it was 25% faster overall
> (1.19 seconds versus 1.49 seconds) when I used memrchr there
> and memchr for forward searches.

25% faster is still terribly slow for redisplay.  xdisp.c doesn't have
a problem in the first place (1.49 sec divided by 100 is 15 msec, not
something users will notice, let alone the difference between 15 and
11 msec).  And for files with long lines, these 25% will not solve
anything, since 6 sec _per_scroll_, give or take 25%, is intolerably
slow.

I don't think we should make this optimization, because it optimizes
in the wrong place.  The problem is not with scan_buffer, the problem
is that it (actually, its callers) get called way too much.

This is a classic case where solving a slow operation needs a radical
change in the algorithms, not loophole optimizations.

> Most of the attached patch is boilerplate taken unmodified from gnulib,
> to support memrchr on non-GNU platforms.  The key part of the change is
> at the end, to src/search.c.

I don't understand why you removed the TARGET argument of
scan_buffer.  The fact that all its callers use it for looking for a
newline doesn't mean it cannot be used otherwise.  At the very least,
the name of the function should be changed to reflect the change.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-09  8:46                           ` Eli Zaretskii
@ 2013-02-09  9:05                             ` Paul Eggert
  2013-02-09  9:33                               ` Eli Zaretskii
  2013-02-09 10:01                               ` Eli Zaretskii
  0 siblings, 2 replies; 27+ messages in thread
From: Paul Eggert @ 2013-02-09  9:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmantipov, emacs-devel

On 02/09/2013 12:46 AM, Eli Zaretskii wrote:

> 25% faster is still terribly slow for redisplay.

Yes, as I said, it doesn't solve the performance problem.
Still, it doesn't complicate the code, and it significantly
improves speed in code likely to be executed often, so it
seems worth doing in its own right.

> I don't understand why you removed the TARGET argument of
> scan_buffer.  The fact that all its callers use it for looking for a
> newline doesn't mean it cannot be used otherwise.

If we ever need that ability we can put it back in.  In the meantime
there's no need for the generality and I found it confusing.

> At the very least, the name of the function should be
> changed to reflect the change.

Sure, what name do you suggest?  scan_newline is already taken.
Perhaps scan_buffer_newline?

This area is a bit messed up, unfortunately -- scan_newline has
comments saying that it looks for carriage return (!) but
it does not in fact do that.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-09  9:05                             ` Paul Eggert
@ 2013-02-09  9:33                               ` Eli Zaretskii
  2013-02-11  2:33                                 ` Paul Eggert
  2013-02-09 10:01                               ` Eli Zaretskii
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-09  9:33 UTC (permalink / raw)
  To: Paul Eggert; +Cc: dmantipov, emacs-devel

> Date: Sat, 09 Feb 2013 01:05:01 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> Cc: dmantipov@yandex.ru, emacs-devel@gnu.org
> 
> > At the very least, the name of the function should be
> > changed to reflect the change.
> 
> Sure, what name do you suggest?  scan_newline is already taken.
> Perhaps scan_buffer_newline?

I'd use find_newline, since 2 out of 3 of its callers are
find_next_newline_no_quit and find_before_next_newline.

> This area is a bit messed up, unfortunately -- scan_newline has
> comments saying that it looks for carriage return (!) but
> it does not in fact do that.

People tend to forget updating the commentary when they change code.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-09  9:05                             ` Paul Eggert
  2013-02-09  9:33                               ` Eli Zaretskii
@ 2013-02-09 10:01                               ` Eli Zaretskii
  2013-02-10 16:57                                 ` Eli Zaretskii
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-09 10:01 UTC (permalink / raw)
  To: Paul Eggert; +Cc: dmantipov, emacs-devel

> Date: Sat, 09 Feb 2013 01:05:01 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: dmantipov@yandex.ru, emacs-devel@gnu.org
> 
> On 02/09/2013 12:46 AM, Eli Zaretskii wrote:
> 
> > 25% faster is still terribly slow for redisplay.
> 
> Yes, as I said, it doesn't solve the performance problem.
> Still, it doesn't complicate the code, and it significantly
> improves speed in code likely to be executed often, so it
> seems worth doing in its own right.

I suspect that the use case that makes scan_buffer so high on the
profile is very much skewed.  My crystal ball says that the file in
question was one very long paragraph, or at least had many-many
_thousands_ of lines between empty lines that delimit paragraphs.
scan_buffer is high on the profile because the bidi.c code tries to
find the beginning of a paragraph, which determines the base direction
of the paragraph, which in turn determines how the text should be
reordered for display.

By contrast, most real-life files have much less text between empty
lines, so scan_buffer will not be at any prominent place in the
profile.  But redisplay of a buffer with very long lines will still be
awfully slow, even if there's an empty line between every 2 long
lines, although scan_buffer will no longer be a factor.

OTOH, if you create a file with a single long paragraph, but whose
lines have "normal" width, like 100 characters, redisplay will perform
adequately, even though scan_buffer will be heavily used.  (It would
be interesting to see a profile for that, btw.)

IOW, the solution in bidi.c for extremely long paragraphs is optimized
for the 99% of use cases, where lines are not too long, i.e. for those
cases where the old unidirectional display engine gave reasonable
performance.  Dmitry's use case, OTOH, is skewed on several counts:

 . it uses extremely long lines
 . it uses too many neutral/weak characters
 . it uses extremely long paragraphs

This simultaneously hits on several unrelated weaknesses of the
current display engine, with the result that the profile is a
combination of at least 3 different reasons for slow-down, which makes
it very hard to analyze the results and look for solutions.

That is why I think we should attack this problem one reason at a
time.  The most important reason is the first one: long lines cause
the display code traverse too much of buffer text.  This is why you
see x_produce_glyphs so high on the profile in the unidirectional
case: it examines too many characters, much more than what will be
actually displayed on the screen.  Solve this problem, and the 2nd one
will simply disappear without a trace, because it is at least linear
in the number of scanned characters.  If the 3rd problem is still a
factor, after the 1st one is gone, we can tune the current
optimization at that time.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-09 10:01                               ` Eli Zaretskii
@ 2013-02-10 16:57                                 ` Eli Zaretskii
  2013-02-11  5:43                                   ` Dmitry Antipov
  2013-02-11 17:17                                   ` Eli Zaretskii
  0 siblings, 2 replies; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-10 16:57 UTC (permalink / raw)
  To: eggert, dmantipov; +Cc: emacs-devel

> Date: Sat, 09 Feb 2013 12:01:46 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: dmantipov@yandex.ru, emacs-devel@gnu.org
> 
> That is why I think we should attack this problem one reason at a
> time.  The most important reason is the first one: long lines cause
> the display code traverse too much of buffer text.  This is why you
> see x_produce_glyphs so high on the profile in the unidirectional
> case: it examines too many characters, much more than what will be
> actually displayed on the screen.

I just committed to the trunk revision 111724 with a couple of simple
changes which speed up by a factor of 3 some redisplay operations,
such as M-v or M->, in a buffer with very long lines.  Please try it.

This is by no means the complete solution, even for the situations
where it provides a 3-fold speed-up: we need the speed-up to be much
more aggressive.  But it does demonstrate how simple changes can have
a significant effect in this area.

Stay tuned.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-09  9:33                               ` Eli Zaretskii
@ 2013-02-11  2:33                                 ` Paul Eggert
  0 siblings, 0 replies; 27+ messages in thread
From: Paul Eggert @ 2013-02-11  2:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmantipov, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 354 bytes --]

On 02/09/2013 01:33 AM, Eli Zaretskii wrote:
> I'd use find_newline, since 2 out of 3 of its callers are
> find_next_newline_no_quit and find_before_next_newline.

OK, thanks, attached is a revised patch to do that.  It also removes
the confusing comments about carriage return, and identifies
two or three more places where it's clearer to use memchr.


[-- Attachment #2: memchr.txt --]
[-- Type: text/plain, Size: 82397 bytes --]

=== modified file '.bzrignore'
--- .bzrignore	2013-02-01 06:30:51 +0000
+++ .bzrignore	2013-02-09 03:12:48 +0000
@@ -97,6 +97,7 @@
 lib/stdio.h
 lib/stdint.h
 lib/stdlib.h
+lib/string.h
 lib/sys/
 lib/SYS
 lib/time.h

=== modified file 'ChangeLog'
--- ChangeLog	2013-02-11 00:55:26 +0000
+++ ChangeLog	2013-02-11 01:28:13 +0000
@@ -1,3 +1,11 @@
+2013-02-11  Paul Eggert  <eggert@cs.ucla.edu>
+
+	Tune by using memchr and memrchr.
+	* .bzrignore: Add string.h.
+	* lib/gnulib.mk, m4/gnulib-comp.m4: Regenerate.
+	* lib/memrchr.c, lib/string.in.h, m4/memrchr.m4, m4/string_h.m4:
+	New files, from gnulib.
+
 2013-02-11  Glenn Morris  <rgm@gnu.org>
 
 	* configure.ac (emacs_config_options): Record some env vars.

=== modified file 'admin/ChangeLog'
--- admin/ChangeLog	2013-02-01 06:30:51 +0000
+++ admin/ChangeLog	2013-02-11 01:28:59 +0000
@@ -1,3 +1,8 @@
+2013-02-09  Paul Eggert  <eggert@cs.ucla.edu>
+
+	Tune by using memchr and memrchr.
+	* merge-gnulib (GNULIB_MODULES): Add memrchr.
+
 2013-02-01  Paul Eggert  <eggert@cs.ucla.edu>
 
 	Use fdopendir, fstatat and readlinkat, for efficiency (Bug#13539).

=== modified file 'admin/merge-gnulib'
--- admin/merge-gnulib	2013-02-01 06:30:51 +0000
+++ admin/merge-gnulib	2013-02-09 03:12:48 +0000
@@ -31,7 +31,8 @@
   dtoastr dtotimespec dup2 environ execinfo faccessat
   fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday
   ignore-value intprops largefile lstat
-  manywarnings mktime pselect pthread_sigmask putenv readlink readlinkat
+  manywarnings memrchr mktime
+  pselect pthread_sigmask putenv readlink readlinkat
   sig2str socklen stat-time stdalign stdarg stdbool stdio
   strftime strtoimax strtoumax symlink sys_stat
   sys_time time timer-time timespec-add timespec-sub unsetenv utimens

=== modified file 'lib/gnulib.mk'
--- lib/gnulib.mk	2013-02-08 23:37:17 +0000
+++ lib/gnulib.mk	2013-02-09 03:12:48 +0000
@@ -21,7 +21,7 @@
 # the same distribution terms as the rest of that program.
 #
 # Generated by gnulib-tool.
-# Reproduce by: gnulib-tool --import --dir=. --lib=libgnu --source-base=lib --m4-base=m4 --doc-base=doc --tests-base=tests --aux-dir=build-aux --avoid=dup --avoid=errno --avoid=fchdir --avoid=fcntl --avoid=fstat --avoid=malloc-posix --avoid=msvc-inval --avoid=msvc-nothrow --avoid=open --avoid=openat-die --avoid=opendir --avoid=raise --avoid=save-cwd --avoid=select --avoid=sigprocmask --avoid=sys_types --avoid=threadlib --makefile-name=gnulib.mk --conditional-dependencies --no-libtool --macro-prefix=gl --no-vc-files alloca-opt c-ctype c-strcase careadlinkat close-stream crypto/md5 crypto/sha1 crypto/sha256 crypto/sha512 dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat manywarnings mktime pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens warnings
+# Reproduce by: gnulib-tool --import --dir=. --lib=libgnu --source-base=lib --m4-base=m4 --doc-base=doc --tests-base=tests --aux-dir=build-aux --avoid=dup --avoid=errno --avoid=fchdir --avoid=fcntl --avoid=fstat --avoid=malloc-posix --avoid=msvc-inval --avoid=msvc-nothrow --avoid=open --avoid=openat-die --avoid=opendir --avoid=raise --avoid=save-cwd --avoid=select --avoid=sigprocmask --avoid=sys_types --avoid=threadlib --makefile-name=gnulib.mk --conditional-dependencies --no-libtool --macro-prefix=gl --no-vc-files alloca-opt c-ctype c-strcase careadlinkat close-stream crypto/md5 crypto/sha1 crypto/sha256 crypto/sha512 dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat manywarnings memrchr mktime pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens warnings
 
 
 MOSTLYCLEANFILES += core *.stackdump
@@ -480,6 +480,15 @@
 
 ## end   gnulib module lstat
 
+## begin gnulib module memrchr
+
+
+EXTRA_DIST += memrchr.c
+
+EXTRA_libgnu_a_SOURCES += memrchr.c
+
+## end   gnulib module memrchr
+
 ## begin gnulib module mktime
 
 
@@ -1105,6 +1114,106 @@
 
 ## end   gnulib module strftime
 
+## begin gnulib module string
+
+BUILT_SOURCES += string.h
+
+# We need the following in order to create <string.h> when the system
+# doesn't have one that works with the given compiler.
+string.h: string.in.h $(top_builddir)/config.status $(CXXDEFS_H) $(ARG_NONNULL_H) $(WARN_ON_USE_H)
+	$(AM_V_GEN)rm -f $@-t $@ && \
+	{ echo '/* DO NOT EDIT! GENERATED AUTOMATICALLY! */' && \
+	  sed -e 's|@''GUARD_PREFIX''@|GL|g' \
+	      -e 's|@''INCLUDE_NEXT''@|$(INCLUDE_NEXT)|g' \
+	      -e 's|@''PRAGMA_SYSTEM_HEADER''@|@PRAGMA_SYSTEM_HEADER@|g' \
+	      -e 's|@''PRAGMA_COLUMNS''@|@PRAGMA_COLUMNS@|g' \
+	      -e 's|@''NEXT_STRING_H''@|$(NEXT_STRING_H)|g' \
+	      -e 's/@''GNULIB_FFSL''@/$(GNULIB_FFSL)/g' \
+	      -e 's/@''GNULIB_FFSLL''@/$(GNULIB_FFSLL)/g' \
+	      -e 's/@''GNULIB_MBSLEN''@/$(GNULIB_MBSLEN)/g' \
+	      -e 's/@''GNULIB_MBSNLEN''@/$(GNULIB_MBSNLEN)/g' \
+	      -e 's/@''GNULIB_MBSCHR''@/$(GNULIB_MBSCHR)/g' \
+	      -e 's/@''GNULIB_MBSRCHR''@/$(GNULIB_MBSRCHR)/g' \
+	      -e 's/@''GNULIB_MBSSTR''@/$(GNULIB_MBSSTR)/g' \
+	      -e 's/@''GNULIB_MBSCASECMP''@/$(GNULIB_MBSCASECMP)/g' \
+	      -e 's/@''GNULIB_MBSNCASECMP''@/$(GNULIB_MBSNCASECMP)/g' \
+	      -e 's/@''GNULIB_MBSPCASECMP''@/$(GNULIB_MBSPCASECMP)/g' \
+	      -e 's/@''GNULIB_MBSCASESTR''@/$(GNULIB_MBSCASESTR)/g' \
+	      -e 's/@''GNULIB_MBSCSPN''@/$(GNULIB_MBSCSPN)/g' \
+	      -e 's/@''GNULIB_MBSPBRK''@/$(GNULIB_MBSPBRK)/g' \
+	      -e 's/@''GNULIB_MBSSPN''@/$(GNULIB_MBSSPN)/g' \
+	      -e 's/@''GNULIB_MBSSEP''@/$(GNULIB_MBSSEP)/g' \
+	      -e 's/@''GNULIB_MBSTOK_R''@/$(GNULIB_MBSTOK_R)/g' \
+	      -e 's/@''GNULIB_MEMCHR''@/$(GNULIB_MEMCHR)/g' \
+	      -e 's/@''GNULIB_MEMMEM''@/$(GNULIB_MEMMEM)/g' \
+	      -e 's/@''GNULIB_MEMPCPY''@/$(GNULIB_MEMPCPY)/g' \
+	      -e 's/@''GNULIB_MEMRCHR''@/$(GNULIB_MEMRCHR)/g' \
+	      -e 's/@''GNULIB_RAWMEMCHR''@/$(GNULIB_RAWMEMCHR)/g' \
+	      -e 's/@''GNULIB_STPCPY''@/$(GNULIB_STPCPY)/g' \
+	      -e 's/@''GNULIB_STPNCPY''@/$(GNULIB_STPNCPY)/g' \
+	      -e 's/@''GNULIB_STRCHRNUL''@/$(GNULIB_STRCHRNUL)/g' \
+	      -e 's/@''GNULIB_STRDUP''@/$(GNULIB_STRDUP)/g' \
+	      -e 's/@''GNULIB_STRNCAT''@/$(GNULIB_STRNCAT)/g' \
+	      -e 's/@''GNULIB_STRNDUP''@/$(GNULIB_STRNDUP)/g' \
+	      -e 's/@''GNULIB_STRNLEN''@/$(GNULIB_STRNLEN)/g' \
+	      -e 's/@''GNULIB_STRPBRK''@/$(GNULIB_STRPBRK)/g' \
+	      -e 's/@''GNULIB_STRSEP''@/$(GNULIB_STRSEP)/g' \
+	      -e 's/@''GNULIB_STRSTR''@/$(GNULIB_STRSTR)/g' \
+	      -e 's/@''GNULIB_STRCASESTR''@/$(GNULIB_STRCASESTR)/g' \
+	      -e 's/@''GNULIB_STRTOK_R''@/$(GNULIB_STRTOK_R)/g' \
+	      -e 's/@''GNULIB_STRERROR''@/$(GNULIB_STRERROR)/g' \
+	      -e 's/@''GNULIB_STRERROR_R''@/$(GNULIB_STRERROR_R)/g' \
+	      -e 's/@''GNULIB_STRSIGNAL''@/$(GNULIB_STRSIGNAL)/g' \
+	      -e 's/@''GNULIB_STRVERSCMP''@/$(GNULIB_STRVERSCMP)/g' \
+	      < $(srcdir)/string.in.h | \
+	  sed -e 's|@''HAVE_FFSL''@|$(HAVE_FFSL)|g' \
+	      -e 's|@''HAVE_FFSLL''@|$(HAVE_FFSLL)|g' \
+	      -e 's|@''HAVE_MBSLEN''@|$(HAVE_MBSLEN)|g' \
+	      -e 's|@''HAVE_MEMCHR''@|$(HAVE_MEMCHR)|g' \
+	      -e 's|@''HAVE_DECL_MEMMEM''@|$(HAVE_DECL_MEMMEM)|g' \
+	      -e 's|@''HAVE_MEMPCPY''@|$(HAVE_MEMPCPY)|g' \
+	      -e 's|@''HAVE_DECL_MEMRCHR''@|$(HAVE_DECL_MEMRCHR)|g' \
+	      -e 's|@''HAVE_RAWMEMCHR''@|$(HAVE_RAWMEMCHR)|g' \
+	      -e 's|@''HAVE_STPCPY''@|$(HAVE_STPCPY)|g' \
+	      -e 's|@''HAVE_STPNCPY''@|$(HAVE_STPNCPY)|g' \
+	      -e 's|@''HAVE_STRCHRNUL''@|$(HAVE_STRCHRNUL)|g' \
+	      -e 's|@''HAVE_DECL_STRDUP''@|$(HAVE_DECL_STRDUP)|g' \
+	      -e 's|@''HAVE_DECL_STRNDUP''@|$(HAVE_DECL_STRNDUP)|g' \
+	      -e 's|@''HAVE_DECL_STRNLEN''@|$(HAVE_DECL_STRNLEN)|g' \
+	      -e 's|@''HAVE_STRPBRK''@|$(HAVE_STRPBRK)|g' \
+	      -e 's|@''HAVE_STRSEP''@|$(HAVE_STRSEP)|g' \
+	      -e 's|@''HAVE_STRCASESTR''@|$(HAVE_STRCASESTR)|g' \
+	      -e 's|@''HAVE_DECL_STRTOK_R''@|$(HAVE_DECL_STRTOK_R)|g' \
+	      -e 's|@''HAVE_DECL_STRERROR_R''@|$(HAVE_DECL_STRERROR_R)|g' \
+	      -e 's|@''HAVE_DECL_STRSIGNAL''@|$(HAVE_DECL_STRSIGNAL)|g' \
+	      -e 's|@''HAVE_STRVERSCMP''@|$(HAVE_STRVERSCMP)|g' \
+	      -e 's|@''REPLACE_STPNCPY''@|$(REPLACE_STPNCPY)|g' \
+	      -e 's|@''REPLACE_MEMCHR''@|$(REPLACE_MEMCHR)|g' \
+	      -e 's|@''REPLACE_MEMMEM''@|$(REPLACE_MEMMEM)|g' \
+	      -e 's|@''REPLACE_STRCASESTR''@|$(REPLACE_STRCASESTR)|g' \
+	      -e 's|@''REPLACE_STRCHRNUL''@|$(REPLACE_STRCHRNUL)|g' \
+	      -e 's|@''REPLACE_STRDUP''@|$(REPLACE_STRDUP)|g' \
+	      -e 's|@''REPLACE_STRSTR''@|$(REPLACE_STRSTR)|g' \
+	      -e 's|@''REPLACE_STRERROR''@|$(REPLACE_STRERROR)|g' \
+	      -e 's|@''REPLACE_STRERROR_R''@|$(REPLACE_STRERROR_R)|g' \
+	      -e 's|@''REPLACE_STRNCAT''@|$(REPLACE_STRNCAT)|g' \
+	      -e 's|@''REPLACE_STRNDUP''@|$(REPLACE_STRNDUP)|g' \
+	      -e 's|@''REPLACE_STRNLEN''@|$(REPLACE_STRNLEN)|g' \
+	      -e 's|@''REPLACE_STRSIGNAL''@|$(REPLACE_STRSIGNAL)|g' \
+	      -e 's|@''REPLACE_STRTOK_R''@|$(REPLACE_STRTOK_R)|g' \
+	      -e 's|@''UNDEFINE_STRTOK_R''@|$(UNDEFINE_STRTOK_R)|g' \
+	      -e '/definitions of _GL_FUNCDECL_RPL/r $(CXXDEFS_H)' \
+	      -e '/definition of _GL_ARG_NONNULL/r $(ARG_NONNULL_H)' \
+	      -e '/definition of _GL_WARN_ON_USE/r $(WARN_ON_USE_H)'; \
+	      < $(srcdir)/string.in.h; \
+	} > $@-t && \
+	mv $@-t $@
+MOSTLYCLEANFILES += string.h string.h-t
+
+EXTRA_DIST += string.in.h
+
+## end   gnulib module string
+
 ## begin gnulib module strtoimax
 
 

=== added file 'lib/memrchr.c'
--- lib/memrchr.c	1970-01-01 00:00:00 +0000
+++ lib/memrchr.c	2013-02-09 03:12:48 +0000
@@ -0,0 +1,161 @@
+/* memrchr -- find the last occurrence of a byte in a memory block
+
+   Copyright (C) 1991, 1993, 1996-1997, 1999-2000, 2003-2013 Free Software
+   Foundation, Inc.
+
+   Based on strlen implementation by Torbjorn Granlund (tege@sics.se),
+   with help from Dan Sahlin (dan@sics.se) and
+   commentary by Jim Blandy (jimb@ai.mit.edu);
+   adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu),
+   and implemented by Roland McGrath (roland@ai.mit.edu).
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#if defined _LIBC
+# include <memcopy.h>
+#else
+# include <config.h>
+# define reg_char char
+#endif
+
+#include <string.h>
+#include <limits.h>
+
+#undef __memrchr
+#ifdef _LIBC
+# undef memrchr
+#endif
+
+#ifndef weak_alias
+# define __memrchr memrchr
+#endif
+
+/* Search no more than N bytes of S for C.  */
+void *
+__memrchr (void const *s, int c_in, size_t n)
+{
+  /* On 32-bit hardware, choosing longword to be a 32-bit unsigned
+     long instead of a 64-bit uintmax_t tends to give better
+     performance.  On 64-bit hardware, unsigned long is generally 64
+     bits already.  Change this typedef to experiment with
+     performance.  */
+  typedef unsigned long int longword;
+
+  const unsigned char *char_ptr;
+  const longword *longword_ptr;
+  longword repeated_one;
+  longword repeated_c;
+  unsigned reg_char c;
+
+  c = (unsigned char) c_in;
+
+  /* Handle the last few bytes by reading one byte at a time.
+     Do this until CHAR_PTR is aligned on a longword boundary.  */
+  for (char_ptr = (const unsigned char *) s + n;
+       n > 0 && (size_t) char_ptr % sizeof (longword) != 0;
+       --n)
+    if (*--char_ptr == c)
+      return (void *) char_ptr;
+
+  longword_ptr = (const longword *) char_ptr;
+
+  /* All these elucidatory comments refer to 4-byte longwords,
+     but the theory applies equally well to any size longwords.  */
+
+  /* Compute auxiliary longword values:
+     repeated_one is a value which has a 1 in every byte.
+     repeated_c has c in every byte.  */
+  repeated_one = 0x01010101;
+  repeated_c = c | (c << 8);
+  repeated_c |= repeated_c << 16;
+  if (0xffffffffU < (longword) -1)
+    {
+      repeated_one |= repeated_one << 31 << 1;
+      repeated_c |= repeated_c << 31 << 1;
+      if (8 < sizeof (longword))
+        {
+          size_t i;
+
+          for (i = 64; i < sizeof (longword) * 8; i *= 2)
+            {
+              repeated_one |= repeated_one << i;
+              repeated_c |= repeated_c << i;
+            }
+        }
+    }
+
+  /* Instead of the traditional loop which tests each byte, we will test a
+     longword at a time.  The tricky part is testing if *any of the four*
+     bytes in the longword in question are equal to c.  We first use an xor
+     with repeated_c.  This reduces the task to testing whether *any of the
+     four* bytes in longword1 is zero.
+
+     We compute tmp =
+       ((longword1 - repeated_one) & ~longword1) & (repeated_one << 7).
+     That is, we perform the following operations:
+       1. Subtract repeated_one.
+       2. & ~longword1.
+       3. & a mask consisting of 0x80 in every byte.
+     Consider what happens in each byte:
+       - If a byte of longword1 is zero, step 1 and 2 transform it into 0xff,
+         and step 3 transforms it into 0x80.  A carry can also be propagated
+         to more significant bytes.
+       - If a byte of longword1 is nonzero, let its lowest 1 bit be at
+         position k (0 <= k <= 7); so the lowest k bits are 0.  After step 1,
+         the byte ends in a single bit of value 0 and k bits of value 1.
+         After step 2, the result is just k bits of value 1: 2^k - 1.  After
+         step 3, the result is 0.  And no carry is produced.
+     So, if longword1 has only non-zero bytes, tmp is zero.
+     Whereas if longword1 has a zero byte, call j the position of the least
+     significant zero byte.  Then the result has a zero at positions 0, ...,
+     j-1 and a 0x80 at position j.  We cannot predict the result at the more
+     significant bytes (positions j+1..3), but it does not matter since we
+     already have a non-zero bit at position 8*j+7.
+
+     So, the test whether any byte in longword1 is zero is equivalent to
+     testing whether tmp is nonzero.  */
+
+  while (n >= sizeof (longword))
+    {
+      longword longword1 = *--longword_ptr ^ repeated_c;
+
+      if ((((longword1 - repeated_one) & ~longword1)
+           & (repeated_one << 7)) != 0)
+        {
+          longword_ptr++;
+          break;
+        }
+      n -= sizeof (longword);
+    }
+
+  char_ptr = (const unsigned char *) longword_ptr;
+
+  /* At this point, we know that either n < sizeof (longword), or one of the
+     sizeof (longword) bytes starting at char_ptr is == c.  On little-endian
+     machines, we could determine the first such byte without any further
+     memory accesses, just by looking at the tmp result from the last loop
+     iteration.  But this does not work on big-endian machines.  Choose code
+     that works in both cases.  */
+
+  while (n-- > 0)
+    {
+      if (*--char_ptr == c)
+        return (void *) char_ptr;
+    }
+
+  return NULL;
+}
+#ifdef weak_alias
+weak_alias (__memrchr, memrchr)
+#endif

=== added file 'lib/string.in.h'
--- lib/string.in.h	1970-01-01 00:00:00 +0000
+++ lib/string.in.h	2013-02-09 03:12:48 +0000
@@ -0,0 +1,1029 @@
+/* A GNU-like <string.h>.
+
+   Copyright (C) 1995-1996, 2001-2013 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef _@GUARD_PREFIX@_STRING_H
+
+#if __GNUC__ >= 3
+@PRAGMA_SYSTEM_HEADER@
+#endif
+@PRAGMA_COLUMNS@
+
+/* The include_next requires a split double-inclusion guard.  */
+#@INCLUDE_NEXT@ @NEXT_STRING_H@
+
+#ifndef _@GUARD_PREFIX@_STRING_H
+#define _@GUARD_PREFIX@_STRING_H
+
+/* NetBSD 5.0 mis-defines NULL.  */
+#include <stddef.h>
+
+/* MirBSD defines mbslen as a macro.  */
+#if @GNULIB_MBSLEN@ && defined __MirBSD__
+# include <wchar.h>
+#endif
+
+/* The __attribute__ feature is available in gcc versions 2.5 and later.
+   The attribute __pure__ was added in gcc 2.96.  */
+#if __GNUC__ > 2 || (__GNUC__ == 2 && __GNUC_MINOR__ >= 96)
+# define _GL_ATTRIBUTE_PURE __attribute__ ((__pure__))
+#else
+# define _GL_ATTRIBUTE_PURE /* empty */
+#endif
+
+/* NetBSD 5.0 declares strsignal in <unistd.h>, not in <string.h>.  */
+/* But in any case avoid namespace pollution on glibc systems.  */
+#if (@GNULIB_STRSIGNAL@ || defined GNULIB_POSIXCHECK) && defined __NetBSD__ \
+    && ! defined __GLIBC__
+# include <unistd.h>
+#endif
+
+/* The definitions of _GL_FUNCDECL_RPL etc. are copied here.  */
+
+/* The definition of _GL_ARG_NONNULL is copied here.  */
+
+/* The definition of _GL_WARN_ON_USE is copied here.  */
+
+
+/* Find the index of the least-significant set bit.  */
+#if @GNULIB_FFSL@
+# if !@HAVE_FFSL@
+_GL_FUNCDECL_SYS (ffsl, int, (long int i));
+# endif
+_GL_CXXALIAS_SYS (ffsl, int, (long int i));
+_GL_CXXALIASWARN (ffsl);
+#elif defined GNULIB_POSIXCHECK
+# undef ffsl
+# if HAVE_RAW_DECL_FFSL
+_GL_WARN_ON_USE (ffsl, "ffsl is not portable - use the ffsl module");
+# endif
+#endif
+
+
+/* Find the index of the least-significant set bit.  */
+#if @GNULIB_FFSLL@
+# if !@HAVE_FFSLL@
+_GL_FUNCDECL_SYS (ffsll, int, (long long int i));
+# endif
+_GL_CXXALIAS_SYS (ffsll, int, (long long int i));
+_GL_CXXALIASWARN (ffsll);
+#elif defined GNULIB_POSIXCHECK
+# undef ffsll
+# if HAVE_RAW_DECL_FFSLL
+_GL_WARN_ON_USE (ffsll, "ffsll is not portable - use the ffsll module");
+# endif
+#endif
+
+
+/* Return the first instance of C within N bytes of S, or NULL.  */
+#if @GNULIB_MEMCHR@
+# if @REPLACE_MEMCHR@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define memchr rpl_memchr
+#  endif
+_GL_FUNCDECL_RPL (memchr, void *, (void const *__s, int __c, size_t __n)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (memchr, void *, (void const *__s, int __c, size_t __n));
+# else
+#  if ! @HAVE_MEMCHR@
+_GL_FUNCDECL_SYS (memchr, void *, (void const *__s, int __c, size_t __n)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+#  endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C" { const void * std::memchr (const void *, int, size_t); }
+       extern "C++" { void * std::memchr (void *, int, size_t); }  */
+_GL_CXXALIAS_SYS_CAST2 (memchr,
+                        void *, (void const *__s, int __c, size_t __n),
+                        void const *, (void const *__s, int __c, size_t __n));
+# endif
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (memchr, void *, (void *__s, int __c, size_t __n));
+_GL_CXXALIASWARN1 (memchr, void const *,
+                   (void const *__s, int __c, size_t __n));
+# else
+_GL_CXXALIASWARN (memchr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef memchr
+/* Assume memchr is always declared.  */
+_GL_WARN_ON_USE (memchr, "memchr has platform-specific bugs - "
+                 "use gnulib module memchr for portability" );
+#endif
+
+/* Return the first occurrence of NEEDLE in HAYSTACK.  */
+#if @GNULIB_MEMMEM@
+# if @REPLACE_MEMMEM@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define memmem rpl_memmem
+#  endif
+_GL_FUNCDECL_RPL (memmem, void *,
+                  (void const *__haystack, size_t __haystack_len,
+                   void const *__needle, size_t __needle_len)
+                  _GL_ATTRIBUTE_PURE
+                  _GL_ARG_NONNULL ((1, 3)));
+_GL_CXXALIAS_RPL (memmem, void *,
+                  (void const *__haystack, size_t __haystack_len,
+                   void const *__needle, size_t __needle_len));
+# else
+#  if ! @HAVE_DECL_MEMMEM@
+_GL_FUNCDECL_SYS (memmem, void *,
+                  (void const *__haystack, size_t __haystack_len,
+                   void const *__needle, size_t __needle_len)
+                  _GL_ATTRIBUTE_PURE
+                  _GL_ARG_NONNULL ((1, 3)));
+#  endif
+_GL_CXXALIAS_SYS (memmem, void *,
+                  (void const *__haystack, size_t __haystack_len,
+                   void const *__needle, size_t __needle_len));
+# endif
+_GL_CXXALIASWARN (memmem);
+#elif defined GNULIB_POSIXCHECK
+# undef memmem
+# if HAVE_RAW_DECL_MEMMEM
+_GL_WARN_ON_USE (memmem, "memmem is unportable and often quadratic - "
+                 "use gnulib module memmem-simple for portability, "
+                 "and module memmem for speed" );
+# endif
+#endif
+
+/* Copy N bytes of SRC to DEST, return pointer to bytes after the
+   last written byte.  */
+#if @GNULIB_MEMPCPY@
+# if ! @HAVE_MEMPCPY@
+_GL_FUNCDECL_SYS (mempcpy, void *,
+                  (void *restrict __dest, void const *restrict __src,
+                   size_t __n)
+                  _GL_ARG_NONNULL ((1, 2)));
+# endif
+_GL_CXXALIAS_SYS (mempcpy, void *,
+                  (void *restrict __dest, void const *restrict __src,
+                   size_t __n));
+_GL_CXXALIASWARN (mempcpy);
+#elif defined GNULIB_POSIXCHECK
+# undef mempcpy
+# if HAVE_RAW_DECL_MEMPCPY
+_GL_WARN_ON_USE (mempcpy, "mempcpy is unportable - "
+                 "use gnulib module mempcpy for portability");
+# endif
+#endif
+
+/* Search backwards through a block for a byte (specified as an int).  */
+#if @GNULIB_MEMRCHR@
+# if ! @HAVE_DECL_MEMRCHR@
+_GL_FUNCDECL_SYS (memrchr, void *, (void const *, int, size_t)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+# endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const void * std::memrchr (const void *, int, size_t); }
+       extern "C++" { void * std::memrchr (void *, int, size_t); }  */
+_GL_CXXALIAS_SYS_CAST2 (memrchr,
+                        void *, (void const *, int, size_t),
+                        void const *, (void const *, int, size_t));
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (memrchr, void *, (void *, int, size_t));
+_GL_CXXALIASWARN1 (memrchr, void const *, (void const *, int, size_t));
+# else
+_GL_CXXALIASWARN (memrchr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef memrchr
+# if HAVE_RAW_DECL_MEMRCHR
+_GL_WARN_ON_USE (memrchr, "memrchr is unportable - "
+                 "use gnulib module memrchr for portability");
+# endif
+#endif
+
+/* Find the first occurrence of C in S.  More efficient than
+   memchr(S,C,N), at the expense of undefined behavior if C does not
+   occur within N bytes.  */
+#if @GNULIB_RAWMEMCHR@
+# if ! @HAVE_RAWMEMCHR@
+_GL_FUNCDECL_SYS (rawmemchr, void *, (void const *__s, int __c_in)
+                                     _GL_ATTRIBUTE_PURE
+                                     _GL_ARG_NONNULL ((1)));
+# endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const void * std::rawmemchr (const void *, int); }
+       extern "C++" { void * std::rawmemchr (void *, int); }  */
+_GL_CXXALIAS_SYS_CAST2 (rawmemchr,
+                        void *, (void const *__s, int __c_in),
+                        void const *, (void const *__s, int __c_in));
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (rawmemchr, void *, (void *__s, int __c_in));
+_GL_CXXALIASWARN1 (rawmemchr, void const *, (void const *__s, int __c_in));
+# else
+_GL_CXXALIASWARN (rawmemchr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef rawmemchr
+# if HAVE_RAW_DECL_RAWMEMCHR
+_GL_WARN_ON_USE (rawmemchr, "rawmemchr is unportable - "
+                 "use gnulib module rawmemchr for portability");
+# endif
+#endif
+
+/* Copy SRC to DST, returning the address of the terminating '\0' in DST.  */
+#if @GNULIB_STPCPY@
+# if ! @HAVE_STPCPY@
+_GL_FUNCDECL_SYS (stpcpy, char *,
+                  (char *restrict __dst, char const *restrict __src)
+                  _GL_ARG_NONNULL ((1, 2)));
+# endif
+_GL_CXXALIAS_SYS (stpcpy, char *,
+                  (char *restrict __dst, char const *restrict __src));
+_GL_CXXALIASWARN (stpcpy);
+#elif defined GNULIB_POSIXCHECK
+# undef stpcpy
+# if HAVE_RAW_DECL_STPCPY
+_GL_WARN_ON_USE (stpcpy, "stpcpy is unportable - "
+                 "use gnulib module stpcpy for portability");
+# endif
+#endif
+
+/* Copy no more than N bytes of SRC to DST, returning a pointer past the
+   last non-NUL byte written into DST.  */
+#if @GNULIB_STPNCPY@
+# if @REPLACE_STPNCPY@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef stpncpy
+#   define stpncpy rpl_stpncpy
+#  endif
+_GL_FUNCDECL_RPL (stpncpy, char *,
+                  (char *restrict __dst, char const *restrict __src,
+                   size_t __n)
+                  _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (stpncpy, char *,
+                  (char *restrict __dst, char const *restrict __src,
+                   size_t __n));
+# else
+#  if ! @HAVE_STPNCPY@
+_GL_FUNCDECL_SYS (stpncpy, char *,
+                  (char *restrict __dst, char const *restrict __src,
+                   size_t __n)
+                  _GL_ARG_NONNULL ((1, 2)));
+#  endif
+_GL_CXXALIAS_SYS (stpncpy, char *,
+                  (char *restrict __dst, char const *restrict __src,
+                   size_t __n));
+# endif
+_GL_CXXALIASWARN (stpncpy);
+#elif defined GNULIB_POSIXCHECK
+# undef stpncpy
+# if HAVE_RAW_DECL_STPNCPY
+_GL_WARN_ON_USE (stpncpy, "stpncpy is unportable - "
+                 "use gnulib module stpncpy for portability");
+# endif
+#endif
+
+#if defined GNULIB_POSIXCHECK
+/* strchr() does not work with multibyte strings if the locale encoding is
+   GB18030 and the character to be searched is a digit.  */
+# undef strchr
+/* Assume strchr is always declared.  */
+_GL_WARN_ON_USE (strchr, "strchr cannot work correctly on character strings "
+                 "in some multibyte locales - "
+                 "use mbschr if you care about internationalization");
+#endif
+
+/* Find the first occurrence of C in S or the final NUL byte.  */
+#if @GNULIB_STRCHRNUL@
+# if @REPLACE_STRCHRNUL@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define strchrnul rpl_strchrnul
+#  endif
+_GL_FUNCDECL_RPL (strchrnul, char *, (const char *__s, int __c_in)
+                                     _GL_ATTRIBUTE_PURE
+                                     _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (strchrnul, char *,
+                  (const char *str, int ch));
+# else
+#  if ! @HAVE_STRCHRNUL@
+_GL_FUNCDECL_SYS (strchrnul, char *, (char const *__s, int __c_in)
+                                     _GL_ATTRIBUTE_PURE
+                                     _GL_ARG_NONNULL ((1)));
+#  endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const char * std::strchrnul (const char *, int); }
+       extern "C++" { char * std::strchrnul (char *, int); }  */
+_GL_CXXALIAS_SYS_CAST2 (strchrnul,
+                        char *, (char const *__s, int __c_in),
+                        char const *, (char const *__s, int __c_in));
+# endif
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (strchrnul, char *, (char *__s, int __c_in));
+_GL_CXXALIASWARN1 (strchrnul, char const *, (char const *__s, int __c_in));
+# else
+_GL_CXXALIASWARN (strchrnul);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strchrnul
+# if HAVE_RAW_DECL_STRCHRNUL
+_GL_WARN_ON_USE (strchrnul, "strchrnul is unportable - "
+                 "use gnulib module strchrnul for portability");
+# endif
+#endif
+
+/* Duplicate S, returning an identical malloc'd string.  */
+#if @GNULIB_STRDUP@
+# if @REPLACE_STRDUP@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strdup
+#   define strdup rpl_strdup
+#  endif
+_GL_FUNCDECL_RPL (strdup, char *, (char const *__s) _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (strdup, char *, (char const *__s));
+# else
+#  if defined __cplusplus && defined GNULIB_NAMESPACE && defined strdup
+    /* strdup exists as a function and as a macro.  Get rid of the macro.  */
+#   undef strdup
+#  endif
+#  if !(@HAVE_DECL_STRDUP@ || defined strdup)
+_GL_FUNCDECL_SYS (strdup, char *, (char const *__s) _GL_ARG_NONNULL ((1)));
+#  endif
+_GL_CXXALIAS_SYS (strdup, char *, (char const *__s));
+# endif
+_GL_CXXALIASWARN (strdup);
+#elif defined GNULIB_POSIXCHECK
+# undef strdup
+# if HAVE_RAW_DECL_STRDUP
+_GL_WARN_ON_USE (strdup, "strdup is unportable - "
+                 "use gnulib module strdup for portability");
+# endif
+#endif
+
+/* Append no more than N characters from SRC onto DEST.  */
+#if @GNULIB_STRNCAT@
+# if @REPLACE_STRNCAT@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strncat
+#   define strncat rpl_strncat
+#  endif
+_GL_FUNCDECL_RPL (strncat, char *, (char *dest, const char *src, size_t n)
+                                   _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (strncat, char *, (char *dest, const char *src, size_t n));
+# else
+_GL_CXXALIAS_SYS (strncat, char *, (char *dest, const char *src, size_t n));
+# endif
+_GL_CXXALIASWARN (strncat);
+#elif defined GNULIB_POSIXCHECK
+# undef strncat
+# if HAVE_RAW_DECL_STRNCAT
+_GL_WARN_ON_USE (strncat, "strncat is unportable - "
+                 "use gnulib module strncat for portability");
+# endif
+#endif
+
+/* Return a newly allocated copy of at most N bytes of STRING.  */
+#if @GNULIB_STRNDUP@
+# if @REPLACE_STRNDUP@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strndup
+#   define strndup rpl_strndup
+#  endif
+_GL_FUNCDECL_RPL (strndup, char *, (char const *__string, size_t __n)
+                                   _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (strndup, char *, (char const *__string, size_t __n));
+# else
+#  if ! @HAVE_DECL_STRNDUP@
+_GL_FUNCDECL_SYS (strndup, char *, (char const *__string, size_t __n)
+                                   _GL_ARG_NONNULL ((1)));
+#  endif
+_GL_CXXALIAS_SYS (strndup, char *, (char const *__string, size_t __n));
+# endif
+_GL_CXXALIASWARN (strndup);
+#elif defined GNULIB_POSIXCHECK
+# undef strndup
+# if HAVE_RAW_DECL_STRNDUP
+_GL_WARN_ON_USE (strndup, "strndup is unportable - "
+                 "use gnulib module strndup for portability");
+# endif
+#endif
+
+/* Find the length (number of bytes) of STRING, but scan at most
+   MAXLEN bytes.  If no '\0' terminator is found in that many bytes,
+   return MAXLEN.  */
+#if @GNULIB_STRNLEN@
+# if @REPLACE_STRNLEN@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strnlen
+#   define strnlen rpl_strnlen
+#  endif
+_GL_FUNCDECL_RPL (strnlen, size_t, (char const *__string, size_t __maxlen)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (strnlen, size_t, (char const *__string, size_t __maxlen));
+# else
+#  if ! @HAVE_DECL_STRNLEN@
+_GL_FUNCDECL_SYS (strnlen, size_t, (char const *__string, size_t __maxlen)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+#  endif
+_GL_CXXALIAS_SYS (strnlen, size_t, (char const *__string, size_t __maxlen));
+# endif
+_GL_CXXALIASWARN (strnlen);
+#elif defined GNULIB_POSIXCHECK
+# undef strnlen
+# if HAVE_RAW_DECL_STRNLEN
+_GL_WARN_ON_USE (strnlen, "strnlen is unportable - "
+                 "use gnulib module strnlen for portability");
+# endif
+#endif
+
+#if defined GNULIB_POSIXCHECK
+/* strcspn() assumes the second argument is a list of single-byte characters.
+   Even in this simple case, it does not work with multibyte strings if the
+   locale encoding is GB18030 and one of the characters to be searched is a
+   digit.  */
+# undef strcspn
+/* Assume strcspn is always declared.  */
+_GL_WARN_ON_USE (strcspn, "strcspn cannot work correctly on character strings "
+                 "in multibyte locales - "
+                 "use mbscspn if you care about internationalization");
+#endif
+
+/* Find the first occurrence in S of any character in ACCEPT.  */
+#if @GNULIB_STRPBRK@
+# if ! @HAVE_STRPBRK@
+_GL_FUNCDECL_SYS (strpbrk, char *, (char const *__s, char const *__accept)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1, 2)));
+# endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C" { const char * strpbrk (const char *, const char *); }
+       extern "C++" { char * strpbrk (char *, const char *); }  */
+_GL_CXXALIAS_SYS_CAST2 (strpbrk,
+                        char *, (char const *__s, char const *__accept),
+                        const char *, (char const *__s, char const *__accept));
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (strpbrk, char *, (char *__s, char const *__accept));
+_GL_CXXALIASWARN1 (strpbrk, char const *,
+                   (char const *__s, char const *__accept));
+# else
+_GL_CXXALIASWARN (strpbrk);
+# endif
+# if defined GNULIB_POSIXCHECK
+/* strpbrk() assumes the second argument is a list of single-byte characters.
+   Even in this simple case, it does not work with multibyte strings if the
+   locale encoding is GB18030 and one of the characters to be searched is a
+   digit.  */
+#  undef strpbrk
+_GL_WARN_ON_USE (strpbrk, "strpbrk cannot work correctly on character strings "
+                 "in multibyte locales - "
+                 "use mbspbrk if you care about internationalization");
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strpbrk
+# if HAVE_RAW_DECL_STRPBRK
+_GL_WARN_ON_USE (strpbrk, "strpbrk is unportable - "
+                 "use gnulib module strpbrk for portability");
+# endif
+#endif
+
+#if defined GNULIB_POSIXCHECK
+/* strspn() assumes the second argument is a list of single-byte characters.
+   Even in this simple case, it cannot work with multibyte strings.  */
+# undef strspn
+/* Assume strspn is always declared.  */
+_GL_WARN_ON_USE (strspn, "strspn cannot work correctly on character strings "
+                 "in multibyte locales - "
+                 "use mbsspn if you care about internationalization");
+#endif
+
+#if defined GNULIB_POSIXCHECK
+/* strrchr() does not work with multibyte strings if the locale encoding is
+   GB18030 and the character to be searched is a digit.  */
+# undef strrchr
+/* Assume strrchr is always declared.  */
+_GL_WARN_ON_USE (strrchr, "strrchr cannot work correctly on character strings "
+                 "in some multibyte locales - "
+                 "use mbsrchr if you care about internationalization");
+#endif
+
+/* Search the next delimiter (char listed in DELIM) starting at *STRINGP.
+   If one is found, overwrite it with a NUL, and advance *STRINGP
+   to point to the next char after it.  Otherwise, set *STRINGP to NULL.
+   If *STRINGP was already NULL, nothing happens.
+   Return the old value of *STRINGP.
+
+   This is a variant of strtok() that is multithread-safe and supports
+   empty fields.
+
+   Caveat: It modifies the original string.
+   Caveat: These functions cannot be used on constant strings.
+   Caveat: The identity of the delimiting character is lost.
+   Caveat: It doesn't work with multibyte strings unless all of the delimiter
+           characters are ASCII characters < 0x30.
+
+   See also strtok_r().  */
+#if @GNULIB_STRSEP@
+# if ! @HAVE_STRSEP@
+_GL_FUNCDECL_SYS (strsep, char *,
+                  (char **restrict __stringp, char const *restrict __delim)
+                  _GL_ARG_NONNULL ((1, 2)));
+# endif
+_GL_CXXALIAS_SYS (strsep, char *,
+                  (char **restrict __stringp, char const *restrict __delim));
+_GL_CXXALIASWARN (strsep);
+# if defined GNULIB_POSIXCHECK
+#  undef strsep
+_GL_WARN_ON_USE (strsep, "strsep cannot work correctly on character strings "
+                 "in multibyte locales - "
+                 "use mbssep if you care about internationalization");
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strsep
+# if HAVE_RAW_DECL_STRSEP
+_GL_WARN_ON_USE (strsep, "strsep is unportable - "
+                 "use gnulib module strsep for portability");
+# endif
+#endif
+
+#if @GNULIB_STRSTR@
+# if @REPLACE_STRSTR@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define strstr rpl_strstr
+#  endif
+_GL_FUNCDECL_RPL (strstr, char *, (const char *haystack, const char *needle)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (strstr, char *, (const char *haystack, const char *needle));
+# else
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const char * strstr (const char *, const char *); }
+       extern "C++" { char * strstr (char *, const char *); }  */
+_GL_CXXALIAS_SYS_CAST2 (strstr,
+                        char *, (const char *haystack, const char *needle),
+                        const char *, (const char *haystack, const char *needle));
+# endif
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (strstr, char *, (char *haystack, const char *needle));
+_GL_CXXALIASWARN1 (strstr, const char *,
+                   (const char *haystack, const char *needle));
+# else
+_GL_CXXALIASWARN (strstr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+/* strstr() does not work with multibyte strings if the locale encoding is
+   different from UTF-8:
+   POSIX says that it operates on "strings", and "string" in POSIX is defined
+   as a sequence of bytes, not of characters.  */
+# undef strstr
+/* Assume strstr is always declared.  */
+_GL_WARN_ON_USE (strstr, "strstr is quadratic on many systems, and cannot "
+                 "work correctly on character strings in most "
+                 "multibyte locales - "
+                 "use mbsstr if you care about internationalization, "
+                 "or use strstr if you care about speed");
+#endif
+
+/* Find the first occurrence of NEEDLE in HAYSTACK, using case-insensitive
+   comparison.  */
+#if @GNULIB_STRCASESTR@
+# if @REPLACE_STRCASESTR@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define strcasestr rpl_strcasestr
+#  endif
+_GL_FUNCDECL_RPL (strcasestr, char *,
+                  (const char *haystack, const char *needle)
+                  _GL_ATTRIBUTE_PURE
+                  _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (strcasestr, char *,
+                  (const char *haystack, const char *needle));
+# else
+#  if ! @HAVE_STRCASESTR@
+_GL_FUNCDECL_SYS (strcasestr, char *,
+                  (const char *haystack, const char *needle)
+                  _GL_ATTRIBUTE_PURE
+                  _GL_ARG_NONNULL ((1, 2)));
+#  endif
+  /* On some systems, this function is defined as an overloaded function:
+       extern "C++" { const char * strcasestr (const char *, const char *); }
+       extern "C++" { char * strcasestr (char *, const char *); }  */
+_GL_CXXALIAS_SYS_CAST2 (strcasestr,
+                        char *, (const char *haystack, const char *needle),
+                        const char *, (const char *haystack, const char *needle));
+# endif
+# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \
+     && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4))
+_GL_CXXALIASWARN1 (strcasestr, char *, (char *haystack, const char *needle));
+_GL_CXXALIASWARN1 (strcasestr, const char *,
+                   (const char *haystack, const char *needle));
+# else
+_GL_CXXALIASWARN (strcasestr);
+# endif
+#elif defined GNULIB_POSIXCHECK
+/* strcasestr() does not work with multibyte strings:
+   It is a glibc extension, and glibc implements it only for unibyte
+   locales.  */
+# undef strcasestr
+# if HAVE_RAW_DECL_STRCASESTR
+_GL_WARN_ON_USE (strcasestr, "strcasestr does work correctly on character "
+                 "strings in multibyte locales - "
+                 "use mbscasestr if you care about "
+                 "internationalization, or use c-strcasestr if you want "
+                 "a locale independent function");
+# endif
+#endif
+
+/* Parse S into tokens separated by characters in DELIM.
+   If S is NULL, the saved pointer in SAVE_PTR is used as
+   the next starting point.  For example:
+        char s[] = "-abc-=-def";
+        char *sp;
+        x = strtok_r(s, "-", &sp);      // x = "abc", sp = "=-def"
+        x = strtok_r(NULL, "-=", &sp);  // x = "def", sp = NULL
+        x = strtok_r(NULL, "=", &sp);   // x = NULL
+                // s = "abc\0-def\0"
+
+   This is a variant of strtok() that is multithread-safe.
+
+   For the POSIX documentation for this function, see:
+   http://www.opengroup.org/susv3xsh/strtok.html
+
+   Caveat: It modifies the original string.
+   Caveat: These functions cannot be used on constant strings.
+   Caveat: The identity of the delimiting character is lost.
+   Caveat: It doesn't work with multibyte strings unless all of the delimiter
+           characters are ASCII characters < 0x30.
+
+   See also strsep().  */
+#if @GNULIB_STRTOK_R@
+# if @REPLACE_STRTOK_R@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strtok_r
+#   define strtok_r rpl_strtok_r
+#  endif
+_GL_FUNCDECL_RPL (strtok_r, char *,
+                  (char *restrict s, char const *restrict delim,
+                   char **restrict save_ptr)
+                  _GL_ARG_NONNULL ((2, 3)));
+_GL_CXXALIAS_RPL (strtok_r, char *,
+                  (char *restrict s, char const *restrict delim,
+                   char **restrict save_ptr));
+# else
+#  if @UNDEFINE_STRTOK_R@ || defined GNULIB_POSIXCHECK
+#   undef strtok_r
+#  endif
+#  if ! @HAVE_DECL_STRTOK_R@
+_GL_FUNCDECL_SYS (strtok_r, char *,
+                  (char *restrict s, char const *restrict delim,
+                   char **restrict save_ptr)
+                  _GL_ARG_NONNULL ((2, 3)));
+#  endif
+_GL_CXXALIAS_SYS (strtok_r, char *,
+                  (char *restrict s, char const *restrict delim,
+                   char **restrict save_ptr));
+# endif
+_GL_CXXALIASWARN (strtok_r);
+# if defined GNULIB_POSIXCHECK
+_GL_WARN_ON_USE (strtok_r, "strtok_r cannot work correctly on character "
+                 "strings in multibyte locales - "
+                 "use mbstok_r if you care about internationalization");
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strtok_r
+# if HAVE_RAW_DECL_STRTOK_R
+_GL_WARN_ON_USE (strtok_r, "strtok_r is unportable - "
+                 "use gnulib module strtok_r for portability");
+# endif
+#endif
+
+
+/* The following functions are not specified by POSIX.  They are gnulib
+   extensions.  */
+
+#if @GNULIB_MBSLEN@
+/* Return the number of multibyte characters in the character string STRING.
+   This considers multibyte characters, unlike strlen, which counts bytes.  */
+# ifdef __MirBSD__  /* MirBSD defines mbslen as a macro.  Override it.  */
+#  undef mbslen
+# endif
+# if @HAVE_MBSLEN@  /* AIX, OSF/1, MirBSD define mbslen already in libc.  */
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define mbslen rpl_mbslen
+#  endif
+_GL_FUNCDECL_RPL (mbslen, size_t, (const char *string)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (mbslen, size_t, (const char *string));
+# else
+_GL_FUNCDECL_SYS (mbslen, size_t, (const char *string)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_SYS (mbslen, size_t, (const char *string));
+# endif
+_GL_CXXALIASWARN (mbslen);
+#endif
+
+#if @GNULIB_MBSNLEN@
+/* Return the number of multibyte characters in the character string starting
+   at STRING and ending at STRING + LEN.  */
+_GL_EXTERN_C size_t mbsnlen (const char *string, size_t len)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1));
+#endif
+
+#if @GNULIB_MBSCHR@
+/* Locate the first single-byte character C in the character string STRING,
+   and return a pointer to it.  Return NULL if C is not found in STRING.
+   Unlike strchr(), this function works correctly in multibyte locales with
+   encodings such as GB18030.  */
+# if defined __hpux
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define mbschr rpl_mbschr /* avoid collision with HP-UX function */
+#  endif
+_GL_FUNCDECL_RPL (mbschr, char *, (const char *string, int c)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (mbschr, char *, (const char *string, int c));
+# else
+_GL_FUNCDECL_SYS (mbschr, char *, (const char *string, int c)
+                                  _GL_ATTRIBUTE_PURE
+                                  _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_SYS (mbschr, char *, (const char *string, int c));
+# endif
+_GL_CXXALIASWARN (mbschr);
+#endif
+
+#if @GNULIB_MBSRCHR@
+/* Locate the last single-byte character C in the character string STRING,
+   and return a pointer to it.  Return NULL if C is not found in STRING.
+   Unlike strrchr(), this function works correctly in multibyte locales with
+   encodings such as GB18030.  */
+# if defined __hpux || defined __INTERIX
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define mbsrchr rpl_mbsrchr /* avoid collision with system function */
+#  endif
+_GL_FUNCDECL_RPL (mbsrchr, char *, (const char *string, int c)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_RPL (mbsrchr, char *, (const char *string, int c));
+# else
+_GL_FUNCDECL_SYS (mbsrchr, char *, (const char *string, int c)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1)));
+_GL_CXXALIAS_SYS (mbsrchr, char *, (const char *string, int c));
+# endif
+_GL_CXXALIASWARN (mbsrchr);
+#endif
+
+#if @GNULIB_MBSSTR@
+/* Find the first occurrence of the character string NEEDLE in the character
+   string HAYSTACK.  Return NULL if NEEDLE is not found in HAYSTACK.
+   Unlike strstr(), this function works correctly in multibyte locales with
+   encodings different from UTF-8.  */
+_GL_EXTERN_C char * mbsstr (const char *haystack, const char *needle)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSCASECMP@
+/* Compare the character strings S1 and S2, ignoring case, returning less than,
+   equal to or greater than zero if S1 is lexicographically less than, equal to
+   or greater than S2.
+   Note: This function may, in multibyte locales, return 0 for strings of
+   different lengths!
+   Unlike strcasecmp(), this function works correctly in multibyte locales.  */
+_GL_EXTERN_C int mbscasecmp (const char *s1, const char *s2)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSNCASECMP@
+/* Compare the initial segment of the character string S1 consisting of at most
+   N characters with the initial segment of the character string S2 consisting
+   of at most N characters, ignoring case, returning less than, equal to or
+   greater than zero if the initial segment of S1 is lexicographically less
+   than, equal to or greater than the initial segment of S2.
+   Note: This function may, in multibyte locales, return 0 for initial segments
+   of different lengths!
+   Unlike strncasecmp(), this function works correctly in multibyte locales.
+   But beware that N is not a byte count but a character count!  */
+_GL_EXTERN_C int mbsncasecmp (const char *s1, const char *s2, size_t n)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSPCASECMP@
+/* Compare the initial segment of the character string STRING consisting of
+   at most mbslen (PREFIX) characters with the character string PREFIX,
+   ignoring case.  If the two match, return a pointer to the first byte
+   after this prefix in STRING.  Otherwise, return NULL.
+   Note: This function may, in multibyte locales, return non-NULL if STRING
+   is of smaller length than PREFIX!
+   Unlike strncasecmp(), this function works correctly in multibyte
+   locales.  */
+_GL_EXTERN_C char * mbspcasecmp (const char *string, const char *prefix)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSCASESTR@
+/* Find the first occurrence of the character string NEEDLE in the character
+   string HAYSTACK, using case-insensitive comparison.
+   Note: This function may, in multibyte locales, return success even if
+   strlen (haystack) < strlen (needle) !
+   Unlike strcasestr(), this function works correctly in multibyte locales.  */
+_GL_EXTERN_C char * mbscasestr (const char *haystack, const char *needle)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSCSPN@
+/* Find the first occurrence in the character string STRING of any character
+   in the character string ACCEPT.  Return the number of bytes from the
+   beginning of the string to this occurrence, or to the end of the string
+   if none exists.
+   Unlike strcspn(), this function works correctly in multibyte locales.  */
+_GL_EXTERN_C size_t mbscspn (const char *string, const char *accept)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSPBRK@
+/* Find the first occurrence in the character string STRING of any character
+   in the character string ACCEPT.  Return the pointer to it, or NULL if none
+   exists.
+   Unlike strpbrk(), this function works correctly in multibyte locales.  */
+# if defined __hpux
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define mbspbrk rpl_mbspbrk /* avoid collision with HP-UX function */
+#  endif
+_GL_FUNCDECL_RPL (mbspbrk, char *, (const char *string, const char *accept)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_RPL (mbspbrk, char *, (const char *string, const char *accept));
+# else
+_GL_FUNCDECL_SYS (mbspbrk, char *, (const char *string, const char *accept)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1, 2)));
+_GL_CXXALIAS_SYS (mbspbrk, char *, (const char *string, const char *accept));
+# endif
+_GL_CXXALIASWARN (mbspbrk);
+#endif
+
+#if @GNULIB_MBSSPN@
+/* Find the first occurrence in the character string STRING of any character
+   not in the character string REJECT.  Return the number of bytes from the
+   beginning of the string to this occurrence, or to the end of the string
+   if none exists.
+   Unlike strspn(), this function works correctly in multibyte locales.  */
+_GL_EXTERN_C size_t mbsspn (const char *string, const char *reject)
+     _GL_ATTRIBUTE_PURE
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSSEP@
+/* Search the next delimiter (multibyte character listed in the character
+   string DELIM) starting at the character string *STRINGP.
+   If one is found, overwrite it with a NUL, and advance *STRINGP to point
+   to the next multibyte character after it.  Otherwise, set *STRINGP to NULL.
+   If *STRINGP was already NULL, nothing happens.
+   Return the old value of *STRINGP.
+
+   This is a variant of mbstok_r() that supports empty fields.
+
+   Caveat: It modifies the original string.
+   Caveat: These functions cannot be used on constant strings.
+   Caveat: The identity of the delimiting character is lost.
+
+   See also mbstok_r().  */
+_GL_EXTERN_C char * mbssep (char **stringp, const char *delim)
+     _GL_ARG_NONNULL ((1, 2));
+#endif
+
+#if @GNULIB_MBSTOK_R@
+/* Parse the character string STRING into tokens separated by characters in
+   the character string DELIM.
+   If STRING is NULL, the saved pointer in SAVE_PTR is used as
+   the next starting point.  For example:
+        char s[] = "-abc-=-def";
+        char *sp;
+        x = mbstok_r(s, "-", &sp);      // x = "abc", sp = "=-def"
+        x = mbstok_r(NULL, "-=", &sp);  // x = "def", sp = NULL
+        x = mbstok_r(NULL, "=", &sp);   // x = NULL
+                // s = "abc\0-def\0"
+
+   Caveat: It modifies the original string.
+   Caveat: These functions cannot be used on constant strings.
+   Caveat: The identity of the delimiting character is lost.
+
+   See also mbssep().  */
+_GL_EXTERN_C char * mbstok_r (char *string, const char *delim, char **save_ptr)
+     _GL_ARG_NONNULL ((2, 3));
+#endif
+
+/* Map any int, typically from errno, into an error message.  */
+#if @GNULIB_STRERROR@
+# if @REPLACE_STRERROR@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strerror
+#   define strerror rpl_strerror
+#  endif
+_GL_FUNCDECL_RPL (strerror, char *, (int));
+_GL_CXXALIAS_RPL (strerror, char *, (int));
+# else
+_GL_CXXALIAS_SYS (strerror, char *, (int));
+# endif
+_GL_CXXALIASWARN (strerror);
+#elif defined GNULIB_POSIXCHECK
+# undef strerror
+/* Assume strerror is always declared.  */
+_GL_WARN_ON_USE (strerror, "strerror is unportable - "
+                 "use gnulib module strerror to guarantee non-NULL result");
+#endif
+
+/* Map any int, typically from errno, into an error message.  Multithread-safe.
+   Uses the POSIX declaration, not the glibc declaration.  */
+#if @GNULIB_STRERROR_R@
+# if @REPLACE_STRERROR_R@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   undef strerror_r
+#   define strerror_r rpl_strerror_r
+#  endif
+_GL_FUNCDECL_RPL (strerror_r, int, (int errnum, char *buf, size_t buflen)
+                                   _GL_ARG_NONNULL ((2)));
+_GL_CXXALIAS_RPL (strerror_r, int, (int errnum, char *buf, size_t buflen));
+# else
+#  if !@HAVE_DECL_STRERROR_R@
+_GL_FUNCDECL_SYS (strerror_r, int, (int errnum, char *buf, size_t buflen)
+                                   _GL_ARG_NONNULL ((2)));
+#  endif
+_GL_CXXALIAS_SYS (strerror_r, int, (int errnum, char *buf, size_t buflen));
+# endif
+# if @HAVE_DECL_STRERROR_R@
+_GL_CXXALIASWARN (strerror_r);
+# endif
+#elif defined GNULIB_POSIXCHECK
+# undef strerror_r
+# if HAVE_RAW_DECL_STRERROR_R
+_GL_WARN_ON_USE (strerror_r, "strerror_r is unportable - "
+                 "use gnulib module strerror_r-posix for portability");
+# endif
+#endif
+
+#if @GNULIB_STRSIGNAL@
+# if @REPLACE_STRSIGNAL@
+#  if !(defined __cplusplus && defined GNULIB_NAMESPACE)
+#   define strsignal rpl_strsignal
+#  endif
+_GL_FUNCDECL_RPL (strsignal, char *, (int __sig));
+_GL_CXXALIAS_RPL (strsignal, char *, (int __sig));
+# else
+#  if ! @HAVE_DECL_STRSIGNAL@
+_GL_FUNCDECL_SYS (strsignal, char *, (int __sig));
+#  endif
+/* Need to cast, because on Cygwin 1.5.x systems, the return type is
+   'const char *'.  */
+_GL_CXXALIAS_SYS_CAST (strsignal, char *, (int __sig));
+# endif
+_GL_CXXALIASWARN (strsignal);
+#elif defined GNULIB_POSIXCHECK
+# undef strsignal
+# if HAVE_RAW_DECL_STRSIGNAL
+_GL_WARN_ON_USE (strsignal, "strsignal is unportable - "
+                 "use gnulib module strsignal for portability");
+# endif
+#endif
+
+#if @GNULIB_STRVERSCMP@
+# if !@HAVE_STRVERSCMP@
+_GL_FUNCDECL_SYS (strverscmp, int, (const char *, const char *)
+                                   _GL_ATTRIBUTE_PURE
+                                   _GL_ARG_NONNULL ((1, 2)));
+# endif
+_GL_CXXALIAS_SYS (strverscmp, int, (const char *, const char *));
+_GL_CXXALIASWARN (strverscmp);
+#elif defined GNULIB_POSIXCHECK
+# undef strverscmp
+# if HAVE_RAW_DECL_STRVERSCMP
+_GL_WARN_ON_USE (strverscmp, "strverscmp is unportable - "
+                 "use gnulib module strverscmp for portability");
+# endif
+#endif
+
+
+#endif /* _@GUARD_PREFIX@_STRING_H */
+#endif /* _@GUARD_PREFIX@_STRING_H */

=== modified file 'm4/gnulib-comp.m4'
--- m4/gnulib-comp.m4	2013-02-01 06:30:51 +0000
+++ m4/gnulib-comp.m4	2013-02-09 03:12:48 +0000
@@ -83,6 +83,7 @@
   AC_REQUIRE([AC_SYS_LARGEFILE])
   # Code from module lstat:
   # Code from module manywarnings:
+  # Code from module memrchr:
   # Code from module mktime:
   # Code from module multiarch:
   # Code from module nocrash:
@@ -117,6 +118,7 @@
   # Code from module stdio:
   # Code from module stdlib:
   # Code from module strftime:
+  # Code from module string:
   # Code from module strtoimax:
   # Code from module strtoll:
   # Code from module strtoull:
@@ -242,6 +244,12 @@
     gl_PREREQ_LSTAT
   fi
   gl_SYS_STAT_MODULE_INDICATOR([lstat])
+  gl_FUNC_MEMRCHR
+  if test $ac_cv_func_memrchr = no; then
+    AC_LIBOBJ([memrchr])
+    gl_PREREQ_MEMRCHR
+  fi
+  gl_STRING_MODULE_INDICATOR([memrchr])
   gl_FUNC_MKTIME
   if test $REPLACE_MKTIME = 1; then
     AC_LIBOBJ([mktime])
@@ -294,6 +302,7 @@
   gl_STDIO_H
   gl_STDLIB_H
   gl_FUNC_GNU_STRFTIME
+  gl_HEADER_STRING_H
   gl_FUNC_STRTOIMAX
   if test $HAVE_STRTOIMAX = 0 || test $REPLACE_STRTOIMAX = 1; then
     AC_LIBOBJ([strtoimax])
@@ -757,6 +766,7 @@
   lib/lstat.c
   lib/md5.c
   lib/md5.h
+  lib/memrchr.c
   lib/mktime-internal.h
   lib/mktime.c
   lib/openat-priv.h
@@ -790,6 +800,7 @@
   lib/stdlib.in.h
   lib/strftime.c
   lib/strftime.h
+  lib/string.in.h
   lib/strtoimax.c
   lib/strtol.c
   lib/strtoll.c
@@ -848,6 +859,7 @@
   m4/lstat.m4
   m4/manywarnings.m4
   m4/md5.m4
+  m4/memrchr.m4
   m4/mktime.m4
   m4/multiarch.m4
   m4/nocrash.m4
@@ -877,6 +889,7 @@
   m4/stdio_h.m4
   m4/stdlib_h.m4
   m4/strftime.m4
+  m4/string_h.m4
   m4/strtoimax.m4
   m4/strtoll.m4
   m4/strtoull.m4

=== added file 'm4/memrchr.m4'
--- m4/memrchr.m4	1970-01-01 00:00:00 +0000
+++ m4/memrchr.m4	2013-02-09 03:12:48 +0000
@@ -0,0 +1,23 @@
+# memrchr.m4 serial 10
+dnl Copyright (C) 2002-2003, 2005-2007, 2009-2013 Free Software Foundation,
+dnl Inc.
+dnl This file is free software; the Free Software Foundation
+dnl gives unlimited permission to copy and/or distribute it,
+dnl with or without modifications, as long as this notice is preserved.
+
+AC_DEFUN([gl_FUNC_MEMRCHR],
+[
+  dnl Persuade glibc <string.h> to declare memrchr().
+  AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS])
+
+  AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS])
+  AC_CHECK_DECLS_ONCE([memrchr])
+  if test $ac_cv_have_decl_memrchr = no; then
+    HAVE_DECL_MEMRCHR=0
+  fi
+
+  AC_CHECK_FUNCS([memrchr])
+])
+
+# Prerequisites of lib/memrchr.c.
+AC_DEFUN([gl_PREREQ_MEMRCHR], [:])

=== added file 'm4/string_h.m4'
--- m4/string_h.m4	1970-01-01 00:00:00 +0000
+++ m4/string_h.m4	2013-02-09 03:12:48 +0000
@@ -0,0 +1,120 @@
+# Configure a GNU-like replacement for <string.h>.
+
+# Copyright (C) 2007-2013 Free Software Foundation, Inc.
+# This file is free software; the Free Software Foundation
+# gives unlimited permission to copy and/or distribute it,
+# with or without modifications, as long as this notice is preserved.
+
+# serial 21
+
+# Written by Paul Eggert.
+
+AC_DEFUN([gl_HEADER_STRING_H],
+[
+  dnl Use AC_REQUIRE here, so that the default behavior below is expanded
+  dnl once only, before all statements that occur in other macros.
+  AC_REQUIRE([gl_HEADER_STRING_H_BODY])
+])
+
+AC_DEFUN([gl_HEADER_STRING_H_BODY],
+[
+  AC_REQUIRE([AC_C_RESTRICT])
+  AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS])
+  gl_NEXT_HEADERS([string.h])
+
+  dnl Check for declarations of anything we want to poison if the
+  dnl corresponding gnulib module is not in use, and which is not
+  dnl guaranteed by C89.
+  gl_WARN_ON_USE_PREPARE([[#include <string.h>
+    ]],
+    [ffsl ffsll memmem mempcpy memrchr rawmemchr stpcpy stpncpy strchrnul
+     strdup strncat strndup strnlen strpbrk strsep strcasestr strtok_r
+     strerror_r strsignal strverscmp])
+])
+
+AC_DEFUN([gl_STRING_MODULE_INDICATOR],
+[
+  dnl Use AC_REQUIRE here, so that the default settings are expanded once only.
+  AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS])
+  gl_MODULE_INDICATOR_SET_VARIABLE([$1])
+  dnl Define it also as a C macro, for the benefit of the unit tests.
+  gl_MODULE_INDICATOR_FOR_TESTS([$1])
+])
+
+AC_DEFUN([gl_HEADER_STRING_H_DEFAULTS],
+[
+  GNULIB_FFSL=0;        AC_SUBST([GNULIB_FFSL])
+  GNULIB_FFSLL=0;       AC_SUBST([GNULIB_FFSLL])
+  GNULIB_MEMCHR=0;      AC_SUBST([GNULIB_MEMCHR])
+  GNULIB_MEMMEM=0;      AC_SUBST([GNULIB_MEMMEM])
+  GNULIB_MEMPCPY=0;     AC_SUBST([GNULIB_MEMPCPY])
+  GNULIB_MEMRCHR=0;     AC_SUBST([GNULIB_MEMRCHR])
+  GNULIB_RAWMEMCHR=0;   AC_SUBST([GNULIB_RAWMEMCHR])
+  GNULIB_STPCPY=0;      AC_SUBST([GNULIB_STPCPY])
+  GNULIB_STPNCPY=0;     AC_SUBST([GNULIB_STPNCPY])
+  GNULIB_STRCHRNUL=0;   AC_SUBST([GNULIB_STRCHRNUL])
+  GNULIB_STRDUP=0;      AC_SUBST([GNULIB_STRDUP])
+  GNULIB_STRNCAT=0;     AC_SUBST([GNULIB_STRNCAT])
+  GNULIB_STRNDUP=0;     AC_SUBST([GNULIB_STRNDUP])
+  GNULIB_STRNLEN=0;     AC_SUBST([GNULIB_STRNLEN])
+  GNULIB_STRPBRK=0;     AC_SUBST([GNULIB_STRPBRK])
+  GNULIB_STRSEP=0;      AC_SUBST([GNULIB_STRSEP])
+  GNULIB_STRSTR=0;      AC_SUBST([GNULIB_STRSTR])
+  GNULIB_STRCASESTR=0;  AC_SUBST([GNULIB_STRCASESTR])
+  GNULIB_STRTOK_R=0;    AC_SUBST([GNULIB_STRTOK_R])
+  GNULIB_MBSLEN=0;      AC_SUBST([GNULIB_MBSLEN])
+  GNULIB_MBSNLEN=0;     AC_SUBST([GNULIB_MBSNLEN])
+  GNULIB_MBSCHR=0;      AC_SUBST([GNULIB_MBSCHR])
+  GNULIB_MBSRCHR=0;     AC_SUBST([GNULIB_MBSRCHR])
+  GNULIB_MBSSTR=0;      AC_SUBST([GNULIB_MBSSTR])
+  GNULIB_MBSCASECMP=0;  AC_SUBST([GNULIB_MBSCASECMP])
+  GNULIB_MBSNCASECMP=0; AC_SUBST([GNULIB_MBSNCASECMP])
+  GNULIB_MBSPCASECMP=0; AC_SUBST([GNULIB_MBSPCASECMP])
+  GNULIB_MBSCASESTR=0;  AC_SUBST([GNULIB_MBSCASESTR])
+  GNULIB_MBSCSPN=0;     AC_SUBST([GNULIB_MBSCSPN])
+  GNULIB_MBSPBRK=0;     AC_SUBST([GNULIB_MBSPBRK])
+  GNULIB_MBSSPN=0;      AC_SUBST([GNULIB_MBSSPN])
+  GNULIB_MBSSEP=0;      AC_SUBST([GNULIB_MBSSEP])
+  GNULIB_MBSTOK_R=0;    AC_SUBST([GNULIB_MBSTOK_R])
+  GNULIB_STRERROR=0;    AC_SUBST([GNULIB_STRERROR])
+  GNULIB_STRERROR_R=0;  AC_SUBST([GNULIB_STRERROR_R])
+  GNULIB_STRSIGNAL=0;   AC_SUBST([GNULIB_STRSIGNAL])
+  GNULIB_STRVERSCMP=0;  AC_SUBST([GNULIB_STRVERSCMP])
+  HAVE_MBSLEN=0;        AC_SUBST([HAVE_MBSLEN])
+  dnl Assume proper GNU behavior unless another module says otherwise.
+  HAVE_FFSL=1;                  AC_SUBST([HAVE_FFSL])
+  HAVE_FFSLL=1;                 AC_SUBST([HAVE_FFSLL])
+  HAVE_MEMCHR=1;                AC_SUBST([HAVE_MEMCHR])
+  HAVE_DECL_MEMMEM=1;           AC_SUBST([HAVE_DECL_MEMMEM])
+  HAVE_MEMPCPY=1;               AC_SUBST([HAVE_MEMPCPY])
+  HAVE_DECL_MEMRCHR=1;          AC_SUBST([HAVE_DECL_MEMRCHR])
+  HAVE_RAWMEMCHR=1;             AC_SUBST([HAVE_RAWMEMCHR])
+  HAVE_STPCPY=1;                AC_SUBST([HAVE_STPCPY])
+  HAVE_STPNCPY=1;               AC_SUBST([HAVE_STPNCPY])
+  HAVE_STRCHRNUL=1;             AC_SUBST([HAVE_STRCHRNUL])
+  HAVE_DECL_STRDUP=1;           AC_SUBST([HAVE_DECL_STRDUP])
+  HAVE_DECL_STRNDUP=1;          AC_SUBST([HAVE_DECL_STRNDUP])
+  HAVE_DECL_STRNLEN=1;          AC_SUBST([HAVE_DECL_STRNLEN])
+  HAVE_STRPBRK=1;               AC_SUBST([HAVE_STRPBRK])
+  HAVE_STRSEP=1;                AC_SUBST([HAVE_STRSEP])
+  HAVE_STRCASESTR=1;            AC_SUBST([HAVE_STRCASESTR])
+  HAVE_DECL_STRTOK_R=1;         AC_SUBST([HAVE_DECL_STRTOK_R])
+  HAVE_DECL_STRERROR_R=1;       AC_SUBST([HAVE_DECL_STRERROR_R])
+  HAVE_DECL_STRSIGNAL=1;        AC_SUBST([HAVE_DECL_STRSIGNAL])
+  HAVE_STRVERSCMP=1;            AC_SUBST([HAVE_STRVERSCMP])
+  REPLACE_MEMCHR=0;             AC_SUBST([REPLACE_MEMCHR])
+  REPLACE_MEMMEM=0;             AC_SUBST([REPLACE_MEMMEM])
+  REPLACE_STPNCPY=0;            AC_SUBST([REPLACE_STPNCPY])
+  REPLACE_STRDUP=0;             AC_SUBST([REPLACE_STRDUP])
+  REPLACE_STRSTR=0;             AC_SUBST([REPLACE_STRSTR])
+  REPLACE_STRCASESTR=0;         AC_SUBST([REPLACE_STRCASESTR])
+  REPLACE_STRCHRNUL=0;          AC_SUBST([REPLACE_STRCHRNUL])
+  REPLACE_STRERROR=0;           AC_SUBST([REPLACE_STRERROR])
+  REPLACE_STRERROR_R=0;         AC_SUBST([REPLACE_STRERROR_R])
+  REPLACE_STRNCAT=0;            AC_SUBST([REPLACE_STRNCAT])
+  REPLACE_STRNDUP=0;            AC_SUBST([REPLACE_STRNDUP])
+  REPLACE_STRNLEN=0;            AC_SUBST([REPLACE_STRNLEN])
+  REPLACE_STRSIGNAL=0;          AC_SUBST([REPLACE_STRSIGNAL])
+  REPLACE_STRTOK_R=0;           AC_SUBST([REPLACE_STRTOK_R])
+  UNDEFINE_STRTOK_R=0;          AC_SUBST([UNDEFINE_STRTOK_R])
+])

=== modified file 'src/ChangeLog'
--- src/ChangeLog	2013-02-10 16:49:09 +0000
+++ src/ChangeLog	2013-02-11 01:28:13 +0000
@@ -1,3 +1,14 @@
+2013-02-11  Paul Eggert  <eggert@cs.ucla.edu>
+
+	Tune by using memchr and memrchr.
+	* doc.c (Fsnarf_documentation):
+	* fileio.c (Fsubstitute_in_file_name):
+	* search.c (find_newline, scan_newline):
+	* xdisp.c (pos_visible_p, display_count_lines):
+	Use memchr and memrchr rather than scanning byte-by-byte.
+	* search.c (find_newline): Rename from scan_buffer.
+	Omit first arg TARGET, as it's always '\n'.  All callers changed.
+
 2013-02-10  Eli Zaretskii  <eliz@gnu.org>
 
 	* xdisp.c (move_it_vertically_backward, move_it_by_lines): When

=== modified file 'src/doc.c'
--- src/doc.c	2013-02-08 17:42:09 +0000
+++ src/doc.c	2013-02-11 01:51:25 +0000
@@ -630,11 +630,10 @@
 	break;
 
       buf[filled] = 0;
-      p = buf;
       end = buf + (filled < 512 ? filled : filled - 128);
-      while (p != end && *p != '\037') p++;
+      p = memchr (buf, '\037', end - buf);
       /* p points to ^_Ffunctionname\n or ^_Vvarname\n or ^_Sfilename\n.  */
-      if (p != end)
+      if (p)
 	{
 	  end = strchr (p, '\n');
 

=== modified file 'src/editfns.c'
--- src/editfns.c	2013-01-23 20:07:28 +0000
+++ src/editfns.c	2013-02-11 01:26:59 +0000
@@ -735,9 +735,8 @@
 	      /* This is the ONLY_IN_LINE case, check that NEW_POS and
 		 FIELD_BOUND are on the same line by seeing whether
 		 there's an intervening newline or not.  */
-	      || (scan_buffer ('\n',
-			       XFASTINT (new_pos), XFASTINT (field_bound),
-			       fwd ? -1 : 1, &shortage, 1),
+	      || (find_newline (XFASTINT (new_pos), XFASTINT (field_bound),
+				fwd ? -1 : 1, &shortage, 1),
 		  shortage != 0)))
 	/* Constrain NEW_POS to FIELD_BOUND.  */
 	new_pos = field_bound;

=== modified file 'src/fileio.c'
--- src/fileio.c	2013-02-11 00:35:37 +0000
+++ src/fileio.c	2013-02-11 01:28:13 +0000
@@ -1710,8 +1710,9 @@
 	else if (*p == '{')
 	  {
 	    o = ++p;
-	    while (p != endp && *p != '}') p++;
-	    if (*p != '}') goto missingclose;
+	    p = memchr (p, '}', endp - p);
+	    if (! p)
+	      goto missingclose;
 	    s = p;
 	  }
 	else
@@ -1779,8 +1780,9 @@
 	else if (*p == '{')
 	  {
 	    o = ++p;
-	    while (p != endp && *p != '}') p++;
-	    if (*p != '}') goto missingclose;
+	    p = memchr (p, '}', endp - p);
+	    if (! p)
+	      goto missingclose;
 	    s = p++;
 	  }
 	else

=== modified file 'src/lisp.h'
--- src/lisp.h	2013-02-09 22:42:33 +0000
+++ src/lisp.h	2013-02-11 01:28:13 +0000
@@ -3346,8 +3346,8 @@
 extern ptrdiff_t fast_string_match_ignore_case (Lisp_Object, Lisp_Object);
 extern ptrdiff_t fast_looking_at (Lisp_Object, ptrdiff_t, ptrdiff_t,
                                   ptrdiff_t, ptrdiff_t, Lisp_Object);
-extern ptrdiff_t scan_buffer (int, ptrdiff_t, ptrdiff_t, ptrdiff_t,
-			      ptrdiff_t *, bool);
+extern ptrdiff_t find_newline (ptrdiff_t, ptrdiff_t, ptrdiff_t,
+			       ptrdiff_t *, bool);
 extern EMACS_INT scan_newline (ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t,
 			       EMACS_INT, bool);
 extern ptrdiff_t find_next_newline (ptrdiff_t, int);

=== modified file 'src/region-cache.h'
--- src/region-cache.h	2013-01-01 09:11:05 +0000
+++ src/region-cache.h	2013-02-11 01:26:59 +0000
@@ -40,7 +40,7 @@
    existing data structure, and disturb as little of the existing code
    as possible.
 
-   So here's the tack.  We add some caching to the scan_buffer
+   So here's the tack.  We add some caching to the find_newline
    function, so that when it searches for a newline, it notes that the
    region between the start and end of the search contained no
    newlines; then, the next time around, it consults this cache to see

=== modified file 'src/search.c'
--- src/search.c	2013-02-08 14:44:53 +0000
+++ src/search.c	2013-02-11 02:31:39 +0000
@@ -619,7 +619,7 @@
 }
 
 \f
-/* Search for COUNT instances of the character TARGET between START and END.
+/* Search for COUNT newlines between START and END.
 
    If COUNT is positive, search forwards; END must be >= START.
    If COUNT is negative, search backwards for the -COUNTth instance;
@@ -634,14 +634,14 @@
    this is not the same as the usual convention for Emacs motion commands.
 
    If we don't find COUNT instances before reaching END, set *SHORTAGE
-   to the number of TARGETs left unfound, and return END.
+   to the number of newlines left unfound, and return END.
 
    If ALLOW_QUIT, set immediate_quit.  That's good to do
    except when inside redisplay.  */
 
 ptrdiff_t
-scan_buffer (int target, ptrdiff_t start, ptrdiff_t end,
-	     ptrdiff_t count, ptrdiff_t *shortage, bool allow_quit)
+find_newline (ptrdiff_t start, ptrdiff_t end,
+	      ptrdiff_t count, ptrdiff_t *shortage, bool allow_quit)
 {
   struct region_cache *newline_cache;
   ptrdiff_t end_byte = -1;
@@ -656,7 +656,7 @@
   else
     {
       direction = -1;
-      if (!end) 
+      if (!end)
 	end = BEGV, end_byte = BEGV_BYTE;
     }
   if (end_byte == -1)
@@ -684,7 +684,7 @@
 
         /* If we're looking for a newline, consult the newline cache
            to see where we can avoid some scanning.  */
-        if (target == '\n' && newline_cache)
+        if (newline_cache)
           {
             ptrdiff_t next_change;
             immediate_quit = 0;
@@ -723,32 +723,32 @@
 
           while (cursor < ceiling_addr)
             {
-              unsigned char *scan_start = cursor;
-
               /* The dumb loop.  */
-              while (*cursor != target && ++cursor < ceiling_addr)
-                ;
+	      unsigned char *nl = memchr (cursor, '\n', ceiling_addr - cursor);
 
               /* If we're looking for newlines, cache the fact that
                  the region from start to cursor is free of them. */
-              if (target == '\n' && newline_cache)
-                know_region_cache (current_buffer, newline_cache,
-                                   BYTE_TO_CHAR (start_byte + scan_start - base),
-                                   BYTE_TO_CHAR (start_byte + cursor - base));
-
-              /* Did we find the target character?  */
-              if (cursor < ceiling_addr)
-                {
-                  if (--count == 0)
-                    {
-                      immediate_quit = 0;
-                      return BYTE_TO_CHAR (start_byte + cursor - base + 1);
-                    }
-                  cursor++;
-                }
+              if (newline_cache)
+		{
+		  unsigned char *low = cursor;
+		  unsigned char *lim = nl ? nl : ceiling_addr;
+		  know_region_cache (current_buffer, newline_cache,
+				     BYTE_TO_CHAR (low - base + start_byte),
+				     BYTE_TO_CHAR (lim - base + start_byte));
+		}
+
+              if (! nl)
+		break;
+
+	      if (--count == 0)
+		{
+		  immediate_quit = 0;
+		  return BYTE_TO_CHAR (nl + 1 - base + start_byte);
+		}
+	      cursor = nl + 1;
             }
 
-          start = BYTE_TO_CHAR (start_byte + cursor - base);
+          start = BYTE_TO_CHAR (ceiling_addr - base + start_byte);
         }
       }
   else
@@ -760,7 +760,7 @@
 	ptrdiff_t tem;
 
         /* Consult the newline cache, if appropriate.  */
-        if (target == '\n' && newline_cache)
+        if (newline_cache)
           {
             ptrdiff_t next_change;
             immediate_quit = 0;
@@ -794,31 +794,32 @@
 
           while (cursor >= ceiling_addr)
             {
-              unsigned char *scan_start = cursor;
-
-              while (*cursor != target && --cursor >= ceiling_addr)
-                ;
+	      unsigned char *nl = memrchr (ceiling_addr, '\n',
+					   cursor + 1 - ceiling_addr);
 
               /* If we're looking for newlines, cache the fact that
                  the region from after the cursor to start is free of them.  */
-              if (target == '\n' && newline_cache)
-                know_region_cache (current_buffer, newline_cache,
-                                   BYTE_TO_CHAR (start_byte + cursor - base),
-                                   BYTE_TO_CHAR (start_byte + scan_start - base));
-
-              /* Did we find the target character?  */
-              if (cursor >= ceiling_addr)
-                {
-                  if (++count >= 0)
-                    {
-                      immediate_quit = 0;
-                      return BYTE_TO_CHAR (start_byte + cursor - base);
-                    }
-                  cursor--;
-                }
+              if (newline_cache)
+		{
+		  unsigned char *low = nl ? nl : ceiling_addr - 1;
+		  unsigned char *lim = cursor;
+		  know_region_cache (current_buffer, newline_cache,
+				     BYTE_TO_CHAR (low - base + start_byte),
+				     BYTE_TO_CHAR (lim - base + start_byte));
+		}
+
+              if (! nl)
+		break;
+
+	      if (++count >= 0)
+		{
+		  immediate_quit = 0;
+		  return BYTE_TO_CHAR (nl - base + start_byte);
+		}
+	      cursor = nl - 1;
             }
 
-	  start = BYTE_TO_CHAR (start_byte + cursor - base);
+	  start = BYTE_TO_CHAR (ceiling_addr - 1 - base + start_byte);
         }
       }
 
@@ -828,8 +829,7 @@
   return start;
 }
 \f
-/* Search for COUNT instances of a line boundary, which means either a
-   newline or (if selective display enabled) a carriage return.
+/* Search for COUNT instances of a line boundary.
    Start at START.  If COUNT is negative, search backwards.
 
    We report the resulting position by calling TEMP_SET_PT_BOTH.
@@ -860,9 +860,6 @@
 
   bool old_immediate_quit = immediate_quit;
 
-  /* The code that follows is like scan_buffer
-     but checks for either newline or carriage return.  */
-
   if (allow_quit)
     immediate_quit++;
 
@@ -874,29 +871,25 @@
 	  ceiling = min (limit_byte - 1, ceiling);
 	  ceiling_addr = BYTE_POS_ADDR (ceiling) + 1;
 	  base = (cursor = BYTE_POS_ADDR (start_byte));
-	  while (1)
+
+	  do
 	    {
-	      while (*cursor != '\n' && ++cursor != ceiling_addr)
-		;
-
-	      if (cursor != ceiling_addr)
+	      unsigned char *nl = memchr (cursor, '\n', ceiling_addr - cursor);
+	      if (! nl)
+		break;
+	      if (--count == 0)
 		{
-		  if (--count == 0)
-		    {
-		      immediate_quit = old_immediate_quit;
-		      start_byte = start_byte + cursor - base + 1;
-		      start = BYTE_TO_CHAR (start_byte);
-		      TEMP_SET_PT_BOTH (start, start_byte);
-		      return 0;
-		    }
-		  else
-		    if (++cursor == ceiling_addr)
-		      break;
+		  immediate_quit = old_immediate_quit;
+		  start_byte += nl - base + 1;
+		  start = BYTE_TO_CHAR (start_byte);
+		  TEMP_SET_PT_BOTH (start, start_byte);
+		  return 0;
 		}
-	      else
-		break;
+	      cursor = nl + 1;
 	    }
-	  start_byte += cursor - base;
+	  while (cursor < ceiling_addr);
+
+	  start_byte += ceiling_addr - base;
 	}
     }
   else
@@ -905,31 +898,28 @@
 	{
 	  ceiling = BUFFER_FLOOR_OF (start_byte - 1);
 	  ceiling = max (limit_byte, ceiling);
-	  ceiling_addr = BYTE_POS_ADDR (ceiling) - 1;
+	  ceiling_addr = BYTE_POS_ADDR (ceiling);
 	  base = (cursor = BYTE_POS_ADDR (start_byte - 1) + 1);
 	  while (1)
 	    {
-	      while (--cursor != ceiling_addr && *cursor != '\n')
-		;
+	      unsigned char *nl = memrchr (ceiling_addr, '\n',
+					   cursor - ceiling_addr);
+	      if (! nl)
+		break;
 
-	      if (cursor != ceiling_addr)
+	      if (++count == 0)
 		{
-		  if (++count == 0)
-		    {
-		      immediate_quit = old_immediate_quit;
-		      /* Return the position AFTER the match we found.  */
-		      start_byte = start_byte + cursor - base + 1;
-		      start = BYTE_TO_CHAR (start_byte);
-		      TEMP_SET_PT_BOTH (start, start_byte);
-		      return 0;
-		    }
+		  immediate_quit = old_immediate_quit;
+		  /* Return the position AFTER the match we found.  */
+		  start_byte += nl - base + 1;
+		  start = BYTE_TO_CHAR (start_byte);
+		  TEMP_SET_PT_BOTH (start, start_byte);
+		  return 0;
 		}
-	      else
-		break;
+
+	      cursor = nl;
 	    }
-	  /* Here we add 1 to compensate for the last decrement
-	     of CURSOR, which took it past the valid range.  */
-	  start_byte += cursor - base + 1;
+	  start_byte += ceiling_addr - base;
 	}
     }
 
@@ -942,7 +932,7 @@
 ptrdiff_t
 find_next_newline_no_quit (ptrdiff_t from, ptrdiff_t cnt)
 {
-  return scan_buffer ('\n', from, 0, cnt, (ptrdiff_t *) 0, 0);
+  return find_newline (from, 0, cnt, (ptrdiff_t *) 0, 0);
 }
 
 /* Like find_next_newline, but returns position before the newline,
@@ -953,7 +943,7 @@
 find_before_next_newline (ptrdiff_t from, ptrdiff_t to, ptrdiff_t cnt)
 {
   ptrdiff_t shortage;
-  ptrdiff_t pos = scan_buffer ('\n', from, to, cnt, &shortage, 1);
+  ptrdiff_t pos = find_newline (from, to, cnt, &shortage, 1);
 
   if (shortage == 0)
     pos--;

=== modified file 'src/xdisp.c'
--- src/xdisp.c	2013-02-10 16:49:09 +0000
+++ src/xdisp.c	2013-02-11 01:28:13 +0000
@@ -1392,21 +1392,9 @@
 	      Lisp_Object cpos = make_number (charpos);
 	      Lisp_Object spec = Fget_char_property (cpos, Qdisplay, Qnil);
 	      Lisp_Object string = string_from_display_spec (spec);
-	      int newline_in_string = 0;
-
-	      if (STRINGP (string))
-		{
-		  const char *s = SSDATA (string);
-		  const char *e = s + SBYTES (string);
-		  while (s < e)
-		    {
-		      if (*s++ == '\n')
-			{
-			  newline_in_string = 1;
-			  break;
-			}
-		    }
-		}
+	      bool newline_in_string
+		= (STRINGP (string)
+		   && memchr (SDATA (string), '\n', SBYTES (string)));
 	      /* The tricky code below is needed because there's a
 		 discrepancy between move_it_to and how we set cursor
 		 when the display line ends in a newline from a
@@ -14753,7 +14741,7 @@
 	SET_TEXT_POS (start_pos, ZV, ZV_BYTE);
 
       /* Find the start of the continued line.  This should be fast
-	 because scan_buffer is fast (newline cache).  */
+	 because find_newline is fast (newline cache).  */
       row = w->desired_matrix->rows + (WINDOW_WANTS_HEADER_LINE_P (w) ? 1 : 0);
       init_iterator (&it, w, CHARPOS (start_pos), BYTEPOS (start_pos),
 		     row, DEFAULT_FACE_ID);
@@ -21620,31 +21608,36 @@
 	  ceiling = min (limit_byte - 1, ceiling);
 	  ceiling_addr = BYTE_POS_ADDR (ceiling) + 1;
 	  base = (cursor = BYTE_POS_ADDR (start_byte));
-	  while (1)
+
+	  do
 	    {
 	      if (selective_display)
-		while (*cursor != '\n' && *cursor != 015 && ++cursor != ceiling_addr)
-		  ;
-	      else
-		while (*cursor != '\n' && ++cursor != ceiling_addr)
-		  ;
-
-	      if (cursor != ceiling_addr)
-		{
-		  if (--count == 0)
-		    {
-		      start_byte += cursor - base + 1;
-		      *byte_pos_ptr = start_byte;
-		      return orig_count;
-		    }
-		  else
-		    if (++cursor == ceiling_addr)
-		      break;
-		}
-	      else
-		break;
+		{
+		  while (*cursor != '\n' && *cursor != 015
+			 && ++cursor != ceiling_addr)
+		    continue;
+		  if (cursor == ceiling_addr)
+		    break;
+		}
+	      else
+		{
+		  cursor = memchr (cursor, '\n', ceiling_addr - cursor);
+		  if (! cursor)
+		    break;
+		}
+
+	      cursor++;
+
+	      if (--count == 0)
+		{
+		  start_byte += cursor - base;
+		  *byte_pos_ptr = start_byte;
+		  return orig_count;
+		}
 	    }
-	  start_byte += cursor - base;
+	  while (cursor < ceiling_addr);
+
+	  start_byte += ceiling_addr - base;
 	}
     }
   else
@@ -21653,35 +21646,35 @@
 	{
 	  ceiling = BUFFER_FLOOR_OF (start_byte - 1);
 	  ceiling = max (limit_byte, ceiling);
-	  ceiling_addr = BYTE_POS_ADDR (ceiling) - 1;
+	  ceiling_addr = BYTE_POS_ADDR (ceiling);
 	  base = (cursor = BYTE_POS_ADDR (start_byte - 1) + 1);
 	  while (1)
 	    {
 	      if (selective_display)
-		while (--cursor != ceiling_addr
-		       && *cursor != '\n' && *cursor != 015)
-		  ;
+		{
+		  while (--cursor >= ceiling_addr
+			 && *cursor != '\n' && *cursor != 015)
+		    continue;
+		  if (cursor < ceiling_addr)
+		    break;
+		}
 	      else
-		while (--cursor != ceiling_addr && *cursor != '\n')
-		  ;
+		{
+		  cursor = memrchr (ceiling_addr, '\n', cursor - ceiling_addr);
+		  if (! cursor)
+		    break;
+		}
 
-	      if (cursor != ceiling_addr)
+	      if (++count == 0)
 		{
-		  if (++count == 0)
-		    {
-		      start_byte += cursor - base + 1;
-		      *byte_pos_ptr = start_byte;
-		      /* When scanning backwards, we should
-			 not count the newline posterior to which we stop.  */
-		      return - orig_count - 1;
-		    }
+		  start_byte += cursor - base + 1;
+		  *byte_pos_ptr = start_byte;
+		  /* When scanning backwards, we should
+		     not count the newline posterior to which we stop.  */
+		  return - orig_count - 1;
 		}
-	      else
-		break;
 	    }
-	  /* Here we add 1 to compensate for the last decrement
-	     of CURSOR, which took it past the valid range.  */
-	  start_byte += cursor - base + 1;
+	  start_byte += ceiling_addr - base;
 	}
     }
 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-10 16:57                                 ` Eli Zaretskii
@ 2013-02-11  5:43                                   ` Dmitry Antipov
  2013-02-11  7:54                                     ` Dmitry Antipov
  2013-02-11 16:42                                     ` Eli Zaretskii
  2013-02-11 17:17                                   ` Eli Zaretskii
  1 sibling, 2 replies; 27+ messages in thread
From: Dmitry Antipov @ 2013-02-11  5:43 UTC (permalink / raw)
  To: emacs-devel; +Cc: Eli Zaretskii, Paul Eggert

Yet another interesting profile (generated by scroll-both micro-benchmark with
r111730) is shown below.

Input is 4K lines, each line is ~27K bytes, Imla'ei (modern Arabic) script. IIUC
this R2L text with long lines should push bidi really hard, but ... bidi core
routines (by itself) are almost irrelevant in the profile:

     39.96%        emacs  emacs                          [.] scan_buffer
     28.72%        emacs  emacs                          [.] buf_charpos_to_bytepos
     21.82%        emacs  emacs                          [.] buf_bytepos_to_charpos
      0.59%        emacs  emacs                          [.] re_match_2_internal
      0.51%        emacs  emacs                          [.] sub_char_table_ref
      0.42%        emacs  emacs                          [.] mark_object
      0.23%        emacs  emacs                          [.] composition_gstring_width
      0.19%        emacs  libc-2.16.so                   [.] __memcpy_ssse3_back
      0.18%        emacs  emacs                          [.] x_produce_glyphs
      0.17%        emacs  emacs                          [.] move_it_in_display_line_to
      0.17%        emacs  emacs                          [.] hash_lookup
      0.17%        emacs  emacs                          [.] Fgarbage_collect
      0.17%        emacs  emacs                          [.] lface_hash
      0.16%        emacs  emacs                          [.] decode_coding_utf_8
      0.16%        emacs  emacs                          [.] face_for_font
      0.16%        emacs  emacs                          [.] composition_gstring_p
      0.15%        emacs  emacs                          [.] compile_pattern
      0.15%        emacs  emacs                          [.] get_next_display_element
      0.14%        emacs  emacs                          [.] bidi_level_of_next_char
      0.12%        emacs  emacs                          [.] font_range
      0.12%        emacs  emacs                          [.] bidi_fetch_char
      0.12%        emacs  emacs                          [.] internal_equal
      0.11%        emacs  emacs                          [.] autocmp_chars
      0.11%        emacs  emacs                          [.] char_table_ref
      0.11%        emacs  libgtk-3.so.0.600.4            [.] 0x0000000000115bf0
      0.10%        emacs  emacs                          [.] next_element_from_buffer
      0.10%        emacs  emacs                          [.] composition_update_it
      0.10%        emacs  emacs                          [.] boyer_moore

Dmitry



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-11  5:43                                   ` Dmitry Antipov
@ 2013-02-11  7:54                                     ` Dmitry Antipov
  2013-02-11 16:47                                       ` Eli Zaretskii
  2013-02-11 16:42                                     ` Eli Zaretskii
  1 sibling, 1 reply; 27+ messages in thread
From: Dmitry Antipov @ 2013-02-11  7:54 UTC (permalink / raw)
  To: emacs-devel; +Cc: Eli Zaretskii, Paul Eggert

On 02/11/2013 09:43 AM, Dmitry Antipov wrote:

> Yet another interesting profile (generated by scroll-both micro-benchmark with
> r111730) is shown below.
>
> Input is 4K lines, each line is ~27K bytes, Imla'ei (modern Arabic) script. IIUC
> this R2L text with long lines should push bidi really hard, but ... bidi core
> routines (by itself) are almost irrelevant in the profile:
>
>      39.96%        emacs  emacs                          [.] scan_buffer
>      28.72%        emacs  emacs                          [.] buf_charpos_to_bytepos
>      21.82%        emacs  emacs                          [.] buf_bytepos_to_charpos
>       0.59%        emacs  emacs                          [.] re_match_2_internal

... and with Paul's mem(r)chr patch it is:

     43.38%        emacs  emacs                          [.] buf_charpos_to_bytepos
     28.42%        emacs  emacs                          [.] buf_bytepos_to_charpos
     13.10%        emacs  libc-2.16.so                   [.] memrchr
      0.85%        emacs  emacs                          [.] re_match_2_internal
...

So I should vote YES. This is simple optimization which really makes sense,
and I suspect that the "less usual" input is, the more sense it has.

Dmitry




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-11  5:43                                   ` Dmitry Antipov
  2013-02-11  7:54                                     ` Dmitry Antipov
@ 2013-02-11 16:42                                     ` Eli Zaretskii
  2013-02-11 17:53                                       ` Dmitry Antipov
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-11 16:42 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: eggert, emacs-devel

> Date: Mon, 11 Feb 2013 09:43:17 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: Eli Zaretskii <eliz@gnu.org>, Paul Eggert <eggert@cs.ucla.edu>
> 
> Yet another interesting profile (generated by scroll-both micro-benchmark with
> r111730) is shown below.
> 
> Input is 4K lines, each line is ~27K bytes, Imla'ei (modern Arabic) script.

Can you publish the file, or the URL where you downloaded it from?

> IIUC this R2L text with long lines should push bidi really hard,
> but... bidi core routines (by itself) are almost irrelevant in the
> profile:

Actually, that's expected, see below.

>      39.96%        emacs  emacs                          [.] scan_buffer
>      28.72%        emacs  emacs                          [.] buf_charpos_to_bytepos
>      21.82%        emacs  emacs                          [.] buf_bytepos_to_charpos
>       0.59%        emacs  emacs                          [.] re_match_2_internal
>       0.51%        emacs  emacs                          [.] sub_char_table_ref
>       0.42%        emacs  emacs                          [.] mark_object
>       0.23%        emacs  emacs                          [.] composition_gstring_width
>       0.19%        emacs  libc-2.16.so                   [.] __memcpy_ssse3_back
>       0.18%        emacs  emacs                          [.] x_produce_glyphs
>       0.17%        emacs  emacs                          [.] move_it_in_display_line_to
>       0.17%        emacs  emacs                          [.] hash_lookup
>       0.17%        emacs  emacs                          [.] Fgarbage_collect
>       0.17%        emacs  emacs                          [.] lface_hash
>       0.16%        emacs  emacs                          [.] decode_coding_utf_8
>       0.16%        emacs  emacs                          [.] face_for_font
>       0.16%        emacs  emacs                          [.] composition_gstring_p
>       0.15%        emacs  emacs                          [.] compile_pattern
>       0.15%        emacs  emacs                          [.] get_next_display_element
>       0.14%        emacs  emacs                          [.] bidi_level_of_next_char
>       0.12%        emacs  emacs                          [.] font_range
>       0.12%        emacs  emacs                          [.] bidi_fetch_char
>       0.12%        emacs  emacs                          [.] internal_equal
>       0.11%        emacs  emacs                          [.] autocmp_chars
>       0.11%        emacs  emacs                          [.] char_table_ref
>       0.11%        emacs  libgtk-3.so.0.600.4            [.] 0x0000000000115bf0
>       0.10%        emacs  emacs                          [.] next_element_from_buffer
>       0.10%        emacs  emacs                          [.] composition_update_it
>       0.10%        emacs  emacs                          [.] boyer_moore

The Arabic script is a heavy user of character compositions: they are
important for correct shaping of the glyphs, without which any speaker
of Arabic will turn away in disgust.  The fact that you see functions
like composition_update_it, composition_gstring_p,
composition_gstring_width, and sub_char_table_ref all hint towards
this.  Character compositions work by scanning the vicinity of a
composable character using regular expression matching in Lisp.  That
is why you see re_match_2_internal relatively high in the profile.
Handling these compositions can obscure any bidi reordering.  To
disable this factor, turn off auto-composition-mode.

More importantly, you cannot easily "push bidi really hard", not with
a file that consists of predominantly RTL characters.  That's because
such a file is as easy to display as a pure LTR text: the characters
are delivered for display entirely in their logical order in the
buffer, and only laid out starting at the right margin of the window
instead of at the left margin.

To exercise bidi.c, you need heavily mixed RTL and LTR text, with
digits, punctuation, and lots of embeddings and directional overrides
(using the LRE, RLE, RLO, and LRO control characters), which push and
pop the reordering stack.  Only then the reordering of characters will
become non-trivial, and you _might_ see some bidi functions as hot
spots.  I say "might" because bidi.c uses a dynamic cache which allows
it to fetch and analyze each character only once, even if reordering
jumps here and there like a young goat.  Thus, the only overhead of
reordering is the logic that decides where in the cache is the next
character to deliver for display; the cache is accessed directly (it
is implemented as a linear array).

There could be rare pathological situations where bidi.c needs to
examine lots (and I'm talking tens or hundreds of thousands) of
characters for some simple redisplay operation.  A few of these were
discovered and taken care of during late stages of v24.1 development,
but maybe there are some more.  These typically show up as heavy usage
of bidi_fetch_char or its subroutines, or of bidi_find_paragraph_start
and its subroutines.  I haven't seen such problems since last July.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-11  7:54                                     ` Dmitry Antipov
@ 2013-02-11 16:47                                       ` Eli Zaretskii
  2013-02-11 23:55                                         ` Paul Eggert
  0 siblings, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-11 16:47 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: eggert, emacs-devel

> Date: Mon, 11 Feb 2013 11:54:57 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: Eli Zaretskii <eliz@gnu.org>, Paul Eggert <eggert@cs.ucla.edu>
> 
> On 02/11/2013 09:43 AM, Dmitry Antipov wrote:
> 
> > Yet another interesting profile (generated by scroll-both micro-benchmark with
> > r111730) is shown below.
> >
> > Input is 4K lines, each line is ~27K bytes, Imla'ei (modern Arabic) script. IIUC
> > this R2L text with long lines should push bidi really hard, but ... bidi core
> > routines (by itself) are almost irrelevant in the profile:
> >
> >      39.96%        emacs  emacs                          [.] scan_buffer
> >      28.72%        emacs  emacs                          [.] buf_charpos_to_bytepos
> >      21.82%        emacs  emacs                          [.] buf_bytepos_to_charpos
> >       0.59%        emacs  emacs                          [.] re_match_2_internal
> 
> ... and with Paul's mem(r)chr patch it is:
> 
>      43.38%        emacs  emacs                          [.] buf_charpos_to_bytepos
>      28.42%        emacs  emacs                          [.] buf_bytepos_to_charpos
>      13.10%        emacs  libc-2.16.so                   [.] memrchr
>       0.85%        emacs  emacs                          [.] re_match_2_internal

Without absolute times, it's hard to judge the improvement.

> So I should vote YES. This is simple optimization which really makes sense,
> and I suspect that the "less usual" input is, the more sense it has.

I'm not opposed to using memchr where possible.  I'm just saying that
we should NOT regard this as any kind of solution for the long-lines
problem with the current display engine.  To fix that problem, we need
to speed up redisplay by one or two orders of magnitude (it currently
takes several hundreds of milliseconds to several seconds; it should
take a few milliseconds, 10 msec max).  That is a far cry from 25%
improvement we will get with memchr.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-10 16:57                                 ` Eli Zaretskii
  2013-02-11  5:43                                   ` Dmitry Antipov
@ 2013-02-11 17:17                                   ` Eli Zaretskii
  2013-02-11 17:55                                     ` Drew Adams
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-11 17:17 UTC (permalink / raw)
  To: emacs-devel, Stefan Monnier; +Cc: eggert, dmantipov

> Date: Sun, 10 Feb 2013 18:57:00 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> I just committed to the trunk revision 111724 with a couple of simple
> changes which speed up by a factor of 3 some redisplay operations,
> such as M-v or M->, in a buffer with very long lines.  Please try it.

Further measurements indicate that the bottleneck is in searches for
previous or next newline, or N-th previous/next newline.  These
searches are at the core of functions that compute pixel dimensions of
buffer text, when the display engine needs to figure out where to
start displaying the window after scrolling, or where to put point
after C-p or C-n.

As a typical example, a C-n in a buffer with truncate-lines set
non-nil requires us to find the next physical line in the buffer,
i.e. the next newline.  We currently do that by searching forward in
the buffer, one byte at a time, until we find a newline.  If lines are
very long, this is expensive.

When truncate-lines is nil, this problem doesn't exist for C-n, but a
similar problem exists for C-p: we need to find the _previous_
newline (which is many characters back when lines are long), and then
scan forward until we find a character that is displayed one screen
line above the one we were at when the user typed C-p.  Revision
111724 makes sure we don't go back more than one physical line, unless
really needed, but given the current design of the code, one full line
is the absolute minimum.

Turning on the newline cache speeds up these searches for a newline by
a factor of 2, which is not too spectacular, but not negligible.  Any
objections to turning on that caching by default in all buffers?

Beyond that, either we can find a much more efficient way of finding
the next or previous newline, or we will need a complete redesign and
re-implementation of the move_it_* family of functions, which is used
a lot by the display engine.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-11 16:42                                     ` Eli Zaretskii
@ 2013-02-11 17:53                                       ` Dmitry Antipov
  2013-02-11 18:10                                         ` Eli Zaretskii
  0 siblings, 1 reply; 27+ messages in thread
From: Dmitry Antipov @ 2013-02-11 17:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel

On 02/11/2013 08:42 PM, Eli Zaretskii wrote:

> Can you publish the file, or the URL where you downloaded it from?

Actually it was artificially generated from Quran text available
at http://tanzil.net/download.  I can't publish it because the license
doesn't allow any modifications, so I assume that any derivatives are
also illegal; but I also assume that we still can use them just for
the testing purposes, e.g. without any redistribution.

Dmitry

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: Long lines and bidi
  2013-02-11 17:17                                   ` Eli Zaretskii
@ 2013-02-11 17:55                                     ` Drew Adams
  2013-02-11 18:13                                       ` Eli Zaretskii
  0 siblings, 1 reply; 27+ messages in thread
From: Drew Adams @ 2013-02-11 17:55 UTC (permalink / raw)
  To: 'Eli Zaretskii', emacs-devel, 'Stefan Monnier'
  Cc: eggert, dmantipov

> Turning on the newline cache speeds up these searches for a newline by
> a factor of 2, which is not too spectacular, but not negligible.  Any
> objections to turning on that caching by default in all buffers?

I only followed some of all that you wrote, and I haven't followed the thread.
But a question:

You do not mention any added cost, AFAICT (but again, I did not follow in
detail).

Is the caching relevant (helpful) regardless of the value of truncate-lines or
whether visual-line-mode etc. is on?  IOW, does it make sense for many common
configurations or just for some particular configs?

If it is not particularly advantageous for some common configs, does it have a
cost that would suggest it should not be done in those configs, or is it pretty
much without a downside?

What about "for all buffers"?  Does it make sense also for buffers such as Dired
and Info, which have relatively short line lengths?

If there is no extra cost or other drawback then such considerations probably do
not matter, of course.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-11 17:53                                       ` Dmitry Antipov
@ 2013-02-11 18:10                                         ` Eli Zaretskii
  2013-02-11 18:21                                           ` Dmitry Antipov
  0 siblings, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-11 18:10 UTC (permalink / raw)
  To: Dmitry Antipov; +Cc: eggert, emacs-devel

> Date: Mon, 11 Feb 2013 21:53:32 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> CC: emacs-devel@gnu.org, eggert@cs.ucla.edu
> 
> On 02/11/2013 08:42 PM, Eli Zaretskii wrote:
> 
> > Can you publish the file, or the URL where you downloaded it from?
> 
> Actually it was artificially generated from Quran text available
> at http://tanzil.net/download.

Can you tell how you generated it?



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-11 17:55                                     ` Drew Adams
@ 2013-02-11 18:13                                       ` Eli Zaretskii
  0 siblings, 0 replies; 27+ messages in thread
From: Eli Zaretskii @ 2013-02-11 18:13 UTC (permalink / raw)
  To: Drew Adams; +Cc: eggert, dmantipov, monnier, emacs-devel

> From: "Drew Adams" <drew.adams@oracle.com>
> Cc: <eggert@cs.ucla.edu>, <dmantipov@yandex.ru>
> Date: Mon, 11 Feb 2013 09:55:36 -0800
> 
> > Turning on the newline cache speeds up these searches for a newline by
> > a factor of 2, which is not too spectacular, but not negligible.  Any
> > objections to turning on that caching by default in all buffers?
> 
> I only followed some of all that you wrote, and I haven't followed the thread.
> But a question:
> 
> You do not mention any added cost, AFAICT (but again, I did not follow in
> detail).

The overhead is only visible with very short lines, and is negligible
even then.

> Is the caching relevant (helpful) regardless of the value of truncate-lines or
> whether visual-line-mode etc. is on?  IOW, does it make sense for many common
> configurations or just for some particular configs?

It always makes sense.  Searching for newlines is a very frequent
operation in Emacs, not only in the display engine.

> What about "for all buffers"?  Does it make sense also for buffers such as Dired
> and Info, which have relatively short line lengths?

It doesn't hurt there, AFAICS.  And we can always turn it off in the
mode function, if we find later that some modes don't like it.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-11 18:10                                         ` Eli Zaretskii
@ 2013-02-11 18:21                                           ` Dmitry Antipov
  0 siblings, 0 replies; 27+ messages in thread
From: Dmitry Antipov @ 2013-02-11 18:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel

On 02/11/2013 10:10 PM, Eli Zaretskii wrote:

> Can you tell how you generated it?

# Get first 100 lines and convert them to the only line
head -n 100 < quran-simple.txt | tr '\n' ' ' | tr '\r' ' ' > 0.txt
# Add newline
echo -ne "\n" >> 0.txt
# Copy it 4096 times
cat 0.txt 0.txt 0.txt 0.txt  > 1.txt
cat 1.txt 1.txt 1.txt 1.txt > 0.txt
cat 0.txt 0.txt 0.txt 0.txt  > 1.txt
cat 1.txt 1.txt 1.txt 1.txt > 0.txt
cat 0.txt 0.txt 0.txt 0.txt  > 1.txt
cat 1.txt 1.txt 1.txt 1.txt > 0.txt

I realize that this is pretty artificial and doesn't reflect
the real structure of any Arabic text. This is definitely a trick
in attempt to exploit some corner cases here and there.

Dmitry




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Long lines and bidi
  2013-02-11 16:47                                       ` Eli Zaretskii
@ 2013-02-11 23:55                                         ` Paul Eggert
  0 siblings, 0 replies; 27+ messages in thread
From: Paul Eggert @ 2013-02-11 23:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dmitry Antipov, emacs-devel

On 02/11/13 08:47, Eli Zaretskii wrote:
> we should NOT regard this as any kind of solution for the long-lines
> problem with the current display engine.

Yes, the memchr/memrchr improvement is a relatively minor
performance improvement; I suggested it primarily
because it's easy to do and doesn't complicate Emacs proper.
I pushed it into the trunk as bzr 111741.

By the way, in reviewing this area it appears to me that
there must be a bug in the code that caches newline locations
when searching backwards.  The above performance improvement
doesn't affect this bug.  I'll try to follow up on this soon.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-02-11 23:55 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <877gmp5a04.fsf@ed.ac.uk>
     [not found] ` <83vca89izh.fsf@gnu.org>
     [not found]   ` <5110906D.7020406@yandex.ru>
     [not found]     ` <83fw1aac3d.fsf@gnu.org>
     [not found]       ` <51120360.4060104@yandex.ru>
     [not found]         ` <jwvehgtfrd6.fsf-monnier+emacs@gnu.org>
     [not found]           ` <51127363.5030203@yandex.ru>
     [not found]             ` <834nhp9u9j.fsf@gnu.org>
2013-02-08 13:33               ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov
2013-02-08 14:07                 ` Eli Zaretskii
2013-02-08 14:46                   ` Long lines and bidi Eli Zaretskii
2013-02-08 16:38                     ` Dmitry Antipov
2013-02-08 16:52                       ` Eli Zaretskii
2013-02-09  3:34                         ` Paul Eggert
2013-02-09  8:46                           ` Eli Zaretskii
2013-02-09  9:05                             ` Paul Eggert
2013-02-09  9:33                               ` Eli Zaretskii
2013-02-11  2:33                                 ` Paul Eggert
2013-02-09 10:01                               ` Eli Zaretskii
2013-02-10 16:57                                 ` Eli Zaretskii
2013-02-11  5:43                                   ` Dmitry Antipov
2013-02-11  7:54                                     ` Dmitry Antipov
2013-02-11 16:47                                       ` Eli Zaretskii
2013-02-11 23:55                                         ` Paul Eggert
2013-02-11 16:42                                     ` Eli Zaretskii
2013-02-11 17:53                                       ` Dmitry Antipov
2013-02-11 18:10                                         ` Eli Zaretskii
2013-02-11 18:21                                           ` Dmitry Antipov
2013-02-11 17:17                                   ` Eli Zaretskii
2013-02-11 17:55                                     ` Drew Adams
2013-02-11 18:13                                       ` Eli Zaretskii
2013-02-08 16:21                   ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov
2013-02-08 17:04                     ` Eli Zaretskii
2013-02-08 15:33                 ` Stefan Monnier
2013-02-08 16:05                   ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).