* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode @ 2013-02-03 22:05 Lawrence Mitchell 2013-02-04 15:49 ` Eli Zaretskii 0 siblings, 1 reply; 42+ messages in thread From: Lawrence Mitchell @ 2013-02-03 22:05 UTC (permalink / raw) To: 13623 When using word or sexp marking commands, the active region does not always get highlighted. emacs -Q M-< M-@ # note how ";; This" is selected, and highlighted in region-face M-@ # ";; This buffer" is selected, however, " buffer" is not highlighted Pressing C-l at this point, correctly shows the highlighted region. This is with: commit e6762a6d2dc65d3201c03d5995686112483dc4ff Merge: 4ba48d0 6840160 Author: dancol@dancol.org <> Date: Sun Feb 3 09:02:56 2013 -0800 Daniel Colascione 2013-02-03 * emacs.c: Use execvp, not execv, when DAEMON_MUST_EXEC I'd previously updated and built on the 19th of January: commit c5a8149837c5ed53655d4383dea3b8f29374b266 Author: Glenn Morris <rgm@gnu.org> Date: Sat Jan 19 18:40:49 2013 -0800 * lisp-mode.el (emacs-lisp-mode-map): Add native profiler menu entries. Which didn't demonstrate this behaviour, if that happens to help with tracking the issue down. Cheers, Lawrence In GNU Emacs 24.3.50.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.4.2) of 2013-02-01 on e4300lm Windowing system distributor `The X.Org Foundation', version 11.0.11103000 System Description: Ubuntu 12.04.1 LTS Configured using: `configure --prefix=/home/lmitche4/Apps/emacs -C CFLAGS=-O0 -ggdb --no-create --no-recursion' Important settings: value of $LANG: en_GB.UTF-8 locale-coding-system: utf-8-unix default enable-multibyte-characters: t Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-03 22:05 bug#13623: 24.3.50; Redisplay issue with transient-mark-mode Lawrence Mitchell @ 2013-02-04 15:49 ` Eli Zaretskii 2013-02-04 17:20 ` Lawrence Mitchell 2013-02-05 4:54 ` Dmitry Antipov 0 siblings, 2 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-04 15:49 UTC (permalink / raw) To: Lawrence Mitchell, Dmitry Antipov; +Cc: 13623 > From: Lawrence Mitchell <wence@gmx.li> > Date: Sun, 03 Feb 2013 22:05:15 +0000 > > > When using word or sexp marking commands, the active region does not > always get highlighted. > > emacs -Q > M-< > M-@ # note how ";; This" is selected, and highlighted in region-face > M-@ # ";; This buffer" is selected, however, " buffer" is not highlighted > > Pressing C-l at this point, correctly shows the highlighted region. Thanks, should be fixed now (revision 111673 on the trunk). Dmitry, this bug and also 13626 were both caused by your changes in revision 111647. While the reason for the changes was to use non-Lisp objects for some fields, several hunks in the changeset had no relation whatsoever to that, and were highly questionable. Example: - /* If showing the region, and mark has changed, we must redisplay - the whole window. The assignment to this_line_start_pos prevents - the optimization directly below this if-statement. */ - if (((!NILP (Vtransient_mark_mode) - && !NILP (BVAR (XBUFFER (w->buffer), mark_active))) - != !NILP (w->region_showing)) - || (!NILP (w->region_showing) - && !EQ (w->region_showing, - Fmarker_position (BVAR (XBUFFER (w->buffer), mark))))) - CHARPOS (this_line_start_pos) = 0; I don't understand why such non-trivial code is being dropped on the floor without discussion. And there were others like this in this revision. Please don't assume that any dropped code that is really needed will cause bugs that will be immediately reported. I've seen display bugs that went unnoticed for months and even years. In this case, we were just lucky. ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-04 15:49 ` Eli Zaretskii @ 2013-02-04 17:20 ` Lawrence Mitchell 2013-02-04 18:10 ` Eli Zaretskii 2013-02-05 4:54 ` Dmitry Antipov 1 sibling, 1 reply; 42+ messages in thread From: Lawrence Mitchell @ 2013-02-04 17:20 UTC (permalink / raw) To: 13623 Eli Zaretskii wrote: >> From: Lawrence Mitchell <wence@gmx.li> >> Date: Sun, 03 Feb 2013 22:05:15 +0000 >> Pressing C-l at this point, correctly shows the highlighted region. > Thanks, should be fixed now (revision 111673 on the trunk). [...] Yes, works for me. Cheers, Lawrence -- Lawrence Mitchell <wence@gmx.li> ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-04 17:20 ` Lawrence Mitchell @ 2013-02-04 18:10 ` Eli Zaretskii 0 siblings, 0 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-04 18:10 UTC (permalink / raw) To: Lawrence Mitchell; +Cc: 13623-done > From: Lawrence Mitchell <wence@gmx.li> > Date: Mon, 04 Feb 2013 17:20:21 +0000 > > Eli Zaretskii wrote: > >> From: Lawrence Mitchell <wence@gmx.li> > >> Date: Sun, 03 Feb 2013 22:05:15 +0000 > >> Pressing C-l at this point, correctly shows the highlighted region. > > > Thanks, should be fixed now (revision 111673 on the trunk). > > [...] > > Yes, works for me. Thanks, closing. ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-04 15:49 ` Eli Zaretskii 2013-02-04 17:20 ` Lawrence Mitchell @ 2013-02-05 4:54 ` Dmitry Antipov 2013-02-05 12:07 ` Dmitry Antipov 2013-02-05 17:45 ` Eli Zaretskii 1 sibling, 2 replies; 42+ messages in thread From: Dmitry Antipov @ 2013-02-05 4:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Lawrence Mitchell, 13623 On 02/04/2013 07:49 PM, Eli Zaretskii wrote: > Dmitry, this bug and also 13626 were both caused by your changes in > revision 111647. While the reason for the changes was to use non-Lisp > objects for some fields, several hunks in the changeset had no > relation whatsoever to that, and were highly questionable. Example: > > - /* If showing the region, and mark has changed, we must redisplay > - the whole window. The assignment to this_line_start_pos prevents > - the optimization directly below this if-statement. */ > - if (((!NILP (Vtransient_mark_mode) > - && !NILP (BVAR (XBUFFER (w->buffer), mark_active))) > - != !NILP (w->region_showing)) > - || (!NILP (w->region_showing) > - && !EQ (w->region_showing, > - Fmarker_position (BVAR (XBUFFER (w->buffer), mark))))) > - CHARPOS (this_line_start_pos) = 0; Hm. Although this is an obvious bug, are you sure that we must redisplay the whole window even if the region doesn't span multiple lines? IIUC it should be enough to redisplay the current line only. Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-05 4:54 ` Dmitry Antipov @ 2013-02-05 12:07 ` Dmitry Antipov 2013-02-05 17:46 ` Eli Zaretskii 2013-02-05 17:45 ` Eli Zaretskii 1 sibling, 1 reply; 42+ messages in thread From: Dmitry Antipov @ 2013-02-05 12:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Lawrence Mitchell, 13623 [-- Attachment #1: Type: text/plain, Size: 368 bytes --] On 02/05/2013 08:54 AM, Dmitry Antipov wrote: > Hm. Although this is an obvious bug, are you sure that we must redisplay > the whole window even if the region doesn't span multiple lines? IIUC > it should be enough to redisplay the current line only. E.g. something like attached. This is the revert of 111673 plus special treatment of single-line region. Dmitry [-- Attachment #2: single_line_region.patch --] [-- Type: text/plain, Size: 6816 bytes --] === modified file 'src/window.h' --- src/window.h 2013-02-04 15:39:55 +0000 +++ src/window.h 2013-02-05 09:58:36 +0000 @@ -333,15 +333,13 @@ the frame image that window_end_pos did not get onto the frame. */ unsigned window_end_valid : 1; + /* Nonzero if we have highlighted the region (or any part of it). */ + unsigned region_showing : 1; + /* Amount by which lines of this window are scrolled in y-direction (smooth scrolling). */ int vscroll; - /* If we have highlighted the region (or any part of it), the mark - position or -1 (the latter is used by the iterator for internal - purposes); otherwise zero. */ - ptrdiff_t region_showing; - /* Z_BYTE - buffer position of the last glyph in the current matrix of W. Should be nonnegative, and only valid if window_end_valid is nonzero. */ ptrdiff_t window_end_bytepos; === modified file 'src/xdisp.c' --- src/xdisp.c 2013-02-04 15:39:55 +0000 +++ src/xdisp.c 2013-02-05 11:59:01 +0000 @@ -2536,8 +2536,8 @@ #endif /* GLYPH_DEBUG and ENABLE_CHECKING */ -/* Return mark position if current buffer has the region of non-zero length, - or -1 otherwise. */ +/* Return mark position if current buffer has the region of non-zero + length, zero if mark and point are the same, or -1 otherwise. */ static ptrdiff_t markpos_of_region (void) @@ -2547,9 +2547,7 @@ && XMARKER (BVAR (current_buffer, mark))->buffer != NULL) { ptrdiff_t markpos = XMARKER (BVAR (current_buffer, mark))->charpos; - - if (markpos != PT) - return markpos; + return markpos == PT ? 0 : markpos; } return -1; } @@ -2689,7 +2687,7 @@ and IT->region_end_charpos to the start and end of a visible region in window IT->w. Set both to -1 to indicate no region. */ markpos = markpos_of_region (); - if (0 <= markpos + if (0 < markpos /* Maybe highlight only in selected window. */ && (/* Either show region everywhere. */ highlight_nonselected_windows @@ -10753,7 +10751,7 @@ return (((BUF_SAVE_MODIFF (b) < BUF_MODIFF (b)) != w->last_had_star) || ((!NILP (Vtransient_mark_mode) && !NILP (BVAR (b, mark_active))) - != (w->region_showing != 0))); + != w->region_showing)); } /* Nonzero if W has %c in its mode line and mode line should be updated. */ @@ -12793,7 +12791,7 @@ int must_finish = 0; struct text_pos tlbufpos, tlendpos; int number_of_visible_frames; - ptrdiff_t count, count1; + ptrdiff_t pos, count, count1; struct frame *sf; int polling_stopped_here = 0; Lisp_Object tail, frame; @@ -12806,6 +12804,9 @@ /* Non-zero means redisplay has to redisplay the miniwindow. */ int update_miniwindow_p = 0; + /* Non-zero means the mark is on the same line as point. */ + bool mark_at_this_line = 0; + TRACE ((stderr, "redisplay_internal %d\n", redisplaying_p)); /* No redisplay if running in batch mode or frame is not yet fully @@ -13016,23 +13017,17 @@ clear_garbaged_frames (); } - /* If showing the region, and mark has changed, we must redisplay - the whole window. The assignment to this_line_start_pos prevents - the optimization directly below this if-statement. */ - if (((!NILP (Vtransient_mark_mode) - && !NILP (BVAR (XBUFFER (w->buffer), mark_active))) - != (w->region_showing > 0)) - || (w->region_showing - && w->region_showing - != XINT (Fmarker_position (BVAR (XBUFFER (w->buffer), mark))))) - CHARPOS (this_line_start_pos) = 0; - /* Optimize the case that only the line containing the cursor in the selected window has changed. Variables starting with this_ are set in display_line and record information about the line containing the cursor. */ tlbufpos = this_line_start_pos; tlendpos = this_line_end_pos; + pos = markpos_of_region (); + if (pos != -1) + mark_at_this_line = (CHARPOS (tlbufpos) <= pos + && pos <= Z - CHARPOS (tlendpos)); + if (!consider_all_windows_p && CHARPOS (tlbufpos) > 0 && !w->update_mode_line @@ -13048,6 +13043,8 @@ /* Point must be on the line that we have info recorded about. */ && PT >= CHARPOS (tlbufpos) && PT <= Z - CHARPOS (tlendpos) + /* No region or region which doesn't span multiple lines. */ + && (pos == -1 || mark_at_this_line) /* All text outside that line, including its final newline, must be unchanged. */ && text_outside_line_unchanged_p (w, CHARPOS (tlbufpos), @@ -13059,7 +13056,7 @@ || FETCH_BYTE (BYTEPOS (tlbufpos)) == '\n')) /* Former continuation line has disappeared by becoming empty. */ goto cancel; - else if (window_outdated (w) || MINI_WINDOW_P (w)) + else if (window_outdated (w) || MINI_WINDOW_P (w) || mark_at_this_line) { /* We have to handle the case of continuation around a wide-column character (see the comment in indent.c around @@ -13239,8 +13236,6 @@ ++clear_image_cache_count; #endif - w->region_showing = XINT (Fmarker_position (BVAR (XBUFFER (w->buffer), mark))); - /* Build desired matrices, and update the display. If consider_all_windows_p is non-zero, do it for all windows on all frames. Otherwise do it for selected_window, only. */ @@ -14837,7 +14832,7 @@ /* Can't use this case if highlighting a region. When a region exists, cursor movement has to do more than just set the cursor. */ - && markpos_of_region () < 0 + && markpos_of_region () <= 0 && !w->region_showing && NILP (Vshow_trailing_whitespace) /* This code is not used for mini-buffer for the sake of the case @@ -15505,7 +15500,7 @@ /* If we are highlighting the region, then we just changed the region, so redisplay to show it. */ - if (0 <= markpos_of_region ()) + if (0 < markpos_of_region ()) { clear_glyph_matrix (w->desired_matrix); if (!try_window (window, startp, 0)) @@ -16206,7 +16201,7 @@ return 0; /* Can't do this if region may have changed. */ - if (0 <= markpos_of_region () + if (0 < markpos_of_region () || w->region_showing || !NILP (Vshow_trailing_whitespace)) return 0; @@ -17038,7 +17033,7 @@ /* Can't use this if highlighting a region because a cursor movement will do more than just set the cursor. */ - if (0 <= markpos_of_region ()) + if (0 < markpos_of_region ()) GIVE_UP (9); /* Likewise if highlighting trailing whitespace. */ @@ -19133,7 +19128,7 @@ } /* Is IT->w showing the region? */ - it->w->region_showing = it->region_beg_charpos > 0 ? -1 : 0; + it->w->region_showing = it->region_beg_charpos > 0; /* Clear the result glyph row and enable it. */ prepare_desired_row (row); ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-05 12:07 ` Dmitry Antipov @ 2013-02-05 17:46 ` Eli Zaretskii 0 siblings, 0 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-05 17:46 UTC (permalink / raw) To: Dmitry Antipov; +Cc: wence, 13623 > Date: Tue, 05 Feb 2013 16:07:47 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: Lawrence Mitchell <wence@gmx.li>, 13623@debbugs.gnu.org > > > Hm. Although this is an obvious bug, are you sure that we must redisplay > > the whole window even if the region doesn't span multiple lines? IIUC > > it should be enough to redisplay the current line only. > > E.g. something like attached. This is the revert of 111673 plus special > treatment of single-line region. If you customize the font of 'region' face, you will see that additional lines are affected anyway. ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-05 4:54 ` Dmitry Antipov 2013-02-05 12:07 ` Dmitry Antipov @ 2013-02-05 17:45 ` Eli Zaretskii 2013-02-06 7:16 ` Dmitry Antipov 1 sibling, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-05 17:45 UTC (permalink / raw) To: Dmitry Antipov; +Cc: wence, 13623 > Date: Tue, 05 Feb 2013 08:54:05 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: Lawrence Mitchell <wence@gmx.li>, 13623@debbugs.gnu.org > > > - /* If showing the region, and mark has changed, we must redisplay > > - the whole window. The assignment to this_line_start_pos prevents > > - the optimization directly below this if-statement. */ > > - if (((!NILP (Vtransient_mark_mode) > > - && !NILP (BVAR (XBUFFER (w->buffer), mark_active))) > > - != !NILP (w->region_showing)) > > - || (!NILP (w->region_showing) > > - && !EQ (w->region_showing, > > - Fmarker_position (BVAR (XBUFFER (w->buffer), mark))))) > > - CHARPOS (this_line_start_pos) = 0; > > Hm. Although this is an obvious bug, are you sure that we must redisplay > the whole window even if the region doesn't span multiple lines? IIUC > it should be enough to redisplay the current line only. As long as we don't restrict the 'region' face to a very small subset of possible face customizations (e.g., just the background and the foreground colors), and ignore the other attributes, an arbitrary face change on one line might potentially affect many more lines in the window. E.g., try customizing the 'region' face to twice its normal height. And I'm not sure this is worth optimizing anyway: region changes are relatively rare and almost always driven by user input, so I don't think redisplay will become significantly faster as result of any optimizations in this area. ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-05 17:45 ` Eli Zaretskii @ 2013-02-06 7:16 ` Dmitry Antipov 2013-02-06 14:31 ` Stefan Monnier 0 siblings, 1 reply; 42+ messages in thread From: Dmitry Antipov @ 2013-02-06 7:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: wence, 13623 On 02/05/2013 09:45 PM, Eli Zaretskii wrote: > As long as we don't restrict the 'region' face to a very small subset > of possible face customizations (e.g., just the background and the > foreground colors), and ignore the other attributes, an arbitrary face > change on one line might potentially affect many more lines in the > window. E.g., try customizing the 'region' face to twice its normal > height. This looks terribly ugly, but works (at least, I don't see any glitches while performing basic operations with the region). > And I'm not sure this is worth optimizing anyway: region changes are > relatively rare and almost always driven by user input, so I don't > think redisplay will become significantly faster as result of any > optimizations in this area. I have a strong suspicion that this also applies to ~50% of xdisp.c unless you're on 2400bps tty or remote X connection via 33600bps modem. Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-06 7:16 ` Dmitry Antipov @ 2013-02-06 14:31 ` Stefan Monnier 2013-02-06 15:14 ` Dmitry Antipov 0 siblings, 1 reply; 42+ messages in thread From: Stefan Monnier @ 2013-02-06 14:31 UTC (permalink / raw) To: Dmitry Antipov; +Cc: wence, 13623 >> And I'm not sure this is worth optimizing anyway: region changes are >> relatively rare and almost always driven by user input, so I don't >> think redisplay will become significantly faster as result of any >> optimizations in this area. > I have a strong suspicion that this also applies to ~50% of xdisp.c > unless you're on 2400bps tty or remote X connection via 33600bps modem. For my use case, the optimisations that matter are the ones that avoid looking at the unmodified (and mostly all iconified) frames when I work within a frame or switch between two frames (or maybe 3 at most: the origin, the destination and the minibuffer-only frame). Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-06 14:31 ` Stefan Monnier @ 2013-02-06 15:14 ` Dmitry Antipov 2013-02-06 18:04 ` Eli Zaretskii 2013-02-06 18:23 ` Eli Zaretskii 0 siblings, 2 replies; 42+ messages in thread From: Dmitry Antipov @ 2013-02-06 15:14 UTC (permalink / raw) To: Stefan Monnier; +Cc: wence, 13623 On 02/06/2013 06:31 PM, Stefan Monnier wrote: > For my use case, the optimisations that matter are the ones that avoid > looking at the unmodified (and mostly all iconified) frames when I work > within a frame or switch between two frames (or maybe 3 at most: the > origin, the destination and the minibuffer-only frame). This is a question of splitting global state between frames because current tricks like ++windows_or_buffers_changed effectively prevents single-frame redisplay optimizations. I have a few experimental patches with per-frame fonts_changed_p and cursor_type_changed flags. Since font/cursor changes are rare, the effect is negligible, but this opens the way towards more interesting things. On the other side, I suspect that the most of users are either 1) uses single-frame configuration or 2) uses reasonably modern hardware where the complete redisplay (all frames) is faster than the period between two keystrokes and so doesn't affect an editing experience. Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-06 15:14 ` Dmitry Antipov @ 2013-02-06 18:04 ` Eli Zaretskii 2013-02-06 18:23 ` Eli Zaretskii 1 sibling, 0 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-06 18:04 UTC (permalink / raw) To: Dmitry Antipov; +Cc: wence, 13623 > Date: Wed, 06 Feb 2013 19:14:43 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: Eli Zaretskii <eliz@gnu.org>, wence@gmx.li, 13623@debbugs.gnu.org > > On the other side, I suspect that the most of users > are either 1) uses single-frame configuration or 2) uses reasonably modern > hardware where the complete redisplay (all frames) is faster than the period > between two keystrokes and so doesn't affect an editing experience. I doubt many users use only one frame. I certainly don't, although my usage patterns are pretty conservative, nowhere near Stefan's. As for full redisplay: please remember that there are 2 aspects to that: (1) on the xdisp.c level, which is device independent, and (2) on the device-dependent xterm.c/w32term.c/nsterm.m etc. level. Even if on the xdisp.c level we do a complete redisplay of a window, update_frame and its subroutines compare the desired and the current display and only redraw the lines that are different. Therefore, you could do a complete redisplay on xdisp.c level, and then redraw very little or even nothing at all, even if your video hardware is 10 years old. I'm saying that because due to this 2-level optimization, the hardware speed is rarely seen in Emacs. ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-06 15:14 ` Dmitry Antipov 2013-02-06 18:04 ` Eli Zaretskii @ 2013-02-06 18:23 ` Eli Zaretskii 2013-02-06 20:30 ` Stefan Monnier 2013-02-08 13:33 ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov 1 sibling, 2 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-06 18:23 UTC (permalink / raw) To: Dmitry Antipov; +Cc: wence, 13623 > Date: Wed, 06 Feb 2013 19:14:43 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: Eli Zaretskii <eliz@gnu.org>, wence@gmx.li, 13623@debbugs.gnu.org > > I have a few experimental patches with per-frame fonts_changed_p and > cursor_type_changed flags. Since font/cursor changes are rare, the > effect is negligible, but this opens the way towards more > interesting things. What exactly is the strategy of enabling per-frame redisplay optimizations? As you point out, font and cursor type changes are not the important cost drivers. The question is: what is? IOW, how do you know which changes affect what frames? The current display engine does not try to optimize per-frame redisplay (except on a TTY, but let's forget about that for a moment). It tries to optimize per-window redisplay. I don't see what's wrong with this approach; in fact, I think for GUI frames it's better than per-frame optimizations. Therefore, I think we should improve optimizations by more aggressively detecting windows that need not be updated, instead of turning our attention to frames. In any case, the first step towards any intelligent improvements in redisplay optimizations would be to compile a list of use cases where we currently give up and redisplay entire windows, and where this thorough redisplay really matters, i.e. we can actually see and measure the effect of this. Another area of redisplay optimizations would be the infamous very-long-lines use case. (Personally, I think this one is the single most important deficiency in the current display engine, by far more important than any other display problem.) ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-06 18:23 ` Eli Zaretskii @ 2013-02-06 20:30 ` Stefan Monnier 2013-02-07 3:41 ` Eli Zaretskii 2013-02-08 13:33 ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov 1 sibling, 1 reply; 42+ messages in thread From: Stefan Monnier @ 2013-02-06 20:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: wence, Dmitry Antipov, 13623 > per-frame optimizations. Therefore, I think we should improve > optimizations by more aggressively detecting windows that need not be > updated, instead of turning our attention to frames. Agreed. > In any case, the first step towards any intelligent improvements in > redisplay optimizations would be to compile a list of use cases where > we currently give up and redisplay entire windows, and where this One case where I notice performance sucks is when the minibuffer is in use (in my minibuffer-only frame), and I switch to another frame and try to work there. Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* bug#13623: 24.3.50; Redisplay issue with transient-mark-mode 2013-02-06 20:30 ` Stefan Monnier @ 2013-02-07 3:41 ` Eli Zaretskii 0 siblings, 0 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-07 3:41 UTC (permalink / raw) To: Stefan Monnier; +Cc: wence, dmantipov, 13623 > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Dmitry Antipov <dmantipov@yandex.ru>, wence@gmx.li, 13623@debbugs.gnu.org > One case where I notice performance sucks is when the minibuffer is in > use (in my minibuffer-only frame), and I switch to another frame and try > to work there. I suggest to create a new bug report with a recipe to reproduce this. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Long lines and bidi [Was: Re: bug#13623: ...] 2013-02-06 18:23 ` Eli Zaretskii 2013-02-06 20:30 ` Stefan Monnier @ 2013-02-08 13:33 ` Dmitry Antipov 2013-02-08 14:07 ` Eli Zaretskii 2013-02-08 15:33 ` Stefan Monnier 1 sibling, 2 replies; 42+ messages in thread From: Dmitry Antipov @ 2013-02-08 13:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Emacs development discussions On 02/06/2013 10:23 PM, Eli Zaretskii wrote: > Another area of redisplay optimizations would be the infamous > very-long-lines use case. (Personally, I think this one is the single > most important deficiency in the current display engine, by far more > important than any other display problem.) I tried to scroll (down from the beginning and then up from the end) the very pathological file (~150M with just ~500 lines) and got the following profile: 8.59% emacs emacs [.] bidi_resolve_weak 7.92% emacs emacs [.] bidi_level_of_next_char 7.81% emacs emacs [.] get_next_display_element 7.12% emacs emacs [.] move_it_in_display_line_to 6.96% emacs emacs [.] x_produce_glyphs 5.06% emacs libc-2.16.so [.] __memcpy_ssse3_back 4.56% emacs emacs [.] next_element_from_buffer 4.38% emacs emacs [.] bidi_move_to_visually_next 4.26% emacs emacs [.] scan_buffer 3.04% emacs libXft.so.2.3.1 [.] XftCharIndex 2.93% emacs emacs [.] bidi_fetch_char 2.67% emacs emacs [.] bidi_cache_iterator_state 2.61% emacs emacs [.] lookup_glyphless_char_display 2.47% emacs libXft.so.2.3.1 [.] XftGlyphExtents 2.35% emacs emacs [.] bidi_resolve_neutral 1.95% emacs emacs [.] bidi_get_type 1.86% emacs emacs [.] detect_coding 1.70% emacs emacs [.] produce_chars 1.50% emacs emacs [.] bidi_resolve_explicit_1 1.18% emacs emacs [.] get_per_char_metric 1.13% emacs emacs [.] bidi_cache_search.constprop.4 1.01% emacs emacs [.] xftfont_text_extents 0.90% emacs emacs [.] bidi_explicit_dir_char 0.88% emacs emacs [.] bidi_resolve_explicit ... So the first question is: is it feasible/possible/desirable to detect that the buffer has no R2L text at all and automatically force bidi-paragraph-direction to left-to-right and bidi-display-reordering to nil? Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi [Was: Re: bug#13623: ...] 2013-02-08 13:33 ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov @ 2013-02-08 14:07 ` Eli Zaretskii 2013-02-08 14:46 ` Long lines and bidi Eli Zaretskii 2013-02-08 16:21 ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov 2013-02-08 15:33 ` Stefan Monnier 1 sibling, 2 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-08 14:07 UTC (permalink / raw) To: Dmitry Antipov; +Cc: emacs-devel > Date: Fri, 08 Feb 2013 17:33:47 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: Emacs development discussions <emacs-devel@gnu.org> > > On 02/06/2013 10:23 PM, Eli Zaretskii wrote: > > > Another area of redisplay optimizations would be the infamous > > very-long-lines use case. (Personally, I think this one is the single > > most important deficiency in the current display engine, by far more > > important than any other display problem.) > > I tried to scroll (down from the beginning and then up from the end) the > very pathological file (~150M with just ~500 lines) and got the following > profile: Profile alone is not enough. Please tell how did you "scroll", exactly (which commands did you use), and please also show the absolute times it took to perform each command. > 8.59% emacs emacs [.] bidi_resolve_weak What was in the file? bidi_resolve_weak high on the profile hints that it was full of punctuation or digits or banks, which is not really an interesting case. > 7.92% emacs emacs [.] bidi_level_of_next_char > 7.81% emacs emacs [.] get_next_display_element > 7.12% emacs emacs [.] move_it_in_display_line_to > 6.96% emacs emacs [.] x_produce_glyphs > 5.06% emacs libc-2.16.so [.] __memcpy_ssse3_back > 4.56% emacs emacs [.] next_element_from_buffer > 4.38% emacs emacs [.] bidi_move_to_visually_next > 4.26% emacs emacs [.] scan_buffer > 3.04% emacs libXft.so.2.3.1 [.] XftCharIndex > 2.93% emacs emacs [.] bidi_fetch_char > 2.67% emacs emacs [.] bidi_cache_iterator_state > 2.61% emacs emacs [.] lookup_glyphless_char_display > 2.47% emacs libXft.so.2.3.1 [.] XftGlyphExtents > 2.35% emacs emacs [.] bidi_resolve_neutral > 1.95% emacs emacs [.] bidi_get_type > 1.86% emacs emacs [.] detect_coding > 1.70% emacs emacs [.] produce_chars > 1.50% emacs emacs [.] bidi_resolve_explicit_1 > 1.18% emacs emacs [.] get_per_char_metric > 1.13% emacs emacs [.] bidi_cache_search.constprop.4 > 1.01% emacs emacs [.] xftfont_text_extents > 0.90% emacs emacs [.] bidi_explicit_dir_char > 0.88% emacs emacs [.] bidi_resolve_explicit > ... > > So the first question is: is it feasible/possible/desirable to detect that > the buffer has no R2L text at all and automatically force bidi-paragraph-direction > to left-to-right and bidi-display-reordering to nil? Ah, _that_ red herring... Why is that the first question? What were the times with and without bidi-display-reordering in this file? In my testing, the display engine performs awfully slow in both cases, so even though turning off reordering makes it faster, it is still so terribly slow that the problem is not going to be solved by that. As to your question: how can we know what characters are or aren't in the buffer without scanning it? And scanning the buffer is exactly what bidi.c does. As to bidi-paragraph-direction, the detection of the paragraph direction is turned off for long paragraphs anyway. Again, does setting bidi-paragraph-direction to left-to-right give you reasonable performance in that file? If not, this is just another red herring. Anyway, I think this is the wrong way to try to find the solution. The problem is not that scanning is slower with the bidi display. (If it were, we would see terribly slow performance with "normal" files as well.) The problem is that _we_scan_too_many_characters_. See this part of the profile: > 7.12% emacs emacs [.] move_it_in_display_line_to The display routines of the move_it_* family, which are heavily used in scrolling, cursor movement, and just about any display operation, _always_ scan each line from the beginning to the end, before they get to the next line. When each line is very long, those scans are very expensive. The way to make display significantly faster for long lines is to avoid scanning entire lines. The problem is how to do that without losing accuracy, e.g., without missing characters that affect the line metrics. IOW, our problem is to find clever algorithms and provide supporting data structures for those algorithms, so that we could avoid scanning very long lines in their entirety each time we need to move the cursor. When we find these algorithms and code them, the bidi "problem" will disappear without a trace. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-08 14:07 ` Eli Zaretskii @ 2013-02-08 14:46 ` Eli Zaretskii 2013-02-08 16:38 ` Dmitry Antipov 2013-02-08 16:21 ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov 1 sibling, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-08 14:46 UTC (permalink / raw) To: dmantipov; +Cc: emacs-devel > Date: Fri, 08 Feb 2013 16:07:23 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org > > Profile alone is not enough. Please tell how did you "scroll", > exactly (which commands did you use), and please also show the > absolute times it took to perform each command. Btw, if you are serious about finding a solution to the long-line display misfeature (or any other too-slow redisplay situation), I generally find it necessary to do precision timing of the suspicious parts of code, because otherwise it is impossible to find the actual culprits. On GNU/Linux, I use the following simple function: double timer_time (void) { struct timeval tv; gettimeofday (&tv, NULL); return tv.tv_usec * 0.000001 + tv.tv_sec; } Now, to time a particular portion of the code, do something like this: double t1, t2; ... t1 = timer_time (); /* here comes the code that should be timed */ t2 = timer_time (); if (t2 - t1 > THRESHOLD) fprintf (stderr, "that code took %.4g sec\n", t2 - t1); The value of THRESHOLD depends on the magnitude of the slow-down you are working on. I generally start with 0.1 of the time it takes to perform some redisplay operation; e.g., if it takes 5 sec to move the cursor, start with 0.5 sec. gettimeofday has a sufficient resolution on GNU/Linux to get you sub-millisecond accuracy, which is more than enough for display engine measurements. Using the above, you can quickly identify the function(s) that take most of the time of a particular redisplay operation, then time the parts of those functions to find the most expensive parts, and so on, recursively, until you find the hot spots (more than 50% of the slow operation). ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-08 14:46 ` Long lines and bidi Eli Zaretskii @ 2013-02-08 16:38 ` Dmitry Antipov 2013-02-08 16:52 ` Eli Zaretskii 0 siblings, 1 reply; 42+ messages in thread From: Dmitry Antipov @ 2013-02-08 16:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 02/08/2013 06:46 PM, Eli Zaretskii wrote: > Btw, if you are serious about finding a solution to the long-line > display misfeature (or any other too-slow redisplay situation), I > generally find it necessary to do precision timing of the suspicious > parts of code, because otherwise it is impossible to find the actual > culprits. On GNU/Linux, I use the following simple function: Ah, please, there is a difference between 2013 and 1980. 1) perf record -e stalled-cycles-frontend -e stalled-cycles-backend -F 10000 [workload] 2) perf report --stdio ==> 25.18% emacs emacs [.] scan_buffer 7.04% emacs emacs [.] bidi_resolve_weak ... 3) perf annotate scan_buffer --stdio ==> : while (cursor >= ceiling_addr) : { : unsigned char *scan_start = cursor; : : while (*cursor != target && --cursor >= ceiling_addr) 65.74 : 526620: movzbl (%r14),%eax 6.46 : 526624: cmp %r15d,%eax 0.17 : 526627: je 526632 <scan_buffer+0x512> 27.33 : 526629: sub $0x1,%r14 0.03 : 52662d: cmp %r14,%rbx 0.19 : 526630: jbe 526620 <scan_buffer+0x500> : ; So, ~90% of time spent in scan_buffer is: 799 while (*cursor != target && --cursor >= ceiling_addr) 800 ; Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-08 16:38 ` Dmitry Antipov @ 2013-02-08 16:52 ` Eli Zaretskii 2013-02-09 3:34 ` Paul Eggert 0 siblings, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-08 16:52 UTC (permalink / raw) To: Dmitry Antipov; +Cc: emacs-devel > Date: Fri, 08 Feb 2013 20:38:24 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: emacs-devel@gnu.org > > On 02/08/2013 06:46 PM, Eli Zaretskii wrote: > > > Btw, if you are serious about finding a solution to the long-line > > display misfeature (or any other too-slow redisplay situation), I > > generally find it necessary to do precision timing of the suspicious > > parts of code, because otherwise it is impossible to find the actual > > culprits. On GNU/Linux, I use the following simple function: > > Ah, please, there is a difference between 2013 and 1980. Sorry, you lost me here. > 1) perf record -e stalled-cycles-frontend -e stalled-cycles-backend -F 10000 [workload] > 2) perf report --stdio ==> > > 25.18% emacs emacs [.] scan_buffer > 7.04% emacs emacs [.] bidi_resolve_weak That's why testing redisplay on buffers which are predominantly punctuation will give you unrealistic measurements. (If you want to understand why, read UAX#9.) > So, ~90% of time spent in scan_buffer is: > > 799 while (*cursor != target && --cursor >= ceiling_addr) > 800 ; Which cannot be optimized. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-08 16:52 ` Eli Zaretskii @ 2013-02-09 3:34 ` Paul Eggert 2013-02-09 8:46 ` Eli Zaretskii 0 siblings, 1 reply; 42+ messages in thread From: Paul Eggert @ 2013-02-09 3:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Dmitry Antipov, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1070 bytes --] On 02/08/2013 08:52 AM, Eli Zaretskii wrote: >> > So, ~90% of time spent in scan_buffer is: >> > >> > 799 while (*cursor != target && --cursor >= ceiling_addr) >> > 800 ; > Which cannot be optimized. It can be sped up somewhat, by using memrchr. This won't solve these performance issues, but it helps: on my platform (x86-64 Ubuntu 12.10) I ran Dmitry's scroll-both benchmark <http://lists.gnu.org/archive/html/emacs-devel/2013-02/msg00147.html> on a real file (the trunk's src/xdisp.c), and it was 25% faster overall (1.19 seconds versus 1.49 seconds) when I used memrchr there and memchr for forward searches. I'll attach the patch I used. Eli, it'll need a bit of hacking to port to MS-Windows, since the substitute memrchr implementation (which is supplied) will need to be compiled. Dmitry, is this something you can easily try with your benchmarks? Most of the attached patch is boilerplate taken unmodified from gnulib, to support memrchr on non-GNU platforms. The key part of the change is at the end, to src/search.c. [-- Attachment #2: memchr.txt --] [-- Type: text/plain, Size: 75410 bytes --] === modified file '.bzrignore' --- .bzrignore 2013-02-01 06:30:51 +0000 +++ .bzrignore 2013-02-09 03:12:48 +0000 @@ -97,6 +97,7 @@ lib/stdio.h lib/stdint.h lib/stdlib.h +lib/string.h lib/sys/ lib/SYS lib/time.h === modified file 'ChangeLog' --- ChangeLog 2013-02-08 23:37:17 +0000 +++ ChangeLog 2013-02-09 03:12:48 +0000 @@ -1,3 +1,11 @@ +2013-02-09 Paul Eggert <eggert@cs.ucla.edu> + + Tune redisplay by using memchr and memrchr. + * .bzrignore: Add string.h. + * lib/gnulib.mk, m4/gnulib-comp.m4: Regenerate. + * lib/memrchr.c, lib/string.in.h, m4/memrchr.m4, m4/string_h.m4: + New files, from gnulib. + 2013-02-08 Paul Eggert <eggert@cs.ucla.edu> Merge from gnulib, incorporating: === modified file 'admin/ChangeLog' --- admin/ChangeLog 2013-02-01 06:30:51 +0000 +++ admin/ChangeLog 2013-02-09 03:12:48 +0000 @@ -1,3 +1,8 @@ +2013-02-09 Paul Eggert <eggert@cs.ucla.edu> + + Tune redisplay by using memchr and memrchr. + * merge-gnulib (GNULIB_MODULES): Add memrchr. + 2013-02-01 Paul Eggert <eggert@cs.ucla.edu> Use fdopendir, fstatat and readlinkat, for efficiency (Bug#13539). === modified file 'admin/merge-gnulib' --- admin/merge-gnulib 2013-02-01 06:30:51 +0000 +++ admin/merge-gnulib 2013-02-09 03:12:48 +0000 @@ -31,7 +31,8 @@ dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat - manywarnings mktime pselect pthread_sigmask putenv readlink readlinkat + manywarnings memrchr mktime + pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens === modified file 'lib/gnulib.mk' --- lib/gnulib.mk 2013-02-08 23:37:17 +0000 +++ lib/gnulib.mk 2013-02-09 03:12:48 +0000 @@ -21,7 +21,7 @@ # the same distribution terms as the rest of that program. # # Generated by gnulib-tool. -# Reproduce by: gnulib-tool --import --dir=. --lib=libgnu --source-base=lib --m4-base=m4 --doc-base=doc --tests-base=tests --aux-dir=build-aux --avoid=dup --avoid=errno --avoid=fchdir --avoid=fcntl --avoid=fstat --avoid=malloc-posix --avoid=msvc-inval --avoid=msvc-nothrow --avoid=open --avoid=openat-die --avoid=opendir --avoid=raise --avoid=save-cwd --avoid=select --avoid=sigprocmask --avoid=sys_types --avoid=threadlib --makefile-name=gnulib.mk --conditional-dependencies --no-libtool --macro-prefix=gl --no-vc-files alloca-opt c-ctype c-strcase careadlinkat close-stream crypto/md5 crypto/sha1 crypto/sha256 crypto/sha512 dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat manywarnings mktime pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens warnings +# Reproduce by: gnulib-tool --import --dir=. --lib=libgnu --source-base=lib --m4-base=m4 --doc-base=doc --tests-base=tests --aux-dir=build-aux --avoid=dup --avoid=errno --avoid=fchdir --avoid=fcntl --avoid=fstat --avoid=malloc-posix --avoid=msvc-inval --avoid=msvc-nothrow --avoid=open --avoid=openat-die --avoid=opendir --avoid=raise --avoid=save-cwd --avoid=select --avoid=sigprocmask --avoid=sys_types --avoid=threadlib --makefile-name=gnulib.mk --conditional-dependencies --no-libtool --macro-prefix=gl --no-vc-files alloca-opt c-ctype c-strcase careadlinkat close-stream crypto/md5 crypto/sha1 crypto/sha256 crypto/sha512 dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat manywarnings memrchr mktime pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens warnings MOSTLYCLEANFILES += core *.stackdump @@ -480,6 +480,15 @@ ## end gnulib module lstat +## begin gnulib module memrchr + + +EXTRA_DIST += memrchr.c + +EXTRA_libgnu_a_SOURCES += memrchr.c + +## end gnulib module memrchr + ## begin gnulib module mktime @@ -1105,6 +1114,106 @@ ## end gnulib module strftime +## begin gnulib module string + +BUILT_SOURCES += string.h + +# We need the following in order to create <string.h> when the system +# doesn't have one that works with the given compiler. +string.h: string.in.h $(top_builddir)/config.status $(CXXDEFS_H) $(ARG_NONNULL_H) $(WARN_ON_USE_H) + $(AM_V_GEN)rm -f $@-t $@ && \ + { echo '/* DO NOT EDIT! GENERATED AUTOMATICALLY! */' && \ + sed -e 's|@''GUARD_PREFIX''@|GL|g' \ + -e 's|@''INCLUDE_NEXT''@|$(INCLUDE_NEXT)|g' \ + -e 's|@''PRAGMA_SYSTEM_HEADER''@|@PRAGMA_SYSTEM_HEADER@|g' \ + -e 's|@''PRAGMA_COLUMNS''@|@PRAGMA_COLUMNS@|g' \ + -e 's|@''NEXT_STRING_H''@|$(NEXT_STRING_H)|g' \ + -e 's/@''GNULIB_FFSL''@/$(GNULIB_FFSL)/g' \ + -e 's/@''GNULIB_FFSLL''@/$(GNULIB_FFSLL)/g' \ + -e 's/@''GNULIB_MBSLEN''@/$(GNULIB_MBSLEN)/g' \ + -e 's/@''GNULIB_MBSNLEN''@/$(GNULIB_MBSNLEN)/g' \ + -e 's/@''GNULIB_MBSCHR''@/$(GNULIB_MBSCHR)/g' \ + -e 's/@''GNULIB_MBSRCHR''@/$(GNULIB_MBSRCHR)/g' \ + -e 's/@''GNULIB_MBSSTR''@/$(GNULIB_MBSSTR)/g' \ + -e 's/@''GNULIB_MBSCASECMP''@/$(GNULIB_MBSCASECMP)/g' \ + -e 's/@''GNULIB_MBSNCASECMP''@/$(GNULIB_MBSNCASECMP)/g' \ + -e 's/@''GNULIB_MBSPCASECMP''@/$(GNULIB_MBSPCASECMP)/g' \ + -e 's/@''GNULIB_MBSCASESTR''@/$(GNULIB_MBSCASESTR)/g' \ + -e 's/@''GNULIB_MBSCSPN''@/$(GNULIB_MBSCSPN)/g' \ + -e 's/@''GNULIB_MBSPBRK''@/$(GNULIB_MBSPBRK)/g' \ + -e 's/@''GNULIB_MBSSPN''@/$(GNULIB_MBSSPN)/g' \ + -e 's/@''GNULIB_MBSSEP''@/$(GNULIB_MBSSEP)/g' \ + -e 's/@''GNULIB_MBSTOK_R''@/$(GNULIB_MBSTOK_R)/g' \ + -e 's/@''GNULIB_MEMCHR''@/$(GNULIB_MEMCHR)/g' \ + -e 's/@''GNULIB_MEMMEM''@/$(GNULIB_MEMMEM)/g' \ + -e 's/@''GNULIB_MEMPCPY''@/$(GNULIB_MEMPCPY)/g' \ + -e 's/@''GNULIB_MEMRCHR''@/$(GNULIB_MEMRCHR)/g' \ + -e 's/@''GNULIB_RAWMEMCHR''@/$(GNULIB_RAWMEMCHR)/g' \ + -e 's/@''GNULIB_STPCPY''@/$(GNULIB_STPCPY)/g' \ + -e 's/@''GNULIB_STPNCPY''@/$(GNULIB_STPNCPY)/g' \ + -e 's/@''GNULIB_STRCHRNUL''@/$(GNULIB_STRCHRNUL)/g' \ + -e 's/@''GNULIB_STRDUP''@/$(GNULIB_STRDUP)/g' \ + -e 's/@''GNULIB_STRNCAT''@/$(GNULIB_STRNCAT)/g' \ + -e 's/@''GNULIB_STRNDUP''@/$(GNULIB_STRNDUP)/g' \ + -e 's/@''GNULIB_STRNLEN''@/$(GNULIB_STRNLEN)/g' \ + -e 's/@''GNULIB_STRPBRK''@/$(GNULIB_STRPBRK)/g' \ + -e 's/@''GNULIB_STRSEP''@/$(GNULIB_STRSEP)/g' \ + -e 's/@''GNULIB_STRSTR''@/$(GNULIB_STRSTR)/g' \ + -e 's/@''GNULIB_STRCASESTR''@/$(GNULIB_STRCASESTR)/g' \ + -e 's/@''GNULIB_STRTOK_R''@/$(GNULIB_STRTOK_R)/g' \ + -e 's/@''GNULIB_STRERROR''@/$(GNULIB_STRERROR)/g' \ + -e 's/@''GNULIB_STRERROR_R''@/$(GNULIB_STRERROR_R)/g' \ + -e 's/@''GNULIB_STRSIGNAL''@/$(GNULIB_STRSIGNAL)/g' \ + -e 's/@''GNULIB_STRVERSCMP''@/$(GNULIB_STRVERSCMP)/g' \ + < $(srcdir)/string.in.h | \ + sed -e 's|@''HAVE_FFSL''@|$(HAVE_FFSL)|g' \ + -e 's|@''HAVE_FFSLL''@|$(HAVE_FFSLL)|g' \ + -e 's|@''HAVE_MBSLEN''@|$(HAVE_MBSLEN)|g' \ + -e 's|@''HAVE_MEMCHR''@|$(HAVE_MEMCHR)|g' \ + -e 's|@''HAVE_DECL_MEMMEM''@|$(HAVE_DECL_MEMMEM)|g' \ + -e 's|@''HAVE_MEMPCPY''@|$(HAVE_MEMPCPY)|g' \ + -e 's|@''HAVE_DECL_MEMRCHR''@|$(HAVE_DECL_MEMRCHR)|g' \ + -e 's|@''HAVE_RAWMEMCHR''@|$(HAVE_RAWMEMCHR)|g' \ + -e 's|@''HAVE_STPCPY''@|$(HAVE_STPCPY)|g' \ + -e 's|@''HAVE_STPNCPY''@|$(HAVE_STPNCPY)|g' \ + -e 's|@''HAVE_STRCHRNUL''@|$(HAVE_STRCHRNUL)|g' \ + -e 's|@''HAVE_DECL_STRDUP''@|$(HAVE_DECL_STRDUP)|g' \ + -e 's|@''HAVE_DECL_STRNDUP''@|$(HAVE_DECL_STRNDUP)|g' \ + -e 's|@''HAVE_DECL_STRNLEN''@|$(HAVE_DECL_STRNLEN)|g' \ + -e 's|@''HAVE_STRPBRK''@|$(HAVE_STRPBRK)|g' \ + -e 's|@''HAVE_STRSEP''@|$(HAVE_STRSEP)|g' \ + -e 's|@''HAVE_STRCASESTR''@|$(HAVE_STRCASESTR)|g' \ + -e 's|@''HAVE_DECL_STRTOK_R''@|$(HAVE_DECL_STRTOK_R)|g' \ + -e 's|@''HAVE_DECL_STRERROR_R''@|$(HAVE_DECL_STRERROR_R)|g' \ + -e 's|@''HAVE_DECL_STRSIGNAL''@|$(HAVE_DECL_STRSIGNAL)|g' \ + -e 's|@''HAVE_STRVERSCMP''@|$(HAVE_STRVERSCMP)|g' \ + -e 's|@''REPLACE_STPNCPY''@|$(REPLACE_STPNCPY)|g' \ + -e 's|@''REPLACE_MEMCHR''@|$(REPLACE_MEMCHR)|g' \ + -e 's|@''REPLACE_MEMMEM''@|$(REPLACE_MEMMEM)|g' \ + -e 's|@''REPLACE_STRCASESTR''@|$(REPLACE_STRCASESTR)|g' \ + -e 's|@''REPLACE_STRCHRNUL''@|$(REPLACE_STRCHRNUL)|g' \ + -e 's|@''REPLACE_STRDUP''@|$(REPLACE_STRDUP)|g' \ + -e 's|@''REPLACE_STRSTR''@|$(REPLACE_STRSTR)|g' \ + -e 's|@''REPLACE_STRERROR''@|$(REPLACE_STRERROR)|g' \ + -e 's|@''REPLACE_STRERROR_R''@|$(REPLACE_STRERROR_R)|g' \ + -e 's|@''REPLACE_STRNCAT''@|$(REPLACE_STRNCAT)|g' \ + -e 's|@''REPLACE_STRNDUP''@|$(REPLACE_STRNDUP)|g' \ + -e 's|@''REPLACE_STRNLEN''@|$(REPLACE_STRNLEN)|g' \ + -e 's|@''REPLACE_STRSIGNAL''@|$(REPLACE_STRSIGNAL)|g' \ + -e 's|@''REPLACE_STRTOK_R''@|$(REPLACE_STRTOK_R)|g' \ + -e 's|@''UNDEFINE_STRTOK_R''@|$(UNDEFINE_STRTOK_R)|g' \ + -e '/definitions of _GL_FUNCDECL_RPL/r $(CXXDEFS_H)' \ + -e '/definition of _GL_ARG_NONNULL/r $(ARG_NONNULL_H)' \ + -e '/definition of _GL_WARN_ON_USE/r $(WARN_ON_USE_H)'; \ + < $(srcdir)/string.in.h; \ + } > $@-t && \ + mv $@-t $@ +MOSTLYCLEANFILES += string.h string.h-t + +EXTRA_DIST += string.in.h + +## end gnulib module string + ## begin gnulib module strtoimax === added file 'lib/memrchr.c' --- lib/memrchr.c 1970-01-01 00:00:00 +0000 +++ lib/memrchr.c 2013-02-09 03:12:48 +0000 @@ -0,0 +1,161 @@ +/* memrchr -- find the last occurrence of a byte in a memory block + + Copyright (C) 1991, 1993, 1996-1997, 1999-2000, 2003-2013 Free Software + Foundation, Inc. + + Based on strlen implementation by Torbjorn Granlund (tege@sics.se), + with help from Dan Sahlin (dan@sics.se) and + commentary by Jim Blandy (jimb@ai.mit.edu); + adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu), + and implemented by Roland McGrath (roland@ai.mit.edu). + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <http://www.gnu.org/licenses/>. */ + +#if defined _LIBC +# include <memcopy.h> +#else +# include <config.h> +# define reg_char char +#endif + +#include <string.h> +#include <limits.h> + +#undef __memrchr +#ifdef _LIBC +# undef memrchr +#endif + +#ifndef weak_alias +# define __memrchr memrchr +#endif + +/* Search no more than N bytes of S for C. */ +void * +__memrchr (void const *s, int c_in, size_t n) +{ + /* On 32-bit hardware, choosing longword to be a 32-bit unsigned + long instead of a 64-bit uintmax_t tends to give better + performance. On 64-bit hardware, unsigned long is generally 64 + bits already. Change this typedef to experiment with + performance. */ + typedef unsigned long int longword; + + const unsigned char *char_ptr; + const longword *longword_ptr; + longword repeated_one; + longword repeated_c; + unsigned reg_char c; + + c = (unsigned char) c_in; + + /* Handle the last few bytes by reading one byte at a time. + Do this until CHAR_PTR is aligned on a longword boundary. */ + for (char_ptr = (const unsigned char *) s + n; + n > 0 && (size_t) char_ptr % sizeof (longword) != 0; + --n) + if (*--char_ptr == c) + return (void *) char_ptr; + + longword_ptr = (const longword *) char_ptr; + + /* All these elucidatory comments refer to 4-byte longwords, + but the theory applies equally well to any size longwords. */ + + /* Compute auxiliary longword values: + repeated_one is a value which has a 1 in every byte. + repeated_c has c in every byte. */ + repeated_one = 0x01010101; + repeated_c = c | (c << 8); + repeated_c |= repeated_c << 16; + if (0xffffffffU < (longword) -1) + { + repeated_one |= repeated_one << 31 << 1; + repeated_c |= repeated_c << 31 << 1; + if (8 < sizeof (longword)) + { + size_t i; + + for (i = 64; i < sizeof (longword) * 8; i *= 2) + { + repeated_one |= repeated_one << i; + repeated_c |= repeated_c << i; + } + } + } + + /* Instead of the traditional loop which tests each byte, we will test a + longword at a time. The tricky part is testing if *any of the four* + bytes in the longword in question are equal to c. We first use an xor + with repeated_c. This reduces the task to testing whether *any of the + four* bytes in longword1 is zero. + + We compute tmp = + ((longword1 - repeated_one) & ~longword1) & (repeated_one << 7). + That is, we perform the following operations: + 1. Subtract repeated_one. + 2. & ~longword1. + 3. & a mask consisting of 0x80 in every byte. + Consider what happens in each byte: + - If a byte of longword1 is zero, step 1 and 2 transform it into 0xff, + and step 3 transforms it into 0x80. A carry can also be propagated + to more significant bytes. + - If a byte of longword1 is nonzero, let its lowest 1 bit be at + position k (0 <= k <= 7); so the lowest k bits are 0. After step 1, + the byte ends in a single bit of value 0 and k bits of value 1. + After step 2, the result is just k bits of value 1: 2^k - 1. After + step 3, the result is 0. And no carry is produced. + So, if longword1 has only non-zero bytes, tmp is zero. + Whereas if longword1 has a zero byte, call j the position of the least + significant zero byte. Then the result has a zero at positions 0, ..., + j-1 and a 0x80 at position j. We cannot predict the result at the more + significant bytes (positions j+1..3), but it does not matter since we + already have a non-zero bit at position 8*j+7. + + So, the test whether any byte in longword1 is zero is equivalent to + testing whether tmp is nonzero. */ + + while (n >= sizeof (longword)) + { + longword longword1 = *--longword_ptr ^ repeated_c; + + if ((((longword1 - repeated_one) & ~longword1) + & (repeated_one << 7)) != 0) + { + longword_ptr++; + break; + } + n -= sizeof (longword); + } + + char_ptr = (const unsigned char *) longword_ptr; + + /* At this point, we know that either n < sizeof (longword), or one of the + sizeof (longword) bytes starting at char_ptr is == c. On little-endian + machines, we could determine the first such byte without any further + memory accesses, just by looking at the tmp result from the last loop + iteration. But this does not work on big-endian machines. Choose code + that works in both cases. */ + + while (n-- > 0) + { + if (*--char_ptr == c) + return (void *) char_ptr; + } + + return NULL; +} +#ifdef weak_alias +weak_alias (__memrchr, memrchr) +#endif === added file 'lib/string.in.h' --- lib/string.in.h 1970-01-01 00:00:00 +0000 +++ lib/string.in.h 2013-02-09 03:12:48 +0000 @@ -0,0 +1,1029 @@ +/* A GNU-like <string.h>. + + Copyright (C) 1995-1996, 2001-2013 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see <http://www.gnu.org/licenses/>. */ + +#ifndef _@GUARD_PREFIX@_STRING_H + +#if __GNUC__ >= 3 +@PRAGMA_SYSTEM_HEADER@ +#endif +@PRAGMA_COLUMNS@ + +/* The include_next requires a split double-inclusion guard. */ +#@INCLUDE_NEXT@ @NEXT_STRING_H@ + +#ifndef _@GUARD_PREFIX@_STRING_H +#define _@GUARD_PREFIX@_STRING_H + +/* NetBSD 5.0 mis-defines NULL. */ +#include <stddef.h> + +/* MirBSD defines mbslen as a macro. */ +#if @GNULIB_MBSLEN@ && defined __MirBSD__ +# include <wchar.h> +#endif + +/* The __attribute__ feature is available in gcc versions 2.5 and later. + The attribute __pure__ was added in gcc 2.96. */ +#if __GNUC__ > 2 || (__GNUC__ == 2 && __GNUC_MINOR__ >= 96) +# define _GL_ATTRIBUTE_PURE __attribute__ ((__pure__)) +#else +# define _GL_ATTRIBUTE_PURE /* empty */ +#endif + +/* NetBSD 5.0 declares strsignal in <unistd.h>, not in <string.h>. */ +/* But in any case avoid namespace pollution on glibc systems. */ +#if (@GNULIB_STRSIGNAL@ || defined GNULIB_POSIXCHECK) && defined __NetBSD__ \ + && ! defined __GLIBC__ +# include <unistd.h> +#endif + +/* The definitions of _GL_FUNCDECL_RPL etc. are copied here. */ + +/* The definition of _GL_ARG_NONNULL is copied here. */ + +/* The definition of _GL_WARN_ON_USE is copied here. */ + + +/* Find the index of the least-significant set bit. */ +#if @GNULIB_FFSL@ +# if !@HAVE_FFSL@ +_GL_FUNCDECL_SYS (ffsl, int, (long int i)); +# endif +_GL_CXXALIAS_SYS (ffsl, int, (long int i)); +_GL_CXXALIASWARN (ffsl); +#elif defined GNULIB_POSIXCHECK +# undef ffsl +# if HAVE_RAW_DECL_FFSL +_GL_WARN_ON_USE (ffsl, "ffsl is not portable - use the ffsl module"); +# endif +#endif + + +/* Find the index of the least-significant set bit. */ +#if @GNULIB_FFSLL@ +# if !@HAVE_FFSLL@ +_GL_FUNCDECL_SYS (ffsll, int, (long long int i)); +# endif +_GL_CXXALIAS_SYS (ffsll, int, (long long int i)); +_GL_CXXALIASWARN (ffsll); +#elif defined GNULIB_POSIXCHECK +# undef ffsll +# if HAVE_RAW_DECL_FFSLL +_GL_WARN_ON_USE (ffsll, "ffsll is not portable - use the ffsll module"); +# endif +#endif + + +/* Return the first instance of C within N bytes of S, or NULL. */ +#if @GNULIB_MEMCHR@ +# if @REPLACE_MEMCHR@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define memchr rpl_memchr +# endif +_GL_FUNCDECL_RPL (memchr, void *, (void const *__s, int __c, size_t __n) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (memchr, void *, (void const *__s, int __c, size_t __n)); +# else +# if ! @HAVE_MEMCHR@ +_GL_FUNCDECL_SYS (memchr, void *, (void const *__s, int __c, size_t __n) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C" { const void * std::memchr (const void *, int, size_t); } + extern "C++" { void * std::memchr (void *, int, size_t); } */ +_GL_CXXALIAS_SYS_CAST2 (memchr, + void *, (void const *__s, int __c, size_t __n), + void const *, (void const *__s, int __c, size_t __n)); +# endif +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (memchr, void *, (void *__s, int __c, size_t __n)); +_GL_CXXALIASWARN1 (memchr, void const *, + (void const *__s, int __c, size_t __n)); +# else +_GL_CXXALIASWARN (memchr); +# endif +#elif defined GNULIB_POSIXCHECK +# undef memchr +/* Assume memchr is always declared. */ +_GL_WARN_ON_USE (memchr, "memchr has platform-specific bugs - " + "use gnulib module memchr for portability" ); +#endif + +/* Return the first occurrence of NEEDLE in HAYSTACK. */ +#if @GNULIB_MEMMEM@ +# if @REPLACE_MEMMEM@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define memmem rpl_memmem +# endif +_GL_FUNCDECL_RPL (memmem, void *, + (void const *__haystack, size_t __haystack_len, + void const *__needle, size_t __needle_len) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 3))); +_GL_CXXALIAS_RPL (memmem, void *, + (void const *__haystack, size_t __haystack_len, + void const *__needle, size_t __needle_len)); +# else +# if ! @HAVE_DECL_MEMMEM@ +_GL_FUNCDECL_SYS (memmem, void *, + (void const *__haystack, size_t __haystack_len, + void const *__needle, size_t __needle_len) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 3))); +# endif +_GL_CXXALIAS_SYS (memmem, void *, + (void const *__haystack, size_t __haystack_len, + void const *__needle, size_t __needle_len)); +# endif +_GL_CXXALIASWARN (memmem); +#elif defined GNULIB_POSIXCHECK +# undef memmem +# if HAVE_RAW_DECL_MEMMEM +_GL_WARN_ON_USE (memmem, "memmem is unportable and often quadratic - " + "use gnulib module memmem-simple for portability, " + "and module memmem for speed" ); +# endif +#endif + +/* Copy N bytes of SRC to DEST, return pointer to bytes after the + last written byte. */ +#if @GNULIB_MEMPCPY@ +# if ! @HAVE_MEMPCPY@ +_GL_FUNCDECL_SYS (mempcpy, void *, + (void *restrict __dest, void const *restrict __src, + size_t __n) + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (mempcpy, void *, + (void *restrict __dest, void const *restrict __src, + size_t __n)); +_GL_CXXALIASWARN (mempcpy); +#elif defined GNULIB_POSIXCHECK +# undef mempcpy +# if HAVE_RAW_DECL_MEMPCPY +_GL_WARN_ON_USE (mempcpy, "mempcpy is unportable - " + "use gnulib module mempcpy for portability"); +# endif +#endif + +/* Search backwards through a block for a byte (specified as an int). */ +#if @GNULIB_MEMRCHR@ +# if ! @HAVE_DECL_MEMRCHR@ +_GL_FUNCDECL_SYS (memrchr, void *, (void const *, int, size_t) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const void * std::memrchr (const void *, int, size_t); } + extern "C++" { void * std::memrchr (void *, int, size_t); } */ +_GL_CXXALIAS_SYS_CAST2 (memrchr, + void *, (void const *, int, size_t), + void const *, (void const *, int, size_t)); +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (memrchr, void *, (void *, int, size_t)); +_GL_CXXALIASWARN1 (memrchr, void const *, (void const *, int, size_t)); +# else +_GL_CXXALIASWARN (memrchr); +# endif +#elif defined GNULIB_POSIXCHECK +# undef memrchr +# if HAVE_RAW_DECL_MEMRCHR +_GL_WARN_ON_USE (memrchr, "memrchr is unportable - " + "use gnulib module memrchr for portability"); +# endif +#endif + +/* Find the first occurrence of C in S. More efficient than + memchr(S,C,N), at the expense of undefined behavior if C does not + occur within N bytes. */ +#if @GNULIB_RAWMEMCHR@ +# if ! @HAVE_RAWMEMCHR@ +_GL_FUNCDECL_SYS (rawmemchr, void *, (void const *__s, int __c_in) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const void * std::rawmemchr (const void *, int); } + extern "C++" { void * std::rawmemchr (void *, int); } */ +_GL_CXXALIAS_SYS_CAST2 (rawmemchr, + void *, (void const *__s, int __c_in), + void const *, (void const *__s, int __c_in)); +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (rawmemchr, void *, (void *__s, int __c_in)); +_GL_CXXALIASWARN1 (rawmemchr, void const *, (void const *__s, int __c_in)); +# else +_GL_CXXALIASWARN (rawmemchr); +# endif +#elif defined GNULIB_POSIXCHECK +# undef rawmemchr +# if HAVE_RAW_DECL_RAWMEMCHR +_GL_WARN_ON_USE (rawmemchr, "rawmemchr is unportable - " + "use gnulib module rawmemchr for portability"); +# endif +#endif + +/* Copy SRC to DST, returning the address of the terminating '\0' in DST. */ +#if @GNULIB_STPCPY@ +# if ! @HAVE_STPCPY@ +_GL_FUNCDECL_SYS (stpcpy, char *, + (char *restrict __dst, char const *restrict __src) + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (stpcpy, char *, + (char *restrict __dst, char const *restrict __src)); +_GL_CXXALIASWARN (stpcpy); +#elif defined GNULIB_POSIXCHECK +# undef stpcpy +# if HAVE_RAW_DECL_STPCPY +_GL_WARN_ON_USE (stpcpy, "stpcpy is unportable - " + "use gnulib module stpcpy for portability"); +# endif +#endif + +/* Copy no more than N bytes of SRC to DST, returning a pointer past the + last non-NUL byte written into DST. */ +#if @GNULIB_STPNCPY@ +# if @REPLACE_STPNCPY@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef stpncpy +# define stpncpy rpl_stpncpy +# endif +_GL_FUNCDECL_RPL (stpncpy, char *, + (char *restrict __dst, char const *restrict __src, + size_t __n) + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (stpncpy, char *, + (char *restrict __dst, char const *restrict __src, + size_t __n)); +# else +# if ! @HAVE_STPNCPY@ +_GL_FUNCDECL_SYS (stpncpy, char *, + (char *restrict __dst, char const *restrict __src, + size_t __n) + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (stpncpy, char *, + (char *restrict __dst, char const *restrict __src, + size_t __n)); +# endif +_GL_CXXALIASWARN (stpncpy); +#elif defined GNULIB_POSIXCHECK +# undef stpncpy +# if HAVE_RAW_DECL_STPNCPY +_GL_WARN_ON_USE (stpncpy, "stpncpy is unportable - " + "use gnulib module stpncpy for portability"); +# endif +#endif + +#if defined GNULIB_POSIXCHECK +/* strchr() does not work with multibyte strings if the locale encoding is + GB18030 and the character to be searched is a digit. */ +# undef strchr +/* Assume strchr is always declared. */ +_GL_WARN_ON_USE (strchr, "strchr cannot work correctly on character strings " + "in some multibyte locales - " + "use mbschr if you care about internationalization"); +#endif + +/* Find the first occurrence of C in S or the final NUL byte. */ +#if @GNULIB_STRCHRNUL@ +# if @REPLACE_STRCHRNUL@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define strchrnul rpl_strchrnul +# endif +_GL_FUNCDECL_RPL (strchrnul, char *, (const char *__s, int __c_in) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (strchrnul, char *, + (const char *str, int ch)); +# else +# if ! @HAVE_STRCHRNUL@ +_GL_FUNCDECL_SYS (strchrnul, char *, (char const *__s, int __c_in) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const char * std::strchrnul (const char *, int); } + extern "C++" { char * std::strchrnul (char *, int); } */ +_GL_CXXALIAS_SYS_CAST2 (strchrnul, + char *, (char const *__s, int __c_in), + char const *, (char const *__s, int __c_in)); +# endif +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (strchrnul, char *, (char *__s, int __c_in)); +_GL_CXXALIASWARN1 (strchrnul, char const *, (char const *__s, int __c_in)); +# else +_GL_CXXALIASWARN (strchrnul); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strchrnul +# if HAVE_RAW_DECL_STRCHRNUL +_GL_WARN_ON_USE (strchrnul, "strchrnul is unportable - " + "use gnulib module strchrnul for portability"); +# endif +#endif + +/* Duplicate S, returning an identical malloc'd string. */ +#if @GNULIB_STRDUP@ +# if @REPLACE_STRDUP@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strdup +# define strdup rpl_strdup +# endif +_GL_FUNCDECL_RPL (strdup, char *, (char const *__s) _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (strdup, char *, (char const *__s)); +# else +# if defined __cplusplus && defined GNULIB_NAMESPACE && defined strdup + /* strdup exists as a function and as a macro. Get rid of the macro. */ +# undef strdup +# endif +# if !(@HAVE_DECL_STRDUP@ || defined strdup) +_GL_FUNCDECL_SYS (strdup, char *, (char const *__s) _GL_ARG_NONNULL ((1))); +# endif +_GL_CXXALIAS_SYS (strdup, char *, (char const *__s)); +# endif +_GL_CXXALIASWARN (strdup); +#elif defined GNULIB_POSIXCHECK +# undef strdup +# if HAVE_RAW_DECL_STRDUP +_GL_WARN_ON_USE (strdup, "strdup is unportable - " + "use gnulib module strdup for portability"); +# endif +#endif + +/* Append no more than N characters from SRC onto DEST. */ +#if @GNULIB_STRNCAT@ +# if @REPLACE_STRNCAT@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strncat +# define strncat rpl_strncat +# endif +_GL_FUNCDECL_RPL (strncat, char *, (char *dest, const char *src, size_t n) + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (strncat, char *, (char *dest, const char *src, size_t n)); +# else +_GL_CXXALIAS_SYS (strncat, char *, (char *dest, const char *src, size_t n)); +# endif +_GL_CXXALIASWARN (strncat); +#elif defined GNULIB_POSIXCHECK +# undef strncat +# if HAVE_RAW_DECL_STRNCAT +_GL_WARN_ON_USE (strncat, "strncat is unportable - " + "use gnulib module strncat for portability"); +# endif +#endif + +/* Return a newly allocated copy of at most N bytes of STRING. */ +#if @GNULIB_STRNDUP@ +# if @REPLACE_STRNDUP@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strndup +# define strndup rpl_strndup +# endif +_GL_FUNCDECL_RPL (strndup, char *, (char const *__string, size_t __n) + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (strndup, char *, (char const *__string, size_t __n)); +# else +# if ! @HAVE_DECL_STRNDUP@ +_GL_FUNCDECL_SYS (strndup, char *, (char const *__string, size_t __n) + _GL_ARG_NONNULL ((1))); +# endif +_GL_CXXALIAS_SYS (strndup, char *, (char const *__string, size_t __n)); +# endif +_GL_CXXALIASWARN (strndup); +#elif defined GNULIB_POSIXCHECK +# undef strndup +# if HAVE_RAW_DECL_STRNDUP +_GL_WARN_ON_USE (strndup, "strndup is unportable - " + "use gnulib module strndup for portability"); +# endif +#endif + +/* Find the length (number of bytes) of STRING, but scan at most + MAXLEN bytes. If no '\0' terminator is found in that many bytes, + return MAXLEN. */ +#if @GNULIB_STRNLEN@ +# if @REPLACE_STRNLEN@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strnlen +# define strnlen rpl_strnlen +# endif +_GL_FUNCDECL_RPL (strnlen, size_t, (char const *__string, size_t __maxlen) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (strnlen, size_t, (char const *__string, size_t __maxlen)); +# else +# if ! @HAVE_DECL_STRNLEN@ +_GL_FUNCDECL_SYS (strnlen, size_t, (char const *__string, size_t __maxlen) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif +_GL_CXXALIAS_SYS (strnlen, size_t, (char const *__string, size_t __maxlen)); +# endif +_GL_CXXALIASWARN (strnlen); +#elif defined GNULIB_POSIXCHECK +# undef strnlen +# if HAVE_RAW_DECL_STRNLEN +_GL_WARN_ON_USE (strnlen, "strnlen is unportable - " + "use gnulib module strnlen for portability"); +# endif +#endif + +#if defined GNULIB_POSIXCHECK +/* strcspn() assumes the second argument is a list of single-byte characters. + Even in this simple case, it does not work with multibyte strings if the + locale encoding is GB18030 and one of the characters to be searched is a + digit. */ +# undef strcspn +/* Assume strcspn is always declared. */ +_GL_WARN_ON_USE (strcspn, "strcspn cannot work correctly on character strings " + "in multibyte locales - " + "use mbscspn if you care about internationalization"); +#endif + +/* Find the first occurrence in S of any character in ACCEPT. */ +#if @GNULIB_STRPBRK@ +# if ! @HAVE_STRPBRK@ +_GL_FUNCDECL_SYS (strpbrk, char *, (char const *__s, char const *__accept) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C" { const char * strpbrk (const char *, const char *); } + extern "C++" { char * strpbrk (char *, const char *); } */ +_GL_CXXALIAS_SYS_CAST2 (strpbrk, + char *, (char const *__s, char const *__accept), + const char *, (char const *__s, char const *__accept)); +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (strpbrk, char *, (char *__s, char const *__accept)); +_GL_CXXALIASWARN1 (strpbrk, char const *, + (char const *__s, char const *__accept)); +# else +_GL_CXXALIASWARN (strpbrk); +# endif +# if defined GNULIB_POSIXCHECK +/* strpbrk() assumes the second argument is a list of single-byte characters. + Even in this simple case, it does not work with multibyte strings if the + locale encoding is GB18030 and one of the characters to be searched is a + digit. */ +# undef strpbrk +_GL_WARN_ON_USE (strpbrk, "strpbrk cannot work correctly on character strings " + "in multibyte locales - " + "use mbspbrk if you care about internationalization"); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strpbrk +# if HAVE_RAW_DECL_STRPBRK +_GL_WARN_ON_USE (strpbrk, "strpbrk is unportable - " + "use gnulib module strpbrk for portability"); +# endif +#endif + +#if defined GNULIB_POSIXCHECK +/* strspn() assumes the second argument is a list of single-byte characters. + Even in this simple case, it cannot work with multibyte strings. */ +# undef strspn +/* Assume strspn is always declared. */ +_GL_WARN_ON_USE (strspn, "strspn cannot work correctly on character strings " + "in multibyte locales - " + "use mbsspn if you care about internationalization"); +#endif + +#if defined GNULIB_POSIXCHECK +/* strrchr() does not work with multibyte strings if the locale encoding is + GB18030 and the character to be searched is a digit. */ +# undef strrchr +/* Assume strrchr is always declared. */ +_GL_WARN_ON_USE (strrchr, "strrchr cannot work correctly on character strings " + "in some multibyte locales - " + "use mbsrchr if you care about internationalization"); +#endif + +/* Search the next delimiter (char listed in DELIM) starting at *STRINGP. + If one is found, overwrite it with a NUL, and advance *STRINGP + to point to the next char after it. Otherwise, set *STRINGP to NULL. + If *STRINGP was already NULL, nothing happens. + Return the old value of *STRINGP. + + This is a variant of strtok() that is multithread-safe and supports + empty fields. + + Caveat: It modifies the original string. + Caveat: These functions cannot be used on constant strings. + Caveat: The identity of the delimiting character is lost. + Caveat: It doesn't work with multibyte strings unless all of the delimiter + characters are ASCII characters < 0x30. + + See also strtok_r(). */ +#if @GNULIB_STRSEP@ +# if ! @HAVE_STRSEP@ +_GL_FUNCDECL_SYS (strsep, char *, + (char **restrict __stringp, char const *restrict __delim) + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (strsep, char *, + (char **restrict __stringp, char const *restrict __delim)); +_GL_CXXALIASWARN (strsep); +# if defined GNULIB_POSIXCHECK +# undef strsep +_GL_WARN_ON_USE (strsep, "strsep cannot work correctly on character strings " + "in multibyte locales - " + "use mbssep if you care about internationalization"); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strsep +# if HAVE_RAW_DECL_STRSEP +_GL_WARN_ON_USE (strsep, "strsep is unportable - " + "use gnulib module strsep for portability"); +# endif +#endif + +#if @GNULIB_STRSTR@ +# if @REPLACE_STRSTR@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define strstr rpl_strstr +# endif +_GL_FUNCDECL_RPL (strstr, char *, (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (strstr, char *, (const char *haystack, const char *needle)); +# else + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const char * strstr (const char *, const char *); } + extern "C++" { char * strstr (char *, const char *); } */ +_GL_CXXALIAS_SYS_CAST2 (strstr, + char *, (const char *haystack, const char *needle), + const char *, (const char *haystack, const char *needle)); +# endif +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (strstr, char *, (char *haystack, const char *needle)); +_GL_CXXALIASWARN1 (strstr, const char *, + (const char *haystack, const char *needle)); +# else +_GL_CXXALIASWARN (strstr); +# endif +#elif defined GNULIB_POSIXCHECK +/* strstr() does not work with multibyte strings if the locale encoding is + different from UTF-8: + POSIX says that it operates on "strings", and "string" in POSIX is defined + as a sequence of bytes, not of characters. */ +# undef strstr +/* Assume strstr is always declared. */ +_GL_WARN_ON_USE (strstr, "strstr is quadratic on many systems, and cannot " + "work correctly on character strings in most " + "multibyte locales - " + "use mbsstr if you care about internationalization, " + "or use strstr if you care about speed"); +#endif + +/* Find the first occurrence of NEEDLE in HAYSTACK, using case-insensitive + comparison. */ +#if @GNULIB_STRCASESTR@ +# if @REPLACE_STRCASESTR@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define strcasestr rpl_strcasestr +# endif +_GL_FUNCDECL_RPL (strcasestr, char *, + (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (strcasestr, char *, + (const char *haystack, const char *needle)); +# else +# if ! @HAVE_STRCASESTR@ +_GL_FUNCDECL_SYS (strcasestr, char *, + (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const char * strcasestr (const char *, const char *); } + extern "C++" { char * strcasestr (char *, const char *); } */ +_GL_CXXALIAS_SYS_CAST2 (strcasestr, + char *, (const char *haystack, const char *needle), + const char *, (const char *haystack, const char *needle)); +# endif +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (strcasestr, char *, (char *haystack, const char *needle)); +_GL_CXXALIASWARN1 (strcasestr, const char *, + (const char *haystack, const char *needle)); +# else +_GL_CXXALIASWARN (strcasestr); +# endif +#elif defined GNULIB_POSIXCHECK +/* strcasestr() does not work with multibyte strings: + It is a glibc extension, and glibc implements it only for unibyte + locales. */ +# undef strcasestr +# if HAVE_RAW_DECL_STRCASESTR +_GL_WARN_ON_USE (strcasestr, "strcasestr does work correctly on character " + "strings in multibyte locales - " + "use mbscasestr if you care about " + "internationalization, or use c-strcasestr if you want " + "a locale independent function"); +# endif +#endif + +/* Parse S into tokens separated by characters in DELIM. + If S is NULL, the saved pointer in SAVE_PTR is used as + the next starting point. For example: + char s[] = "-abc-=-def"; + char *sp; + x = strtok_r(s, "-", &sp); // x = "abc", sp = "=-def" + x = strtok_r(NULL, "-=", &sp); // x = "def", sp = NULL + x = strtok_r(NULL, "=", &sp); // x = NULL + // s = "abc\0-def\0" + + This is a variant of strtok() that is multithread-safe. + + For the POSIX documentation for this function, see: + http://www.opengroup.org/susv3xsh/strtok.html + + Caveat: It modifies the original string. + Caveat: These functions cannot be used on constant strings. + Caveat: The identity of the delimiting character is lost. + Caveat: It doesn't work with multibyte strings unless all of the delimiter + characters are ASCII characters < 0x30. + + See also strsep(). */ +#if @GNULIB_STRTOK_R@ +# if @REPLACE_STRTOK_R@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strtok_r +# define strtok_r rpl_strtok_r +# endif +_GL_FUNCDECL_RPL (strtok_r, char *, + (char *restrict s, char const *restrict delim, + char **restrict save_ptr) + _GL_ARG_NONNULL ((2, 3))); +_GL_CXXALIAS_RPL (strtok_r, char *, + (char *restrict s, char const *restrict delim, + char **restrict save_ptr)); +# else +# if @UNDEFINE_STRTOK_R@ || defined GNULIB_POSIXCHECK +# undef strtok_r +# endif +# if ! @HAVE_DECL_STRTOK_R@ +_GL_FUNCDECL_SYS (strtok_r, char *, + (char *restrict s, char const *restrict delim, + char **restrict save_ptr) + _GL_ARG_NONNULL ((2, 3))); +# endif +_GL_CXXALIAS_SYS (strtok_r, char *, + (char *restrict s, char const *restrict delim, + char **restrict save_ptr)); +# endif +_GL_CXXALIASWARN (strtok_r); +# if defined GNULIB_POSIXCHECK +_GL_WARN_ON_USE (strtok_r, "strtok_r cannot work correctly on character " + "strings in multibyte locales - " + "use mbstok_r if you care about internationalization"); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strtok_r +# if HAVE_RAW_DECL_STRTOK_R +_GL_WARN_ON_USE (strtok_r, "strtok_r is unportable - " + "use gnulib module strtok_r for portability"); +# endif +#endif + + +/* The following functions are not specified by POSIX. They are gnulib + extensions. */ + +#if @GNULIB_MBSLEN@ +/* Return the number of multibyte characters in the character string STRING. + This considers multibyte characters, unlike strlen, which counts bytes. */ +# ifdef __MirBSD__ /* MirBSD defines mbslen as a macro. Override it. */ +# undef mbslen +# endif +# if @HAVE_MBSLEN@ /* AIX, OSF/1, MirBSD define mbslen already in libc. */ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define mbslen rpl_mbslen +# endif +_GL_FUNCDECL_RPL (mbslen, size_t, (const char *string) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (mbslen, size_t, (const char *string)); +# else +_GL_FUNCDECL_SYS (mbslen, size_t, (const char *string) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_SYS (mbslen, size_t, (const char *string)); +# endif +_GL_CXXALIASWARN (mbslen); +#endif + +#if @GNULIB_MBSNLEN@ +/* Return the number of multibyte characters in the character string starting + at STRING and ending at STRING + LEN. */ +_GL_EXTERN_C size_t mbsnlen (const char *string, size_t len) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1)); +#endif + +#if @GNULIB_MBSCHR@ +/* Locate the first single-byte character C in the character string STRING, + and return a pointer to it. Return NULL if C is not found in STRING. + Unlike strchr(), this function works correctly in multibyte locales with + encodings such as GB18030. */ +# if defined __hpux +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define mbschr rpl_mbschr /* avoid collision with HP-UX function */ +# endif +_GL_FUNCDECL_RPL (mbschr, char *, (const char *string, int c) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (mbschr, char *, (const char *string, int c)); +# else +_GL_FUNCDECL_SYS (mbschr, char *, (const char *string, int c) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_SYS (mbschr, char *, (const char *string, int c)); +# endif +_GL_CXXALIASWARN (mbschr); +#endif + +#if @GNULIB_MBSRCHR@ +/* Locate the last single-byte character C in the character string STRING, + and return a pointer to it. Return NULL if C is not found in STRING. + Unlike strrchr(), this function works correctly in multibyte locales with + encodings such as GB18030. */ +# if defined __hpux || defined __INTERIX +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define mbsrchr rpl_mbsrchr /* avoid collision with system function */ +# endif +_GL_FUNCDECL_RPL (mbsrchr, char *, (const char *string, int c) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (mbsrchr, char *, (const char *string, int c)); +# else +_GL_FUNCDECL_SYS (mbsrchr, char *, (const char *string, int c) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_SYS (mbsrchr, char *, (const char *string, int c)); +# endif +_GL_CXXALIASWARN (mbsrchr); +#endif + +#if @GNULIB_MBSSTR@ +/* Find the first occurrence of the character string NEEDLE in the character + string HAYSTACK. Return NULL if NEEDLE is not found in HAYSTACK. + Unlike strstr(), this function works correctly in multibyte locales with + encodings different from UTF-8. */ +_GL_EXTERN_C char * mbsstr (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSCASECMP@ +/* Compare the character strings S1 and S2, ignoring case, returning less than, + equal to or greater than zero if S1 is lexicographically less than, equal to + or greater than S2. + Note: This function may, in multibyte locales, return 0 for strings of + different lengths! + Unlike strcasecmp(), this function works correctly in multibyte locales. */ +_GL_EXTERN_C int mbscasecmp (const char *s1, const char *s2) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSNCASECMP@ +/* Compare the initial segment of the character string S1 consisting of at most + N characters with the initial segment of the character string S2 consisting + of at most N characters, ignoring case, returning less than, equal to or + greater than zero if the initial segment of S1 is lexicographically less + than, equal to or greater than the initial segment of S2. + Note: This function may, in multibyte locales, return 0 for initial segments + of different lengths! + Unlike strncasecmp(), this function works correctly in multibyte locales. + But beware that N is not a byte count but a character count! */ +_GL_EXTERN_C int mbsncasecmp (const char *s1, const char *s2, size_t n) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSPCASECMP@ +/* Compare the initial segment of the character string STRING consisting of + at most mbslen (PREFIX) characters with the character string PREFIX, + ignoring case. If the two match, return a pointer to the first byte + after this prefix in STRING. Otherwise, return NULL. + Note: This function may, in multibyte locales, return non-NULL if STRING + is of smaller length than PREFIX! + Unlike strncasecmp(), this function works correctly in multibyte + locales. */ +_GL_EXTERN_C char * mbspcasecmp (const char *string, const char *prefix) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSCASESTR@ +/* Find the first occurrence of the character string NEEDLE in the character + string HAYSTACK, using case-insensitive comparison. + Note: This function may, in multibyte locales, return success even if + strlen (haystack) < strlen (needle) ! + Unlike strcasestr(), this function works correctly in multibyte locales. */ +_GL_EXTERN_C char * mbscasestr (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSCSPN@ +/* Find the first occurrence in the character string STRING of any character + in the character string ACCEPT. Return the number of bytes from the + beginning of the string to this occurrence, or to the end of the string + if none exists. + Unlike strcspn(), this function works correctly in multibyte locales. */ +_GL_EXTERN_C size_t mbscspn (const char *string, const char *accept) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSPBRK@ +/* Find the first occurrence in the character string STRING of any character + in the character string ACCEPT. Return the pointer to it, or NULL if none + exists. + Unlike strpbrk(), this function works correctly in multibyte locales. */ +# if defined __hpux +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define mbspbrk rpl_mbspbrk /* avoid collision with HP-UX function */ +# endif +_GL_FUNCDECL_RPL (mbspbrk, char *, (const char *string, const char *accept) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (mbspbrk, char *, (const char *string, const char *accept)); +# else +_GL_FUNCDECL_SYS (mbspbrk, char *, (const char *string, const char *accept) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_SYS (mbspbrk, char *, (const char *string, const char *accept)); +# endif +_GL_CXXALIASWARN (mbspbrk); +#endif + +#if @GNULIB_MBSSPN@ +/* Find the first occurrence in the character string STRING of any character + not in the character string REJECT. Return the number of bytes from the + beginning of the string to this occurrence, or to the end of the string + if none exists. + Unlike strspn(), this function works correctly in multibyte locales. */ +_GL_EXTERN_C size_t mbsspn (const char *string, const char *reject) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSSEP@ +/* Search the next delimiter (multibyte character listed in the character + string DELIM) starting at the character string *STRINGP. + If one is found, overwrite it with a NUL, and advance *STRINGP to point + to the next multibyte character after it. Otherwise, set *STRINGP to NULL. + If *STRINGP was already NULL, nothing happens. + Return the old value of *STRINGP. + + This is a variant of mbstok_r() that supports empty fields. + + Caveat: It modifies the original string. + Caveat: These functions cannot be used on constant strings. + Caveat: The identity of the delimiting character is lost. + + See also mbstok_r(). */ +_GL_EXTERN_C char * mbssep (char **stringp, const char *delim) + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSTOK_R@ +/* Parse the character string STRING into tokens separated by characters in + the character string DELIM. + If STRING is NULL, the saved pointer in SAVE_PTR is used as + the next starting point. For example: + char s[] = "-abc-=-def"; + char *sp; + x = mbstok_r(s, "-", &sp); // x = "abc", sp = "=-def" + x = mbstok_r(NULL, "-=", &sp); // x = "def", sp = NULL + x = mbstok_r(NULL, "=", &sp); // x = NULL + // s = "abc\0-def\0" + + Caveat: It modifies the original string. + Caveat: These functions cannot be used on constant strings. + Caveat: The identity of the delimiting character is lost. + + See also mbssep(). */ +_GL_EXTERN_C char * mbstok_r (char *string, const char *delim, char **save_ptr) + _GL_ARG_NONNULL ((2, 3)); +#endif + +/* Map any int, typically from errno, into an error message. */ +#if @GNULIB_STRERROR@ +# if @REPLACE_STRERROR@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strerror +# define strerror rpl_strerror +# endif +_GL_FUNCDECL_RPL (strerror, char *, (int)); +_GL_CXXALIAS_RPL (strerror, char *, (int)); +# else +_GL_CXXALIAS_SYS (strerror, char *, (int)); +# endif +_GL_CXXALIASWARN (strerror); +#elif defined GNULIB_POSIXCHECK +# undef strerror +/* Assume strerror is always declared. */ +_GL_WARN_ON_USE (strerror, "strerror is unportable - " + "use gnulib module strerror to guarantee non-NULL result"); +#endif + +/* Map any int, typically from errno, into an error message. Multithread-safe. + Uses the POSIX declaration, not the glibc declaration. */ +#if @GNULIB_STRERROR_R@ +# if @REPLACE_STRERROR_R@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strerror_r +# define strerror_r rpl_strerror_r +# endif +_GL_FUNCDECL_RPL (strerror_r, int, (int errnum, char *buf, size_t buflen) + _GL_ARG_NONNULL ((2))); +_GL_CXXALIAS_RPL (strerror_r, int, (int errnum, char *buf, size_t buflen)); +# else +# if !@HAVE_DECL_STRERROR_R@ +_GL_FUNCDECL_SYS (strerror_r, int, (int errnum, char *buf, size_t buflen) + _GL_ARG_NONNULL ((2))); +# endif +_GL_CXXALIAS_SYS (strerror_r, int, (int errnum, char *buf, size_t buflen)); +# endif +# if @HAVE_DECL_STRERROR_R@ +_GL_CXXALIASWARN (strerror_r); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strerror_r +# if HAVE_RAW_DECL_STRERROR_R +_GL_WARN_ON_USE (strerror_r, "strerror_r is unportable - " + "use gnulib module strerror_r-posix for portability"); +# endif +#endif + +#if @GNULIB_STRSIGNAL@ +# if @REPLACE_STRSIGNAL@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define strsignal rpl_strsignal +# endif +_GL_FUNCDECL_RPL (strsignal, char *, (int __sig)); +_GL_CXXALIAS_RPL (strsignal, char *, (int __sig)); +# else +# if ! @HAVE_DECL_STRSIGNAL@ +_GL_FUNCDECL_SYS (strsignal, char *, (int __sig)); +# endif +/* Need to cast, because on Cygwin 1.5.x systems, the return type is + 'const char *'. */ +_GL_CXXALIAS_SYS_CAST (strsignal, char *, (int __sig)); +# endif +_GL_CXXALIASWARN (strsignal); +#elif defined GNULIB_POSIXCHECK +# undef strsignal +# if HAVE_RAW_DECL_STRSIGNAL +_GL_WARN_ON_USE (strsignal, "strsignal is unportable - " + "use gnulib module strsignal for portability"); +# endif +#endif + +#if @GNULIB_STRVERSCMP@ +# if !@HAVE_STRVERSCMP@ +_GL_FUNCDECL_SYS (strverscmp, int, (const char *, const char *) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (strverscmp, int, (const char *, const char *)); +_GL_CXXALIASWARN (strverscmp); +#elif defined GNULIB_POSIXCHECK +# undef strverscmp +# if HAVE_RAW_DECL_STRVERSCMP +_GL_WARN_ON_USE (strverscmp, "strverscmp is unportable - " + "use gnulib module strverscmp for portability"); +# endif +#endif + + +#endif /* _@GUARD_PREFIX@_STRING_H */ +#endif /* _@GUARD_PREFIX@_STRING_H */ === modified file 'm4/gnulib-comp.m4' --- m4/gnulib-comp.m4 2013-02-01 06:30:51 +0000 +++ m4/gnulib-comp.m4 2013-02-09 03:12:48 +0000 @@ -83,6 +83,7 @@ AC_REQUIRE([AC_SYS_LARGEFILE]) # Code from module lstat: # Code from module manywarnings: + # Code from module memrchr: # Code from module mktime: # Code from module multiarch: # Code from module nocrash: @@ -117,6 +118,7 @@ # Code from module stdio: # Code from module stdlib: # Code from module strftime: + # Code from module string: # Code from module strtoimax: # Code from module strtoll: # Code from module strtoull: @@ -242,6 +244,12 @@ gl_PREREQ_LSTAT fi gl_SYS_STAT_MODULE_INDICATOR([lstat]) + gl_FUNC_MEMRCHR + if test $ac_cv_func_memrchr = no; then + AC_LIBOBJ([memrchr]) + gl_PREREQ_MEMRCHR + fi + gl_STRING_MODULE_INDICATOR([memrchr]) gl_FUNC_MKTIME if test $REPLACE_MKTIME = 1; then AC_LIBOBJ([mktime]) @@ -294,6 +302,7 @@ gl_STDIO_H gl_STDLIB_H gl_FUNC_GNU_STRFTIME + gl_HEADER_STRING_H gl_FUNC_STRTOIMAX if test $HAVE_STRTOIMAX = 0 || test $REPLACE_STRTOIMAX = 1; then AC_LIBOBJ([strtoimax]) @@ -757,6 +766,7 @@ lib/lstat.c lib/md5.c lib/md5.h + lib/memrchr.c lib/mktime-internal.h lib/mktime.c lib/openat-priv.h @@ -790,6 +800,7 @@ lib/stdlib.in.h lib/strftime.c lib/strftime.h + lib/string.in.h lib/strtoimax.c lib/strtol.c lib/strtoll.c @@ -848,6 +859,7 @@ m4/lstat.m4 m4/manywarnings.m4 m4/md5.m4 + m4/memrchr.m4 m4/mktime.m4 m4/multiarch.m4 m4/nocrash.m4 @@ -877,6 +889,7 @@ m4/stdio_h.m4 m4/stdlib_h.m4 m4/strftime.m4 + m4/string_h.m4 m4/strtoimax.m4 m4/strtoll.m4 m4/strtoull.m4 === added file 'm4/memrchr.m4' --- m4/memrchr.m4 1970-01-01 00:00:00 +0000 +++ m4/memrchr.m4 2013-02-09 03:12:48 +0000 @@ -0,0 +1,23 @@ +# memrchr.m4 serial 10 +dnl Copyright (C) 2002-2003, 2005-2007, 2009-2013 Free Software Foundation, +dnl Inc. +dnl This file is free software; the Free Software Foundation +dnl gives unlimited permission to copy and/or distribute it, +dnl with or without modifications, as long as this notice is preserved. + +AC_DEFUN([gl_FUNC_MEMRCHR], +[ + dnl Persuade glibc <string.h> to declare memrchr(). + AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS]) + + AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS]) + AC_CHECK_DECLS_ONCE([memrchr]) + if test $ac_cv_have_decl_memrchr = no; then + HAVE_DECL_MEMRCHR=0 + fi + + AC_CHECK_FUNCS([memrchr]) +]) + +# Prerequisites of lib/memrchr.c. +AC_DEFUN([gl_PREREQ_MEMRCHR], [:]) === added file 'm4/string_h.m4' --- m4/string_h.m4 1970-01-01 00:00:00 +0000 +++ m4/string_h.m4 2013-02-09 03:12:48 +0000 @@ -0,0 +1,120 @@ +# Configure a GNU-like replacement for <string.h>. + +# Copyright (C) 2007-2013 Free Software Foundation, Inc. +# This file is free software; the Free Software Foundation +# gives unlimited permission to copy and/or distribute it, +# with or without modifications, as long as this notice is preserved. + +# serial 21 + +# Written by Paul Eggert. + +AC_DEFUN([gl_HEADER_STRING_H], +[ + dnl Use AC_REQUIRE here, so that the default behavior below is expanded + dnl once only, before all statements that occur in other macros. + AC_REQUIRE([gl_HEADER_STRING_H_BODY]) +]) + +AC_DEFUN([gl_HEADER_STRING_H_BODY], +[ + AC_REQUIRE([AC_C_RESTRICT]) + AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS]) + gl_NEXT_HEADERS([string.h]) + + dnl Check for declarations of anything we want to poison if the + dnl corresponding gnulib module is not in use, and which is not + dnl guaranteed by C89. + gl_WARN_ON_USE_PREPARE([[#include <string.h> + ]], + [ffsl ffsll memmem mempcpy memrchr rawmemchr stpcpy stpncpy strchrnul + strdup strncat strndup strnlen strpbrk strsep strcasestr strtok_r + strerror_r strsignal strverscmp]) +]) + +AC_DEFUN([gl_STRING_MODULE_INDICATOR], +[ + dnl Use AC_REQUIRE here, so that the default settings are expanded once only. + AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS]) + gl_MODULE_INDICATOR_SET_VARIABLE([$1]) + dnl Define it also as a C macro, for the benefit of the unit tests. + gl_MODULE_INDICATOR_FOR_TESTS([$1]) +]) + +AC_DEFUN([gl_HEADER_STRING_H_DEFAULTS], +[ + GNULIB_FFSL=0; AC_SUBST([GNULIB_FFSL]) + GNULIB_FFSLL=0; AC_SUBST([GNULIB_FFSLL]) + GNULIB_MEMCHR=0; AC_SUBST([GNULIB_MEMCHR]) + GNULIB_MEMMEM=0; AC_SUBST([GNULIB_MEMMEM]) + GNULIB_MEMPCPY=0; AC_SUBST([GNULIB_MEMPCPY]) + GNULIB_MEMRCHR=0; AC_SUBST([GNULIB_MEMRCHR]) + GNULIB_RAWMEMCHR=0; AC_SUBST([GNULIB_RAWMEMCHR]) + GNULIB_STPCPY=0; AC_SUBST([GNULIB_STPCPY]) + GNULIB_STPNCPY=0; AC_SUBST([GNULIB_STPNCPY]) + GNULIB_STRCHRNUL=0; AC_SUBST([GNULIB_STRCHRNUL]) + GNULIB_STRDUP=0; AC_SUBST([GNULIB_STRDUP]) + GNULIB_STRNCAT=0; AC_SUBST([GNULIB_STRNCAT]) + GNULIB_STRNDUP=0; AC_SUBST([GNULIB_STRNDUP]) + GNULIB_STRNLEN=0; AC_SUBST([GNULIB_STRNLEN]) + GNULIB_STRPBRK=0; AC_SUBST([GNULIB_STRPBRK]) + GNULIB_STRSEP=0; AC_SUBST([GNULIB_STRSEP]) + GNULIB_STRSTR=0; AC_SUBST([GNULIB_STRSTR]) + GNULIB_STRCASESTR=0; AC_SUBST([GNULIB_STRCASESTR]) + GNULIB_STRTOK_R=0; AC_SUBST([GNULIB_STRTOK_R]) + GNULIB_MBSLEN=0; AC_SUBST([GNULIB_MBSLEN]) + GNULIB_MBSNLEN=0; AC_SUBST([GNULIB_MBSNLEN]) + GNULIB_MBSCHR=0; AC_SUBST([GNULIB_MBSCHR]) + GNULIB_MBSRCHR=0; AC_SUBST([GNULIB_MBSRCHR]) + GNULIB_MBSSTR=0; AC_SUBST([GNULIB_MBSSTR]) + GNULIB_MBSCASECMP=0; AC_SUBST([GNULIB_MBSCASECMP]) + GNULIB_MBSNCASECMP=0; AC_SUBST([GNULIB_MBSNCASECMP]) + GNULIB_MBSPCASECMP=0; AC_SUBST([GNULIB_MBSPCASECMP]) + GNULIB_MBSCASESTR=0; AC_SUBST([GNULIB_MBSCASESTR]) + GNULIB_MBSCSPN=0; AC_SUBST([GNULIB_MBSCSPN]) + GNULIB_MBSPBRK=0; AC_SUBST([GNULIB_MBSPBRK]) + GNULIB_MBSSPN=0; AC_SUBST([GNULIB_MBSSPN]) + GNULIB_MBSSEP=0; AC_SUBST([GNULIB_MBSSEP]) + GNULIB_MBSTOK_R=0; AC_SUBST([GNULIB_MBSTOK_R]) + GNULIB_STRERROR=0; AC_SUBST([GNULIB_STRERROR]) + GNULIB_STRERROR_R=0; AC_SUBST([GNULIB_STRERROR_R]) + GNULIB_STRSIGNAL=0; AC_SUBST([GNULIB_STRSIGNAL]) + GNULIB_STRVERSCMP=0; AC_SUBST([GNULIB_STRVERSCMP]) + HAVE_MBSLEN=0; AC_SUBST([HAVE_MBSLEN]) + dnl Assume proper GNU behavior unless another module says otherwise. + HAVE_FFSL=1; AC_SUBST([HAVE_FFSL]) + HAVE_FFSLL=1; AC_SUBST([HAVE_FFSLL]) + HAVE_MEMCHR=1; AC_SUBST([HAVE_MEMCHR]) + HAVE_DECL_MEMMEM=1; AC_SUBST([HAVE_DECL_MEMMEM]) + HAVE_MEMPCPY=1; AC_SUBST([HAVE_MEMPCPY]) + HAVE_DECL_MEMRCHR=1; AC_SUBST([HAVE_DECL_MEMRCHR]) + HAVE_RAWMEMCHR=1; AC_SUBST([HAVE_RAWMEMCHR]) + HAVE_STPCPY=1; AC_SUBST([HAVE_STPCPY]) + HAVE_STPNCPY=1; AC_SUBST([HAVE_STPNCPY]) + HAVE_STRCHRNUL=1; AC_SUBST([HAVE_STRCHRNUL]) + HAVE_DECL_STRDUP=1; AC_SUBST([HAVE_DECL_STRDUP]) + HAVE_DECL_STRNDUP=1; AC_SUBST([HAVE_DECL_STRNDUP]) + HAVE_DECL_STRNLEN=1; AC_SUBST([HAVE_DECL_STRNLEN]) + HAVE_STRPBRK=1; AC_SUBST([HAVE_STRPBRK]) + HAVE_STRSEP=1; AC_SUBST([HAVE_STRSEP]) + HAVE_STRCASESTR=1; AC_SUBST([HAVE_STRCASESTR]) + HAVE_DECL_STRTOK_R=1; AC_SUBST([HAVE_DECL_STRTOK_R]) + HAVE_DECL_STRERROR_R=1; AC_SUBST([HAVE_DECL_STRERROR_R]) + HAVE_DECL_STRSIGNAL=1; AC_SUBST([HAVE_DECL_STRSIGNAL]) + HAVE_STRVERSCMP=1; AC_SUBST([HAVE_STRVERSCMP]) + REPLACE_MEMCHR=0; AC_SUBST([REPLACE_MEMCHR]) + REPLACE_MEMMEM=0; AC_SUBST([REPLACE_MEMMEM]) + REPLACE_STPNCPY=0; AC_SUBST([REPLACE_STPNCPY]) + REPLACE_STRDUP=0; AC_SUBST([REPLACE_STRDUP]) + REPLACE_STRSTR=0; AC_SUBST([REPLACE_STRSTR]) + REPLACE_STRCASESTR=0; AC_SUBST([REPLACE_STRCASESTR]) + REPLACE_STRCHRNUL=0; AC_SUBST([REPLACE_STRCHRNUL]) + REPLACE_STRERROR=0; AC_SUBST([REPLACE_STRERROR]) + REPLACE_STRERROR_R=0; AC_SUBST([REPLACE_STRERROR_R]) + REPLACE_STRNCAT=0; AC_SUBST([REPLACE_STRNCAT]) + REPLACE_STRNDUP=0; AC_SUBST([REPLACE_STRNDUP]) + REPLACE_STRNLEN=0; AC_SUBST([REPLACE_STRNLEN]) + REPLACE_STRSIGNAL=0; AC_SUBST([REPLACE_STRSIGNAL]) + REPLACE_STRTOK_R=0; AC_SUBST([REPLACE_STRTOK_R]) + UNDEFINE_STRTOK_R=0; AC_SUBST([UNDEFINE_STRTOK_R]) +]) === modified file 'src/ChangeLog' --- src/ChangeLog 2013-02-08 17:42:09 +0000 +++ src/ChangeLog 2013-02-09 03:12:48 +0000 @@ -1,3 +1,11 @@ +2013-02-09 Paul Eggert <eggert@cs.ucla.edu> + + Tune redisplay by using memchr and memrchr. + * search.c (scan_buffer): Omit first arg TARGET, as it's always '\n'. + All callers changed. + (scan_buffer, scan_newline): Use memchr and memrchr rather than + scanning byte-by-byte. + 2013-02-08 Stefan Monnier <monnier@iro.umontreal.ca> * lread.c (skip_dyn_bytes): New function (bug#12598). === modified file 'src/editfns.c' --- src/editfns.c 2013-01-23 20:07:28 +0000 +++ src/editfns.c 2013-02-09 03:12:48 +0000 @@ -735,8 +735,7 @@ /* This is the ONLY_IN_LINE case, check that NEW_POS and FIELD_BOUND are on the same line by seeing whether there's an intervening newline or not. */ - || (scan_buffer ('\n', - XFASTINT (new_pos), XFASTINT (field_bound), + || (scan_buffer (XFASTINT (new_pos), XFASTINT (field_bound), fwd ? -1 : 1, &shortage, 1), shortage != 0))) /* Constrain NEW_POS to FIELD_BOUND. */ === modified file 'src/lisp.h' --- src/lisp.h 2013-02-08 05:28:52 +0000 +++ src/lisp.h 2013-02-09 03:12:48 +0000 @@ -3338,7 +3338,7 @@ extern ptrdiff_t fast_string_match_ignore_case (Lisp_Object, Lisp_Object); extern ptrdiff_t fast_looking_at (Lisp_Object, ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t, Lisp_Object); -extern ptrdiff_t scan_buffer (int, ptrdiff_t, ptrdiff_t, ptrdiff_t, +extern ptrdiff_t scan_buffer (ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t *, bool); extern EMACS_INT scan_newline (ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT, bool); === modified file 'src/search.c' --- src/search.c 2013-02-08 14:44:53 +0000 +++ src/search.c 2013-02-09 03:12:48 +0000 @@ -619,7 +619,7 @@ } \f -/* Search for COUNT instances of the character TARGET between START and END. +/* Search for COUNT newlines between START and END. If COUNT is positive, search forwards; END must be >= START. If COUNT is negative, search backwards for the -COUNTth instance; @@ -634,13 +634,13 @@ this is not the same as the usual convention for Emacs motion commands. If we don't find COUNT instances before reaching END, set *SHORTAGE - to the number of TARGETs left unfound, and return END. + to the number of newlines left unfound, and return END. If ALLOW_QUIT, set immediate_quit. That's good to do except when inside redisplay. */ ptrdiff_t -scan_buffer (int target, ptrdiff_t start, ptrdiff_t end, +scan_buffer (ptrdiff_t start, ptrdiff_t end, ptrdiff_t count, ptrdiff_t *shortage, bool allow_quit) { struct region_cache *newline_cache; @@ -656,7 +656,7 @@ else { direction = -1; - if (!end) + if (!end) end = BEGV, end_byte = BEGV_BYTE; } if (end_byte == -1) @@ -684,7 +684,7 @@ /* If we're looking for a newline, consult the newline cache to see where we can avoid some scanning. */ - if (target == '\n' && newline_cache) + if (newline_cache) { ptrdiff_t next_change; immediate_quit = 0; @@ -726,29 +726,28 @@ unsigned char *scan_start = cursor; /* The dumb loop. */ - while (*cursor != target && ++cursor < ceiling_addr) - ; + unsigned char *nl = memchr (cursor, '\n', ceiling_addr - cursor); /* If we're looking for newlines, cache the fact that the region from start to cursor is free of them. */ - if (target == '\n' && newline_cache) + if (newline_cache) know_region_cache (current_buffer, newline_cache, BYTE_TO_CHAR (start_byte + scan_start - base), - BYTE_TO_CHAR (start_byte + cursor - base)); - - /* Did we find the target character? */ - if (cursor < ceiling_addr) - { - if (--count == 0) - { - immediate_quit = 0; - return BYTE_TO_CHAR (start_byte + cursor - base + 1); - } - cursor++; - } + BYTE_TO_CHAR (start_byte + (nl ? nl : ceiling_addr) - base)); + + /* Did we find the newline? */ + if (! nl) + break; + + if (--count == 0) + { + immediate_quit = 0; + return BYTE_TO_CHAR (start_byte + nl - base + 1); + } + cursor = nl + 1; } - start = BYTE_TO_CHAR (start_byte + cursor - base); + start = BYTE_TO_CHAR (start_byte + ceiling_addr - base); } } else @@ -760,7 +759,7 @@ ptrdiff_t tem; /* Consult the newline cache, if appropriate. */ - if (target == '\n' && newline_cache) + if (newline_cache) { ptrdiff_t next_change; immediate_quit = 0; @@ -795,30 +794,29 @@ while (cursor >= ceiling_addr) { unsigned char *scan_start = cursor; - - while (*cursor != target && --cursor >= ceiling_addr) - ; + unsigned char *nl = memrchr (ceiling_addr, '\n', + cursor + 1 - ceiling_addr); /* If we're looking for newlines, cache the fact that the region from after the cursor to start is free of them. */ - if (target == '\n' && newline_cache) + if (newline_cache) know_region_cache (current_buffer, newline_cache, - BYTE_TO_CHAR (start_byte + cursor - base), + BYTE_TO_CHAR (start_byte + (nl ? nl : ceiling_addr - 1) - base), BYTE_TO_CHAR (start_byte + scan_start - base)); - /* Did we find the target character? */ - if (cursor >= ceiling_addr) - { - if (++count >= 0) - { - immediate_quit = 0; - return BYTE_TO_CHAR (start_byte + cursor - base); - } - cursor--; - } + /* Did we find the newline? */ + if (! nl) + break; + + if (++count >= 0) + { + immediate_quit = 0; + return BYTE_TO_CHAR (start_byte + nl - base); + } + cursor = nl - 1; } - start = BYTE_TO_CHAR (start_byte + cursor - base); + start = BYTE_TO_CHAR (start_byte + (ceiling_addr - 1) - base); } } @@ -874,29 +872,25 @@ ceiling = min (limit_byte - 1, ceiling); ceiling_addr = BYTE_POS_ADDR (ceiling) + 1; base = (cursor = BYTE_POS_ADDR (start_byte)); - while (1) + + do { - while (*cursor != '\n' && ++cursor != ceiling_addr) - ; - - if (cursor != ceiling_addr) + unsigned char *nl = memchr (cursor, '\n', ceiling_addr - cursor); + if (! nl) + break; + if (--count == 0) { - if (--count == 0) - { - immediate_quit = old_immediate_quit; - start_byte = start_byte + cursor - base + 1; - start = BYTE_TO_CHAR (start_byte); - TEMP_SET_PT_BOTH (start, start_byte); - return 0; - } - else - if (++cursor == ceiling_addr) - break; + immediate_quit = old_immediate_quit; + start_byte += nl - base + 1; + start = BYTE_TO_CHAR (start_byte); + TEMP_SET_PT_BOTH (start, start_byte); + return 0; } - else - break; + cursor = nl + 1; } - start_byte += cursor - base; + while (cursor < ceiling_addr); + + start_byte += ceiling_addr - base; } } else @@ -905,31 +899,28 @@ { ceiling = BUFFER_FLOOR_OF (start_byte - 1); ceiling = max (limit_byte, ceiling); - ceiling_addr = BYTE_POS_ADDR (ceiling) - 1; + ceiling_addr = BYTE_POS_ADDR (ceiling); base = (cursor = BYTE_POS_ADDR (start_byte - 1) + 1); while (1) { - while (--cursor != ceiling_addr && *cursor != '\n') - ; + unsigned char *nl = memrchr (ceiling_addr, '\n', + cursor - ceiling_addr); + if (! nl) + break; - if (cursor != ceiling_addr) + if (++count == 0) { - if (++count == 0) - { - immediate_quit = old_immediate_quit; - /* Return the position AFTER the match we found. */ - start_byte = start_byte + cursor - base + 1; - start = BYTE_TO_CHAR (start_byte); - TEMP_SET_PT_BOTH (start, start_byte); - return 0; - } + immediate_quit = old_immediate_quit; + /* Return the position AFTER the match we found. */ + start_byte += nl - base + 1; + start = BYTE_TO_CHAR (start_byte); + TEMP_SET_PT_BOTH (start, start_byte); + return 0; } - else - break; + + cursor = nl; } - /* Here we add 1 to compensate for the last decrement - of CURSOR, which took it past the valid range. */ - start_byte += cursor - base + 1; + start_byte += ceiling_addr - base; } } @@ -942,7 +933,7 @@ ptrdiff_t find_next_newline_no_quit (ptrdiff_t from, ptrdiff_t cnt) { - return scan_buffer ('\n', from, 0, cnt, (ptrdiff_t *) 0, 0); + return scan_buffer (from, 0, cnt, (ptrdiff_t *) 0, 0); } /* Like find_next_newline, but returns position before the newline, @@ -953,7 +944,7 @@ find_before_next_newline (ptrdiff_t from, ptrdiff_t to, ptrdiff_t cnt) { ptrdiff_t shortage; - ptrdiff_t pos = scan_buffer ('\n', from, to, cnt, &shortage, 1); + ptrdiff_t pos = scan_buffer (from, to, cnt, &shortage, 1); if (shortage == 0) pos--; ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-09 3:34 ` Paul Eggert @ 2013-02-09 8:46 ` Eli Zaretskii 2013-02-09 9:05 ` Paul Eggert 0 siblings, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-09 8:46 UTC (permalink / raw) To: Paul Eggert; +Cc: dmantipov, emacs-devel > Date: Fri, 08 Feb 2013 19:34:20 -0800 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: Dmitry Antipov <dmantipov@yandex.ru>, emacs-devel@gnu.org > > On 02/08/2013 08:52 AM, Eli Zaretskii wrote: > >> > So, ~90% of time spent in scan_buffer is: > >> > > >> > 799 while (*cursor != target && --cursor >= ceiling_addr) > >> > 800 ; > > > Which cannot be optimized. > > It can be sped up somewhat, by using memrchr. > > This won't solve these performance issues, but it helps: > on my platform (x86-64 Ubuntu 12.10) I ran Dmitry's scroll-both benchmark > <http://lists.gnu.org/archive/html/emacs-devel/2013-02/msg00147.html> > on a real file (the trunk's src/xdisp.c), and it was 25% faster overall > (1.19 seconds versus 1.49 seconds) when I used memrchr there > and memchr for forward searches. 25% faster is still terribly slow for redisplay. xdisp.c doesn't have a problem in the first place (1.49 sec divided by 100 is 15 msec, not something users will notice, let alone the difference between 15 and 11 msec). And for files with long lines, these 25% will not solve anything, since 6 sec _per_scroll_, give or take 25%, is intolerably slow. I don't think we should make this optimization, because it optimizes in the wrong place. The problem is not with scan_buffer, the problem is that it (actually, its callers) get called way too much. This is a classic case where solving a slow operation needs a radical change in the algorithms, not loophole optimizations. > Most of the attached patch is boilerplate taken unmodified from gnulib, > to support memrchr on non-GNU platforms. The key part of the change is > at the end, to src/search.c. I don't understand why you removed the TARGET argument of scan_buffer. The fact that all its callers use it for looking for a newline doesn't mean it cannot be used otherwise. At the very least, the name of the function should be changed to reflect the change. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-09 8:46 ` Eli Zaretskii @ 2013-02-09 9:05 ` Paul Eggert 2013-02-09 9:33 ` Eli Zaretskii 2013-02-09 10:01 ` Eli Zaretskii 0 siblings, 2 replies; 42+ messages in thread From: Paul Eggert @ 2013-02-09 9:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmantipov, emacs-devel On 02/09/2013 12:46 AM, Eli Zaretskii wrote: > 25% faster is still terribly slow for redisplay. Yes, as I said, it doesn't solve the performance problem. Still, it doesn't complicate the code, and it significantly improves speed in code likely to be executed often, so it seems worth doing in its own right. > I don't understand why you removed the TARGET argument of > scan_buffer. The fact that all its callers use it for looking for a > newline doesn't mean it cannot be used otherwise. If we ever need that ability we can put it back in. In the meantime there's no need for the generality and I found it confusing. > At the very least, the name of the function should be > changed to reflect the change. Sure, what name do you suggest? scan_newline is already taken. Perhaps scan_buffer_newline? This area is a bit messed up, unfortunately -- scan_newline has comments saying that it looks for carriage return (!) but it does not in fact do that. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-09 9:05 ` Paul Eggert @ 2013-02-09 9:33 ` Eli Zaretskii 2013-02-11 2:33 ` Paul Eggert 2013-02-09 10:01 ` Eli Zaretskii 1 sibling, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-09 9:33 UTC (permalink / raw) To: Paul Eggert; +Cc: dmantipov, emacs-devel > Date: Sat, 09 Feb 2013 01:05:01 -0800 > From: Paul Eggert <eggert@cs.ucla.edu> > Cc: dmantipov@yandex.ru, emacs-devel@gnu.org > > > At the very least, the name of the function should be > > changed to reflect the change. > > Sure, what name do you suggest? scan_newline is already taken. > Perhaps scan_buffer_newline? I'd use find_newline, since 2 out of 3 of its callers are find_next_newline_no_quit and find_before_next_newline. > This area is a bit messed up, unfortunately -- scan_newline has > comments saying that it looks for carriage return (!) but > it does not in fact do that. People tend to forget updating the commentary when they change code. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-09 9:33 ` Eli Zaretskii @ 2013-02-11 2:33 ` Paul Eggert 0 siblings, 0 replies; 42+ messages in thread From: Paul Eggert @ 2013-02-11 2:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dmantipov, emacs-devel [-- Attachment #1: Type: text/plain, Size: 354 bytes --] On 02/09/2013 01:33 AM, Eli Zaretskii wrote: > I'd use find_newline, since 2 out of 3 of its callers are > find_next_newline_no_quit and find_before_next_newline. OK, thanks, attached is a revised patch to do that. It also removes the confusing comments about carriage return, and identifies two or three more places where it's clearer to use memchr. [-- Attachment #2: memchr.txt --] [-- Type: text/plain, Size: 82397 bytes --] === modified file '.bzrignore' --- .bzrignore 2013-02-01 06:30:51 +0000 +++ .bzrignore 2013-02-09 03:12:48 +0000 @@ -97,6 +97,7 @@ lib/stdio.h lib/stdint.h lib/stdlib.h +lib/string.h lib/sys/ lib/SYS lib/time.h === modified file 'ChangeLog' --- ChangeLog 2013-02-11 00:55:26 +0000 +++ ChangeLog 2013-02-11 01:28:13 +0000 @@ -1,3 +1,11 @@ +2013-02-11 Paul Eggert <eggert@cs.ucla.edu> + + Tune by using memchr and memrchr. + * .bzrignore: Add string.h. + * lib/gnulib.mk, m4/gnulib-comp.m4: Regenerate. + * lib/memrchr.c, lib/string.in.h, m4/memrchr.m4, m4/string_h.m4: + New files, from gnulib. + 2013-02-11 Glenn Morris <rgm@gnu.org> * configure.ac (emacs_config_options): Record some env vars. === modified file 'admin/ChangeLog' --- admin/ChangeLog 2013-02-01 06:30:51 +0000 +++ admin/ChangeLog 2013-02-11 01:28:59 +0000 @@ -1,3 +1,8 @@ +2013-02-09 Paul Eggert <eggert@cs.ucla.edu> + + Tune by using memchr and memrchr. + * merge-gnulib (GNULIB_MODULES): Add memrchr. + 2013-02-01 Paul Eggert <eggert@cs.ucla.edu> Use fdopendir, fstatat and readlinkat, for efficiency (Bug#13539). === modified file 'admin/merge-gnulib' --- admin/merge-gnulib 2013-02-01 06:30:51 +0000 +++ admin/merge-gnulib 2013-02-09 03:12:48 +0000 @@ -31,7 +31,8 @@ dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat - manywarnings mktime pselect pthread_sigmask putenv readlink readlinkat + manywarnings memrchr mktime + pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens === modified file 'lib/gnulib.mk' --- lib/gnulib.mk 2013-02-08 23:37:17 +0000 +++ lib/gnulib.mk 2013-02-09 03:12:48 +0000 @@ -21,7 +21,7 @@ # the same distribution terms as the rest of that program. # # Generated by gnulib-tool. -# Reproduce by: gnulib-tool --import --dir=. --lib=libgnu --source-base=lib --m4-base=m4 --doc-base=doc --tests-base=tests --aux-dir=build-aux --avoid=dup --avoid=errno --avoid=fchdir --avoid=fcntl --avoid=fstat --avoid=malloc-posix --avoid=msvc-inval --avoid=msvc-nothrow --avoid=open --avoid=openat-die --avoid=opendir --avoid=raise --avoid=save-cwd --avoid=select --avoid=sigprocmask --avoid=sys_types --avoid=threadlib --makefile-name=gnulib.mk --conditional-dependencies --no-libtool --macro-prefix=gl --no-vc-files alloca-opt c-ctype c-strcase careadlinkat close-stream crypto/md5 crypto/sha1 crypto/sha256 crypto/sha512 dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat manywarnings mktime pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens warnings +# Reproduce by: gnulib-tool --import --dir=. --lib=libgnu --source-base=lib --m4-base=m4 --doc-base=doc --tests-base=tests --aux-dir=build-aux --avoid=dup --avoid=errno --avoid=fchdir --avoid=fcntl --avoid=fstat --avoid=malloc-posix --avoid=msvc-inval --avoid=msvc-nothrow --avoid=open --avoid=openat-die --avoid=opendir --avoid=raise --avoid=save-cwd --avoid=select --avoid=sigprocmask --avoid=sys_types --avoid=threadlib --makefile-name=gnulib.mk --conditional-dependencies --no-libtool --macro-prefix=gl --no-vc-files alloca-opt c-ctype c-strcase careadlinkat close-stream crypto/md5 crypto/sha1 crypto/sha256 crypto/sha512 dtoastr dtotimespec dup2 environ execinfo faccessat fcntl-h fdopendir filemode fstatat getloadavg getopt-gnu gettime gettimeofday ignore-value intprops largefile lstat manywarnings memrchr mktime pselect pthread_sigmask putenv readlink readlinkat sig2str socklen stat-time stdalign stdarg stdbool stdio strftime strtoimax strtoumax symlink sys_stat sys_time time timer-time timespec-add timespec-sub unsetenv utimens warnings MOSTLYCLEANFILES += core *.stackdump @@ -480,6 +480,15 @@ ## end gnulib module lstat +## begin gnulib module memrchr + + +EXTRA_DIST += memrchr.c + +EXTRA_libgnu_a_SOURCES += memrchr.c + +## end gnulib module memrchr + ## begin gnulib module mktime @@ -1105,6 +1114,106 @@ ## end gnulib module strftime +## begin gnulib module string + +BUILT_SOURCES += string.h + +# We need the following in order to create <string.h> when the system +# doesn't have one that works with the given compiler. +string.h: string.in.h $(top_builddir)/config.status $(CXXDEFS_H) $(ARG_NONNULL_H) $(WARN_ON_USE_H) + $(AM_V_GEN)rm -f $@-t $@ && \ + { echo '/* DO NOT EDIT! GENERATED AUTOMATICALLY! */' && \ + sed -e 's|@''GUARD_PREFIX''@|GL|g' \ + -e 's|@''INCLUDE_NEXT''@|$(INCLUDE_NEXT)|g' \ + -e 's|@''PRAGMA_SYSTEM_HEADER''@|@PRAGMA_SYSTEM_HEADER@|g' \ + -e 's|@''PRAGMA_COLUMNS''@|@PRAGMA_COLUMNS@|g' \ + -e 's|@''NEXT_STRING_H''@|$(NEXT_STRING_H)|g' \ + -e 's/@''GNULIB_FFSL''@/$(GNULIB_FFSL)/g' \ + -e 's/@''GNULIB_FFSLL''@/$(GNULIB_FFSLL)/g' \ + -e 's/@''GNULIB_MBSLEN''@/$(GNULIB_MBSLEN)/g' \ + -e 's/@''GNULIB_MBSNLEN''@/$(GNULIB_MBSNLEN)/g' \ + -e 's/@''GNULIB_MBSCHR''@/$(GNULIB_MBSCHR)/g' \ + -e 's/@''GNULIB_MBSRCHR''@/$(GNULIB_MBSRCHR)/g' \ + -e 's/@''GNULIB_MBSSTR''@/$(GNULIB_MBSSTR)/g' \ + -e 's/@''GNULIB_MBSCASECMP''@/$(GNULIB_MBSCASECMP)/g' \ + -e 's/@''GNULIB_MBSNCASECMP''@/$(GNULIB_MBSNCASECMP)/g' \ + -e 's/@''GNULIB_MBSPCASECMP''@/$(GNULIB_MBSPCASECMP)/g' \ + -e 's/@''GNULIB_MBSCASESTR''@/$(GNULIB_MBSCASESTR)/g' \ + -e 's/@''GNULIB_MBSCSPN''@/$(GNULIB_MBSCSPN)/g' \ + -e 's/@''GNULIB_MBSPBRK''@/$(GNULIB_MBSPBRK)/g' \ + -e 's/@''GNULIB_MBSSPN''@/$(GNULIB_MBSSPN)/g' \ + -e 's/@''GNULIB_MBSSEP''@/$(GNULIB_MBSSEP)/g' \ + -e 's/@''GNULIB_MBSTOK_R''@/$(GNULIB_MBSTOK_R)/g' \ + -e 's/@''GNULIB_MEMCHR''@/$(GNULIB_MEMCHR)/g' \ + -e 's/@''GNULIB_MEMMEM''@/$(GNULIB_MEMMEM)/g' \ + -e 's/@''GNULIB_MEMPCPY''@/$(GNULIB_MEMPCPY)/g' \ + -e 's/@''GNULIB_MEMRCHR''@/$(GNULIB_MEMRCHR)/g' \ + -e 's/@''GNULIB_RAWMEMCHR''@/$(GNULIB_RAWMEMCHR)/g' \ + -e 's/@''GNULIB_STPCPY''@/$(GNULIB_STPCPY)/g' \ + -e 's/@''GNULIB_STPNCPY''@/$(GNULIB_STPNCPY)/g' \ + -e 's/@''GNULIB_STRCHRNUL''@/$(GNULIB_STRCHRNUL)/g' \ + -e 's/@''GNULIB_STRDUP''@/$(GNULIB_STRDUP)/g' \ + -e 's/@''GNULIB_STRNCAT''@/$(GNULIB_STRNCAT)/g' \ + -e 's/@''GNULIB_STRNDUP''@/$(GNULIB_STRNDUP)/g' \ + -e 's/@''GNULIB_STRNLEN''@/$(GNULIB_STRNLEN)/g' \ + -e 's/@''GNULIB_STRPBRK''@/$(GNULIB_STRPBRK)/g' \ + -e 's/@''GNULIB_STRSEP''@/$(GNULIB_STRSEP)/g' \ + -e 's/@''GNULIB_STRSTR''@/$(GNULIB_STRSTR)/g' \ + -e 's/@''GNULIB_STRCASESTR''@/$(GNULIB_STRCASESTR)/g' \ + -e 's/@''GNULIB_STRTOK_R''@/$(GNULIB_STRTOK_R)/g' \ + -e 's/@''GNULIB_STRERROR''@/$(GNULIB_STRERROR)/g' \ + -e 's/@''GNULIB_STRERROR_R''@/$(GNULIB_STRERROR_R)/g' \ + -e 's/@''GNULIB_STRSIGNAL''@/$(GNULIB_STRSIGNAL)/g' \ + -e 's/@''GNULIB_STRVERSCMP''@/$(GNULIB_STRVERSCMP)/g' \ + < $(srcdir)/string.in.h | \ + sed -e 's|@''HAVE_FFSL''@|$(HAVE_FFSL)|g' \ + -e 's|@''HAVE_FFSLL''@|$(HAVE_FFSLL)|g' \ + -e 's|@''HAVE_MBSLEN''@|$(HAVE_MBSLEN)|g' \ + -e 's|@''HAVE_MEMCHR''@|$(HAVE_MEMCHR)|g' \ + -e 's|@''HAVE_DECL_MEMMEM''@|$(HAVE_DECL_MEMMEM)|g' \ + -e 's|@''HAVE_MEMPCPY''@|$(HAVE_MEMPCPY)|g' \ + -e 's|@''HAVE_DECL_MEMRCHR''@|$(HAVE_DECL_MEMRCHR)|g' \ + -e 's|@''HAVE_RAWMEMCHR''@|$(HAVE_RAWMEMCHR)|g' \ + -e 's|@''HAVE_STPCPY''@|$(HAVE_STPCPY)|g' \ + -e 's|@''HAVE_STPNCPY''@|$(HAVE_STPNCPY)|g' \ + -e 's|@''HAVE_STRCHRNUL''@|$(HAVE_STRCHRNUL)|g' \ + -e 's|@''HAVE_DECL_STRDUP''@|$(HAVE_DECL_STRDUP)|g' \ + -e 's|@''HAVE_DECL_STRNDUP''@|$(HAVE_DECL_STRNDUP)|g' \ + -e 's|@''HAVE_DECL_STRNLEN''@|$(HAVE_DECL_STRNLEN)|g' \ + -e 's|@''HAVE_STRPBRK''@|$(HAVE_STRPBRK)|g' \ + -e 's|@''HAVE_STRSEP''@|$(HAVE_STRSEP)|g' \ + -e 's|@''HAVE_STRCASESTR''@|$(HAVE_STRCASESTR)|g' \ + -e 's|@''HAVE_DECL_STRTOK_R''@|$(HAVE_DECL_STRTOK_R)|g' \ + -e 's|@''HAVE_DECL_STRERROR_R''@|$(HAVE_DECL_STRERROR_R)|g' \ + -e 's|@''HAVE_DECL_STRSIGNAL''@|$(HAVE_DECL_STRSIGNAL)|g' \ + -e 's|@''HAVE_STRVERSCMP''@|$(HAVE_STRVERSCMP)|g' \ + -e 's|@''REPLACE_STPNCPY''@|$(REPLACE_STPNCPY)|g' \ + -e 's|@''REPLACE_MEMCHR''@|$(REPLACE_MEMCHR)|g' \ + -e 's|@''REPLACE_MEMMEM''@|$(REPLACE_MEMMEM)|g' \ + -e 's|@''REPLACE_STRCASESTR''@|$(REPLACE_STRCASESTR)|g' \ + -e 's|@''REPLACE_STRCHRNUL''@|$(REPLACE_STRCHRNUL)|g' \ + -e 's|@''REPLACE_STRDUP''@|$(REPLACE_STRDUP)|g' \ + -e 's|@''REPLACE_STRSTR''@|$(REPLACE_STRSTR)|g' \ + -e 's|@''REPLACE_STRERROR''@|$(REPLACE_STRERROR)|g' \ + -e 's|@''REPLACE_STRERROR_R''@|$(REPLACE_STRERROR_R)|g' \ + -e 's|@''REPLACE_STRNCAT''@|$(REPLACE_STRNCAT)|g' \ + -e 's|@''REPLACE_STRNDUP''@|$(REPLACE_STRNDUP)|g' \ + -e 's|@''REPLACE_STRNLEN''@|$(REPLACE_STRNLEN)|g' \ + -e 's|@''REPLACE_STRSIGNAL''@|$(REPLACE_STRSIGNAL)|g' \ + -e 's|@''REPLACE_STRTOK_R''@|$(REPLACE_STRTOK_R)|g' \ + -e 's|@''UNDEFINE_STRTOK_R''@|$(UNDEFINE_STRTOK_R)|g' \ + -e '/definitions of _GL_FUNCDECL_RPL/r $(CXXDEFS_H)' \ + -e '/definition of _GL_ARG_NONNULL/r $(ARG_NONNULL_H)' \ + -e '/definition of _GL_WARN_ON_USE/r $(WARN_ON_USE_H)'; \ + < $(srcdir)/string.in.h; \ + } > $@-t && \ + mv $@-t $@ +MOSTLYCLEANFILES += string.h string.h-t + +EXTRA_DIST += string.in.h + +## end gnulib module string + ## begin gnulib module strtoimax === added file 'lib/memrchr.c' --- lib/memrchr.c 1970-01-01 00:00:00 +0000 +++ lib/memrchr.c 2013-02-09 03:12:48 +0000 @@ -0,0 +1,161 @@ +/* memrchr -- find the last occurrence of a byte in a memory block + + Copyright (C) 1991, 1993, 1996-1997, 1999-2000, 2003-2013 Free Software + Foundation, Inc. + + Based on strlen implementation by Torbjorn Granlund (tege@sics.se), + with help from Dan Sahlin (dan@sics.se) and + commentary by Jim Blandy (jimb@ai.mit.edu); + adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu), + and implemented by Roland McGrath (roland@ai.mit.edu). + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <http://www.gnu.org/licenses/>. */ + +#if defined _LIBC +# include <memcopy.h> +#else +# include <config.h> +# define reg_char char +#endif + +#include <string.h> +#include <limits.h> + +#undef __memrchr +#ifdef _LIBC +# undef memrchr +#endif + +#ifndef weak_alias +# define __memrchr memrchr +#endif + +/* Search no more than N bytes of S for C. */ +void * +__memrchr (void const *s, int c_in, size_t n) +{ + /* On 32-bit hardware, choosing longword to be a 32-bit unsigned + long instead of a 64-bit uintmax_t tends to give better + performance. On 64-bit hardware, unsigned long is generally 64 + bits already. Change this typedef to experiment with + performance. */ + typedef unsigned long int longword; + + const unsigned char *char_ptr; + const longword *longword_ptr; + longword repeated_one; + longword repeated_c; + unsigned reg_char c; + + c = (unsigned char) c_in; + + /* Handle the last few bytes by reading one byte at a time. + Do this until CHAR_PTR is aligned on a longword boundary. */ + for (char_ptr = (const unsigned char *) s + n; + n > 0 && (size_t) char_ptr % sizeof (longword) != 0; + --n) + if (*--char_ptr == c) + return (void *) char_ptr; + + longword_ptr = (const longword *) char_ptr; + + /* All these elucidatory comments refer to 4-byte longwords, + but the theory applies equally well to any size longwords. */ + + /* Compute auxiliary longword values: + repeated_one is a value which has a 1 in every byte. + repeated_c has c in every byte. */ + repeated_one = 0x01010101; + repeated_c = c | (c << 8); + repeated_c |= repeated_c << 16; + if (0xffffffffU < (longword) -1) + { + repeated_one |= repeated_one << 31 << 1; + repeated_c |= repeated_c << 31 << 1; + if (8 < sizeof (longword)) + { + size_t i; + + for (i = 64; i < sizeof (longword) * 8; i *= 2) + { + repeated_one |= repeated_one << i; + repeated_c |= repeated_c << i; + } + } + } + + /* Instead of the traditional loop which tests each byte, we will test a + longword at a time. The tricky part is testing if *any of the four* + bytes in the longword in question are equal to c. We first use an xor + with repeated_c. This reduces the task to testing whether *any of the + four* bytes in longword1 is zero. + + We compute tmp = + ((longword1 - repeated_one) & ~longword1) & (repeated_one << 7). + That is, we perform the following operations: + 1. Subtract repeated_one. + 2. & ~longword1. + 3. & a mask consisting of 0x80 in every byte. + Consider what happens in each byte: + - If a byte of longword1 is zero, step 1 and 2 transform it into 0xff, + and step 3 transforms it into 0x80. A carry can also be propagated + to more significant bytes. + - If a byte of longword1 is nonzero, let its lowest 1 bit be at + position k (0 <= k <= 7); so the lowest k bits are 0. After step 1, + the byte ends in a single bit of value 0 and k bits of value 1. + After step 2, the result is just k bits of value 1: 2^k - 1. After + step 3, the result is 0. And no carry is produced. + So, if longword1 has only non-zero bytes, tmp is zero. + Whereas if longword1 has a zero byte, call j the position of the least + significant zero byte. Then the result has a zero at positions 0, ..., + j-1 and a 0x80 at position j. We cannot predict the result at the more + significant bytes (positions j+1..3), but it does not matter since we + already have a non-zero bit at position 8*j+7. + + So, the test whether any byte in longword1 is zero is equivalent to + testing whether tmp is nonzero. */ + + while (n >= sizeof (longword)) + { + longword longword1 = *--longword_ptr ^ repeated_c; + + if ((((longword1 - repeated_one) & ~longword1) + & (repeated_one << 7)) != 0) + { + longword_ptr++; + break; + } + n -= sizeof (longword); + } + + char_ptr = (const unsigned char *) longword_ptr; + + /* At this point, we know that either n < sizeof (longword), or one of the + sizeof (longword) bytes starting at char_ptr is == c. On little-endian + machines, we could determine the first such byte without any further + memory accesses, just by looking at the tmp result from the last loop + iteration. But this does not work on big-endian machines. Choose code + that works in both cases. */ + + while (n-- > 0) + { + if (*--char_ptr == c) + return (void *) char_ptr; + } + + return NULL; +} +#ifdef weak_alias +weak_alias (__memrchr, memrchr) +#endif === added file 'lib/string.in.h' --- lib/string.in.h 1970-01-01 00:00:00 +0000 +++ lib/string.in.h 2013-02-09 03:12:48 +0000 @@ -0,0 +1,1029 @@ +/* A GNU-like <string.h>. + + Copyright (C) 1995-1996, 2001-2013 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see <http://www.gnu.org/licenses/>. */ + +#ifndef _@GUARD_PREFIX@_STRING_H + +#if __GNUC__ >= 3 +@PRAGMA_SYSTEM_HEADER@ +#endif +@PRAGMA_COLUMNS@ + +/* The include_next requires a split double-inclusion guard. */ +#@INCLUDE_NEXT@ @NEXT_STRING_H@ + +#ifndef _@GUARD_PREFIX@_STRING_H +#define _@GUARD_PREFIX@_STRING_H + +/* NetBSD 5.0 mis-defines NULL. */ +#include <stddef.h> + +/* MirBSD defines mbslen as a macro. */ +#if @GNULIB_MBSLEN@ && defined __MirBSD__ +# include <wchar.h> +#endif + +/* The __attribute__ feature is available in gcc versions 2.5 and later. + The attribute __pure__ was added in gcc 2.96. */ +#if __GNUC__ > 2 || (__GNUC__ == 2 && __GNUC_MINOR__ >= 96) +# define _GL_ATTRIBUTE_PURE __attribute__ ((__pure__)) +#else +# define _GL_ATTRIBUTE_PURE /* empty */ +#endif + +/* NetBSD 5.0 declares strsignal in <unistd.h>, not in <string.h>. */ +/* But in any case avoid namespace pollution on glibc systems. */ +#if (@GNULIB_STRSIGNAL@ || defined GNULIB_POSIXCHECK) && defined __NetBSD__ \ + && ! defined __GLIBC__ +# include <unistd.h> +#endif + +/* The definitions of _GL_FUNCDECL_RPL etc. are copied here. */ + +/* The definition of _GL_ARG_NONNULL is copied here. */ + +/* The definition of _GL_WARN_ON_USE is copied here. */ + + +/* Find the index of the least-significant set bit. */ +#if @GNULIB_FFSL@ +# if !@HAVE_FFSL@ +_GL_FUNCDECL_SYS (ffsl, int, (long int i)); +# endif +_GL_CXXALIAS_SYS (ffsl, int, (long int i)); +_GL_CXXALIASWARN (ffsl); +#elif defined GNULIB_POSIXCHECK +# undef ffsl +# if HAVE_RAW_DECL_FFSL +_GL_WARN_ON_USE (ffsl, "ffsl is not portable - use the ffsl module"); +# endif +#endif + + +/* Find the index of the least-significant set bit. */ +#if @GNULIB_FFSLL@ +# if !@HAVE_FFSLL@ +_GL_FUNCDECL_SYS (ffsll, int, (long long int i)); +# endif +_GL_CXXALIAS_SYS (ffsll, int, (long long int i)); +_GL_CXXALIASWARN (ffsll); +#elif defined GNULIB_POSIXCHECK +# undef ffsll +# if HAVE_RAW_DECL_FFSLL +_GL_WARN_ON_USE (ffsll, "ffsll is not portable - use the ffsll module"); +# endif +#endif + + +/* Return the first instance of C within N bytes of S, or NULL. */ +#if @GNULIB_MEMCHR@ +# if @REPLACE_MEMCHR@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define memchr rpl_memchr +# endif +_GL_FUNCDECL_RPL (memchr, void *, (void const *__s, int __c, size_t __n) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (memchr, void *, (void const *__s, int __c, size_t __n)); +# else +# if ! @HAVE_MEMCHR@ +_GL_FUNCDECL_SYS (memchr, void *, (void const *__s, int __c, size_t __n) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C" { const void * std::memchr (const void *, int, size_t); } + extern "C++" { void * std::memchr (void *, int, size_t); } */ +_GL_CXXALIAS_SYS_CAST2 (memchr, + void *, (void const *__s, int __c, size_t __n), + void const *, (void const *__s, int __c, size_t __n)); +# endif +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (memchr, void *, (void *__s, int __c, size_t __n)); +_GL_CXXALIASWARN1 (memchr, void const *, + (void const *__s, int __c, size_t __n)); +# else +_GL_CXXALIASWARN (memchr); +# endif +#elif defined GNULIB_POSIXCHECK +# undef memchr +/* Assume memchr is always declared. */ +_GL_WARN_ON_USE (memchr, "memchr has platform-specific bugs - " + "use gnulib module memchr for portability" ); +#endif + +/* Return the first occurrence of NEEDLE in HAYSTACK. */ +#if @GNULIB_MEMMEM@ +# if @REPLACE_MEMMEM@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define memmem rpl_memmem +# endif +_GL_FUNCDECL_RPL (memmem, void *, + (void const *__haystack, size_t __haystack_len, + void const *__needle, size_t __needle_len) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 3))); +_GL_CXXALIAS_RPL (memmem, void *, + (void const *__haystack, size_t __haystack_len, + void const *__needle, size_t __needle_len)); +# else +# if ! @HAVE_DECL_MEMMEM@ +_GL_FUNCDECL_SYS (memmem, void *, + (void const *__haystack, size_t __haystack_len, + void const *__needle, size_t __needle_len) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 3))); +# endif +_GL_CXXALIAS_SYS (memmem, void *, + (void const *__haystack, size_t __haystack_len, + void const *__needle, size_t __needle_len)); +# endif +_GL_CXXALIASWARN (memmem); +#elif defined GNULIB_POSIXCHECK +# undef memmem +# if HAVE_RAW_DECL_MEMMEM +_GL_WARN_ON_USE (memmem, "memmem is unportable and often quadratic - " + "use gnulib module memmem-simple for portability, " + "and module memmem for speed" ); +# endif +#endif + +/* Copy N bytes of SRC to DEST, return pointer to bytes after the + last written byte. */ +#if @GNULIB_MEMPCPY@ +# if ! @HAVE_MEMPCPY@ +_GL_FUNCDECL_SYS (mempcpy, void *, + (void *restrict __dest, void const *restrict __src, + size_t __n) + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (mempcpy, void *, + (void *restrict __dest, void const *restrict __src, + size_t __n)); +_GL_CXXALIASWARN (mempcpy); +#elif defined GNULIB_POSIXCHECK +# undef mempcpy +# if HAVE_RAW_DECL_MEMPCPY +_GL_WARN_ON_USE (mempcpy, "mempcpy is unportable - " + "use gnulib module mempcpy for portability"); +# endif +#endif + +/* Search backwards through a block for a byte (specified as an int). */ +#if @GNULIB_MEMRCHR@ +# if ! @HAVE_DECL_MEMRCHR@ +_GL_FUNCDECL_SYS (memrchr, void *, (void const *, int, size_t) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const void * std::memrchr (const void *, int, size_t); } + extern "C++" { void * std::memrchr (void *, int, size_t); } */ +_GL_CXXALIAS_SYS_CAST2 (memrchr, + void *, (void const *, int, size_t), + void const *, (void const *, int, size_t)); +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (memrchr, void *, (void *, int, size_t)); +_GL_CXXALIASWARN1 (memrchr, void const *, (void const *, int, size_t)); +# else +_GL_CXXALIASWARN (memrchr); +# endif +#elif defined GNULIB_POSIXCHECK +# undef memrchr +# if HAVE_RAW_DECL_MEMRCHR +_GL_WARN_ON_USE (memrchr, "memrchr is unportable - " + "use gnulib module memrchr for portability"); +# endif +#endif + +/* Find the first occurrence of C in S. More efficient than + memchr(S,C,N), at the expense of undefined behavior if C does not + occur within N bytes. */ +#if @GNULIB_RAWMEMCHR@ +# if ! @HAVE_RAWMEMCHR@ +_GL_FUNCDECL_SYS (rawmemchr, void *, (void const *__s, int __c_in) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const void * std::rawmemchr (const void *, int); } + extern "C++" { void * std::rawmemchr (void *, int); } */ +_GL_CXXALIAS_SYS_CAST2 (rawmemchr, + void *, (void const *__s, int __c_in), + void const *, (void const *__s, int __c_in)); +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (rawmemchr, void *, (void *__s, int __c_in)); +_GL_CXXALIASWARN1 (rawmemchr, void const *, (void const *__s, int __c_in)); +# else +_GL_CXXALIASWARN (rawmemchr); +# endif +#elif defined GNULIB_POSIXCHECK +# undef rawmemchr +# if HAVE_RAW_DECL_RAWMEMCHR +_GL_WARN_ON_USE (rawmemchr, "rawmemchr is unportable - " + "use gnulib module rawmemchr for portability"); +# endif +#endif + +/* Copy SRC to DST, returning the address of the terminating '\0' in DST. */ +#if @GNULIB_STPCPY@ +# if ! @HAVE_STPCPY@ +_GL_FUNCDECL_SYS (stpcpy, char *, + (char *restrict __dst, char const *restrict __src) + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (stpcpy, char *, + (char *restrict __dst, char const *restrict __src)); +_GL_CXXALIASWARN (stpcpy); +#elif defined GNULIB_POSIXCHECK +# undef stpcpy +# if HAVE_RAW_DECL_STPCPY +_GL_WARN_ON_USE (stpcpy, "stpcpy is unportable - " + "use gnulib module stpcpy for portability"); +# endif +#endif + +/* Copy no more than N bytes of SRC to DST, returning a pointer past the + last non-NUL byte written into DST. */ +#if @GNULIB_STPNCPY@ +# if @REPLACE_STPNCPY@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef stpncpy +# define stpncpy rpl_stpncpy +# endif +_GL_FUNCDECL_RPL (stpncpy, char *, + (char *restrict __dst, char const *restrict __src, + size_t __n) + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (stpncpy, char *, + (char *restrict __dst, char const *restrict __src, + size_t __n)); +# else +# if ! @HAVE_STPNCPY@ +_GL_FUNCDECL_SYS (stpncpy, char *, + (char *restrict __dst, char const *restrict __src, + size_t __n) + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (stpncpy, char *, + (char *restrict __dst, char const *restrict __src, + size_t __n)); +# endif +_GL_CXXALIASWARN (stpncpy); +#elif defined GNULIB_POSIXCHECK +# undef stpncpy +# if HAVE_RAW_DECL_STPNCPY +_GL_WARN_ON_USE (stpncpy, "stpncpy is unportable - " + "use gnulib module stpncpy for portability"); +# endif +#endif + +#if defined GNULIB_POSIXCHECK +/* strchr() does not work with multibyte strings if the locale encoding is + GB18030 and the character to be searched is a digit. */ +# undef strchr +/* Assume strchr is always declared. */ +_GL_WARN_ON_USE (strchr, "strchr cannot work correctly on character strings " + "in some multibyte locales - " + "use mbschr if you care about internationalization"); +#endif + +/* Find the first occurrence of C in S or the final NUL byte. */ +#if @GNULIB_STRCHRNUL@ +# if @REPLACE_STRCHRNUL@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define strchrnul rpl_strchrnul +# endif +_GL_FUNCDECL_RPL (strchrnul, char *, (const char *__s, int __c_in) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (strchrnul, char *, + (const char *str, int ch)); +# else +# if ! @HAVE_STRCHRNUL@ +_GL_FUNCDECL_SYS (strchrnul, char *, (char const *__s, int __c_in) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const char * std::strchrnul (const char *, int); } + extern "C++" { char * std::strchrnul (char *, int); } */ +_GL_CXXALIAS_SYS_CAST2 (strchrnul, + char *, (char const *__s, int __c_in), + char const *, (char const *__s, int __c_in)); +# endif +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (strchrnul, char *, (char *__s, int __c_in)); +_GL_CXXALIASWARN1 (strchrnul, char const *, (char const *__s, int __c_in)); +# else +_GL_CXXALIASWARN (strchrnul); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strchrnul +# if HAVE_RAW_DECL_STRCHRNUL +_GL_WARN_ON_USE (strchrnul, "strchrnul is unportable - " + "use gnulib module strchrnul for portability"); +# endif +#endif + +/* Duplicate S, returning an identical malloc'd string. */ +#if @GNULIB_STRDUP@ +# if @REPLACE_STRDUP@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strdup +# define strdup rpl_strdup +# endif +_GL_FUNCDECL_RPL (strdup, char *, (char const *__s) _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (strdup, char *, (char const *__s)); +# else +# if defined __cplusplus && defined GNULIB_NAMESPACE && defined strdup + /* strdup exists as a function and as a macro. Get rid of the macro. */ +# undef strdup +# endif +# if !(@HAVE_DECL_STRDUP@ || defined strdup) +_GL_FUNCDECL_SYS (strdup, char *, (char const *__s) _GL_ARG_NONNULL ((1))); +# endif +_GL_CXXALIAS_SYS (strdup, char *, (char const *__s)); +# endif +_GL_CXXALIASWARN (strdup); +#elif defined GNULIB_POSIXCHECK +# undef strdup +# if HAVE_RAW_DECL_STRDUP +_GL_WARN_ON_USE (strdup, "strdup is unportable - " + "use gnulib module strdup for portability"); +# endif +#endif + +/* Append no more than N characters from SRC onto DEST. */ +#if @GNULIB_STRNCAT@ +# if @REPLACE_STRNCAT@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strncat +# define strncat rpl_strncat +# endif +_GL_FUNCDECL_RPL (strncat, char *, (char *dest, const char *src, size_t n) + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (strncat, char *, (char *dest, const char *src, size_t n)); +# else +_GL_CXXALIAS_SYS (strncat, char *, (char *dest, const char *src, size_t n)); +# endif +_GL_CXXALIASWARN (strncat); +#elif defined GNULIB_POSIXCHECK +# undef strncat +# if HAVE_RAW_DECL_STRNCAT +_GL_WARN_ON_USE (strncat, "strncat is unportable - " + "use gnulib module strncat for portability"); +# endif +#endif + +/* Return a newly allocated copy of at most N bytes of STRING. */ +#if @GNULIB_STRNDUP@ +# if @REPLACE_STRNDUP@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strndup +# define strndup rpl_strndup +# endif +_GL_FUNCDECL_RPL (strndup, char *, (char const *__string, size_t __n) + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (strndup, char *, (char const *__string, size_t __n)); +# else +# if ! @HAVE_DECL_STRNDUP@ +_GL_FUNCDECL_SYS (strndup, char *, (char const *__string, size_t __n) + _GL_ARG_NONNULL ((1))); +# endif +_GL_CXXALIAS_SYS (strndup, char *, (char const *__string, size_t __n)); +# endif +_GL_CXXALIASWARN (strndup); +#elif defined GNULIB_POSIXCHECK +# undef strndup +# if HAVE_RAW_DECL_STRNDUP +_GL_WARN_ON_USE (strndup, "strndup is unportable - " + "use gnulib module strndup for portability"); +# endif +#endif + +/* Find the length (number of bytes) of STRING, but scan at most + MAXLEN bytes. If no '\0' terminator is found in that many bytes, + return MAXLEN. */ +#if @GNULIB_STRNLEN@ +# if @REPLACE_STRNLEN@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strnlen +# define strnlen rpl_strnlen +# endif +_GL_FUNCDECL_RPL (strnlen, size_t, (char const *__string, size_t __maxlen) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (strnlen, size_t, (char const *__string, size_t __maxlen)); +# else +# if ! @HAVE_DECL_STRNLEN@ +_GL_FUNCDECL_SYS (strnlen, size_t, (char const *__string, size_t __maxlen) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +# endif +_GL_CXXALIAS_SYS (strnlen, size_t, (char const *__string, size_t __maxlen)); +# endif +_GL_CXXALIASWARN (strnlen); +#elif defined GNULIB_POSIXCHECK +# undef strnlen +# if HAVE_RAW_DECL_STRNLEN +_GL_WARN_ON_USE (strnlen, "strnlen is unportable - " + "use gnulib module strnlen for portability"); +# endif +#endif + +#if defined GNULIB_POSIXCHECK +/* strcspn() assumes the second argument is a list of single-byte characters. + Even in this simple case, it does not work with multibyte strings if the + locale encoding is GB18030 and one of the characters to be searched is a + digit. */ +# undef strcspn +/* Assume strcspn is always declared. */ +_GL_WARN_ON_USE (strcspn, "strcspn cannot work correctly on character strings " + "in multibyte locales - " + "use mbscspn if you care about internationalization"); +#endif + +/* Find the first occurrence in S of any character in ACCEPT. */ +#if @GNULIB_STRPBRK@ +# if ! @HAVE_STRPBRK@ +_GL_FUNCDECL_SYS (strpbrk, char *, (char const *__s, char const *__accept) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C" { const char * strpbrk (const char *, const char *); } + extern "C++" { char * strpbrk (char *, const char *); } */ +_GL_CXXALIAS_SYS_CAST2 (strpbrk, + char *, (char const *__s, char const *__accept), + const char *, (char const *__s, char const *__accept)); +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (strpbrk, char *, (char *__s, char const *__accept)); +_GL_CXXALIASWARN1 (strpbrk, char const *, + (char const *__s, char const *__accept)); +# else +_GL_CXXALIASWARN (strpbrk); +# endif +# if defined GNULIB_POSIXCHECK +/* strpbrk() assumes the second argument is a list of single-byte characters. + Even in this simple case, it does not work with multibyte strings if the + locale encoding is GB18030 and one of the characters to be searched is a + digit. */ +# undef strpbrk +_GL_WARN_ON_USE (strpbrk, "strpbrk cannot work correctly on character strings " + "in multibyte locales - " + "use mbspbrk if you care about internationalization"); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strpbrk +# if HAVE_RAW_DECL_STRPBRK +_GL_WARN_ON_USE (strpbrk, "strpbrk is unportable - " + "use gnulib module strpbrk for portability"); +# endif +#endif + +#if defined GNULIB_POSIXCHECK +/* strspn() assumes the second argument is a list of single-byte characters. + Even in this simple case, it cannot work with multibyte strings. */ +# undef strspn +/* Assume strspn is always declared. */ +_GL_WARN_ON_USE (strspn, "strspn cannot work correctly on character strings " + "in multibyte locales - " + "use mbsspn if you care about internationalization"); +#endif + +#if defined GNULIB_POSIXCHECK +/* strrchr() does not work with multibyte strings if the locale encoding is + GB18030 and the character to be searched is a digit. */ +# undef strrchr +/* Assume strrchr is always declared. */ +_GL_WARN_ON_USE (strrchr, "strrchr cannot work correctly on character strings " + "in some multibyte locales - " + "use mbsrchr if you care about internationalization"); +#endif + +/* Search the next delimiter (char listed in DELIM) starting at *STRINGP. + If one is found, overwrite it with a NUL, and advance *STRINGP + to point to the next char after it. Otherwise, set *STRINGP to NULL. + If *STRINGP was already NULL, nothing happens. + Return the old value of *STRINGP. + + This is a variant of strtok() that is multithread-safe and supports + empty fields. + + Caveat: It modifies the original string. + Caveat: These functions cannot be used on constant strings. + Caveat: The identity of the delimiting character is lost. + Caveat: It doesn't work with multibyte strings unless all of the delimiter + characters are ASCII characters < 0x30. + + See also strtok_r(). */ +#if @GNULIB_STRSEP@ +# if ! @HAVE_STRSEP@ +_GL_FUNCDECL_SYS (strsep, char *, + (char **restrict __stringp, char const *restrict __delim) + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (strsep, char *, + (char **restrict __stringp, char const *restrict __delim)); +_GL_CXXALIASWARN (strsep); +# if defined GNULIB_POSIXCHECK +# undef strsep +_GL_WARN_ON_USE (strsep, "strsep cannot work correctly on character strings " + "in multibyte locales - " + "use mbssep if you care about internationalization"); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strsep +# if HAVE_RAW_DECL_STRSEP +_GL_WARN_ON_USE (strsep, "strsep is unportable - " + "use gnulib module strsep for portability"); +# endif +#endif + +#if @GNULIB_STRSTR@ +# if @REPLACE_STRSTR@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define strstr rpl_strstr +# endif +_GL_FUNCDECL_RPL (strstr, char *, (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (strstr, char *, (const char *haystack, const char *needle)); +# else + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const char * strstr (const char *, const char *); } + extern "C++" { char * strstr (char *, const char *); } */ +_GL_CXXALIAS_SYS_CAST2 (strstr, + char *, (const char *haystack, const char *needle), + const char *, (const char *haystack, const char *needle)); +# endif +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (strstr, char *, (char *haystack, const char *needle)); +_GL_CXXALIASWARN1 (strstr, const char *, + (const char *haystack, const char *needle)); +# else +_GL_CXXALIASWARN (strstr); +# endif +#elif defined GNULIB_POSIXCHECK +/* strstr() does not work with multibyte strings if the locale encoding is + different from UTF-8: + POSIX says that it operates on "strings", and "string" in POSIX is defined + as a sequence of bytes, not of characters. */ +# undef strstr +/* Assume strstr is always declared. */ +_GL_WARN_ON_USE (strstr, "strstr is quadratic on many systems, and cannot " + "work correctly on character strings in most " + "multibyte locales - " + "use mbsstr if you care about internationalization, " + "or use strstr if you care about speed"); +#endif + +/* Find the first occurrence of NEEDLE in HAYSTACK, using case-insensitive + comparison. */ +#if @GNULIB_STRCASESTR@ +# if @REPLACE_STRCASESTR@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define strcasestr rpl_strcasestr +# endif +_GL_FUNCDECL_RPL (strcasestr, char *, + (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (strcasestr, char *, + (const char *haystack, const char *needle)); +# else +# if ! @HAVE_STRCASESTR@ +_GL_FUNCDECL_SYS (strcasestr, char *, + (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +# endif + /* On some systems, this function is defined as an overloaded function: + extern "C++" { const char * strcasestr (const char *, const char *); } + extern "C++" { char * strcasestr (char *, const char *); } */ +_GL_CXXALIAS_SYS_CAST2 (strcasestr, + char *, (const char *haystack, const char *needle), + const char *, (const char *haystack, const char *needle)); +# endif +# if ((__GLIBC__ == 2 && __GLIBC_MINOR__ >= 10) && !defined __UCLIBC__) \ + && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 4)) +_GL_CXXALIASWARN1 (strcasestr, char *, (char *haystack, const char *needle)); +_GL_CXXALIASWARN1 (strcasestr, const char *, + (const char *haystack, const char *needle)); +# else +_GL_CXXALIASWARN (strcasestr); +# endif +#elif defined GNULIB_POSIXCHECK +/* strcasestr() does not work with multibyte strings: + It is a glibc extension, and glibc implements it only for unibyte + locales. */ +# undef strcasestr +# if HAVE_RAW_DECL_STRCASESTR +_GL_WARN_ON_USE (strcasestr, "strcasestr does work correctly on character " + "strings in multibyte locales - " + "use mbscasestr if you care about " + "internationalization, or use c-strcasestr if you want " + "a locale independent function"); +# endif +#endif + +/* Parse S into tokens separated by characters in DELIM. + If S is NULL, the saved pointer in SAVE_PTR is used as + the next starting point. For example: + char s[] = "-abc-=-def"; + char *sp; + x = strtok_r(s, "-", &sp); // x = "abc", sp = "=-def" + x = strtok_r(NULL, "-=", &sp); // x = "def", sp = NULL + x = strtok_r(NULL, "=", &sp); // x = NULL + // s = "abc\0-def\0" + + This is a variant of strtok() that is multithread-safe. + + For the POSIX documentation for this function, see: + http://www.opengroup.org/susv3xsh/strtok.html + + Caveat: It modifies the original string. + Caveat: These functions cannot be used on constant strings. + Caveat: The identity of the delimiting character is lost. + Caveat: It doesn't work with multibyte strings unless all of the delimiter + characters are ASCII characters < 0x30. + + See also strsep(). */ +#if @GNULIB_STRTOK_R@ +# if @REPLACE_STRTOK_R@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strtok_r +# define strtok_r rpl_strtok_r +# endif +_GL_FUNCDECL_RPL (strtok_r, char *, + (char *restrict s, char const *restrict delim, + char **restrict save_ptr) + _GL_ARG_NONNULL ((2, 3))); +_GL_CXXALIAS_RPL (strtok_r, char *, + (char *restrict s, char const *restrict delim, + char **restrict save_ptr)); +# else +# if @UNDEFINE_STRTOK_R@ || defined GNULIB_POSIXCHECK +# undef strtok_r +# endif +# if ! @HAVE_DECL_STRTOK_R@ +_GL_FUNCDECL_SYS (strtok_r, char *, + (char *restrict s, char const *restrict delim, + char **restrict save_ptr) + _GL_ARG_NONNULL ((2, 3))); +# endif +_GL_CXXALIAS_SYS (strtok_r, char *, + (char *restrict s, char const *restrict delim, + char **restrict save_ptr)); +# endif +_GL_CXXALIASWARN (strtok_r); +# if defined GNULIB_POSIXCHECK +_GL_WARN_ON_USE (strtok_r, "strtok_r cannot work correctly on character " + "strings in multibyte locales - " + "use mbstok_r if you care about internationalization"); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strtok_r +# if HAVE_RAW_DECL_STRTOK_R +_GL_WARN_ON_USE (strtok_r, "strtok_r is unportable - " + "use gnulib module strtok_r for portability"); +# endif +#endif + + +/* The following functions are not specified by POSIX. They are gnulib + extensions. */ + +#if @GNULIB_MBSLEN@ +/* Return the number of multibyte characters in the character string STRING. + This considers multibyte characters, unlike strlen, which counts bytes. */ +# ifdef __MirBSD__ /* MirBSD defines mbslen as a macro. Override it. */ +# undef mbslen +# endif +# if @HAVE_MBSLEN@ /* AIX, OSF/1, MirBSD define mbslen already in libc. */ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define mbslen rpl_mbslen +# endif +_GL_FUNCDECL_RPL (mbslen, size_t, (const char *string) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (mbslen, size_t, (const char *string)); +# else +_GL_FUNCDECL_SYS (mbslen, size_t, (const char *string) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_SYS (mbslen, size_t, (const char *string)); +# endif +_GL_CXXALIASWARN (mbslen); +#endif + +#if @GNULIB_MBSNLEN@ +/* Return the number of multibyte characters in the character string starting + at STRING and ending at STRING + LEN. */ +_GL_EXTERN_C size_t mbsnlen (const char *string, size_t len) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1)); +#endif + +#if @GNULIB_MBSCHR@ +/* Locate the first single-byte character C in the character string STRING, + and return a pointer to it. Return NULL if C is not found in STRING. + Unlike strchr(), this function works correctly in multibyte locales with + encodings such as GB18030. */ +# if defined __hpux +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define mbschr rpl_mbschr /* avoid collision with HP-UX function */ +# endif +_GL_FUNCDECL_RPL (mbschr, char *, (const char *string, int c) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (mbschr, char *, (const char *string, int c)); +# else +_GL_FUNCDECL_SYS (mbschr, char *, (const char *string, int c) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_SYS (mbschr, char *, (const char *string, int c)); +# endif +_GL_CXXALIASWARN (mbschr); +#endif + +#if @GNULIB_MBSRCHR@ +/* Locate the last single-byte character C in the character string STRING, + and return a pointer to it. Return NULL if C is not found in STRING. + Unlike strrchr(), this function works correctly in multibyte locales with + encodings such as GB18030. */ +# if defined __hpux || defined __INTERIX +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define mbsrchr rpl_mbsrchr /* avoid collision with system function */ +# endif +_GL_FUNCDECL_RPL (mbsrchr, char *, (const char *string, int c) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_RPL (mbsrchr, char *, (const char *string, int c)); +# else +_GL_FUNCDECL_SYS (mbsrchr, char *, (const char *string, int c) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1))); +_GL_CXXALIAS_SYS (mbsrchr, char *, (const char *string, int c)); +# endif +_GL_CXXALIASWARN (mbsrchr); +#endif + +#if @GNULIB_MBSSTR@ +/* Find the first occurrence of the character string NEEDLE in the character + string HAYSTACK. Return NULL if NEEDLE is not found in HAYSTACK. + Unlike strstr(), this function works correctly in multibyte locales with + encodings different from UTF-8. */ +_GL_EXTERN_C char * mbsstr (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSCASECMP@ +/* Compare the character strings S1 and S2, ignoring case, returning less than, + equal to or greater than zero if S1 is lexicographically less than, equal to + or greater than S2. + Note: This function may, in multibyte locales, return 0 for strings of + different lengths! + Unlike strcasecmp(), this function works correctly in multibyte locales. */ +_GL_EXTERN_C int mbscasecmp (const char *s1, const char *s2) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSNCASECMP@ +/* Compare the initial segment of the character string S1 consisting of at most + N characters with the initial segment of the character string S2 consisting + of at most N characters, ignoring case, returning less than, equal to or + greater than zero if the initial segment of S1 is lexicographically less + than, equal to or greater than the initial segment of S2. + Note: This function may, in multibyte locales, return 0 for initial segments + of different lengths! + Unlike strncasecmp(), this function works correctly in multibyte locales. + But beware that N is not a byte count but a character count! */ +_GL_EXTERN_C int mbsncasecmp (const char *s1, const char *s2, size_t n) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSPCASECMP@ +/* Compare the initial segment of the character string STRING consisting of + at most mbslen (PREFIX) characters with the character string PREFIX, + ignoring case. If the two match, return a pointer to the first byte + after this prefix in STRING. Otherwise, return NULL. + Note: This function may, in multibyte locales, return non-NULL if STRING + is of smaller length than PREFIX! + Unlike strncasecmp(), this function works correctly in multibyte + locales. */ +_GL_EXTERN_C char * mbspcasecmp (const char *string, const char *prefix) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSCASESTR@ +/* Find the first occurrence of the character string NEEDLE in the character + string HAYSTACK, using case-insensitive comparison. + Note: This function may, in multibyte locales, return success even if + strlen (haystack) < strlen (needle) ! + Unlike strcasestr(), this function works correctly in multibyte locales. */ +_GL_EXTERN_C char * mbscasestr (const char *haystack, const char *needle) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSCSPN@ +/* Find the first occurrence in the character string STRING of any character + in the character string ACCEPT. Return the number of bytes from the + beginning of the string to this occurrence, or to the end of the string + if none exists. + Unlike strcspn(), this function works correctly in multibyte locales. */ +_GL_EXTERN_C size_t mbscspn (const char *string, const char *accept) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSPBRK@ +/* Find the first occurrence in the character string STRING of any character + in the character string ACCEPT. Return the pointer to it, or NULL if none + exists. + Unlike strpbrk(), this function works correctly in multibyte locales. */ +# if defined __hpux +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define mbspbrk rpl_mbspbrk /* avoid collision with HP-UX function */ +# endif +_GL_FUNCDECL_RPL (mbspbrk, char *, (const char *string, const char *accept) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_RPL (mbspbrk, char *, (const char *string, const char *accept)); +# else +_GL_FUNCDECL_SYS (mbspbrk, char *, (const char *string, const char *accept) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +_GL_CXXALIAS_SYS (mbspbrk, char *, (const char *string, const char *accept)); +# endif +_GL_CXXALIASWARN (mbspbrk); +#endif + +#if @GNULIB_MBSSPN@ +/* Find the first occurrence in the character string STRING of any character + not in the character string REJECT. Return the number of bytes from the + beginning of the string to this occurrence, or to the end of the string + if none exists. + Unlike strspn(), this function works correctly in multibyte locales. */ +_GL_EXTERN_C size_t mbsspn (const char *string, const char *reject) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSSEP@ +/* Search the next delimiter (multibyte character listed in the character + string DELIM) starting at the character string *STRINGP. + If one is found, overwrite it with a NUL, and advance *STRINGP to point + to the next multibyte character after it. Otherwise, set *STRINGP to NULL. + If *STRINGP was already NULL, nothing happens. + Return the old value of *STRINGP. + + This is a variant of mbstok_r() that supports empty fields. + + Caveat: It modifies the original string. + Caveat: These functions cannot be used on constant strings. + Caveat: The identity of the delimiting character is lost. + + See also mbstok_r(). */ +_GL_EXTERN_C char * mbssep (char **stringp, const char *delim) + _GL_ARG_NONNULL ((1, 2)); +#endif + +#if @GNULIB_MBSTOK_R@ +/* Parse the character string STRING into tokens separated by characters in + the character string DELIM. + If STRING is NULL, the saved pointer in SAVE_PTR is used as + the next starting point. For example: + char s[] = "-abc-=-def"; + char *sp; + x = mbstok_r(s, "-", &sp); // x = "abc", sp = "=-def" + x = mbstok_r(NULL, "-=", &sp); // x = "def", sp = NULL + x = mbstok_r(NULL, "=", &sp); // x = NULL + // s = "abc\0-def\0" + + Caveat: It modifies the original string. + Caveat: These functions cannot be used on constant strings. + Caveat: The identity of the delimiting character is lost. + + See also mbssep(). */ +_GL_EXTERN_C char * mbstok_r (char *string, const char *delim, char **save_ptr) + _GL_ARG_NONNULL ((2, 3)); +#endif + +/* Map any int, typically from errno, into an error message. */ +#if @GNULIB_STRERROR@ +# if @REPLACE_STRERROR@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strerror +# define strerror rpl_strerror +# endif +_GL_FUNCDECL_RPL (strerror, char *, (int)); +_GL_CXXALIAS_RPL (strerror, char *, (int)); +# else +_GL_CXXALIAS_SYS (strerror, char *, (int)); +# endif +_GL_CXXALIASWARN (strerror); +#elif defined GNULIB_POSIXCHECK +# undef strerror +/* Assume strerror is always declared. */ +_GL_WARN_ON_USE (strerror, "strerror is unportable - " + "use gnulib module strerror to guarantee non-NULL result"); +#endif + +/* Map any int, typically from errno, into an error message. Multithread-safe. + Uses the POSIX declaration, not the glibc declaration. */ +#if @GNULIB_STRERROR_R@ +# if @REPLACE_STRERROR_R@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# undef strerror_r +# define strerror_r rpl_strerror_r +# endif +_GL_FUNCDECL_RPL (strerror_r, int, (int errnum, char *buf, size_t buflen) + _GL_ARG_NONNULL ((2))); +_GL_CXXALIAS_RPL (strerror_r, int, (int errnum, char *buf, size_t buflen)); +# else +# if !@HAVE_DECL_STRERROR_R@ +_GL_FUNCDECL_SYS (strerror_r, int, (int errnum, char *buf, size_t buflen) + _GL_ARG_NONNULL ((2))); +# endif +_GL_CXXALIAS_SYS (strerror_r, int, (int errnum, char *buf, size_t buflen)); +# endif +# if @HAVE_DECL_STRERROR_R@ +_GL_CXXALIASWARN (strerror_r); +# endif +#elif defined GNULIB_POSIXCHECK +# undef strerror_r +# if HAVE_RAW_DECL_STRERROR_R +_GL_WARN_ON_USE (strerror_r, "strerror_r is unportable - " + "use gnulib module strerror_r-posix for portability"); +# endif +#endif + +#if @GNULIB_STRSIGNAL@ +# if @REPLACE_STRSIGNAL@ +# if !(defined __cplusplus && defined GNULIB_NAMESPACE) +# define strsignal rpl_strsignal +# endif +_GL_FUNCDECL_RPL (strsignal, char *, (int __sig)); +_GL_CXXALIAS_RPL (strsignal, char *, (int __sig)); +# else +# if ! @HAVE_DECL_STRSIGNAL@ +_GL_FUNCDECL_SYS (strsignal, char *, (int __sig)); +# endif +/* Need to cast, because on Cygwin 1.5.x systems, the return type is + 'const char *'. */ +_GL_CXXALIAS_SYS_CAST (strsignal, char *, (int __sig)); +# endif +_GL_CXXALIASWARN (strsignal); +#elif defined GNULIB_POSIXCHECK +# undef strsignal +# if HAVE_RAW_DECL_STRSIGNAL +_GL_WARN_ON_USE (strsignal, "strsignal is unportable - " + "use gnulib module strsignal for portability"); +# endif +#endif + +#if @GNULIB_STRVERSCMP@ +# if !@HAVE_STRVERSCMP@ +_GL_FUNCDECL_SYS (strverscmp, int, (const char *, const char *) + _GL_ATTRIBUTE_PURE + _GL_ARG_NONNULL ((1, 2))); +# endif +_GL_CXXALIAS_SYS (strverscmp, int, (const char *, const char *)); +_GL_CXXALIASWARN (strverscmp); +#elif defined GNULIB_POSIXCHECK +# undef strverscmp +# if HAVE_RAW_DECL_STRVERSCMP +_GL_WARN_ON_USE (strverscmp, "strverscmp is unportable - " + "use gnulib module strverscmp for portability"); +# endif +#endif + + +#endif /* _@GUARD_PREFIX@_STRING_H */ +#endif /* _@GUARD_PREFIX@_STRING_H */ === modified file 'm4/gnulib-comp.m4' --- m4/gnulib-comp.m4 2013-02-01 06:30:51 +0000 +++ m4/gnulib-comp.m4 2013-02-09 03:12:48 +0000 @@ -83,6 +83,7 @@ AC_REQUIRE([AC_SYS_LARGEFILE]) # Code from module lstat: # Code from module manywarnings: + # Code from module memrchr: # Code from module mktime: # Code from module multiarch: # Code from module nocrash: @@ -117,6 +118,7 @@ # Code from module stdio: # Code from module stdlib: # Code from module strftime: + # Code from module string: # Code from module strtoimax: # Code from module strtoll: # Code from module strtoull: @@ -242,6 +244,12 @@ gl_PREREQ_LSTAT fi gl_SYS_STAT_MODULE_INDICATOR([lstat]) + gl_FUNC_MEMRCHR + if test $ac_cv_func_memrchr = no; then + AC_LIBOBJ([memrchr]) + gl_PREREQ_MEMRCHR + fi + gl_STRING_MODULE_INDICATOR([memrchr]) gl_FUNC_MKTIME if test $REPLACE_MKTIME = 1; then AC_LIBOBJ([mktime]) @@ -294,6 +302,7 @@ gl_STDIO_H gl_STDLIB_H gl_FUNC_GNU_STRFTIME + gl_HEADER_STRING_H gl_FUNC_STRTOIMAX if test $HAVE_STRTOIMAX = 0 || test $REPLACE_STRTOIMAX = 1; then AC_LIBOBJ([strtoimax]) @@ -757,6 +766,7 @@ lib/lstat.c lib/md5.c lib/md5.h + lib/memrchr.c lib/mktime-internal.h lib/mktime.c lib/openat-priv.h @@ -790,6 +800,7 @@ lib/stdlib.in.h lib/strftime.c lib/strftime.h + lib/string.in.h lib/strtoimax.c lib/strtol.c lib/strtoll.c @@ -848,6 +859,7 @@ m4/lstat.m4 m4/manywarnings.m4 m4/md5.m4 + m4/memrchr.m4 m4/mktime.m4 m4/multiarch.m4 m4/nocrash.m4 @@ -877,6 +889,7 @@ m4/stdio_h.m4 m4/stdlib_h.m4 m4/strftime.m4 + m4/string_h.m4 m4/strtoimax.m4 m4/strtoll.m4 m4/strtoull.m4 === added file 'm4/memrchr.m4' --- m4/memrchr.m4 1970-01-01 00:00:00 +0000 +++ m4/memrchr.m4 2013-02-09 03:12:48 +0000 @@ -0,0 +1,23 @@ +# memrchr.m4 serial 10 +dnl Copyright (C) 2002-2003, 2005-2007, 2009-2013 Free Software Foundation, +dnl Inc. +dnl This file is free software; the Free Software Foundation +dnl gives unlimited permission to copy and/or distribute it, +dnl with or without modifications, as long as this notice is preserved. + +AC_DEFUN([gl_FUNC_MEMRCHR], +[ + dnl Persuade glibc <string.h> to declare memrchr(). + AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS]) + + AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS]) + AC_CHECK_DECLS_ONCE([memrchr]) + if test $ac_cv_have_decl_memrchr = no; then + HAVE_DECL_MEMRCHR=0 + fi + + AC_CHECK_FUNCS([memrchr]) +]) + +# Prerequisites of lib/memrchr.c. +AC_DEFUN([gl_PREREQ_MEMRCHR], [:]) === added file 'm4/string_h.m4' --- m4/string_h.m4 1970-01-01 00:00:00 +0000 +++ m4/string_h.m4 2013-02-09 03:12:48 +0000 @@ -0,0 +1,120 @@ +# Configure a GNU-like replacement for <string.h>. + +# Copyright (C) 2007-2013 Free Software Foundation, Inc. +# This file is free software; the Free Software Foundation +# gives unlimited permission to copy and/or distribute it, +# with or without modifications, as long as this notice is preserved. + +# serial 21 + +# Written by Paul Eggert. + +AC_DEFUN([gl_HEADER_STRING_H], +[ + dnl Use AC_REQUIRE here, so that the default behavior below is expanded + dnl once only, before all statements that occur in other macros. + AC_REQUIRE([gl_HEADER_STRING_H_BODY]) +]) + +AC_DEFUN([gl_HEADER_STRING_H_BODY], +[ + AC_REQUIRE([AC_C_RESTRICT]) + AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS]) + gl_NEXT_HEADERS([string.h]) + + dnl Check for declarations of anything we want to poison if the + dnl corresponding gnulib module is not in use, and which is not + dnl guaranteed by C89. + gl_WARN_ON_USE_PREPARE([[#include <string.h> + ]], + [ffsl ffsll memmem mempcpy memrchr rawmemchr stpcpy stpncpy strchrnul + strdup strncat strndup strnlen strpbrk strsep strcasestr strtok_r + strerror_r strsignal strverscmp]) +]) + +AC_DEFUN([gl_STRING_MODULE_INDICATOR], +[ + dnl Use AC_REQUIRE here, so that the default settings are expanded once only. + AC_REQUIRE([gl_HEADER_STRING_H_DEFAULTS]) + gl_MODULE_INDICATOR_SET_VARIABLE([$1]) + dnl Define it also as a C macro, for the benefit of the unit tests. + gl_MODULE_INDICATOR_FOR_TESTS([$1]) +]) + +AC_DEFUN([gl_HEADER_STRING_H_DEFAULTS], +[ + GNULIB_FFSL=0; AC_SUBST([GNULIB_FFSL]) + GNULIB_FFSLL=0; AC_SUBST([GNULIB_FFSLL]) + GNULIB_MEMCHR=0; AC_SUBST([GNULIB_MEMCHR]) + GNULIB_MEMMEM=0; AC_SUBST([GNULIB_MEMMEM]) + GNULIB_MEMPCPY=0; AC_SUBST([GNULIB_MEMPCPY]) + GNULIB_MEMRCHR=0; AC_SUBST([GNULIB_MEMRCHR]) + GNULIB_RAWMEMCHR=0; AC_SUBST([GNULIB_RAWMEMCHR]) + GNULIB_STPCPY=0; AC_SUBST([GNULIB_STPCPY]) + GNULIB_STPNCPY=0; AC_SUBST([GNULIB_STPNCPY]) + GNULIB_STRCHRNUL=0; AC_SUBST([GNULIB_STRCHRNUL]) + GNULIB_STRDUP=0; AC_SUBST([GNULIB_STRDUP]) + GNULIB_STRNCAT=0; AC_SUBST([GNULIB_STRNCAT]) + GNULIB_STRNDUP=0; AC_SUBST([GNULIB_STRNDUP]) + GNULIB_STRNLEN=0; AC_SUBST([GNULIB_STRNLEN]) + GNULIB_STRPBRK=0; AC_SUBST([GNULIB_STRPBRK]) + GNULIB_STRSEP=0; AC_SUBST([GNULIB_STRSEP]) + GNULIB_STRSTR=0; AC_SUBST([GNULIB_STRSTR]) + GNULIB_STRCASESTR=0; AC_SUBST([GNULIB_STRCASESTR]) + GNULIB_STRTOK_R=0; AC_SUBST([GNULIB_STRTOK_R]) + GNULIB_MBSLEN=0; AC_SUBST([GNULIB_MBSLEN]) + GNULIB_MBSNLEN=0; AC_SUBST([GNULIB_MBSNLEN]) + GNULIB_MBSCHR=0; AC_SUBST([GNULIB_MBSCHR]) + GNULIB_MBSRCHR=0; AC_SUBST([GNULIB_MBSRCHR]) + GNULIB_MBSSTR=0; AC_SUBST([GNULIB_MBSSTR]) + GNULIB_MBSCASECMP=0; AC_SUBST([GNULIB_MBSCASECMP]) + GNULIB_MBSNCASECMP=0; AC_SUBST([GNULIB_MBSNCASECMP]) + GNULIB_MBSPCASECMP=0; AC_SUBST([GNULIB_MBSPCASECMP]) + GNULIB_MBSCASESTR=0; AC_SUBST([GNULIB_MBSCASESTR]) + GNULIB_MBSCSPN=0; AC_SUBST([GNULIB_MBSCSPN]) + GNULIB_MBSPBRK=0; AC_SUBST([GNULIB_MBSPBRK]) + GNULIB_MBSSPN=0; AC_SUBST([GNULIB_MBSSPN]) + GNULIB_MBSSEP=0; AC_SUBST([GNULIB_MBSSEP]) + GNULIB_MBSTOK_R=0; AC_SUBST([GNULIB_MBSTOK_R]) + GNULIB_STRERROR=0; AC_SUBST([GNULIB_STRERROR]) + GNULIB_STRERROR_R=0; AC_SUBST([GNULIB_STRERROR_R]) + GNULIB_STRSIGNAL=0; AC_SUBST([GNULIB_STRSIGNAL]) + GNULIB_STRVERSCMP=0; AC_SUBST([GNULIB_STRVERSCMP]) + HAVE_MBSLEN=0; AC_SUBST([HAVE_MBSLEN]) + dnl Assume proper GNU behavior unless another module says otherwise. + HAVE_FFSL=1; AC_SUBST([HAVE_FFSL]) + HAVE_FFSLL=1; AC_SUBST([HAVE_FFSLL]) + HAVE_MEMCHR=1; AC_SUBST([HAVE_MEMCHR]) + HAVE_DECL_MEMMEM=1; AC_SUBST([HAVE_DECL_MEMMEM]) + HAVE_MEMPCPY=1; AC_SUBST([HAVE_MEMPCPY]) + HAVE_DECL_MEMRCHR=1; AC_SUBST([HAVE_DECL_MEMRCHR]) + HAVE_RAWMEMCHR=1; AC_SUBST([HAVE_RAWMEMCHR]) + HAVE_STPCPY=1; AC_SUBST([HAVE_STPCPY]) + HAVE_STPNCPY=1; AC_SUBST([HAVE_STPNCPY]) + HAVE_STRCHRNUL=1; AC_SUBST([HAVE_STRCHRNUL]) + HAVE_DECL_STRDUP=1; AC_SUBST([HAVE_DECL_STRDUP]) + HAVE_DECL_STRNDUP=1; AC_SUBST([HAVE_DECL_STRNDUP]) + HAVE_DECL_STRNLEN=1; AC_SUBST([HAVE_DECL_STRNLEN]) + HAVE_STRPBRK=1; AC_SUBST([HAVE_STRPBRK]) + HAVE_STRSEP=1; AC_SUBST([HAVE_STRSEP]) + HAVE_STRCASESTR=1; AC_SUBST([HAVE_STRCASESTR]) + HAVE_DECL_STRTOK_R=1; AC_SUBST([HAVE_DECL_STRTOK_R]) + HAVE_DECL_STRERROR_R=1; AC_SUBST([HAVE_DECL_STRERROR_R]) + HAVE_DECL_STRSIGNAL=1; AC_SUBST([HAVE_DECL_STRSIGNAL]) + HAVE_STRVERSCMP=1; AC_SUBST([HAVE_STRVERSCMP]) + REPLACE_MEMCHR=0; AC_SUBST([REPLACE_MEMCHR]) + REPLACE_MEMMEM=0; AC_SUBST([REPLACE_MEMMEM]) + REPLACE_STPNCPY=0; AC_SUBST([REPLACE_STPNCPY]) + REPLACE_STRDUP=0; AC_SUBST([REPLACE_STRDUP]) + REPLACE_STRSTR=0; AC_SUBST([REPLACE_STRSTR]) + REPLACE_STRCASESTR=0; AC_SUBST([REPLACE_STRCASESTR]) + REPLACE_STRCHRNUL=0; AC_SUBST([REPLACE_STRCHRNUL]) + REPLACE_STRERROR=0; AC_SUBST([REPLACE_STRERROR]) + REPLACE_STRERROR_R=0; AC_SUBST([REPLACE_STRERROR_R]) + REPLACE_STRNCAT=0; AC_SUBST([REPLACE_STRNCAT]) + REPLACE_STRNDUP=0; AC_SUBST([REPLACE_STRNDUP]) + REPLACE_STRNLEN=0; AC_SUBST([REPLACE_STRNLEN]) + REPLACE_STRSIGNAL=0; AC_SUBST([REPLACE_STRSIGNAL]) + REPLACE_STRTOK_R=0; AC_SUBST([REPLACE_STRTOK_R]) + UNDEFINE_STRTOK_R=0; AC_SUBST([UNDEFINE_STRTOK_R]) +]) === modified file 'src/ChangeLog' --- src/ChangeLog 2013-02-10 16:49:09 +0000 +++ src/ChangeLog 2013-02-11 01:28:13 +0000 @@ -1,3 +1,14 @@ +2013-02-11 Paul Eggert <eggert@cs.ucla.edu> + + Tune by using memchr and memrchr. + * doc.c (Fsnarf_documentation): + * fileio.c (Fsubstitute_in_file_name): + * search.c (find_newline, scan_newline): + * xdisp.c (pos_visible_p, display_count_lines): + Use memchr and memrchr rather than scanning byte-by-byte. + * search.c (find_newline): Rename from scan_buffer. + Omit first arg TARGET, as it's always '\n'. All callers changed. + 2013-02-10 Eli Zaretskii <eliz@gnu.org> * xdisp.c (move_it_vertically_backward, move_it_by_lines): When === modified file 'src/doc.c' --- src/doc.c 2013-02-08 17:42:09 +0000 +++ src/doc.c 2013-02-11 01:51:25 +0000 @@ -630,11 +630,10 @@ break; buf[filled] = 0; - p = buf; end = buf + (filled < 512 ? filled : filled - 128); - while (p != end && *p != '\037') p++; + p = memchr (buf, '\037', end - buf); /* p points to ^_Ffunctionname\n or ^_Vvarname\n or ^_Sfilename\n. */ - if (p != end) + if (p) { end = strchr (p, '\n'); === modified file 'src/editfns.c' --- src/editfns.c 2013-01-23 20:07:28 +0000 +++ src/editfns.c 2013-02-11 01:26:59 +0000 @@ -735,9 +735,8 @@ /* This is the ONLY_IN_LINE case, check that NEW_POS and FIELD_BOUND are on the same line by seeing whether there's an intervening newline or not. */ - || (scan_buffer ('\n', - XFASTINT (new_pos), XFASTINT (field_bound), - fwd ? -1 : 1, &shortage, 1), + || (find_newline (XFASTINT (new_pos), XFASTINT (field_bound), + fwd ? -1 : 1, &shortage, 1), shortage != 0))) /* Constrain NEW_POS to FIELD_BOUND. */ new_pos = field_bound; === modified file 'src/fileio.c' --- src/fileio.c 2013-02-11 00:35:37 +0000 +++ src/fileio.c 2013-02-11 01:28:13 +0000 @@ -1710,8 +1710,9 @@ else if (*p == '{') { o = ++p; - while (p != endp && *p != '}') p++; - if (*p != '}') goto missingclose; + p = memchr (p, '}', endp - p); + if (! p) + goto missingclose; s = p; } else @@ -1779,8 +1780,9 @@ else if (*p == '{') { o = ++p; - while (p != endp && *p != '}') p++; - if (*p != '}') goto missingclose; + p = memchr (p, '}', endp - p); + if (! p) + goto missingclose; s = p++; } else === modified file 'src/lisp.h' --- src/lisp.h 2013-02-09 22:42:33 +0000 +++ src/lisp.h 2013-02-11 01:28:13 +0000 @@ -3346,8 +3346,8 @@ extern ptrdiff_t fast_string_match_ignore_case (Lisp_Object, Lisp_Object); extern ptrdiff_t fast_looking_at (Lisp_Object, ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t, Lisp_Object); -extern ptrdiff_t scan_buffer (int, ptrdiff_t, ptrdiff_t, ptrdiff_t, - ptrdiff_t *, bool); +extern ptrdiff_t find_newline (ptrdiff_t, ptrdiff_t, ptrdiff_t, + ptrdiff_t *, bool); extern EMACS_INT scan_newline (ptrdiff_t, ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT, bool); extern ptrdiff_t find_next_newline (ptrdiff_t, int); === modified file 'src/region-cache.h' --- src/region-cache.h 2013-01-01 09:11:05 +0000 +++ src/region-cache.h 2013-02-11 01:26:59 +0000 @@ -40,7 +40,7 @@ existing data structure, and disturb as little of the existing code as possible. - So here's the tack. We add some caching to the scan_buffer + So here's the tack. We add some caching to the find_newline function, so that when it searches for a newline, it notes that the region between the start and end of the search contained no newlines; then, the next time around, it consults this cache to see === modified file 'src/search.c' --- src/search.c 2013-02-08 14:44:53 +0000 +++ src/search.c 2013-02-11 02:31:39 +0000 @@ -619,7 +619,7 @@ } \f -/* Search for COUNT instances of the character TARGET between START and END. +/* Search for COUNT newlines between START and END. If COUNT is positive, search forwards; END must be >= START. If COUNT is negative, search backwards for the -COUNTth instance; @@ -634,14 +634,14 @@ this is not the same as the usual convention for Emacs motion commands. If we don't find COUNT instances before reaching END, set *SHORTAGE - to the number of TARGETs left unfound, and return END. + to the number of newlines left unfound, and return END. If ALLOW_QUIT, set immediate_quit. That's good to do except when inside redisplay. */ ptrdiff_t -scan_buffer (int target, ptrdiff_t start, ptrdiff_t end, - ptrdiff_t count, ptrdiff_t *shortage, bool allow_quit) +find_newline (ptrdiff_t start, ptrdiff_t end, + ptrdiff_t count, ptrdiff_t *shortage, bool allow_quit) { struct region_cache *newline_cache; ptrdiff_t end_byte = -1; @@ -656,7 +656,7 @@ else { direction = -1; - if (!end) + if (!end) end = BEGV, end_byte = BEGV_BYTE; } if (end_byte == -1) @@ -684,7 +684,7 @@ /* If we're looking for a newline, consult the newline cache to see where we can avoid some scanning. */ - if (target == '\n' && newline_cache) + if (newline_cache) { ptrdiff_t next_change; immediate_quit = 0; @@ -723,32 +723,32 @@ while (cursor < ceiling_addr) { - unsigned char *scan_start = cursor; - /* The dumb loop. */ - while (*cursor != target && ++cursor < ceiling_addr) - ; + unsigned char *nl = memchr (cursor, '\n', ceiling_addr - cursor); /* If we're looking for newlines, cache the fact that the region from start to cursor is free of them. */ - if (target == '\n' && newline_cache) - know_region_cache (current_buffer, newline_cache, - BYTE_TO_CHAR (start_byte + scan_start - base), - BYTE_TO_CHAR (start_byte + cursor - base)); - - /* Did we find the target character? */ - if (cursor < ceiling_addr) - { - if (--count == 0) - { - immediate_quit = 0; - return BYTE_TO_CHAR (start_byte + cursor - base + 1); - } - cursor++; - } + if (newline_cache) + { + unsigned char *low = cursor; + unsigned char *lim = nl ? nl : ceiling_addr; + know_region_cache (current_buffer, newline_cache, + BYTE_TO_CHAR (low - base + start_byte), + BYTE_TO_CHAR (lim - base + start_byte)); + } + + if (! nl) + break; + + if (--count == 0) + { + immediate_quit = 0; + return BYTE_TO_CHAR (nl + 1 - base + start_byte); + } + cursor = nl + 1; } - start = BYTE_TO_CHAR (start_byte + cursor - base); + start = BYTE_TO_CHAR (ceiling_addr - base + start_byte); } } else @@ -760,7 +760,7 @@ ptrdiff_t tem; /* Consult the newline cache, if appropriate. */ - if (target == '\n' && newline_cache) + if (newline_cache) { ptrdiff_t next_change; immediate_quit = 0; @@ -794,31 +794,32 @@ while (cursor >= ceiling_addr) { - unsigned char *scan_start = cursor; - - while (*cursor != target && --cursor >= ceiling_addr) - ; + unsigned char *nl = memrchr (ceiling_addr, '\n', + cursor + 1 - ceiling_addr); /* If we're looking for newlines, cache the fact that the region from after the cursor to start is free of them. */ - if (target == '\n' && newline_cache) - know_region_cache (current_buffer, newline_cache, - BYTE_TO_CHAR (start_byte + cursor - base), - BYTE_TO_CHAR (start_byte + scan_start - base)); - - /* Did we find the target character? */ - if (cursor >= ceiling_addr) - { - if (++count >= 0) - { - immediate_quit = 0; - return BYTE_TO_CHAR (start_byte + cursor - base); - } - cursor--; - } + if (newline_cache) + { + unsigned char *low = nl ? nl : ceiling_addr - 1; + unsigned char *lim = cursor; + know_region_cache (current_buffer, newline_cache, + BYTE_TO_CHAR (low - base + start_byte), + BYTE_TO_CHAR (lim - base + start_byte)); + } + + if (! nl) + break; + + if (++count >= 0) + { + immediate_quit = 0; + return BYTE_TO_CHAR (nl - base + start_byte); + } + cursor = nl - 1; } - start = BYTE_TO_CHAR (start_byte + cursor - base); + start = BYTE_TO_CHAR (ceiling_addr - 1 - base + start_byte); } } @@ -828,8 +829,7 @@ return start; } \f -/* Search for COUNT instances of a line boundary, which means either a - newline or (if selective display enabled) a carriage return. +/* Search for COUNT instances of a line boundary. Start at START. If COUNT is negative, search backwards. We report the resulting position by calling TEMP_SET_PT_BOTH. @@ -860,9 +860,6 @@ bool old_immediate_quit = immediate_quit; - /* The code that follows is like scan_buffer - but checks for either newline or carriage return. */ - if (allow_quit) immediate_quit++; @@ -874,29 +871,25 @@ ceiling = min (limit_byte - 1, ceiling); ceiling_addr = BYTE_POS_ADDR (ceiling) + 1; base = (cursor = BYTE_POS_ADDR (start_byte)); - while (1) + + do { - while (*cursor != '\n' && ++cursor != ceiling_addr) - ; - - if (cursor != ceiling_addr) + unsigned char *nl = memchr (cursor, '\n', ceiling_addr - cursor); + if (! nl) + break; + if (--count == 0) { - if (--count == 0) - { - immediate_quit = old_immediate_quit; - start_byte = start_byte + cursor - base + 1; - start = BYTE_TO_CHAR (start_byte); - TEMP_SET_PT_BOTH (start, start_byte); - return 0; - } - else - if (++cursor == ceiling_addr) - break; + immediate_quit = old_immediate_quit; + start_byte += nl - base + 1; + start = BYTE_TO_CHAR (start_byte); + TEMP_SET_PT_BOTH (start, start_byte); + return 0; } - else - break; + cursor = nl + 1; } - start_byte += cursor - base; + while (cursor < ceiling_addr); + + start_byte += ceiling_addr - base; } } else @@ -905,31 +898,28 @@ { ceiling = BUFFER_FLOOR_OF (start_byte - 1); ceiling = max (limit_byte, ceiling); - ceiling_addr = BYTE_POS_ADDR (ceiling) - 1; + ceiling_addr = BYTE_POS_ADDR (ceiling); base = (cursor = BYTE_POS_ADDR (start_byte - 1) + 1); while (1) { - while (--cursor != ceiling_addr && *cursor != '\n') - ; + unsigned char *nl = memrchr (ceiling_addr, '\n', + cursor - ceiling_addr); + if (! nl) + break; - if (cursor != ceiling_addr) + if (++count == 0) { - if (++count == 0) - { - immediate_quit = old_immediate_quit; - /* Return the position AFTER the match we found. */ - start_byte = start_byte + cursor - base + 1; - start = BYTE_TO_CHAR (start_byte); - TEMP_SET_PT_BOTH (start, start_byte); - return 0; - } + immediate_quit = old_immediate_quit; + /* Return the position AFTER the match we found. */ + start_byte += nl - base + 1; + start = BYTE_TO_CHAR (start_byte); + TEMP_SET_PT_BOTH (start, start_byte); + return 0; } - else - break; + + cursor = nl; } - /* Here we add 1 to compensate for the last decrement - of CURSOR, which took it past the valid range. */ - start_byte += cursor - base + 1; + start_byte += ceiling_addr - base; } } @@ -942,7 +932,7 @@ ptrdiff_t find_next_newline_no_quit (ptrdiff_t from, ptrdiff_t cnt) { - return scan_buffer ('\n', from, 0, cnt, (ptrdiff_t *) 0, 0); + return find_newline (from, 0, cnt, (ptrdiff_t *) 0, 0); } /* Like find_next_newline, but returns position before the newline, @@ -953,7 +943,7 @@ find_before_next_newline (ptrdiff_t from, ptrdiff_t to, ptrdiff_t cnt) { ptrdiff_t shortage; - ptrdiff_t pos = scan_buffer ('\n', from, to, cnt, &shortage, 1); + ptrdiff_t pos = find_newline (from, to, cnt, &shortage, 1); if (shortage == 0) pos--; === modified file 'src/xdisp.c' --- src/xdisp.c 2013-02-10 16:49:09 +0000 +++ src/xdisp.c 2013-02-11 01:28:13 +0000 @@ -1392,21 +1392,9 @@ Lisp_Object cpos = make_number (charpos); Lisp_Object spec = Fget_char_property (cpos, Qdisplay, Qnil); Lisp_Object string = string_from_display_spec (spec); - int newline_in_string = 0; - - if (STRINGP (string)) - { - const char *s = SSDATA (string); - const char *e = s + SBYTES (string); - while (s < e) - { - if (*s++ == '\n') - { - newline_in_string = 1; - break; - } - } - } + bool newline_in_string + = (STRINGP (string) + && memchr (SDATA (string), '\n', SBYTES (string))); /* The tricky code below is needed because there's a discrepancy between move_it_to and how we set cursor when the display line ends in a newline from a @@ -14753,7 +14741,7 @@ SET_TEXT_POS (start_pos, ZV, ZV_BYTE); /* Find the start of the continued line. This should be fast - because scan_buffer is fast (newline cache). */ + because find_newline is fast (newline cache). */ row = w->desired_matrix->rows + (WINDOW_WANTS_HEADER_LINE_P (w) ? 1 : 0); init_iterator (&it, w, CHARPOS (start_pos), BYTEPOS (start_pos), row, DEFAULT_FACE_ID); @@ -21620,31 +21608,36 @@ ceiling = min (limit_byte - 1, ceiling); ceiling_addr = BYTE_POS_ADDR (ceiling) + 1; base = (cursor = BYTE_POS_ADDR (start_byte)); - while (1) + + do { if (selective_display) - while (*cursor != '\n' && *cursor != 015 && ++cursor != ceiling_addr) - ; - else - while (*cursor != '\n' && ++cursor != ceiling_addr) - ; - - if (cursor != ceiling_addr) - { - if (--count == 0) - { - start_byte += cursor - base + 1; - *byte_pos_ptr = start_byte; - return orig_count; - } - else - if (++cursor == ceiling_addr) - break; - } - else - break; + { + while (*cursor != '\n' && *cursor != 015 + && ++cursor != ceiling_addr) + continue; + if (cursor == ceiling_addr) + break; + } + else + { + cursor = memchr (cursor, '\n', ceiling_addr - cursor); + if (! cursor) + break; + } + + cursor++; + + if (--count == 0) + { + start_byte += cursor - base; + *byte_pos_ptr = start_byte; + return orig_count; + } } - start_byte += cursor - base; + while (cursor < ceiling_addr); + + start_byte += ceiling_addr - base; } } else @@ -21653,35 +21646,35 @@ { ceiling = BUFFER_FLOOR_OF (start_byte - 1); ceiling = max (limit_byte, ceiling); - ceiling_addr = BYTE_POS_ADDR (ceiling) - 1; + ceiling_addr = BYTE_POS_ADDR (ceiling); base = (cursor = BYTE_POS_ADDR (start_byte - 1) + 1); while (1) { if (selective_display) - while (--cursor != ceiling_addr - && *cursor != '\n' && *cursor != 015) - ; + { + while (--cursor >= ceiling_addr + && *cursor != '\n' && *cursor != 015) + continue; + if (cursor < ceiling_addr) + break; + } else - while (--cursor != ceiling_addr && *cursor != '\n') - ; + { + cursor = memrchr (ceiling_addr, '\n', cursor - ceiling_addr); + if (! cursor) + break; + } - if (cursor != ceiling_addr) + if (++count == 0) { - if (++count == 0) - { - start_byte += cursor - base + 1; - *byte_pos_ptr = start_byte; - /* When scanning backwards, we should - not count the newline posterior to which we stop. */ - return - orig_count - 1; - } + start_byte += cursor - base + 1; + *byte_pos_ptr = start_byte; + /* When scanning backwards, we should + not count the newline posterior to which we stop. */ + return - orig_count - 1; } - else - break; } - /* Here we add 1 to compensate for the last decrement - of CURSOR, which took it past the valid range. */ - start_byte += cursor - base + 1; + start_byte += ceiling_addr - base; } } ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-09 9:05 ` Paul Eggert 2013-02-09 9:33 ` Eli Zaretskii @ 2013-02-09 10:01 ` Eli Zaretskii 2013-02-10 16:57 ` Eli Zaretskii 1 sibling, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-09 10:01 UTC (permalink / raw) To: Paul Eggert; +Cc: dmantipov, emacs-devel > Date: Sat, 09 Feb 2013 01:05:01 -0800 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: dmantipov@yandex.ru, emacs-devel@gnu.org > > On 02/09/2013 12:46 AM, Eli Zaretskii wrote: > > > 25% faster is still terribly slow for redisplay. > > Yes, as I said, it doesn't solve the performance problem. > Still, it doesn't complicate the code, and it significantly > improves speed in code likely to be executed often, so it > seems worth doing in its own right. I suspect that the use case that makes scan_buffer so high on the profile is very much skewed. My crystal ball says that the file in question was one very long paragraph, or at least had many-many _thousands_ of lines between empty lines that delimit paragraphs. scan_buffer is high on the profile because the bidi.c code tries to find the beginning of a paragraph, which determines the base direction of the paragraph, which in turn determines how the text should be reordered for display. By contrast, most real-life files have much less text between empty lines, so scan_buffer will not be at any prominent place in the profile. But redisplay of a buffer with very long lines will still be awfully slow, even if there's an empty line between every 2 long lines, although scan_buffer will no longer be a factor. OTOH, if you create a file with a single long paragraph, but whose lines have "normal" width, like 100 characters, redisplay will perform adequately, even though scan_buffer will be heavily used. (It would be interesting to see a profile for that, btw.) IOW, the solution in bidi.c for extremely long paragraphs is optimized for the 99% of use cases, where lines are not too long, i.e. for those cases where the old unidirectional display engine gave reasonable performance. Dmitry's use case, OTOH, is skewed on several counts: . it uses extremely long lines . it uses too many neutral/weak characters . it uses extremely long paragraphs This simultaneously hits on several unrelated weaknesses of the current display engine, with the result that the profile is a combination of at least 3 different reasons for slow-down, which makes it very hard to analyze the results and look for solutions. That is why I think we should attack this problem one reason at a time. The most important reason is the first one: long lines cause the display code traverse too much of buffer text. This is why you see x_produce_glyphs so high on the profile in the unidirectional case: it examines too many characters, much more than what will be actually displayed on the screen. Solve this problem, and the 2nd one will simply disappear without a trace, because it is at least linear in the number of scanned characters. If the 3rd problem is still a factor, after the 1st one is gone, we can tune the current optimization at that time. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-09 10:01 ` Eli Zaretskii @ 2013-02-10 16:57 ` Eli Zaretskii 2013-02-11 5:43 ` Dmitry Antipov 2013-02-11 17:17 ` Eli Zaretskii 0 siblings, 2 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-10 16:57 UTC (permalink / raw) To: eggert, dmantipov; +Cc: emacs-devel > Date: Sat, 09 Feb 2013 12:01:46 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: dmantipov@yandex.ru, emacs-devel@gnu.org > > That is why I think we should attack this problem one reason at a > time. The most important reason is the first one: long lines cause > the display code traverse too much of buffer text. This is why you > see x_produce_glyphs so high on the profile in the unidirectional > case: it examines too many characters, much more than what will be > actually displayed on the screen. I just committed to the trunk revision 111724 with a couple of simple changes which speed up by a factor of 3 some redisplay operations, such as M-v or M->, in a buffer with very long lines. Please try it. This is by no means the complete solution, even for the situations where it provides a 3-fold speed-up: we need the speed-up to be much more aggressive. But it does demonstrate how simple changes can have a significant effect in this area. Stay tuned. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-10 16:57 ` Eli Zaretskii @ 2013-02-11 5:43 ` Dmitry Antipov 2013-02-11 7:54 ` Dmitry Antipov 2013-02-11 16:42 ` Eli Zaretskii 2013-02-11 17:17 ` Eli Zaretskii 1 sibling, 2 replies; 42+ messages in thread From: Dmitry Antipov @ 2013-02-11 5:43 UTC (permalink / raw) To: emacs-devel; +Cc: Eli Zaretskii, Paul Eggert Yet another interesting profile (generated by scroll-both micro-benchmark with r111730) is shown below. Input is 4K lines, each line is ~27K bytes, Imla'ei (modern Arabic) script. IIUC this R2L text with long lines should push bidi really hard, but ... bidi core routines (by itself) are almost irrelevant in the profile: 39.96% emacs emacs [.] scan_buffer 28.72% emacs emacs [.] buf_charpos_to_bytepos 21.82% emacs emacs [.] buf_bytepos_to_charpos 0.59% emacs emacs [.] re_match_2_internal 0.51% emacs emacs [.] sub_char_table_ref 0.42% emacs emacs [.] mark_object 0.23% emacs emacs [.] composition_gstring_width 0.19% emacs libc-2.16.so [.] __memcpy_ssse3_back 0.18% emacs emacs [.] x_produce_glyphs 0.17% emacs emacs [.] move_it_in_display_line_to 0.17% emacs emacs [.] hash_lookup 0.17% emacs emacs [.] Fgarbage_collect 0.17% emacs emacs [.] lface_hash 0.16% emacs emacs [.] decode_coding_utf_8 0.16% emacs emacs [.] face_for_font 0.16% emacs emacs [.] composition_gstring_p 0.15% emacs emacs [.] compile_pattern 0.15% emacs emacs [.] get_next_display_element 0.14% emacs emacs [.] bidi_level_of_next_char 0.12% emacs emacs [.] font_range 0.12% emacs emacs [.] bidi_fetch_char 0.12% emacs emacs [.] internal_equal 0.11% emacs emacs [.] autocmp_chars 0.11% emacs emacs [.] char_table_ref 0.11% emacs libgtk-3.so.0.600.4 [.] 0x0000000000115bf0 0.10% emacs emacs [.] next_element_from_buffer 0.10% emacs emacs [.] composition_update_it 0.10% emacs emacs [.] boyer_moore Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-11 5:43 ` Dmitry Antipov @ 2013-02-11 7:54 ` Dmitry Antipov 2013-02-11 16:47 ` Eli Zaretskii 2013-02-11 16:42 ` Eli Zaretskii 1 sibling, 1 reply; 42+ messages in thread From: Dmitry Antipov @ 2013-02-11 7:54 UTC (permalink / raw) To: emacs-devel; +Cc: Eli Zaretskii, Paul Eggert On 02/11/2013 09:43 AM, Dmitry Antipov wrote: > Yet another interesting profile (generated by scroll-both micro-benchmark with > r111730) is shown below. > > Input is 4K lines, each line is ~27K bytes, Imla'ei (modern Arabic) script. IIUC > this R2L text with long lines should push bidi really hard, but ... bidi core > routines (by itself) are almost irrelevant in the profile: > > 39.96% emacs emacs [.] scan_buffer > 28.72% emacs emacs [.] buf_charpos_to_bytepos > 21.82% emacs emacs [.] buf_bytepos_to_charpos > 0.59% emacs emacs [.] re_match_2_internal ... and with Paul's mem(r)chr patch it is: 43.38% emacs emacs [.] buf_charpos_to_bytepos 28.42% emacs emacs [.] buf_bytepos_to_charpos 13.10% emacs libc-2.16.so [.] memrchr 0.85% emacs emacs [.] re_match_2_internal ... So I should vote YES. This is simple optimization which really makes sense, and I suspect that the "less usual" input is, the more sense it has. Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-11 7:54 ` Dmitry Antipov @ 2013-02-11 16:47 ` Eli Zaretskii 2013-02-11 23:55 ` Paul Eggert 0 siblings, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-11 16:47 UTC (permalink / raw) To: Dmitry Antipov; +Cc: eggert, emacs-devel > Date: Mon, 11 Feb 2013 11:54:57 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: Eli Zaretskii <eliz@gnu.org>, Paul Eggert <eggert@cs.ucla.edu> > > On 02/11/2013 09:43 AM, Dmitry Antipov wrote: > > > Yet another interesting profile (generated by scroll-both micro-benchmark with > > r111730) is shown below. > > > > Input is 4K lines, each line is ~27K bytes, Imla'ei (modern Arabic) script. IIUC > > this R2L text with long lines should push bidi really hard, but ... bidi core > > routines (by itself) are almost irrelevant in the profile: > > > > 39.96% emacs emacs [.] scan_buffer > > 28.72% emacs emacs [.] buf_charpos_to_bytepos > > 21.82% emacs emacs [.] buf_bytepos_to_charpos > > 0.59% emacs emacs [.] re_match_2_internal > > ... and with Paul's mem(r)chr patch it is: > > 43.38% emacs emacs [.] buf_charpos_to_bytepos > 28.42% emacs emacs [.] buf_bytepos_to_charpos > 13.10% emacs libc-2.16.so [.] memrchr > 0.85% emacs emacs [.] re_match_2_internal Without absolute times, it's hard to judge the improvement. > So I should vote YES. This is simple optimization which really makes sense, > and I suspect that the "less usual" input is, the more sense it has. I'm not opposed to using memchr where possible. I'm just saying that we should NOT regard this as any kind of solution for the long-lines problem with the current display engine. To fix that problem, we need to speed up redisplay by one or two orders of magnitude (it currently takes several hundreds of milliseconds to several seconds; it should take a few milliseconds, 10 msec max). That is a far cry from 25% improvement we will get with memchr. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-11 16:47 ` Eli Zaretskii @ 2013-02-11 23:55 ` Paul Eggert 0 siblings, 0 replies; 42+ messages in thread From: Paul Eggert @ 2013-02-11 23:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Dmitry Antipov, emacs-devel On 02/11/13 08:47, Eli Zaretskii wrote: > we should NOT regard this as any kind of solution for the long-lines > problem with the current display engine. Yes, the memchr/memrchr improvement is a relatively minor performance improvement; I suggested it primarily because it's easy to do and doesn't complicate Emacs proper. I pushed it into the trunk as bzr 111741. By the way, in reviewing this area it appears to me that there must be a bug in the code that caches newline locations when searching backwards. The above performance improvement doesn't affect this bug. I'll try to follow up on this soon. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-11 5:43 ` Dmitry Antipov 2013-02-11 7:54 ` Dmitry Antipov @ 2013-02-11 16:42 ` Eli Zaretskii 2013-02-11 17:53 ` Dmitry Antipov 1 sibling, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-11 16:42 UTC (permalink / raw) To: Dmitry Antipov; +Cc: eggert, emacs-devel > Date: Mon, 11 Feb 2013 09:43:17 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: Eli Zaretskii <eliz@gnu.org>, Paul Eggert <eggert@cs.ucla.edu> > > Yet another interesting profile (generated by scroll-both micro-benchmark with > r111730) is shown below. > > Input is 4K lines, each line is ~27K bytes, Imla'ei (modern Arabic) script. Can you publish the file, or the URL where you downloaded it from? > IIUC this R2L text with long lines should push bidi really hard, > but... bidi core routines (by itself) are almost irrelevant in the > profile: Actually, that's expected, see below. > 39.96% emacs emacs [.] scan_buffer > 28.72% emacs emacs [.] buf_charpos_to_bytepos > 21.82% emacs emacs [.] buf_bytepos_to_charpos > 0.59% emacs emacs [.] re_match_2_internal > 0.51% emacs emacs [.] sub_char_table_ref > 0.42% emacs emacs [.] mark_object > 0.23% emacs emacs [.] composition_gstring_width > 0.19% emacs libc-2.16.so [.] __memcpy_ssse3_back > 0.18% emacs emacs [.] x_produce_glyphs > 0.17% emacs emacs [.] move_it_in_display_line_to > 0.17% emacs emacs [.] hash_lookup > 0.17% emacs emacs [.] Fgarbage_collect > 0.17% emacs emacs [.] lface_hash > 0.16% emacs emacs [.] decode_coding_utf_8 > 0.16% emacs emacs [.] face_for_font > 0.16% emacs emacs [.] composition_gstring_p > 0.15% emacs emacs [.] compile_pattern > 0.15% emacs emacs [.] get_next_display_element > 0.14% emacs emacs [.] bidi_level_of_next_char > 0.12% emacs emacs [.] font_range > 0.12% emacs emacs [.] bidi_fetch_char > 0.12% emacs emacs [.] internal_equal > 0.11% emacs emacs [.] autocmp_chars > 0.11% emacs emacs [.] char_table_ref > 0.11% emacs libgtk-3.so.0.600.4 [.] 0x0000000000115bf0 > 0.10% emacs emacs [.] next_element_from_buffer > 0.10% emacs emacs [.] composition_update_it > 0.10% emacs emacs [.] boyer_moore The Arabic script is a heavy user of character compositions: they are important for correct shaping of the glyphs, without which any speaker of Arabic will turn away in disgust. The fact that you see functions like composition_update_it, composition_gstring_p, composition_gstring_width, and sub_char_table_ref all hint towards this. Character compositions work by scanning the vicinity of a composable character using regular expression matching in Lisp. That is why you see re_match_2_internal relatively high in the profile. Handling these compositions can obscure any bidi reordering. To disable this factor, turn off auto-composition-mode. More importantly, you cannot easily "push bidi really hard", not with a file that consists of predominantly RTL characters. That's because such a file is as easy to display as a pure LTR text: the characters are delivered for display entirely in their logical order in the buffer, and only laid out starting at the right margin of the window instead of at the left margin. To exercise bidi.c, you need heavily mixed RTL and LTR text, with digits, punctuation, and lots of embeddings and directional overrides (using the LRE, RLE, RLO, and LRO control characters), which push and pop the reordering stack. Only then the reordering of characters will become non-trivial, and you _might_ see some bidi functions as hot spots. I say "might" because bidi.c uses a dynamic cache which allows it to fetch and analyze each character only once, even if reordering jumps here and there like a young goat. Thus, the only overhead of reordering is the logic that decides where in the cache is the next character to deliver for display; the cache is accessed directly (it is implemented as a linear array). There could be rare pathological situations where bidi.c needs to examine lots (and I'm talking tens or hundreds of thousands) of characters for some simple redisplay operation. A few of these were discovered and taken care of during late stages of v24.1 development, but maybe there are some more. These typically show up as heavy usage of bidi_fetch_char or its subroutines, or of bidi_find_paragraph_start and its subroutines. I haven't seen such problems since last July. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-11 16:42 ` Eli Zaretskii @ 2013-02-11 17:53 ` Dmitry Antipov 2013-02-11 18:10 ` Eli Zaretskii 0 siblings, 1 reply; 42+ messages in thread From: Dmitry Antipov @ 2013-02-11 17:53 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel On 02/11/2013 08:42 PM, Eli Zaretskii wrote: > Can you publish the file, or the URL where you downloaded it from? Actually it was artificially generated from Quran text available at http://tanzil.net/download. I can't publish it because the license doesn't allow any modifications, so I assume that any derivatives are also illegal; but I also assume that we still can use them just for the testing purposes, e.g. without any redistribution. Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-11 17:53 ` Dmitry Antipov @ 2013-02-11 18:10 ` Eli Zaretskii 2013-02-11 18:21 ` Dmitry Antipov 0 siblings, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-11 18:10 UTC (permalink / raw) To: Dmitry Antipov; +Cc: eggert, emacs-devel > Date: Mon, 11 Feb 2013 21:53:32 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: emacs-devel@gnu.org, eggert@cs.ucla.edu > > On 02/11/2013 08:42 PM, Eli Zaretskii wrote: > > > Can you publish the file, or the URL where you downloaded it from? > > Actually it was artificially generated from Quran text available > at http://tanzil.net/download. Can you tell how you generated it? ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-11 18:10 ` Eli Zaretskii @ 2013-02-11 18:21 ` Dmitry Antipov 0 siblings, 0 replies; 42+ messages in thread From: Dmitry Antipov @ 2013-02-11 18:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel On 02/11/2013 10:10 PM, Eli Zaretskii wrote: > Can you tell how you generated it? # Get first 100 lines and convert them to the only line head -n 100 < quran-simple.txt | tr '\n' ' ' | tr '\r' ' ' > 0.txt # Add newline echo -ne "\n" >> 0.txt # Copy it 4096 times cat 0.txt 0.txt 0.txt 0.txt > 1.txt cat 1.txt 1.txt 1.txt 1.txt > 0.txt cat 0.txt 0.txt 0.txt 0.txt > 1.txt cat 1.txt 1.txt 1.txt 1.txt > 0.txt cat 0.txt 0.txt 0.txt 0.txt > 1.txt cat 1.txt 1.txt 1.txt 1.txt > 0.txt I realize that this is pretty artificial and doesn't reflect the real structure of any Arabic text. This is definitely a trick in attempt to exploit some corner cases here and there. Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-10 16:57 ` Eli Zaretskii 2013-02-11 5:43 ` Dmitry Antipov @ 2013-02-11 17:17 ` Eli Zaretskii 2013-02-11 17:55 ` Drew Adams 1 sibling, 1 reply; 42+ messages in thread From: Eli Zaretskii @ 2013-02-11 17:17 UTC (permalink / raw) To: emacs-devel, Stefan Monnier; +Cc: eggert, dmantipov > Date: Sun, 10 Feb 2013 18:57:00 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org > > I just committed to the trunk revision 111724 with a couple of simple > changes which speed up by a factor of 3 some redisplay operations, > such as M-v or M->, in a buffer with very long lines. Please try it. Further measurements indicate that the bottleneck is in searches for previous or next newline, or N-th previous/next newline. These searches are at the core of functions that compute pixel dimensions of buffer text, when the display engine needs to figure out where to start displaying the window after scrolling, or where to put point after C-p or C-n. As a typical example, a C-n in a buffer with truncate-lines set non-nil requires us to find the next physical line in the buffer, i.e. the next newline. We currently do that by searching forward in the buffer, one byte at a time, until we find a newline. If lines are very long, this is expensive. When truncate-lines is nil, this problem doesn't exist for C-n, but a similar problem exists for C-p: we need to find the _previous_ newline (which is many characters back when lines are long), and then scan forward until we find a character that is displayed one screen line above the one we were at when the user typed C-p. Revision 111724 makes sure we don't go back more than one physical line, unless really needed, but given the current design of the code, one full line is the absolute minimum. Turning on the newline cache speeds up these searches for a newline by a factor of 2, which is not too spectacular, but not negligible. Any objections to turning on that caching by default in all buffers? Beyond that, either we can find a much more efficient way of finding the next or previous newline, or we will need a complete redesign and re-implementation of the move_it_* family of functions, which is used a lot by the display engine. ^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: Long lines and bidi 2013-02-11 17:17 ` Eli Zaretskii @ 2013-02-11 17:55 ` Drew Adams 2013-02-11 18:13 ` Eli Zaretskii 0 siblings, 1 reply; 42+ messages in thread From: Drew Adams @ 2013-02-11 17:55 UTC (permalink / raw) To: 'Eli Zaretskii', emacs-devel, 'Stefan Monnier' Cc: eggert, dmantipov > Turning on the newline cache speeds up these searches for a newline by > a factor of 2, which is not too spectacular, but not negligible. Any > objections to turning on that caching by default in all buffers? I only followed some of all that you wrote, and I haven't followed the thread. But a question: You do not mention any added cost, AFAICT (but again, I did not follow in detail). Is the caching relevant (helpful) regardless of the value of truncate-lines or whether visual-line-mode etc. is on? IOW, does it make sense for many common configurations or just for some particular configs? If it is not particularly advantageous for some common configs, does it have a cost that would suggest it should not be done in those configs, or is it pretty much without a downside? What about "for all buffers"? Does it make sense also for buffers such as Dired and Info, which have relatively short line lengths? If there is no extra cost or other drawback then such considerations probably do not matter, of course. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi 2013-02-11 17:55 ` Drew Adams @ 2013-02-11 18:13 ` Eli Zaretskii 0 siblings, 0 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-11 18:13 UTC (permalink / raw) To: Drew Adams; +Cc: eggert, dmantipov, monnier, emacs-devel > From: "Drew Adams" <drew.adams@oracle.com> > Cc: <eggert@cs.ucla.edu>, <dmantipov@yandex.ru> > Date: Mon, 11 Feb 2013 09:55:36 -0800 > > > Turning on the newline cache speeds up these searches for a newline by > > a factor of 2, which is not too spectacular, but not negligible. Any > > objections to turning on that caching by default in all buffers? > > I only followed some of all that you wrote, and I haven't followed the thread. > But a question: > > You do not mention any added cost, AFAICT (but again, I did not follow in > detail). The overhead is only visible with very short lines, and is negligible even then. > Is the caching relevant (helpful) regardless of the value of truncate-lines or > whether visual-line-mode etc. is on? IOW, does it make sense for many common > configurations or just for some particular configs? It always makes sense. Searching for newlines is a very frequent operation in Emacs, not only in the display engine. > What about "for all buffers"? Does it make sense also for buffers such as Dired > and Info, which have relatively short line lengths? It doesn't hurt there, AFAICS. And we can always turn it off in the mode function, if we find later that some modes don't like it. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi [Was: Re: bug#13623: ...] 2013-02-08 14:07 ` Eli Zaretskii 2013-02-08 14:46 ` Long lines and bidi Eli Zaretskii @ 2013-02-08 16:21 ` Dmitry Antipov 2013-02-08 17:04 ` Eli Zaretskii 1 sibling, 1 reply; 42+ messages in thread From: Dmitry Antipov @ 2013-02-08 16:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 02/08/2013 06:07 PM, Eli Zaretskii wrote: > Profile alone is not enough. Please tell how did you "scroll", > exactly (which commands did you use), and please also show the > absolute times it took to perform each command. (defun scroll-both () (interactive) (let ((start (float-time))) (progn (dotimes (n 100) (progn (scroll-up) (redisplay))) (goto-char (point-max)) (dotimes (n 100) (progn (scroll-down) (redisplay))) (message "Elapsed %f seconds" (- (float-time) start))))) With bidi, ~600 second elapsed, and: 25.18% emacs emacs [.] scan_buffer 7.04% emacs emacs [.] bidi_resolve_weak 6.47% emacs emacs [.] get_next_display_element 6.37% emacs emacs [.] bidi_level_of_next_char 5.14% emacs libc-2.16.so [.] __memcpy_ssse3_back 5.05% emacs emacs [.] move_it_in_display_line_to 4.94% emacs emacs [.] x_produce_glyphs 4.84% emacs libXft.so.2.3.1 [.] XftCharIndex 3.72% emacs emacs [.] bidi_move_to_visually_next 3.70% emacs emacs [.] next_element_from_buffer 2.90% emacs libXft.so.2.3.1 [.] XftGlyphExtents 2.05% emacs emacs [.] bidi_fetch_char 2.02% emacs emacs [.] lookup_glyphless_char_display 2.01% emacs emacs [.] bidi_resolve_neutral 1.76% emacs emacs [.] bidi_cache_iterator_state 1.70% emacs emacs [.] bidi_get_type 1.51% emacs emacs [.] bidi_resolve_explicit_1 1.18% emacs libXft.so.2.3.1 [.] XftFontCheckGlyph 1.12% emacs emacs [.] xftfont_encode_char 1.01% emacs emacs [.] xftfont_text_extents Without bidi, ~230 seconds elapsed, and: 21.36% emacs emacs [.] x_produce_glyphs 17.92% emacs emacs [.] get_next_display_element 15.07% emacs emacs [.] move_it_in_display_line_to 8.37% emacs emacs [.] next_element_from_buffer 8.34% emacs libXft.so.2.3.1 [.] XftCharIndex 6.12% emacs emacs [.] lookup_glyphless_char_display 4.21% emacs libXft.so.2.3.1 [.] XftGlyphExtents 3.07% emacs emacs [.] xftfont_encode_char 2.68% emacs emacs [.] xftfont_text_extents 1.87% emacs emacs [.] get_per_char_metric 1.53% emacs libXft.so.2.3.1 [.] XftFontCheckGlyph 1.49% emacs emacs [.] composition_compute_stop_pos 1.35% emacs emacs [.] set_iterator_to_next cache-long-line-scans is nil in both cases. I suspect that scroll should be direction-agnostic in theory; but both profiled runs shows that scroll-down is much, much slower than scroll-up (that's why elapsed time is so huge in both cases). > What was in the file? bidi_resolve_weak high on the profile hints > that it was full of punctuation or digits or banks, which is not > really an interesting case. Your guess is correct; but I suspect that an average text in human language contains less punctuations, digits and blanks than the C source code of the same size :-). > Ah, _that_ red herring... Why is that the first question? What were > the times with and without bidi-display-reordering in this file? In > my testing, the display engine performs awfully slow in both cases, so > even though turning off reordering makes it faster, it is still so > terribly slow that the problem is not going to be solved by that. > > As to your question: how can we know what characters are or aren't in > the buffer without scanning it? And scanning the buffer is exactly > what bidi.c does. Hm... insert-file-contents tries to detect encoding by looking at first 1K and last 3K of the file. Why the similar approach isn't applicable to bidi? Dmitry ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi [Was: Re: bug#13623: ...] 2013-02-08 16:21 ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov @ 2013-02-08 17:04 ` Eli Zaretskii 0 siblings, 0 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-08 17:04 UTC (permalink / raw) To: Dmitry Antipov; +Cc: emacs-devel > Date: Fri, 08 Feb 2013 20:21:57 +0400 > From: Dmitry Antipov <dmantipov@yandex.ru> > CC: emacs-devel@gnu.org > > On 02/08/2013 06:07 PM, Eli Zaretskii wrote: > > > Profile alone is not enough. Please tell how did you "scroll", > > exactly (which commands did you use), and please also show the > > absolute times it took to perform each command. > > (defun scroll-both () > (interactive) > (let ((start (float-time))) > (progn > (dotimes (n 100) (progn (scroll-up) (redisplay))) > (goto-char (point-max)) > (dotimes (n 100) (progn (scroll-down) (redisplay))) > (message "Elapsed %f seconds" (- (float-time) start))))) > > With bidi, ~600 second elapsed, and: > > 25.18% emacs emacs [.] scan_buffer > 7.04% emacs emacs [.] bidi_resolve_weak > 6.47% emacs emacs [.] get_next_display_element > 6.37% emacs emacs [.] bidi_level_of_next_char > 5.14% emacs libc-2.16.so [.] __memcpy_ssse3_back > 5.05% emacs emacs [.] move_it_in_display_line_to > 4.94% emacs emacs [.] x_produce_glyphs > 4.84% emacs libXft.so.2.3.1 [.] XftCharIndex > 3.72% emacs emacs [.] bidi_move_to_visually_next > 3.70% emacs emacs [.] next_element_from_buffer > 2.90% emacs libXft.so.2.3.1 [.] XftGlyphExtents > 2.05% emacs emacs [.] bidi_fetch_char > 2.02% emacs emacs [.] lookup_glyphless_char_display > 2.01% emacs emacs [.] bidi_resolve_neutral > 1.76% emacs emacs [.] bidi_cache_iterator_state > 1.70% emacs emacs [.] bidi_get_type > 1.51% emacs emacs [.] bidi_resolve_explicit_1 > 1.18% emacs libXft.so.2.3.1 [.] XftFontCheckGlyph > 1.12% emacs emacs [.] xftfont_encode_char > 1.01% emacs emacs [.] xftfont_text_extents > > Without bidi, ~230 seconds elapsed, and: This is consistent with my past measurements: (a) disabling bidi makes redisplay faster, but it is still awfully slow (2.3 sec per scroll); (b) bidi iteration is about 2 times slower than the unidirectional one (you get 3 times slower because your buffer is full of weak characters, which make the bidi iterator work harder due to the requirements of the Unicode Bidirectional Algorithm. > I suspect that scroll should be direction-agnostic in theory That theory is wrong. The reason is that functions that move by display lines can only move forward. So moving backward is coded very differently (a.k.a. "slower"). > but both profiled runs shows that scroll-down is much, much slower > than scroll-up (that's why elapsed time is so huge in both cases). That's expected; see also my explanation in a previous mail, which describes what move_it_vertically_backward does. That function is used a lot by scroll-down. > > What was in the file? bidi_resolve_weak high on the profile hints > > that it was full of punctuation or digits or banks, which is not > > really an interesting case. > > Your guess is correct; but I suspect that an average text in human language > contains less punctuations, digits and blanks than the C source code of the > same size :-). An average C code still has only a small fraction of punctuation. Just look at any C file. > > As to your question: how can we know what characters are or aren't in > > the buffer without scanning it? And scanning the buffer is exactly > > what bidi.c does. > > Hm... insert-file-contents tries to detect encoding by looking at first 1K > and last 3K of the file. Why the similar approach isn't applicable to bidi? No. Detecting encoding by a small portion is a heuristic that works only because most every file is encoded consistently. When a file is encoded inconsistently, the result of the above decoding heuristic is horribly wrong, and the consequences for the user are grave. As a recent example, see bug #13505. By contrast, scripts used in a text file do not have to be consistent or uniformly distributed over the file at all. So the probability to get this wrong will be much higher. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi [Was: Re: bug#13623: ...] 2013-02-08 13:33 ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov 2013-02-08 14:07 ` Eli Zaretskii @ 2013-02-08 15:33 ` Stefan Monnier 2013-02-08 16:05 ` Eli Zaretskii 1 sibling, 1 reply; 42+ messages in thread From: Stefan Monnier @ 2013-02-08 15:33 UTC (permalink / raw) To: Dmitry Antipov; +Cc: Eli Zaretskii, Emacs development discussions > So the first question is: is it feasible/possible/desirable to detect > that the buffer has no R2L text at all and automatically force > bidi-paragraph-direction to left-to-right and bidi-display-reordering > to nil? Would this speed things up by a constant factor, or would it actually remove an O(N) factor? I think a fix will need more than a constant factor speed up. Did you check both the truncate-lines=nil and the truncate-lines=t cases? I think that for the truncate-lines=t case, we won't be able to avoid the O(linelength) slowdown (but we should try and skip the non-displayed part of lines faster, especially when there's no `display/after/before-string' property). Stefan ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Long lines and bidi [Was: Re: bug#13623: ...] 2013-02-08 15:33 ` Stefan Monnier @ 2013-02-08 16:05 ` Eli Zaretskii 0 siblings, 0 replies; 42+ messages in thread From: Eli Zaretskii @ 2013-02-08 16:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: dmantipov, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Eli Zaretskii <eliz@gnu.org>, Emacs development discussions <emacs-devel@gnu.org> > Date: Fri, 08 Feb 2013 10:33:08 -0500 > > > So the first question is: is it feasible/possible/desirable to detect > > that the buffer has no R2L text at all and automatically force > > bidi-paragraph-direction to left-to-right and bidi-display-reordering > > to nil? > > Would this speed things up by a constant factor, or would it actually > remove an O(N) factor? The former, because the bidi iteration is slower than the original unidirectional one by a constant factor, on the average. > I think a fix will need more than a constant factor speed up. Indeed. > Did you check both the truncate-lines=nil and the truncate-lines=t cases? > I think that for the truncate-lines=t case, we won't be able to avoid > the O(linelength) slowdown (but we should try and skip the non-displayed > part of lines faster, especially when there's no > `display/after/before-string' property). The problem is not with the part of text we actually display, because the number of characters shown in a window does not depend on whether we have truncate-lines=t or nil. The problem is that most redisplay operations always scan some text that is eventually not shown in the window. The longer the lines, the more text we scan that is outside of the window. For example, any redisplay that needs to scroll the window up (M-v etc.) needs to find the buffer position for the window start. To do that, we use move_it_vertically_backward, which moves N screen lines up (back) in the buffer. But what that function does is move N _buffer_lines_ back, and then moves forward by screen lines to find which position is N screen lines above where we started. If each line is hundreds or thousands of characters, it is clear that moving back N buffer lines will move much more than needed, and thereafter moving by screen lines back through all those thousands of characters wastes a lot of CPU cycles. ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2013-02-11 23:55 UTC | newest] Thread overview: 42+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-02-03 22:05 bug#13623: 24.3.50; Redisplay issue with transient-mark-mode Lawrence Mitchell 2013-02-04 15:49 ` Eli Zaretskii 2013-02-04 17:20 ` Lawrence Mitchell 2013-02-04 18:10 ` Eli Zaretskii 2013-02-05 4:54 ` Dmitry Antipov 2013-02-05 12:07 ` Dmitry Antipov 2013-02-05 17:46 ` Eli Zaretskii 2013-02-05 17:45 ` Eli Zaretskii 2013-02-06 7:16 ` Dmitry Antipov 2013-02-06 14:31 ` Stefan Monnier 2013-02-06 15:14 ` Dmitry Antipov 2013-02-06 18:04 ` Eli Zaretskii 2013-02-06 18:23 ` Eli Zaretskii 2013-02-06 20:30 ` Stefan Monnier 2013-02-07 3:41 ` Eli Zaretskii 2013-02-08 13:33 ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov 2013-02-08 14:07 ` Eli Zaretskii 2013-02-08 14:46 ` Long lines and bidi Eli Zaretskii 2013-02-08 16:38 ` Dmitry Antipov 2013-02-08 16:52 ` Eli Zaretskii 2013-02-09 3:34 ` Paul Eggert 2013-02-09 8:46 ` Eli Zaretskii 2013-02-09 9:05 ` Paul Eggert 2013-02-09 9:33 ` Eli Zaretskii 2013-02-11 2:33 ` Paul Eggert 2013-02-09 10:01 ` Eli Zaretskii 2013-02-10 16:57 ` Eli Zaretskii 2013-02-11 5:43 ` Dmitry Antipov 2013-02-11 7:54 ` Dmitry Antipov 2013-02-11 16:47 ` Eli Zaretskii 2013-02-11 23:55 ` Paul Eggert 2013-02-11 16:42 ` Eli Zaretskii 2013-02-11 17:53 ` Dmitry Antipov 2013-02-11 18:10 ` Eli Zaretskii 2013-02-11 18:21 ` Dmitry Antipov 2013-02-11 17:17 ` Eli Zaretskii 2013-02-11 17:55 ` Drew Adams 2013-02-11 18:13 ` Eli Zaretskii 2013-02-08 16:21 ` Long lines and bidi [Was: Re: bug#13623: ...] Dmitry Antipov 2013-02-08 17:04 ` Eli Zaretskii 2013-02-08 15:33 ` Stefan Monnier 2013-02-08 16:05 ` Eli Zaretskii
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.