all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@etl.go.jp>
Cc: spiegel@gnu.org, emacs-devel@gnu.org, d.love@dl.ac.uk
Subject: unencodable-char-position [Re: Several serious problems]
Date: Sun, 11 Aug 2002 10:59:33 +0900 (JST)	[thread overview]
Message-ID: <200208110159.KAA23419@etlken.m17n.org> (raw)
In-Reply-To: <200207250312.g6P3C9J06653@aztec.santafe.edu> (message from Richard Stallman on Wed, 24 Jul 2002 21:12:09 -0600 (MDT))

In article <200207250312.g6P3C9J06653@aztec.santafe.edu>, Richard Stallman <rms@gnu.org> writes:

>     If the specified coding system is totally inappropriate for
>     the buffer, highlighting them will results in huge amount of
>     overlays and also it takes long time to finish the job.

> That is true.

>       If
>     we limit the number of highlighting, it may give users
>     incorrect information (i.e. non-highlighted characters seems
>     to be encodable).

> It could highlight the first N runs of such characters, and display a
> message saying "Many more unencodable characters found--type WHATEVER
> to view them".  WHATEVER could be the same command with a prefix
> argument.

I implemented that and tried on several files.  But, it
seems that such kind of feature is not that helpful.

In the case that the buffer contains many unencodable chars,
usually the specified coding system is wrong, and we must
use a different coding system.  So, it is not that
interesting to know where are the other unencodable
characters.

In the case that the buffer contains a few unencodable
chars, as it's seldam that more than one of them appear in
one window, highlighting the other unencodable chars is not
that useful.


By the way, I've just noticed that Dave has already
installed the function `unencodable-char-position' in
mule-cmds.el and used it in select-safe-coding-system.

That function resembles to check-coding-system-region on
which we are currently discussing.

But, as the docstring says, it's slow.

So, I commited these changes.

(1) Re-implementation of unencodable-char-position in C
    while adding two optional arguments.
----------------------------------------------------------------------
unencodable-char-position is a built-in function.
(unencodable-char-position START END CODING-SYSTEM &optional COUNT STRING)

Return position of first un-encodable character in a region.
START and END specfiy the region and CODING-SYSTEM specifies the
encoding to check.  Return nil if CODING-SYSTEM does encode the region.

If optional 4th argument COUNT is non-nil, it specifies at most how
many un-encodable characters to search.  In this case, the value is a
list of positions.

If optional 5th argument STRING is non-nil, it is a string to search
for un-encodable characters.  In that case, START and END are indexes
to the string.
----------------------------------------------------------------------


(2) New function `search-unencodable-char' for interactive
    use.  It utilizes `unencodable-char-position'.

----------------------------------------------------------------------
(search-unencodable-char CODING-SYSTEM)

Search forward from point for a character that is not encodable.
It asks which coding system to check.
If such a character is found, set point after that character.
Otherwise, don't move point.

When called from a program, the value is a position of the found character,
or nil if all characters are encodable.
----------------------------------------------------------------------

It may be good to bind C-x RET s to this command.

Could someone make this command more user friendly
(e.g. improving messages)?  It is also easy to modify this
funciton to highlight a few more (or windowful) unencodable
characters if you think that is surely helpful.

(3) Make select-safe-coding-system to show (at most 10)
    unencodable characters for each default coding systems
    tried.

Now, if any unencodable chars are found, one can type C-g to
cancel further saving.  As C-g doesn't hide *Warning*
buffer, one can clik on the displayed unencodable chars to
jump to the corresponding position in a buffer.


---
Ken'ichi HANDA
handa@etl.go.jp

  parent reply	other threads:[~2002-08-11  1:59 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-22 17:11 Several serious problems Richard Stallman
2002-07-22 19:01 ` Andre Spiegel
2002-07-22 19:03 ` Andre Spiegel
2002-07-23  4:00   ` Richard Stallman
2002-07-22 19:03 ` Andreas Schwab
2002-07-23 18:58   ` Richard Stallman
2002-07-22 19:11 ` Andre Spiegel
2002-07-23  4:42 ` Karl Eichwalder
2002-07-24  3:25   ` Richard Stallman
2002-07-24  4:43     ` Karl Eichwalder
2002-07-25  3:12       ` Richard Stallman
2002-07-25  3:24         ` Karl Eichwalder
2002-07-26 15:35           ` Richard Stallman
2002-07-27  3:19             ` Karl Eichwalder
2002-07-29  1:12               ` Richard Stallman
2002-07-29 14:32                 ` Karl Eichwalder
2002-07-30  1:00                   ` Richard Stallman
2002-08-09  7:42               ` Stefan Monnier
2002-08-09 16:08                 ` Karl Eichwalder
2002-08-10 17:16                 ` Richard Stallman
2002-08-12 16:20                   ` Stefan Monnier
2002-08-13  1:48                     ` Richard Stallman
2002-08-15  2:30                       ` Karl Eichwalder
2002-08-15  2:47                         ` Stefan Monnier
2002-08-15  5:31                           ` Karl Eichwalder
2002-08-15 15:30                             ` Stefan Monnier
2002-08-15 17:33                               ` Dave Love
2002-07-23 13:35 ` Kenichi Handa
2002-07-23 13:52   ` Alan Shutko
2002-07-24  3:25     ` Richard Stallman
2002-07-24  3:25   ` Richard Stallman
2002-07-24  4:37     ` Kenichi Handa
2002-07-25  3:12       ` Richard Stallman
2002-07-25  5:53         ` Miles Bader
2002-07-26 14:29         ` Francesco Potorti`
2002-07-27 18:52           ` Richard Stallman
2002-08-09  7:43           ` Stefan Monnier
2002-08-11  1:59         ` Kenichi Handa [this message]
2002-08-12 17:06           ` unencodable-char-position [Re: Several serious problems] Richard Stallman
2002-08-12 17:15             ` Stefan Monnier
2002-08-13  0:37               ` Kenichi Handa
2002-08-13 22:47                 ` Richard Stallman
2002-08-14  0:20                   ` Kenichi Handa
2002-08-14 23:13                     ` Richard Stallman
2002-08-15 17:51           ` Dave Love
2002-08-19  5:04             ` Kenichi Handa
2002-08-29 22:52               ` Dave Love
2002-08-30  6:53                 ` Andre Spiegel
2002-08-09  7:44       ` Several serious problems Stefan Monnier
2002-08-10 17:16         ` Richard Stallman
2002-08-12  0:26         ` Kenichi Handa
2002-08-09  4:41 ` Stefan Monnier
2002-08-15 17:23   ` Dave Love

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200208110159.KAA23419@etlken.m17n.org \
    --to=handa@etl.go.jp \
    --cc=d.love@dl.ac.uk \
    --cc=emacs-devel@gnu.org \
    --cc=spiegel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.