* Look for data serialisation format to implement communication between Emacs and external program.
@ 2013-01-06 21:44 Oleksandr Gavenko
2013-01-06 22:46 ` Thien-Thi Nguyen
0 siblings, 1 reply; 7+ messages in thread
From: Oleksandr Gavenko @ 2013-01-06 21:44 UTC (permalink / raw)
To: help-gnu-emacs
I know that XML is BAD so avoid it.
My experience with 'expect' show me that simple string data communication is
good but some times data have same pattern as _prompt_ so I can't distinct
control data from regular data.
Prefixes like in 'diff' file format resolve this issue. But this move us to
TLV (table-length-value) format.
Standard TLV format described by ASN.1 BER coding schema.
Is that right to use ASN.1 BER as serialisation data format for communication
between Emacs and external program?
My external application is not jet written but I select Python as
implementation language. Seems 'python-pyasn1' library make work from Python
side:
http://pyasn1.sourceforge.net/
How about Emacs? Is there any library for creating/parsing ASN.1 BER data?
I need only basic types, like structure of numbers and UTF-8 strings (blogging
software - pass command and data like article + title + list of tags)...
Another formats like JSON or sexp also look nice but I don't know about Emacs
support for this formats.
I didn't look yet to projects like pymacs, which solved such task.
Any advice is welcome!
--
Best regards!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Look for data serialisation format to implement communication between Emacs and external program.
2013-01-06 21:44 Look for data serialisation format to implement communication between Emacs and external program Oleksandr Gavenko
@ 2013-01-06 22:46 ` Thien-Thi Nguyen
2013-01-07 13:35 ` Oleksandr Gavenko
2013-01-07 21:57 ` Oleksandr Gavenko
0 siblings, 2 replies; 7+ messages in thread
From: Thien-Thi Nguyen @ 2013-01-06 22:46 UTC (permalink / raw)
To: Oleksandr Gavenko; +Cc: help-gnu-emacs
[-- Attachment #1: Type: text/plain, Size: 1272 bytes --]
() Oleksandr Gavenko <gavenkoa@gmail.com>
() Sun, 06 Jan 2013 23:44:46 +0200
Another formats like JSON or sexp also look nice but I don't know
about Emacs support for this formats.
Any advice is welcome!
Go with sexp. If you must touch XML, at least dress it up as SXML,
which, aside from being less ugly, extends interop to include Scheme.
E.g., sexp + SXML is the approach IXIN uses. Latest announcement:
<http://lists.gnu.org/archive/html/help-texinfo/2013-01/msg00000.html>.
If you absolutely need a binary (non-text) format, then i would next
recommend bindat.el (info "(elisp) Byte Packing") for the job. Perhaps
ASN.1 can be layered on top of bindat.el, but that doesn't sound so fun.
(But what do i know about ASN.1 and/or fun?)
Lastly, on the other side of the pipe, why not consider Emacs itself, or
Guile, or something that can ‘read’ a sexp?
--
Thien-Thi Nguyen ..................................... GPG key: 4C807502
. NB: ttn at glug dot org is not me .
. (and has not been since 2007 or so) .
. ACCEPT NO SUBSTITUTES .
........... please send technical questions to mailing lists ...........
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Look for data serialisation format to implement communication between Emacs and external program.
2013-01-06 22:46 ` Thien-Thi Nguyen
@ 2013-01-07 13:35 ` Oleksandr Gavenko
2013-01-08 9:44 ` Thien-Thi Nguyen
2013-01-07 21:57 ` Oleksandr Gavenko
1 sibling, 1 reply; 7+ messages in thread
From: Oleksandr Gavenko @ 2013-01-07 13:35 UTC (permalink / raw)
To: help-gnu-emacs
On 2013-01-07, Thien-Thi Nguyen wrote:
> () Oleksandr Gavenko <gavenkoa@gmail.com>
> () Sun, 06 Jan 2013 23:44:46 +0200
>
> Another formats like JSON or sexp also look nice but I don't know
> about Emacs support for this formats.
>
> Go with sexp. If you must touch XML, at least dress it up as SXML,
> which, aside from being less ugly, extends interop to include Scheme.
> E.g., sexp + SXML is the approach IXIN uses. Latest announcement:
> <http://lists.gnu.org/archive/html/help-texinfo/2013-01/msg00000.html>.
>
One thing stop me from using sexp - my Python code can be possible used in
another cases where sexp parser can be absent.
> If you absolutely need a binary (non-text) format, then i would next
> recommend bindat.el (info "(elisp) Byte Packing") for the job. Perhaps
> ASN.1 can be layered on top of bindat.el, but that doesn't sound so fun.
> (But what do i know about ASN.1 and/or fun?)
>
Thank for suggestion. I completed simple type-length-value example:
(setq bin-str (unibyte-string ?s 4 ?h ?a ?l ?o))
(setq bin-header '((:type u8) (:len u8) (:val vec (:len))))
(setq bin-data (bindat-unpack bin-header bin-str))
(bindat-pack bin-header bin-data)
But as input come from external process I need manually check for input end
because parsing of incomplete input take error "args-out-of-range".
I don't know how to resolve this issue.
I expect to find a way to get data validation for free (like XSD/RNC for XML).
Also I don't see have can I split data into packets with "bindat" (Emacs send
request and Python send response in a loop without dropping connection while
returned data is valid). As solution - make 10 attempt with 1 sec delay and
then report error - incomplete or invalid packet...
> Lastly, on the other side of the pipe, why not consider Emacs itself, or
> Guile, or something that can ‘read’ a sexp?
--
Best regards!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Look for data serialisation format to implement communication between Emacs and external program.
2013-01-07 13:35 ` Oleksandr Gavenko
@ 2013-01-08 9:44 ` Thien-Thi Nguyen
0 siblings, 0 replies; 7+ messages in thread
From: Thien-Thi Nguyen @ 2013-01-08 9:44 UTC (permalink / raw)
To: Oleksandr Gavenko; +Cc: help-gnu-emacs
[-- Attachment #1: Type: text/plain, Size: 1258 bytes --]
() Oleksandr Gavenko <gavenkoa@gmail.com>
() Mon, 07 Jan 2013 15:35:01 +0200
But as input come from external process I need manually check for
input end because parsing of incomplete input take error
"args-out-of-range".
I don't know how to resolve this issue.
Also I don't see have can I split data into packets with "bindat"
(Emacs send request and Python send response in a loop without
dropping connection while returned data is valid). As solution - make
10 attempt with 1 sec delay and then report error - incomplete or
invalid packet...
See ‘accept-process-output’: (info "(elisp) Accepting Output")
and ‘condition-case’: (info "(elisp) Handling Errors")
I expect to find a way to get data validation for free (like XSD/RNC
for XML).
It depends on what you mean by "valid data" and "for free", i suppose.
--
Thien-Thi Nguyen ..................................... GPG key: 4C807502
. NB: ttn at glug dot org is not me .
. (and has not been since 2007 or so) .
. ACCEPT NO SUBSTITUTES .
........... please send technical questions to mailing lists ...........
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Look for data serialisation format to implement communication between Emacs and external program.
2013-01-06 22:46 ` Thien-Thi Nguyen
2013-01-07 13:35 ` Oleksandr Gavenko
@ 2013-01-07 21:57 ` Oleksandr Gavenko
1 sibling, 0 replies; 7+ messages in thread
From: Oleksandr Gavenko @ 2013-01-07 21:57 UTC (permalink / raw)
To: help-gnu-emacs
On 2013-01-07, Thien-Thi Nguyen wrote:
> Go with sexp. If you must touch XML, at least dress it up as SXML,
> which, aside from being less ugly, extends interop to include Scheme.
I found nice support for JSON in Python code library:
http://docs.python.org/2/library/json.html
and starting from 2008 (Emacs v23.1):
json.el
(require 'json)
Live examples at:
http://edward.oconnor.cx/2006/03/json.el
Seems that this is a simplest and rich data format for communication in modern
applications. In fact it is library/product/framework agnostic. For these
reasons SXML is losing.
But seems that having TLV in mind is also good thing to make proper decision.
--
Best regards!
^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <mailman.16821.1357508702.855.help-gnu-emacs@gnu.org>]
* Re: Look for data serialisation format to implement communication between Emacs and external program.
[not found] <mailman.16821.1357508702.855.help-gnu-emacs@gnu.org>
@ 2013-01-07 7:53 ` Helmut Eller
2013-01-07 13:53 ` Oleksandr Gavenko
0 siblings, 1 reply; 7+ messages in thread
From: Helmut Eller @ 2013-01-07 7:53 UTC (permalink / raw)
To: help-gnu-emacs
On Sun, Jan 06 2013, Oleksandr Gavenko wrote:
> Is that right to use ASN.1 BER as serialisation data format for communication
> between Emacs and external program?
S-expressions is the only format that Emacs can write and parse quickly
because the printer and reader are implemented in C. This is likely 10
times faster than any parser that you write in Emacs Lisp. The downside
is that the external program needs to be able to do the same. Not such
a bad tradeoff as S-expressions are fairly easy to parse.
For communication with an external format I recommend a "framed" format:
a frame is a fixed sized header followed by a variable length payload.
The header describes the length of the frame. The length should be in
bytes (not characters as counting characters in UTF8 strings is
uneccessary complicated). Knowing the length of the frame is very
useful because that makes it easy to wait for a complete frame. After
you received a complete frame, parsing is simpler because you don't have
to worry about incomplete input.
I also recommend to limit the frame length to 24 bits (not 32 bit)
because Emacs fixnums are limited to 29 bits on 32 bit machines.
The payload can then be an S-expression printed with the Emacs prin1 and
parsed back with the read function. The encoding of the payload can be
utf-8. But use the Emacs 'binary coding system for communication with
the external process and unibyte buffers for parsing. For the
binary-to-utf8 conversion of the payload use something like
decode-coding-string (which is written C and should be fast).
If you like, you can also use extra bits in the header to indicate the
format of the payload. E.g. it might be useful to have frames that
contain only plain strings (not encoded as S-expr).
Helmut
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Look for data serialisation format to implement communication between Emacs and external program.
2013-01-07 7:53 ` Helmut Eller
@ 2013-01-07 13:53 ` Oleksandr Gavenko
0 siblings, 0 replies; 7+ messages in thread
From: Oleksandr Gavenko @ 2013-01-07 13:53 UTC (permalink / raw)
To: help-gnu-emacs
On 2013-01-07, Helmut Eller wrote:
> On Sun, Jan 06 2013, Oleksandr Gavenko wrote:
>
>> Is that right to use ASN.1 BER as serialisation data format for communication
>> between Emacs and external program?
>
> S-expressions is the only format that Emacs can write and parse quickly
> because the printer and reader are implemented in C. This is likely 10
> times faster than any parser that you write in Emacs Lisp. The downside
> is that the external program needs to be able to do the same. Not such
> a bad tradeoff as S-expressions are fairly easy to parse.
>
> For communication with an external format I recommend a "framed" format:
> a frame is a fixed sized header followed by a variable length payload.
> The header describes the length of the frame. The length should be in
> bytes (not characters as counting characters in UTF8 strings is
> uneccessary complicated). Knowing the length of the frame is very
> useful because that makes it easy to wait for a complete frame. After
> you received a complete frame, parsing is simpler because you don't have
> to worry about incomplete input.
>
> I also recommend to limit the frame length to 24 bits (not 32 bit)
> because Emacs fixnums are limited to 29 bits on 32 bit machines.
>
> The payload can then be an S-expression printed with the Emacs prin1 and
> parsed back with the read function. The encoding of the payload can be
> utf-8. But use the Emacs 'binary coding system for communication with
> the external process and unibyte buffers for parsing. For the
> binary-to-utf8 conversion of the payload use something like
> decode-coding-string (which is written C and should be fast).
>
Seems that this is good solution in case of Emacs:
(assoc ':title (read "((:type blog-entry) (:title \"Hello\") (:article \"world!\"))"))
Data validation:
(read ")") ;; ==> invalid-read-syntax
or when assoc return unknown ":type", etc...
Only things that annoying is escaping (like <div>hello</div> for
<div>hello</div> in XML or in SLIP protocol where 0x7e escaped by 0x7d 0x5e
and escape character 0x7d escaped by 0x7d 0x5d).
> If you like, you can also use extra bits in the header to indicate the
> format of the payload. E.g. it might be useful to have frames that
> contain only plain strings (not encoded as S-expr).
>
I start from using custom TLV data format but parsing and validation is hand
written so I decide as for suggestions...
--
Best regards!
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-01-08 9:44 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-06 21:44 Look for data serialisation format to implement communication between Emacs and external program Oleksandr Gavenko
2013-01-06 22:46 ` Thien-Thi Nguyen
2013-01-07 13:35 ` Oleksandr Gavenko
2013-01-08 9:44 ` Thien-Thi Nguyen
2013-01-07 21:57 ` Oleksandr Gavenko
[not found] <mailman.16821.1357508702.855.help-gnu-emacs@gnu.org>
2013-01-07 7:53 ` Helmut Eller
2013-01-07 13:53 ` Oleksandr Gavenko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).