From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Israelsson Tampe Newsgroups: gmane.lisp.guile.devel Subject: Fwd: What is the point of bytevectors? Date: Sat, 12 Sep 2020 18:31:25 +0200 Message-ID: References: <663bcc3f7ca042bfd727cbcecb0dccfcde43aec8.camel@divoplade.fr> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000c8fb2005af205525" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10130"; mail-complaints-to="usenet@ciao.gmane.io" To: guile-devel Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Sat Sep 12 18:31:55 2020 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kH8RN-0002SR-L4 for guile-devel@m.gmane-mx.org; Sat, 12 Sep 2020 18:31:53 +0200 Original-Received: from localhost ([::1]:46996 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kH8RM-0001WW-Kk for guile-devel@m.gmane-mx.org; Sat, 12 Sep 2020 12:31:52 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:32802) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kH8RA-0001WD-Nd for guile-devel@gnu.org; Sat, 12 Sep 2020 12:31:40 -0400 Original-Received: from mail-wm1-x32f.google.com ([2a00:1450:4864:20::32f]:33683) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kH8R8-0008QY-Se for guile-devel@gnu.org; Sat, 12 Sep 2020 12:31:40 -0400 Original-Received: by mail-wm1-x32f.google.com with SMTP id e11so6097063wme.0 for ; Sat, 12 Sep 2020 09:31:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=DfrX4GK448kGYdnW8lGSjXlYnNS28Pue/lTApUZMaS4=; b=dHffcfYwC7R5vf0lvaDFLtwfFZrSG1HcfbOT/bMvSRSBhflTTpUDDfExtElTt8Logt Yjqof3/d9rNwvm4qIr7o1vPX1LTBK+9E+8WZE8FW7eL9ecJ3xkub9Yz/AJfXdEGZWKBH yYWyIYcM06erRXIxNWx7XSl6BwoK5V5Bb5E5bRt+RJXRcGpmQxFEvnvJVL1DS9jLObFP RZJD8jzT1B9rX+AQCGUtiHBbStT+D5TX+Ag0E6AOj977ikjjezaY/EviM2FuTQR/i/+N j1GxF+6t3eKzjvR413AFlTUrMtMHpvEO00xykpbGiMHRmvoVOOyqN9wOIRMv9T+nHlNh qqOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=DfrX4GK448kGYdnW8lGSjXlYnNS28Pue/lTApUZMaS4=; b=L+JkpzcmCxSDLyj5su8QgQ2JgsXiW77pisH8wBqXqy7fZs4F9NzvNSK6ssTHIugQHL nCUnoIgFXEw732qUwGvsX54Usir09XOc92qRFSnIe7B9pL0Q+gZNxjX2k53f1w7J+yt/ 7F0k7J49FyZsUuGulsVtmJC5OSmV6eKwTfYSJBok0Cgj5ovr64DX0YKYFlGhZlNVyIr/ dzAcavrZ02bLhFhjczWp/QP3Tmweupfl4utIOmgUFjrN78fK0xrk/E4/oxdvoE5z3KvX uuCUx5Jz7fPN2883osQzogkt2LjFzAnnCfcz2QBLAuy9hYeIsaOcmH3/kYI9MlIS5rBh ry7A== X-Gm-Message-State: AOAM532DKfe61/1E0Ng3V4CVZSHQcYxzOaM9CPP/RMjthD0MQqJtgddm 9Ozx0N6GlQZRl5dLSmgc5Mqxv9L05Zbg4s6N2mYZ1pSr X-Google-Smtp-Source: ABdhPJyUhe3vZHsXb0BtW/yh3d+OL0G4slpjne2SIHKNX10W5MS0u09bUjBAFjidOCUZtPABb2G6ubHMqlktYn7/Zn8= X-Received: by 2002:a05:600c:2183:: with SMTP id e3mr7868297wme.49.1599928296601; Sat, 12 Sep 2020 09:31:36 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2a00:1450:4864:20::32f; envelope-from=stefan.itampe@gmail.com; helo=mail-wm1-x32f.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20583 Archived-At: --000000000000c8fb2005af205525 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable ---------- Forwarded message --------- From: Stefan Israelsson Tampe Date: Sat, Sep 12, 2020 at 5:12 PM Subject: Re: What is the point of bytevectors? To: Linus Bj=C3=B6rnstam If you want to handle e.g. raw data bytevectors is a must. You will find that you need it when working with binary files, interact with C etc etc. Not so much managing strings, for that as you say the string object should be enough. And now to a shamfule advertisement of my python module for guile. In python bytevectors are very much like strings and you can usa all the python ordinary string operations on it. You may even use the regular expression module for it if you like. I have a schemish interface for it and you may use that if you can stand the dependency. On Sat, Sep 12, 2020 at 12:36 PM Linus Bj=C3=B6rnstam wrote: > The point is to work with binary data, of which the most common type of C > strings are one kind. > > If you are using libguile in your C code you can use guile strings in you= r > C code and pass them around to avoid the encoding/decoding overhead. Or, > the other way around, expose the procedures that work with whatever C > string representation your are using to guile. > > If your latin1 strings contain unicode data they are not latin1. > > -- > Linus Bj=C3=B6rnstam > > On Sat, 12 Sep 2020, at 09:49, divoplade wrote: > > Hello guile users, > > > > I am writing a library mixing some scheme code and C code, and I have > > two options for interfacing C strings: > > 1. Use bytevectors; > > 2. Use strings with byte access semantics (so-called latin-1, which is > > really a misleading name since it will most certainly contain utf-8- > > encoded unicode text). > > > > From the C side, they have nearly identical APIs, and the conversion > > functions do not transcode anything. > > > > From the scheme side, however: > > 1. The bytevector library needs to be imported; > > 2. The function names have way more characters to type; > > 3. The bytevector library is missing a lot of text functions (like > > join, split, trim, pad, searching...). > > > > If the user wants to always manipulate unicode (decoded) strings, using > > either bytevectors or latin-1 strings require transcoding to enter the > > library and to exit the library, so either option is valid. > > > > But if the user wants to always manipulate utf-8-encoded strings [1], > > using bytevectors is impossible or much more difficult (see points > > above). > > > > So, why should I ever use bytevectors? > > > > divoplade > > > > [1] https://utf8everywhere.org/ > > > > > > > > --000000000000c8fb2005af205525 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


---------- Forwarded message ---------
From: Stefan Israelsson Tampe <stefan.itampe@= gmail.com>
Date: Sat, Sep 12, 2020 at 5:12 PM
Subject: = Re: What is the point of bytevectors?
To: Linus Bj=C3=B6rnstam <linus.internet@fastmail.se>=


If you want to handle e.g. raw data byte= vectors is a must. You will find that you need it when working with binary = files, interact with C etc etc. Not so much managing strings, for that as y= ou say the string object should be enough. And now to a shamfule advertisem= ent=C2=A0of my python module for guile. In python bytevectors=C2=A0are very= much=C2=A0like strings and you can usa all the python ordinary=C2=A0string= operations on it. You may even use the regular expression module for it if= you like. I have a schemish interface for it and you may use that if you c= an stand the dependency.=C2=A0

On Sat, Sep 12, 2020 at 12:36 PM Linus Bj=C3= =B6rnstam <linus.internet@fastmail.se> wrote:
The point is to work with binary data, of whic= h the most common type of C strings are one kind.

If you are using libguile in your C code you can use guile strings in your = C code and pass them around to avoid the encoding/decoding overhead. Or, th= e other way around, expose the procedures that work with whatever C string = representation your are using to guile.

If your latin1 strings contain unicode data they are not latin1.

--
=C2=A0 Linus Bj=C3=B6rnstam

On Sat, 12 Sep 2020, at 09:49, divoplade wrote:
> Hello guile users,
>
> I am writing a library mixing some scheme code and C code, and I have<= br> > two options for interfacing C strings:
> 1. Use bytevectors;
> 2. Use strings with byte access semantics (so-called latin-1, which is=
> really a misleading name since it will most certainly contain utf-8- > encoded unicode text).
>
> From the C side, they have nearly identical APIs, and the conversion > functions do not transcode anything.
>
> From the scheme side, however:
> 1. The bytevector library needs to be imported;
> 2. The function names have way more characters to type;
> 3. The bytevector library is missing a lot of text functions (like
> join, split, trim, pad, searching...).
>
> If the user wants to always manipulate unicode (decoded) strings, usin= g
> either bytevectors or latin-1 strings require transcoding to enter the=
> library and to exit the library, so either option is valid.
>
> But if the user wants to always manipulate utf-8-encoded strings [1],<= br> > using bytevectors is impossible or much more difficult (see points
> above).
>
> So, why should I ever use bytevectors?
>
> divoplade
>
> [1] https://utf8everywhere.org/
>
>
>

--000000000000c8fb2005af205525--