unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [PATCH] support filename prefixes and 100-character filenames in tar-mode.el
@ 2008-04-11 17:51 David Glasser
  2008-04-13 22:23 ` Juri Linkov
  0 siblings, 1 reply; 11+ messages in thread
From: David Glasser @ 2008-04-11 17:51 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 3268 bytes --]

I noticed this morning that when opening a tarball containing a file
with a long filename in tar-mode, the entry for that file didn't list
the full filename; it only listed some suffix of the filename.  I did
some digging and found that newish tar standards (among others)
support a "prefix" field in the tar header block to overcome tar
format limitations; however, tar-mode doesn't support it.

You should be able to test this by opening the attached tarball.  The
correct output should look something like:

 drwxr-x--- glasser/eng           0 foo1
 drwxr-x--- glasser/eng           0 foo1/foo2
 drwxr-x--- glasser/eng           0 foo1/foo2/foo3
 drwxr-x--- glasser/eng           0 foo1/foo2/foo3/foo4
 drwxr-x--- glasser/eng           0 foo1/foo2/foo3/foo4/foo5
 drwxr-x--- glasser/eng           0 foo1/foo2/foo3/foo4/foo5/foo6
 drwxr-x--- glasser/eng           0 foo1/foo2/foo3/foo4/foo5/foo6/foo7
 drwxr-x--- glasser/eng           0 foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8
 drwxr-x--- glasser/eng           0 foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12/foo13
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12/foo13/foo14
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12/foo13/foo14/foo15
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12/foo13/foo14/foo15/foo16
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12/foo13/foo14/foo15/foo16/foo17
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12/foo13/foo14/foo15/foo16/foo17/foo18
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12/foo13/foo14/foo15/foo16/foo17/foo18/foo19
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12/foo13/foo14/foo15/foo16/foo17/foo18/foo19/foo20
 drwxr-x--- glasser/eng           0
foo1/foo2/foo3/foo4/foo5/foo6/foo7/foo8/foo9/foo10/foo11/foo12/foo13/foo14/foo15/foo16/foo17/foo18/foo19/foo20/foo21

Without my patch at all, the bottom few look like they aren't inside foo1/foo2/.

With all of the patch except for the change in the initial
initialization of name-end, the results look correct, except the
penultimate line ends in "foo2" instead of "foo20"; that is because
the name field here fill the entire field with no nulls, but the code
previously assumed there would be at least one null.  (It's possible
that similar adjustments are required for the initial values of
link-end, gname-end, and uname-end.)

Let me know if this is a significant enough change that papers are necessary.

Also, please note that until an hour ago I had no idea what the TAR
format looked like!

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

[-- Attachment #2: tar-prefix.patch.txt --]
[-- Type: text/plain, Size: 3686 bytes --]

Index: tar-mode.el
===================================================================
*** tar-mode.el	(revision 200)
--- tar-mode.el	(working copy)
*************** This information is useful, but it takes
*** 197,203 ****
  (defconst tar-gname-offset (+ tar-uname-offset 32))
  (defconst tar-dmaj-offset (+ tar-gname-offset 32))
  (defconst tar-dmin-offset (+ tar-dmaj-offset 8))
! (defconst tar-end-offset (+ tar-dmin-offset 8))
  
  (defun tar-header-block-tokenize (string)
    "Return a `tar-header' structure.
--- 197,204 ----
  (defconst tar-gname-offset (+ tar-uname-offset 32))
  (defconst tar-dmaj-offset (+ tar-gname-offset 32))
  (defconst tar-dmin-offset (+ tar-dmaj-offset 8))
! (defconst tar-prefix-offset (+ tar-dmin-offset 8))
! (defconst tar-end-offset (+ tar-prefix-offset 155))
  
  (defun tar-header-block-tokenize (string)
    "Return a `tar-header' structure.
*************** write-date, checksum, link-type, and lin
*** 207,219 ****
  	(;(some 'plusp string)		 ; <-- oops, massive cycle hog!
  	 (or (not (= 0 (aref string 0))) ; This will do.
  	     (not (= 0 (aref string 101))))
! 	 (let* ((name-end (1- tar-mode-offset))
  		(link-end (1- tar-magic-offset))
  		(uname-end (1- tar-gname-offset))
  		(gname-end (1- tar-dmaj-offset))
  		(link-p (aref string tar-linkp-offset))
  		(magic-str (substring string tar-magic-offset (1- tar-uname-offset)))
! 		(uname-valid-p (or (string= "ustar  " magic-str) (string= "GNUtar " magic-str)))
  		name linkname
  		(nulsexp   "[^\000]*\000"))
  	   (when (string-match nulsexp string tar-name-offset)
--- 208,221 ----
  	(;(some 'plusp string)		 ; <-- oops, massive cycle hog!
  	 (or (not (= 0 (aref string 0))) ; This will do.
  	     (not (= 0 (aref string 101))))
! 	 (let* ((name-end tar-mode-offset)
  		(link-end (1- tar-magic-offset))
  		(uname-end (1- tar-gname-offset))
  		(gname-end (1- tar-dmaj-offset))
  		(link-p (aref string tar-linkp-offset))
  		(magic-str (substring string tar-magic-offset (1- tar-uname-offset)))
! 		(uname-valid-p (or (string= "ustar  " magic-str) (string= "GNUtar " magic-str)
!                                    (string= "ustar\0000" magic-str)))
  		name linkname
  		(nulsexp   "[^\000]*\000"))
  	   (when (string-match nulsexp string tar-name-offset)
*************** write-date, checksum, link-type, and lin
*** 229,234 ****
--- 231,242 ----
  			    nil
  			  (- link-p ?0)))
  	   (setq linkname (substring string tar-link-offset link-end))
+            (when (and uname-valid-p
+                       (string-match nulsexp string tar-prefix-offset)
+                       (> (match-end 0) (1+ tar-prefix-offset)))
+              (setq name (concat (substring string tar-prefix-offset
+                                            (1- (match-end 0)))
+                                 "/" name)))
  	   (if default-enable-multibyte-characters
  	       (setq name
  		     (decode-coding-string name
*************** write-date, checksum, link-type, and lin
*** 255,261 ****
  	     (and uname-valid-p (substring string tar-uname-offset uname-end))
  	     (and uname-valid-p (substring string tar-gname-offset gname-end))
  	     (tar-parse-octal-integer string tar-dmaj-offset tar-dmin-offset)
! 	     (tar-parse-octal-integer string tar-dmin-offset tar-end-offset)
  	     )))
  	(t 'empty-tar-block)))
  
--- 263,269 ----
  	     (and uname-valid-p (substring string tar-uname-offset uname-end))
  	     (and uname-valid-p (substring string tar-gname-offset gname-end))
  	     (tar-parse-octal-integer string tar-dmaj-offset tar-dmin-offset)
! 	     (tar-parse-octal-integer string tar-dmin-offset tar-prefix-offset)
  	     )))
  	(t 'empty-tar-block)))
  

[-- Attachment #3: test.tar.gz --]
[-- Type: application/x-gzip, Size: 437 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-04-22 21:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-11 17:51 [PATCH] support filename prefixes and 100-character filenames in tar-mode.el David Glasser
2008-04-13 22:23 ` Juri Linkov
2008-04-14  1:48   ` David Glasser
2008-04-15 22:35     ` Juri Linkov
2008-04-15 23:17       ` David Glasser
2008-04-16 23:17         ` Juri Linkov
2008-04-16 23:32           ` David Glasser
2008-04-17  9:21             ` Juri Linkov
2008-04-17 13:34               ` Chong Yidong
2008-04-22 20:51                 ` Juri Linkov
2008-04-22 21:20                   ` David Glasser

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).