Archived Forum Post

Index of archived forum posts

Question:

TAR : Fast read of content. Don't verify or extract

Sep 14 '15 at 22:57

I just want a list of the files in the tar (USTar) archive, but I've only managed to retrieve the xml file with the file if I e.g. verify or extract the archive. Can't I just read the archive? Verify and extract takes too long for what I need (a quick look).


Answer

Thanks! A TAR archive is different than a Zip archive in that there is no "central directory" structure to make it easy to quickly and easily see the contents of the archive. TAR stands for "Tape Archive", and it's quite old -- used when files were backed up to tape. (I'm old enough to remember those days..) Anyway, the format of a TAR archive goes like this: fileHeader1, fileData1, fileHeader2, fileData2, ...

Each file header provides an offset to the next file header. The only way to know what files are in the archive is to read the 1st header, skip past the data, read the next header, and so on. This involves scanning the entire archive -- the quickness depends on the characteristics of the TAR archive. For example, one might have a very large archive (perhaps more than a GB), but if there are very few files (because all the files are large), then the scan would happen very quickly because skipping over huge amounts of data is fast because a program has random-access to the data in the file (i.e. it can set the file pointer to the next header and read). However, if the TAR archive has a huge number of small files, then scanning the archive would most certainly take much more time.


Answer

Hmm... Didn't get the ListXml to work until now...

Works now :-)