Archived Forum Post

Index of archived forum posts

Question:

Untar performance dropping on larger files on iOS

Jun 25 '13 at 13:32

We have successfully integrated the Untar library into our iOS App. With our initial 2MB testfile performance seemed ok, however as we test with larger files we see massive slowdowns. The current TAR file is 60 MB and each (tiny 12K) file we extract takes many seconds per file. This is already way too slow for our purposes, and will only become worse as we routinely deal with Files up to 2, 3 GB in size.

Our previous (open source ) library did not exhibit this behavior, and the TAR Format is one where performance should stay constant independent of how big the file is (no lookahead required, no compression, etc).

Our code looks like a copy of the iOS Untar example on http://www.example-code.com/ios/untar.asp

Are we using it wrong?


Accepted Answer

This new build should solve the problem:

http://www.chilkatsoft.com/preRelease/Chilkat-9.4.1-IOS-6.1.zip

PS> This is valid for both iOS 6.0 and 6.1


Answer

Hi Chilkat,

great work, well done!!!! Much better now! Thumbs up :-) ! Many thanks for your support and many thanks for solving this issue so quickly!

Greetings, disy.


Answer

The TAR format is a streaming format with no table of contents that would allow for a program to instantly seek to the location of a particular file within a TAR archive. In a nutshell, the TAR format is like this:

header for file 1
file 1 data
header for file 2
file 2 data
header for file 3
file 3 data
...

To access file N, a program must begin by reading the 1st header, skip the file data, read the 2nd header, skip the data, etc. until it finally gets to header N.

(If a TAR archive is compressed, such as .tar.gz or .tar.Z, then the decompression must occur on all of the TAR archive before getting to the Nth header (and of course, decompression must continue while untarring the Nth file).

If your program is untarring the entire TAR archive, then performance should be reasonable. If your program is making separate calls to untar each file individually, then this is the cause of the problem. It's due to the nature of the TAR format. (As opposed to a format such as .zip where there is a central directory located at the end of the file that provides entry names, offsets, sizes, etc.)


Answer

Hi Chilkat,

that's right. As mentioned above, we've implemented the untarring as mentioned here: http://www.example-code.com/ios/untar.asp

We are untarring the entire file at once, not only individual files. The TAR-Files are not compressed, nothing is to be done then write the file out.

For small TAR-Files(<2MB, containing 93 files) performance is OK (each file contained in the tar is "extracted" in milliseconds). For large files (tested with 60MB TAR-Files containing 2784 files) the performance is dropped (each file in the TAR takes several seconds). And our use case is to deal with TAR-files with 3-4GB containing an amount of several 100000 files...

Is there any problem in our implementation? We untar like this snippet:

NSFileManager * filemanager = [NSFileManager defaultManager];
if(![filemanager fileExistsAtPath:tarPath]){
    NSString *errorMessage = [NSString stringWithFormat:@"File does not exist at file path: %@", tarPath];
    NSLog(@"[UntarChilkat] - %@", errorMessage);
    NSDictionary *userInfo = [NSDictionary dictionaryWithObject:errorMessage
                                                         forKey:NSLocalizedDescriptionKey];
    if (error != NULL) *error = [NSError errorWithDomain:UNTAR_ERROR_DOMAIN code:UNTAR_ERROR_CODE_FILE_DOES_NOT_EXIST userInfo:userInfo];
    return false;
}

NSMutableString *strOutput = [NSMutableString stringWithCapacity:1000];

//  Important: It is helpful to send the contents of the
//  tar.LastErrorText property when requesting support.

//  Untar a .tar archive.
CkoTar *tar = [[[CkoTar alloc] init] autorelease];

//  Any string automatically begins the 30-day trial.
BOOL successfullyUnlocked;
successfullyUnlocked = [tar UnlockComponent:CHILKATUNLOCKCODEDISY];
if (successfullyUnlocked != YES) {
    [strOutput appendString: tar.LastErrorText];
    [strOutput appendString: @"\n"];
    self.mainTextField = strOutput;
    NSLog(@"[UntarChilkat] - Error in unlocking component: %@", strOutput);
    NSDictionary *userInfo = [NSDictionary dictionaryWithObject:tar.LastErrorText
                                                         forKey:NSLocalizedDescriptionKey];
    if (error != NULL) *error = [NSError errorWithDomain:UNTAR_ERROR_DOMAIN code:UNTAR_ERROR_CODE_UNLOCK_CHILKAT_FAILED userInfo:userInfo];
    return false;
}

int internalFileCount;

//  Untar into c:/temp/untarDir.  The directory tree(s) contained
//  within the TAR archive will be re-created rooted at this
//  directory.
tar.UntarFromDir = path;

//  If any filepaths within the Tar archive are absolute,
//  automatically make them relative by removing the first
//  forward or backward slash.  This protects from untarring
//  files to unexpected locations.
tar.NoAbsolutePaths = NO;
tar.VerboseLogging  = NO;

internalFileCount = [[tar Untar: tarPath] intValue];
if (internalFileCount < 0) {
    [strOutput appendString: tar.LastErrorText];
    [strOutput appendString: @"\n"];
    NSDictionary *userInfo = [NSDictionary dictionaryWithObject:tar.LastErrorText
                                                         forKey:NSLocalizedDescriptionKey];
    if (error != NULL) *error = [NSError errorWithDomain:UNTAR_ERROR_DOMAIN code:UNTAR_ERROR_CODE_NO_FILES_EXTRACTED userInfo:userInfo];
    return false;
}
else {
    [strOutput appendFormat: @"Untarred %d files and directories\n"
     ,internalFileCount];
}

Answer

Before I investigate further, first check to see if the same issue occurs with this new iOS 6.0/6.1 build:

http://www.chilkatsoft.com/preRelease/Chilkat-9.4.1-IOS-6.1.zip

If so, then I'll do some testing here. The same internal C++ implementation is common across all operating systems, platforms, programming languages, etc., and therefore I would expect the same good or bad behavior on all systems..


Answer

Hi Chilkat,

we've tested already with this implementation, 'cause we got it in the issue http://www.chilkatforum.com/questions/3586/ios-ckotarprogress-how-to-use-it-correctly-for-untar/3589

It's the same behavior in both versions, the official one and the preRelease candidate. We can mail you the TAR-file which we are testing, if you want to validate this issue.


Answer

I don't think my mailbox would accept such a large file. You could instead provide a URL for me to download (sending me the URL in private email).

However, before doing that, check this one possibility: Maybe it's your event callback causing the problem? For example, the following situation can happen in .NET, and maybe something similar is happening in iOS? If you have a event handler that updates a text box like this:

textBox1.Text += untarredPath + "\r\n";

What happens is that as the content in the text box grows, performance gets slower and slower just in updating the text box..


Answer

Hi Chilkat,

this can't be our problem. As described in http://www.chilkatforum.com/questions/3586/ios-ckotarprogress-how-to-use-it-correctly-for-untar/3589 our event callback only logs the next-tar event to console:

- (void)NextTarFile: (NSString *)fileName
       fileSize: (NSNumber *)fileSize
   bIsDirectory: (BOOL)bIsDirectory
           skip: (BOOL *)skip
{
 NSLog(@"UNTARCHILKATPROGRESS NEXTTARFILE: %@ FILESIZE: %@ BISDIRECTORY: %c SKIP: %s", fileName, fileSize, bIsDirectory, skip);
}

I'll mail you two links for downloading TAR-Files, the small one and the "large" one. To untar the files in the small one it takes milliseconds per "extracted" file. To untar the files in the "large" one, it takes seconds per "extracted" file. All on iOS device, not in simulator. iOS device is an iPad3.

Greetings, disy.


Answer

Hi Chilkat,

I can't send you a message via the contact formula, the message field is always marked as "this field is required", even though I've typed my message. So here comes the message:

Here are the two links for the TAR files: Small: Link removed Large: Link removed

Untar the whole file at once on an iOS device (we tested iPad3): Untar the small file on iOS device (iPad3): Each file included in the TAR takes milliseconds to be "extracted" (you can see this by logging the event callback). Untar the large file on iOS device (iPad3): Each file included in the TAR takes one to several seconds to be "extracted" (you can see this by logging the event callback). With our old open source library, the performance to "extract" files is the same with small files and big files up to 3-4GB. Greetings, disy.