Nahum Shalman

Tricks With Joyent's Manta Service: Tarballs

The Use Case

Let's say you're backing up a directory of files into Joyent's Manta service with Dave Eddy's manta-sync tool. You're happily backing up your files regularly, but one day you need to get the whole directory back out. You notice that "Remote => Local sync" is a possible future feature that hasn't been implemented yet.

How do I get that directory of files back out of Manta?

Other Options

Before I get to my "tarball" solution from the article title, let's look at another option for getting the files back out of Manta:

If I run mfind -t o on the directory I care about I'll get a list of the paths to all of the objects under that directory.

I could pipe that into xargs mget... but then all of those files will end up in the current working directory; mget won't preserve the full path to each object.

That means I need to manually recreate the directory hierarchy on the local end and mget each file into the right place.

Here's one way to do that:

mfind <del>/stor/backups/example -t d | xargs -I{} mkdir -p ./{}  
mfind </del>/stor/backups/example -t o | xargs -I{} mget -o ./{} {}  

That will recreate the full Manta path to the directory you care about and download the files into it. It's slow (one process / API request for each file you want to download), and it just feels gross to me. Let's move on to some better options

Making a Tarball

The simple version

Tar terminates an archive with two blocks of zeros.
As long as your tar archives don't contain garbage of bytes other than zeros after those last blocks of zeros, you can concatenate them, and GNU tar will be able to extract all the archives one after the other if you provide it with the -i/--ignore-zeros flag (See the GNU tar manual.)

Armed with this knowledge we can run a job in Manta that will wrap each file into a compressed tar archive and just concatenate them all together. As long as we provide the -i flag, tar will ignore the end-of-archive blocks of zeros and extract all of our files.

mfind ~~/stor/backups/example -t o | mjob create -o -m 'gtar -cz $MANTA_INPUT_FILE' -r 'cat' | gtar -xzvi --strip-components=4  

That's as far as I got when I decided to write this article. One clever one-liner.

Can we do better?

I discovered on IRC that I am not the first person to have realized that this is possible (and am clearly not the first person to concatenate tar files).

But... having to remember to add the "-i" argument is a lot of work. My fingers won't remember that at all. And other people might forget it even if I tell them to use it. Why can't we make a Real Tarball (TM) that doesn't need special flags to extract?

Can't I spend a bit more CPU time and save myself and others the effort of having to remember things? Isn't that the point of having a computer?

The fancier way

As I mentioned before, GNU tar puts a pair of (512 byte) blocks of zeros at the end of an archive (Reference).

Let's try to strip off the trailing blocks of zeros at the end of the individual tar files before we concatenate them.

Because of tar's history as a tool for writing to magnetic tape (it's the tape archive tool...) it has a notion of a blocking factor which allowed sending bigger blocks of bytes at a time to the tape drive to improve throughput and data density (Reference). From what I can tell blocking factor can also be expressed as a record size. E.g. The default blocking factor is 20 which yields a record size of 20 * 512 = 10240 bytes.

Now, if you compress a file using the default blocking factor, and if the last record is short, tar will still write a full final record, padding out with zeros. This means that if I naively strip the last 1024 bytes, but the record size is 10240 bytes and the last record contained less than 8704 bytes, there will still be enough padding zeros that there will be at least one empty block (looks like a corrupted end-of-archive) if not two or more (which look like a real end-of-archive).

The simplest way I've found to guarantee that I can accurately strip off the final two blocks of zeros and not have blocks of zeros that were created to pad out a record confusing tar is to crank down the blocking factor. All the way down. By using a blocking factor of 1 (or a record size of 512) I ensure that there won't be any extra blocks of zeros that could be confused for a full or partial end of archive marker.

Let's confirm that I'm not going to gain or lose any bytes of the files I care about if I use this trick:

# Make files whose names match their sizes. Go up to 4k just to be thorough.
mkdir inputs && cd inputs  
for i in {1..4096}; do mkfile -n $i $i; done;  
# Sanity check that all file names and sizes match (we'll do this later to confirm that we didn't lose or gain any bytes)
stat -c "%s %n" * | awk 'BEGIN{ok=1}{if (($1 - $2) != 0){ok=0;print "not ok"}} END{if(ok==1){print "ok"}}'  
# returns ok
cd ..  
# Assemble the test tarball by creating individual archives, stripping off the last two blocks, and concatenating them all into a single file.
find input -type f | xargs -I{} echo gtar c -b 1  {}  \| /opt/local/bin/head --bytes=-2b | bash 2>/dev/null > inputs.tar  
# Let's see if it worked:
mkdir verify && cd verify  
tar xf ../inputs.tar  
cd input  
stat -c "%s %n" * | awk 'BEGIN{ok=1}{if (($1 - $2) != 0){ok=0;print "not ok"}} END{if(ok==1){print "ok"}}'  
# returns ok

Looks good. Let's use this technique to generate a tarball from a Manta directory. Note that this method can't handle the ~~ syntax of the normal Manta tools...

manta_tarball(){  
  the_path=$1
  parent=$(dirname $the_path)
  mfind -t o ${the_path} | mjob create -o -m "gtar -c -b 1 -C /manta${parent} \${MANTA_INPUT_OBJECT#${parent}/} | /opt/local/bin/head --bytes=-2b" -r "gzip --best" > $2
}
manta_tarball /nahamu/public/smartos/bins smartos_bins.tar.gz  

I'm pretty sure this will fail if you provide too many inputs, but you can always run the job without the -o flag and fetch your tarball when the job has completed.

The Best Way?

I have one main concern about this method. Using a blocking factor of 1 makes the invocation of tar on the compressing side inefficient. We have to make 20x more write(2) system calls than we would using the default blocking factor. Some crude tests indicate that this has a negative effect on how long it takes for tar to run. But hey, we're trading CPU time on the assembly end to simplify the decompression. That was kind of the goal, right?

A minor concern about both methods is that the Manta job will create an extra object under ~~/jobs/<job-id> for each input file. This is a waste of space. You should go delete those objects.

Closing thoughts

If you're doing this for yourself, just cat'ing together multiple archives and passing a "-i" flag to gtar is probably simpler and more efficient in CPU usage.
If you're generating an archive to give to someone else, you might want to try the "fancier" version to make it easier for them to extract