Recover a failed TimeMachine backup

I recently received an unpleasant warning message after TimeMachine routinely tried to perform a backup:

Time Machine completed a verification of your backups on “matmos”. To improve reliability, Time Machine must create a new backup for you. Click Start New Backup to create a new backup. This will remove your existing backup history. This could take several hours. Click Back Up Later to be reminded tomorrow. Time Machine won’t perform backups during this time.

Googling around for others with the same problem I found quite a few tips (like this one, or this one). The basic idea is to mount the sparsebundle image, run a disk check/repair, and hope for the best. In my case (as you will see in a bit), my sparsebundle appeared to be hosed. My options: lose my old backups or look for a way to recover the old backups. But first up, turn off TimeMachine, and then try to run a standard disk check.

Run disk check/repair

What to do next?

So, “The volume could not be verified completely” says that disk check is not going to repair my sparsebundle. But one good thing to note: the sparsebundle can be mounted read-only, so the old backups should still be there. So the plan: make a new sparsebundle, mount it, mount the old sparsebundle, and then copy all files from the old sparsebundle into the new. Sounds easy, right? Making a new sparsebundle is not rocket science, however copying the files from TimeMachine backups can be quite challenging. I learned quite a bit when trying various methods to copy the files across (Finder, cp, rsync, ditto, etc). TimeMachine is quite ingenious: it uses a combination of file hard links and directory hard links (the latter is a new one to me!) in order to keep the backup size at a mininum. Unfortunately, all the methods I tried could not reconcile the directory hard links: instead of the links being created, the actual directory contents were copied. Furthermore, Apple has made it difficult to work directly with files in TimeMachine backups by making use of sandboxing and an access control “safety net” (see this or this). So I did some more digging and found a great product that can deal with TimeMachine backups, directory hard links, and this safety net: SuperDuper.

Recover contents of failed sparsebundle

Move failed sparsebundle to a new location

I tried restoring the failed sparsebundle to a new sparsebundle while both were on a network drive (connected to my machine via gigabit ethernet) and the backup was painfully slow (wasn’t finished after 24 hours) meaning the bottleneck was random access/seek time of my poor, slow network drives. I cancelled the restore operation, moved the failed sparsebundle to an external USB drive.

Create new sparsebundle

With the failed sparsebundle no longer in my TimeMachine network share, I created a new sparsebundle. Since I encrypt my backups, I used TimeMachine to create a new sparsebundle on the network share:

Mount both sparsebundles

Copy files from the failed sparsebundle to the new sparsebundle

Enable TimeMachine

Enable TimeMachine and start the backups. When the backups first started, the “Oldest backup” date was not listed. But when the backup finished, TimeMachine successfully recognized the oldest backup. Success!

Backing up done!