+ -=:iNode Locate:=- +

In the basket of Debian packages I've found a very nice backup solution - BackupPC. It let us to backup 1.5TB from 8 of our PCs including 600GB 90% full RAID using 200GB of the backup RAID space. And that is including 2 or 3 (weekly) full backups and about 4-5 (daily) incremental backups. Isn't that nice? And that was possible due to duplicate information present across different directories/drives/computers + compression.

When I've got backuppc working, I've got curious on why do we have so much of duplicate information? To answer that question I had to find all the files present on the harddrive which are hardlinks to the same file, i.e. to the same inode. Doing find for every file on 200GB drive full of hardlinks is not really a good solution. Probably the same problem lead to creation of locate package long ago.

So I've decided to use existing tool locate to do my job. The only thing I had to do is to 'adjust' updatedb script which actually creates the DB to be used by locate: now at the end of every call to find my iupdatedb does store not just a full_filename but rather full_filename/inode. Then I can use regular locate command to locate files which have specific inode. We just need to make sure that we don't report matching substrings of inode or files with given inode number present in their names.

DOWNLOAD: here is the full iupdatedb script or patch to be applied to updatedb, and ilocate wrapper around locate command to either look for a specific accessible file (does simple test if provided file exists) or inode number.

Now we can create a DB

./iupdatedb --localpaths=. --output=locatedb.local

and then use ilocate

./ilocate 6949126

6949126  cpool/6/9/6/696c5fe46fca94ea5de0395b6adc8a8f
6949126  pc/ravana/10/f%2fraid/fresearch/ffaceprime/ffaceprime.tar
6949126  pc/ravana/3/f%2fraid/fresearch/ffaceprime/ffaceprime.tar

real    0m3.680s
user    0m2.693s
sys     0m0.128s

./ilocate pc/ravana/10/f%2fraid/fresearch/ffaceprime/ffaceprime.tar

to get the same output :-)


No proper path and DB filename handling in ilocate yet. Works from current directory

ilocate and iupdatedb would work only when applied to a single partition, so inodes are unique for the partition. I could've tricked iupdatedb more, but it wasn't necessary for my case.

Provided here iupdatedb uses current directory by default as the temporary directory instead of /var/tmp as updatedb

Comments / suggestions are appreciated. And it is all GPLed so you can continue 'adjusting' the ilocate :-)