...
Code Block |
---|
tail -c +17185 recovery2_failing.tar > recovery2_working.tar |
This command copies everything from recovery2_failing.tar, starting at offset +17185 into recovery2_working.tar.
Great, now we have a "recovery2_working.tar" tar file, which WORKS !
...
Well, right, you did it.
We can get something out of it.
For instance, thou shall not use gzip compressed archives for relatively critical stuffs,
because if it ever gets corrupted, well, it's just lost. Sad story, huh ?
Second thing, tar archives are quite fine with data corrupting, at least,
they are better than gzipped files.
Here, we could restore everything from a .tar.bz2 file, EXCEPT what was
within the corrupted bzip block, and everything until the first clean header
after the corrupted block. To sum it up: we lost one block, and any file with
either its header or a part of the body in that block.
If you are saving critical stuff, you could tell BZip2 to use 100kb block-size.
If your archive gets corrupted, you loose a multiple of 100kb,
against a multiple of 900kb if you use 900kb block-size,
which could actually make a BIG difference!
Addendum : Expected Minimal Data Loss
Best case (minimal loss): No file has its header within a corrupted block and its data block in others.
Wors case (maximal loss): Each corrupted block contains the header of a big file. The whole block is lost, plus that file. (hypothetically, unlimited amount of data can be lost, it could be a 100GB file....)
Block Size | Minimal loss/N corrupted blocks |
---|---|
100 kB | 100 x N kB |
200 kB | 200 x N kB |
300 kB | 300 x N kB |
400 kB | 400 x N kB |
500 kB | 500 x N kB |
600 kB | 600 x N kB |
700 kB | 700 x N kB |
800 kB | 800 x N kB |
900 kB | 900 x N kB |
Please note that statistically, on a high amount of corrupted blocks, if the average filesize is M kB, the expected data lost is around (block size + average file size) x (number of corrupt blocks)