What does shred actually do?
May 22 2024
(This post is a draft. More to come later when I feel like writing it)
Usually when something needs to be really erased, it’s common to reach for shred from the GNU coreutils. I blame reading the alt text on this xkcd when I was like 15 for my initial knowledge of shred (and xargs):

I’ve wanted to explore what shred is actually doing under the hood for a little while. For a far more in depth guide I’d recommend looking at official docs.
A little history
shred’s source can be found on git.savannah.gnu.org in the coreutils project.Or if you prefer the github mirror works too.
It’s a part of the GNU coreutils and was written by Colin Plumb, who is amazingly hard to find information on.One mention of him I can find is this dubious reddit thread from 16 years ago. Apparently I’m not the only one to search for information on this programmer. And since one of the few mentions I can find on him is in the Wikipedia article on a method for securely erasing data I suppose it’s hardly surprising that this is the sort of person to simply disappear, especially if they’re working for a government agency known for their secrecy as the reddit thread implies. Who knew technical writeups of random system utilities could veer so far into cloak-and-dagger territory?
Both the man page and source comments remark upon “sophisticated methods” for recovery. I’ll let the paper and the comments in the source remark upon the finer details save for this paragraph which I’ll include here:
Just for the record, reversing one or two passes of disk overwrite
is not terribly difficult with hardware help. Hook up a good-quality
digitizing oscilloscope to the output of the head preamplifier and copy
the high-res digitized data to a computer for some off-line analysis.
Read the "current" data and average all the pulses together to get an
"average" pulse on the disk. Subtract this average pulse from all of
the actual pulses and you can clearly see the "echo" of the previous
data on the disk.
Easy enough right? I often do that in the evenings to relax.Though I wonder how well that particular technique has aged in the face of modern hard drives.
Digging in
The man page outlines the following options:
-f, --force
change permissions to allow writing if necessary
-n, --iterations=N
overwrite N times instead of the default (3)
--random-source=FILE
get random bytes from FILE
-s, --size=N
shred this many bytes (suffixes like K, M, G accepted)
-u deallocate and remove file after overwriting
--remove[=HOW]
like -u but give control on HOW to delete; See below
-v, --verbose
show progress
-x, --exact
do not round file sizes up to the next full block;
this is the default for non-regular files
-z, --zero
add a final overwrite with zeros to hide shreddingLet’s explore them.
I’m going to be using hexyl, a little hex viewer written in Rust to inspect the contents of a file at the byte level and see what changes. Also bat because it’s great.
Basic Case (no options)
λ echo 'sphinx of black quartz judge my vow' > foo.txt
λ bat foo.txt
sphinx of black quartz judge my vow
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 73 70 68 69 6e 78 20 6f ┊ 66 20 62 6c 61 63 6b 20 │sphinx o┊f black │
│00000010│ 71 75 61 72 74 7a 20 6a ┊ 75 64 67 65 20 6d 79 20 │quartz j┊udge my │
│00000020│ 76 6f 77 0a ┊ │vow_ ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘
λ shred foo.txt
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ ab 9d fe 9c ad fa de 97 ┊ cf 14 6b 3d ea 9e 8d 97 │××××××××┊וk=××××│
│00000010│ 8f f2 38 e9 10 6d 52 51 ┊ e3 fa 5a 10 a5 24 a2 05 │××8וmRQ┊××Z•×$ו│
│00000020│ d0 5c 78 16 70 7b f9 4d ┊ ef 14 50 60 4e fd 27 0a │×\x•p{×M┊וP`N×'_│
│00000030│ 3e 64 3c 81 79 15 da 3d ┊ 9d d8 01 32 c3 4a e7 4f │>d<×y•×=┊×ו2×J×O│
│00000040│ b9 64 2c 25 1a 9c a9 59 ┊ aa 15 be 4e ad d2 8d b1 │×d,%•××Y┊ו×N××××│
│00000050│ be 09 63 07 b0 5f 2f b2 ┊ 7d c5 95 46 fc 4d 74 33 │×_c•×_/×┊}××F×Mt3│
│00000060│ a6 4e fc 7f 70 16 fe e4 ┊ 24 df 35 6e 36 cf ba 3c │×Nוp•××┊$×5n6××<│
│00000070│ 2a af 07 60 dd c1 57 83 ┊ 33 2d 1b ac 67 78 e1 4f │*ו`××W×┊3-•×gx×O│
│00000080│ 77 9e d8 45 51 b2 99 ee ┊ c2 81 4d 2f 5f 9c 51 bd │w××EQ×××┊××M/_×Q×│
│00000090│ 65 61 8b 93 4a f2 84 a3 ┊ bd 1f 4e c8 1c 0f ca 0f │ea××J×××┊וNו•ו│
│000000a0│ fa 05 43 b3 78 da 9d e4 ┊ f5 58 53 d3 45 d7 f1 02 │וC×x×××┊×XS×E×ו│
│000000b0│ 35 3d 1d 13 54 3e ac 18 ┊ 14 09 39 22 2b d0 15 28 │5=••T>ו┊•_9"+ו(│
│000000c0│ 55 03 6a b6 d8 92 85 99 ┊ 85 2b c6 7d c2 2f 5c bc │U•j×××××┊×+×}×/\×│
│000000d0│ 83 03 29 97 da af ab 2a ┊ 5d 84 be 8d fb e4 05 c7 │ו)××××*┊]××××ו×│
│000000e0│ 9f 29 b9 9f d6 81 e8 60 ┊ f3 44 0f b9 95 ad 48 05 │×)×××××`┊×D•×××H•│
│000000f0│ 33 d8 80 c0 46 04 bc c9 ┊ 46 58 ec a9 7b c2 60 10 │3×××F•××┊FX××{×`•│
-- snip --
│00000f40│ 71 21 da 30 ad de aa b3 ┊ 48 b6 a9 a4 70 b6 60 1d │q!×0××××┊H×××p×`•│
│00000f50│ d3 15 e3 fb a5 e1 93 f7 ┊ 2d 69 c3 af 07 38 ff d4 │ו××××××┊-i×ו8××│
│00000f60│ 22 fd 0e 8a bf e7 eb 58 ┊ 98 19 ab a9 77 b5 f3 42 │"ו××××X┊ו××w××B│
│00000f70│ 2d 47 4b c6 d2 9c fb f8 ┊ 1a b4 c2 27 9e 5c d1 16 │-GK×××××┊•××'×\ו│
│00000f80│ b3 f7 23 6f e5 fe a6 2a ┊ 61 30 cd e8 8c 76 bc 9f │××#o×××*┊a0×××v××│
│00000f90│ fb cc 34 6c 38 55 e1 39 ┊ 73 45 4d 2f e5 16 cb fd │××4l8U×9┊sEM/ו××│
│00000fa0│ 89 71 ee f4 f8 55 ee 77 ┊ cf 7e e4 d8 c2 fd 9e 7e │×q×××U×w┊×~×××××~│
│00000fb0│ a4 f4 39 f7 7c f3 b3 8f ┊ 3d a0 41 c0 8a 12 f6 c2 │××9×|×××┊=×A×ו××│
│00000fc0│ 96 cd 00 ff 20 4f 84 af ┊ dd 83 e1 ac 55 dd a8 6f │××⋄× O××┊××××U××o│
│00000fd0│ eb cb c0 49 f2 cf e5 29 ┊ 45 ba fd 67 04 f5 13 f5 │×××I×××)┊E××g•ו×│
│00000fe0│ 65 74 70 b3 0b 9b 1d e4 ┊ 48 a5 58 e4 ed 8a 30 d1 │etpוו×┊H×X×××0×│
│00000ff0│ 8a 29 48 06 e8 c8 b1 1b ┊ 2c da 93 4f 23 c9 99 fa │×)H•××ו┊,××O#×××│
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘Not bad.
What’s actually happening here?
From a quick glance at the source. Most of the options boil down to variations on this function:
/*
* Finally, the function that actually takes a filename and grinds
* it into hamburger.
--snip--
*/
static bool
wipefile (char *name, char const *qname,
struct randint_source *s, struct Options const *flags)
{
bool ok;
int fd;
fd = open (name, O_WRONLY | O_NOCTTY | O_BINARY);
if (fd < 0
&& (errno == EACCES && flags->force)
&& chmod (name, S_IWUSR) == 0)
fd = open (name, O_WRONLY | O_NOCTTY | O_BINARY);
if (fd < 0)
{
error (0, errno, _("%s: failed to open for writing"), qname);
return false;
}
ok = do_wipefd (fd, qname, s, flags);
if (close (fd) != 0)
{
error (0, errno, _("%s: failed to close"), qname);
ok = false;
}
if (ok && flags->remove_file)
ok = wipename (name, qname, flags);
return ok;
}Though if you’re observant, you may notice that the real function doing the heavy lifting here is do_wipefd():
static bool
do_wipefd (int fd, char const *qname, struct randint_source *s,
struct Options const *flags)
{
size_t i;
struct stat st;
off_t size; /* Size to write, size to read */
off_t i_size = 0; /* For small files, initial size to overwrite inode */
unsigned long int n; /* Number of passes for printing purposes */
int *passarray;
bool ok = true;
struct randread_source *rs;
n = 0; /* dopass takes n == 0 to mean "don't print progress" */
if (flags->verbose)
n = flags->n_iterations + flags->zero_fill;
if (fstat (fd, &st))
{
error (0, errno, _("%s: fstat failed"), qname);
return false;
}
/* If we know that we can't possibly shred the file, give up now.
Otherwise, we may go into an infinite loop writing data before we
find that we can't rewind the device. */
if ((S_ISCHR (st.st_mode) && isatty (fd))
|| S_ISFIFO (st.st_mode)
|| S_ISSOCK (st.st_mode))
{
error (0, 0, _("%s: invalid file type"), qname);
return false;
}
else if (S_ISREG (st.st_mode) && st.st_size < 0)
{
error (0, 0, _("%s: file has negative size"), qname);
return false;
}
/* Allocate pass array */
passarray = xnmalloc (flags->n_iterations, sizeof *passarray);
size = flags->size;
if (size == -1)
{
if (S_ISREG (st.st_mode))
{
size = st.st_size;
if (! flags->exact)
{
/* Round up to the nearest block size to clear slack space. */
off_t remainder = size % STP_BLKSIZE (&st);
if (size && size < STP_BLKSIZE (&st))
i_size = size;
if (remainder != 0)
{
off_t size_incr = STP_BLKSIZE (&st) - remainder;
size += MIN (size_incr, OFF_T_MAX - size);
}
}
}
else if (S_ISREG (st.st_mode)
&& st.st_size < MIN (STP_BLKSIZE (&st), size))
i_size = st.st_size;
/* Schedule the passes in random order. */
genpattern (passarray, flags->n_iterations, s);
rs = randint_get_source (s);
while (true)
{
off_t pass_size;
unsigned long int pn = n;
if (i_size)
{
pass_size = i_size;
i_size = 0;
pn = 0;
}
else if (size)
{
pass_size = size;
size = 0;
}
/* TODO: consider handling tail packing by
writing the tail padding as a separate pass,
(that would not rewind). */
else
break;
for (i = 0; i < flags->n_iterations + flags->zero_fill; i++)
{
int err = 0;
int type = i < flags->n_iterations ? passarray[i] : 0;
err = dopass (fd, &st, qname, &pass_size, type, rs, i + 1, pn);
if (err)
{
ok = false;
if (err < 0)
goto wipefd_out;
}
}
}
/* Now deallocate the data. The effect of ftruncate is specified
on regular files and shared memory objects (also directories, but
they are not possible here); don't worry about errors reported
for other file types. */
if (flags->remove_file && ftruncate (fd, 0) != 0
&& (S_ISREG (st.st_mode) || S_TYPEISSHM (&st)))
{
error (0, errno, _("%s: error truncating"), qname);
ok = false;
goto wipefd_out;
}
wipefd_out:
free (passarray);
return ok;
}I won’t paste any more of the code here. But I think that’s enough for a cursory glance.
Let’s look at the options themselves:Sphinx of black quartz, judge my vow is a Pangram. Though for the remainder of the article I believe I actually ended up using “Sphinx of black quartz, hear my vow”, which is not
Option: --force
-f, --force
change permissions to allow writing if necessary
Fairly straightforward. We can test that with:
λ echo 'sphinx of black quartz hear my vow' > foo.txt
λ ls -la foo.txt
.rw-rw-r-- kingsfoil kingsfoil 35 B Wed May 22 23:01:11 2024 foo.txt
λ chmod -w foo.txt
λ ls -la foo.txt
.r--r--r-- kingsfoil kingsfoil 35 B Wed May 22 23:01:11 2024 foo.txt
λ shred foo.txt
shred: foo.txt: failed to open for writing: Permission denied
λ shred --force foo.txt
λ ## YepHandy, but not terribly interesting.
Option: --iterations=N
This is the same as the basic use case, but it will overwrite more N number of times. Obviously. In the main function of the shred utility this amounts to repeatedly calling the wipefile() function we mentioned earlier. I won’t spend too much time on this other than a basic sanity check:
λ echo 'sphinx of black quartz hear my vow' > foo.txt
λ shred foo.txt && hexyl foo.txt | head
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 92 e2 68 9f ac c8 93 6e ┊ 7f f1 d1 00 2c ca ae 5d │××h××××n┊•××⋄,××]│
│00000010│ b7 c0 67 b9 42 58 2f 55 ┊ 87 43 49 e4 25 6f 56 7a │××g×BX/U┊×CI×%oVz│
│00000020│ b2 4b d2 38 13 c3 66 10 ┊ fa a4 e0 b7 18 2f d4 4b │×K×8•×f•┊×××ו/×K│
│00000030│ c7 8b 02 03 65 b0 81 a6 ┊ d9 e3 b4 5f cc e9 10 96 │×ו•e×××┊×××_×ו×│
│00000040│ eb 73 39 be a5 6f 02 d3 ┊ 34 25 24 ee c2 a5 05 6d │×s9××o•×┊4%$××וm│
│00000050│ 28 d2 e9 6e 0e 37 5c fb ┊ ba 65 5e 4f 39 fa ae ce │(××n•7\×┊×e^O9×××│
│00000060│ 5e db 63 f5 12 ae a1 8b ┊ ca 0f 9b 0a 19 30 39 95 │^×cו×××┊ו×_•09×│
│00000070│ 18 76 4c 85 eb ae 37 d9 ┊ 67 7f 6c d4 26 2c 63 56 │•vL×××7×┊g•l×&,cV│
│00000080│ 14 9e 93 8c 45 ee bb 69 ┊ 91 a4 dd e4 b1 fc 14 80 │•×××E××i┊×××××ו×│
λ echo 'sphinx of black quartz hear my vow' > foo.txt
λ shred --iterations=3 foo.txt && hexyl foo.txt | head
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 43 ff 1b 70 38 5b f1 e2 ┊ 98 78 89 9b 71 0e 1a 72 │Cוp8[××┊×x××q••r│
│00000010│ c2 cd 5f 66 97 65 32 5b ┊ b5 d1 52 93 e4 99 70 c0 │××_f×e2[┊××R×××p×│
│00000020│ 30 7e a4 06 74 4d e1 6f ┊ 93 7c f5 cd 8d 96 ba 27 │0~וtM×o┊×|×××××'│
│00000030│ f9 d6 14 c0 80 bd 72 0f ┊ df dc f2 18 93 cc ed 06 │×ו×××r•┊××ו××ו│
│00000040│ 62 91 77 a3 0b 86 6d 95 ┊ 18 11 a0 22 df 1b 22 1b │b×wו×m×┊••×"ו"•│
│00000050│ 8e e0 6f 3f 4d fe ee bd ┊ 3f 31 04 ad 70 7d 3a da │××o?M×××┊?1•×p}:×│
│00000060│ 51 6a 4d 8e 51 78 a8 8e ┊ 30 98 ca 9f a2 7e fb 2a │QjM×Qx××┊0××××~×*│
│00000070│ 03 8c 54 29 48 df 4c b2 ┊ 70 07 b3 da 1c f7 7f 41 │•×T)H×L×┊p•×ווA│
│00000080│ 98 5c 37 36 a9 b5 3a e2 ┊ 50 2f fd 61 4c 44 7d ea │×\76××:×┊P/×aLD}×│Option: --random-source=FILE
Now this is where things get cool and UNIX like. The obvious source for random data is of course, /dev/urandom. The GNU Docs have a good little blurb on other sources you can choose.
λ echo 'sphinx of black quartz hear my vow' > foo.txt
λ shred --random-source=/dev/urandom foo.txt && hexyl foo.txt | head
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 38 3b 5f 85 07 85 db eb ┊ 3e f8 b6 14 0a 1f e4 5c │8;_ו×××┊>×ו_•×\│
│00000010│ aa ed 02 02 64 d7 bd 39 ┊ 89 eb 43 e4 f1 b2 9e c3 │×ו•d××9┊××C×××××│
│00000020│ 3f 22 3a 99 14 1d 51 47 ┊ 78 43 3c 39 fe b9 aa 51 │?":ו•QG┊xC<9×××Q│
│00000030│ 01 54 f6 ff d0 69 01 10 ┊ 3f fe 78 d0 52 9c 54 b6 │•T×××i••┊?×x×R×T×│
│00000040│ ad 3f ce a9 24 ce 10 f6 ┊ c3 e9 bc f1 fc c3 fa 29 │×?××$ו×┊×××××××)│
│00000050│ 1b 8a bb db c4 d7 94 95 ┊ c7 f9 3f 2a 62 60 22 61 │•×××××××┊××?*b`"a│
│00000060│ b3 cf 36 db c6 c2 37 57 ┊ 2f e0 24 57 f0 2d 99 64 │××6×××7W┊/×$W×-×d│
│00000070│ 6f ed 8d dc 2f b5 66 26 ┊ 56 29 56 e4 7f d9 ea a2 │o×××/×f&┊V)Vו×××│
│00000080│ 26 69 ce bc b0 51 05 1f ┊ ad f6 bc 79 0c f6 6d bb │&i×××Q••┊×××y_×m×│Nothing too different. Choosing different random number sources is naturally, its own rabbit hole beyond the scope of this article.
We could pick /dev/zero if we wanted as well. Though we’ll see later on that this is the equivalent of using the --zeroes option.
λ shred --random-source=/dev/zero foo.txt && hexyl foo.txt | head
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │⋄⋄⋄⋄⋄⋄⋄⋄┊⋄⋄⋄⋄⋄⋄⋄⋄│
│* │ ┊ │ ┊ │
│00001000│ ┊ │ ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘The only limitation on a file you can choose here has to do with available bytes. /dev/zero /dev/urandom and /dev/random are special files that continuously generate bytes as they are read from. If you provide a random file that is to short, it will fail and leave your program unaltered.
However if you specify one that is longer:
λ echo 'sphinx of black quartz hear my vow' > foo.txt
λ bat zork.txt
Oh ye who go about saying unto each: "Hello sailor":
Dost thou know the magnitude of thy sin before the gods?
Yea, verily, thou shalt be ground between two stones.
Shall the angry gods cast thy body into the whirlpool?
Surely, thy eye shall be put out with a sharp stick!
Even unto the ends of the earth shalt thou wander and
unto the land of the dead shalt thou be sent at last.
Surely thou shalt repent of thy cunning.
λ shred --random-source=./zork.txt foo.txt
shred: ‘./zork.txt’: end of file
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 68 65 20 6d 61 67 6e 69 ┊ 74 75 64 65 20 6f 66 20 │he magni┊tude of │
│00000010│ 74 68 79 20 73 69 6e 20 ┊ 62 65 66 6f 72 65 20 74 │thy sin ┊before t│
│00000020│ 68 65 20 ┊ │he ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘
λ bat foo.txt
he magnitude of thy sin before the Why those bytes from the random file in particular are selected is left as an exercise to the reader. But if you’ve ever thought of writing insulting messages in the faint afterimages of a magnetic echo on a hard disk, this is likely to be a viable method. Remember you heard it here first.
Option: --size=N
This one is sort of fun. I can’t think of a practical reason to use it, but I’m sure it exists.
λ echo 'sphinx of black quartz hear my vow' > foo.txt
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 73 70 68 69 6e 78 20 6f ┊ 66 20 62 6c 61 63 6b 20 │sphinx o┊f black │
│00000010│ 71 75 61 72 74 7a 20 68 ┊ 65 61 72 20 6d 79 20 76 │quartz h┊ear my v│
│00000020│ 6f 77 0a ┊ │ow_ ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘
λ shred --size=12 foo.txt
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 54 91 41 44 0f 92 d1 5f ┊ 8e 8d 05 05 61 63 6b 20 │T×AD•××_┊×ו•ack │
│00000010│ 71 75 61 72 74 7a 20 68 ┊ 65 61 72 20 6d 79 20 76 │quartz h┊ear my v│
│00000020│ 6f 77 0a ┊ │ow_ ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘First let’s establish what the normal behavior looks like. Theoretically setting --size to the number of bytes should be identical to calling shred with no arguments:
λ echo 'sphinx of black quartz hear my vow' > foo.txt
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 73 70 68 69 6e 78 20 6f ┊ 66 20 62 6c 61 63 6b 20 │sphinx o┊f black │
│00000010│ 71 75 61 72 74 7a 20 68 ┊ 65 61 72 20 6d 79 20 76 │quartz h┊ear my v│
│00000020│ 6f 77 0a ┊ │ow_ ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘
λ wc --bytes foo.txt
35 foo.txt
λ shred --size=35 foo.txt ## Identical to a normal call to shred
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 46 75 e0 3b b2 69 75 05 ┊ 5a 65 2a b5 87 26 9a 2d │Fu×;×iu•┊Ze*××&×-│
│00000010│ cf 68 bb 0b 82 d0 80 ba ┊ 46 42 87 33 79 45 78 00 │×hו××××┊FB×3yEx⋄│
│00000020│ 82 9f 83 ┊ │××× ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘And it looks like that’s the case. I would expect the n+1 number of bytes to also be equivalent to an invocation of shred with no arguments. Sure enough:
λ echo 'sphinx of black quartz hear my vow' > foo.txt
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 73 70 68 69 6e 78 20 6f ┊ 66 20 62 6c 61 63 6b 20 │sphinx o┊f black │
│00000010│ 71 75 61 72 74 7a 20 68 ┊ 65 61 72 20 6d 79 20 76 │quartz h┊ear my v│
│00000020│ 6f 77 0a ┊ │ow_ ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘
λ shred --size=36 foo.txt ## n+1 bytes
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 67 6f d7 fb ab 70 d4 fb ┊ 87 c3 dd 93 67 02 ef f7 │go×××p××┊××××g•××│
│00000010│ 28 17 0c 49 73 ec 50 ee ┊ 5e 51 ae 5a 4e 18 53 10 │(•_Is×P×┊^Q×ZN•S•│
│00000020│ 66 76 da f4 ┊ │fv×× ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘What about the n-1 case?Note that the last byte of the file is a 0a which is of course a line feed. Hexyl represents this with an _, which is why the final four characters in the file are 76 6f 77 0a or vow_.
λ echo 'sphinx of black quartz hear my vow' > foo.txt
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 73 70 68 69 6e 78 20 6f ┊ 66 20 62 6c 61 63 6b 20 │sphinx o┊f black │
│00000010│ 71 75 61 72 74 7a 20 68 ┊ 65 61 72 20 6d 79 20 76 │quartz h┊ear my v│
│00000020│ 6f 77 0a ┊ │ow_ ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘
λ wc --bytes foo.txt
35 foo.txt
λ shred --size=34 foo.txt ## n-1 bytes
λ hexyl foo.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 58 a5 36 b0 e5 dd ee f5 ┊ b9 fa a7 7e ec b0 5c 8c │X×6×××××┊×××~××\×│
│00000010│ 6c b7 87 77 20 4e d2 b9 ┊ ea a8 26 0f a9 35 55 0d │l××w N××┊××&•×5U_│
│00000020│ 50 c7 0a ┊ │P×_ ┊ │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘And as expected the final character of the file 0a remains unchanged.
What if we gave it an egregiously large number?
λ echo 'sphinx of black quartz hear my vow' > foo.txt
~/misc via C v11.4.0-gcc took 2s
λ shred --size=5G foo.txt ## n-1 bytes
~/misc via C v11.4.0-gcc took 36s
λ wc --bytes foo.txt
5368709120 foo.txt
~/misc via C v11.4.0-gcc
λ du -h foo.txt
5.1G foo.txt
~/misc via C v11.4.0-gcc
λ hexyl foo.txt | head
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ ac ef eb 4f 12 ee c8 b4 ┊ ab 08 a5 73 7c 2f f4 61 │×××O•×××┊ו×s|/×a│
│00000010│ a3 85 3b 8f 83 37 b5 73 ┊ 0b a1 32 ce a6 02 89 60 │××;××7×s┊•×2×ו×`│
│00000020│ 0d 9d bb e1 40 05 49 3e ┊ a3 f3 f8 b7 e1 1f a7 a2 │_×××@•I>┊××××ו××│
│00000030│ aa df 5e 5f 47 17 4e d4 ┊ 30 b0 3d 9e 4a c1 dc d5 │××^_G•N×┊0×=×J×××│
│00000040│ 59 11 4d 66 60 af a2 a8 ┊ f1 86 cc 66 9b ad 20 1a │Y•Mf`×××┊×××f×× •│
│00000050│ c0 81 eb 1a 6b 44 bb ff ┊ 4c 6c 33 3a 5a 43 bd 96 │××וkD××┊Ll3:ZC××│
│00000060│ 21 57 9a 4c 00 13 3f ee ┊ c6 71 b3 49 bb c9 ed 86 │!W×L⋄•?×┊×q×I××××│
│00000070│ 81 96 1f d0 1a 24 98 79 ┊ f3 73 70 5a 46 50 08 8e │×וו$×y┊×spZFP•×│
│00000080│ a4 6b fb 94 78 83 e9 98 ┊ 5f 48 84 d7 bd 9d 5c 2d │×k××x×××┊_H××××\-│Well that’s certainly interesting. What if we combine that with a not-so-random data source like we did before?
λ shred --random-source=./zork.txt --size=1G foo.txt
shred: ‘./zork.txt’: end of file
λ bat foo.txt
he magnitude of thy sin before the
λ wc --bytes foo.txt
35 foo.txtSame as before. Remember earlier when I said that the reason those bytes were selected specifically is left as an exercise to the reader? That was code for me being too lazy to look it up. But I don’t think I can in good conscience say that if this to truly be a deep dive on shred. Let’s look at teh codez:
Remember earlier when I said that the funciton do_wipefd() did all the heavy lifting? Let’s go back there. I’m going to focus on the parts of the function that involve the random data.
First off, the program defines this just before the main() function:
static struct randint_source *randint_source;Which is passed into do_wipefd():
do_wipefd (int fd, char const *qname, struct randint_source *s,
struct Options const *flags)
{
size_t i;
struct stat st;
off_t size; /* Size to write, size to read */
off_t i_size = 0; /* For small files, initial size to overwrite inode */
unsigned long int n; /* Number of passes for printing purposes */
int *passarray;
bool ok = true;
struct randread_source *rs;
--snip--
rs = randint_get_source (s);
...
}randint_get_source() is not actually defined in this shred.c. It comes from randint.c as best I can tell. Which looks like this:
struct randread_source *
randint_get_source (struct randint_source const *s)
{
return s->source;
}That same file defines this struct:
/* A source of random data for generating random integers. */
struct randint_source
{
/* The source of random bytes. */
struct randread_source *source;
/* RANDNUM is a buffered random integer, whose information has not
yet been delivered to the caller. It is uniformly distributed in
the range 0 <= RANDNUM <= RANDMAX. If RANDMAX is zero, then
RANDNUM must be zero (and in some sense it is not really
"random"). */
randint randnum;
randint randmax;
};And to further round out the picture that struct randread_source *source; is referencing a struct in randread.c:
/* A source of random data for generating random buffers. */
struct randread_source
{
/* Stream to read random bytes from. If null, the current
implementation uses an internal PRNG (ISAAC). */
FILE *source;
/* Function to call, and its argument, if there is an input error or
end of file when reading from the stream; errno is nonzero if
there was an error. If this function returns, it should fix the
problem before returning. The default handler assumes that
handler_arg is the file name of the source. */
void (*handler) (void const *);
void const *handler_arg;
/* The buffer for SOURCE. It's kept here to simplify storage
allocation and to make it easier to clear out buffered random
data. */
union
{
/* The stream buffer, if SOURCE is not null. */
char c[RANDREAD_BUFFER_SIZE];
/* The buffered ISAAC pseudorandom buffer, if SOURCE is null. */
struct isaac
{
/* The number of bytes that are buffered at the end of data.b. */
size_t buffered;
/* State of the ISAAC generator. */
struct isaac_state state;
/* Up to a buffer's worth of pseudorandom data. */
union
{
isaac_word w[ISAAC_WORDS];
unsigned char b[ISAAC_BYTES];
} data;
} isaac;
} buf;
};I’m hitting some of my limits in terms of general knowledge regarding C, and there’s a bit more going on here that has to do with the random data generation when the file source is null.Specifically it’s using ISAAC which is a pseudorandom number generator (PRNG) that I had fun reading about in the process of writing this post. Though I won’t claim the requisite knowledge to understand how it works.
But I think we have sufficient information to make some educated guesses about what’s going on here. We can actually ignore most of the PRNG stuff that the functionality here defaults to because we’re investigating why shred behaves a certain way when given a file to read from.
Anyways, all of that is to say with a moderate level of certainty that the line
rs = randint_get_source (s);is returing a randread_source struct.
The next time we see the variable rs is here:
for (i = 0; i < flags->n_iterations + flags->zero_fill; i++)
{
int err = 0;
int type = i < flags->n_iterations ? passarray[i] : 0;
err = dopass (fd, &st, qname, &pass_size, type, rs, i + 1, pn);
if (err)
{
ok = false;
if (err < 0)
goto wipefd_out;
}
}in the line:
err = dopass (fd, &st, qname, &pass_size, type, rs, i + 1, pn);dopass() is a function also defined in the shred file that begins with this:
/*
* Do pass number K of N, writing *SIZEP bytes of the given pattern TYPE
* to the file descriptor FD. K and N are passed in only for verbose
* progress message purposes. If N == 0, no progress messages are printed.
*
* If *SIZEP == -1, the size is unknown, and it will be filled in as soon
* as writing fails with ENOSPC.
*
* Return 1 on write error, -1 on other error, 0 on success.
*/
static int
dopass (int fd, struct stat const *st, char const *qname, off_t *sizep,
int type, struct randread_source *s,
unsigned long int k, unsigned long int n)
{
...
}This little randread struct we’ve been following doesn’t get referenced until much later in this function:
--snip--
randread (s, pbuf, lim);
--snip--Option -u
-u deallocate and remove file after overwriting
The point of going through shred in this way is to demonstrate how it’s different from just a plain removal of a file. rm foo.txt and shred -u foo.txt are not the same thing. But how do we demonstrate that?
This is where things get interesting, and we have to talk about how rm actually works.
TODO: Finish this section
Horrible Abuses of shred
TODO: writeme
This post will be updated with additional content when it becomes available.