[buug] I/O redirection and named pipes
Michael.Paoli at cal.berkeley.edu
Tue Aug 3 06:03:36 PDT 2010
A few things from the 2010-07-15 BUUG meeting,
... among other things discussed,
I/O redirection and named pipes
I/O redirection - much of what was said/covered, I've covered before,
even on this list - for starters, see:
In addition to that, a few (additional) points were made/emphasized:
Order matters - I/O redirection is (mostly) processed left-to-right.
Though 2>&1 can be thought of as redirecting file descriptor 2 (stderr)
to file descriptor 1 (stdout), it more literally and properly means
copy file descriptor 1 to file descriptor 2.
Here, with strace(1), we can see what happens at the system call level.
Using strace(1), and here showing only the specific calls of interest:
$ strace -fv -e trace=dup2,fcntl64 sh -c '2>&1 echo -n ""'
dup2(1, 2) = 2
$ strace -fv -e trace=dup2,fcntl64 dash -c '2>&1 echo -n ""'
fcntl64(1, F_DUPFD, 2) = 2
references: sh(1), strace(1), dup(2), fcntl(2)
And named pipes - rather like shell pipes, but they're a named file on a
filesystem. A few points were made - need to have something reading the
pipe before trying to write the pipe. The pipe stores no data on disk
(file of type pipe/FIFO has no data blocks - though buffered data might
be subject to being paged or swapped to disk). And a practical example.
Let's say we've got a large ISO image file (e.g. DVD, but I'll use CD in
this example), and we want to compute multiple hash algorithms on the
file - but we don't want to have to read the file's blocks from disk
multiple times (which would be inefficient in its redundant disk I/O).
We can use named pipes (and a bit of use of tee(1) and asynchronous job
execution). Let's say we already validated (via gpg) our files giving
I'll introduce my comments with // at the start of the line:
//So, let's say we first snag information on mtime and length from
$ 2>>/dev/null curl -I
http://releases.ubuntu.com/lucid/ubuntu-10.04-server-amd64.iso | fgrep
Last-Modified: Tue, 27 Apr 2010 10:56:34 GMT
//Next we determine block count (2 KiB blocks for CD-ROM/R/RW) and copy
//our data from CD to file:
$ echo '710412288/2048' | bc -l
$ dd if=/media/cdrom0 bs=2048 count=346881 of=ubuntu-10.04-server-amd64.iso
//we then set our mtime to that of the archive copy:
$ TZ=GMT0 touch -t 201004271056.34
//we examine our expected hashes:
$ fgrep ubuntu-10.04-server-amd64.iso *SUMS
//we create our named pipes:
$ mknod p p && mknod p2 p
//we start our processes reading those named pipes:
$ < p > md5 md5sum &
$ < p2 > sha1 sha1sum &
//we then use tee(1) to feed the named pies
$ < ubuntu-10.04-server-amd64.iso tee p | tee p2 | sha256sum > sha256
//we wait for our background processes to complete then examine our
//results and remove our no longer needed named pipes
$ wait; cat md5 sha1 sha256; rm p p2
//We then compare - confirming all our hashes matched as expected.
Left as an exercise :-) ... How could we be even more disk I/O
efficient? Hint: did we really need to read the file we wrote to disk?
More information about the buug