I've started a project on github for my collection of general-purpose shell scripts: the ones I keep in ~/bin on each of my shell accounts. If you have any general purpose utilities, don't hesitate to fork the project; I'm sure we could collectively build a fantastic set of power tools.
I wrote a new one this week, called xip, that is a shell analog for the zip function in many languages (the name zip is naturally reserved for the pkzip utility). I created this script to join the ranks of diff and comm, all functions that benefit from multiple input streams. This comes on the heels of discovering at commandlinefu.com that there's a syntax for subshell fifo replacement. That is, you can supply a subshell as an argument to a command, and it will be replaced with the file name of a named pipe. Let's take the the canonical example:
$ cat a a b c $ cat b b a $ diff <(sort a) <(sort b) 3d2 < c
To peer under the hood, I used echo.
$ echo <(echo) /dev/fd/63
Ahah! The stream gets passed as an argument!
So, this opens up a world of possibilities. Normally you can only work with linear pipelines because the functions or programs only have one input and one output stream, and this limitation has created a dearth of standard utilities for working with multiple input streams. Before discovering this feature, the command line was like a programming language where functions only accepted one argument (and no implicit partial application, smarty-pants). Now I feel like I've discovered bash's secret cow level.
So, to remedy the lack of multi-parameter functions in shell, I started by making xip. It takes any number of file names as arguments and interlaces the lines of their output until one of the streams closes.
$ xip <(echo 1; echo 2) <(echo a; echo b) 1 a 2 b
You can then pipe that to a while read loop, or xargs -n 2 loop, to create a table. This example enumerates the lines of a file (jot for BSD, seq for Linux).
$ xip <(seq `cat a | wc -l`) a | xargs -n 2 1 a 2 b 3 c
I suppose the next fun trick is producing multiple output streams, with something like tee and mkfifo. I leave this as an exercise for the reader.
I've also included some of my older scripts from back in the days when I was working exclusively on Linux and used mpg123 to play my music. mpg123 is a command line music player, and it doesn't really have a playlist system built in (for that there are alternatives, but I digress). So, I used a pipeline to generate my playlist stream. cycle, shuffle, and enquote are in the github ~/bin project.
$ find . -name '*.mp3' \ | cycle \ | shuffle `find . -name '*.mp3' | wc -l` \ | enquote \ | xargs -n 1 mpg123
2 comments:
Producing multiple output streams is fully supported in shell (bourne, korn, bash) with external call-outs like tee. Simply use the exec keyword to open/close file descriptors.
@Wes, perhaps you can provide an example. I'm familiar with "tee" but not familiar with how, in conjunction with "exec", this can be harnessed to create branches in a pipeline, which is what I presume you mean to address.
Post a Comment