Pipes and streams in Linux
Using stdout
and stdin
we can connect programs and write to arbitrary files. This level of decoupling makes it
simple to change part of the process and even allows programs written in different programming languages to be used
within a single process.
It's important not to use stdout
and stdin
for other types of output like progress and errors. Write that to
stderr
.
Output redirection
Write to file
The >
redirection operator writes anything that is written to stdout
to a file.
./fetch > rawdata.json
Read from file
The <
redirection operator will read a file and stream it to stdin
.
./process < rawdata.json
Combining read and write
You can combine the >
and <
operator, but you can't use it to modify a file.
./process < rawdata.json > data.json
Pipe
Stream by having fetch
is write to stdout
and process
is read from stdin
.
./fetch | ./process
You can redirect the output of process
to a file
./fetch | ./process > data.json
or pipe it to another process.
./fetch | ./process | ./store
tee
The bash program tee
can be used to do multiple things with the output of a process.
Pipe and store
You can specify a file to save the data to, while keeping your pipeline.
./fetch | ./process | tee data.json | ./store
Pipe to multiple processes
tee
can also be used to create a subpipeline.
./fetch | ./process | tee >(./fetch-images | ./store-images) | ./store-data
Stream over the network
We can use simple TCP socket streams to pipe data to another server. This allows splitting the process across machines.
On server B (192.168.0.101) listen to port 9000 and stream it to process
and store
.
nc -l 9000 | ./process | ./store
On server A we fetch and write the contents to port 9000 of server B.
./fetch > /dev/tcp/192.168.0.101/9000