Learn something new every day
Usually in a bash script when I want to parse a file line by line I do something like this:
exec < foo.txt
while read LINE;
do
SOMEVAR=$SOMEVAR,$LINE
done
This evening I was writing a bash wrapper for debootstrap to automate the building of chroot environments and for some reason I chose to stray away from my norm and did the following:
cat foo.txt | while read LINE;
do
SOMEVAR=$SOMEVAR,$LINE
done
After a few test runs I noticed $SOMEVAR did not have what I expected after exiting the while loop. After some debugging (bash -x) I discovered the variable was magically empty after the last read LINE of the while loop. I immediately went back to old ways of parsing and sure enough it worked as I expected. Hmm… Well after finding the explanation it makes perfect sense. I am just surprised I hadn’t run into it before.
From the Bash FAQ
E4) If I pipe the output of a command into `read variable’, why doesn’t
the output show up in $variable when the read command finishes?
This has to do with the parent-child relationship between Unix
processes. It affects all commands run in pipelines, not just
simple calls to `read’. For example, piping a command’s output
into a `while’ loop that repeatedly calls `read’ will result in
the same behavior.
Each element of a pipeline runs in a separate process, a child of
the shell running the pipeline. A subprocess cannot affect its
parent’s environment. When the `read’ command sets the variable
to the input, that variable is set only in the subshell, not the
parent shell. When the subshell exits, the value of the variable
is lost.
