Shell scripting patterns: returning from functions

One shortcoming of shell scripting is the inability to return anything of significance from a shell script function. Consider: get a function that returns the youngest file in a directory. The basic moving part in this is ls -tr /da/dir | tail -1. Abstracting this to a function seem problematic, given that we cannot return a value. However, functions is very similar to external commands in bash, so you can do this:

functions latest_file() {
  ls -tr "$1" | tail -1
}

... and then simply invoke it thus:

latest=$(latest_file /da/dir)

From this point, we can write what effectively is a generator:

function find_magic_scripts() {
    find $1 -type f -name '*.sh' | while read fname ; do
        grep -l 'magic word' $fname || true
    done
}

We can now use it as a source for a consumer:

find_magic_scripts /da/dir | link_and_version

These two processes will be run in parallel, which is good since they both take a little time. Of course, find_magic_scripts will still block as soon as it has output a file, but will restart as soon as link_and_version has read that line.

However.

There is an issue with this pattern: a sequence of pipes will have the error code for the last command, ignoring any previous failures. So, if /da/dir does not exist, link_and_version will simply get EOF and be happy. Unfortunately, there is no neat workaround that solves all problems. The best strategy is usually to introduce an explicit failure marker in the protocol over the pipe, something like:

function foo() {
    {
        ...
    } || echo "FAIL $?"
}
foo | bar

Or, where left side is an external command, simply:

{ find /may/not/exist -name '*.sh' || echo "FAIL $?" ; } | bar

Now bar() can know that something went wrong and abort.