Shell Scripts
Overview
Teaching: 15 min
Exercises: 0 minQuestions
How can I save and re-use commands?
Objectives
Variables in Shell Scripts
In the
moleculesdirectory, imagine you have a shell script calledscript.shcontaining the following commands:head -n $2 $1 tail -n $3 $1While you are in the
moleculesdirectory, you type the following command:bash script.sh '*.pdb' 1 1Which of the following outputs would you expect to see?
- All of the lines between the first and the last lines of each file ending in
.pdbin themoleculesdirectory- The first and the last line of each file ending in
.pdbin themoleculesdirectory- The first and the last line of each file in the
moleculesdirectory- An error because of the quotes around
*.pdbSolution
The correct answer is 2.
The special variables $1, $2 and $3 represent the command line arguments given to the script, such that the commands run are:
$ head -n 1 cubane.pdb ethane.pdb octane.pdb pentane.pdb propane.pdb $ tail -n 1 cubane.pdb ethane.pdb octane.pdb pentane.pdb propane.pdbThe shell does not expand
'*.pdb'because it is enclosed by quote marks. As such, the first argument to the script is'*.pdb'which gets expanded within the script byheadandtail.
List Unique Species
Leah has several hundred data files, each of which is formatted like this:
2013-11-05,deer,5 2013-11-05,rabbit,22 2013-11-05,raccoon,7 2013-11-06,rabbit,19 2013-11-06,deer,2 2013-11-06,fox,1 2013-11-07,rabbit,18 2013-11-07,bear,1An example of this type of file is given in
data-shell/data/animals.txt.Write a shell script called
species.shthat takes any number of filenames as command-line arguments, and usescut,sort, anduniqto print a list of the unique species appearing in each of those files separately.Solution
# Script to find unique species in csv files where species is the second data field # This script accepts any number of file names as command line arguments # Loop over all files for file in $@ do echo "Unique species in $file:" # Extract species names cut -d , -f 2 $file | sort | uniq done
Find the Longest File With a Given Extension
Write a shell script called
longest.shthat takes the name of a directory and a filename extension as its arguments, and prints out the name of the file with the most lines in that directory with that extension. For example:$ bash longest.sh /tmp/data pdbwould print the name of the
.pdbfile in/tmp/datathat has the most lines.Solution
# Shell script which takes two arguments: # 1. a directory name # 2. a file extension # and prints the name of the file in that directory # with the most lines which matches the file extension. wc -l $1/*.$2 | sort -n | tail -n 2 | head -n 1
Why Record Commands in the History Before Running Them?
If you run the command:
$ history | tail -n 5 > recent.shthe last command in the file is the
historycommand itself, i.e., the shell has addedhistoryto the command log before actually running it. In fact, the shell always adds commands to the log before running them. Why do you think it does this?Solution
If a command causes something to crash or hang, it might be useful to know what that command was, in order to investigate the problem. Were the command only be recorded after running it, we would not have a record of the last command run in the event of a crash.
Script Reading Comprehension
For this question, consider the
data-shell/moleculesdirectory once again. This contains a number of.pdbfiles in addition to any other files you may have created. Explain what a script calledexample.shwould do when run asbash example.sh *.pdbif it contained the following lines:# Script 1 echo *.*# Script 2 for filename in $1 $2 $3 do cat $filename done# Script 3 echo $@.pdbSolutions
Script 1 would print out a list of all files containing a dot in their name.
Script 2 would print the contents of the first 3 files matching the file extension. The shell expands the wildcard before passing the arguments to the
example.shscript.Script 3 would print all the arguments to the script (i.e. all the
.pdbfiles), followed by.pdb. cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb.pdb
Debugging Scripts
Suppose you have saved the following script in a file called
do-errors.shin Nelle’snorth-pacific-gyre/2012-07-03directory:# Calculate reduced stats for data files at J = 100 c/bp. for datafile in "$@" do echo $datfile bash goostats -J 100 -r $datafile stats-$datafile doneWhen you run it:
$ bash do-errors.sh *[AB].txtthe output is blank. To figure out why, re-run the script using the
-xoption:bash -x do-errors.sh *[AB].txtWhat is the output showing you? Which line is responsible for the error?
Solution
The
-xflag causesbashto run in debug mode. This prints out each command as it is run, which will help you to locate errors. In this example, we can see thatechoisn’t printing anything. We have made a typo in the loop variable name, and the variabledatfiledoesn’t exist, hence returning an empty string.
Key Points