Loops
Overview
Teaching: 15 min
Exercises: 0 minQuestions
How can I perform the same actions on many different files?
Objectives
Variables in Loops
This exercise refers to the
data-shell/molecules
directory.ls
gives the following output:cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb
What is the output of the following code?
for datafile in *.pdb do ls *.pdb done
Now, what is the output of the following code?
for datafile in *.pdb do ls $datafile done
Why do these two loops give different outputs?
Solution
The first code block gives the same output on each iteration through the loop. Bash expands the wildcard
*.pdb
within the loop body (as well as before the loop starts) to match all files ending in.pdb
and then lists them usingls
. The expanded loop would look like this:for datafile in cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb do ls cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb done
cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb
The second code block lists a different file on each loop iteration. The value of the
datafile
variable is evaluated using$datafile
, and then listed usingls
.cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb
Saving to a File in a Loop - Part One
In the same directory, what is the effect of this loop?
for alkanes in *.pdb do echo $alkanes cat $alkanes > alkanes.pdb done
- Prints
cubane.pdb
,ethane.pdb
,methane.pdb
,octane.pdb
,pentane.pdb
andpropane.pdb
, and the text frompropane.pdb
will be saved to a file calledalkanes.pdb
.- Prints
cubane.pdb
,ethane.pdb
, andmethane.pdb
, and the text from all three files would be concatenated and saved to a file calledalkanes.pdb
.- Prints
cubane.pdb
,ethane.pdb
,methane.pdb
,octane.pdb
, andpentane.pdb
, and the text frompropane.pdb
will be saved to a file calledalkanes.pdb
.- None of the above.
Solution
- The text from each file in turn gets written to the
alkanes.pdb
file. However, the file gets overwritten on each loop interation, so the final content ofalkanes.pdb
is the text from thepropane.pdb
file.
Saving to a File in a Loop - Part Two
In the same directory, what would be the output of the following loop?
for datafile in *.pdb do cat $datafile >> all.pdb done
- All of the text from
cubane.pdb
,ethane.pdb
,methane.pdb
,octane.pdb
, andpentane.pdb
would be concatenated and saved to a file calledall.pdb
.- The text from
ethane.pdb
will be saved to a file calledall.pdb
.- All of the text from
cubane.pdb
,ethane.pdb
,methane.pdb
,octane.pdb
,pentane.pdb
andpropane.pdb
would be concatenated and saved to a file calledall.pdb
.- All of the text from
cubane.pdb
,ethane.pdb
,methane.pdb
,octane.pdb
,pentane.pdb
andpropane.pdb
would be printed to the screen and saved to a file calledall.pdb
.Solution
3 is the correct answer.
>>
appends to a file, rather than overwriting it with the redirected output from a command. Given the output from thecat
command has been redirected, nothing is printed to the screen.
Limiting Sets of Files
In the same directory, what would be the output of the following loop?
for filename in c* do ls $filename done
- No files are listed.
- All files are listed.
- Only
cubane.pdb
,octane.pdb
andpentane.pdb
are listed.- Only
cubane.pdb
is listed.Solution
4 is the correct answer.
*
matches zero or more characters, so any file name starting with the letter c, followed by zero or more other characters will be matched.How would the output differ from using this command instead?
for filename in *c* do ls $filename done
- The same files would be listed.
- All the files are listed this time.
- No files are listed this time.
- The files
cubane.pdb
andoctane.pdb
will be listed.- Only the file
octane.pdb
will be listed.Solution
4 is the correct answer.
*
matches zero or more characters, so a file name with zero or more characters before a letter c and zero or more characters after the letter c will be matched.
Doing a Dry Run
A loop is a way to do many things at once — or to make many mistakes at once if it does the wrong thing. One way to check what a loop would do is to
echo
the commands it would run instead of actually running them.Suppose we want to preview the commands the following loop will execute without actually running those commands:
for file in *.pdb do analyze $file > analyzed-$file done
What is the difference between the two loops below, and which one would we want to run?
# Version 1 for file in *.pdb do echo analyze $file > analyzed-$file done
# Version 2 for file in *.pdb do echo "analyze $file > analyzed-$file" done
Solution
The second version is the one we want to run. This prints to screen everything enclosed in the quote marks, expanding the loop variable name because we have prefixed it with a dollar sign.
The first version redirects the output from the command
echo analyze $file
to a file,analyzed-$file
. A series of files is generated:analyzed-cubane.pdb
,analyzed-ethane.pdb
etc.Try both versions for yourself to see the output! Be sure to open the
analyzed-*.pdb
files to view their contents.
Nested Loops
Suppose we want to set up up a directory structure to organize some experiments measuring reaction rate constants with different compounds and different temperatures. What would be the result of the following code:
for species in cubane ethane methane do for temperature in 25 30 37 40 do mkdir $species-$temperature done done
Solution
We have a nested loop, i.e. contained within another loop, so for each species in the outer loop, the inner loop (the nested loop) iterates over the list of temperatures, and creates a new directory for each combination.
Try running the code for yourself to see which directories are created!
Key Points