Finding Things
Overview
Teaching: 15 min
Exercises: 0 minQuestions
How can I find files?
How can I find things in files?
Objectives
Using
grep
Referring to
haiku.txt
presented at the begin of this topic, which command would result in the following output:and the presence of absence:
grep "of" haiku.txt
grep -E "of" haiku.txt
grep -w "of" haiku.txt
grep -i "of" haiku.txt
Solution
The correct answer is 3, because the
-w
flag looks only for whole-word matches. The other options will all match “of” when part of another word.
find
Pipeline Reading ComprehensionWrite a short explanatory comment for the following shell script:
wc -l $(find . -name '*.dat') | sort -n
Solution
- Find all files with a
.dat
extension in the current directory- Count the number of lines each of these files contains
- Sort the output from step 2. numerically
Matching and Subtracting
The
-v
flag togrep
inverts pattern matching, so that only lines which do not match the pattern are printed. Given that, which of the following commands will find all files in/data
whose names end ins.txt
(e.g.,animals.txt
orplanets.txt
), but do not contain the wordnet
? Once you have thought about your answer, you can test the commands in thedata-shell
directory.
find data -name '*s.txt' | grep -v net
find data -name *s.txt | grep -v net
grep -v "temp" $(find data -name '*s.txt')
- None of the above.
Solution
The correct answer is 1. Putting the match expression in quotes prevents the shell expanding it, so it gets passed to the
find
command.Option 2 is incorrect because the shell expands
*s.txt
instead of passing the wildcard expression tofind
.Option 3 is incorrect because it searches the contents of the files for lines which do not match “temp”, rather than searching the file names.
Tracking a Species
Leah has several hundred data files saved in one directory, each of which is formatted like this:
2013-11-05,deer,5 2013-11-05,rabbit,22 2013-11-05,raccoon,7 2013-11-06,rabbit,19 2013-11-06,deer,2
She wants to write a shell script that takes a species as the first command-line argument and a directory as the second argument. The script should return one file called
species.txt
containing a list of dates and the number of that species seen on each date. For example using the data shown above,rabbits.txt
would contain:2013-11-05,22 2013-11-06,19
Put these commands and pipes in the right order to achieve this:
cut -d : -f 2 > | grep -w $1 -r $2 | $1.txt cut -d , -f 1,3
Hint: use
man grep
to look for how to grep text recursively in a directory andman cut
to select more than one field in a line.An example of such a file is provided in
data-shell/data/animal-counts/animals.txt
Solution
grep -w $1 -r $2 | cut -d : -f 2 | cut -d , -f 1,3 > $1.txt
You would call the script above like this:
$ bash count-species.sh bear .
Little Women
You and your friend, having just finished reading Little Women by Louisa May Alcott, are in an argument. Of the four sisters in the book, Jo, Meg, Beth, and Amy, your friend thinks that Jo was the most mentioned. You, however, are certain it was Amy. Luckily, you have a file
LittleWomen.txt
containing the full text of the novel (data-shell/writing/data/LittleWomen.txt
). Using afor
loop, how would you tabulate the number of times each of the four sisters is mentioned?Hint: one solution might employ the commands
grep
andwc
and a|
, while another might utilizegrep
options. There is often more than one way to solve a programming task, so a particular solution is usually chosen based on a combination of yielding the correct result, elegance, readability, and speed.Solutions
for sis in Jo Meg Beth Amy do echo $sis: grep -ow $sis LittleWomen.txt | wc -l done
Alternative, slightly inferior solution:
for sis in Jo Meg Beth Amy do echo $sis: grep -ocw $sis LittleWomen.txt done
This solution is inferior because
grep -c
only reports the number of lines matched. The total number of matches reported by this method will be lower if there is more than one match per line.
Finding Files With Different Properties
The
find
command can be given several other criteria known as “tests” to locate files with specific attributes, such as creation time, size, permissions, or ownership. Useman find
to explore these, and then write a single command to find all files in or below the current directory that were modified by the userahmed
in the last 24 hours.Hint 1: you will need to use three tests:
-type
,-mtime
, and-user
.Hint 2: The value for
-mtime
will need to be negative—why?Solution
Assuming that Nelle’s home is our working directory we type:
$ find ./ -type f -mtime -1 -user ahmed
Key Points