Finding Things
Overview
Teaching: 15 min
Exercises: 0 minQuestions
How can I find files?
How can I find things in files?
Objectives
Using
grepReferring to
haiku.txtpresented at the begin of this topic, which command would result in the following output:and the presence of absence:
grep "of" haiku.txtgrep -E "of" haiku.txtgrep -w "of" haiku.txtgrep -i "of" haiku.txtSolution
The correct answer is 3, because the
-wflag looks only for whole-word matches. The other options will all match “of” when part of another word.
findPipeline Reading ComprehensionWrite a short explanatory comment for the following shell script:
wc -l $(find . -name '*.dat') | sort -nSolution
- Find all files with a
.datextension in the current directory- Count the number of lines each of these files contains
- Sort the output from step 2. numerically
Matching and Subtracting
The
-vflag togrepinverts pattern matching, so that only lines which do not match the pattern are printed. Given that, which of the following commands will find all files in/datawhose names end ins.txt(e.g.,animals.txtorplanets.txt), but do not contain the wordnet? Once you have thought about your answer, you can test the commands in thedata-shelldirectory.
find data -name '*s.txt' | grep -v netfind data -name *s.txt | grep -v netgrep -v "temp" $(find data -name '*s.txt')- None of the above.
Solution
The correct answer is 1. Putting the match expression in quotes prevents the shell expanding it, so it gets passed to the
findcommand.Option 2 is incorrect because the shell expands
*s.txtinstead of passing the wildcard expression tofind.Option 3 is incorrect because it searches the contents of the files for lines which do not match “temp”, rather than searching the file names.
Tracking a Species
Leah has several hundred data files saved in one directory, each of which is formatted like this:
2013-11-05,deer,5 2013-11-05,rabbit,22 2013-11-05,raccoon,7 2013-11-06,rabbit,19 2013-11-06,deer,2She wants to write a shell script that takes a species as the first command-line argument and a directory as the second argument. The script should return one file called
species.txtcontaining a list of dates and the number of that species seen on each date. For example using the data shown above,rabbits.txtwould contain:2013-11-05,22 2013-11-06,19Put these commands and pipes in the right order to achieve this:
cut -d : -f 2 > | grep -w $1 -r $2 | $1.txt cut -d , -f 1,3Hint: use
man grepto look for how to grep text recursively in a directory andman cutto select more than one field in a line.An example of such a file is provided in
data-shell/data/animal-counts/animals.txtSolution
grep -w $1 -r $2 | cut -d : -f 2 | cut -d , -f 1,3 > $1.txtYou would call the script above like this:
$ bash count-species.sh bear .
Little Women
You and your friend, having just finished reading Little Women by Louisa May Alcott, are in an argument. Of the four sisters in the book, Jo, Meg, Beth, and Amy, your friend thinks that Jo was the most mentioned. You, however, are certain it was Amy. Luckily, you have a file
LittleWomen.txtcontaining the full text of the novel (data-shell/writing/data/LittleWomen.txt). Using aforloop, how would you tabulate the number of times each of the four sisters is mentioned?Hint: one solution might employ the commands
grepandwcand a|, while another might utilizegrepoptions. There is often more than one way to solve a programming task, so a particular solution is usually chosen based on a combination of yielding the correct result, elegance, readability, and speed.Solutions
for sis in Jo Meg Beth Amy do echo $sis: grep -ow $sis LittleWomen.txt | wc -l doneAlternative, slightly inferior solution:
for sis in Jo Meg Beth Amy do echo $sis: grep -ocw $sis LittleWomen.txt doneThis solution is inferior because
grep -conly reports the number of lines matched. The total number of matches reported by this method will be lower if there is more than one match per line.
Finding Files With Different Properties
The
findcommand can be given several other criteria known as “tests” to locate files with specific attributes, such as creation time, size, permissions, or ownership. Useman findto explore these, and then write a single command to find all files in or below the current directory that were modified by the userahmedin the last 24 hours.Hint 1: you will need to use three tests:
-type,-mtime, and-user.Hint 2: The value for
-mtimewill need to be negative—why?Solution
Assuming that Nelle’s home is our working directory we type:
$ find ./ -type f -mtime -1 -user ahmed
Key Points