Global Regular Expression Print is a staple of every command-line user’s toolbox. As with find, it derives a lot of power from being combined with other tools and can increase your productivity significantly.
Following is a simple tutorial that will help you realize the power of this simple and most useful command. If you are on Windows and haven’t already, download and install Cygwin. If you are also new to regular expressions (regex), here is a great regular expressions reference to get you started.
Tutorial
Suppose we want to search for duplicate functions in all of our JavaScript files. Let’s start basic and work up to it. This technique can be used to search for a TON of duplicate items like:
- Duplicate HTML IDs
- Check how many times a CSS class is used
- Duplicate java classes
- many, many more…
The above command will print the lines containing “function” in all JavaScript files in the current directory (NOT subdirectories). Printing out line contents would be much more helpful if we knew what files they come from and their line numbers:
Depending on how you format your JavaScript files, something like this will omit comments, anonymous functions, and also words like “functionality” giving you better results.
-o prints only the part that matches the regular expression. -E options gives me extended regex and -h suppresses printing of the file name. I am then piping to sort which just sorts the output so it a list of function
There we go! That will list only the duplcated functions. I know that we can expand this with awk or other stuff and get the file names and line numbers of the duplicates, but I don’t want to explaining the details of awk ;).
Other Examples
Conclusion
grep is one of the most used command-line tools, often piped to for filtering output. Understanding it is essential to increasing productivity on the command-line. There is so much more to grep than what I’ve shown here, and it would be cool to see your best uses in the comments!