Grep
Exact Pattern Matching¶
Grep is great for basic pattern matching where you are looking for an exact pattern.
For example, to all lines in an nginx log file that are for a specific IP, you might do:
grep '123.123.123.123' *.log
Complex Pattern Matching¶
The default regex syntax used in grep is not very advanced. If you are used to PHP regex then things get a lot easier if you use grep with the -P
flag, so that it uses Perl Compatible Regular Expressions - which is exactly what PHP uses.
grep -P '/[^/]+?\.js' *.log
Pulling out Subpatterns¶
If you want to pull out a specific match rather than echo out the whole matching line, then the best way to do this is using lookbehind and lookahead, combined with the -o
flag
For example:
netstat -tulpn | grep 1080 | grep -Po '(?<=[^0-9])([0-9]+?)(?=/ssh)'
This command gives me the PID for the process that is running on port 1080, in my case a local socks5 proxy
Note the above example also demonstrates using grep in a pipe.
Look Ahead and Behind¶
You might want to have matches that enforce a particular patter before or after your target pattern. For this we can use lookahead and behind
- Lookbehind
(?<=...)
- Lookahead
(?=...)
Discared Patterns¶
If you want to match a pattern, but not actually capture the pattern then you can use (?:...)
These patterns will not be included in your match, but will be enforced as matching.
Counting Matches¶
Lets say someone is attacking your server and you want to parse exception logs to pull out teh number of exceptions by IP
This is one way to do that:
grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' exception.log | sort | uniq -c
If you then want to filter out IPs making more than 1000 exceptions, you could do:
grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' exception.log | sort | uniq -c | grep -pO '(?:[0-9]{4,} )(.*)'
Searching for both single and double quotes¶
Searching for both types of quotes in a file can become confusing quite quickly as you can't escape a single quote inside another single quote and double quotes can cause other things to be expanded.
A simple way to get around this to use $'
to escape the entire string you want to search for.
This Stack Overflow answer goes into more details about how to use this