Sculpting text with regex, grep, sed, awk, emacs and vim
[shared via Google Reader from Matt Might’s blog]
Unix is an alliance of loosely structured text files bound together and governed by scripts. Unix is the United Confederation of Strings:
everywhere it is passed there is much duplication of process.
It is a perfect vehicle for hiding information.
—Alan Perlis
Tools built in the Unix tradition excel at manipulating strings as data.
Yet many newer Unix users are unaware of the classic tools and their power.
In this article, I’ll provide a functional introduction to four important
concepts and tools for sculpting text:
regex, grep, sed and awk.
In short:
- regex is a language for describing patterns in strings;
-
grepfilters its input against a pattern; -
sedapplies transformation rules to each line; and -
awkmanipulates an ad hoc database stored as text, e.g. CSV files.
With this functional introduction, my goal is to introduce enough of each tool to cover 80%-90% of their niche uses cases.
Read on for a touch of history, theory and practice.
This post is part of a “Unix fundamentals” series; see basic Unix, and settling into Unix for more.