Common Lisp/External libraries/CL-PPCRE

Common Lisp Portable Perl Compatible Regular Expression library, or CL-PPCRE, brings the power of Perl regular expressions to Common Lisp. In the words of the author, Edi Weitz, CL-PPCRE has the following features:


 * It is compatible with Perl.
 * It is pretty fast.
 * It is portable between ANSI-compliant Common Lisp implementations.
 * It is thread-safe.
 * In addition to specifying regular expressions as strings like in Perl you can also use S-expressions.
 * It comes with a BSD-style license so you can basically do with it whatever you want.

Basic Usage
The main entry point to CL-PPCRE is the scan function. Scan takes a regular expression (or regex) and a string to match it on and returns the matched start and end indices for the regex and the start and end indices of any registers you defined in the regex.

The first and second return values are the start and end of the matching substring, respectively. The third and fourth return values mark the start and end indices of register matches. Notice that it only found the first instance of bar. To find the next instance, you may pass values for the keyword parameter :start. It goes without saying that you can also limit how far the scan runs along the string by specifying the :end keyword.

As you may have noticed, keeping track of start points while scanning a string can be a bit tedious. For this, CL-PPCRE has several convenience functions and macros for common tasks such as:


 * do-scans
 * scan-to-strings
 * do-matches
 * do-register-groups
 * all-matches
 * <tt>split</tt>
 * <tt>regex-replace</tt>
 * <tt>regex-replace-all</tt>

Regular Expressions as Trees and Closures
While CL-PPCRE takes regular expressions as strings, it actually parses that string into a regular expression tree. Besides being the Lispy thing to do, this also removes much of the cryptic nature of regular expressions, however, it trades it for verbosity. One nice feature of this is that since you have the regular expression's parse tree at hand, it is straight forward to build or alter regular expressions programatically on the fly.

After it has the tree form for the regular expression, CL-PPCRE compiles that representation using the function <tt>create-scanner</tt> (many Lisps compile down to native machine code). This does two things: (1) it tends to make regular expression scans very fast, and (2) it tends to make it expensive to define a regular expression the first time (due to the compilation overhead, although there are variables you can set to reduce this). Once compilation is done, you can reuse the same expression circumventing the compilation overhead. CL-PPCRE also uses proper algorithms producing an efficient regular expression representation, i.e. not a stack based system. Overall, this is a quite efficient library.

Examples
Well, using CL-PPCRE isn't that much different from using any other regular expression engine. Here we will see some quick and dirty examples of what you can do. If you get stuck on how to do something, consult regular expression tutorials, the <tt>perlre</tt> manpage, or a Perl user. You just have to remember to double backslash your regexs (as you always have to do to insert literal backslashes into Lisp strings).