More fun with Regex!

Extending The Curly Quotes Module is fun! With a little understanding of how RegEx works, you can do all sorts of fun things. For instance, I added the following into my ‘Curly Quotes’ template:

<MTAddRegex>s|&([^#])|&#038;$1|g</MTAddRegex>

Which does what you may ask? Well, it does the same thing as the Hivelogic URL Cleaner. It finds all instances of an & in the site, and converts it to the equivalent symbolic notation (&#038;), except when it is followed by a # (indicating it’s already a symbolic notation of something). As an added bonus, this version will clean up your &’s all over the place, so your page will validate. Well, except for that pesky RDF thing.

There is a downside, in that I’m now force to used numeric symbolic equivalents instead of the handy shortcuts (like &#060; instead of &lt;), but I’m sure I’ll figure out some workaround for that too. Perhaps if I simply replace all those with their numerical equivalents before replacing the ampersand? I’ll sleep on this one.