- Posted by liammclennan on July 24, 2009
Yesterday Martin Fowler wrote a post about making regular expressions easier to read by using a variation of the Composed Method pattern.
His code ended up looking like this:
const string scoreKeyword = @"^score\s+";
const string numberOfPoints = @"(\d+)";
const string forKeyword = @"\s+for\s+";
const string numberOfNights = @"(\d+)";
const string nightsAtKeyword = @"\s+nights?\s+at\s+";
const string hotelName = @"(.*)";
const string pattern = scoreKeyword + numberOfPoints +
forKeyword + numberOfNights + nightsAtKeyword + hotelName;
The goal is to name the various pieces of the regular expression so that it will be easier to decipher later.
I wrote a static class to do the same thing more concisely in .NET by using anonymous objects. The following snippet shows a regular expression before and after.
// before (standard .NET regular expression instantiation)
var expression = new Regex(@"<h(?<level>\d).*?>(?<title>.+?)</h\d>", RegexOptions.IgnoreCase);
// using an anonymous object and the Composed Method pattern
var expression = ComposeRegex.Compose(new
{
openH = @"<h",
levelCapture = @"(?<level>\d)",
anyOtherAttributes = ".*?",
closeH = ">",
titleCapture = "(?<title>.+?)",
endHElement = @"</h\d>"
}, RegexOptions.IgnoreCase);
And here is the code for the static class.
public static class ComposeRegex
{
public static Regex Compose(object pattern, RegexOptions options)
{
string expression = string.Empty;
foreach (var property in pattern.GetType().GetProperties(BindingFlags.Public | BindingFlags.Instance)) {
if (property.CanRead) expression += property.GetValue(pattern, null);
}
return new Regex(expression, options);
}
}
Hopefully you think that the Composed Method version is more readable. I would love to hear suggestions to improve the implementation. Perhaps a fluent interface might work well?