why I like regular expressions + who says they aren’t readable

Scott and I are working on our second book (OCP 8). I’m excited that I get to write the part about regular expressions as that is one of my favorite programming topics. Why, you ask? Because it lets you write clear and efficient code.

The scenario

For the book, I wrote an example showing how validating a simplified phone number is so much easier with a regular expression. The rules for the example were:

  1. a phone number is exactly 10 digits
  2. a phone number may contain dashes to separate the first three digits and next three digits, but not anywhere else
  3. no other characters are allowed (no parens around the area code in this example)

For example, 123-456-7890, 123-4567890 and 123456-7890 are valid. In real life, the third one wouldn’t be; we allow this typo here to be nice.  However dashes aren’t allowed in random positions. 12-45-67-890 is not a phone number.

Without regular expressions

This isn’t in the book, but i tired to write the code “the long way” to ensure it was annoying long. It was. I tried to write the code in a readable way and the best I could think of was:

private static boolean validateLong(String original) {
   String phone = original;
   // remove first dash (if present)
   if (phone.charAt(3) == '-') {
     phone = phone.substring(0, 3) + phone.substring(4);
   }
   // remove second dash (if present)
   if (phone.charAt(6) == '-') {
      phone = phone.substring(0, 6) + phone.substring(7);
   }

   // validate 10 characters left
   if (phone.length() != 10) {
      return false;
   }

   // validate only numbers left
   Set<Character> digits = new HashSet<>(Arrays.asList('0', '1', '2', '3', '4', '5', '6', '7', '8', '9'));
   for (int i = 0; i < phone.length(); i++) {
      if (!digits.contains(phone.charAt(i))) {
         return false;
      }
   }
   return true;
   }

This is a lot of code. And to those who think regular expressions are unreadable, what do you think of the above? I don’t find it easy to see what is going on even though I wrote it. There’s just too much logic and too much detail to ensure is correct. (And no, it didn’t work on my first attempt.)

With regular expressions

Re-writing to use regular expression gives me this:

private static boolean validate(String phone) {
   String threeDigits = "\\d{3}";
   String fourDigits = "\\d{4}";
   String optionalDash = "-?";
   String regEx = threeDigits + optionalDash + threeDigits + optionalDash + fourDigits;
   return phone.matches(regEx);
}

Even if you don’t know the regular expression syntax, it should be obvious what is going on here. We look for three digits, an optional dash, three more digits, another optional dash and a final four digits.

It’s a tiny bit longer in the book version because {3} isn’t on the exam so that part is:

String threeDigits = "\\d\\d\\d";
String fourDigits = "\\d\\d\\d\\d";

Still. Way easier to read and faster to write than the original code without regular expressions. I consider regular expressions like a hammer. They aren’t the right tool for every job, but they are quite helpful when they are the right tool.

contrast security plugin for eclipse

I recently learned that Contrast Security has a free plugin that tests your application against the OWASP Top 10.  We’ve tried to fix these already. You can read about how we fixed Clickjacking, CSRF and XSS in JForum.

Installing

I started out by installing the Contrast plugin from the Eclipse Marketplace. After restarting Eclipse, a Contrast view automatically opens with instructions. It says to right click your server and choose “Start with Contrast.” Easy enough. I usually use Sysdeo so I can start the server in one click, but this is hardly onerous.

A Diversion: Fixing Tomcat Configuration

I got an error on startup. I then tried to start the server using the server view (without Contrast) and got the same NoSuchMethodError:

java.lang.NoSuchMethodError: sun.security.ec.NamedCurve.<init>(Ljava/lang/String;Ljava/lang/String;Ljava/security/spec/EllipticCurve;Ljava/security/spec/ECPoint;Ljava/math/BigInteger;I)V

I fixed this by switching Tomcat 7 to use Java 7 instead of Java 8. (We aren’t using Java 8 yet for CodeRanch’s JForum software so this is fine.)

  • Workspace preference
  • Server
  • Runtime Environments
  • Click Tomcat and edit
  • Choose Java 7 as JRE

This had nothing to do with Contrast. I hadn’t encountered it because I was using Sysdeo to start Tomcat before this.

Actually testing

Now that the server starts up, I stopped it and restarted with Contrast. Then I clicked around the app a bit. (You can use Selenium tests or any other testing tool to automate this part.) The Contrast view starts to populate with its findings. I clicked around until I had about a dozen findings. They were:

Category Issue # Instances Details My analysis
Orange Insecure hash algorithms in XXX 3 Provides an explanation of what the problem is, why it might/might not be a problem along with the stack trace (showing how it is used) and the HTTP request/headers for the request(s) that triggered it. Two of the three findings refer to the exact same line of code. (Which was run on two different screens). The other appears to be in Tomcat itself. My configuration isn’t the same as the real server here. [The other two I need to look into further]
Yellow Anti-Caching Controls Missing in XXXX 6 Provides the HTTP request/headers, suggested remediation It’s annoying to have this reported on every page. Glad there is an :ignore this rule” option. We run a public website and want things to be cached. Client side caching makes the site faster for users and doesn’t leak information since 90% of our information is public to begin with. The only risk is if a moderator access the private forum on a public computer. We are technical users and know to clear data if this happens.
Yellow Forms without autocomplete prevention 3 Provides the HTTP request/headers, suggested remediation Again, we are a public site so not a big deal for browsers to retain information.
Warning CVE(s) in commons-httpclient-3-1.jar 1 Provides links to the two CVEs along with the manifest of the vulnerable library. I knew this from running Sonatype CLM Insight. The two CVEs are in functionality in the library that we don’t use. Still it is sweet to have this information available for free and with almost no effort. (Insight is a commercial project. We saw a one time result from the report.) I was concerned that information about the jars was being sent over the internet so I asked on Twitter. Jeff Williams replied that the CVE information is in a built in database updated via Eclipse Marketplace. Neat!

What to do with the results

When right clicking on any finding, you have four options:

  • Mark Resolved
  • Delete
  • Ignore (this instance) – useful for a false positive
  • Ignore rule – useful for a rule that doesn’t apply

My thoughts on the Contrast plugin

  • I like that the stack trace is included because it is easy to see context. I also like that lines belonging to the app is in blue in the stack trace.
  • It was very easy to use. And free. Which makes using it a no brainer.
  • While there aren’t false positives from unused code, there are false positives from context (which a tool can’t know).
  • Two of the rules triggered on a number of pages. (and would have triggered on a lot if I tested more)
  • While I don’t have a long list of things to follow up, it was a good thought exercise. And the reason I don’t have a long list is because we manually went through the OWASP top 10 in preparation for the “Iron Clad Java” promo recently. (so as not to have embarrassing issues pointed out)

Progress on the OCA: Oracle Certified Associate Java SE 8 Programmer I Study Guide

book-pdf

In September, Scott and I announced we were writing a book for the OCA (Java 8) exam. Just over a month later, the book cover is up on Amazon along with the estimated publish date of December 31, 2014. I assume this means early January as I find it hard to believe anything happens at a large company during Christmas/New Year’s Week.

It’s great to see progress though. The book is now starting the technical proofreading stage. Yesterday, our tech proofer showed us what the PDF or some of the chapters looks like. it was really cool seeing the jump from a (heavily edited and iterative) Word document to a sharp looking PDF. It’s also exciting seeing something we wrote in near final form.