Eclipse – easily looking at Java bytecode

A fellow moderator asked me to weigh in on this question at CodeRanch. The gist is whether this code creates one String or two:

String s = " " + 3;

How to find out the answer

The most definitive way to verify this is to check the bytecode. I had downloaded the bytecode plugin when working on our Java 8 OCA Study Guide because sometimes you just have to know what actually goes on behind the scenes to be accurate.

Using the plugin is easy. You go to Window -> Show View -> Other -> Java -> Bytecode. Then every time you save the Java file, the bytecode window is automatically updated. Great for lots of iterations.

The test

I wrote a simple Java class:

package jb;
public class PlayTest {
  public static void main(String[] args) {
    String s = "" + 3;
  }
}

The generated bytecode is:

// class version 52.0 (52)
// access flags 0x21
public class jb/PlayTest {

  // compiled from: PlayTest.java

  // access flags 0x1
  public <init>()V
   L0
    LINENUMBER 4 L0
    ALOAD 0
    INVOKESPECIAL java/lang/Object.<init> ()V
    RETURN
   L1
    LOCALVARIABLE this Ljb/PlayTest; L0 L1 0
    MAXSTACK = 1
    MAXLOCALS = 1

  // access flags 0x9
  public static main([Ljava/lang/String;)V
   L0
    LINENUMBER 8 L0
    LDC "3"
    ASTORE 1
   L1
    LINENUMBER 14 L1
    RETURN
   L2
    LOCALVARIABLE args [Ljava/lang/String; L0 L2 0
    LOCALVARIABLE s Ljava/lang/String; L1 L2 1
    MAXSTACK = 1
    MAXLOCALS = 2
}

why I like regular expressions + who says they aren’t readable

Scott and I are working on our second book (OCP 8). I’m excited that I get to write the part about regular expressions as that is one of my favorite programming topics. Why, you ask? Because it lets you write clear and efficient code.

The scenario

For the book, I wrote an example showing how validating a simplified phone number is so much easier with a regular expression. The rules for the example were:

  1. a phone number is exactly 10 digits
  2. a phone number may contain dashes to separate the first three digits and next three digits, but not anywhere else
  3. no other characters are allowed (no parens around the area code in this example)

For example, 123-456-7890, 123-4567890 and 123456-7890 are valid. In real life, the third one wouldn’t be; we allow this typo here to be nice.  However dashes aren’t allowed in random positions. 12-45-67-890 is not a phone number.

Without regular expressions

This isn’t in the book, but i tired to write the code “the long way” to ensure it was annoying long. It was. I tried to write the code in a readable way and the best I could think of was:

private static boolean validateLong(String original) {
   String phone = original;
   // remove first dash (if present)
   if (phone.charAt(3) == '-') {
     phone = phone.substring(0, 3) + phone.substring(4);
   }
   // remove second dash (if present)
   if (phone.charAt(6) == '-') {
      phone = phone.substring(0, 6) + phone.substring(7);
   }

   // validate 10 characters left
   if (phone.length() != 10) {
      return false;
   }

   // validate only numbers left
   Set<Character> digits = new HashSet<>(Arrays.asList('0', '1', '2', '3', '4', '5', '6', '7', '8', '9'));
   for (int i = 0; i < phone.length(); i++) {
      if (!digits.contains(phone.charAt(i))) {
         return false;
      }
   }
   return true;
   }

This is a lot of code. And to those who think regular expressions are unreadable, what do you think of the above? I don’t find it easy to see what is going on even though I wrote it. There’s just too much logic and too much detail to ensure is correct. (And no, it didn’t work on my first attempt.)

With regular expressions

Re-writing to use regular expression gives me this:

private static boolean validate(String phone) {
   String threeDigits = "\\d{3}";
   String fourDigits = "\\d{4}";
   String optionalDash = "-?";
   String regEx = threeDigits + optionalDash + threeDigits + optionalDash + fourDigits;
   return phone.matches(regEx);
}

Even if you don’t know the regular expression syntax, it should be obvious what is going on here. We look for three digits, an optional dash, three more digits, another optional dash and a final four digits.

It’s a tiny bit longer in the book version because {3} isn’t on the exam so that part is:

String threeDigits = "\\d\\d\\d";
String fourDigits = "\\d\\d\\d\\d";

Still. Way easier to read and faster to write than the original code without regular expressions. I consider regular expressions like a hammer. They aren’t the right tool for every job, but they are quite helpful when they are the right tool.

Progress on the OCA: Oracle Certified Associate Java SE 8 Programmer I Study Guide

book-pdf

In September, Scott and I announced we were writing a book for the OCA (Java 8) exam. Just over a month later, the book cover is up on Amazon along with the estimated publish date of December 31, 2014. I assume this means early January as I find it hard to believe anything happens at a large company during Christmas/New Year’s Week.

It’s great to see progress though. The book is now starting the technical proofreading stage. Yesterday, our tech proofer showed us what the PDF or some of the chapters looks like. it was really cool seeing the jump from a (heavily edited and iterative) Word document to a sharp looking PDF. It’s also exciting seeing something we wrote in near final form.