javaranch – the web based pick winners program

JavaRanch uses a Java program to pick the weekly winners for book promotions.  It previously used a serious of classes that went the URLs, parsed the data, went to more URLs, picked some random winners and then output them to a file.  These contortions were done because the old software was hard to change.  With the new Java based software, we have much more active development.  Time for a new approach.

Designing the new pick winners program.  (It’s the 3rd iteration of the program and the 2nd I’ve done so I’m familiar with the domain.)

  1. Decide to make a web based version (servlet)
  2. Think about what I need from the database.
  3. Write three DAO methods to get post, topic and user info.  While I wrote the integration tests first, I did write the unit tests after the code.
  4. Start the pick winner class  Realize there is a lot of date validation logic (and determining the default week) and rename class to WinnerPickingWeek to encapsulate the date range.
  5. Start the pick winner class again.  Call the three DAO methods tying them together.
  6. Now add the randomness.  My test with 1 post will give me enough determinism to keep the tests passing and useful.
  7. Added a test for excluding ineligible winners (like Henry and I – the winner pickers)
  8. Now on to the front end.  My servlet needs to make sure you are logged in as admin and then delegate to the processing logic.

This got a useable program that runs much faster.  After that I added some jQuery logic to make the page dynamic and even more useful.  But that’s another topic – possibly a more interesting one.  I’ll post it later in the week.

how testing can improve legacy code design

There’s no shortage of articles on how TDD improves the design of new code.  That’s all well and good.  But what about legacy code?

How it came up

This weekend, I had occasion to make a few enhancements to the email sending project at JavaRanch.  The one that got me thinking about the design was when I needed to add some logic to filter the e-mail list.  I was trying to allow specifiying the start and end index so we could resend to just part of the list if the process failed.  After all, you don’t want people getting two copies.

What I did

As the current filtering logic was in the BulkMailerProcess class, I decided to start there.  This class has a bunch of dependencies and isn’t currently tested.  Hello legacy code!  My first thought was that I would make my new filtering method package private and test that.  As I set out to write the test, I realized I needed to get access to the instance variable containing the list of e-mails.  Ugh.  This made me cringe enough to think about an alternate direction.

My next thought was that I really have a separate concept here.  I’m filtering e-mails.  So it one of the existing methods in BulkMailerProcess.  (It makes sure the e-mails are properly formed.)  Time to create a new class.  At this point it was easy.  My new class EmailFilter takes a list of e-mails and runs both filtering/cleaning operations on them.  It’s very focused and gets all that logic out of the main processing class.  I feel like I left the code cleaner than I found it.

The result

It certainly is more tested.  The method to clean the e-mails wasn’t originally tested since it was so embedded in everything else.  Now it is.  In fact now it is tested at 100%.  Without looking at the implementation of the method I copied in, I did write one test to verify the method gets called and at least does what it sounds like.  With this system I got to 96% coverage on the new EmailFilter class.  I could easily get to 100%, but that’s for another day.  The goal here isn’t to be perfect.  It’s to leave things better than they started out.

I left a comment in the code so nobody thinks it is more tested than it is.

/**
* Check valid format for email. This logic was moved verbatim from
* BulkMailerProcess class. Do not change without adding more detailed unit
* tests to verify behavior.
*/

What’s next

The only thing I’m less than happy with is the name of the class.  I started with EmailListUtil because I didn’t have a better name.  Then I changed it to EmailFilter.  But it’s not just a filter.  It’s doing validation too for cleaning the invalid e-mails from the list before filtering by index.  I’ll have to think about the name some more.

The other next step is to improve things a little more next time I touch the code.  I already have some ideas.  The key is to not attempt too much at once.  That would be overwhelming.  A little at a time makes it doable.

Great Developers Are Not Afraid to Experiment

As a moderator on the JavaRanch, I often come across people asking “What would happen if I executed the following code”. Many times the author of such posts can answer the question by copying and pasting his/her code into a Java main() method and running it. Some might chalk these posts up to being lazy, but, clearly, taking the time to write a post on a message board – often signing up as a member for the first time – takes some amount of effort as well. With that, I’m going to be go with the assumption developers avoid experimenting with code because they are scared or unsure of their own knowledge. Besides which, if it is a matter of laziness, there’s not much advice I can give except to say “Don’t be lazy”.

Experiment

Why is experimenting with code can be scary

In my experience as a teacher, development lead, and moderator I often come across developers who are unsure of their own knowledge. Often times they don’t fully understand what it is the code is doing and are afraid to experiment for fear it will either demonstrate their own personal weakness or harm the existing code. To them, I believe that experimenting is most crucial, if only because some of the doubts and questions that linger in their head can often be answered in an surprisingly short amount of time. If you are staring at a piece of code, puzzled by what you do not understand, take my advice: Step back and play 20 questions, ask yourself “What are questions I have about this code that can be answered with a simple yes/no?” then set up test code to answer each question. Once you have your answer, your doubt about the application should vanish, replaced by first-hand knowledge of what is going on. Keep in mind that sometimes these experiments lead to even more questions, but that’s good; it’s part of the learning process. In those cases, perform even more experiments on the code.

What is an experiment?

What constitutes an experiment? Oftentimes, it involves just writing a short line or two of code, then writing a logging statement, or, more commonly, if you don’t have a logger, a System.out.println() statement that displays the value of some variable or object. For example, if you don’t know why a method is behaving a certain way, add a dozen output statements throughout the code so you can follow two things: 1) the path the code is taking and 2) the value of the data throughout the code. Many times, you may be staring at a section of code, wondering why it’s not working, only to find out that section of code is never reached at run-time. Experimenting can be about changing values and recording the inputs, but sometimes its just about outputting where/what you think a process is doing.

Some people will recommend a debugger for experimentation but I’m not one of them. Aside from often being unwieldy and confusing to use, especially for beginners, sometimes running code through a debugger can affect what it outputs. For example, in a server environment, remote debuggers can be especially difficult to use. Services may have transaction timeouts of 30 seconds, and pausing the code in the debugger can cause an exception to be thrown before the method is complete. If you like using debuggers, more power to you, but as someone who has used both output statements via logging tools and debuggers, I greatly prefer the logging tools. Primarily, this is because the output statements give you more of a trial and error structure to work with: either a test succeeded, or it failed, and the output is all there in front of you.

Stand-alone Safe Experiments

The easiest and safest experiments are those you create that are completely separate from any other code. In terms of Java and Eclipse, this is akin to creating a new Workspace and a new Java Project to run the test code in. Every developer should have a temporary, throw-away workspace like this to perform low-level Java tests with. Simply create a file which in some way asks the question you want answered, and execute the program to evaluate the results.

Safe Experiments within Existing Code

Let’s say the code is highly framework dependent, such as often is the cases with J2EE and database-driven applications. For example, creating a temporary workspace to house your test case may be too cumbersome to implement. In such a case, you can run the experiment inside your existing code provided you are careful and follow some general guidelines:

  • If you are working with code that is backed by a repository, such as Subversion, CVS, ClearCase, etc, make sure your experimentation code does not get checked in or you may end up with applications such as this, this, or even the impossible (the last one). The DailyWTF has literally thousands of such cases. It is perfectly fine to experiment, just be careful to cleanup when you are done!
  • If you are working with code that is *not* backed by a repository, then install a developer repository! I cannot tell you enough the power and value of using a coding repository for all development work. In lieu of that, though, you should just make a copy of the project and/or workspace and experiment with the copy, keeping the original intact.
  • If you are working with code that connects to a database, make sure it’s not one other developers use. Making changes to a shared database as part of a test could affect other developers, so, if possible, you should have your own local copy of the database. This does not mean you should have a production database, but merely a copy of a QA or Development database

Final Thoughts
In my experience, one of the things that separates a great developer from the rest of the pack is that the great developer fully understands the code they are writing. When a bug appears (even great developers can cause bugs), this person often has a good idea where to look for the problem right away, and save valuable support time. Ultimately, if you ever find yourself mystified by your own code base, run some experiments and learn why things work the way they work. You’ll be a better developer for it!