javaranch – the web based pick winners program

JavaRanch uses a Java program to pick the weekly winners for book promotions.  It previously used a serious of classes that went the URLs, parsed the data, went to more URLs, picked some random winners and then output them to a file.  These contortions were done because the old software was hard to change.  With the new Java based software, we have much more active development.  Time for a new approach.

Designing the new pick winners program.  (It’s the 3rd iteration of the program and the 2nd I’ve done so I’m familiar with the domain.)

  1. Decide to make a web based version (servlet)
  2. Think about what I need from the database.
  3. Write three DAO methods to get post, topic and user info.  While I wrote the integration tests first, I did write the unit tests after the code.
  4. Start the pick winner class  Realize there is a lot of date validation logic (and determining the default week) and rename class to WinnerPickingWeek to encapsulate the date range.
  5. Start the pick winner class again.  Call the three DAO methods tying them together.
  6. Now add the randomness.  My test with 1 post will give me enough determinism to keep the tests passing and useful.
  7. Added a test for excluding ineligible winners (like Henry and I – the winner pickers)
  8. Now on to the front end.  My servlet needs to make sure you are logged in as admin and then delegate to the processing logic.

This got a useable program that runs much faster.  After that I added some jQuery logic to make the page dynamic and even more useful.  But that’s another topic – possibly a more interesting one.  I’ll post it later in the week.

how testing can improve legacy code design

There’s no shortage of articles on how TDD improves the design of new code.  That’s all well and good.  But what about legacy code?

How it came up

This weekend, I had occasion to make a few enhancements to the email sending project at JavaRanch.  The one that got me thinking about the design was when I needed to add some logic to filter the e-mail list.  I was trying to allow specifiying the start and end index so we could resend to just part of the list if the process failed.  After all, you don’t want people getting two copies.

What I did

As the current filtering logic was in the BulkMailerProcess class, I decided to start there.  This class has a bunch of dependencies and isn’t currently tested.  Hello legacy code!  My first thought was that I would make my new filtering method package private and test that.  As I set out to write the test, I realized I needed to get access to the instance variable containing the list of e-mails.  Ugh.  This made me cringe enough to think about an alternate direction.

My next thought was that I really have a separate concept here.  I’m filtering e-mails.  So it one of the existing methods in BulkMailerProcess.  (It makes sure the e-mails are properly formed.)  Time to create a new class.  At this point it was easy.  My new class EmailFilter takes a list of e-mails and runs both filtering/cleaning operations on them.  It’s very focused and gets all that logic out of the main processing class.  I feel like I left the code cleaner than I found it.

The result

It certainly is more tested.  The method to clean the e-mails wasn’t originally tested since it was so embedded in everything else.  Now it is.  In fact now it is tested at 100%.  Without looking at the implementation of the method I copied in, I did write one test to verify the method gets called and at least does what it sounds like.  With this system I got to 96% coverage on the new EmailFilter class.  I could easily get to 100%, but that’s for another day.  The goal here isn’t to be perfect.  It’s to leave things better than they started out.

I left a comment in the code so nobody thinks it is more tested than it is.

/**
* Check valid format for email. This logic was moved verbatim from
* BulkMailerProcess class. Do not change without adding more detailed unit
* tests to verify behavior.
*/

What’s next

The only thing I’m less than happy with is the name of the class.  I started with EmailListUtil because I didn’t have a better name.  Then I changed it to EmailFilter.  But it’s not just a filter.  It’s doing validation too for cleaning the invalid e-mails from the list before filtering by index.  I’ll have to think about the name some more.

The other next step is to improve things a little more next time I touch the code.  I already have some ideas.  The key is to not attempt too much at once.  That would be overwhelming.  A little at a time makes it doable.

Subversion, Subversion – what client shall I pick?

Since I’m using Eclipse at home for development, it seemed logical to go with Subclipse or Subversive – two of the top three clients listed on the Subversion homepage.  I also tried TortoiseSVN the the other of the top three.  Below are my experiences with the three.  (I haven’t tried branching yet in any of them.)  All have had a release within the past month.  My comments for Subclipse and Subversive are largely from slightly earlier versions though.

Note: See If you tried Subversive before, it’s time to try it again for an update.

Subclipse 1.4.8 (released February 27, 2009)
Pros:

  • It seemed faster to checkout a large project with Subclipse than Subversive.  I can’t prove this as my internet connection varies, but I did notice this a few times.   Another person had the same observation though.
  • Has been around longer.

Cons:

  • No tag awareness!  When I want to commit a file, I need to think about copying the directory to the tags directory.  This is a mental jump as I’m really just thinking I want to create a version – not about how Subversion data is stored.  I also worry that it is too easy to accidentally create the version in the wrong place.  Why invite trouble?  The CVS Eclipse plugin just asked for the name of the tag.
  • Similarly, I have to think about where the tags are stored when comparing to a past version.
  • One common thing I do in Eclipse is awkward – I want to compare the contents of a file to tagged versions.  If I do “compare with revision” I see all the revision numbers and commit comments, but the tag column is blank.  If I do “compare with branch/tag”, I see all the tags, but can only select one to see the compare before starting the whole process again.  Also, you can only compare with branch/tag on a project level – which takes forever – at least over my internet connection.  As a clumsy workaround, we’ve been storing the tag name and revision number on our “release notes” wiki page.  This is an extra manual step that I feel should be unnecessary.  If the tool would just show the tag in the “compare with revision” view like Eclipse does, it would be perfect.
  • The proposal to become an Eclipse project has been withdrawn.

How long I used before switching
A few months.  It wasn’t until the project got into the tagging phase when the tag awareness feature became problematic.

Subversive 0.7.7 (release February 24, 2009)
Pros:

  • Can compare file with “working copy” (last checked out) without connecting to network
  • Since November 2007, has been an incubator project on eclipse.org.  This reset the version number which is why .7 is newer than 1.0/1.1 on polarion.org.

Cons:

  • Installing requires two parts.  The connectors are a separate install for licensing purposes.
  • As in Subclipse, I want to compare the contents of a file to tagged versions.  Using “compare with revision” is worse than in Subclipse.  it doesn’t attempt to show the tags.  More importantly, you must pick which revision you want to compare to, wait a while, view it and then repeat the entire process to pick another revision.  Not nearly as easy to use as the Eclipse version.  (If you think to open the SVN history view first, you get the Eclipse/Subclipse based view where you see them all at once.  It still takes forever though for large projects as it is seeing what else changed in that revision.  Turning off “deep copy” didn’t help matters.)  I didn’t try compare to with branch/tag because I didn’t see the option.

How long I used before switching
I started with Subversive.  After a few months a switched to Subclipse.  Then I switched back to Subversive where I have been since December.

I wasn’t thrilled with either of these so I then tried a standalone client. Tortoise SVN 1.5.9 (released February 27, 2009)
Pros:

  • Checking out one project wasn’t hard.
  • I like the GUI – very intuitive.
  • Committing was simple from Windows.
  • A very cool graphical release view for tags and branching.
  • I really like the log viewer letting you filter by date and or message.  It makes it easy to find out which version you want to compare with.

Cons:

  • It’s not in the IDE.  If not using an IDE, this isn’t a problem of course.
  • Windows only.
  • To tag a project, you need to be aware of the directory structure.  (See my comments about Subclipse for why this scares me.)  There is a graphical explorer to find the tags directory.

How long I used before switching
I just tried it for a few days.  I really wanted an Eclipse based plugin – was just looking to see if this was better.  If I wasn’t using Eclipse, this would be fine.

Conclusion

Both Subclipse and Subversion are usable Eclipse plugins for Subversion.  I prefer Subversive a bit and am going with that at home.  The main reason being tag awareness (I tend to deploy/tag on a weekday evening when I am tired and more likely to mess things up.)  I am keeping TortoiseSVN on my machine for the non-Java code checkins I need to do.  Note that you do have to pick one or the other.    You can’t point to both Tortoise and an Eclipse plugin for the same directory.  As a result, the last week of development on my home computer for Javaranch looked like:

  1. Sunday – Do a bit of Java development in JForum project using Eclipse.  Check out/commit using Subversive.
  2. Monday night – Generate the book promotion materials in PickWinners project by updating the build.properties and running an Ant build script.  (I don’t check these in on Monday since they are so easy to regenerate if something happens to my computer over the course of the week.)
  3. Wednesday night – Production deployment.  I wanted to get in some changes that would make sending private messages to the winners a two click operation.  (It took six clicks for each winner last time it was my turn.)  Tag JForum project using Subversive.
  4. Friday night – Pick winners for book promotion copying winners from web page into one of the files generated on Monday.  Commit using TortoiseSVN.  I like that I didn’t need to open Eclipse on Friday – picking winners is now a web page and the commit happened in Windows.

Many weeks I do less Subversion work at home.  It so happens last week was representative of the breadth of SVN operations I do.  I liked the split between Subversive for JForum (Java development) vs TortoiseSVN for PickWinners (Ant build and text files) and plan to continue that way. I’ll be interested to hear what my co-promotion coordinator uses to commit when it is his turn.  I’m not overly thrilled with either Subversion Eclipse plugin.  I’m thinking of pulling in the JForum project into another directory in Tortoise to use it for file comparison too!

This analysis isn’t so useful for recommending a tool in a corporate scenario where network connections are more reliable and projects are huge.  My suspicion is that Subclipse is better in that environment.  Especially if the majority of tagging operations are done through an automated build.