“i want a career in java”; “java is the next cobol”

Over the past week, I’ve heard the following quotes “I want a career in Java” and “Java is the next Cobol.”  While these two statements don’t conflict with each other when taken literally, the connotations are quite clearly opposites.  Let’s take a look at themS

“I want a career in Java”

This is often asked in the context of a college student who doesn’t have any programming job yet.  So why Java specifically?  When I was in college, I wanted to be a developer.  Any language would have been fine.  It was where my path happened to take me that made me (currently) a Java developer.

Further, you don’t have a career in one programming language.  Things change too fast in technology for that to be the case.  So despite the fact that I’ve been a Java developer for the last ten years, it doesn’t imply that will be the language for my whole career.  Or maybe a career is a shorter term concept than that which I think of?

I think the context behind this statement is that Java jobs appear to be plentiful and pay well making them attractive to someone without any language experience.

“Java is the next Cobol”

A quick internet search says Java has been compare to Cobol since at least 2007.   Let’s see.  Cobol is a widespread language used in countless production applications.  It lasted decades.  It wasn’t cool, but it worked.  Not a bad place to be.

I think the context behind this statement is that Java isn’t cool anymore.  Or a “default” choice of language.  Which is fine.

Where we are

 These quotes show that we have people eager to learn Java and people predicting it’s demise at the same time.  Clearly the reality is somewhere in the middle.

Java is good for certain types of apps.  JVM languages are good for certain types of apps (especially when there is a desire to integrate with “legacy” Java code.)  As is .NET and Python and Ruby and …

Inovation is good.  That’s why we became techies!

data transformation: an example with jforum

Rafael sent me a copy of his jforum database so we could use the data on jforum.  It’s not a trivial import so I thought it would be interesting to walk through what I did.  I’ve ommitted the details of the dead ends but tried to at least allude to them for those learning to develop.

Step 1: Importing what Rafael sent to make it easy to look at and be able to run queries

It took me a little while to notice I that when I asked for a “data dump”, I wasn’t specific enough.  I got a mysql specific format dump.

If you were importing a postgres dump, it would be simple:

  • In pgadmin, create another database named raf_dump.
  • Import data from the command line using: “psql -h localhost -U postgres -f jforum.sql -d raf_import” or via pgadmin open file.

The “right” way to deal with a mysql dump would have been to import it into mysql and then export a dump in postgres compatibility mode.  I stupidly tried to avoid installing mysql and instead:

  1. Copy the create table statement from the dump and convert the types to postgres types.
  2. Run the insert that inserts all the data.
Don’t do this!  It doesn’t work.  Aside from type mismatches, I had encoding errors and escape character problems.  Two hours into this, I gave up and did what I should have done in the first place – install mysql.
To install mysql:
  1. Download dmg image
  2. Install both dmg files
  3. Then click MySql.prefpane to add to system preferences. This lets you click start or start always on machine starup.  You can get back to it by going to system preferences and seeing the “other” section at the bottom.
  4. sudo su – root
  5. cd /usr/local/mysql/bin/
  6. Create database: ./mysqladmin create raf_jforum
  7. ./mysql -u root mysql
  8. create user ‘jforum’@’localhost’identified by ‘xxx’;
  9. create user ‘jforum’@’127.0.0.1 by ‘xxx’;  (I don’t think this was needed but I typed it in so including it here to be thorough.)  Obviously pick a better password than xxx.
  10. grant all on *.* to ‘jforum’@’localhost’
  11. grant all on *.* to ‘jforum’@’127.0.0.1’ (same deal, I don’t think this was necessary)
  12. Install Sequel pro to have a nice GUI for querying.  (It also uses Mac shortcuts for newline and allows you to copy/paste directly.  Much better than the postgres client.)
  13. In Sequel pro, connect to the database.
  14. In Sequel pro, File > Import > point to sql file.  It was really fast  The 34 meg import took less than 30 seconds.
  15. Download the JDBC driver for mysql.

Step 2: Analysis

There are just under 5000 topics and 20,000 posts.  While we have a one to one mapping with the table names, we have a number of adjustments that need to be made.  I’m listing the significant ones here along with my approach for handling.

Mismatch Approach
We’ve added columns to two of the tables. Read the source data into a new file format and then use the java JForum API to create the topics/posts for consistency with our schema changes.
The topic/post id numbers the source data uses are already in use by our forum. Using our Java API solves this as well.
The user ids of the source data are already in use by our forum and some of the members duplicate. Have all posts show as being from the same “Migrated from jforum” user.  The original user can be specified in the post as a footer.
The topics come from multiple forums in the source data and we want them to end up in the same forum. Have the conversion program ignore the source data forum and choose our JForum forum id.
Some forums aren’t appropriate to migrate (off topic, jforum team relax, test forum) Have the conversion program filter on which posts it migrates.

After doing this analysis, I realize I don’t need the data in postgres at all.  I need a flat file to run through a program on the server.  I can generate that flat file from msql just as easily.

Summary of migration process:

  • Delete data that don’t want to migrate
  • Write out a flat file containing topic/post/post data/user info that we do want to migrate
  • Write a servlet to read from that flat file and call the JForum APIs to add data on server (we used the servlet approach for the initial UBB to JForum migration so I can reuse parts of that.)  The advantage is that it is using the actual JForum codebase to insert.

Step 3 – Code and test locally

For the flat file,  I used the Java CSV library for generating my flat file since posts contain spaces, quotes, line breaks and other special characters.  It’s good library.  Easy to use and works exactly as you’d expect.  The flat file generation code took about an hour to write and less than ten seconds to run.

For the servlet, coding was easier since I had my “starter program” from the UBB migration.  I still hit a couple of bugs (that would have been avoided with unit testing).  My analysis was good though.  Few surprises when coding/testing.  I did create helper objects for IntermediatePost and IntermediateTopic so I didn’t have to keep thinking about the fields.

Step 4 – run on test server

I had a network connection error while the program was running disconnecting my browser from the output.  It’s still going in the background so I’m looking at the database counts increasing and the server error logs.  I should have run it in smaller batches.

The servlet was also much faster on my machine than on the server.  I’m guessing I have too many database trips or the fact that there is more existing data.  It’s not horribly slow though and we don’t need to do this again so just letting it finish at it’s pace.

Step 5 – run on the real server

Smooth!  The forum is live.

Thanks to Rafael for the data!

ant dependency graphing with graphviz

In 2007, I wrote an article about ant dependency graphing in the JavaRanch Journal.  I had cause to try it again to visualize the FIRST robotics complex build scripts.  This time, I’m on Mac.  At the moment, the MacOS version of Graphviz is in beta.  This isn’t critical so giving it a try.

It was easy to install with a wizard to accept license, choose location, etc.  Installation was successful and less than a minute.

The contenders from 2007

  • Ant2Dot – hasn’t changed since 2007
  • Visant – hasn’t changed since 2002
  • Grand – hasn’t change since 2005

Since there haven’t been any updates, my analysis should still be the same.  Since Grand generated the most useful output then, I went with it.

Grand

<?xml version="1.0"?>
<project name="Grand" basedir="." default="grand">

	<typedef resource="net/ggtools/grand/antlib.xml" classpath="grand-1.8.jar" />

	<target name="grand">
		<grand output="grand-build.dot" buildfile="/Users/nyjeanne/sunspotfrcsdk/build.xml" />
	</target>

</project>

Graphviz

  1. File > Open – choose grand-build.dot
  2. As before, I can navigate around the file.  This time the graph is ugly.  But I was doing this because the build file was so large and complex it was hard to understand so I’m glad I visualized it.
  3. File > Export
Not sure why Graphiz is still in beta; it worked perfectly.  Maybe for more complex operations there are problems?

The image
This series of Ant builds has a really large number of targets and dependencies so it makes for a complicated graph. A snippet is at the top of this post. The full one is available online:
pdf (decent number of bytes – still tiny so you’ll want to soom in and scroll around)
gif (the file is a large – 1MB download – which is why I’m not including it in the main blog entry. It’s also tiny, so you’ll want to zoom in and scroll around)