performance tuning selenium – firefox vs chrome vs headless

I’m the co-volunteer coordinator for NYC FIRST. Every year we are faced with a problem: we want to export the volunteer data including preferences for offseason events. The system provides an export feature but does not include a few fields we want. A few years ago, my friend Norm said “if only we could export those fields.” I’m a programmer; of course we can!

So I wrote him a program to do just this. It’s export-vol-data at Github. And fittingly, he “paid” me with free candy from the NYC FIRST office. Once a year we meet, Norm gives his credentials to the program and we wait. And wait. And wait. This year NYC FIRST had more events than ever before so it took a really long time. I wanted to tune it.

Getting test data

The problems with tuning have been:

  1. I have no control over when people volunteer for the event. It’s hard to performance test when the data set keeps changing.
  2. The time period when I have access to the event is not the time period that I have the most free time.

Norm solved these problems by creating a test event for me. I started over the summer, but then got accepted to speak at JavaOne and was really busy getting ready for that. Then I went back to it and someone deleted my test event. Norm solved that problem by creating a new event called “TEST EVENT FOR SOFTWARE DEVELOPMENT – DO NOT ENROLL OR DELETE, please. – FLL”. And one person did volunteer for that. But not a lot so it helped.

Performance tuning

I tried the following performance improvements based on our experience exporting in April 2017.

  1. SUCCESS: Run the program on the largest events first. (It’s feasible to manually export the data for small events. Plus those have largely people who also volunteered at a larger event.) This allows us to run for the events with the most business value first. It also allows us to abort the program at any time.
  2. SUCCESS: Skip events and roles with zero volunteers. For some reason, it takes a lot longer to load a page with no volunteers. So skipping this makes the program MUCH faster.
  3. SKIP: Add parallelization. I wound up not doing this because the program is so fast now.
  4. FAILED: Switch from Firefox driver to PhantomJS. I knew the site didn’t function with HtmlUnitDriver. I thought maybe it would work with PhantomJS – an in memory driver with better JavaScript support. Alas it didn’t.
  5. FAILED: Try to go directly to URLs with data. FIRST prevents this from working. You can’t simply simulate the REST calls externally.
  6. SUCCESS: Switch from  Firefox driver to Chrome driver. This made a huge difference in both performance and stability. The program would crash periodically in Firefox. I was never able to figure out why. I have retry/resume logic, but having to manually click “continue” makes it slower.
  7. UNKNOWN: I added support for Headless Chrome in the program. It doesn’t seem noticeably faster though. And it is fun for Norm and I to watch the program “click” through the site. So I left it as an option, but not the default.

Results

Like any good programming exercise, some things worked and some didn’t.  The program is an order of magnitude faster now that at the start though so I declare this a success!

JavaOne – Modern Java Recipes

“Modern Java Recipes”

Speaker: Ken Kousen

For more blog posts from JavaOne, see the table of contents


All examples in this talk are in:
https://github.com/kousen/java_8_recipes

Lazy Streams

  • Streams – doesn’t store elements, doesn’t modify source, lazy when possible, destructive (can only run once)
  • Showed how findFirst() doesn’t cause all intermediate operations to run against all data in stream
  • findFirst() is a short circuiting terminal operation
  • Not many short circuiting stream operations. limit() is one as well for intermediate operation

Debugging streams

  • Eclipse and IntelliJ let you put breakpoints in stream
  • IntelliJ has plugin to see values in stream as go by
  • peek() method
  • Tip: Use a debug log library with peek so easy to turn off

Strings as Streams

  • String does not implement Iterable
  • Arrays.stream() doesn’t work for char[]
  • str.codePoints() returns int stream
  • StringBilder::appendCodePoint gets it back into stringish form
  • Obscure case; source: stack overflow

allMatch, anyMatch, noneMatch

  • all short circuiting terminal operations
  • showed with prime number checker – noneMatch returns as soon as finds example that proves number isn’t prime
  • also showed the assertFalse and anyMatch. I didn’t understand why this wasn’t assertTrue and noneMatch in the book

collect

  • showed three arg version – Supplier, BiConsumer to add single element to the result and anoher BiConsumer ot combine two interim results
  • the combiner isn’t mentioned in the JavaDoc pseudocoe
  • the combiner also gets used for parallel streams
  • reduce() is similar

Reduction

  • count() == mapToLong(e -> 1L).sum()
  • Added a few methods like Integer.sum(a,b) so can use as a BinaryOperator
  • The two argument version of reduce takes an identity for the binary operator. This lets it return a value instead of an Optional
  • Use reduce that takes BiFunction if reducing into a different type

Transforming streams

  • map – one to one mapping
  • flatMap – function from T to a stream. It is one to many where many is a stream
  • Optional also has a flatMap() which is for flattening Optional<Optional<T>> to Optional<T>

Deferred execution

  • Showed logger and how doesn’t build complex string if not needed to log
  • Overload methods to take supplier for this case. Caller just needs to add () ->
  • Don’t worry about this if you string is just a constant
  • Optional.orElseGet works the same way

Partioning and Grouping

  • downstream collectors – use when don’t want list back

Words

  • showed the /user/share/dict/words example
  • need to use try with resources when use Files.lines
  • Comparator.comparingInt(..).reversed().thenComparing(..)

Finally, showed demo of Anartica time zone map. The South pole follows New Zealand time. Which eans has daylight savings time despite getting 6 months of light vs 6 mnonths of dark

My take: Everything in this talk is from the book which I’ve already heard. But I’ve never seen Ken speak and wanted to. And it’s fun seeing things presented out loud. His umor while writing and out loud are similar which is good. I learned a few things in the comments like the IntelliJ debugging plugin.

JavaOne – 10 tips to become an awesome tech lead

“10 tips to become an awesome tech lead”

Speaker: Bart Blommaerts

For more blog posts from JavaOne, see the table of contents


Not one free seat in the room and long wait list line. [I wait listed and left the keynote 5 minutes early to ensure I got in]

Role of Tech Lead

  • Provide tech leadership
  • Protect team from interruptions
  • What new libraries/frameworks should we use? What are risks?
  • Coaching
  • Communication – bridge gap between devs an business

Do we need one

  • In ideal world, don’t need tech lead
  • Hard to do everything by consensus
  • Unlike the lego movie, everything is not always awesome
  • Business still needs point of contact
  • New people still need training

The ten tips

  1. Advocate for change – don’t want people to be afraid of prod, need to evolve. Try to make stupid processes better. OODE – observe/orient/decide/act
  2. Work through failure and success – prepare for failure, don’t finger point (“we”), take responsibility, learn from failure, problem if you are fixing the same bug twice – that is process failure. Celebrate success – sprint celebrations, feature complete, congratulate team/individuals
  3. Stay technical – write code, review code, tech vision (and get buy in; ideally with everyone contributing to it), evolution of code. Don’t forget about security/networking
  4. Time management – be available – spending time on tech design, talking to the business, project management (ex: help write user stories) and code
  5. Be a mentor for your team – facilitate discussion. Help team becomes stronger developers. Delegate. Optimize for the group/team
  6. Surround yourself with other tech leads – on a personal level, see what others do to get ideas and learn. On an organizational level, look at common org/architecture along with interoperability/dependencies
  7. Interviewing potental new team members – know your goals. For short term,, looking at tooling. For longer term, look at eagerness to learn. Don’t use stack overflow to find questions; pick things relevant to your project
  8. Embrace cultural differences – diversity, culture, family. Your users are different too.
  9. Estimating is hard – “Hofstadter’s Law: It always takes longer than you expect, even when you take into account Hofstadter’s Law.”. Planning poker. Uncertainty is normal. Add 20% for test/debug/polish/documentation/wtf moments.
  10. Interfacing with the outside world – want to be the go-to person, but not the single point of failure
  11. Faciliate an agile and awesome team

Other notes/tips

  • Not difficult for tech lead to make things worse
  • Need to experience same pain as everbody else on team – you are part of the team
  • Be realistic. Not possible to answer every question. Also, need to connect teammates to each other. Ok to request time to look into a question as long as you get back to them. Also ok to suggest pairing.
  • Not making a decision is worse than making the wrong decision becuase not doing any work
  • Interview style – comfort people (they are nervous; start with a siimple question like “what is the difference between an interface and abstract class in Java”), offer options, build on responses like “what is the difference between interfaces/abstract classes in Java 8”, show interest, bonus question
  • Offshoring – work gets done while sleeping, communication harder. Prepare work for remote teammates since they can’t talk directly to business. Harder to make everyone feel like part of team. ex do a night shift “most developers work better at night” [I am so non standard here!] Keep history ex: Slack

He commented about people working nights the last couple sprints to ensure success on the project. [That sounds completely against the sprint of sprints being sustainable]

My take: Nice to have a good tech soft skills talk. I noticed that Bart referred to “technical lead” as “he” whenever he used a pronoun. I wonder if it’s like that in his native language. Google translate says both tech lead and developer use “le” (are masculine) in French. This was more obvious at the beginning; he switched to “you” after a bit. And he said he/she later on.