results driven deveopment and mobile – aaron glazer – qcon

This is part of my live blogging from QCon 2015. See my QCon table of contents for other posts.

Building a mobile app is like a Formula One car. Someone else creates the rules. People care about how you perform, not the internals.

Results driven means working with all areas (sales, marketing, tech, etc) to achieve a common goal.

Data on it’s own isn’t useful.Needs focus.

Clarity thrugh simplicity. Simplicity alone is not enough.

A typical analytics graph shows dips/peaks over time. But don’t know what. Was a feature released then? Did new copy change your ranking in google? Did features have a delayed effect? Were externalities driving the result? Did features have any effect?

Instead do A/B testing to focus on causation instead of correlation.

In physical store, 75% of users pull out phone and 25% of those buy online rather than in store standing in.

After 1 day after downloading, 15% of users still use app. After a month,only 2% do. This inludes paid and unpaid apps.

On stubhub, see 400 words on desktop, mobile 30 words, Apple watch 5 words. Target has same scale: 500/50/7. On smaller device, word worth more.

A/B testing more important on mobile because less opportunity to hook user.

A/B Testing Walkthrough
Know goal.
Setup distribution. 50% baseline 50% varation
Segmentation: only show to users meeting target audience

Results Driven Development

  • Everyone must work together – Isolating the mobile team is bad. The engineering team controls app, but not accountable for user retention and other business goals. In results driven, havve a cross functional team. Center team around checkout flow, not platform.
  • Get the right tool for the job
  • Ensure accountability is directed properly
  • Data gives you information, but need goal. Results gives you answers.
  • Choose contextual business metric. Hypothesize/test/improve

Q & A

  • How do A/B testing on mobile? Can build multiple apps within an app and toggle.Can use Taplytics (is company) to change dynamically
  • There were two other questions, but they dried up fast

Impressions: The stats were interesting. I feel like i’ve heard most of the remaining info before.

java mini talks at qcon

This is part of my live blogging from QCon 2015. See my QCon table of contents for other posts.

This session is four 10 minute talks.

Determininistic testing in a non-deterministic world

  • determinism – everything has a cause and effect
  • pseudorandom – algorihtm that generates approximately random #s

Should see LocalDateTime with Clock instance to reproduce results in a program. There is a fixed clock so all operations in program see the exam exact time. Similar to using a random seed for generating numbers.

Hash spreads and probe functions: a choice of performance and consistency
primitive collections faster/less memory than boxed implementations. Uses 56 bytes for each Integer.

  • hash spread – function that destroys simple patterns in input data while presuming maximal info abobut input. Goal is to avoid collisions without spending too much time hashing.
  • hash probe – function to determining order of slots that span array. For example a linear probe goes down one slot if collision. A quadratic probe goes down down further if collision

Typesafe config on steroids
Property files are hard to scale. Apache Commons adds typing, but still limits to property file format and limit composition. Spring helps with scaling property file.

Typesafe Config – library used by Play and Akaa Standalone project without depenencies so can use in Java. JSON like format called HOCON (human optimized config object notation)

Scopes – library built on top of typesafe config

Real time distributed event driven computing at Credit Suisse
Credit Suisse produced own language that they call “data algebra”. Looks like a DSL.

java 8 stream performance – maurice naftalin – qcon

This is part of my live blogging from QCon 2015. See my QCon table of contents for other posts.

See http://www.lambdafaq.org

Background
He started with background on streams. (This is old news by now, but still taking some notes). The goals were to bring a functional style to Java and “explicit but unobtrusive” hardware parallelism. The former is more important than performance.

The intention is to replace loops with aggregate operations. [I like that he picked an example that required three operations and not an oversimplified example]. More concise/readable. Easy to change to parllelize.

Reduction == terminal operation == sink

Performance Notes
Free lunch is over. Chips don’t magically get faster over time. Intead, add core. The goal of parallel streamsisfor the intermediate operations in parallel and then bringing them together in reduction.

What to measure?

  • We want to know how code changes affect system performance in prod. Not feasible though because would need to do a controlled eperiment in prod conditions. Instead, we do a controlled experiment in lab conditions and hope not answering a simplified question.
  • Hard to microbenchmark because of inaccuracy, garbage collection, optimization over time, etc. There are benchmarking libraries – Caliper or JMH. [or better if don’t need to microbenchmark]
  • Don’t optimize code if don’t have a problem. What’s your performance requirement? [and is it the bottleneck]. Similarly don’t optimize the OS or the problem lies somewhere else.

Case study
This was a live demo. First we saw that not using BufferedReader makes a file slow to read. [not about streams]. Then we watched my JMeter didn’t work on the first try. [the danger of a live demo]. Then he showed how messing with the GC size and making it too small is bad for performance as well [still not on streams]. He is trying to shw the process of perofrmance tuning overall. Which is valid info. Just not what I expected this session to be about.

Then [after I didn’t see the stream logic being a problem in th first plae], he showe how to solve subproblems and merge them.[oddly not calling it map reduce]

8 minutes before the end of the talk, we finally see the non-parallel code for the case study. It’s interesting code becauase it uses two terminal operations and two streams. At least reading in the file is done normally. Finally, we see that the combiner is O(n) which prevents speeding it up.

Some rules

  • The workload of the intermedidate operations must be great enough to outweith the overheads. Often quoted as size of data set * processing cost per element
  • sorted() is worse
  • Collectors cost extra. toMap*( merging maps is slow. toList, toSet() is dominated by the accumulator.
  • In the real world, the fork/join pool doesn’t operate in isolation

My impressions: A large amount of this presentation wasn’t stream performance. Then the case study shows that reading without a BufferedReader is slow. [no kidding]. I feel like the example was contrived and we “learned” that poorly written code behaves poorly. I was hopingthe talk would actually be about parallelization. When parallelStream() saves time and when it doesn’t for example. What I learned was for this particular scenario, parallelization wasn’t helpful. And then right at the end, the generic rules. Which felt rushed and thown at us.