“Streams in Java 8 – The good, the bad & the ugly”
Speaker: Simon Ritter & Stuart Marks
[Simon has the Twtter handle @speakjava; very cool]
For more blog posts from JavaOne, see the table of contents
Need to think differently. We are used to imperative programming with loops and variables.
Dealing wih exceptions
ugly code – three lines of code and hard to tell what it does
Problems:
- looks like Perl
- returns null (vs Optional or empty string)
- split is called twice so wasted work
- skipped URLDecoder.decode() because didn’t want to deal with a checked exception – but lost functionality. Problem caused by a missing API in Java so have to use decode.
Better approach:
- use a method with a try/catch block; call that method from the stream
- use Map.entry to simulate a tuple
- Use single char (vs regex) in split. If only pass one character to split, far faster
- split() is overloaded to take a numeric limit to how many are returned
Imperative streams
inside the for each is a print, and if statement and a LongAdder variable (good for frequent writes and infrequent reads)
then refactored to use mapToInt, a println and an if statement and a local variable. more complicated and still not functional
then switched to peek and no variabe but still an if statement (well a ternary)
finally switched to use a filter and count instead of sum
still not 100% functional because println is a side effect. ok for debugging
[good showing evolution to get functional]
Problems
- Easy to misuse forEach() because feels familiar. But easily leads to side effects
- Imperative thinking “for each of these I want to..”
- Pause to consider if should use for each
Mixing internal and external iteration
for loop running 12 times and then getting data for each month with filter checking Month.of(x) – doesn’t work because x isn’t effectively final
“solve” effectively final by setting to different interim variable
IntStream.range(0,12).forEach – uses internal iteration but forEach. Marginaly better as don’t need interim variable
Instead return a nested map of Month to Map
Problems
- Going through data.stream() 12 times
- forEach cheat
- array not right data structure; it’s really a map of month to value
Hands on lab question
reduce (“”, (a,b) -> a+b) – works but inefficient because String concatenation
reduce(a,b) -> sb.append(b) – fails because ignores the first letter.
next attempt uses an if statement in reduce
then tried a custom collector. works but more complicated than necessary
Collector.of(StringBuilder::new, StringBuilder::append, StringBuilder::append, StringBuilder::toString
or just use Collectors.joining()
Problems
- If not using a parameter, it is probably wrong
- Side effects
- if stateent version not associative so would fail when run in parallel
Misc
- can’t use same stream multiple times
- method references are slightly more efficient than lambdas because lambda gets added into a method in bytecode. Saves a level of indirection by using method reference. But only slightly
- Calling .sorted() multiple times vs chaining comparing.thenComparing – the later is better [also works because preserves sort :)]
- parallel streams do more work. might or might not complete faster. uses fork-join pool. number of threads defaults to number of CPUs. In Java9, this is # CPUs for container. On Jaa 8, it was for physical machine
- Nested parallel streams is bad idea because using same threads so performance is worse. Can create ForkJoinPool if must. Buyer beware; this is an implementation specific behavior and tied to the profile of the machine you write it for.
My take: Fun start to he morning. I like that they covered common things in an entertaining way and not common things. Something to learn for everyone!