Exploring Collectors
Speakers: Venkat Subramaniam
For more blog posts, see The Oracle Code One table of contents
General
- Common operations; filter, map, reduce
- Filter – like a coin sorter. Will let some values through and discard others
- Reduce – go from a stream to a non stream
- Collect is a reduce operation
- Functions should be pure – doesn’t change anything *and* doesn’t depend on anything that could change
Collectors – concepts
- Don’t call add() in a forEach. Should be using a Collector
- Can’t parallelize code when have shared mutability (ex: add() in forEach)
- Can’t say “the code worked”. Can say “the code behaved”
- If use ConcurrentList, it’s a ticking time bomb for when someone changes the list type.
- Should write collect(Collectors.toList()) or collect(toList()). Already handles concurrency (when running with a parallel stream)
- Venkat prefers using Collectors as a static import and just calling toList()
- Collectors are recursive data structures. The second parameter is another Collector
- Often need to chain collectors to do what want.
- Ok to write code “the long way” and then refactor once have passing tests
Collectors – Code
- Java 10+: toUnmodifiableList() – immutable list
- partitioningBy – when need both the matching and non matching results. Avoids needing two passes of data to get result.
- joining(“, “) – comma separated
- groupingBy(Person::getName) – create map with key as name and value as list of Person objects. Conceptualize as buckets. Put items in bucket by key
- groupingBy(Person::getName, mapping(Person::getAge, toList())) – map after group. Perform mapping right before throw data into bucket.
- groupingBy(Person::getName, counting()) – value is # matching values
- groupingBy(Person::getName, collectingAndThen(counting(), Long::intValue)) – transform the result of a collector to a different type
My take
I like that Venkat talked about how to write code “the long way” to explain the power of collectors. This was a good review. And good motivation as we update our OCP book (I have the streams chapter). I like the bucket analogy for groupingBy(). I didn’t know about collectingAndThen() The 45 minutes flew!