Moving Java Forward Faster – live blog at Oracle Code

Title: Moving Java Forward Faster
Speakers: Donald Smith

See my live blog table of contents from Oracle Cloud

New Java Release Model

  • “no more limos; think trains”
  • From Java 2 to Java 8, had target release of every two years. Slipped a bunch of times (once to 5 years)
  • From Java 2 to Java 8, did update releases roughly every 6 months. (not counting security releases). These update releases had new APIs or functionality. Ex: 8u20, 8u40, 8u60
  • Java 9 was released in September 2017 and is already at end of life
  • Tried to make case that Java 10 is really 9.1 and Java 12 is really 10.1 and Java 13 is really 10.2 and so forth. [I don’t think this analogy holds. Talking to Mark Reinhold suggests that they aren’t trying to make major changes specifically for 11).
  • Didn’t use 9.1, 10.1, etc because need major version number to make spec change.
  • Carve up changes across releases instead of a major release every few years.
  • Every six months is now a feature release and can potentially change the spec.
  • Enterprises do not like 6 month releases so every 3 years is LTS release.
  • Java 11 is 18.9 LTS. So the LTS still uses the yy.m naming convention. So Java 17 will be the 21.9 LTS. If so java –version get vendor string that will have both Java 11 and 18.9 in string for Oracle JDK
  • Java 11 (18.9 LTS) will be supported for about 5 years plus 3 years of extended support.
  • LTS releases will be more stable because people are using the interim releases who don’t care about LTS.
  • Java 6 support ends in late 2018. Java 7 support ends in 2022 and Java 8 ends in 2025. Java 8 might be supported longer. TBD.
  • Oracle will be producing Open JDK binaries vs it being a third party thing. Open JDK binaries are GPL licensed. They will only be available for 6 months for Java 11, 17, etc
  • Open sourcing the commercial features that are part of the JDK – mission control, flight recorder, app class data sharing, java usage tracker (see how many JREs used in system).
  • Separately packaged tools will stay commercial

What’s new in Java 9

  • Last “major” release. 100+ features
  • No longer using word “major” because releases are frequent and incremental.
  • Jigsaw gives smaller footprint to attack by having less modules.
  • java misc Unsafe – [the usual so not writing it up]
  • AOT compiler – for application as well. Not immediate, but coming.
  • jshell – live demo
  • G1 is now default garbage collector

Long term goal of jlink

  • Horror stories where people need a dozen versions of JREs because apps don’t work with various patches.
  • Long term goal – shift thinking of how package apps from standalone JRE to shipping a JRE with the app itself. (for client side apps for users)
  • Gut: This makes the problem worse
  • jlink lets you create custom runtime optimized for program. Doesn’t have all the modules.
  • Get smaller package with just what need.
  • [security implications are interesting; have to patch each app but apps far less likely to contain vulnerability]
  • jlink also requires packing for hardware
  • Goal shifting to jlink because browsers and OS are heading away from one common Java for Windows and Mac. Worried that one day there will be an OS update that will block separate JRE.
  • [I asked why it isn’t a problem for developers if OS blocks common java. He said maybe a configuration or a developer build. So power users vs end users problem]

What’s new in Java 10

  • First feature release (vs major release)
  • Type inference – var x – … (example of a feature that couldn’t be in an update release since changes Java spec)
  • G1 garbage collector uses multiple threads
  • 12 JEPs (Java Enhancement Proposals) targeted.
  • Open source root certificates. Can connect to many TLS servers out of the box. Vs OpenJDK for RHEL which assumes you have Firefox installed.

What’s new in Java 11

  • 4 JEPs already targeted. Waiting until ready to target
  • Removing some APIs

Future – version not yet known

  • Optimize for data; not just code. So big data libraries don’t need to call native code
  • Project Panama – interoperate with native libraries (better JNI)
  • Project Loom – lightweight threads
  • Project Valhalla – value types. Maybe Java 12?

My take

Excellent session. I thought I understood the release model and still learned some nuances! Glad he spent more than half the session on this topic. And I hadn’t realized the long term implications for Jigsaw/jlink.

Getting Started with Hadoop, Spark, Hive and Kafka – live blog from oracle code

Title: Getting Started with Hadoop, Spark, Hive and Kafka

Speakers: Edelweiss Kammermann

See my live blog table of contents from Oracle Cloud

Nice beginning with picture of Uruguay and a map

Big data

  • Volume – Lots of data
  • Variety – Many different data format
  • Velocity – Data create/consumed quickly
  • Veracity – Know data is accurate
  • Value – Data has intrinsic value; but have to find it

Hadoop

  • Manage huge volumes of data
  • Parallel processing
  • Highly scalable
  • HDFS: Hadoop Distributed File System  for storing info
  • Map Reduce – for processing data. Language/methods inside hadoop
  • Writes data into fixed size blocks
  • NameNode – ike index, central entry point
  • DataNode – store data. Send data to next DataNode and so on until done.
  • Fault tolerant – can survive node failure (Each DataNode sends heartbeat every 3 seconds to NameNode; assues dead after 10 minutes), Communication failure (DataNode sends ack), data corruption (data nodes send block report to NameNode of good blocks)
  • Can have second NameNode for active/standby config. DataNodes report to both.

Hive

  • Analyze and query HDFS data to find patterns
  • Structure the data into tables so can write SQL like queries – HiveQL
  • HiveQL has multitable insert and cluster by clause
  • HiveQL has high atench and lacks a query cache

Spark

  • Can write in Java, Scala, Python or R
  • Fast in-memory data processing engine
  • Supports SQL, streaing data, machine learning and graph procesing
  • Can run standalone, on Hadoop or on Apache Mesos
  • Much faster than map reduce. How much faster depends n whether the data can fit into memory
  • Includes packages for core, streaming, SQL, MLLib and GraphX
  • RDD (resilient distributed dataset) – immutable programming abstraction of objects collection, can be splt cross clusters. Can create from text file, sql, nosql, etc
  • Can choose which acks need to receive – none, from the leader or from al replicas

Kafka

  • Integrate data from different sources as input/output
  • Producer/consumer pattern (called source and sink)
  • Incoming essages are stored in topics
  • Topics are identified by unique names and split into partitions (for redundancy and partitions)
  • Partitions are ordered and has an id named offset
  • Brokers are Kafka servers in a cluster. Recommended to have three
  • Define replication factor for data. 2 or 3 is common
  • Consumers read data from a topic. They read in order from a partition, but in parallel between partitions.

My take

Good simplified intro for a bunch of topics. It was good seeing how things fit together. The audience asked what sounded like detailed questions. I would have liked if they held that for the end.

Intro to Docker Containers – live blog at Oracle Code

Title: Intro to Docker Containers
Speakers: Mike Raab

See my live blog table of contents from Oracle Cloud

 

History of containers

  • ex: UNIX containers, Solairs Zones, VMWare
  • Docker as a product and company made containerization easy

Use cases

  • Ready to run application stacks – setting up a cluster can take a few days even if know what doing. Preparing Docker takes a few minutes once you have it configured.
  • New development/microservices
  • One time run jobs – the data dies with the container by default so good if don’t need it.
  • Front end app servers
  • Server density – Portable – can run same container anywhere

Architecture/Nomenclature

  • VM has entire OS, app, dependencies, binaries, etc.. Container include the app and dependencies.
  • Docker client – CLI for interfacing with Docker
  • Dockerfile – text file of docker instructions to assemble image
  • Image – hierarchies of files. Input to the docker build command. Collection if files and metadata. Contains layers.
  • Container – running instance of an image using the docker run command.
  • User doesn’t know whether you are using a container. Pure implementation.
  • Registry – image repository. DockerHub is largest repo.
  • Docker engine – container execution and admin. Uses Linux Kernel namespaces so have isolated workspaces. Can do Docker for Windows but different image and not popular. 99.9%+ Linux

Commands inside DockerFile

  • FROM x – the base/parent image
  • COPY x y – copy x to y
  • RUN c – run UNIX command c
  • ENTRYPOINT [“a”] – run command a at startup

Docker commands

  • docker build -t dockerfile . – run in current directory
  • docker tag image user/image
  • docker push username/image
  • docker pull username/image
  • doker run – pull the image and run in container
  • docker logs
  • docker ps  – running docker containers
  • docker ps – a – all docker containers whether running or not
  • docker images – list all images
  • docker rm
  • docker tag
  • docker login – login to registry
  • docker push/pull
  • docker inspect – config metadata
  • docker-compose up -d – run multi container Docker applications

Why docker is hot

  • developers love it
  • fast to spin up environment
  • open source
  • code agility, CI/CD pipeline, devops
  • portability
  • managed kubernetes – running it is hard. use managed/cloud environments – oracle commercial time 🙂

My take

Good intro/review. I’m giving a session right after this one and wanted to get ready. So this was a good session for me. Not brand new content, but still go something out of it.