Mark Price @epickrram
For other QCon blog posts, see QCon live blog table of contents
Requirements
- Trading app
- Need microsecond (not millisecond) response time
- Need data in memory vs database
- Lock free programming
- Redundancy
- High volume
- Predictable latency
Hydra
- System built on OSS
- Opinionated framework to accelerate app dev
- Clients communicate with stateless, scalable gateways
- Persistors – manage data in memory.
- Gateway – converts large text message to something smaller and more efficient
Design choices
- Replay logs to reapply changes. Business logic must be fully deterministic. Bounded recovery times
- Placement group in cloud – machines guaranteed to be near each other. Minimizes latency between nodes
Testing latency
- Do as part of CD pipeline
- Can’t physically monitor with fibertab
- Capture in histogram to get statistical view and calculate data
- Test under load
- Fan out where test from
- Store % in time series data
- Can see jigger for garbage collection
Performance on shared box/cloud
- Not in control of resources running on
- Containers share L3 cache so can see higher rates of cache miss
- CPU throttling effects
- Hard to measure since can’t see what neighbors are doing
- One option is to rent the largest box possible and compare to vendor website for specs. If have max # cores, know have box to self. Expensive. Was about five dollars a year. At that price, might be worth just buying own machine in data center
- Can pack non latency services onto shared machines
<missed some at the end. I got an email that distracted me>
My impressions
There was a lot of discussion about the histogram. I would have liked to see some examples rather than just talking about how it is calculated. They didn’t have to be real examples to be useful. There were some interesting facts and it was a good case study so I’m glad I went. I was glad he addressed that non-cloud is a possible option for this scenario