[2024 dev2next] distributed consensus algorithms

Speakers: Mykyta Protsenko, Alex Borysov

For more see the table of contents


Scenario: need 5 people to meet: two in Ukraine, one east coast US and two West coast US. Need to meet weekly

Consensus properties

  • Fault tolerance – not reliant on one person to see result
  • Safety – only one value chosen
  • Liveness – get to consensus in finite about of time, can take multiple iterations

2 phase commit

  • One person asks and someone says can attend if everyone can. everyone says yes.
  • Commit and now official.
  • Simple but downsides. When group big, a lot of acks. If one no, transaction aborted and start over. Also, waiting for slowest node to reply.
  • If coordinator loses internet, everything blocked
  • Fails on fault tolerance and liveness

Paxos Protocol

  • Everyone has a ballot with unique ballot number
  • Propose a time with next ballot
  • Submit last vote null message to promise that will vote. Can be for or against.
  • Once majority promise to vote, sends actual begin ballot message
  • Then people actually vote
  • Consensus is majority. Sends message that reached consensus
  • Each participant must track last ballot tried, promise to vote and actual vote.
  • Choosing proposer and ballot don’t have to same person
  • Can only vote if confirmed a promise to vote in cluster
  • Learners can observe to be notified when consensus is reached
  • Fault tolerant because majority is enough
  • Safety because majority based
  • Doesn’t ensure liveness.
  • Can elect member as leader who can be the only one to propose
  • 2 round trips for consensus

Cassandra

  • Uses Paxos
  • Need to know order of data – linearizable consistency
  • Don’t mix transaction types. ex: use if exists/if not exists consistently.
  • Lightweight transactions are faster than two phase commit
  • Incurs performance penalty by design because more Paxos interactions.

Raft

  • Two message types
  • Leader based. All other nodes are followers
  • No reelections. Leader stays as leader until disappears
  • Like Paxos, use increasing numbers
  • Every node starts as a follower. on term 1 Followers notice no leader. One or more volunteers and increases term number. Others vote on leader. Only one vote for that term so can’t vote twice.
  • Once leader elected, followers send requests to leader who propagates
  • Log replication – can be applied (not final) or committed
  • Use commit index as tracker of what data was committed. Allows to see state
  • All followers have a heartbeat tracker. If leader disappears, the one who hasn’t heard from the leader in the longest time becomes a candidate to be new leader. If away and request leader, gets rejected because have one
  • If outside cluster and want to know status, asks leader
  • Fault tolerance – yes leader or follower can droo
  • Safety – guarantees one choice. Also only commit data from term
  • Liveness – in practice yes, but in theory no

Mongo DB

  • Uses Raft
  • If slow member is leader, there is a write bottleneck.
  • Can horizontally scale by replica set. Can hash keys so majority of requests aren’t all on one replica

Accord

  • New algorithm; not widely available
  • Leader based protocols create bottleneck
  • Fast and slow paths
  • If can get majority with fast path, can tell slower nodes later; even async
  • A node must be part of all fast paths majority so can share with others when back online
  • Fast path should be 3/4 of nodes to guarantee someone has latest state
  • Slow path remains as simple majority
  • ACID
  • Reorder buffer to reset transactions to be in order based on time differenitials

My take

I knew what two phase commit was. Everything else was new to me. Excellent start to the morning! The five people voting made it easier to follow. The reasons for them disappearing (Ukrainian soldier, Californian losing power) also helped pay attention. (Left a few minutes early to answer a phone call)

[2024 dev2next] Breaking AI

Speaker; Micah Silverman @afitnerd)

For more see the table of contents


Notes

  • ChatGPT took 2 months to get to 100 million global monthly active users. By contrast, TikTok took 9 months, Uber 70 months, Instagram 30 months
  • Hot trend, but also people found utility in it.

App Security

  • Getting hard.
  • Code growing faster and apps getting more complex

Common uses in dev

  • Adding comments
  • Summarizing Code
  • Writing “readme”
  • Refacotring code
  • Proividing templates
  • Pair programming
  • Generating code – the new stack overflow

Stats and studies

  • 92% using AI coding use
  • 57% completed tasks faster (not necessarily better)
  • 27% more likely to finish task
  • 40% co-pilot code contained vulnerabilities
  • More likely to believe wrote more secure code, but wrote less secure code. Because believed was more secure, didn’t look hard.

AI code

  • Like junior dev just out of bootcamp. Need to checked works and secure
  • Example hallucinations. Change over time. Over a few months, went from 98% on math to 2% on math. Open AI fixed basic math. Designed to be good prediction engines, not math
  • “chatGPT is confidently wrong” – Eelko de Vos

AI Coding

  • Asked for an Express app to take name in request param and returns a website showing name
  • All LLMs tried had XSS/injection
  • If questioned or asked to create a secure express app, would get sanitized one. Let of sanitization varies.
  • Showed Synyk advisor – gives health score on libraries – ex: sanitizer. Need to check recommended libraries

Co-pilot

  • 40% code trained on is insecure
  • Used approach where prompt through comments (vs chat feature)
  • Used live templates to autocomplete prompt comments to save time for demo
  • Example with Spring boot and Thymeleaf. Copilot got that from context of project
  • Not quite right but made minor changes vs starting from scratch
  • Copyright was 2017; noted hadn’t done that before
  • Copilot tried to provide the next comment/prompt. Not what wanted, but reasonable
  • Synk IDE extension – Detected SQL injection in view that looks like problems view

Chat GPT

  • Had do a security code review
  • Added HtmlUtils.htmlEscape(username) – context aware; knew using Spring Boog

My take

Micah said up front that he has no connection to Microsoft or IntelliJ and is just using their products. I never thought to give that disclaimer when I use tools. I’ll think about whether I want to when it isn’t almost 9pm. I am very much a morning person. In fact, that’s why I chose this talk. I thought it would require loading the least info into my mind to understand at this hour while still learning. The demo of copilot for building an app was fun with a good emphasis on security.

[2024 dev2next] Architecture.Next: 4 trends for architecture

Speakers: Mark Richards (markrichardssa) & Neal Ford (neal4d)

For more see the table of contents


Architectural Nexûs

  • Need to recognize intersections (I didn’t type “intersection” repeatedly, but they used the word many times)
  • How many architectures to screw in a lightbulb? None; it’s an implementation detail
  • Implementation needs to be fault tolerant; not just architecture
  • Engineering practices need to be agile
  • Team topology makes hard to implement certain types of architectures
  • Often ignore data topologies and system architecture
  • “The enterprise” – processes, standards, frameworks from dept, division, enterprise enforcing for many reasons; usually good
  • Business environment – ex: cost cutting mode vs aggressively expanding, rate of change in business or marketplace. Software must be flexible enough to change as business changes to achieve goals. Cannot be bottleneck
  • Generative AI – can apply governance, find inefficiencies in architectures, etc. This is the fourth trend in this session
  • Architecture can’t live along which is why often system just don’t work
  • Mechanical sympathy – use tool so works at it’s best. ex: bytecode so small, avoid context switching. On prem storage is expensive and CPU is cheap so use CPU to break up data. The opposite is true in the cloud. CPU is more expensive. Ex: we don’t question normalizing data, but new topologies don’t always follow them.

Automated Governance and Fitness Functions

  • We write tests every day. And if you don’t watch out for Venkat 🙂
  • We do a good job testing functionality
  • How do you test structural integrity of architecture? Elasticity? Maintainability?
  • Fitness function – objectively evaluates an architectural characteristics
  • Operational fitness functions – availability, scalability, etc. Scalability problems manifest as responsiveness problems
  • Structural fitness functions – bridge is fine as cars drive over it. Until it’s not.
  • ArchUnit in Java, ArchUnit and NeArchTest for .NET, PyTestArch for Python, TSArch for JavaScript/TypeScript. Get AI does a good job generating the tests
  • ArchUnit example for structural integrity were the package dependency ones [these seem like the easiest ones to write]
  • Data fitness functions – ex: foreign key constraint across databases, checksum to ensure data consistency
  • Process fitness functions – ex: testability measured by error rates
  • https://blog.hello2morrow.com/2018/12/a-promising-new-metric-to-track-maintainability/
  • Architecture as code

Aspect oriented architecture

  • Spring uses AOP as output
  • Hexagonal architecture – loose coupling to separate plumbing from domain stuff. Alistair Cockburn drew a hexagon when talking about it, but too late, it stuck. Almost got it right. Can’t treat database as a separate thing. Most people use a shorthand for separating domain and plumbing.
  • Don’t need hexagonal architecture anymore. [yet new book on it: Hexagonal Architecture Explained ]
  • Need data to be in context. Microservices preserve this boundary
  • Service mesh
  • Data mesh – operational vs analytical data, However, can’t build an analytics sidecar. Instead build cooperative quantum
  • Sidecar/mesh pattern – can build aspect oriented architecture.
  • Governance mesh. Ad hoc governance/fitness function all over. ex: logging, monitoring, circuit breaker
  • join point – governance mesh
  • pointcut – holistic capabilities like observability
  • advice – fitness functions

AI ∩ Architecture

  • Updating thoughtworks tech radar.
  • AI came up a lot, but only one is in adopt which requires maturity
  • Categories – AI assistant software dev, local inference, fine tuning, inference, cloud services, evals and guard rails, structures outputs, prompts, information retrieval , observability for LLM, building agendas
  • Easy to make a talking dog. Hard to get it to talk right in prod. Also hard to get the talking dog to call an API
  • Vector database used for LLMs
  • Guard rails – how prevent from doing something shouldn’t
  • Eval – how well doing
  • More expensive LLM can validate results of cheaper LLM

How four trends related

  • Can use fitness functions to validate code generated by LLMs
  • Architecture as code and fitness functions describe intersections. Need executable, not diagrams
  • Once critical mass of fitness functions, have governance mesh
  • Systems are too large; can’t manually validate
  • When Log4J, people asked architects what in product and didn’t know [solution to that is a tool. don’t need governance mesh. I agree with the point, but not a fan of the example]

My take

I like that they covered a variety of topics while also getting into code for ArchUnit