[devnexus 2024] ai proof your career with software architecture

Speaker: Kelly Morrison

For more, see the 2024 DevNexus Blog Table of Contents


HIstory

  • Fairly recent. GPT created in 2018. Number parameters increasing exponentially
  • Microsoft CoPilot released in 2021. Uses Codex; a specialized model off GPT3 for creating code. Trained on billions of lines of GitHub code and can learn from a local code base
  • Amazon released CodeWhisperer in 2022. Can generate code for 15 languages. Specialized for AWS Code Deployment

Basic Example

  • Asked ChatGPT to write a Java 17 Spring boot rest API for stats in a MongoDB with JUnit 5 tests cases for the most common cases
  • Looks impressive on first pass, but then find problems
  • Hard coded info
  • Used Lombok instead of Java 17 records
  • Code doesn’t compile

Complicated Example

  • Asked ChatGPT to write an entire enterprise app for selling over 10K crafts with a whole bunch of requirements like OpenID, Sarbanes Oxley, etc
  • Didn’t try. Instead came back with a list of things to consider in terms of requirements

What AI can/can’t do

  • Can do “Ground level” work.
  • Still need humans for large orchestratoin – ex: architects
  • Can do more self without junior devs
  • Garbage in, garbage out. Trained on public code in GitHub. Not all good/correct. Some obsolete.
  • Humans better at changing frameworks, working with CSS (does it look nice), major architectural changes, understanding impact of code when requirements change

Hallucinations

  • Doesn’t understand. Asks as mime/mimic/parrot
  • If can’t find answer, will give answer that looks like what you want even if made up. Example where made up up a kubectl option
  • Not enough training data on new languages/technologies. More hallucinations when less training data
  • Mojo created May 2023. Likely to get Python examples if ask for Mojo. However, it is a subset of Python with some extra things

Security Concerns

  • Learns from what you enter so can leak data
  • Almost impossible to remove something in a LLM. ex: passwords, intellectual propery, trade secrets
  • Some companies forbid using these models or require anonymous air gapped use. Translate something innocuous into what actually want

Debugging

  • Can human understand AI generated code well enough to debug
  • GPT and Copilot can sometimes debug code, but have to worry about security

Pushback

  • Law – ChatGPT made up cases
  • Hollywood strike – copying old plots/scripts/characters
  • Unclear if generated output can be copyrighted. For now, not copyrightable but could change.
  • Some software is too important to risk hallucinations 0 ex: plane, car (although Telsa getting there), pacemakers, spacecraft, satellites
  • Lack of context – other software at compnay, standards, reuse, why use certain technologies, securities
  • Lack of creativity – need to determine problem to solve or new approaches

What AI does well

  • Low level code gen (REST APIs, config, database access)
  • Code optimization
  • Greenfield development
  • Generateing docs or tests
  • Basically the kin of tasks you hand off to a junior developer [I disagree that some of these are things you hand off]

Career Advice

  • Focus on architecture, not code
  • Don’t just learn a langauge or framework.
  • Learn which langauges are best in different situations
  • Learn common idioms
  • Look at pricing, availability of libraries and programmers
  • Learn which architectures should be implemented in different languages
  • Learn how to create great prompts for code generation
  • Learn how to understand, follow, test, and debug AI generated code

Book recommendations

  • Building Evolutionary Architectures
  • Domain Driven Design
  • Fundamentals of Software Archicture
  • Head First Software Architecture

More skills

  • Types or architecutures – Layered, event driven, microkernel, microservices, space based, client/server, broker, peer to peer, etc
  • Determine requirements – domain experts don’t know enough about software to specify. Can be bridge between AI and domain experts

Mentoring junior developers

  • Teach how write high quality prompts.
  • Remind to ask for security, test cases, docs, design patterns, OWASP checks
  • Show to spot and deal with hallucinations
  • Help to understand and debut AI written code
  • Help learn architecture by explaining why choices made
  • Ensure code reviews are held
  • Precommit git hooks to test code
  • Use AI to help generate unit tests

ArchUnit

  • archunit.org tests architecuture.
  • Can add own architecture rules.
  • ex: never use Java Util Logging or Joda Time
  • ex: fields should be private/static/final
  • ex: no field injection
  • ex: what layers are allowed to call
  • Can include “Because” reason for each rule
  • Ensures AI doesn’t sneak in something that goes against conventions

My take

Good examples. I was worried about the omission of “where to senior devs” come from but there were examples like changing frameworks so not entirely ignored. Good examples from the ecosystem as well. Good list of skills to focus on.

[devnexus 2024] breaking ai: live coding and hacking apps wih generative ai

Speaker: Micah Silverman

For more, see the 2024 DevNexus Blog Table of Contents


Changes/Trends

  • AI is not a silver bullet
  • Treat AI like a junior dev and verify everything produced
  • New iteration of copy/paste. More code that got from Stack Overflow
  • We don’t do thorough code reviews of libary code. Loo at security, number commiers, when last release
  • Different because our name now on commit
  • More challenging to maintain visiblity
  • Backlog ever growing

Common use in Dev

  • Addig coents
  • Summarizing code
  • Writing Readme
  • Refactorin code
  • Providing templates
  • Pair programming
  • Generating code (the new stack overflow)

Stats

  • 92% software dev using AI in some form
  • Those who use AI are 57% faster
  • Those who use AI are 27% more likely to complete task
  • 40% of co pilot generated code conains vulnrabiliies. Same stat without AI s o this is what we trained it on
  • Those using AI wrote less secure code but believed ore secure. We trust AI too much. Junior devs get more scrutinity than senior devs
  • [Some of these stats aren’t austation. Ex: early adopters]

Using AI well

  • Good starting point
  • Don’t use without review

Problems

  • Hallucinations – including supporting “evidence” even where wrong. Gave addition example
  • ChatGPT got worse in math over a few months. 98% to 2%. Now some math specific AIs
  • ”ChatGPT is confidentally wrong” – Eelko de Vos
  • First defamation lawsuit – ChatGPT made up case law
  • AI doesn’t know when wrong

AI and Code

  • Asked for Express app taking name as a request parameter. Tried a bunch of times and name parameter never sanizitzed so cross site scripting vulnerabilities. Ideally wouldn’t auto generate vulnerabilities
  • Can give bad advice – asked if code was safe from NoSQL inection. GPT and Bard said safe. Was not safe.
  • Samsung put all code in ChatGPT and leaked code, keys, trade secrets. Became part of training data. ChatGPT says got better about dealing with secrets.
  • Terms of services of ChatGPT say can use anything as training data

Sample Conference App

  • Using CoPilot
  • Spring Boot app in IntellIJ

Basic example

  • Gave co-pilot comments/prompts to create code
  • Showed generating code to read a file that doesn’t exist due to a typo

JPA SQL injection example

  • Showed not the most common way to write JPA here in 2024; usually don’t need direct query
  • Showed copilot offers prompt to get result
  • Tried adding to prompt to protect against SQL injection. Got a naive regex sanitizer
  • Then tried requesting named parameters in the query. After that did promps for setting parameter and getting result. All was well.

Example

  • Tried getting file then wih a file separator
  • Successfully got constant defined in file so some context sensitivity. But ignored file separator reques fro propt
  • Then requesting saving the file and successfully geerated code to write it
  • Requested getting a person and successfully called already written getPerson() method
  • Then set image name
  • Co pilot offered to write prompt to save person.
  • Then requested adding the message and added attribute to model
  • However, has path traversal issue
  • Showed BurpSuite monitoring as run example. “Send to repeater” keeps cookies and such while letting alter request.
  • Changed file name to ../image/snyklogo.png”. If works will replace logo with uploaded pic
  • Showed Synk IDE extension which also note path traversal issue
  • Tried asking to sanitize input against path traversal an got check for two dots. Not good enough but a first pass
  • Tried asking for whitelist to protect against directory traeversal. Checked directory name prefix which is better but also not a whitelist
  • Then tried requesting to validae that there is not a path traversal using the normalize method. Did what requested including the prefix check from previous prompt but Micah noted that’s what did in the past

What can do

  • Use tool to scan code and tell when makes mistake
  • Learn. Ex Snyk has a lesson on prompt injection. – Getting AI to tell info it shouldn’t

Future

  • Currently have Math aware AIs. Maybe will have security aware AI in future

My take

Good mix of background and live code. I like that Micah didn’t assume security knowlege while keeping it engaging for people who are familiar. I had never seen BurpSuite so that was a happy bonus in the demo.

[devnexus 2024] knowledge management for the technicaly inclined

Speaker: Jacqui Read

fosstodon.org/@tekiegirl

For more, see the 2024 DevNexus Blog Table of Contents


Book: Communication Patterns

Knowledge Management

  • McKinsey used phrase internaly in 1987
  • not just putting info on the wiki
  • can’t simply buy it
  • apps can help but small part of what doing

Examples

  • Naming a repo – convey what is i without going in and looking
  • Integrating one app with another – edges are where discover and exchange knowledge
  • Lessons learned – learn from past mistakes and build on successes
  • Inventory – catalog
  • Dashboards – static or dynamic
  • Dcouments – files
  • Expertise locator – how find info in peoples heads
  • Policies and procedures
  • Wikis an articles
  • Forms and templates
  • Databases
  • Meetings and workshops – generate a lot of knowledge. Not always recorded

General

  • ”Knowledge management is the process of capturing, distributing and effectively using knowledge “ – Tom Davenport
  • No org is a vacumn
  • Relationsips change over time and knowledge gets lost
  • If written down, softens the blow
  • Forces between fiishing tech stuff vs getting things done
  • Capturing gets lost and no organization learning. Makes stagnant org and competitors can overtake
  • Fortune 500 companies lose at least $31.5 million a year due to lost info
  • Companies with better knowledge manageent did better during pandemic

Remote first

  • Enable doing best work wherver are Not a bolt on to office work.
  • If anyone remote, everyone is on own device
  • Value output over time spent
  • Emphasis on async communication
  • Better continutiny for time zones, transit strikes, snowstorms, people going out to see eclipse
  • Improved productivity because valued for output and happier. Not trying to look like working
  • Better documentation due to async communications

Sync vs Asyc

  • Async communication – no expectation of reading/responding as soon as received
  • This talk is synchronous for people in room and async for people watching on video later
  • Can capture, publish and use info sync or async

Glossaries

  • Centralize so not looking through multiple or guessing where to put
  • Federated for maintenance – anyone can add
  • Partitioned by domain – different definitions for different parts of business. Define scope
  • Cross reference for simplicity – don’t duplicate

Products over Products

  • Mindset change
  • Other projects can reference
  • Reuse
  • Long term view

Inventories

  • Catalog assets
  • Expicit knowledge -easy to articuate and write down. Think about what it is, structure, format, etc
  • Implicit knowledge – harder to write down. Think about who knows. Is it tacit knowledge (ex: leadership/riding a bike)?
  • Put tacit knowledge in expertise locator
  • Make ore of he implicit knowledge explicit

Personal Knowledge Management (PKM)

  • Encourage people to share what know
  • Bottom up info sharing
  • Can boost carer to share knowledge
  • Can be rabbit hole

Automate knowedge management

  • Documentation as code. Not just markdown. Could be diagrams as code, json, asciidoc etc
  • Optionally review for accuracy. Especially if publishing publicly
  • Automated review for syntax, spelling, links, etc
  • Convert to a useful format like PDF or a website
  • Decouple data fro presentation

Knowledge Management as Code

  • ex: Swapper API docs
  • ex: contract testing
  • ex: Pacts dashboard showing last time tested each API

Hive MInd

  • Optimize knowledge so available at right time
  • Without knowledge manageent, have high cognitive load and chaos
  • With knowledge management, reduce cognitive load
  • With hive mind, reduce cognitive load as much as possible
  • Hierarchy: Wisdom, knowlege, info, data

AI

  • Garbage in, garbage out

Collaborative Knowedge Management

  • Big picture event storming – find boundaries where people disagree. Generates lots of options
  • Domain storytelling – create diagrams with actors and processes. Focus on one way. If hae another version, create a new diagra
  • Bytesize Architecture Session – can use for mix of business and technical. Start with session goal. Then everyone starts independently at same time so not drowned out by loudest voice in room. Then find consensus
  • 6 page memo – re Amazon. Doc created before meeting. Everyone reads at same time (to fence time for reading) then discuss. Downside is that people have thoughts after meeting. Doc needs to be collaborative as well
  • Architecture Decision Records – not just for architectures. Include title, status, context, evaluation criteria, options, decision, implications, consultation Avoids risk of changing something without understanding why decision made. Avoids rework of inestigating same thing repeatedly. New people on team can read why decision made
  • Business Decision Record – same idea as ADR, but for other things. Ex: why buy/choose a product, hiring, strategy

Key takeaways

  • Software BBOM (big ball of mud). Probably also have Documentation BBOM.
  • Think about as wall of ivy instead. Have info hidden in here
  • Good knowledge manageent virtal for building and understanding software
  • Collaborate to collect and record knowledge. Get more perspectives. Break down silos
  • Involve as many minds as possible
  • Elicit the implicit knowledge so don’t miss
  • Knowledge management supports better decision making Need a decision support system
  • Engineer knowledge as much as engineer software
  • Own your knowledge. If don’t know what have, can’t use it

My take

Excellent start to the morning. It was fun seeing how different parts of KM interact. Plus learned some new techniques.