[javaone 2025] next level features of langchain4j

Posted on March 18, 2025 by Jeanne Boyarsky

Speakers: Lize Raes, Mohamed Abderrahman

See the table of contents for more posts

Concepts

SystemMessage – instructions
ContentRetriever – context
Tools – function calling
UserMessage – user to LLM
AiMessage – LLM to user
ChatMemory

5 Levels towards AGI

Can perform work of entire orgs of people
Can create new innovations
Can take actions on users behalf
Can solve basic problems like a PHD with tools
Current AI like ChatGPT that takes with humans

Options

LLM manages step transitions in state machine – can jump states when unexpected requests, flexibility, but risky
Code manages step transitions – any complexity possible, reliable, separation of concerns, tailored model size. However, not flexible. can’t deal with unexpected scenarios and more work to write.

RAG

Retrieval Augmented Generation
Fetch info relevant to request and send information to LLM
Advanced RAG features
Retrieval Augmentor in addition to retriever
LLM writes query
Adds info/context
Need to measure performance of model. Compare across models
MCP (Model Context Protocol)

Steps in code:

Create document content retriever – can limit scope. Ex: scientific literature
Create web search content retriever
Create SQL database content retriever

Guardrails and Moderation

Guardrails add limits. Ex: list examples of queries that shouldn’t be allowed
Moderation – checks if message violent, etc. Can use a different model for validations
LLMs are more sensitive to examples than instructions

Testing approaches

Test human evaluation (thumbs up/down)
AI assisted

Websites

swebench.com – closes github issues
llm-price.com – shows prices per token and per million tokens
JUnit Pioneer – test retry
Examples from session: https://github.com/LizeRaes/ai-drug-discovery

My take

Excellent examples. The real world scenario of diseases/antigens/antibodies was good. Good concepts and great demo. Showing Prometheus/Graphana was good as well.

[2024 dev2next] improving llm results with rag

Posted on October 2, 2024 by Jeanne Boyarsky

Speaker: Brian Sletten (bsletten@mastodon.social)

For more see the table of contents

PDF of deck on dropbox

Notes

Problem: info in our language but models insufficient to extract it
Important to capture sequences – ex: context window
Problems with word2vec and other embedding approaches. Sequences lost impact if got too long. “New York” and “Soviet Union” useful since near each other. Words farther apart are harder to predict
Next transformer architecture used levels of “attention” to have more detailed views between/across sentences
Encode in a lower dimensional space and decode into higher dimensional space
Positional encoding of words in sentences – picks up some nuance – some has quadratic calculations, but can parallelize so fast
Expensive to create a model. Still expensive but less so to tune it
Types of RAG: Naive, Advanced, Modular

Emergent behavior

Not magical/sentinence
Avoids need to have to retrain all the time
Use linguistic skills, not knowledge skills
Chain of thought prompting

Causes of hallucinations

No logic engine/no way to evaluate correctness
Language engine with schocastic element to avoid memorizing and encourage novelty

Example

Showed how can access web.
Allows you to summarize current news stories
Note: can include output format in prompt: As Json, As CSV, In bulleted list format, etc

Options

Basic model
Fine tuning – setting parameters
Prompt engineering

RAG

Allows getting a lot of custom data
Work with vector databases

Searching

Find portion of data. Then do kd tree and nearest neighbor search
Invevertible tree
Hierarchical Navigable Small Worlds (HNSW) – start in high dimensional space then detailed search
Like express to local train in a city
Can find docs that mention a keyword and then use those docs to answer questions
Want to minimize long contexts because costs lots of tokens
Chunking makes docs smaller so pay less for search – llama provides API to chunk

Limitations of Naive RAG Models

Issues with precision and recall: misaligned chunks, irrelevant or missing chunks
Can still hallucinate if no backed by the used chunks
Still have toxicity and bias problems

Chaining

Initial response
Constitutional principal – showed how to add ethics/legality and rewrites
Constitutional principal – added rewrite for 7th grader and rewrites
That gives final response

Security

Easy to poison data
Need data cleansing but cleverer
http://berryvilleiml.com – machine learning security

Reference: https://www.louisbouchard.ai/top-rag-techniques/

My take

I learned a bunch and the examples were fun. Good to see the code and output. My brain filled up during the session. I needed to switch to an easier talk for the final 5:30 session as I don’t have enough focus left. Cool how the answer to security was a different deck!

[2024 dev2next] customgpts

Posted on October 1, 2024 by Jeanne Boyarsky

Speaker: Ken Kousen @kenkousen

For more see the table of contents

Notes

Goal: Customize ChatGPT without coding
Useful for virtual assistants, automate repetitive tasks and shape AI behavior (within limits)
GPT Builder: create profile picture, specify leading questions, upload files (useful for own info or diff since gpt last trained), enable code interpreter, publish via a link or public “GPT Store”
The GPT Store is just a public link and search. There’s no money

chatgpt.com

Explore GPTs in left nav
Can search for or browse. Lots of available ones.
4-5 million custom GPTs. Bar to creating is very low
Ken made Pragmatic Assistant with rules from PDF and editor rules. Ran chapters through guide. [Janeice made a code version of this for our book; no AI though, long predated ChatGPT]
Like a skin on ChatGPT

Demo

CustomGPT with text from Venkat’s Agile book
Lets choose how communicate – formal, casual, pirate speak, shakespeare, etc [I’ve used chatpgt for western stuff for coderanch a few times]
Configure tab – description, instructions for GPT to use (like trying to sell the book), 4 generated conversation starters so users see some prompts.
Click on upload files to give training data (in this case Venkat’s book). Takes care of all the RAG steps. Limit on number and size of attachments. Was able to upload 8 books.
Give it capabilities like web browsing to use certain sites, DALLE-E, code interpreter
Instructions is misleading as have way more room than actually do.
Can keep private, sharing via the link or via the GPT Store
Gets Ken’s name wrong. Loses the “s”

Trying it out

Outputs in Markdown
Did try to sell book per instructions

Stats

10 files per GPT
Text, spreadsheets, presentations, images,. etc
2 millin tokens per file
20MB for images

“Security”

GPT may share file contents.
Files can be downloaded when Code Interpreter is enabled
[I experimented with Venbot: had it tell me the books available, the table of contents of one, the sections in ch 1 and the full text of two sections. Then I ran out of free tokens for a few hours. When I asked for all of ch 1 it gave me a little and prompted to read the book]
Only reference to copyright is in the docs of using free materials

Venbot

https://chatgpt.com/g/g-LsSBgJX2D-venbot-5000
Demo was fun
Cool that it figured out the weather from the location

Actions

GPTs allow you to define Actions – external API. Published via OpenAPI spec (what we used to call Swagger)
Doesn’t work that well. Easier to write code.
ActionsGPT can help generate
Many limitations: no customer headers, must be same domain (except google, microsoft and adobe oauth domains), 100K request/response, 45 second timeout, text only

Code version

The website was NoCode
Showed using langchain4j
Can write own logic and will identify. Langchain will call

Problem

Custom GPT has name on it
When wrong, looks like you did it

Competitors

Claude AI – can only share with teammates (via team subscription), limited number of resources. Two books fit. Can’t access internet.
NotebookLM from Google – can generate stuff like a read only FAQ based on Gemini, Can generate a study guide (short/long essay questions with answers). Also has audio overview. Meant to be a study tool from docs, websites, and youtube (transcripts)

My take

It was cool seeing a demo. Also, the interaction with Dave Thomas for Pragmatic prompts was fun. I played some with Venbot and some other GPTs while Ken was talking which was also fun. Especially getting it give me parts of the book.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Down Home Country Coding With Scott Selikoff and Jeanne Boyarsky

Java/J2EE Software Development and Technology Discussion Blog

Tag Archives: llm

[javaone 2025] next level features of langchain4j

[2024 dev2next] improving llm results with rag

[2024 dev2next] customgpts

Share this:

Share this:

Share this: