RAG Applications
Build AI applications grounded in your enterprise data. RAG eliminates hallucinations by retrieving real documents before generating responses.
Retrieval-Augmented Generation (RAG) combines the power of LLMs with your proprietary knowledge base. Instead of relying on the model's training data (which may be outdated or lack your specific domain knowledge), RAG fetches relevant documents and injects them into the prompt—giving the AI accurate, up-to-date, and citation-backed answers.
This tutorial covers practical RAG applications you can build with Spring AI: from customer support bots that know your product documentation, to legal assistants that search through contracts, to codebase Q&A systems that help developers navigate complex repositories.
Why RAG?
Accurate Answers
Responses grounded in actual documents, not model imagination. Cite exact sources.
Always Current
Update your knowledge base anytime. No expensive model retraining required.
Domain Expertise
Turn generic LLMs into specialists using your proprietary data and documents.
RAG Architecture
Indexing Phase (Offline)
Query Phase (Runtime)
Application 1: Knowledge Base Assistant
Internal Documentation Q&A
Help employees find answers from company docs, wikis, and policies
@ServicepublicclassKnowledgeBaseService{privatefinalVectorStore vectorStore;privatefinalChatClient chatClient;publicKnowledgeBaseService(VectorStore vectorStore,ChatClient.Builder builder){this.vectorStore = vectorStore;this.chatClient = builder
.defaultSystem("""
You are a helpful assistant for company documentation.
Answer questions based ONLY on the provided context.
If the context doesn't contain the answer, say "I don't have
information about that in our documentation."
Always cite which document you found the answer in.
""").build();}publicKBResponseanswer(String question){// Retrieve relevant documentsList<Document> docs = vectorStore.similaritySearch(SearchRequest.query(question).withTopK(5));// Build context from retrieved docsString context = docs.stream().map(d ->"Source: "+ d.getMetadata().get("source")+"\n"+ d.getContent()).collect(Collectors.joining("\n\n---\n\n"));// Generate answerString answer = chatClient.prompt().user("""
Context:
%s
Question: %s
""".formatted(context, question)).call().content();List<String> sources = docs.stream().map(d ->(String) d.getMetadata().get("source")).distinct().toList();returnnewKBResponse(answer, sources);}}recordKBResponse(String answer,List<String> sources){}Application 2: Customer Support Bot
Ticket Deflection with RAG
Answer common questions using product docs and FAQs
@ServicepublicclassSupportBotService{privatefinalVectorStore vectorStore;privatefinalChatClient chatClient;publicSupportResponsehandleQuery(String customerId,String question){// Search with metadata filter for customer's product tierCustomer customer =getCustomer(customerId);FilterExpressionBuilder b =newFilterExpressionBuilder();List<Document> docs = vectorStore.similaritySearch(SearchRequest.query(question).withTopK(5).withFilterExpression(
b.or(
b.eq("visibility","public"),
b.eq("tier", customer.tier())).build()));if(docs.isEmpty()||getMaxSimilarity(docs)<0.7){// Low confidence - escalate to humanreturnnewSupportResponse("I'll connect you with a support agent for this.",true,// needs escalationList.of());}String answer =generateAnswer(question, docs);returnnewSupportResponse(
answer,false,extractArticleLinks(docs));}}recordSupportResponse(String answer,boolean needsEscalation,List<String> helpfulArticles
){}Application 3: Codebase Q&A
Navigate Complex Repositories
Help developers understand and find code
@ServicepublicclassCodebaseQAService{privatefinalVectorStore codeStore;privatefinalChatClient chatClient;publicCodeQAResponseaskAboutCode(String question){// Search code chunksList<Document> codeChunks = codeStore.similaritySearch(SearchRequest.query(question).withTopK(10).withSimilarityThreshold(0.6));// Group by file for contextMap<String,List<Document>> byFile = codeChunks.stream().collect(Collectors.groupingBy(
d ->(String) d.getMetadata().get("filepath")));String context =buildCodeContext(byFile);String answer = chatClient.prompt().system("""
You are an expert code assistant for this codebase.
Explain code clearly and point to specific files/functions.
If suggesting changes, show the exact code modifications.
""").user("""
Codebase context:
%s
Developer question: %s
""".formatted(context, question)).call().content();returnnewCodeQAResponse(
answer,
byFile.keySet().stream().toList());}}Industry Applications
Legal & Compliance
Search contracts, regulations, and case law. Find relevant clauses in seconds.
- • Contract analysis and comparison
- • Regulatory compliance checking
- • Due diligence research
Healthcare
Clinical decision support with medical literature and guidelines.
- • Drug interaction lookup
- • Treatment protocol search
- • Medical coding assistance
Financial Services
Research reports, policy documents, and regulatory filings.
- • Investment research Q&A
- • Risk policy lookup
- • Earnings call analysis
Enterprise IT
Internal wikis, runbooks, and technical documentation.
- • IT helpdesk automation
- • Onboarding assistants
- • Incident response lookup
RAG Best Practices
✓ Do
- • Use chunk overlap (10-20%) to preserve context
- • Include source metadata for citations
- • Test retrieval quality before going live
- • Set similarity thresholds to filter noise
- • Instruct LLM to say "I don't know" when unsure
✗ Avoid
- • Chunks that are too large (context pollution)
- • Chunks that are too small (lost meaning)
- • Mixing different embedding models
- • Ignoring retrieval failures silently
- • Stuffing too much context (cost + confusion)