RAG Document Q&A Examples¶
Query documents using Retrieval-Augmented Generation (RAG) with AWS Bedrock Knowledge Base.
Overview¶
The SDK provides two approaches for document Q&A:
| Approach | File | LangGraph | Best For |
|---|---|---|---|
| Direct Workflow | talk_to_document.py |
No | Simple queries, low latency |
| LangGraph Workflow | langgraph_ttd.py |
Yes | Complex queries, tool support |
talk_to_document.py - Direct Workflow¶
This example demonstrates document Q&A without LangGraph, using a CustomAgent with direct knowledge base search.
Architecture¶
flowchart LR
A[Query] --> B[CustomAgent]
B --> C[ChatAgentLLMService]
C --> D[Knowledge Base Search]
D --> E[AWS Bedrock KB]
E --> F[Retrieved Context]
F --> G[Claude LLM]
G --> H[Response]
Key Components¶
Custom LLM Service with KB Search¶
from akordi_agents.core import LLMServiceInterface
from akordi_agents.handlers.search_handler import get_search_handler
from akordi_agents.models.llm_models import SearchConfig
class ChatAgentLLMService(LLMServiceInterface):
"""LLM service with integrated knowledge base search."""
def __init__(self, model_id: str):
from akordi_agents.services.llm_service import AWSBedrockService
self.llm_service = AWSBedrockService(model_id)
self.search_handler = get_search_handler()
def generate_response(self, prompt: str, context: str = None, **kwargs):
# Create SearchConfig for knowledge base
if kwargs.get("knowledge_base_id"):
from akordi_agents.models.types import ClientParams
client_params = ClientParams(
query=prompt,
knowledge_base_id=kwargs.get("knowledge_base_id"),
bucket_name=kwargs.get("bucket_name"),
file_keys=kwargs.get("file_keys", []),
max_results=kwargs.get("max_results", 10),
context_results_limit=kwargs.get("context_results_limit", 50),
override_search_type=kwargs.get("override_search_type", "HYBRID"),
)
retrieval_config = self.search_handler.get_filter_config(client_params)
knowledge_base_config = SearchConfig(
knowledge_base_id=kwargs.get("knowledge_base_id"),
query=prompt,
retrieval_config=retrieval_config,
)
# Invoke model with KB search
llm_response = self.llm_service.invoke_model(
prompt=prompt,
config=config,
system_message=kwargs.get("system_message"),
knowledge_base=knowledge_base_config,
)
return {
"response": llm_response.content,
"search_results": llm_response.search_results,
"token_usage": llm_response.usage,
}
def get_service_name(self) -> str:
return "chat_agent_service"
Agent Setup¶
from akordi_agents.core import AgentBuilder
# Create agent without LangGraph
agent_name = "TTD_AGENT_NO_LANGGRAPH"
builder = AgentBuilder(agent_name)
builder.with_llm_service_instance(ChatAgentLLMService(model_id=MODEL_ID))
builder.with_data_model_instance(ChatAgentDataModel())
builder.with_config({
"industry": "construction",
"provider": "AWS_BEDROCK",
"environment": "PRODUCTION",
})
# Build CustomAgent (no LangGraph)
agent = builder.build()
Running the Example¶
# Basic query
poetry run python examples/talk_to_document.py \
--query "Summarize the key points from the documents"
# With specific files
poetry run python examples/talk_to_document.py \
--query "What are the safety requirements?" \
--knowledge_base_id "YOUR_KB_ID" \
--bucket_name "your-bucket" \
--file_keys "path/to/document.pdf"
# With custom parameters
poetry run python examples/talk_to_document.py \
--query "Explain the project timeline" \
--max_results 15 \
--temperature 0.2 \
--max_tokens 3000
Command Line Arguments¶
| Argument | Default | Description |
|---|---|---|
--query |
"Summarise and provide top 10 points" | Query to process |
--knowledge_base_id |
"O6BAAV6RQZ" | AWS Bedrock Knowledge Base ID |
--bucket_name |
"akordi-dev02-raw-data-bucket" | S3 bucket for documents |
--file_keys |
(preset list) | S3 keys for specific files |
--max_results |
10 | Maximum search results to retrieve |
--context_results_limit |
50 | Maximum search results to include in LLM context |
--temperature |
0.1 | LLM temperature |
--max_tokens |
5000 | Maximum response tokens |
--chat_id |
"" | Chat session ID |
--user_id |
"" | User ID |
langgraph_ttd.py - LangGraph Workflow¶
This example uses LangGraph for document Q&A, enabling tool support and more complex workflows.
Architecture¶
flowchart TD
A[Query] --> B[LangGraphAgent]
B --> C[Validation Node]
C --> D[KB Search Node]
D --> E[Tool Decision Node]
E -->|No Tools| F[LLM Node]
E -->|Has Tools| G[Tool Execution]
G --> F
F --> H[Response]
Key Differences from Direct Workflow¶
| Feature | talk_to_document.py | langgraph_ttd.py |
|---|---|---|
| Agent Type | CustomAgent | LangGraphAgent |
| Tool Support | No | Yes |
| Workflow Control | Direct | State-based |
| Tracing | Manual | Built-in |
| Complexity | Simple | Advanced |
Agent Setup with LangGraph¶
from akordi_agents.core import AgentBuilder
def _get_cached_agent(has_tools: bool = False, temperature: float = 0.1):
"""Get or create a cached LangGraph agent."""
builder = AgentBuilder("TTD_AGENT")
builder.with_llm_service_instance(ChatAgentLLMService(model_id=MODEL_ID))
builder.with_data_model_instance(ChatAgentDataModel())
builder.with_config({
"industry": "construction",
"provider": "AWS_BEDROCK",
"environment": "PRODUCTION",
})
# Enable LangGraph workflow
builder.with_langgraph(
enable=True,
config={
"enable_validation": False,
"enable_tools": has_tools,
"enable_tracing": False,
"max_iterations": 3,
"temperature": temperature,
},
)
return builder.build()
Processing a Request¶
def main(event: Dict[str, Any]) -> Dict[str, Any]:
"""Full LangGraph workflow with tools and chat history."""
request_data = {
"query": body.get("query"),
"system_message": get_system_prompt_from_dynamodb(agent_code="AP-001"),
"knowledge_base_id": body.get("knowledge_base_id"),
"bucket_name": body.get("bucket_name"),
"file_keys": body.get("file_keys", []),
"max_results": body.get("max_results", 10),
"context_results_limit": body.get("context_results_limit", 50),
"override_search_type": body.get("override_search_type", "HYBRID"),
"temperature": body.get("temperature", 0.1),
"max_tokens": body.get("max_tokens", 5000),
"chat_history": body.get("chat_history", []),
}
# Get cached agent
has_tools = bool(body.get("tools"))
agent = _get_cached_agent(has_tools=has_tools)
# Process with LangGraph workflow
agent_response = agent.process_request(request_data)
return {
"success": True,
"answer": agent_response["llm_response"]["response"],
"search_results": agent_response.get("search_results", []),
"metadata": {
"workflow": {
"type": "langgraph",
"tools_used": agent_response.get("tools_used", []),
}
}
}
Running the Example¶
# Basic query with LangGraph
poetry run python examples/langgraph_ttd.py \
--query "What are the main findings in the report?"
# With custom parameters
poetry run python examples/langgraph_ttd.py \
--query "Analyze the risk factors" \
--knowledge_base_id "YOUR_KB_ID" \
--max_results 20 \
--temperature 0.1
System Prompt from DynamoDB¶
Both examples support loading system prompts from DynamoDB:
from akordi_agents.config.dynamodb_prompt_config import dynamodb_prompt_config
def get_system_prompt_from_dynamodb(agent_code: str) -> Optional[str]:
"""Load system prompt from DynamoDB with caching."""
if dynamodb_prompt_config.is_configured():
template = dynamodb_prompt_config.get_prompt_template(agent_code)
if template and template.template:
return template.template
return None
Search Types¶
The knowledge base supports different search strategies:
| Search Type | Description |
|---|---|
HYBRID |
Combines semantic and keyword search (default) |
SEMANTIC |
Vector-based semantic similarity |
KEYWORD |
Traditional keyword matching |
request_data = {
"query": "Find safety requirements",
"knowledge_base_id": "YOUR_KB_ID",
"override_search_type": "HYBRID", # or "SEMANTIC", "KEYWORD"
"max_results": 10, # Results to retrieve from knowledge base
"context_results_limit": 50, # Results to include in LLM context
}
Response Structure¶
Both examples return similar response structures:
{
"success": True,
"answer": "Based on the documents, the key findings are...",
"chat_id": "session-123",
"search_results": [
{
"content": "Relevant excerpt from document...",
"score": 0.95,
"source": "s3://bucket/path/document.pdf"
}
],
"model_info": {
"provider": "aws_bedrock",
"model_id": "anthropic.claude-3-sonnet-..."
},
"token_usage": {
"input_tokens": 1500,
"output_tokens": 800,
"total_tokens": 2300
},
"metadata": {
"workflow": {
"type": "direct" | "langgraph",
"tools_used": []
}
}
}
When to Use Which¶
Use talk_to_document.py (Direct) when:¶
- Simple document Q&A queries
- Lower latency is required
- No tool integration needed
- Straightforward request/response flow
Use langgraph_ttd.py (LangGraph) when:¶
- Complex multi-step queries
- Tool integration required
- Need workflow tracing/debugging
- Building production pipelines
Next Steps¶
- Tool Integration - Add tools to your agents
- Multi-Agent Orchestration - Coordinate multiple agents
- Guardrails - Add safety controls