Agentic RAG with Azure AI Search using Microsoft
Semantic Kernel
Retrieval-augmented generation (RAG) has revolutionised how AI applications interact with external knowledge bases. But what happens when we add autonomous agents to the mix? Enter Agentic RAG, a powerful pattern that combines the retrieval capabilities of traditional RAG with the reasoning and decision-making abilities of AI agents.
In this article, I’ll walk you through building an Agentic RAG solution using Microsoft Semantic Kernel and Azure AI Search, demonstrating both traditional vector search and autonomous agent-driven retrieval patterns.
You can clone the sample code repository by following this link
What is Agentic RAG?
Traditional RAG systems retrieve relevant documents and pass them to a language model for response generation. Agentic RAG takes this further by empowering autonomous agents to:
- Dynamically decide when and what to retrieve
- Reason over multiple data sources
- Chain multiple retrieval operations based on context
- Generate contextually aware responses with minimal human intervention
This approach is particularly valuable when dealing with complex queries that require multi-step reasoning or when the agent needs to decide which tools to invoke based on the user’s intent.
Architecture Overview
Our solution demonstrates two complementary approaches:
- Direct Vector Search: Traditional vector similarity search against Azure AI Search
- Agentic RAG: An autonomous agent that decides when and how to retrieve information using the Semantic Kernel’s function-calling capabilities
Here’s what we’ll be building with:
- Azure AI Search: Vector store for semantic search capabilities
- Azure OpenAI: GPT-3.5-turbo for chat completion and text-embedding-ada-002 for embeddings
- Microsoft Semantic Kernel: Orchestration framework for building AI agents
- ASP.NET Core MVC: Web application framework
Setting Up Azure Resources
Before diving into code, you’ll need to provision these Azure resources:
- Resource Group: To organise all resources
- Azure Storage Account: For storing your data
- Azure AI Search: Create a search service and index
- Azure OpenAI Service: Deploy two models:
- gpt-35-turbo for chat completion
- text-embedding-ada-002 for generating embeddings
Building the Semantic Kernel Pipeline
The heart of our application is the Semantic Kernel setup in Program.cs. Here’s where the magic happens:
builder.Services.AddSingleton<Kernel>(s =>
{
var kernelBuilder = Kernel.CreateBuilder();
// Add logging for observability
kernelBuilder.Services.AddLogging(b => b.AddConsole()
.SetMinimumLevel(LogLevel.Trace));
// Register configuration
kernelBuilder.Services.AddSingleton<AppSettings>(appSettings);
// Add Azure OpenAI Chat Completion
kernelBuilder.AddAzureOpenAIChatCompletion(
appSettings.AzureOpenAIChatCompletion.Model,
appSettings.AzureOpenAIChatCompletion.Endpoint,
appSettings.AzureOpenAIChatCompletion.ApiKey);
// Add Text Embedding Generation
kernelBuilder.AddAzureOpenAITextEmbeddingGeneration(
appSettings.AzureOpenAITextEmbedding.Model,
appSettings.AzureOpenAITextEmbedding.Endpoint,
appSettings.AzureOpenAITextEmbedding.ApiKey);
// Register Azure AI Search client
kernelBuilder.Services.AddSingleton<SearchIndexClient>(sp =>
new SearchIndexClient(
new Uri(appSettings.AzureSearch.Endpoint),
new AzureKeyCredential(appSettings.AzureSearch.ApiKey))
);
// Add Vector Store support
kernelBuilder.Services.AddAzureAISearchVectorStore();
// Register the search plugin
kernelBuilder.Plugins.AddFromType<AzureAISearchPlugin>("AzureAISearchPlugin");
return kernelBuilder.Build();
});
This setup creates a fully configured Semantic Kernel instance with all necessary services and plugins registered for dependency injection.
Creating the Azure AI Search Plugin
Semantic Kernel uses plugins to extend the agent’s capabilities. Our AzureAISearchPlugin exposes a search function that the agent can invoke automatically:
public class AzureAISearchPlugin
{
private readonly SearchIndexClient _searchIndexClient;
private readonly ITextEmbeddingGenerationService _textEmbeddingGenerationService;
private readonly AppSettings _appSettings;
[KernelFunction("Search")]
[Description("Search for a speaker")]
public async Task<string> SearchAsync(string query)
{
var searchClient = _searchIndexClient.GetSearchClient(
_appSettings.AzureSearch.Index);
ReadOnlyMemory<float> embedding =
await _textEmbeddingGenerationService.GenerateEmbeddingAsync(query);
var vectorQuery = new VectorizedQuery(embedding)
{
KNearestNeighborsCount = _appSettings.AzureSearch.TopK,
Fields = { _appSettings.AzureSearch.VectorField }
};
var options = new SearchOptions
{
VectorSearch = new VectorSearchOptions
{
Queries = { vectorQuery }
},
Size = _appSettings.AzureSearch.Size
};
var response = await searchClient.SearchAsync<Speaker>(options);
await foreach (var result in response.Value.GetResultsAsync())
{
return result.Document.Chunk ?? string.Empty;
}
return string.Empty;
}
}
The [KernelFunction] and [Description] attributes are crucial – they tell Semantic Kernel about this function and help the agent decide when to invoke it.
Implementing the Search Endpoints
Our HomeController exposes two endpoints that demonstrate different approaches:
- Direct Vector Search
The AISearch endpoint performs straightforward vector similarity search:
[HttpPost]
[Route("ai-search")]
public async Task<IActionResult> AISearch([FromBody] SearchTerms search)
{
ReadOnlyMemory<float> query =
await _textEmbeddingGenerationService.GenerateEmbeddingAsync(search.Input);
var vectorQuery = new VectorizedQuery(query)
{
KNearestNeighborsCount = search.TopK,
Fields = { _settings.AzureSearch.VectorField }
};
var options = new SearchOptions
{
VectorSearch = new VectorSearchOptions
{
Queries = { vectorQuery }
},
Size = search.TopK
};
var response = await _searchClient.SearchAsync<Speaker>(options);
}
This approach gives you direct control over the search process but requires manual orchestration.
- Agentic RAG with Autonomous Function Calling
The AgenticAISearch endpoint showcases the true power of Agentic RAG:
[HttpPost]
[Route("agentic-ai-search")]
public async Task<IActionResult> AgenticAISearch([FromBody] SearchTerms search)
{
var promptExecutionSettings = new AzureOpenAIPromptExecutionSettings
{
AzureChatDataSource = GetAzureSearchDataSource(),
ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
};
var systemMessage =
"You are a knowledgeable agent specialised in retrieving data using Azure AI Search. " +
"Search required information in Azure AI Search.";
var chatHistory = new ChatHistory(systemMessage);
chatHistory.AddUserMessage(search.Input);
var assistantReply = await _chatCompletionService.GetChatMessageContentAsync(
chatHistory,
promptExecutionSettings,
_kernel);
if (search.IncludeCitations)
{
citations = GetCitations(assistantReply);
}
return Ok(searchResult);
}
The key difference here is ToolCallBehavior.AutoInvokeKernelFunctions. This tells Semantic Kernel to automatically:
- Analyze the user’s query
- Determine if the AzureAISearchPlugin.Search function should be called
- Invoke the function with appropriate parameters
- Incorporate the results into the response
The agent makes these decisions autonomously based on the function descriptions and the conversation context.
Key Benefits of Agentic RAG
This implementation demonstrates several advantages:
Autonomous Decision Making: The agent decides when to search without explicit instructions, making the system more intuitive and reducing the need for complex routing logic.
Contextual Awareness: The agent maintains conversation context and can perform multi-turn interactions with appropriate retrieval at each step.
Extensibility: Adding new capabilities is as simple as creating new plugins with proper function descriptions – the agent automatically learns to use them.
Transparency: Citations provide grounding and allow users to verify information sources.
Flexibility: The same infrastructure supports both traditional RAG and agentic patterns, letting you choose the right approach for each use case.
When to Use Which Approach
Use Direct Vector Search when:
- You need predictable, deterministic behaviour
- Performance and latency are critical
- The search logic is straightforward
- You want full control over the retrieval process
Use Agentic RAG when:
- Queries require multi-step reasoning
- The system needs to decide between multiple tools or data sources
- User intents are complex or ambiguous
- You want the system to adapt to new capabilities automatically
- Natural conversation flow is important
Conclusion
Agentic RAG represents a significant evolution in how we build intelligent applications. By combining Azure AI Search’s powerful vector search capabilities with Microsoft Semantic Kernel’s agent orchestration, we create powerful and flexible systems.
The autonomous nature of agentic patterns simplifies the development of advanced AI applications while ensuring control and transparency necessary for deployment. As AI agents grow more capable, patterns like this will become increasingly vital for creating reliable, scalable intelligent systems.
Sample App Demo

