Agentic RAG with Azure AI Search using Microsoft

Semantic Kernel

Retrieval-augmented generation (RAG) has revolutionised how AI applications interact with external knowledge bases. But what happens when we add autonomous agents to the mix? Enter Agentic RAG, a powerful pattern that combines the retrieval capabilities of traditional RAG with the reasoning and decision-making abilities of AI agents.

In this article, I’ll walk you through building an Agentic RAG solution using Microsoft Semantic Kernel and Azure AI Search, demonstrating both traditional vector search and autonomous agent-driven retrieval patterns.

You can clone the sample code repository by following this link

What is Agentic RAG?

Traditional RAG systems retrieve relevant documents and pass them to a language model for response generation. Agentic RAG takes this further by empowering autonomous agents to:

Dynamically decide when and what to retrieve
Reason over multiple data sources
Chain multiple retrieval operations based on context
Generate contextually aware responses with minimal human intervention

This approach is particularly valuable when dealing with complex queries that require multi-step reasoning or when the agent needs to decide which tools to invoke based on the user’s intent.

Architecture Overview

Our solution demonstrates two complementary approaches:

Direct Vector Search: Traditional vector similarity search against Azure AI Search
Agentic RAG: An autonomous agent that decides when and how to retrieve information using the Semantic Kernel’s function-calling capabilities

Here’s what we’ll be building with:

Azure AI Search: Vector store for semantic search capabilities
Azure OpenAI: GPT-3.5-turbo for chat completion and text-embedding-ada-002 for embeddings
Microsoft Semantic Kernel: Orchestration framework for building AI agents
ASP.NET Core MVC: Web application framework

Setting Up Azure Resources

Before diving into code, you’ll need to provision these Azure resources:

Resource Group: To organise all resources
Azure Storage Account: For storing your data
Azure AI Search: Create a search service and index
Azure OpenAI Service: Deploy two models:
- gpt-35-turbo for chat completion
- text-embedding-ada-002 for generating embeddings

Building the Semantic Kernel Pipeline

The heart of our application is the Semantic Kernel setup in Program.cs. Here’s where the magic happens:

builder.Services.AddSingleton<Kernel>(s =>
{
var kernelBuilder = Kernel.CreateBuilder();

// Add logging for observability

kernelBuilder.Services.AddLogging(b => b.AddConsole()

.SetMinimumLevel(LogLevel.Trace));

 // Register configuration

kernelBuilder.Services.AddSingleton<AppSettings>(appSettings);

 // Add Azure OpenAI Chat Completion

kernelBuilder.AddAzureOpenAIChatCompletion(

appSettings.AzureOpenAIChatCompletion.Model,

appSettings.AzureOpenAIChatCompletion.Endpoint,

appSettings.AzureOpenAIChatCompletion.ApiKey);

 // Add Text Embedding Generation

kernelBuilder.AddAzureOpenAITextEmbeddingGeneration(

appSettings.AzureOpenAITextEmbedding.Model,

appSettings.AzureOpenAITextEmbedding.Endpoint,

appSettings.AzureOpenAITextEmbedding.ApiKey);
 
// Register Azure AI Search client

kernelBuilder.Services.AddSingleton<SearchIndexClient>(sp =>

new SearchIndexClient(

new Uri(appSettings.AzureSearch.Endpoint),

new AzureKeyCredential(appSettings.AzureSearch.ApiKey))

);

 // Add Vector Store support

kernelBuilder.Services.AddAzureAISearchVectorStore();

 // Register the search plugin

kernelBuilder.Plugins.AddFromType<AzureAISearchPlugin>("AzureAISearchPlugin");

 return kernelBuilder.Build();

});

This setup creates a fully configured Semantic Kernel instance with all necessary services and plugins registered for dependency injection.

Creating the Azure AI Search Plugin

Semantic Kernel uses plugins to extend the agent’s capabilities. Our AzureAISearchPlugin exposes a search function that the agent can invoke automatically:

public class AzureAISearchPlugin

{

private readonly SearchIndexClient _searchIndexClient;

private readonly ITextEmbeddingGenerationService _textEmbeddingGenerationService;

private readonly AppSettings _appSettings;

[KernelFunction("Search")]

[Description("Search for a speaker")]

public async Task<string> SearchAsync(string query)

{

var searchClient = _searchIndexClient.GetSearchClient(

_appSettings.AzureSearch.Index);

ReadOnlyMemory<float> embedding =

await _textEmbeddingGenerationService.GenerateEmbeddingAsync(query);

var vectorQuery = new VectorizedQuery(embedding)

{

KNearestNeighborsCount = _appSettings.AzureSearch.TopK,

Fields = { _appSettings.AzureSearch.VectorField }

};

 var options = new SearchOptions

{

VectorSearch = new VectorSearchOptions

{

Queries = { vectorQuery }

},

Size = _appSettings.AzureSearch.Size

};

var response = await searchClient.SearchAsync<Speaker>(options);

await foreach (var result in response.Value.GetResultsAsync())

{

return result.Document.Chunk ?? string.Empty;

}

return string.Empty;

 }

}

The [KernelFunction] and [Description] attributes are crucial – they tell Semantic Kernel about this function and help the agent decide when to invoke it.

Implementing the Search Endpoints

Our HomeController exposes two endpoints that demonstrate different approaches:

Direct Vector Search

The AISearch endpoint performs straightforward vector similarity search:

[HttpPost]

[Route("ai-search")]

public async Task<IActionResult> AISearch([FromBody] SearchTerms search)

{

ReadOnlyMemory<float> query =

await _textEmbeddingGenerationService.GenerateEmbeddingAsync(search.Input);

var vectorQuery = new VectorizedQuery(query)

{

KNearestNeighborsCount = search.TopK,

Fields = { _settings.AzureSearch.VectorField }

};

var options = new SearchOptions

{

VectorSearch = new VectorSearchOptions

{

Queries = { vectorQuery }

},

Size = search.TopK

};

 var response = await _searchClient.SearchAsync<Speaker>(options);

}

This approach gives you direct control over the search process but requires manual orchestration.

Agentic RAG with Autonomous Function Calling

The AgenticAISearch endpoint showcases the true power of Agentic RAG:

[HttpPost]

[Route("agentic-ai-search")]

public async Task<IActionResult> AgenticAISearch([FromBody] SearchTerms search)

{

var promptExecutionSettings = new AzureOpenAIPromptExecutionSettings

{

AzureChatDataSource = GetAzureSearchDataSource(),

ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions

};

var systemMessage =

"You are a knowledgeable agent specialised in retrieving data using Azure AI Search. " +

"Search required information in Azure AI Search.";

var chatHistory = new ChatHistory(systemMessage);

chatHistory.AddUserMessage(search.Input);

var assistantReply = await _chatCompletionService.GetChatMessageContentAsync(

chatHistory,

promptExecutionSettings,

_kernel);

 if (search.IncludeCitations)

{

    citations = GetCitations(assistantReply);

}
    return Ok(searchResult);
}

The key difference here is ToolCallBehavior.AutoInvokeKernelFunctions. This tells Semantic Kernel to automatically:

Analyze the user’s query
Determine if the AzureAISearchPlugin.Search function should be called
Invoke the function with appropriate parameters
Incorporate the results into the response

The agent makes these decisions autonomously based on the function descriptions and the conversation context.

Key Benefits of Agentic RAG

This implementation demonstrates several advantages:

Autonomous Decision Making: The agent decides when to search without explicit instructions, making the system more intuitive and reducing the need for complex routing logic.

Contextual Awareness: The agent maintains conversation context and can perform multi-turn interactions with appropriate retrieval at each step.

Extensibility: Adding new capabilities is as simple as creating new plugins with proper function descriptions – the agent automatically learns to use them.

Transparency: Citations provide grounding and allow users to verify information sources.

Flexibility: The same infrastructure supports both traditional RAG and agentic patterns, letting you choose the right approach for each use case.

When to Use Which Approach

Use Direct Vector Search when:

You need predictable, deterministic behaviour
Performance and latency are critical
The search logic is straightforward
You want full control over the retrieval process

Use Agentic RAG when:

Queries require multi-step reasoning
The system needs to decide between multiple tools or data sources
User intents are complex or ambiguous
You want the system to adapt to new capabilities automatically
Natural conversation flow is important

Conclusion

Agentic RAG represents a significant evolution in how we build intelligent applications. By combining Azure AI Search’s powerful vector search capabilities with Microsoft Semantic Kernel’s agent orchestration, we create powerful and flexible systems.

The autonomous nature of agentic patterns simplifies the development of advanced AI applications while ensuring control and transparency necessary for deployment. As AI agents grow more capable, patterns like this will become increasingly vital for creating reliable, scalable intelligent systems.

Sample App Demo