Building a Vector Search Application with Semantic 

Kernel and Elasticsearch

Vector search has become essential for modern AI applications, enabling semantic similarity searches that go beyond traditional keyword matching. In this comprehensive guide, we’ll build a complete vector search solution using Semantic KernelElasticsearch, and the Semantic Kernel Connector for ElasticsearchYou can clone the sample code repository by following this link

What We’re Building

We’ll create an MVC web application that:

  • Generates vector embeddings from structured data using Azure OpenAI
  • Stores embeddings in Elasticsearch as a vector database
  • Performs semantic searches using k-Nearest Neighbors (KNN) algorithm
  • Runs entirely in Docker containers for easy deployment
 
Architecture Overview
 

Our solution consists of three main components:

  1. ASP.NET Core MVC Application – The web interface for creating embeddings and performing searches
  2. Elasticsearch – Vector database storing our embeddings
  3. Kibana – Visualization and management interface for Elasticsearch

All components run in Docker containers, orchestrated via Docker Compose.

 Prerequisites

Before we begin, ensure you have:

  • .NET 8 or later
  • Docker Desktop
  • Azure OpenAI service access (or OpenAI API key)
  • Basic understanding of vector embeddings and semantic search

Required NuGet Packages

	Install these packages to your MVC project:

	dotnet add package Elastic.Clients.Elasticsearch --version 8.16.3

	dotnet add package Elastic.SemanticKernel.Connectors.Elasticsearch --version 0.1.2

	dotnet add package Microsoft.Extensions.Hosting --version 9.0.0

	dotnet add package Microsoft.SemanticKernel.Connectors.AzureOpenAI –version 1.30.0

	dotnet add package Microsoft.SemanticKernel.PromptTemplates.Handlebars --version 1.30.0

Setting Up the Docker Environment

Let’s start by configuring our Docker Compose files to run Elasticsearch and Kibana alongside our application.

docker-compose.yml

This file defines our main service:

	services:	

	7_elasticsearch_vectorstore_semantickernel:

	image: ${DOCKER_REGISTRY-}7elasticsearchvectorstoresemantickernel

	build:

	context: 7_ElasticSearch_VectorStore_SemanticKernel

	dockerfile: Dockerfile
 docker-compose.override.yml

This file contains our development configuration with Elasticsearch and Kibana:

	services:

	7_elasticsearch_vectorstore_semantickernel:

	environment:

	- ASPNETCORE_ENVIRONMENT=Development

	- ASPNETCORE_HTTP_PORTS=8080

	- ASPNETCORE_HTTPS_PORTS=8081

	ports:

	- "8080"

	- "8081"

	depends_on:

	- elasticsearch

	- kibana

	elasticsearch:

	image: docker.elastic.co/elasticsearch/elasticsearch:8.18.7

	container_name: elasticsearch

	environment:

	- discovery.type=single-node

	- xpack.security.enabled=false

	- bootstrap.memory_lock=true

	- "ES_JAVA_OPTS=-Xms512m -Xmx512m"

	- network.host=0.0.0.0

	ulimits:

	memlock:

	soft: -1

	hard: -1

	volumes:

	- esdata:/usr/share/elasticsearch/data

	ports:

	- "9200:9200"

	- "9300:9300"

	networks:

	- elk

	kibana:

	image: docker.elastic.co/kibana/kibana:8.18.7

	container_name: kibana

	environment:

	- ELASTICSEARCH_URL=http://elasticsearch:9200

	ports:

	- "5601:5601"

	networks:

	- elk

	depends_on:

	- elasticsearch

	networks:

	elk:

	driver: bridge

	volumes:

	esdata:

	name: elk_docker_esdata

	driver: local

 
Key Configuration Points:
 
  • Single-node discovery – Perfect for development environments
  • Security disabled – Simplifies local development (enable in production!)
  • Memory settings – Allocated 512MB heap size for Elasticsearch
  • Persistent storage – Data persists across container restarts via Docker volumes

 

Data Model

 We’ll work with speaker data, representing conference speakers with their biographies. The Speaker class uses Semantic Kernel’s vector store attributes:

using Microsoft.Extensions.VectorData;

	namespace YourNamespace.Models
	{
		public class Speaker
		{
		[VectorStoreRecordKey]
		public string Id { get; set; }
                [VectorStoreRecordData]
		public string Name { get; set; }

		[VectorStoreRecordData]
		public string Bio { get; set; }

		[VectorStoreRecordData]
		public string WebSite { get; set; }

		[VectorStoreRecordVector(Dimensions: 1536)]
		public ReadOnlyMemory<float> DefinitionEmbedding { get; set; }

		}
	}
 
Understanding the Attributes:
 
  • [VectorStoreRecordKey] – Marks the unique identifier field
  • [VectorStoreRecordData] – Standard data fields that will be stored and retrieved
  • [VectorStoreRecordVector(Dimensions: 1536)] – Vector field with 1536 dimensions (matching Azure OpenAI’s text-embedding-ada-002 model output)
 
These settings should be configured in your appsettings.json:
 
	{

		"ElasticSettings": {

		"Url": "http://localhost:9200",

		"Index": "speakers",

		"ApiKey": "",

		"MaxFetchSize": 1000

		},

		"AzureOpenAITextEmbeddingSettings": {

		"Model": "text-embedding-ada-002",

		"Endpoint": "https://your-resource.openai.azure.com/",

		"ApiKey": "your-api-key"

		}

	}

The Controller: Where the Magic Happens

The HomeController orchestrates all vector search operations. Let’s break down its key components:

Constructor and Dependency Injection

public class HomeController : Controller

{

	private readonly ILogger<HomeController> _logger;

	private readonly Kernel _kernel;

	private readonly IChatCompletionService _chatCompletionService;

	private readonly ITextGenerationService _textGenerationService;

	private readonly ITextEmbeddingGenerationService _textEmbeddingGenerationService;

	private readonly IVectorStoreRecordCollection<string, Speaker> _vectorStoreRecordCollection;

	private readonly ElasticsearchClient _elasticsearch;

	private readonly ISearchSettings _searchSettings;

	public HomeController(

	Kernel kernel,

	ElasticsearchClient elasticsearch,

	ISearchSettings searchSettings,

	ILogger<HomeController> logger)

	{

	_kernel = kernel;

	_chatCompletionService = _kernel.GetRequiredService<IChatCompletionService>();

	_textGenerationService = _kernel.GetRequiredService<ITextGenerationService>();

	_textEmbeddingGenerationService = _kernel.GetRequiredService<ITextEmbeddingGenerationService>();

	_vectorStoreRecordCollection = _kernel.GetRequiredService<IVectorStoreRecordCollection<string, Speaker>>();

	_elasticsearch = elasticsearch;

	_searchSettings = searchSettings;

	_logger = logger;

	}
}
 
Key Services:
 
  • Kernel – Semantic Kernel’s core orchestration engine
  • ITextEmbeddingGenerationService – Generates vector embeddings from text
  • IVectorStoreRecordCollection<string, Speaker> – Abstraction for vector store operations
  • ElasticsearchClient – Direct Elasticsearch client for KNN queries
 
Creating Embeddings from CSV Data
 

The embedding creation process reads speaker data from a CSV file, generates embeddings in batches, and stores them in Elasticsearch:

	[HttpPost]

	public async Task<IActionResult> CreateEmbedding()

	{

		_logger.LogInformation("Creating Index");

		await _vectorStoreRecordCollection.CreateCollectionIfNotExistsAsync();

		var speakers = (await System.IO.File.ReadAllLinesAsync("speakers.csv"))

		.Select(x => x.Split(';'));

		_logger.LogInformation("Creating Embedding from file chunk");

		foreach (var chunk in speakers.Chunk(25))

		{

		var descriptionEmbeddings = await _textEmbeddingGenerationService

		.GenerateEmbeddingsAsync(chunk.Select(x => x[2]).ToArray());

		for (var i = 0; i < chunk.Length; ++i)

		{

		var speaker = chunk[i];

		await _vectorStoreRecordCollection.UpsertAsync(new Speaker

		{

		Id = speaker[0],

		Name = speaker[1],

		Bio = speaker[2],

		WebSite = speaker[3],

		DefinitionEmbedding = descriptionEmbeddings[i],

		});

		}

	}

	_logger.LogInformation("Embedding created");

	return RedirectToAction(nameof(Index), new { Message = "Embedding created" });

	}
 
Why Batch Processing?
 

Processing in chunks of 25 records:

  • Prevents API rate limiting
  • Reduces memory usage
  • Provides better error recovery
  • Improves overall performance
 
Performing Vector Searches with KNN

When a user submits a search query, we:

  1. Convert the query to a vector embedding
  2. Use KNN to find similar vectors in Elasticsearch
  3. Return the most relevant results
	[HttpPost]
	public async Task<IActionResult> Index(SearchTerms terms)
	{

		var query = await _textEmbeddingGenerationService

		.GenerateEmbeddingsAsync(new[] { terms.Input });

		var response = await QueryVectorData(query[0]);

		if (response.ApiCallDetails.HasSuccessfulStatusCode)

		{

		terms.Response = response != null && response.Documents.Any() ?

		string.Join("\n\n", response.Documents.Select(d =>

		$"{d.Name} ({d.WebSite}): {d.Bio}")) :

		"No results found.";

		}

		else

		{

		terms.Response = response.DebugInformation.ToString();

		}

		return View(terms);

		}

		private async Task<SearchResponse<Speaker>> QueryVectorData(ReadOnlyMemory<float> queryVector)

		{

		var response = await _elasticsearch.SearchAsync<Speaker>(s => s

		.Index(_searchSettings.ElasticSettings.Index)

		.Knn(k => k

		.Field(f => f.DefinitionEmbedding)

		.QueryVector(queryVector.ToArray())

		.k(5)                        // Number of nearest neighbors to return

		.NumCandidates(10)           // Number of candidates to consider

		)

		);

		return response;
	}
 
Understanding KNN Parameters

The k-Nearest Neighbors algorithm is the heart of vector search. Let’s understand the parameters:

k (Number of Neighbors)
	.k(5)

Returns the top 5 most similar documents. This is your result set size.

Choosing the right k:
  • Small k (3-5) – Precise results, but might miss relevant documents
  • Medium k (10-20) – Balanced approach for most applications
  • Large k (50+) – Comprehensive results, but may include less relevant matches

NumCandidates

	.NumCandidates(10)

The number of approximate nearest neighbor candidates to consider per shard.

Best Practice: Set NumCandidates to at least 2x your k value for better accuracy.

Why it matters:
  • Higher values = more accurate results but slower queries
  • Lower values = faster queries but potentially less accurate
  • Elasticsearch uses HNSW (Hierarchical Navigable Small World) algorithm for efficient approximate nearest neighbor search
 
 
How Vector Search Works

Let’s walk through a complete search example:

User Query: “AI researcher specializing in natural language processing”

Step 1: Query Vectorization

    var query = await _textEmbeddingGenerationService

    .GenerateEmbeddingsAsync(new[] { terms.Input });

    The text query is converted to a 1536-dimensional vector: [0.0234, -0.0156, 0.0891, ...]

Step 2: KNN Search

    .Knn(k => k

    .Field(f => f.DefinitionEmbedding)

    .QueryVector(queryVector.ToArray())

    .k(5)

    .NumCandidates(10)

    )

Elasticsearch finds the 5 speaker bios whose embedding vectors are closest to the query vector using cosine similarity.

Step 3: Results

The system returns speakers whose biographies semantically match the query, even if they don’t contain the exact keywords:

Alexandra Chen (https://alexchen.dev): AI researcher specializing in natural language processing and machine learning. With over 12 years of experience…

Dr. Sarah Johnson (https://sarahjohnson.ai): Machine learning engineer focusing on transformer architectures and deep learning…

Semantic Search vs. Traditional Search

Traditional Keyword Search

    Query: "NLP researcher"

    Matches: Only documents containing "NLP" or "researcher"

    Misses: "natural language processing expert", "computational linguistics specialist"

Vector Semantic Search

    Query: "NLP researcher"

    Matches: Documents about:

    - Natural language processing

    - Computational linguistics

    - Text analysis and understanding

    - Machine learning for language

Vector search understands meaning and context, not just exact keyword matches.

Running the Application
 

Step 1: Start Docker Containers

    docker-compose up -d

This starts:

  • Elasticsearch on http://localhost:9200
  • Kibana on http://localhost:5601
  • Your MVC application on http://localhost:8080

Step 2: Verify Elasticsearch

Visit http://localhost:9200 – you should see:

    {

        "name" : "elasticsearch",

        "cluster_name" : "docker-cluster",

        "version" : {

        "number" : "8.18.7"

        }

    }

Step 3: Create Embeddings

  1. Navigate to http://localhost:7026
  2. Click “CreateEmbedding” button
  3. Wait for the process to complete (processing 99 speakers in batches)

Step 4: Perform Searches

Enter semantic queries like:

  • “cloud architect with Azure experience”
  • “frontend developer specializing in React”
  • “someone who knows GraphQL”

 
Monitoring with Kibana
 

Kibana (http://localhost:5601) provides powerful tools to inspect your data:

View Your Index

Kibana 

  1. Open http://localhost:5601 in your browser and click ElasticSearch => Manager => Index Management => and then click index speaker_vector_index =>  Click console 
  2. Run:
    GET /speaker_vector_index/_search

    {

        "size": 10

    }

Inspect Vector Embeddings

    GET /speaker_vector_index/_search
    {

        "query": {

        "match_all": {}

        },

        "_source": ["name", "bio", "definitionEmbedding"]

    }

Check Index Mapping

GET /speaker_vector_index/_mapping

You should see the vector field configuration:

    {

        "definitionEmbedding": {

        "type": "dense_vector",

        "dims": 1536,

        "index": true,

        "similarity": "cosine"

        }

    }
Performance Optimization Tips
  1. Batch Size Tuning
    foreach (var chunk in speakers.Chunk(25))

Adjust chunk size based on:

  • API rate limits
  • Available memory
  • Network latency
  1. KNN Parameter Optimization
    .k(5)              // Increase for more results

    .NumCandidates(10) // Increase for better accuracy

Benchmark different values:

  • Test with your actual data
  • Monitor query latency
  • Balance accuracy vs. speed
  1. Index Optimization

Consider adding filters to narrow search scope:

    .Knn(k => k

        .Field(f => f.DefinitionEmbedding)

        .QueryVector(queryVector.ToArray())

        .k(5)

        .NumCandidates(10)

        .Filter(f => f

        .Term(t => t.Field(speaker => speaker.Category).Value("Developer"))

        )

    )
  1. Caching Strategies

Implement caching for:

  • Frequently searched queries
  • User session embeddings
  • Popular results
 Common Issues and Solutions
 

Issue 1: Connection Refused

Error: Cannot connect to Elasticsearch at localhost:9200

Solution:

# Check if containers are running

    docker ps

# Check Elasticsearch logs

docker logs elasticsearch

# Ensure ports are not in use

    netstat -an | findstr "9200"

Issue 2: Out of Memory

Error: Elasticsearch container crashes

Solution: Increase Docker memory allocation or adjust heap size:

    environment:

    - "ES_JAVA_OPTS=-Xms1g -Xmx1g"

 

Issue 3: Slow Embedding Generation

Symptoms: CreateEmbedding takes too long

Solutions:

  • Reduce batch size
  • Increase Azure OpenAI rate limits
  • Process in parallel (carefully!)
  • Cache embeddings locally

Advanced Scenarios

Hybrid Search (Vector + Keyword)

Combine semantic and traditional search:

    var response = await _elasticsearch.SearchAsync<Speaker>(s => s

        .Index(_searchSettings.ElasticSettings.Index)

            .Query(q => q

            .Bool(b => b

                .Should(

                    sh => sh.Knn(k => k

                    .Field(f => f.DefinitionEmbedding)

                    .QueryVector(queryVector.ToArray())

                    .k(5)

                    .NumCandidates(10)

                    .Boost(0.7)

                    ),

                    sh => sh.Match(m => m

                    .Field(f => f.Bio)

                    .Query(searchText)

                    .Boost(0.3)

                    )

                )

            )

        )

    );

Multi-Vector Search

Search across multiple embedding fields:

    public class Speaker

    {

        [VectorStoreRecordVector(Dimensions: 1536)]

        public ReadOnlyMemory<float> BioEmbedding { get; set; }

        [VectorStoreRecordVector(Dimensions: 1536)]

        public ReadOnlyMemory<float> SkillsEmbedding { get; set; }

    }

Filtered Vector Search

Add business logic filters:

    .Knn(k => k

        .Field(f => f.DefinitionEmbedding)

            .QueryVector(queryVector.ToArray())

            .k(5)

            .Filter(f => f

            .Range(r => r

            .Field(speaker => speaker.YearsExperience)

            .Gte(5)

            )

        )

    )

 
 

Summary

Semantic Kernel – AI orchestration framework
Elasticsearch – High-performance vector database
Azure OpenAI – State-of-the-art embeddings
KNN Algorithm – Efficient similarity search
Docker – Containerized deployment

Key Takeaways

  1. Vector embeddings transform text into mathematical representations that capture semantic meaning
  2. KNN search finds similar vectors efficiently using algorithms like HNSW
  3. Semantic Kernel abstracts complexity while providing flexibility
  4. Elasticsearch provides production-grade vector search with powerful querying capabilities
  5. Batch processing and proper configuration are essential for performance

 Next Steps

  • Experiment with different embedding models
  • Implement hybrid search strategies