Semantic Kernel and In-Memory Vector Store
Vector databases have revolutionized how we search and retrieve information. Instead of exact keyword matching, they enable semantic search—understanding the meaning behind queries. In this article, I’ll walk you through building a practical semantic search application using Microsoft’s Semantic Kernel and its in-memory vector store. You can clone the sample code repository by following this link
What We’re Building
We’re creating a motivational speaker search engine that understands natural language queries. When a user searches for “financial advice expert,” it intelligently returns speakers like Dave Ramsey and Suze Orman, even though those exact words aren’t in their bios.
The Power of Vector Embeddings
Traditional databases match exact text. Vector databases convert text into numerical representations (embeddings) that capture semantic meaning. Similar concepts cluster together in vector space, enabling intelligent, context-aware searches.
Prerequisites
Before diving in, ensure you have:
- .NET 8 SDK or later
- OpenAI API key
- Microsoft.SemanticKernel NuGet package
- Microsoft.SemanticKernel.Connectors.InMemory package
The Data Model
Our Speaker model uses vector store attributes to define how data is stored and searched:
public class Speaker
{
[VectorStoreRecordKey]
public ulong Id { get; set; }
[VectorStoreRecordData]
public string Name { get; set; }
[VectorStoreRecordData]
public string Bio { get; set; }
[VectorStoreRecordData]
public string WebSite { get; set; }
[VectorStoreRecordVector(Dimensions: 1536)]
public ReadOnlyMemory<float> DefinitionEmbedding { get; set; }
}
The VectorStoreRecordKey attribute marks the unique identifier. VectorStoreRecordData attributes tag standard data fields, while VectorStoreRecordVector designates the embedding field. The 1536 dimensions match OpenAI’s text-embedding-ada-002 model output size.
Setting Up Semantic Kernel
Semantic Kernel orchestrates AI services. We configure it with both chat completion and embedding generation:
var builder = Kernel.CreateBuilder(); builder.Services.AddLogging(b => b.AddConsole().SetMinimumLevel(LogLevel.Trace)); builder.AddOpenAIChatCompletion(model, apiKey); builder.AddOpenAITextEmbeddingGeneration(embedding, apiKey); var kernel = builder.Build();
This setup registers OpenAI services with dependency injection, making them available throughout our application.
Creating the Vector Store
The in-memory vector store provides a lightweight, ephemeral database perfect for prototyping and small datasets:
var vectorStore = new InMemoryVectorStore();
var collection = vectorStore.GetCollection<ulong, Speaker>("skspeakers");
await collection.CreateCollectionIfNotExistsAsync();
We specify the key type (ulong) and record type (Speaker). The collection name “skspeakers” identifies our dataset within the store.
Generating Embeddings
Each speaker’s bio needs conversion to a vector embedding. We process them concurrently for efficiency:
var textEmbeddingGenerationService = kernel.GetRequiredService<ITextEmbeddingGenerationService>();
var tasks = _speakers.Select(entry => Task.Run(async () =>
{
entry.DefinitionEmbedding = await textEmbeddingGenerationService.GenerateEmbeddingAsync(entry.Bio);
}));
await Task.WhenAll(tasks);
The GenerateEmbeddingAsync method sends each bio to OpenAI’s embedding API, returning a 1536-dimensional vector that captures the semantic essence of the text.
Upserting Records
Once embeddings are generated, we insert the records into our vector store:
await foreach (var key in collection.UpsertBatchAsync(_speakers))
{
Console.WriteLine(key);
}
The UpsertBatchAsync method efficiently handles bulk insertions, creating new records or updating existing ones based on the key.
Performing Semantic Search
The search process mirrors embedding generation. We convert the user’s query into a vector and find the nearest neighbors:
string searchString = Console.ReadLine();
var searchVector = await textEmbeddingGenerationService.GenerateEmbeddingAsync(searchString);
var searchResult = await collection.VectorizedSearchAsync(searchVector);
await foreach (var result in searchResult.Results)
{
Console.WriteLine($"Search score: {result.Score}");
Console.WriteLine($"Name: {result.Record.Name}");
Console.WriteLine($"Bio: {result.Record.Bio}");
Console.WriteLine($"WebSite: {result.Record.WebSite}");
}
The VectorizedSearchAsync method calculates similarity scores between the query vector and stored embeddings, returning the most semantically relevant matches.
Understanding Search Scores
Search scores indicate similarity between the query and results. Higher scores mean stronger semantic matches. A query for “financial expert” returns Dave Ramsey and Suze Orman with high scores because their bios emphasize financial topics.
Real-World Applications
This pattern extends far beyond speaker searches. Consider these scenarios:
Customer Support: Build a knowledge base where support agents find relevant articles by describing problems naturally, not hunting for exact keywords.
Document Discovery: Enable employees to find internal documents by asking questions in plain language, dramatically improving information retrieval in large organizations.
Product Recommendations: Match user queries to products based on semantic similarity, understanding intent rather than just matching product names.
Code Search: Help developers find relevant code snippets or modules by describing functionality, even when variable names and comments differ.
Production Considerations
While the in-memory vector store excels for development and testing, production systems require persistent storage. Consider migrating to:
Azure AI Search: Microsoft’s managed search service with native vector support, excellent for Azure-based applications.
Qdrant or Weaviate: Dedicated vector databases offering high performance, scalability, and advanced filtering capabilities.
PostgreSQL with pgvector: Leverage your existing PostgreSQL infrastructure with vector search extensions.
The beauty of Semantic Kernel is that swapping vector store implementations requires minimal code changes—the core logic remains identical.
Conclusion
Semantic Kernel’s in-memory vector store provides an elegant foundation for semantic search applications. The combination of intuitive APIs, flexible configuration, and seamless integration with OpenAI’s embedding models makes building intelligent search experiences straightforward.
This speaker search example demonstrates core concepts applicable to countless scenarios. Whether you’re building customer support tools, document management systems, or recommendation engines, vector search powered by Semantic Kernel offers a powerful solution.
The in-memory implementation is perfect for learning, prototyping, and small-scale applications. As your needs grow, the same code patterns scale to production-grade vector databases with minimal refactoring.
Start experimenting with semantic search today—your users will appreciate the intuitive, intelligent search experiences that understand what they mean, not just what they type.
