Microsoft Semantic Kernel
Overview
Semantic Kernel (SK) is an open-source SDK developed by Microsoft for orchestrating large language models with external code, memory, planning, and plugins. The framework enables building AI applications that combine LLM capabilities with traditional programming logic, external data access, and multi-step task execution. SK provides native support for C#, Python, and Java, with deep integration into Microsoft’s AI ecosystem.
The framework addresses the problem of bridging LLM capabilities with deterministic code execution, external API calls, and complex workflow orchestration through a kernel-centric architecture where a central orchestrator manages AI services, plugins, memory, and execution flow.
Key technical components covered:
- Kernel architecture and orchestration
- Function types (semantic vs native)
- Plugin system and function calling
- Memory architecture
- Planning and execution patterns
- Prompt template syntax
- Version history and deprecations
Kernel Architecture
The Kernel serves as the central orchestrator and dependency injection container for all AI operations. When a task is invoked, the Kernel executes the following pipeline:
- Selects appropriate AI service based on configuration
- Constructs prompts using templates and parameters
- Sends prompts to AI service
- Parses responses
- Returns results to application
- Supports middleware hooks for logging, monitoring, and responsible AI practices
The Kernel manages registration of AI services (OpenAI, Azure OpenAI, HuggingFace), plugins (collections of functions), memory stores (vector databases for semantic search), and execution filters (cross-cutting concerns like logging and telemetry).
Function Types: Semantic vs Native
Semantic Functions are defined using natural language prompts in text files (typically skprompt.txt). These prompts guide LLM behavior to generate responses based on user input. The LLM processes the prompt template to produce output.
Example semantic function:
# skprompt.txt
Generate a creative excuse for the following event:
Event: {{$input}}
Excuse:Native Functions are implemented in traditional programming languages (C#, Python, TypeScript). They perform specific computational tasks with full control over logic and performance. Native functions enable precise operations, system interaction, and deterministic execution.
Example native function:
def add_numbers(a, b):
return a + bKey differences: Semantic functions use prompt engineering for LLM-based generation and natural language tasks. Native functions use conventional code for computational tasks, data processing, and external system interaction. Both types can be organized into plugins and invoked through the same function calling mechanism.
Plugin System and Function Calling
Plugins (formerly called “skills”) are collections of functions packaged together to expose external APIs and capabilities. Plugins enable AI models to interact with external systems, access data, trigger workflows, and automate tasks beyond built-in LLM capabilities.
To create a plugin, define a class and annotate methods with the KernelFunction attribute with descriptions. Register the plugin with the kernel to make functions available for invocation.
Function calling is the mechanism enabling plugin operation. When an AI model recognizes the need for a specific function, it requests it by name. The Kernel routes the request to the corresponding function within the plugin and returns results to the LLM to form the final response.
The framework aligns with the OpenAI plugin specification, allowing plugins to be utilized across various AI services and ensuring compatibility with standard function calling protocols.
Memory Architecture
The memory system provides semantic memory capabilities using vector embeddings for storing and retrieving information. The architecture comprises three components:
Semantic Text Memory provides high-level interface for memory operations, coordinating embedding generation and storage. Exposes methods for saving, retrieving, and searching information using semantic similarity.
Memory Stores persist vector embeddings and associated metadata with efficient similarity search. Implementations include in-memory storage, Azure AI Search, Qdrant, Pinecone, Redis, PostgreSQL, Chroma, and Weaviate.
Embedding Generation Services convert text to numerical vector representations. Implementations include OpenAI Embeddings, Azure OpenAI Embeddings, HuggingFace Embeddings, and ONNX Embeddings.
The memory system enables agents to remember prior interactions and recall relevant external knowledge dynamically during dialogue or workflow execution using Retrieval-Augmented Generation (RAG) techniques.
Planning and Orchestration Patterns
SK supports multiple orchestration patterns for coordinating agents and plugins:
Sequential Orchestration executes tasks in defined order with each step depending on the previous one. Task output serves as context for subsequent tasks.
Concurrent Orchestration executes multiple tasks in parallel with results aggregated upon completion.
Group Chat Orchestration enables agents to participate in collaborative conversations coordinated by a group manager.
Handoff Orchestration dynamically passes control between agents based on context or rules.
Magentic Orchestration allows flexible, general-purpose multi-agent collaboration inspired by the MagenticOne pattern.
Traditional planners like Function Calling Stepwise Planner and Handlebars Planner create sequences of function calls to achieve complex tasks. However, as of version 1.15.0-preview (June 2024), the OpenAI and Handlebars planners are deprecated. The framework has evolved to use function calling as the primary method for planning and executing tasks, reducing reliance on traditional planners as LLM capabilities have advanced.
Prompt Template Syntax
SK provides a template language for defining prompts with embedded logic:
Variables use {{$variableName}} syntax:
Hello {{$name}}, welcome to Semantic Kernel!Function calls use {{namespace.functionName}} syntax:
The weather today is {{weather.getForecast}}.Function parameters pass variables or hardcoded values:
The weather in {{$city}} is {{weather.getForecast $city}}.
The weather in Schio is {{weather.getForecast "Schio"}}.Special characters are escaped using quotes:
{{ "{{" }} and {{ "}}" }} are special SK sequences.
{{ functionName "one 'quoted' word" }}
{{ "quotes' \"escaping\" example" }}The $input variable is set automatically by the kernel when invoking functions, serving as default input for operations.
Version History and Deprecations
Version 1.0.0-rc4 (December 2023) released as final release candidate before stable 1.0.0.
Version 1.10.0-preview (April 2024) deprecated Kernel Events in favor of function filtering mechanism, providing greater flexibility and control over function execution.
Version 1.15.0-preview (June 2024) deprecated OpenAI and Handlebars planners, encouraging transition to function calling mechanisms as LLM capabilities matured.
Version 1.24.1-preview (October 2024) deprecated Markdown package due to lack of usage.
2025 Roadmap focuses on Agent Framework 1.0 transitioning from preview to general availability by end of Q1 2025, emphasizing production-grade stability. Integration with AutoGen enables seamless agent migration and management within SK ecosystem. Support for Azure AI Foundry Agents Service and other agent frameworks broadens the ecosystem.
Comparison with LangChain
Semantic Kernel uses kernel-centric design with centralized configuration where all components register with a single kernel instance. Provides type safety across C#, Python, and Java, reducing runtime errors. Emphasizes structured planning with plugin architecture and deep Microsoft ecosystem integration. Targets enterprise applications requiring reliability and strong typing.
LangChain uses compositional architecture where chains connect individual components for flexible composition. Provides 500+ pre-built integrations for data sources and services. Emphasizes rapid prototyping with Python-centric design and extensive community-driven integrations. Targets research environments and experimental projects requiring flexibility and quick iteration.