A Broader View of AI Integration
In recent years, the integration of AI agents into various platforms has marked significant technological advancements. Companies like Apple, Microsoft, and Google are leading the way by embedding AI-driven features into their ecosystems, fundamentally transforming user interactions and automating complex tasks.
At Apple’s Worldwide Developer Conference (WWDC) 2024, Apple announced a significant partnership with OpenAI to integrate ChatGPT into iOS, iPadOS, and macOS. This collaboration enhances Siri’s capabilities and embeds AI-driven features seamlessly into everyday tasks.
Similarly, Microsoft has been pioneering the use of AI agents with its copilot features across various applications. Microsoft’s Semantic Kernel offers developers a powerful toolkit to integrate large language models (LLMs) like OpenAI, Azure OpenAI, and Hugging Face into conventional programming languages such as C#, Python, and Java, simplifying the creation of sophisticated AI-driven applications.
“Agents are smarter. They’re proactive—capable of making suggestions before you ask for them. They accomplish tasks across applications. They improve over time because they remember your activities and recognize intent and patterns in your behavior. Based on this information, they offer to provide what they think you need, although you will always make the final decisions.”
Bill Gates
How Can We Build Our Own AI Agents?
To build our own AI agents, we turn to Semantic Kernel, an open-source SDK provided by Microsoft. Semantic Kernel allows developers to integrate LLMs with conventional programming languages, providing a robust framework to build intelligent agents capable of complex tasks.
What is Semantic Kernel?
Semantic Kernel (SK) is a highly extensible, open-source SDK designed to integrate large language models (LLMs) with traditional programming languages such as C#, Python, and Java. It supports various AI models, including those from OpenAI, Azure OpenAI, and Hugging Face, offering tools for prompt engineering, function chaining, and plugin orchestration. This flexibility makes it suitable for a wide range of AI applications.
Core Components of Semantic Kernel:
- Plugins: Encapsulate capabilities into single units of functionality that can be leveraged across different services like ChatGPT, Bing, and Microsoft 365. Plugins can be semantic or native functions, enabling diverse applications and reusability.
- Planners: Use AI to generate a sequence of actions or steps required to fulfill a request. Planners mix and match registered plugins, creating a plan that is executed step-by-step. Planners can be defined using either handlebars or stepwise methodologies, depending on the complexity and requirements of the task.
- Personas: Define the role and behavior of an AI agent, providing a set of instructions that influence how the agent interacts with users. Personas hold memories that offer broader context, allowing for specialized agents that can handle specific types of requests effectively.
- Memories: Provide context to the initial request, helping the AI agent understand and retain relevant information. This can include key-value pairs, local storage, or semantic memory search using embeddings, which compares meanings to enhance response relevance.
- Prompts: Serve as the input to communicate with the AI model. Carefully crafted prompts guide the model in generating high-quality and contextually relevant responses.
Getting Started with Semantic Kernel
Disclaimer: The section “Getting Started with Semantic Kernel” is intended to provide a high-level, introductory overview of the topic, suitable for beginners (level 100). For more in-depth information and detailed instructions, it is highly recommended to refer to the official documentation and learning resources available on the Microsoft Learn website.
To begin, you need to set up your development environment and obtain API keys from either OpenAI or Azure OpenAI. Install .NET SDK and create a new .NET project.
dotnet new console -n AIAgent
Add Semantic Kernel NuGet Packages to your project and open the project in your preferred IDE (e.g. Visual Studio).
dotnet add package Microsoft.SemanticKernel
Communication with LLMs
Semantic Kernel communicates with LLMs through API calls. Depending on the model and service (e.g., OpenAI, Azure OpenAI), it uses specific endpoints and API keys to send prompts and receive responses. Here’s a typical setup for configuring an LLM within Semantic Kernel:
var openAIClient = new Azure.AI.OpenAI.OpenAIClient(" ... your API key ...");
var builder = Kernel.CreateBuilder();
builder.AddOpenAIChatCompletion(
"gpt-3.5-turbo",
openAIClient);
var kernel = builder.Build();
Defining Plugins
Plugins provide the capabilities required by the agent. A plugin is essentially a set of functions that the AI can call to perform specific tasks, such as sending emails, retrieving data, or interacting with external APIs.
The Semantic Kernel SDK offers an extra package with predefined plugins for common tasks. These are available in the Plugins.Core package that you can install with NuGet.
dotnet add package Microsoft.SemanticKernel.Plugins.Core
At the time of writing, the package was only available in a pre-release version. Since the plugin is still in preview, you may need to add suppress the warning to your code.
#pragma warning disable SKEXP0050
The package includes the following plugins:
- ConversationSummaryPlugin – Summarizes conversation
- FileIOPlugin – Reads and writes to the filesystem
- HttpPlugin – Makes requests to HTTP endpoints
- MathPlugin – Performs mathematical operations
- TextPlugin – Performs text manipulation
- TimePlugin – Gets time and date information
- WaitPlugin – Pauses execution for a specified amount of time
To use a core plugin, you can add it to your kernel builder using the AddFromType
method. For example, to add the TimePlugin
to your kernel, you can use the following code:
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Plugins.Core;
...
var builder = Kernel.CreateBuilder();
...
builder.Plugins.AddFromType<TimePlugin>();
var kernel = builder.Build();
var currentDay = await kernel.InvokeAsync("TimePlugin", "DayOfWeek");
Console.WriteLine(currentDay);
Write Your Own Plugin
For instance, an email-sending plugin might look like this:
public class EmailPlugin
{
[KernelFunction]
[Description("Sends an email to a recipient.")]
public async Task SendEmailAsync(
Kernel kernel,
[Description("Semicolon delimited list of emails of the recipients")] string recipientEmails,
string subject,
string body
)
{
// Logic to send an email
Console.WriteLine("Email sent!");
}
}
Plugins can also be defined from OpenAPI specifications, allowing flexibility and reusability. They extend the agent’s capabilities by providing skills that it can use to perform its tasks effectively.
Creating and Using Planners
Planners are crucial for enabling the agent to generate a sequence of actions to achieve tasks. A planner uses AI to create a step-by-step plan based on a user’s request, determining which plugins to call and in what order. Here’s an example of an AuthorEmailPlanner
:
public class AuthorEmailPlanner
{
[KernelFunction]
[Description("Returns the steps necessary to author an email.")]
public async Task<string> GenerateRequiredStepsAsync(
Kernel kernel,
[Description("Description of the email content")] string topic,
[Description("Description of the recipients")] string recipients
)
{
var result = await kernel.InvokePromptAsync($"""
I'm going to write an email to {recipients} about {topic}.
What are the top 3 steps I should take?
""");
return result.ToString();
}
}
Planners enhance the AI’s ability to manage complex tasks by generating actionable plans that involve multiple steps and decision points.
Setting the Agent’s Persona
The persona dictates how the agent interacts with users, shaping its responses and behavior. It provides context and personality traits that influence the agent’s communication style. For example:
ChatHistory chatMessages = new ChatHistory("""
You are a friendly assistant who follows the rules. Complete required steps and request approval before taking actions.
""");
The persona ensures the agent maintains a consistent tone and approach when interacting with users.
Conclusion
Integrating AI agents using .NET and Semantic Kernel unlocks immense potential for creating intelligent, responsive, and proactive applications. By leveraging the core components of Semantic Kernel—plugins, planners, personas, and memories—developers can build sophisticated AI agents that enhance user interactions and automate complex tasks seamlessly.
As AI continues to evolve, tools like Semantic Kernel empower developers to push the boundaries of what’s possible, bringing us closer to a future where intelligent agents are an integral part of our daily lives. Dive into Semantic Kernel, explore its capabilities, and start building your AI-driven applications today.
For more detailed information and step-by-step guides, visit the Microsoft Learn website.