Skip to main content

Command Palette

Search for a command to run...

Build AI Agents Locally Using Small Language Models (SLMs) Without API Costs

Updated
5 min read
Build AI Agents Locally Using Small Language Models (SLMs) Without API Costs
A
LLMs • AI Agents • RAG • LangChain • LangGraph • Fine-Tuning • Reinforcement Learning • Python • APIs

AI Agents that You Can Run on Your Own Laptop

Earlier, building AI Agents was something limited to big tech companies. It required expensive cloud APIs, servers, and ongoing usage costs.

But now things have changed.

You can build and run AI Agents locally on your own laptop without paying for APIs or relying on cloud services. Once set up, these agents can work offline and still perform useful tasks.

This is possible because of Small Language Models (SLMs) like Phi-3, Mistral, and Llama 3, which are lightweight versions of large AI models designed to run on normal computers.

In this guide, you’ll learn how to build your own local AI Agents using Ollama and LangChain, step by step in a simple way.

What Are AI Agents?

An AI Agent is a program that doesn’t just respond to questions — it actually works toward completing a task.

Unlike a chatbot that only replies to text, AI Agents can:

  • Break a task into smaller steps

  • Decide what action to take next

  • Use tools like calculators or file readers

  • Continue until the task is completed

Simple way to understand it:

  • Chatbot → answers questions

  • AI Agent → solves tasks

Core Components of AI Agents

Every AI Agent is built using three main parts:

  1. Brain (Language Model / SLM): Understands your input and decides what to do next.

  2. Memory: This stores past conversations so the agent remembers context.

  3. Tools: External functions the agent can use, like:-Calculator ,File reader ,Search system, Custom Python functions

What Are Small Language Models (SLMs)?

Small Language Models (SLMs) are compact AI models trained on large datasets but optimized to run locally on laptops and desktops.

You can also refer to this paper for SLMs: https://arxiv.org/pdf/2506.02153

Instead of massive models with hundreds of billions of parameters, SLMs usually have 1B to 8B parameters, making them fast and efficient.

Model Developer Size Use Case
Phi-3 Mini Microsoft 3.8B Fast reasoning, lightweight tasks, edge deployment
Mistral 7B Mistral AI 7.3B General-purpose AI tasks, efficient local inference
Llama 3.2 3B / 1B Meta 1B–3B Small, efficient assistants for on-device and low-resource use
Gemma 4 E2B Google 2B Beginner-friendly, low resource usage, multimodal lightweight tasks

Phi-3 Mini or Llama 3 (small version) are the best options for running AI Agents locally.

Why Build AI Agents Locally?

There are several practical reasons developers prefer local AI Agents:

  1. No API Costs: You don’t pay for every request like cloud-based AI systems.

  2. Data Privacy: Your data never leaves your machine.

  3. Works Offline: After setup, AI Agents can run without internet.

  4. Full Control: You control the model, behavior, and tools.

  5. Better Learning Experience: You understand how AI Agents actually work instead of just using APIs.

Tools Required to Build Local AI Agents

  1. Ollama: Ollama allows you to run language models directly on your computer with simple commands.

  2. LangChain: A framework used to connect language models with tools and workflows.

  3. LangGraph: Used for building structured AI Agent flows where steps are clearly defined.

How to Set Up AI Agents Locally

Step 1: Install Ollama

Download and install Ollama from the official website

Then pull a model:

ollama pull phi3

Test it:

ollama run phi3

If it responds, setup is complete.

Step 2: Install Python Dependencies

Create a virtual environment:

python -m venv agent-env

Activate it:

Mac/Linux:

source agent-env/bin/activate

Windows:

agent-env\Scripts\activate

Install required libraries:

pip install langchain langchain-ollama langgraph

How to Build Your First AI Agent (Local Setup)

Here is a simple example of an AI Agent that can calculate math problems:

from langchain_ollama import OllamaLLM
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import tool
from langchain import hub

# Load local model
llm = OllamaLLM(model="phi3")

# Create a tool (calculator)
@tool
def calculator(expression: str) -> str:
    """Evaluates a math expression"""
    try:
        return str(eval(expression))
    except Exception as e:
        return f"Error: {str(e)}"

tools = [calculator]

# ReAct prompt (Reason + Act pattern)
prompt = hub.pull("hwchase17/react")

# Create AI Agent
agent = create_react_agent(llm=llm, tools=tools, prompt=prompt)

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True
)

# Run AI Agent
response = agent_executor.invoke({
    "input": "What is 245 * 18 divided by 5?"
})

print(response["output"])

How This AI Agent Works

This AI Agent follows a simple loop:

  1. Understands the question

  2. Breaks it into steps

  3. Uses tools if needed

  4. Combines results

  5. Gives final answer

This process is called the ReAct (Reason + Act) pattern.

Adding Memory to AI Agents

To make AI Agents remember conversations, you can add memory:

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

Now the AI Agent can remember past messages in the same session.

Limitations of Local AI Agents

Even though local AI Agents are powerful, they have some limitations:

  1. Lower Accuracy: Small models are not as powerful as GPT-4 or Claude.

  2. Slower Performance: Speed depends on your laptop’s hardware.

  3. Limited Context: They cannot remember very long conversations.

  4. Weak Complex Reasoning:Multi-step or advanced logic may not always work correctly.

When to Use Local AI Agents

Best Use Cases:

-Learning AI Agents

-Building prototypes

-Privacy-focused applications

-Offline AI tools

Not Ideal For:

-Large production systems

-High-accuracy business tools

-Complex reasoning systems

Building AI Agents using Small Language Models (SLMs) is no longer complex or expensive.

With tools like Ollama, LangChain, and LangGraph, you can run a fully functional AI Agent on your own system without any API costs.

This is one of the best ways to understand how modern AI systems actually work — by building them yourself.