Ollama is an open-source platform for running and managing large-language-model (LLM) packages entirely on your local machine. It bundles model weights, configurationOllama is an open-source platform for running and managing large-language-model (LLM) packages entirely on your local machine. It bundles model weights, configuration

Complete Ollama Tutorial (2026) – LLMs via CLI, Cloud & Python

2026/01/05 13:09
5 min di lettura
Per feedback o dubbi su questo contenuto, contattateci all'indirizzo crypto.news@mexc.com.

\ Ollama has become the standard for running Large Language Models (LLMs) locally. In this tutorial, I want to show you the most important things you should know about Ollama.

https://youtu.be/AGAETsxjg0o?embedable=true

Watch on YouTube: Ollama Full Tutorial

What is Ollama?

Ollama is an open-source platform for running and managing large-language-model (LLM) packages entirely on your local machine. It bundles model weights, configuration, and data into a single Modelfile package. Ollama offers a command-line interface (CLI), a REST API, and a Python/JavaScript SDK, allowing users to download models, run them offline, and even call user-defined functions. Running models locally gives users privacy, removes network latency, and keeps data on the user’s device.

Install Ollama

Visit the official website to download Ollama https://ollama.com/. It’s available for Mac, Windows, and Linux.

\ Linux:

curl -fsSL https://ollama.com/install.sh | sh

macOS:

brew install ollama

Windows: download the .exe installer and run it.

How to Run Ollama

Before running models, it is essential to understand Quantization. Ollama typically runs models quantized to 4 bits (q4_0), which significantly reduces memory usage with minimal loss in quality.

Recommended Hardware:

  • 7B Models (e.g., Llama 3, Mistral): Requires ~8GB RAM (runs on most modern laptops).

  • 13B — 30B Models: Requires 16GB — 32GB RAM.

  • 70B+ Models: Requires 64GB+ RAM or dual GPUs.

  • GPU: An NVIDIA GPU or Apple Silicon (M1/M2/M3) is highly recommended for speed.

\ Go to the Ollama website and click on the “Models” and select the model for your test.

After that, click on the model name and copy the terminal command:

Then, open the terminal window and paste the command:

It will allow you to download and chat with a model immediately.

Ollama CLI — Core Commands

Ollama’s CLI is central to model management. Common commands include:

  • ollama pull — Download a model
  • ollama run — Run a model interactively
  • ollama list or ollama ls — List downloaded models
  • ollama rm — Remove a model
  • ollama create -f — Create a custom model
  • ollama serve — Start the Ollama API server
  • ollama ps — Show running models
  • ollama stop — Stop a running model
  • ollama help — Show help

Advanced Customization: Custom model with Modelfiles

You can “fine-tune” a model’s personality and constraints using a Modelfile. This is similar to a Dockerfile.

  • Create a file named Modelfile
  • Add the following configuration:

# 1. Base the model on an existing one FROM llama3 # 2. Set the creative temperature (0.0 = precise, 1.0 = creative) PARAMETER temperature 0.7 # 3. Set the context window size (default is 4096 tokens) PARAMETER num_ctx 4096 # 4. Define the System SYSTEM """ You are a Senior Python Backend Engineer. Only answer with code snippets and brief technical explanations. Do not be conversational. """

FROM defines the base model

SYSTEM sets a system prompt

PARAMETER controls inference behavior

After that, you need to build the model by using this command:

ollama create [change-to-your-custom-name] -f Modelfile

This wraps the model + prompt template together into a reusable package.

Then run in:

ollama run [change-to-your-custom-name]

Press enter or click to view image in full size

Ollama Server (Local API)

Ollama can run as a local server that apps can call. To start the server use the command:

ollama serve

It listens on http://localhost:11434 by default.

Raw HTTP

import requests r = requests.post( "http://localhost:11434/api/chat", json={ "model": "llama3", "messages": [{"role":"user","content":"Hello Ollama"}] } ) print(r.json()["message"]["content"])

This lets you embed Ollama into apps or services.

Python Integration

Use Ollama inside Python applications with the official library. Run these commands:

Create and activate virtual environments:

python3 -m venv .venv source .venv/bin/activate

Install the official library:

pip install ollama

Use this simple Python code:

import ollama # This sends a message to the model 'gemma:2b' response = ollama.chat(model='gemma:2b', messages=[ { 'role': 'user', 'content': 'Write a short poem about coding.' }, ]) # Print the AI's reply print(response['message']['content'])

This works over the local API automatically when Ollama is running.

You can also call a local server:

import requests r = requests.post( "http://localhost:11434/api/chat", json={ "model": "llama3", "messages": [{"role":"user","content":"Hello Ollama"}] } ) print(r.json()["message"]["content"])

Using Ollama Cloud

Ollama also supports cloud models — useful when your machine can’t run very large models.

First, create an account on https://ollama.com/cloud and sign in. Then, inside the Models pag,e click on the cloud link and select any model you want to test.

\ In the models list, you will see the model with the -cloud prefix**,** which means it is available in the Ollama cloud.

Click on it and copy the CLI command. Then, inside the terminal, use:

ollama signin

To sign in to your Ollama account. Once you sign in with ollama signin, then run cloud models:

ollama run nemotron-3-nano:30b-cloud

Your Own Model in the Cloud

While Ollama is local-first, Ollama Cloud allows you to push your custom models (the ones you built with Modelfiles) to the web to share with your team or use across devices.

  • Create an account at ollama.com.
  • Add your public key (found in ~/.ollama/id_ed25519.pub).
  • Push your custom model:

ollama push your-username/change-to-your-custom-model-name

Conclusion

That is the complete overview of Ollama! It is a powerful tool that gives you total control over AI. If you like this tutorial, please like it and share your feedback in the section below.

Cheers! ;)

\

Opportunità di mercato
Logo Cloud
Valore Cloud (CLOUD)
$0.03111
$0.03111$0.03111
-4.97%
USD
Grafico dei prezzi in tempo reale di Cloud (CLOUD)
Disclaimer: gli articoli ripubblicati su questo sito provengono da piattaforme pubbliche e sono forniti esclusivamente a scopo informativo. Non riflettono necessariamente le opinioni di MEXC. Tutti i diritti rimangono agli autori originali. Se ritieni che un contenuto violi i diritti di terze parti, contatta crypto.news@mexc.com per la rimozione. MEXC non fornisce alcuna garanzia in merito all'accuratezza, completezza o tempestività del contenuto e non è responsabile per eventuali azioni intraprese sulla base delle informazioni fornite. Il contenuto non costituisce consulenza finanziaria, legale o professionale di altro tipo, né deve essere considerato una raccomandazione o un'approvazione da parte di MEXC.

$30,000 in PRL + 15,000 USDT

$30,000 in PRL + 15,000 USDT$30,000 in PRL + 15,000 USDT

Deposit & trade PRL to boost your rewards!