Cloud^Preview

Run larger, more powerful models with new capabilities

Upgrade

Speed up model inference

Run models using datacenter-grade hardware, returning responses much faster.
Run larger models

Upgrade to the newest hardware, making it possible to run larger models.
Privacy first

Ollama does not retain your data to ensure privacy and security.
Save battery life

Take the load of running models off your Mac, Windows or Linux computer, giving you performance back for your other apps.

What is Ollama's cloud?

Ollama's cloud is a new way to run open models using datacenter-grade hardware. Many new models are too large to fit on widely available GPUs, or run very slowly. Ollama's cloud provides a way to run these models fast while using Ollama's App, CLI, and API.
Does Ollama's cloud work with Ollama's CLI?

Yes! See the docs for more information.
Does Ollama's cloud work with Ollama's API and JavaScript/Python libraries?

Yes! See the docs for more information.
What data do you retain in Ollama's cloud?

Ollama does not log or retain any queries.
Where is the hardware that power Ollama's cloud located?

All hardware is located in the United States.
What are the usage limits for Ollama's cloud?

Ollama's cloud includes hourly and daily limits to avoid capacity issues. Usage-based pricing will soon be available to consume models in a metered fashion.