CloudPreview
Run larger, more powerful models with new capabilities
Ollama's cloud lets you
- 
        Speed up model inferenceRun models using datacenter-grade hardware, returning responses much faster. 
- 
        Run larger modelsUpgrade to the newest hardware, making it possible to run larger models. 
- 
        Privacy firstOllama does not retain your data to ensure privacy and security. 
- 
        Save battery lifeTake the load of running models off your Mac, Windows or Linux computer, giving you performance back for your other apps. 
Frequently asked questions
- 
        What is Ollama's cloud?Ollama's cloud is a new way to run open models using datacenter-grade hardware. Many new models are too large to fit on widely available GPUs, or run very slowly. Ollama's cloud provides a way to run these models fast while using Ollama's App, CLI, and API. 
- 
        Does Ollama's cloud work with Ollama's CLI?Yes! See the docs for more information. 
- 
        Does Ollama's cloud work with Ollama's API and JavaScript/Python libraries?Yes! See the docs for more information. 
- 
        What data do you retain in Ollama's cloud?Ollama does not log or retain any queries. 
- 
        Where is the hardware that power Ollama's cloud located?All hardware is located in the United States. 
- 
        What are the usage limits for Ollama's cloud?Ollama's cloud includes hourly and daily limits to avoid capacity issues. Usage-based pricing will soon be available to consume models in a metered fashion.