Last reviewed 2025-09-22 UTC
    
    
Use the following architecture guides to design and deploy generative AI applications with retrieval-augmented generation (RAG) in Google Cloud.
| Architecture guide | Description | 
|---|---|
| RAG infrastructure for generative AI using Google Agentspace and Vertex AI | An agent-driven architecture that uses Google Agentspace as a unified platform to orchestrate an end-to-end RAG dataflow for enterprise applications that require real-time data availability and enriched contextual search. | 
| RAG infrastructure for generative AI using Vertex AI and Vector Search | A fully managed, serverless architecture that provides optimized, high-performance vector search for large-scale applications. | 
| RAG infrastructure for generative AI using Vertex AI and AlloyDB for PostgreSQL | An architecture that stores vector embeddings alongside your operational data in a fully managed database like AlloyDB for PostgreSQL. | 
| Jump Start Solution: Generative AI RAG using Vertex AI and Cloud SQL | An architecture that stores vector embeddings alongside your operational data in a fully managed database like Cloud SQL. | 
| RAG infrastructure for generative AI using GKE and Cloud SQL | A flexible, container-based architecture that provides maximum control to build custom applications with open source tools such as Ray, Hugging Face, and LangChain. | 
| GraphRAG infrastructure for generative AI using Vertex AI and Spanner Graph | An advanced RAG architecture that combines vector search with knowledge graph queries to retrieve interconnected, contextual data, which results in more detailed and relevant generative AI responses. | 
| Harness CI/CD pipeline for RAG applications | An architecture for a continuous integration (CI) and continuous deployment (CD) pipeline for a RAG application in Google Cloud. |