What Google Cloud announced in AI this month – and how it helps you (original) (raw)

Editor’s note: Want to keep up with the latest from Google Cloud? Check back here for a monthly recap of our latest updates, announcements, resources, events, learning opportunities, and more.

This month, we're buzzing with the news of AP2, the Agent Payments Protocol, an open protocol developed with 60+ leading payments and technology companies. AP2 is a blueprint for interoperable, AI-native commerce, designed to work seamlessly as an extension of the Agent2Agent (A2A) protocol and Model Context Protocol (MCP).

We’ve also seen “bananas” momentum with our generative media models, especially through nano-banana, or Gemini 2.5 Flash Image. Let’s dive in!

Top hits:

Building next-gen visuals with Gemini 2.5 Flash Image (aka nano-banana) on Vertex AI: We announced native image generation and editing in Gemini 2.5 Flash (aka nano-banana) to deliver higher-quality images and more powerful creative control. Gemini 2.5 Flash Image is State-of-the-Art (SOTA) for both generation and image editing. For creative use cases, this means you can create richer, more dynamic visuals and edit images until they’re just right.

Powering AI commerce with the new Agent Payments Protocol (AP2): We announced the Agent Payments Protocol (AP2), an open protocol developed with leading payments and technology companies to securely initiate and transact agent-led payments across platforms. The protocol can be used as an extension of the Agent2Agent (A2A) protocol and Model Context Protocol (MCP). In concert with industry rules and standards, it establishes a payment-agnostic framework for users, merchants, and payments providers to transact with confidence across all types of payment methods.

The ROI of AI: How agents are delivering for business: The AI story is evolving. According to our 2025 ROI of AI Report, for the 52% of executives who report their organizations are now deploying AI agents in production, this represents a fundamental shift in how business gets done.
AI agent security: How to protect digital sidekicks (and your business): The promise of AI agents is immense. Powerful software that can plan, make decisions, and accurately act on your behalf at work? Yes please. But what if your digital assistant goes rogue? Without robust, flexible security in place, an AI agent could leak sensitive company data, cause a system outage, and even expose your business to security threats.
Scaling high-performance inference cost-effectively: GKE Inference Gateway is generally available, and we are launching new capabilities that deliver even more value. This underscores our commitment to helping companies deliver more intelligence, with increased performance and optimized costs for both training and serving. To learn more and get started, visit our AI Hypercomputer site.

News you can use:

Agent Factory Recap: Keith Ballinger on AI, The Future of Development, and Vibe Coding: In Episode #6 of the Agent Factory podcast, Vlad Kolesnikov joined Keith Ballinger, VP and General Manager at Google Cloud, for a deep dive into the transformative future of software development with AI. We explore how AI agents are reshaping the developer's role and boosting team productivity.

Announcing the new Practical Guide to Data Science on Google Cloud: We've designed our new guide for practitioners looking to use Google Cloud’s capabilities across BigQuery, Vertex AI, and Google Cloud Serverless for Apache Spark. We’ll show you how to use an AI-first approach in your data science workflows, use previously untapped unstructured and multimodal data, and achieve new levels of efficiency.
Automate app deployment and security analysis with new Gemini CLI extensions: We’re closing the gap between your terminal and the cloud with a first look at the future of Gemini CLI, delivered through two new extensions: security extension and Cloud Run extension:

1. 1. /security:analyze performs a comprehensive scan right in your local repository, with support for GitHub pull requests coming soon. This makes security a natural part of your development cycle.
    1. /deploy deploys your application to Cloud Run, our fully managed serverless platform, in just a few minutes.

Stay tuned for monthly updates on Google Cloud’s AI announcements, news, and best practices. For a deeper dive into the latest from Google Cloud, read our weekly updates, The Overwhelmed Person’s Guide to Google Cloud.

How much energy does Google AI use? We did the math: Did you know the estimated per-prompt energy impact is equivalent to watching TV for less than nine seconds? Take a look at the technical paper to learn more.

How Wells Fargo is using Google Cloud AI to empower its workforce with agentic tools: Wells Fargo, an early adopter of Google Agentspace, is transforming how individuals and teams work, collaborate, and serve customers.
At the Google Cloud Security Summit 2025, we shared details around new capabilities designed to help you secure your AI initiatives, and to help you use AI to make your organization more secure.

News you can use

Here’s which Google AI developer tool to use for each situation: From Jules to Firebase Studio, we mapped out our developer tool landscape to help you choose the right product for your project. This is a must read for any developer who wants to get started with Google AI developer tools.

July

Do you know since its preview launch on Vertex AI in June, enterprise customers have already generated over 6 million videos?

This July, we focused on helping you build what's next, and do it with confidence. We announced Veo 3 and Veo 3 Fast on Vertex AI to bring your ideas to life. This is exciting, especially for marketers, who want to rapidly iterate on high-quality creative. Plus, we’ve pulled together over 25+ of our best guides – from model deployment to building gen AI apps – to help you find ways to get started.

Top hits

Veo 3 and Veo 3 Fast are now available for everyone on Vertex AI: Veo 3 Fast is a faster way to turn text to video, from narrated product demos to short films.
- How to get started: Go here to learn more about Veo 3 and Veo 3 Fast on Vertex AI, and try it on Vertex AI Media Studio.

Our Big Sleep agent makes a big leap: Developed by Google DeepMind and Google Project Zero, Big Sleep can help security researchers find zero-day (previously unknown) software security vulnerabilities. We believe this is the first time an AI agent has been used to directly foil efforts to exploit a vulnerability in the wild.
Announcing a complete developer toolkit for scaling A2A agents on Google Cloud: We announced version 0.3 of the A2A protocol, which brings a more stable interface to build against and is critical to accelerating enterprise adoption. This version introduces several key capabilities, including gRPC support, the ability to sign security cards, and extended client side support in the Python SDK, which provide more flexible use, better security and easier integration.
The global endpoint offers improved availability for Anthropic’s Claude on Vertex AI: Anthropic's Claude models on Vertex AI now have improved overall availability with the global endpoint for Claude models. Now generally available, the global endpoint unlocks the ability to dynamically route your requests to any region with available capacity supported by the Claude model you're using.
Our customers are building cool things: For our latest edition, we explore Box’s AI agents extracting insights with cross-platform agent integration; Schroders uses multiple connected agents to build a complex investment research system; Hypros’ IoT device can monitor patient distress in hospitals without individual monitoring; a Formula E exhibition of whether regenerative braking can power an EV supercar for an entire lap.

News you can use

Bookmark our 25+ generative AI how-to guides: We gathered over 25 how-to guides for enterprise generative AI, covering everything from faster model deployment and building multi-agent systems to fine-tuning, evaluation, and RAG.
Build a conversational analytics agent with BigQuery: We released a new, first-party toolset for BigQuery that works with Google's Agent Development Kit (ADK) and the open-source MCP Toolbox for Databases. Learn how to build AI agents that can securely and intelligently interact with your enterprise data.
Take an open model from discovery to endpoint on Vertex AI: This guide walks you through the process of selecting, fine-tuning, evaluating, and deploying an open model on Vertex AI. It covers everything from the Model Garden to a production-ready endpoint.
Enable Secure Boot for your AI workloads: Learn how to enable Secure Boot for your GPU-accelerated AI workloads on Google Cloud. Our Secure Boot capability can be opted into at no additional charge, and now there’s a new, easier way to set it up for your GPU-accelerated machines.

June

From the command line to cinematic screens, June was about making AI practical and powerful. Developers can now access Gemini right from their terminal with our new command-line interface, Gemini CLI, and creators can produce stunning, narrative-driven video that combines audio and visuals with Veo 3. Plus, we published several helpful guides and resources for you to bookmark and use as you start building.

Top hits: Gemini comes to everyone – from your command line to Vertex AI

In your terminal: Introducing Gemini CLI, an open-source AI agent that brings the power of Gemini directly into your terminal. It provides lightweight access to Gemini, giving you the most direct path from your prompt to our model. Check it out on GitHub.
- How to use it: To use Gemini CLI free-of-charge, simply login with a personal Google account to get a free Gemini Code Assist license. That free license gets you access to Gemini 2.5 Pro and its massive 1 million token context window. If you’re a professional developer who needs to run multiple agents simultaneously, or if you prefer to use specific models, you can use a Google AI Studio or Vertex AI key for usage-based billing or get a Gemini Code Assist Standard or Enterprise license.
In Vertex AI: Gemini 2.5 Flash and 2.5 Pro now stable and generally available. Our most intelligent models for speed and advanced reasoning are production-ready providing organizations with the stability, reliability and scalability needed to confidently deploy the most advanced AI capabilities into mission-critical applications. Learn more.

The story unfolds: You dream it, Veo creates it

Any creative class will tell you the same axiom of truth: A great story shows rather than tells. We gave our text-to-video model, Veo 3, a simple, evocative prompt to show how it can turn a creative spark into a cinematic scene. Veo 3 is now available for all Google Cloud customers and partners in public preview on Vertex AI.

Why this matters: Veo 3 not only brings stunning visual quality, but now adds sound from background sounds to dialogue. Learn more.

The prompt: "A medium shot frames an old sailor, his knitted blue sailor hat casting a shadow over his eyes, a thick grey beard obscuring his chin. He holds his pipe in one hand, gesturing with it towards the churning, grey sea beyond the ship's railing. 'This ocean, it's a force, a wild, untamed might. And she commands your awe, with every breaking light'"

The result:

AI agents – they’re taking autonomous action, but we need to keep them secure

As AI moves from answering questions to taking action, securing these autonomous systems is paramount. This month, we're not only highlighting how to build powerful agents but also how to implement a security-first approach to their deployment.

Exec perspective: Earlier this month, Anton Chuvakin, security advisor for Google Cloud’s Office of the CISO, discussed a new Google report on securing AI agents, and the new security paradigm they demand. Here’s a snippet: With great power comes a great responsibility for agent developers. To help mitigate the potential risks posed by rogue agent actions, we should invest in a new field of study focused specifically on securing agent systems.

Keeping the conversation going: What are the unique security threats posed by agentic AI, and what can leaders do to start securing their workloads? To help answer that question, we sat down with Archana Ramamoorthy, Senior Director, Google Cloud Security, to ask her how businesses should protect their AI workloads by making security more collaborative. Here’s a snippet:

“Just as leaders are getting a handle on the security risks of generative AI, the ground is shifting again. The emergence of AI agents — which can execute tasks on your company's behalf — is demanding a new security paradigm: agent security. This new layer of autonomy can magnify blind spots in an organization's existing security posture…The bottom line is that we need to be prepared, and the whole organization should invest in keeping systems secure.”

News you can use

Build smarter, multimodal AI applications

Build a RAG-capable app using Vertex AI services. This is a new architecture guide that helps understand the role of Vertex AI and Vector Search in a generative AI app. It includes diagrams, design considerations, and much more.
Check out this digestible tutorial to create multimodal agents for complex tasks like object detection by using a new tutorial featuring Gemini, LangChain, and LangGraph. Matthew and May show you which decisions you need to make to combine Gemini, LangChain and LangGraph to build multimodal agents that can identify objects.
Bookmark this quick guide to fine-tune video inputs on Vertex AI. If your work involves content moderation, video captioning, and detailed event localization, this guide is for you.

Deploy your AI efficiently and at scale

How good is your AI?In this explainer, we dive into the new features of the Gen AI Evaluation Service, designed to help you scale your evaluations, evaluate your autorater, customize your autorater with rubrics and evaluate your agents in production. Check out this helpful rubric:

New recipe! Learn how to access Llama4 and DeepSeek models today on AI Hypercomputer. A quick but essential read.

Ground your AI in your enterprise data

Read this guideon MCP integrations with Google Cloud Databases. Now with Toolbox, any MCP-compatible AI assistant (including Claude Code, Cursor, Windsurf, Cline, and many more) can help you write application code that queries your database, designs a schema for a new application, refactors code when the data model changes, generates data for integration testing, explores the data in your database, etc.

Stay tuned

Top announcements

Google I/O brought a fresh wave of tools from Google Cloud, all designed to help businesses and developers build what's next. These updates bring new ways for organizations to work with AI, code more easily, create media, and manage intelligent agents. Here are the highlights:

We introduced new generative AI models for media, including Veo 3 for video, Imagen 4 for images, and Lyria 2 for music on Vertex AI. These models give you excellent ways to create visual and audio content from text prompts. Learn more in our blog here.
We've expanded Gemini 2.5 Flash and Pro model capabilities to help enterprises build more sophisticated and secure AI-driven applications and agents. With thought summaries, businesses get clarity and auditability of a model’s raw thoughts — including key details and tool usage. The new Deep Think mode uses research techniques that enable the model to consider multiple hypotheses before responding.
Gemini 2.5 is now powering all Gemini Code Assist editions! We also launched Jules, a new autonomous AI coding agent, now in public beta, designed to understand user intent and perform coding tasks like writing tests and fixing bugs
Firebase Studio is a cloud-based, AI workspace powered by Gemini 2.5 that lets you turn your ideas into a full-stack app in minutes. Now you can import Figma designs directly into Firebase Studio using the builder.io plugin, and then add features and functionality using Gemini in Firebase without having to write any code.
We're making AI application deployment significantly easier with Cloud Run, launching three key updates: first, you can now deploy applications built in Google AI Studio directly to Cloud Run with just a single button click; second, we enabled direct deployment of Gemma 3 models from AI Studio to Cloud Run, complete with GPU support for scalable, pay-per-use endpoints; and third, we've introduced a new Cloud Run MCP server, which empowers MCP-compatible AI agents (like AI assistants, IDE integrations, or SDKs) to programmatically deploy applications. Read more here.

The news didn’t stop with I/O – we announced several important announcements to help you deploy AI at scale:

Introducing the next generation of AI inference, powered by llm-d: We’re making inference even easier and more cost-effective, by making vLLM fully scalable with Kubernetes-native distributed and disaggregated inference. This new project is called llm-d. Google Cloud is a founding contributor alongside Red Hat, IBM Research, NVIDIA, and CoreWeave, joined by other industry leaders AMD, Cisco, Hugging Face, Intel, Lambda, and Mistral AI.
Mistral AI's Le Chat Enterprise and Mistral OCR 25.05 model are available on Google Cloud. Available today on Google Cloud Marketplace, Mistral AI's Le Chat Enterprise is a generative AI work assistant designed to connect tools and data in a unified platform for enhanced productivity.
Anthropic’s Claude Opus 4 and Claude Sonnet 4 on Vertex AI. Claude Opus 4 and Claude Sonnet 4 are generally available as a Model-as-a-Service (MaaS) offering on Vertex AI. For more information on the newest Claude models, visit Anthropic’s blog.

…and made strides in security:

What’s new with Google Cloud’s Risk Protection Program. We unveiled at Google Cloud Next major updates to our Risk Protection Program, an industry-first collaboration between Google and insurers that provides competitively priced cyber-insurance and broad coverage for Google Cloud customers. We’re now including Affirmative AI insurance coverage for your Google-related AI workloads. Here’s what’s new.
How Confidential Computing lays the foundation for trusted AI. Our latest Confidential Computing innovations highlight the creative ways our customers are using Confidential Computing to protect their most sensitive workloads — including AI.
How governments can use AI to improve threat detection and reduce cost. In the latest Cloud CISO Perspectives newsletter, our Office of the CISO’s Enrique Alvarez, public sector advisor, explains how government agencies can use AI to improve threat detection — and save money.

News you can use: Actionable ways to get started

Get fluent in generative AI:

62% of employers now expect candidates and employees to possess at least some familiarity with AI. That’s why we launched a first-of-its kind generative AI certification for non-technical learners—plus a new suite of no-cost training to help you prepare for that certification. That means you — and your company — can be among the first to take advantage of this opportunity to validate your strategic acumen in gen AI. Become a generative AI leader today.

Then, put generative AI to work

At I/O, we expanded generative AI media on Vertex AI. But how do you get started, today? To help you make the most of all the latest generative AI media announcements, we redesigned Vertex AI Studio. The developer-first experience will be your source for generative AI media models across all modalities. You’ll have access to Google’s powerful generative AI media models such as Veo, Imagen, Chirp and Lyria in the Vertex AI Media Studio.

Redesigned Vertex AI Studio

To help you turn your generative AI ideas into real web applications, we published this guide to create gen AI apps in less than 30 seconds with Vertex AI and Cloud Run. Any developer knows it’s a complex process to build shareable, interactive applications: you have to set up infrastructure, wire APIs, and build a front-end. It's usually a complex process. What if you could skip the heavy lifting and turn your generative AI concept into a working web app with just a few clicks?

New how-to series alert: Text-to-SQL agents

Recently, powerful large language models (LLMs) like Gemini, with their abilities to reason and synthesize, have driven remarkable advancements in the field of text-to-SQL. In this blog post, the first entry in a series, we explore the technical internals of Google Cloud's text-to-SQL agents.

Real-life demo: What if we turned Gemini into an AI basketball coach?

We rounded out this month with a deep-dive into a demo we showcased at Google Cloud Next and most recently, at I/O. In this article, we showed an AI experiment that turns Gemini 2.5 Pro into a jump shot coach. By combining a ring of Pixel cameras with Vertex AI, the coaching system connects AI motion capture, biomechanical analytics, and Gemini-powered coaching via text and voice.

“It’s like we always say: AI is only as good as the information you give it. For the AI basketball coach to be accurate, we knew we had to talk to actual, real-life professionals. So we talked to our partners at the Golden State Warriors and came up with essential criteria for helping you shoot like the pros.”

April

From Agent2Agent Protocol to Gemini 2.5 Pro, it’s been an exciting month. We hosted Google Cloud Next in Las Vegas on April 9th with over 30,000 people, announcing incredible innovations from Ironwood TPU to Agent2Agent Protocol. We also expanded Vertex AI in dizzying ways – now, it’s the only platform with generative media models across video, image, speech, and music and makes it possible for multiple agents to coexist in your enterprise.

If you missed the livestream, take a look at our Day 1 recap and summary of the developer keynote. It’s been incredible to see how customers have been applying AI to hundreds of use cases. In fact, we’ve counted more than 600 examples.

Top announcements

Agents

Our recently launched Agent2Agent (A2A) protocol is getting a lot of attention, and for good reason. This open interoperability protocol is designed to make it easy for AI agents to communicate with one another, no matter its foundation. It’s also a powerful complement to Anthropic’s Model Context Protocol (MCP). Together, these two open protocols are the foundation for complex multi-agent systems.

Since the launch, we’ve been excited to see all the discussions and reactions across social media and YouTube. The A2A GitHub repository (13k stars) has grown quickly, which underscores the industry's eagerness for a standardized agent communication protocol, especially these features:

Easy agent discovery: A key feature of A2A is the ability for agents to discover the capabilities of other agents through standardized "Agent Cards".
Complementary to MCP: As we say on our A2A ❤️ MCP topic page, A2A complements MCP. MCP. MCP equips an individual agent with the right context and tools to make it more capable A2A enables multiple, diverse agents to communicate and work with each other
Open and community driven: A2A is open source, inviting contributions from the broader community to refine and expand its functionality. To learn more, check out our GitHub here.

And, speaking of agents, we made a host of updates to Google Agentspace – starting with unified search. Now,Agentspace is integrated with Chrome Enterprise, which means you can search and access all of your enterprise’s resources — right from the search box in Chrome. You can also discover and adopt agents quickly and easily with Agent Gallery, and create agents with our new no-code Agent Designer.

Models

This month, we announced six new models (some in preview, some generally available) for our customers:

Gemini 2.5 Pro is available in preview on Vertex AI and the Gemini app.
Gemini 2.5 Flash — our workhorse model optimized specifically for low latency and cost efficiency — is coming soon to Vertex AI, AI Studio, and the Gemini app.
Imagen 3, our highest quality text-to-image model, now has improved image generation and inpainting capabilities for reconstructing missing or damaged portions of an image.
Chirp 3, our groundbreaking audio generation model, now includes a new way to create custom voices with just 10 seconds of audio input.
Lyria, the industry's first enterprise-ready, text-to-music model, can transform simple text prompts into 30-second music clips.
Veo 2, our industry-leading video generation model, is expanding with new features that help organizations create videos, edit them, and add visual effects.

Security

Hot on the heels of our AI Protection news, we introduced Google Unified Security at Next ‘25, which lays the foundation for superior security outcomes, followed by the RSA Conference in San Francisco. We showcased there:

Three new ways you can use AI as your security sidekick — complete with prompts to get you started.
Two new security agents for malware analysis and alert triage (plus two agents for customers who opt into the new SecOps Labs.)
Our singular vision for how agents can revolutionize security operations centers.
New open-source MCP servers for Google Security Operations, Google Threat Intelligence, and Security Command Center.

News you can use:

In just the last year alone, we’ve seen over 40x growth in Gemini use on Vertex AI, now with billions of API calls per month. So what’s new and better with Vertex AI?

To start:

Meta’s Llama 4 is generally available on Vertex AI.

Vertex AI Dashboards: These help you monitor usage, throughput, latency, and troubleshoot errors, providing you with greater visibility and control.
Vertex AI Model Optimizer: This capability uses Google's unique understanding of Gemini to automatically direct your query to the most performant model and tools, based on your quality, speed and cost preferences.
Live API: To enable truly conversational interactions, Live API offers streaming audio and video directly into Gemini.

Worth a second read:

Agent Development Kit (ADK) is a new open-source framework for designing agents built on the same framework that powers Google Agentspace and Google Customer Engagement Suite (CES) agents.Many powerful examples and extensible sample agents are readily available in Agent Garden.
Agent Engine is a fully managed runtime in Vertex AI that helps you deploy your custom agents to production with built-in testing, release, and reliability at a global, secure scale.

Next month, we’ll be releasing several guides and deep-dives into how to use all these features, so stay tuned!

Hear from our leaders:

As usual, we ended this month with our monthly column, The Prompt. In this installment, we hear from Logan Kilpatrick, AI product leader on Google DeepMind, on how multimodal capabilities – especially audio and vision – are bringing about a new UX paradigm. Here’s a snippet:

“There’s the age-old quote that a picture is worth a thousand words. This matters double in the world of multimodal. If I look at my computer right now and try to describe everything I see, it would take 45 minutes. Or, I could just take a picture. Use cases for vision could be something from as simple as object tracking to image detection. For example, a factory watching an assembly line to make sure there's no impurities in the product they’re creating. Or analyzing dozens of pictures of your farm and trying to understand crop yields. There's huge breadth and opportunity by blending these modalities together.”

March

In March, we made big strides in our models and accessibility – from announcing Gemini 2.5, our most intelligent AI model, to Gemma 3, our most capable and advanced version of the Gemma open-model family. We also introduced Gemini Code Assist for individuals, a free version of our AI coding assistant.

In the industry, we shone light on telecom and gaming. We went to Mobile World Congress, where we displayed key agent use cases where AI is becoming an imperative in telecom. At the same time, we talked to leaders in the games industry, who told us how they’re using Google Cloud AI to drive unprecedented advancements in game development, including smarter, faster, and more immersive gaming experiences.

Top announcements

The Gemini family is growing – we introduced Gemini 2.5, a thinking model designed to tackle increasingly complex problems. Gemini 2.5 Pro Experimental leads common benchmarks by meaningful margins and showcases strong reasoning and code capabilities. Learn all about it here.

We also introduced Gemma 3, the most capable model you can run on a TPU or GPU. To help you get started, we shared guides on how to deploy your AI workloads on Cloud Run and Vertex AI.

In the world of open-source, we announced Claude 3.7 Sonnet, Anthropic’s most intelligent model to date and the first hybrid reasoning model on the market, is available in preview on Vertex AI Model Garden. Claude 3.7 Sonnet can produce quick responses or extended, step-by-step thinking that is made visible to the user. Explore our sample notebook and documentation to start building.

Finally, we took a step forward in security. We introduced AI Protection, a set of capabilities designed to safeguard AI workloads and data across clouds and models — irrespective of the platforms you choose to use. Our Office of the CISO suggests these five do-it-today, actionable tips on how leaders can help their organizations adopt AI securely.

News you can use

Wondering how to make the most of your AI? When it comes to Gemini 2.0, we created a helpful guide that teaches you how to optimize one of enterprises’ most time-intensive tasks: document extraction.

From an infrastructure perspective, we broke down the four top use cases for AI Hypercomputer and ways to get started. In this guide, you’ll learn everything from affordable inference to reducing delivery bottlenecks. Take Moloco, for example. Using the AI Hypercomputer architecture they achieved 10x faster model training times and reduced costs by 2-4x.

And, speaking of cost – do you know the true cost of enterprise AI? Enterprises need ways to optimize large AI workloads because these resources can still be quite expensive. Learn how to calculate your AI cost and dig into these five tips to optimize your workflow on Google Cloud Platform.

Hear from our leaders

We closed off this month with our monthly column, The Prompt. In this installment, we hear from Suraj Poozhiyil, AI product leader, on how AI agents depend on “enterprise truth” – your enterprise’s unique context – to be successful. Here’s a snippet:

We’ve heard it before – AI is only as good as the data you put into it. When it comes to agents, AI agents are only as good as the context you give them. Enterprise truth is the answer to questions like, "What is our company’s policy for creating a purchase order?" and "What is the specific workflow for approvals and compliance?"

February

2025 is off to a racing start. From announcing strides in the new Gemini 2.0 model family to retailers accelerating with Cloud AI, we spent January investing in our partner ecosystem, open-source, and ways to make AI more useful. We’ve heard from people everywhere, from developers to CMOs, about the pressure to adapt the latest in AI with efficiency and speed – and the delicate balance of being both conservative and forward-thinking. We’re here to help. Each month, we’ll post a retrospective that recaps Google Cloud’s latest announcements in AI – and importantly, how to make the most of these innovations.

Top announcements: Bringing AI to you

This month, we announced agent evaluation in Vertex AI. A surprise to nobody, AI agents are top of mind for many industries looking to deploy their AI and boost productivity. But closing the gap between impressive model demos and real-world performance is crucial for successfully deploying generative AI. That’s why we announced Vertex AI’s RAG Engine, a fully managed service that helps you build and deploy RAG implementations with your data and methods. Together, these new innovations can help you build reliable, trustworthy models.

From an infrastructure perspective, we announced new updates to AI Hypercomputer. We wanted to make it easier for you to run large multi-node workloads on GPUs by launching A3 Ultra VMs and Hypercompute Cluster, our new highly scalable clustering system. This builds on multiple advancements in AI infrastructure, including Trillium, our sixth-generation TPU.

What’s new in partners and open-source

This month, we invested in our relationship with our partners. We shared how Gemini-powered content creation in Partner Marketing Studio will help partners co-market faster. These features are designed to streamline marketing efforts across our entire ecosystem, empowering our partners to unlock new levels of success, efficiency, and impact.

At the same time, we shared several important announcements in the world of open-source. We announced Mistral AI’s Mistral Large 24.11 and Codestral 25.01 models on Vertex AI. These models will help developers write code and build faster – from high-complexity tasks to reasoning tasks, like creative writing. To help you get started, we provided sample code and documentation.

And, most recently, we announced the public beta of Gen AI Toolbox for Databases in partnership with LangChain, the leading orchestration framework for developers building LLM applications. Toolbox is an open-source server that empowers application developers to connect production-grade, agent-based generative AI applications to databases. You can get started here.

Industry news: Google Cloud at the National Retail Federation (NRF)

The National Retail Federation kicked off the year with their annual NRF conference, where Google Cloud showed how AI agents and AI-powered search are already helping retailers operate more efficiently, create personalized shopping experiences, and use AI to get the latest products and experiences to their customers. Check our new AI tools to help retailers build gen AI search and agents.

As an example, Google Cloud worked with NVIDIA to empower retailers to boost their customer engagements in exciting new ways, deliver more hyper-personalized recommendations, and build their own AI applications and agents. Now with NVIDIA's AI Enterprise software available on Google Cloud, retailers can handle more data and more complex AI tasks without their systems getting bogged down.

News you can use

This month, we shared several ways to better implement fast-moving AI, from a comprehensive guide on Supervised Fine Tuning (SFT), to how developers can help their LLMs deliver more accurate, relevant, and contextually aware responses, minimizing hallucinations and building trust in AI applications by optimizing their RAG retrieval.

We also published new documentation to use open models in Vertex AI Studio. Model selection isn’t limited to Google’s Gemini anymore. Now, choose models from Anthropic, Meta, and more when writing or comparing prompts.

Hear from our leaders

We closed out the month with The Prompt, our monthly column that brings observations from the field of AI. This month, we heard from Warren Barkley, AI product leader, who shares some best practices and essential guidance to help organizations successfully move AI pilots to production. Here’s a snippet:

More than 60% of enterprises are now actively using gen AI in production, helping to boost productivity and business growth, bolster security, and improve user experiences. In the last year alone, we witnessed a staggering 36x increase in Gemini API usage and a nearly 5x increase of Imagen API usage on Vertex AI — clear evidence that our customers are making the move towards bringing gen AI to their real-world applications.

Posted in