⚡ Agentic Systems & Generative Intelligence
The transition from experimental LLMs to production-grade agents is being driven by 'Context Engineering' and the emergence of managed agent execution stacks.
- QueryData: Near-100% Accurate Data Agents: A new tool that translates natural language into nearly 100% accurate database queries for AlloyDB, Cloud SQL, and Spanner.
- Claude Mythos Preview on Vertex AI: Anthropic’s newest and most powerful model is now available in private preview for select Google Cloud customers.
- Lyria 3: Multimodal Music Generation: Google's family of music generation models is now in public preview, capable of generating high-fidelity stereo audio from text prompts and images with vocal support.
- Cloud Run Worker Pools for Agentic Workloads: Now generally available, worker pools provide a platform for stateful, pull-based agentic workloads, as demonstrated by Estée Lauder Companies.
- Fine-Tuning Gemma 4 with Serverless GPUs: A technical guide on using Cloud Run Jobs and NVIDIA RTX 6000 Pro GPUs to adapt Gemma 4's multimodal architecture for specific classification tasks.
- Building Event-Driven Data Agents: Combining BigQuery continuous queries, Pub/Sub, and Vertex AI Agent Engine (ADK) to build autonomous agents that resolve anomalies in real-time.
- Prism: Open-Source Evals for Analytics Agents: An evaluation tool designed to move agents from prototype to production through rigorous testing in BigQuery and Looker.
- Local Testing for Multi-Agent Systems: A guide on validating core reasoning and long-term memory integration in multi-agent systems before cloud deployment.
- Healthcare Recommender with Keras and Cloud Run: Deploying a symptom-to-disease translator using a Two-Tower architecture and ScaNN for efficient semantic search.
- Multimodal GraphRAG Resource Orchestration: A new architectural guide for building multi-agent systems that consolidate fragmented data into searchable knowledge graphs.
- Memorystore for Redis Remote MCP Server: A preview feature allowing LLMs and AI applications to connect directly to Memorystore for Redis instances.
- Datastream Remote MCP Server: Enables AI agents to programmatically manage and monitor data streams and connection profiles.
- Multi-cluster GKE Inference Gateway: The expansion of the Multi-Cluster Gateway API with global inference capabilities allows developers to serve LLM traffic at the lowest possible latency by deploying models worldwide.
- llm-d CNCF Donation: The donation of
llm-dto the CNCF marks a critical industry milestone in setting open standards for large-scale AI inference on Kubernetes. - TPU Support on Ray: Native support for TPUs in Ray core libraries (starting with Ray 2.55) simplifies the use of Ray Train and Ray Serve for hardware-accelerated agentic workflows.
⚡ Contextual Data & Deep Analytics
Unified data platforms are moving beyond storage into 'Actionable Intelligence,' where relational and graph models coexist within the same query lifecycle.
- Data Studio: The New Home for Data Cloud Assets: Data Studio (formerly Looker Studio) is re-emerging as the central hub for serving and visualising Google Data Cloud content.
- Openness for Apache Iceberg Lakehouses: The Google-managed Iceberg REST Catalog now provides full read/write interoperability between BigQuery and Iceberg-compatible engines.
- Looker Self-Service Explores: New ad-hoc analysis capabilities allow users to quickly analyse data governed by the Looker semantic layer.
- Conversational Analytics for Looker Embedded: Extending the natural language data experience to embedded users across multiple surfaces.
- AppOptimize API: Programmatic Cost Management: Empowers developers to retrieve precise cost and usage data for specific projects to streamline FinOps.
- Rightmove: Reinventing Search with Unified Data: A case study on migrating on-prem databases to Google Cloud to unlock personalised user experiences.
- BigQuery Graph: Scaling Relationship Analysis: Native modelling and analysis of complex relationships using GQL and SQL/PGQ standards directly within BigQuery.
- BigFrames: AI Functions in Dataframes: Bridging traditional dataframes and Generative AI to integrate Gemini-powered insights into Python workflows.
- Pub/Sub AI Inference SMT (GA): Generally available Single Method Transforms that allow inferences from Vertex AI models to be added directly to Pub/Sub messages.
- AI.AGG: Semantic Aggregation in BigQuery: A new preview function to aggregate unstructured data based on natural language instructions.
- BigQuery continuous queries: Stateful Operations: Preview support for complex analysis that retains information across time intervals using JOINs and windowing.
- Bigtable JDBC Driver (GA): Now generally available, enabling connections to Bigtable from Java applications and reporting tools via a generic JDBC adapter.
⚡ Sovereign Trust & Security Engineering
Modern security is shifting from reactive detection to 'on-by-default' baselines and automated perimeter management.
- Essential AI Security On By Default: Security Command Centre Standard now includes baseline AI and cloud security protections by default for all innovators.
- Domain Filtering in Cloud NGFW Enterprise: Enhancing perimeter security with wildcard-capable URL filtering and granular policy controls.
- Google Cloud: A Sovereign Cloud Leader: Named a leader in The Forrester Wave™: Sovereign Cloud Platforms, Q2 2026, validating Google's 'portfolio of choice' approach.
- Binary Authorisation: Production Nuances: Deep dive into the architectural requirements for establishing a secure-by-default container deployment posture.
- Automating Threat Intelligence at Scale: Using Cloud Armor and Cloud Run to transform reactive security into a proactive, 'set and forget' defence.
- GCP Landing Zones for Strict Regulation: Using Merlin Studio to simplify the generation of compliant Cloud Foundation Fabric files and documentation.
- Permiso: Searchable IAM Roles Interface: A new tool to help developers efficiently find, compare, and receive recommendations for predefined GCP IAM roles.
- Why PAM Alone Isn't Enough for IAM: An analysis of the risks associated with standing privileges and the necessity of a strict Least Privilege approach.
- Custom Organisation Policies for Workload Identity: New custom constraints are available for managed workload identity and Workload Identity Federation.
- Google Cloud Armor: ModSecurity CRS 4.22: Preconfigured rules now support ModSecurity CRS 4.22 as a rule source in preview.
- Orchestrate SOC Workflows with Multi-Agent AI: A high-level architecture guide for automating complex investigation and triage processes in a Security Operations Centre.
- GKE Control Plane Egress Control (GA): Now generally available, this feature allows administrators to strictly control or allow all egress traffic originating from the cluster's Control Plane VMs.
- Cloud Run + IAP Integration: A new integration with Identity-Aware Proxy (IAP) enables the configuration of authentication in front of Cloud Run services without the complexity of a dedicated load balancer.
- Privileged Workload Admission in Autopilot: GKE version 1.35 introduces granular controls for organisation and cluster admins to specify exactly which privileged partner workloads are permitted to run in Autopilot environments.
- Model Armor Guardrails: GKE users can now integrate Model Armor directly into the network data path to provide a hardened, high-performance security layer for AI inference.
⚡ High-Performance Infra & Cloud Ops
As hardware and networking reach new performance thresholds, the focus is on carbon efficiency and reducing cold-start latencies at the edge.
- Ironwood TPUs: 3.7x Carbon Efficiency: The seventh-generation architecture delivers massive gains in Compute Carbon Intensity compared to TPU v5p.
- GKE managed DRANET (GA): Generally available support for high-performance networking on NVIDIA GPU (A3/A4) and Cloud TPU (v6e/v7x) instances.
- Hyperdisk ML: 2 TiB/s Throughput (GA): The highest throughput Google Cloud storage type is now GA for A3 Ultra, C4D, and N4 machine series.
- vLLM: Production AI Inference on GKE: Architecting a high-throughput, low-latency LLM serving platform using continuous batching and PagedAttention.
- GKE Cloud Storage FUSE Profiles: Automating performance tuning to accelerate data access for AI/ML workloads with minimal overhead.
- Artifact Registry: Manual Image Prewarming: A new API-driven feature to reduce cold-start latency for container deployments.
- Dataflow Auto VM Selection: Automatically provisions workers from a curated list of machine types based on CPU and RAM resource hints.
- GCon: A Keyboard-Driven TUI for GCP: A fast, terminal-based alternative to the Cloud Console featuring fuzzy search and real-time resource monitoring.
- TorchTPU: Native PyTorch at Google Scale: A new engineering stack providing high-performance PyTorch execution on TPU infrastructure with an 'Eager First' approach.
- Cloud SQL: Storage Shrink Capability: Generally available support for manually reducing storage capacity if instance requirements decrease.
- GKE Gateway: Custom Logic via Callouts (GA): Generally available support for adding custom logic into the load balancing path via service extensions.
- GKE Active Buffer [Preview]: Replacing traditional "balloon pod" setups, this new Kubernetes-native API allows clusters to explicitly reserve unused node capacity, ensuring the Cluster Autoscaler provisions nodes ahead of demand spikes.
- GKE Native Custom Metrics: A new integration allows the HPA Controller to consume metrics directly, bypassing the need to route them through Cloud Monitoring and resulting in significantly faster autoscaling decisions.
- Utilisation-Based Load Balancing (UBB): GKE Gateway now supports splitting traffic between pods based on actual resource utilisation rather than traditional even distribution, optimising cluster efficiency.
- Autopilot ComputeClasses on GKE Standard: Developers can now leverage the full automation of Autopilot ComputeClasses within GKE Standard clusters, removing the need to choose between the two deployment modes.