Artwork

TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

324: Clippy’s Revenge: The AI Assistant That Actually Works - Sort Of

1:04:28
 
공유
 

Manage episode 512689768 series 3680004
TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Welcome to episode 324 of The Cloud Pod, where the forecast is always cloudy! Justin, Ryan, and Jonathan are your hosts, bringing you all the latest news and announcements in Cloud and AI. This week we have some exec changes over at Oracle, a LOT of announcements about Sonnet 4.5, and even some marketplace updates over at Azure! Let’s get started.

Titles we almost went with this week

  • Oracle’s Executive Shuffle: Promoting from Within While Chasing from Behind
  • Copilot Takes the Wheel on Your Legacy Code Highway
  • Queue Up for GPUs: Google’s Take-a-Number Approach to AI Computing
  • License to Bill: Google’s 400% Markup Grievance
  • Autopilot Engages: GKE Goes Full Self-Driving Mode
  • SQL Server Finally Gets a Lake House Instead of a Server Room
  • Microsoft Gives Office Apps Their Own AI Interns
  • Claude and Present Danger: The AI That Codes for 30 Hours Straight
  • The Claude Father Part 4.5: An Offer Your Code Can’t Refuse
  • CUD You Believe It? Google Makes Discounts Actually Flexible
  • ECS Goes Full IPv6: No IPv4s Given
  • Breaking News: AWS Finally Lets You Hit the Emergency Stop Button
  • One Marketplace to Rule Them All
  • BigQuery Gets a Crystal Ball and a Chatty Friend
  • Azure’s September to Remember: When Certificates and Allocators Attack
  • Shall I Compare Thee to a Sonnet? 4.5 Ways Anthropic Just Leveled Up
  • AWS provides a big red button

Follow Up

01:26 The global harms of restrictive cloud licensing, one year later | Google Cloud Blog

  • Google Cloud filed a formal complaint with the European Commission one year ago about Microsoft’s anti-competitive cloud licensing practices, specifically the 400% price markup Microsoft imposes on customers who move Windows Server workloads to non-Azure clouds.
  • The UK Competition and Markets Authority found that restrictive licensing costs UK cloud customers £500 million annually due to lack of competition, while US government agencies overspend by $750 million yearly because of Microsoft’s licensing tactics.
  • Microsoft recently disclosed that forcing software customers to use Azure is one of three pillars driving its growth and is implementing new licensing changes preventing managed service providers from hosting certain workloads on Azure competitors.
  • Multiple regulators globally including South Africa and the US FTC are now investigating Microsoft’s cloud licensing practices, with the CMA finding that Azure has gained customers at 2-3x the rate of competitors since implementing restrictive terms.
  • A European Centre for International Political Economy study suggests ending restrictive licensing could unlock €1.2 trillion in additional EU GDP by 2030 and generate €450 billion annually in fiscal savings and productivity gains.

03:32 Jonathan – “I’d feel happier about these complaints Google were making if they actually reciprocated the deals they make for their customers in the EU in the US.”

AI is Going Great – Or How ML Makes Money

05:14 Vibe working: Introducing Agent Mode and Office Agent in Microsoft 365 Copilot | Microsoft 365 Blog

  • Microsoft introduces Agent Mode for Office apps and Office Agent in Copilot chat, leveraging OpenAI’s latest reasoning models and Anthropic models to enable multi-step, iterative AI workflows for document creation.
  • This represents a shift from single-prompt AI assistance to conversational, agentic productivity where AI can evaluate results, fix issues, and iterate until outcomes are verified.
  • Agent Mode in Excel democratizes expert-level spreadsheet capabilities by enabling AI to “speak Excel” natively, handling complex formulas, data visualizations, and financial analysis tasks.
  • The system achieved notable performance on SpreadsheetBench benchmarks, and can execute prompts like creating financial reports, loan calculators, and budget trackers with full validation steps.
  • Agent Mode in Word transforms document creation into an interactive dialogue where Copilot drafts content, suggests refinements, and asks clarifying questions while maintaining Word’s native formatting. This enables faster iteration on complex documents like monthly reports and project updates through conversational prompts rather than manual editing. (FYI, this is a good way to get AI Slop, so buyer beware.)
  • The thing we’re the most excited about, however, is Office Agent in Copilot chat, which creates complete PowerPoint presentations and Word documents through a three-step process: clarifying intent, conducting web-based research with reasoning capabilities, and producing quality-checked content using code generation. (Justin, being an exec, really just likes the pretty slides.)
  • This addresses previous AI limitations in creating well-structured presentations by showing chain of thought and providing live previews.
  • The features are rolling out through Microsoft’s Frontier program for Microsoft 365 Copilot licensed customers and Personal/Family subscribers, with Excel and Word Agent Mode available on web (desktop coming soon) and Office Agent currently US-only in English.
  • This positions Microsoft to compete directly with other AI productivity tools while leveraging their existing Office ecosystem.

17:27 Justin – “There’s web apps for all of them. They’re not as good as Google web apps, but they pretend to be.”

08:14 Introducing Claude Sonnet 4.5 \ Anthropic

  • Claude Sonnet 4.5 achieves 77.2% on SWE-bench verified, positioning it as the leading coding model with the ability to maintain focus for over 30 hours on complex multi-step tasks.
  • The model is available via API at $3/$15 per million tokens, matching the previous Sonnet 4 pricing.
  • The Claude Agent SDK provides developers with the same infrastructure that powers Claude Code, enabling creation of custom AI agents for various tasks beyond coding.
  • This includes memory management for long-running tasks, permission systems, and subagent coordination capabilities.
  • Computer use capabilities improved significantly with 61.4% on OSWorld benchmark (up from 42.2% four months ago), enabling direct browser navigation, spreadsheet manipulation, and task completion.
  • The Claude for Chrome extension brings these capabilities to Max subscribers.
  • New product features include checkpoints in Claude Code for progress saving and rollback, a native VS Code extension, context editing with memory tools in the API, and direct code execution with file creation (spreadsheets, slides, documents) in Claude apps.
  • Early customer results show 44% reduction in vulnerability intake time for security agents, 18% improvement in planning performance for Devin, and zero error rate on internal code editing benchmarks (down from 9%).
  • The model operates under ASL-3 safety protections with improved alignment metrics.

12:02 Ryan – “I’ve been using Sonnet 4 pretty much exclusively for coding, just because the results I’ve been getting on everything else is really hit or miss. But I definitely won’t let it go off, because it WILL go off on some tangents.”

16:22 Claude Sonnet 4.5 Is Here | Databricks Blog

  • Databricks integrates Claude Sonnet 4.5 directly into their platform through AI Functions, allowing enterprises to apply the model to governed data without moving it to external APIs.
  • This preserves data lineage and security while enabling complex analysis at scale.
  • The integration enables SQL and Python users to treat Claude as a built-in operator for analyzing unstructured data like contracts, PDFs, and images. Databricks automatically handles backend scaling from single rows to millions of records.
  • Key technical advancement is bringing AI models to data rather than exporting data to models, solving governance and compliance challenges.
  • This approach maintains existing data pipelines while adding AI capabilities for tasks like contract analysis and compliance risk detection.
  • Agent Bricks allows enterprises to build domain-specific agents using Claude Sonnet 4.5, with built-in evaluation and continuous improvement mechanisms. The platform handles model tuning and performance monitoring for production deployments.
  • Claude Sonnet 4.5 launches just seven weeks after Claude Opus 4.1, highlighting rapid model evolution.
  • Databricks’ model-agnostic approach lets enterprises switch between providers as needs change without rebuilding infrastructure.

16:31 Announcing Anthropic Claude Sonnet 4.5 on Snowflake Cortex AI

  • Snowflake now offers same-day availability of Anthropic’s Claude Sonnet 4.5 model through Cortex AI, accessible via SQL functions and REST API within Snowflake’s secure data perimeter.
  • The model shows improvements in domain knowledge for finance and cybersecurity, enhanced agentic capabilities for multi-step workflows, and achieved higher scores on SWE-bench Verified for coding tasks.
  • Enterprises can leverage Sonnet 4.5 through three main interfaces: Snowflake Intelligence for natural language business queries, Cortex AISQL for multimodal data analysis directly in SQL, and Cortex Agents for building intelligent systems that handle complex business processes.
  • The integration maintains Snowflake’s existing security and governance capabilities while processing both structured and unstructured data.
  • The model is available in supported regions with cross-region inference for non-supported areas, and Snowflake reports over 6,100 accounts using their AI capabilities in Q2 FY26.
  • Developers can access the model using simple SQL commands like AI_COMPLETE or through REST API calls for low-latency inference in native applications.
  • This partnership represents a shift toward embedding frontier AI models directly into data warehouses, allowing analysts to run advanced AI operations using familiar SQL syntax without moving data outside their secure environment.
  • This approach reduces the complexity of building AI pipelines while maintaining enterprise-grade security and governance.

16:41 Announcing SQL Server connector from Lakeflow Connect, now Generally Available | Databricks Blog

  • Databricks’ SQL Server connector for Lakeflow Connect is now GA, providing fully managed data ingestion from SQL Server to the lakehouse with built-in CDC and Change Tracking support, eliminating the need for custom pipelines or complex ETL tools.
  • The connector addresses the common challenge of SQL Server data being locked in transactional systems by enabling incremental data capture without impacting production performance, supporting both on-premises and cloud SQL Server environments through a simple point-and-click UI or API.
  • Key capabilities include automatic SCD Type 2 support for tracking historical changes, integration with Databricks Asset Bundles and Terraform for CI/CD workflows, and the ability to ingest from multiple SQL Server instances simultaneously without full table refreshes.
  • Early adopters like Cirrus Aircraft report migrating hundreds of tables in days instead of months, while Australian Red Cross Lifeblood uses it to build reliable pipelines without complex data engineering, demonstrating real-world value for enterprises moving to lakehouse architectures.
  • This release is part of Lakeflow Connect’s broader ecosystem that now includes GA connectors for ServiceNow and Google Analytics, with PostgreSQL, SharePoint, and query-based connectors for Oracle, MySQL, and Teradata coming soon.

17:35 Ryan – “This has been a challenge for awhile; getting data out of these transactional databases so that you can run large reporting jobs on them. So I like any sort of “easy button” that moves you out of that ecosystem.”

AWS

17:53 Introducing Claude Sonnet 4.5 in Amazon Bedrock: Anthropic’s most intelligent model, best for coding and complex agents | AWS News Blog

  • Claude Sonnet 4.5 is now available in Amazon Bedrock as Anthropic’s most advanced model, specifically optimized for coding tasks and complex agent applications with enhanced tool handling, memory management, and context processing capabilities.
  • The model introduces three key API features: Smart Context Window Management that generates responses up to available limits instead of erroring out, Tool Use Clearing for automatic cleanup of interaction history to reduce token costs, and Cross-Conversation Memory that persists information across sessions using local memory files.
  • Integration with Amazon Bedrock AgentCore enables 8-hour long-running support with complete session isolation and comprehensive observability, making it suitable for autonomous security operations, financial analysis, and research workflows that require extended processing times.
  • Claude Sonnet 4.5 excels at autonomous long-horizon coding tasks where it can plan and execute complex software projects spanning hours or days, with demonstrated strength in cybersecurity for proactive vulnerability patching and finance for transforming manual audits into intelligent risk management.
  • Access requires using inference profiles that define which AWS Regions process requests, with system-defined cross-Region profiles available for optimal performance distribution across multiple regions.

18:06 Justin – “I was mad because it wasn’t working, and then I remembered, “oh yeah…in Bedrock you have to go enable the new model one by one. So if you’re trying to use Bedrock and it’s not working, remember to update your model access.”

18:21 Amazon ECS announces IPv6-only support | Containers

  • Amazon ECS now supports IPv6-only workloads, allowing containers to run without any IPv4 dependencies while maintaining full compatibility with AWS services like ECR, CloudWatch, and Secrets Manager through native IPv6 endpoints.
  • This addresses IPv4 address exhaustion challenges and eliminates the need for NAT gateways in private subnets, reducing operational complexity and costs associated with NAT gateway hours and public IPv4 address charges.
  • The implementation requires minimal configuration changes – simply use IPv6-only subnets with your ECS tasks, and the service automatically adapts without needing IPv6-specific parameters, supporting awsvpc, bridge, and host networking modes.
  • Migration strategies include in-place updates for non-load-balanced services or blue-green deployments using weighted target groups for ALB/NLB workloads, with DNS64/NAT64 available for connecting to IPv4-only internet services.
  • Federal agencies and organizations with IPv6 compliance requirements can now run containerized workloads that meet regulatory mandates while simplifying their network architecture and improving security posture through streamlined access control.

18:57 Amazon EC2 Auto Scaling now supports Internet Protocol Version 6 (IPv6)

  • EC2 Auto Scaling now supports IPv6 in dual-stack configuration alongside IPv4, addressing the growing scarcity of IPv4 addresses and enabling virtually unlimited scaling for applications.
  • The dual-stack approach allows gradual migration from IPv4 to IPv6, reducing risk during transitions while providing contiguous IP ranges that simplify microservice architectures and network management.
  • This update arrives as enterprises face IPv4 exhaustion challenges, with IPv6 adoption becoming essential for large-scale deployments and IoT workloads that require extensive address spaces.
  • Available in all commercial AWS regions except New Zealand (we’re not sure what the deal is there, but sorry Kiwis).
  • The feature integrates with existing VPC configurations and requires no additional charges beyond standard EC2 and networking costs.
  • Organizations running containerized workloads or microservices architectures will benefit from simplified IP management and the ability to assign dedicated ranges to each service without address constraints.

19:47 Matt- “It is amazing how fast that IPv4 cost does add up in your account, especially if you have load balancers, multiple subnets, and you’re running multiple ECS containers and public subnets for some reason.”

20:36 Amazon EC2 Allowed AMIs setting adds new parameters for enhanced AMI governance

  • EC2’s Allowed AMIs setting now supports four new parameters – marketplace codes, deprecation time, creation date, and AMI names – giving organizations more granular control over which Amazon Machine Images can be discovered and launched across their AWS accounts.
  • The marketplace codes parameter addresses a common security concern by allowing teams to restrict usage to specific vetted marketplace AMIs, while deprecation time and creation date parameters help enforce policies against outdated or potentially vulnerable images.
  • AMI name parameter enables enforcement of naming conventions, which is particularly useful for large organizations that use standardized naming patterns to indicate compliance status, department ownership, or approved software stacks.
  • These parameters integrate with AWS Declarative Policies for organization-wide governance, allowing central IT teams to enforce AMI compliance across hundreds or thousands of accounts without manual intervention.
  • The feature is available in all AWS regions at no additional cost and represents a practical solution to the challenge of shadow IT and unauthorized software deployment in cloud environments.

25:07 Jonathan – “Just wait six months, they’ll all have the same features anyway.”

26:00 Amazon EC2 Auto Scaling now supports forced cancellation of instance refreshes

  • EC2 Auto Scaling now allows forced cancellation of instance refreshes by setting WaitForTransitioningInstances to false in the CancelInstanceRefresh API, enabling immediate abort without waiting for in-progress launches or terminations to complete.
  • This feature addresses emergency scenarios where rapid roll forward is needed, such as when a current deployment causes service disruptions and teams need to quickly abandon the problematic refresh and start a new one.
  • The enhancement provides better control over Auto Scaling group updates by bypassing lifecycle hooks and pending instance activities, reducing downtime during critical deployment issues.
  • Available in all AWS regions including GovCloud, this feature integrates with existing Auto Scaling workflows and requires no additional cost beyond standard EC2 and Auto Scaling charges.
  • For organizations using instance refreshes for configuration updates or deployments, this capability reduces recovery time objectives (RTO) when deployments go wrong, particularly valuable for production environments requiring quick remediation.

26:38 Justin – “I was like, this isn’t really that big of an issue, and then I remembered well, I’ve had a really big autoscaling group, and this could be a really big problem. If you have like 5 webservers, you probably don’t care. But if you have hundreds? This could be a big lifesaver for you.”

29:00 Announcing Amazon ECS Managed Instances for containerized applications | AWS News Blog

  • Your hosts spent quite a bit of time arguing about this one…
  • Amazon ECS Managed Instances bridges the gap between serverless simplicity and EC2 flexibility by providing fully managed container compute that supports all EC2 instance types including GPUs and specialized architectures while AWS handles provisioning, scaling, and security patching.
  • The service automatically selects cost-optimized instances by default but allows customers to specify up to 20 instance attributes when workloads require specific capabilities, addressing the limitation that prevented customers with EC2 pricing commitments from using serverless options.
  • Infrastructure management includes automated security patches every 14 days using Bottlerocket OS, intelligent task placement to consolidate workloads onto fewer instances, and automatic termination of idle instances to optimize costs.
  • Pricing consists of standard EC2 instance costs plus a management fee, initially available in 6 regions including US East, US West, Europe, Africa, and Asia Pacific with support for console, CLI, CDK, and CloudFormation deployment.
  • For The Cloud Pod specifically, one single node was $.03 for the management fee.
  • This addresses a key customer pain point where teams wanted serverless operational simplicity but needed specific compute capabilities like GPU acceleration or particular CPU architectures that weren’t available in Fargate.

30:12 Justin – “I love Fargate, but I don’t like paying for Fargate. That’s why I run our Cloud Pod website on an EC2 instance because it’s way cheaper. So for three cents more a gig versus going to Fargate, this is probably where I would land if I didn’t really want to manage the host.”

33:11 Announcing AWS Outposts third-party storage integration with Dell and HPE | AWS News Blog

  • AWS Outposts now integrates with Dell PowerStore and HPE Alletra Storage MP B10000 arrays, joining existing support for NetApp and Pure Storage, allowing customers to use their third-party storage investments with Outposts through native AWS tooling.
  • The integration supports both data and boot volumes with two boot methods – iSCSI SANboot for read/write volumes and Localboot for read-only volumes using iSCSI or NVMe-over-TCP protocols, manageable through the EC2 Launch Instance Wizard.
  • This addresses two key customer needs: organizations migrating VMware workloads who need to maintain existing storage during transition, and companies with strict data residency requirements that must keep data on-premises while using AWS services.
  • Available at no additional charge across all Outposts form factors (2U servers and both rack generations) in all supported regions, with AWS-verified AMIs for Windows Server 2022 and RHEL 9 plus automation scripts on AWS Samples.
  • Second-generation Outposts racks can now combine doubled compute performance (2x vCPU, memory, and network bandwidth) with customers’ preferred storage arrays, providing flexibility for hybrid cloud deployments.

34:37 Jonathan – “It’s more that you can not have AWS provide the storage layer, but you can have them still support S3 and EBS and those other things on top of this third party storage subsystem.”

GCP

36:35 Introducing Flex-start VMs for the Compute Engine Instance API. | Google Cloud Blog

  • Google launches Flex-start VMs in GA, a new consumption model that queues GPU requests for up to 2 hours instead of failing immediately, addressing the persistent challenge of GPU scarcity for AI workloads.
  • This appears to be unique among major cloud providers – rather than competing on raw capacity, Google is innovating on the access model itself by introducing a fair queuing system with significant discounts compared to on-demand pricing.
  • The service integrates directly with Compute Engine’s existing instance API and CLI, allowing easy adoption into current workflows without requiring migration to a separate scheduling service, with VMs running for up to 7 days uninterrupted.
  • Key use cases include AI model fine-tuning, batch inference, and HPC workloads that can tolerate delayed starts in exchange for better resource availability and lower costs, particularly valuable for research and development teams.
  • The stop/start capability with automatic re-queuing and configurable termination actions (preserving VM state after 7 days) provides flexibility for long-running experiments while managing costs effectively.

37:32 Ryan – “I love this. This is great. You’re still going to see a whole bunch of data scientists spamming the workbooks trying to get this to run, but I do think that from a pure capacity standpoint this is the right answer to some of these things, just because a lot of these jobs are very long running and it’s not really instant results.”

39:52 GKE Autopilot now available to all qualifying clusters | Google Cloud Blog

  • GKE Autopilot features are now available in Standard clusters through compute classes, allowing existing GKE users to access container-optimized compute without migrating to dedicated Autopilot clusters – this brings efficient bin-packing and rapid scaling to 70% of GKE clusters that weren’t using Autopilot mode.
  • The container-optimized compute platform starts at just 50 milli-CPU (5% of one core) and scales to 28vCPU, with customers only paying for requested resources rather than entire nodes – addressing the common Kubernetes challenge of overprovisioning and wasted compute capacity.
  • New automatic provisioning for compute classes lets teams gradually adopt Autopilot features alongside existing node pools without disrupting current workloads, solving the previous all-or-nothing approach that made migration risky for production environments.
  • AI workloads can now run on GPUs and TPUs with Autopilot’s managed node properties and enterprise-grade security controls, competing directly with AWS EKS Auto Mode and Azure AKS automatic node provisioning but with tighter integration to Google’s AI ecosystem.
  • Available starting with GKE version 1.33.1 in the Rapid release channel, with 30% of new GKE clusters already created in Autopilot mode in 2024, suggesting strong customer adoption of managed Kubernetes operations.

37:32 Ryan – “So now you can have not only dedicated compute, but preemptible and now autopilot capacity all in the single cluster. Kind of cool.”

41:58 Gemini CLI extensions for Google Data Cloud | Google Cloud Blog

  • Google launches Gemini CLI extensions for Data Cloud services including Cloud SQL, AlloyDB, and BigQuery, enabling developers to manage databases and run analytics directly from their terminal using natural language prompts.
  • What could go wrong?
  • The extensions allow developers to provision databases, create tables, generate APIs, and perform data analysis through conversational commands, potentially reducing the time needed for common database operations and eliminating context switching between tools.
  • BigQuery’s extension includes AI-powered forecasting capabilities and conversational analytics APIs, letting users ask business questions in natural language and receive insights without writing SQL queries.
  • This positions Google against AWS’s recent CodeWhisperer CLI integration and Azure’s GitHub Copilot CLI, though Google’s approach focuses specifically on data services rather than general cloud operations.
  • Key use cases include rapid prototyping for startups, data exploration for analysts who aren’t SQL experts, and streamlining database operations for DevOps teams managing multiple Cloud SQL or AlloyDB instances.

43:28 Announcing Claude Sonnet 4.5 on Vertex AI | Google Cloud Blog

  • Surprise surprise…
  • Google Cloud now offers Claude Sonnet 4.5 on Vertex AI, Anthropic’s most advanced model designed for autonomous agents that can work independently for hours on complex coding, cybersecurity, financial analysis, and research tasks.
  • The integration includes Vertex AI’s Agent Development Kit and Agent Engine for building multi-agent systems, plus provisioned throughput for dedicated capacity at fixed costs, addressing enterprise needs for reliable AI deployment.
  • Claude Sonnet 4.5 supports a 1 million token context window, batch predictions, and prompt caching on Vertex AI, with global endpoint routing that automatically serves traffic from the nearest available region for reduced latency.
  • Customers like Augment Code, spring.new, and TELUS are already using Claude on Vertex AI, with spring.new reporting application development time reduced from three months to 1-2 hours using natural language prompts.
  • The model is available through Vertex AI Model Garden and Google Cloud Marketplace, with VS Code extension support and Claude Code 2.0 terminal interface featuring checkpoints for more autonomous development operations.

43:51 Adopt new VM series with GKE compute classes, Flexible CUDs | Google Cloud Blog

  • GKE compute classes let you define a prioritized list of machine families for autoscaling, automatically falling back to alternative VM types if your preferred option isn’t available – solving the challenge of adopting new Gen4 machines like N4 and C4 while maintaining workload availability.
  • Compute Flexible CUDs provide spend-based discounts up to 46% that follow your workload across different machine families, unlike resource-based CUDs that lock you to specific VM types – enabling financial flexibility when migrating between machine generations.
  • The combination addresses real adoption barriers: compatibility testing through gradual rollouts, regional capacity constraints with automatic fallbacks, and financial commitment alignment by allowing discounts to apply across multiple VM families including both new and legacy options.
  • Shopify successfully used this approach during Black Friday/Cyber Monday 2024, prioritizing new N4 machines with N2 fallbacks to handle massive scale while maintaining cost optimization through Flex CUDs.
  • This approach particularly benefits organizations running large GKE fleets or high-performance workloads that want to leverage new C4/C4D series VMs for better price-performance without sacrificing availability or losing existing discount commitments.

44:08 Justin – “So this is a solution to a problem that Google has because they’;re terrible at capacity planning. Perfect.”

45:35 AI-based forecasting and analytics in BigQuery via MCP and ADK | Google Cloud Blog

  • BigQuery now offers two new AI tools for data analysis: ask_data_insights enables natural language queries against structured data using Conversational Analytics API, while BigQuery Forecast provides time-series predictions using the built-in TimesFM model without requiring separate ML infrastructure setup.
  • These tools integrate with both Google’s Agent Development Kit (ADK) and Model Context Protocol (MCP) Toolbox, allowing developers to build AI agents that can analyze BigQuery data and generate forecasts with just a few lines of code – positioning Google against AWS Bedrock and Azure OpenAI Service in the enterprise AI agent space.
  • The ask_data_insights tool provides transparency by showing step-by-step query formulation and execution logs, addressing enterprise concerns about AI black boxes when analyzing sensitive business data, while BigQuery Forecast leverages the AI.FORECAST function to deliver predictions with confidence intervals.
  • Key use cases include retail sales forecasting, web traffic prediction, and inventory management, with the demo showing Google Analytics 360 data analysis – particularly valuable for businesses already invested in Google’s analytics ecosystem who want to extract deeper insights without data science expertise.
  • Both tools are available today in the MCP Toolbox and ADK’s built-in toolset, with users only needing read access to BigQuery tables, though specific pricing details aren’t mentioned beyond standard BigQuery query and ML costs.

46:38 Ryan – “…this is really neat. And then the fact that it does show you the logic all the way through, which I think is super important. You can ask natural-line questions, and it just comes back with a whole bunch of analysis, and then what happens if that doesn’t work consistently? How do you debug that? This is basically building it, which is how I learned anyway, so it works really well when it’s spitting out the actual config for me instead of just telling me what the results are.”

Azure

49:06 Announcing migration and modernization agentic AI tools | Microsoft Azure Blog

  • Microsoft announced agentic AI tools for migration and modernization at their Migrate and Modernize Summit, with GitHub Copilot now automating Java and .NET app upgrades that previously took months down to days or hours.
  • Azure Migrate introduces AI-powered guidance and connects directly with GitHub Copilot for app modernization, enabling IT and developer teams to collaborate seamlessly while providing application-awareness by default and expanded support for PostgreSQL and Linux distributions.
  • The new Azure Accelerate program combines expert guidance with funding for eligible projects and includes the Cloud Accelerate Factory where Microsoft engineers provide zero-cost deployment support for over 30 Azure services.
  • GitHub Copilot’s app modernization capabilities analyze codebases, detect breaking changes, suggest migration paths, containerize code, and generate deployment artifacts – with Ford China reporting 70% reduction in time and effort for middleware app modernization.
  • This positions Microsoft competitively against AWS and GCP by addressing the 37% of application portfolios requiring modernization, though specific pricing details weren’t provided beyond the zero-cost deployment support through Azure Accelerate.

50:12 Ryan – “Get these things migrated. Because you can’t run them on these ancient frameworks that are full of vulnerabilities.”

54:32 Introducing Microsoft Marketplace — Thousands of solutions. Millions of customers. One Marketplace. – The Official Microsoft Blog

  • Microsoft unifies Azure Marketplace and AppSource into a single Microsoft Marketplace, creating one destination for cloud solutions, AI apps, and agents with over 3,000 AI offerings now available for direct integration into Azure AI Foundry and Microsoft 365 Copilot.
  • The marketplace introduces multiparty private offers and CSP integration, allowing channel partners like Arrow, Crayon, and TD SYNNEX to resell solutions through their own marketplaces while maintaining Microsoft’s security and governance standards.
  • For Azure Consumption Commitment customers, 100% of purchases for Azure benefit eligible solutions count toward their commitment, providing a financial incentive to consolidate software procurement through the marketplace.
  • Configuration time for AI apps has been reduced from 20 minutes to 1 minute per instance according to Siemens, with solutions now deployable directly within Microsoft products using Model Context Protocol (MCP) standards.
  • This positions Microsoft competitively against AWS Marketplace and Google Cloud Marketplace by offering tighter integration with productivity tools like Microsoft 365, though AWS still maintains a larger overall catalog of third-party solutions.

55:23 Justin – “I guess it’s nice to have one marketplace to rule them all, but 3,000 AI apps sounds like a lot of AI slop.”

56:59 Public Preview: Soft Delete feature in Azure Compute Gallery

  • Azure Compute Gallery now includes soft delete functionality with a 7-day retention period, allowing recovery of accidentally deleted VM images and application packages before permanent deletion.
  • This feature addresses a common operational risk where teams accidentally delete critical golden images or application templates, providing a safety net similar to AWS AMI deregistration’s 24-hour pending state.
  • The 7-day retention window aligns with typical enterprise change control cycles, giving IT teams sufficient time to detect and recover from deletion errors during weekend maintenance windows.
  • Target use cases include DevOps teams managing large image libraries, enterprises with strict compliance requirements for image retention, and managed service providers handling multiple customer environments.
  • While pricing details aren’t specified, users should expect storage costs during the retention period similar to standard gallery storage rates, making this a low-cost insurance policy against operational mistakes.

57:21 Matt – “So essentially it’s an easy way to do upgrades versus the way AWS – and you have to press (and by press I mean type your cancel API command) to stop the rolling upgrade of the system…this also prevents the same issue that we’ve all run into where I’ve stopped sharing this across accounts and we just broke production somewhere.”

58:48 Switzerland Azure Outage

  • Azure experienced two major regional outages in September 2025 – Switzerland North suffered a 22-hour outage affecting 20+ services due to a malformed certificate prefix, while East US 2 had a 10-hour incident caused by an Allocator service issue that created cascading failures across availability zones
  • The East US 2 incident reveals critical architectural challenges in Azure’s control plane design – aggressive retry logic meant to improve reliability actually amplified the problem by creating massive backlogs that took hours to drain even after the initial issue was resolved
  • Both incidents highlight gaps in Azure’s incident communication systems – automated alerts only covered a subset of affected services, forcing manual notifications and public status page updates hours into the outages, leaving many customers uninformed during critical periods
  • Microsoft’s response includes immediate fixes like reverting the problematic Allocator behavior and adjusting throttling configurations, plus longer-term improvements to load testing, backlog drainage tools, and communication systems scheduled through June 2026. (So be prepared for this to happen at least three more times before then.)
  • These outages underscore the importance of multi-region deployment strategies for mission-critical workloads – customers relying on single-region deployments faced extended downtime with no failover options during these regional control plane failures.

Oracle

1:01:54 Oracle Corporation Announces Promotion Of Clay Magouyrk And Mike Scilia 2025 09 22

  • Oracle promoted Clay Magouyrk to Executive Vice President of Oracle Cloud Infrastructure, and Mike Sicilia to Executive Vice President of Oracle Industries, signaling continued investment in cloud infrastructure and vertical market strategies despite their distant third-place position behind AWS and Azure.
  • Magouyrk’s promotion after leading OCI engineering suggests Oracle is doubling down on their infrastructure-first approach, though they’ll need significant innovation to close the gap with hyperscalers who have 10+ year head starts and vastly larger customer bases.
  • Sicilia’s elevation to lead Oracle Industries indicates a focus on vertical-specific solutions, a strategy that could differentiate Oracle from AWS/Azure/GCP by leveraging their deep enterprise relationships in healthcare, financial services, and telecommunications.
  • These executive changes come as Oracle tries to position OCI as the preferred cloud for enterprise workloads, particularly for customers already invested in Oracle databases and applications who want integrated stack benefits.
  • The promotions suggest organizational stability at Oracle Cloud during a critical growth phase, though the real test will be whether new leadership can accelerate customer adoption beyond Oracle’s traditional installed base.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

  continue reading

316 에피소드

Artwork
icon공유
 
Manage episode 512689768 series 3680004
TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 TCP.FM, Justin Brodley, Jonathan Baker, Ryan Lucas, and Matt Kohn 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Welcome to episode 324 of The Cloud Pod, where the forecast is always cloudy! Justin, Ryan, and Jonathan are your hosts, bringing you all the latest news and announcements in Cloud and AI. This week we have some exec changes over at Oracle, a LOT of announcements about Sonnet 4.5, and even some marketplace updates over at Azure! Let’s get started.

Titles we almost went with this week

  • Oracle’s Executive Shuffle: Promoting from Within While Chasing from Behind
  • Copilot Takes the Wheel on Your Legacy Code Highway
  • Queue Up for GPUs: Google’s Take-a-Number Approach to AI Computing
  • License to Bill: Google’s 400% Markup Grievance
  • Autopilot Engages: GKE Goes Full Self-Driving Mode
  • SQL Server Finally Gets a Lake House Instead of a Server Room
  • Microsoft Gives Office Apps Their Own AI Interns
  • Claude and Present Danger: The AI That Codes for 30 Hours Straight
  • The Claude Father Part 4.5: An Offer Your Code Can’t Refuse
  • CUD You Believe It? Google Makes Discounts Actually Flexible
  • ECS Goes Full IPv6: No IPv4s Given
  • Breaking News: AWS Finally Lets You Hit the Emergency Stop Button
  • One Marketplace to Rule Them All
  • BigQuery Gets a Crystal Ball and a Chatty Friend
  • Azure’s September to Remember: When Certificates and Allocators Attack
  • Shall I Compare Thee to a Sonnet? 4.5 Ways Anthropic Just Leveled Up
  • AWS provides a big red button

Follow Up

01:26 The global harms of restrictive cloud licensing, one year later | Google Cloud Blog

  • Google Cloud filed a formal complaint with the European Commission one year ago about Microsoft’s anti-competitive cloud licensing practices, specifically the 400% price markup Microsoft imposes on customers who move Windows Server workloads to non-Azure clouds.
  • The UK Competition and Markets Authority found that restrictive licensing costs UK cloud customers £500 million annually due to lack of competition, while US government agencies overspend by $750 million yearly because of Microsoft’s licensing tactics.
  • Microsoft recently disclosed that forcing software customers to use Azure is one of three pillars driving its growth and is implementing new licensing changes preventing managed service providers from hosting certain workloads on Azure competitors.
  • Multiple regulators globally including South Africa and the US FTC are now investigating Microsoft’s cloud licensing practices, with the CMA finding that Azure has gained customers at 2-3x the rate of competitors since implementing restrictive terms.
  • A European Centre for International Political Economy study suggests ending restrictive licensing could unlock €1.2 trillion in additional EU GDP by 2030 and generate €450 billion annually in fiscal savings and productivity gains.

03:32 Jonathan – “I’d feel happier about these complaints Google were making if they actually reciprocated the deals they make for their customers in the EU in the US.”

AI is Going Great – Or How ML Makes Money

05:14 Vibe working: Introducing Agent Mode and Office Agent in Microsoft 365 Copilot | Microsoft 365 Blog

  • Microsoft introduces Agent Mode for Office apps and Office Agent in Copilot chat, leveraging OpenAI’s latest reasoning models and Anthropic models to enable multi-step, iterative AI workflows for document creation.
  • This represents a shift from single-prompt AI assistance to conversational, agentic productivity where AI can evaluate results, fix issues, and iterate until outcomes are verified.
  • Agent Mode in Excel democratizes expert-level spreadsheet capabilities by enabling AI to “speak Excel” natively, handling complex formulas, data visualizations, and financial analysis tasks.
  • The system achieved notable performance on SpreadsheetBench benchmarks, and can execute prompts like creating financial reports, loan calculators, and budget trackers with full validation steps.
  • Agent Mode in Word transforms document creation into an interactive dialogue where Copilot drafts content, suggests refinements, and asks clarifying questions while maintaining Word’s native formatting. This enables faster iteration on complex documents like monthly reports and project updates through conversational prompts rather than manual editing. (FYI, this is a good way to get AI Slop, so buyer beware.)
  • The thing we’re the most excited about, however, is Office Agent in Copilot chat, which creates complete PowerPoint presentations and Word documents through a three-step process: clarifying intent, conducting web-based research with reasoning capabilities, and producing quality-checked content using code generation. (Justin, being an exec, really just likes the pretty slides.)
  • This addresses previous AI limitations in creating well-structured presentations by showing chain of thought and providing live previews.
  • The features are rolling out through Microsoft’s Frontier program for Microsoft 365 Copilot licensed customers and Personal/Family subscribers, with Excel and Word Agent Mode available on web (desktop coming soon) and Office Agent currently US-only in English.
  • This positions Microsoft to compete directly with other AI productivity tools while leveraging their existing Office ecosystem.

17:27 Justin – “There’s web apps for all of them. They’re not as good as Google web apps, but they pretend to be.”

08:14 Introducing Claude Sonnet 4.5 \ Anthropic

  • Claude Sonnet 4.5 achieves 77.2% on SWE-bench verified, positioning it as the leading coding model with the ability to maintain focus for over 30 hours on complex multi-step tasks.
  • The model is available via API at $3/$15 per million tokens, matching the previous Sonnet 4 pricing.
  • The Claude Agent SDK provides developers with the same infrastructure that powers Claude Code, enabling creation of custom AI agents for various tasks beyond coding.
  • This includes memory management for long-running tasks, permission systems, and subagent coordination capabilities.
  • Computer use capabilities improved significantly with 61.4% on OSWorld benchmark (up from 42.2% four months ago), enabling direct browser navigation, spreadsheet manipulation, and task completion.
  • The Claude for Chrome extension brings these capabilities to Max subscribers.
  • New product features include checkpoints in Claude Code for progress saving and rollback, a native VS Code extension, context editing with memory tools in the API, and direct code execution with file creation (spreadsheets, slides, documents) in Claude apps.
  • Early customer results show 44% reduction in vulnerability intake time for security agents, 18% improvement in planning performance for Devin, and zero error rate on internal code editing benchmarks (down from 9%).
  • The model operates under ASL-3 safety protections with improved alignment metrics.

12:02 Ryan – “I’ve been using Sonnet 4 pretty much exclusively for coding, just because the results I’ve been getting on everything else is really hit or miss. But I definitely won’t let it go off, because it WILL go off on some tangents.”

16:22 Claude Sonnet 4.5 Is Here | Databricks Blog

  • Databricks integrates Claude Sonnet 4.5 directly into their platform through AI Functions, allowing enterprises to apply the model to governed data without moving it to external APIs.
  • This preserves data lineage and security while enabling complex analysis at scale.
  • The integration enables SQL and Python users to treat Claude as a built-in operator for analyzing unstructured data like contracts, PDFs, and images. Databricks automatically handles backend scaling from single rows to millions of records.
  • Key technical advancement is bringing AI models to data rather than exporting data to models, solving governance and compliance challenges.
  • This approach maintains existing data pipelines while adding AI capabilities for tasks like contract analysis and compliance risk detection.
  • Agent Bricks allows enterprises to build domain-specific agents using Claude Sonnet 4.5, with built-in evaluation and continuous improvement mechanisms. The platform handles model tuning and performance monitoring for production deployments.
  • Claude Sonnet 4.5 launches just seven weeks after Claude Opus 4.1, highlighting rapid model evolution.
  • Databricks’ model-agnostic approach lets enterprises switch between providers as needs change without rebuilding infrastructure.

16:31 Announcing Anthropic Claude Sonnet 4.5 on Snowflake Cortex AI

  • Snowflake now offers same-day availability of Anthropic’s Claude Sonnet 4.5 model through Cortex AI, accessible via SQL functions and REST API within Snowflake’s secure data perimeter.
  • The model shows improvements in domain knowledge for finance and cybersecurity, enhanced agentic capabilities for multi-step workflows, and achieved higher scores on SWE-bench Verified for coding tasks.
  • Enterprises can leverage Sonnet 4.5 through three main interfaces: Snowflake Intelligence for natural language business queries, Cortex AISQL for multimodal data analysis directly in SQL, and Cortex Agents for building intelligent systems that handle complex business processes.
  • The integration maintains Snowflake’s existing security and governance capabilities while processing both structured and unstructured data.
  • The model is available in supported regions with cross-region inference for non-supported areas, and Snowflake reports over 6,100 accounts using their AI capabilities in Q2 FY26.
  • Developers can access the model using simple SQL commands like AI_COMPLETE or through REST API calls for low-latency inference in native applications.
  • This partnership represents a shift toward embedding frontier AI models directly into data warehouses, allowing analysts to run advanced AI operations using familiar SQL syntax without moving data outside their secure environment.
  • This approach reduces the complexity of building AI pipelines while maintaining enterprise-grade security and governance.

16:41 Announcing SQL Server connector from Lakeflow Connect, now Generally Available | Databricks Blog

  • Databricks’ SQL Server connector for Lakeflow Connect is now GA, providing fully managed data ingestion from SQL Server to the lakehouse with built-in CDC and Change Tracking support, eliminating the need for custom pipelines or complex ETL tools.
  • The connector addresses the common challenge of SQL Server data being locked in transactional systems by enabling incremental data capture without impacting production performance, supporting both on-premises and cloud SQL Server environments through a simple point-and-click UI or API.
  • Key capabilities include automatic SCD Type 2 support for tracking historical changes, integration with Databricks Asset Bundles and Terraform for CI/CD workflows, and the ability to ingest from multiple SQL Server instances simultaneously without full table refreshes.
  • Early adopters like Cirrus Aircraft report migrating hundreds of tables in days instead of months, while Australian Red Cross Lifeblood uses it to build reliable pipelines without complex data engineering, demonstrating real-world value for enterprises moving to lakehouse architectures.
  • This release is part of Lakeflow Connect’s broader ecosystem that now includes GA connectors for ServiceNow and Google Analytics, with PostgreSQL, SharePoint, and query-based connectors for Oracle, MySQL, and Teradata coming soon.

17:35 Ryan – “This has been a challenge for awhile; getting data out of these transactional databases so that you can run large reporting jobs on them. So I like any sort of “easy button” that moves you out of that ecosystem.”

AWS

17:53 Introducing Claude Sonnet 4.5 in Amazon Bedrock: Anthropic’s most intelligent model, best for coding and complex agents | AWS News Blog

  • Claude Sonnet 4.5 is now available in Amazon Bedrock as Anthropic’s most advanced model, specifically optimized for coding tasks and complex agent applications with enhanced tool handling, memory management, and context processing capabilities.
  • The model introduces three key API features: Smart Context Window Management that generates responses up to available limits instead of erroring out, Tool Use Clearing for automatic cleanup of interaction history to reduce token costs, and Cross-Conversation Memory that persists information across sessions using local memory files.
  • Integration with Amazon Bedrock AgentCore enables 8-hour long-running support with complete session isolation and comprehensive observability, making it suitable for autonomous security operations, financial analysis, and research workflows that require extended processing times.
  • Claude Sonnet 4.5 excels at autonomous long-horizon coding tasks where it can plan and execute complex software projects spanning hours or days, with demonstrated strength in cybersecurity for proactive vulnerability patching and finance for transforming manual audits into intelligent risk management.
  • Access requires using inference profiles that define which AWS Regions process requests, with system-defined cross-Region profiles available for optimal performance distribution across multiple regions.

18:06 Justin – “I was mad because it wasn’t working, and then I remembered, “oh yeah…in Bedrock you have to go enable the new model one by one. So if you’re trying to use Bedrock and it’s not working, remember to update your model access.”

18:21 Amazon ECS announces IPv6-only support | Containers

  • Amazon ECS now supports IPv6-only workloads, allowing containers to run without any IPv4 dependencies while maintaining full compatibility with AWS services like ECR, CloudWatch, and Secrets Manager through native IPv6 endpoints.
  • This addresses IPv4 address exhaustion challenges and eliminates the need for NAT gateways in private subnets, reducing operational complexity and costs associated with NAT gateway hours and public IPv4 address charges.
  • The implementation requires minimal configuration changes – simply use IPv6-only subnets with your ECS tasks, and the service automatically adapts without needing IPv6-specific parameters, supporting awsvpc, bridge, and host networking modes.
  • Migration strategies include in-place updates for non-load-balanced services or blue-green deployments using weighted target groups for ALB/NLB workloads, with DNS64/NAT64 available for connecting to IPv4-only internet services.
  • Federal agencies and organizations with IPv6 compliance requirements can now run containerized workloads that meet regulatory mandates while simplifying their network architecture and improving security posture through streamlined access control.

18:57 Amazon EC2 Auto Scaling now supports Internet Protocol Version 6 (IPv6)

  • EC2 Auto Scaling now supports IPv6 in dual-stack configuration alongside IPv4, addressing the growing scarcity of IPv4 addresses and enabling virtually unlimited scaling for applications.
  • The dual-stack approach allows gradual migration from IPv4 to IPv6, reducing risk during transitions while providing contiguous IP ranges that simplify microservice architectures and network management.
  • This update arrives as enterprises face IPv4 exhaustion challenges, with IPv6 adoption becoming essential for large-scale deployments and IoT workloads that require extensive address spaces.
  • Available in all commercial AWS regions except New Zealand (we’re not sure what the deal is there, but sorry Kiwis).
  • The feature integrates with existing VPC configurations and requires no additional charges beyond standard EC2 and networking costs.
  • Organizations running containerized workloads or microservices architectures will benefit from simplified IP management and the ability to assign dedicated ranges to each service without address constraints.

19:47 Matt- “It is amazing how fast that IPv4 cost does add up in your account, especially if you have load balancers, multiple subnets, and you’re running multiple ECS containers and public subnets for some reason.”

20:36 Amazon EC2 Allowed AMIs setting adds new parameters for enhanced AMI governance

  • EC2’s Allowed AMIs setting now supports four new parameters – marketplace codes, deprecation time, creation date, and AMI names – giving organizations more granular control over which Amazon Machine Images can be discovered and launched across their AWS accounts.
  • The marketplace codes parameter addresses a common security concern by allowing teams to restrict usage to specific vetted marketplace AMIs, while deprecation time and creation date parameters help enforce policies against outdated or potentially vulnerable images.
  • AMI name parameter enables enforcement of naming conventions, which is particularly useful for large organizations that use standardized naming patterns to indicate compliance status, department ownership, or approved software stacks.
  • These parameters integrate with AWS Declarative Policies for organization-wide governance, allowing central IT teams to enforce AMI compliance across hundreds or thousands of accounts without manual intervention.
  • The feature is available in all AWS regions at no additional cost and represents a practical solution to the challenge of shadow IT and unauthorized software deployment in cloud environments.

25:07 Jonathan – “Just wait six months, they’ll all have the same features anyway.”

26:00 Amazon EC2 Auto Scaling now supports forced cancellation of instance refreshes

  • EC2 Auto Scaling now allows forced cancellation of instance refreshes by setting WaitForTransitioningInstances to false in the CancelInstanceRefresh API, enabling immediate abort without waiting for in-progress launches or terminations to complete.
  • This feature addresses emergency scenarios where rapid roll forward is needed, such as when a current deployment causes service disruptions and teams need to quickly abandon the problematic refresh and start a new one.
  • The enhancement provides better control over Auto Scaling group updates by bypassing lifecycle hooks and pending instance activities, reducing downtime during critical deployment issues.
  • Available in all AWS regions including GovCloud, this feature integrates with existing Auto Scaling workflows and requires no additional cost beyond standard EC2 and Auto Scaling charges.
  • For organizations using instance refreshes for configuration updates or deployments, this capability reduces recovery time objectives (RTO) when deployments go wrong, particularly valuable for production environments requiring quick remediation.

26:38 Justin – “I was like, this isn’t really that big of an issue, and then I remembered well, I’ve had a really big autoscaling group, and this could be a really big problem. If you have like 5 webservers, you probably don’t care. But if you have hundreds? This could be a big lifesaver for you.”

29:00 Announcing Amazon ECS Managed Instances for containerized applications | AWS News Blog

  • Your hosts spent quite a bit of time arguing about this one…
  • Amazon ECS Managed Instances bridges the gap between serverless simplicity and EC2 flexibility by providing fully managed container compute that supports all EC2 instance types including GPUs and specialized architectures while AWS handles provisioning, scaling, and security patching.
  • The service automatically selects cost-optimized instances by default but allows customers to specify up to 20 instance attributes when workloads require specific capabilities, addressing the limitation that prevented customers with EC2 pricing commitments from using serverless options.
  • Infrastructure management includes automated security patches every 14 days using Bottlerocket OS, intelligent task placement to consolidate workloads onto fewer instances, and automatic termination of idle instances to optimize costs.
  • Pricing consists of standard EC2 instance costs plus a management fee, initially available in 6 regions including US East, US West, Europe, Africa, and Asia Pacific with support for console, CLI, CDK, and CloudFormation deployment.
  • For The Cloud Pod specifically, one single node was $.03 for the management fee.
  • This addresses a key customer pain point where teams wanted serverless operational simplicity but needed specific compute capabilities like GPU acceleration or particular CPU architectures that weren’t available in Fargate.

30:12 Justin – “I love Fargate, but I don’t like paying for Fargate. That’s why I run our Cloud Pod website on an EC2 instance because it’s way cheaper. So for three cents more a gig versus going to Fargate, this is probably where I would land if I didn’t really want to manage the host.”

33:11 Announcing AWS Outposts third-party storage integration with Dell and HPE | AWS News Blog

  • AWS Outposts now integrates with Dell PowerStore and HPE Alletra Storage MP B10000 arrays, joining existing support for NetApp and Pure Storage, allowing customers to use their third-party storage investments with Outposts through native AWS tooling.
  • The integration supports both data and boot volumes with two boot methods – iSCSI SANboot for read/write volumes and Localboot for read-only volumes using iSCSI or NVMe-over-TCP protocols, manageable through the EC2 Launch Instance Wizard.
  • This addresses two key customer needs: organizations migrating VMware workloads who need to maintain existing storage during transition, and companies with strict data residency requirements that must keep data on-premises while using AWS services.
  • Available at no additional charge across all Outposts form factors (2U servers and both rack generations) in all supported regions, with AWS-verified AMIs for Windows Server 2022 and RHEL 9 plus automation scripts on AWS Samples.
  • Second-generation Outposts racks can now combine doubled compute performance (2x vCPU, memory, and network bandwidth) with customers’ preferred storage arrays, providing flexibility for hybrid cloud deployments.

34:37 Jonathan – “It’s more that you can not have AWS provide the storage layer, but you can have them still support S3 and EBS and those other things on top of this third party storage subsystem.”

GCP

36:35 Introducing Flex-start VMs for the Compute Engine Instance API. | Google Cloud Blog

  • Google launches Flex-start VMs in GA, a new consumption model that queues GPU requests for up to 2 hours instead of failing immediately, addressing the persistent challenge of GPU scarcity for AI workloads.
  • This appears to be unique among major cloud providers – rather than competing on raw capacity, Google is innovating on the access model itself by introducing a fair queuing system with significant discounts compared to on-demand pricing.
  • The service integrates directly with Compute Engine’s existing instance API and CLI, allowing easy adoption into current workflows without requiring migration to a separate scheduling service, with VMs running for up to 7 days uninterrupted.
  • Key use cases include AI model fine-tuning, batch inference, and HPC workloads that can tolerate delayed starts in exchange for better resource availability and lower costs, particularly valuable for research and development teams.
  • The stop/start capability with automatic re-queuing and configurable termination actions (preserving VM state after 7 days) provides flexibility for long-running experiments while managing costs effectively.

37:32 Ryan – “I love this. This is great. You’re still going to see a whole bunch of data scientists spamming the workbooks trying to get this to run, but I do think that from a pure capacity standpoint this is the right answer to some of these things, just because a lot of these jobs are very long running and it’s not really instant results.”

39:52 GKE Autopilot now available to all qualifying clusters | Google Cloud Blog

  • GKE Autopilot features are now available in Standard clusters through compute classes, allowing existing GKE users to access container-optimized compute without migrating to dedicated Autopilot clusters – this brings efficient bin-packing and rapid scaling to 70% of GKE clusters that weren’t using Autopilot mode.
  • The container-optimized compute platform starts at just 50 milli-CPU (5% of one core) and scales to 28vCPU, with customers only paying for requested resources rather than entire nodes – addressing the common Kubernetes challenge of overprovisioning and wasted compute capacity.
  • New automatic provisioning for compute classes lets teams gradually adopt Autopilot features alongside existing node pools without disrupting current workloads, solving the previous all-or-nothing approach that made migration risky for production environments.
  • AI workloads can now run on GPUs and TPUs with Autopilot’s managed node properties and enterprise-grade security controls, competing directly with AWS EKS Auto Mode and Azure AKS automatic node provisioning but with tighter integration to Google’s AI ecosystem.
  • Available starting with GKE version 1.33.1 in the Rapid release channel, with 30% of new GKE clusters already created in Autopilot mode in 2024, suggesting strong customer adoption of managed Kubernetes operations.

37:32 Ryan – “So now you can have not only dedicated compute, but preemptible and now autopilot capacity all in the single cluster. Kind of cool.”

41:58 Gemini CLI extensions for Google Data Cloud | Google Cloud Blog

  • Google launches Gemini CLI extensions for Data Cloud services including Cloud SQL, AlloyDB, and BigQuery, enabling developers to manage databases and run analytics directly from their terminal using natural language prompts.
  • What could go wrong?
  • The extensions allow developers to provision databases, create tables, generate APIs, and perform data analysis through conversational commands, potentially reducing the time needed for common database operations and eliminating context switching between tools.
  • BigQuery’s extension includes AI-powered forecasting capabilities and conversational analytics APIs, letting users ask business questions in natural language and receive insights without writing SQL queries.
  • This positions Google against AWS’s recent CodeWhisperer CLI integration and Azure’s GitHub Copilot CLI, though Google’s approach focuses specifically on data services rather than general cloud operations.
  • Key use cases include rapid prototyping for startups, data exploration for analysts who aren’t SQL experts, and streamlining database operations for DevOps teams managing multiple Cloud SQL or AlloyDB instances.

43:28 Announcing Claude Sonnet 4.5 on Vertex AI | Google Cloud Blog

  • Surprise surprise…
  • Google Cloud now offers Claude Sonnet 4.5 on Vertex AI, Anthropic’s most advanced model designed for autonomous agents that can work independently for hours on complex coding, cybersecurity, financial analysis, and research tasks.
  • The integration includes Vertex AI’s Agent Development Kit and Agent Engine for building multi-agent systems, plus provisioned throughput for dedicated capacity at fixed costs, addressing enterprise needs for reliable AI deployment.
  • Claude Sonnet 4.5 supports a 1 million token context window, batch predictions, and prompt caching on Vertex AI, with global endpoint routing that automatically serves traffic from the nearest available region for reduced latency.
  • Customers like Augment Code, spring.new, and TELUS are already using Claude on Vertex AI, with spring.new reporting application development time reduced from three months to 1-2 hours using natural language prompts.
  • The model is available through Vertex AI Model Garden and Google Cloud Marketplace, with VS Code extension support and Claude Code 2.0 terminal interface featuring checkpoints for more autonomous development operations.

43:51 Adopt new VM series with GKE compute classes, Flexible CUDs | Google Cloud Blog

  • GKE compute classes let you define a prioritized list of machine families for autoscaling, automatically falling back to alternative VM types if your preferred option isn’t available – solving the challenge of adopting new Gen4 machines like N4 and C4 while maintaining workload availability.
  • Compute Flexible CUDs provide spend-based discounts up to 46% that follow your workload across different machine families, unlike resource-based CUDs that lock you to specific VM types – enabling financial flexibility when migrating between machine generations.
  • The combination addresses real adoption barriers: compatibility testing through gradual rollouts, regional capacity constraints with automatic fallbacks, and financial commitment alignment by allowing discounts to apply across multiple VM families including both new and legacy options.
  • Shopify successfully used this approach during Black Friday/Cyber Monday 2024, prioritizing new N4 machines with N2 fallbacks to handle massive scale while maintaining cost optimization through Flex CUDs.
  • This approach particularly benefits organizations running large GKE fleets or high-performance workloads that want to leverage new C4/C4D series VMs for better price-performance without sacrificing availability or losing existing discount commitments.

44:08 Justin – “So this is a solution to a problem that Google has because they’;re terrible at capacity planning. Perfect.”

45:35 AI-based forecasting and analytics in BigQuery via MCP and ADK | Google Cloud Blog

  • BigQuery now offers two new AI tools for data analysis: ask_data_insights enables natural language queries against structured data using Conversational Analytics API, while BigQuery Forecast provides time-series predictions using the built-in TimesFM model without requiring separate ML infrastructure setup.
  • These tools integrate with both Google’s Agent Development Kit (ADK) and Model Context Protocol (MCP) Toolbox, allowing developers to build AI agents that can analyze BigQuery data and generate forecasts with just a few lines of code – positioning Google against AWS Bedrock and Azure OpenAI Service in the enterprise AI agent space.
  • The ask_data_insights tool provides transparency by showing step-by-step query formulation and execution logs, addressing enterprise concerns about AI black boxes when analyzing sensitive business data, while BigQuery Forecast leverages the AI.FORECAST function to deliver predictions with confidence intervals.
  • Key use cases include retail sales forecasting, web traffic prediction, and inventory management, with the demo showing Google Analytics 360 data analysis – particularly valuable for businesses already invested in Google’s analytics ecosystem who want to extract deeper insights without data science expertise.
  • Both tools are available today in the MCP Toolbox and ADK’s built-in toolset, with users only needing read access to BigQuery tables, though specific pricing details aren’t mentioned beyond standard BigQuery query and ML costs.

46:38 Ryan – “…this is really neat. And then the fact that it does show you the logic all the way through, which I think is super important. You can ask natural-line questions, and it just comes back with a whole bunch of analysis, and then what happens if that doesn’t work consistently? How do you debug that? This is basically building it, which is how I learned anyway, so it works really well when it’s spitting out the actual config for me instead of just telling me what the results are.”

Azure

49:06 Announcing migration and modernization agentic AI tools | Microsoft Azure Blog

  • Microsoft announced agentic AI tools for migration and modernization at their Migrate and Modernize Summit, with GitHub Copilot now automating Java and .NET app upgrades that previously took months down to days or hours.
  • Azure Migrate introduces AI-powered guidance and connects directly with GitHub Copilot for app modernization, enabling IT and developer teams to collaborate seamlessly while providing application-awareness by default and expanded support for PostgreSQL and Linux distributions.
  • The new Azure Accelerate program combines expert guidance with funding for eligible projects and includes the Cloud Accelerate Factory where Microsoft engineers provide zero-cost deployment support for over 30 Azure services.
  • GitHub Copilot’s app modernization capabilities analyze codebases, detect breaking changes, suggest migration paths, containerize code, and generate deployment artifacts – with Ford China reporting 70% reduction in time and effort for middleware app modernization.
  • This positions Microsoft competitively against AWS and GCP by addressing the 37% of application portfolios requiring modernization, though specific pricing details weren’t provided beyond the zero-cost deployment support through Azure Accelerate.

50:12 Ryan – “Get these things migrated. Because you can’t run them on these ancient frameworks that are full of vulnerabilities.”

54:32 Introducing Microsoft Marketplace — Thousands of solutions. Millions of customers. One Marketplace. – The Official Microsoft Blog

  • Microsoft unifies Azure Marketplace and AppSource into a single Microsoft Marketplace, creating one destination for cloud solutions, AI apps, and agents with over 3,000 AI offerings now available for direct integration into Azure AI Foundry and Microsoft 365 Copilot.
  • The marketplace introduces multiparty private offers and CSP integration, allowing channel partners like Arrow, Crayon, and TD SYNNEX to resell solutions through their own marketplaces while maintaining Microsoft’s security and governance standards.
  • For Azure Consumption Commitment customers, 100% of purchases for Azure benefit eligible solutions count toward their commitment, providing a financial incentive to consolidate software procurement through the marketplace.
  • Configuration time for AI apps has been reduced from 20 minutes to 1 minute per instance according to Siemens, with solutions now deployable directly within Microsoft products using Model Context Protocol (MCP) standards.
  • This positions Microsoft competitively against AWS Marketplace and Google Cloud Marketplace by offering tighter integration with productivity tools like Microsoft 365, though AWS still maintains a larger overall catalog of third-party solutions.

55:23 Justin – “I guess it’s nice to have one marketplace to rule them all, but 3,000 AI apps sounds like a lot of AI slop.”

56:59 Public Preview: Soft Delete feature in Azure Compute Gallery

  • Azure Compute Gallery now includes soft delete functionality with a 7-day retention period, allowing recovery of accidentally deleted VM images and application packages before permanent deletion.
  • This feature addresses a common operational risk where teams accidentally delete critical golden images or application templates, providing a safety net similar to AWS AMI deregistration’s 24-hour pending state.
  • The 7-day retention window aligns with typical enterprise change control cycles, giving IT teams sufficient time to detect and recover from deletion errors during weekend maintenance windows.
  • Target use cases include DevOps teams managing large image libraries, enterprises with strict compliance requirements for image retention, and managed service providers handling multiple customer environments.
  • While pricing details aren’t specified, users should expect storage costs during the retention period similar to standard gallery storage rates, making this a low-cost insurance policy against operational mistakes.

57:21 Matt – “So essentially it’s an easy way to do upgrades versus the way AWS – and you have to press (and by press I mean type your cancel API command) to stop the rolling upgrade of the system…this also prevents the same issue that we’ve all run into where I’ve stopped sharing this across accounts and we just broke production somewhere.”

58:48 Switzerland Azure Outage

  • Azure experienced two major regional outages in September 2025 – Switzerland North suffered a 22-hour outage affecting 20+ services due to a malformed certificate prefix, while East US 2 had a 10-hour incident caused by an Allocator service issue that created cascading failures across availability zones
  • The East US 2 incident reveals critical architectural challenges in Azure’s control plane design – aggressive retry logic meant to improve reliability actually amplified the problem by creating massive backlogs that took hours to drain even after the initial issue was resolved
  • Both incidents highlight gaps in Azure’s incident communication systems – automated alerts only covered a subset of affected services, forcing manual notifications and public status page updates hours into the outages, leaving many customers uninformed during critical periods
  • Microsoft’s response includes immediate fixes like reverting the problematic Allocator behavior and adjusting throttling configurations, plus longer-term improvements to load testing, backlog drainage tools, and communication systems scheduled through June 2026. (So be prepared for this to happen at least three more times before then.)
  • These outages underscore the importance of multi-region deployment strategies for mission-critical workloads – customers relying on single-region deployments faced extended downtime with no failover options during these regional control plane failures.

Oracle

1:01:54 Oracle Corporation Announces Promotion Of Clay Magouyrk And Mike Scilia 2025 09 22

  • Oracle promoted Clay Magouyrk to Executive Vice President of Oracle Cloud Infrastructure, and Mike Sicilia to Executive Vice President of Oracle Industries, signaling continued investment in cloud infrastructure and vertical market strategies despite their distant third-place position behind AWS and Azure.
  • Magouyrk’s promotion after leading OCI engineering suggests Oracle is doubling down on their infrastructure-first approach, though they’ll need significant innovation to close the gap with hyperscalers who have 10+ year head starts and vastly larger customer bases.
  • Sicilia’s elevation to lead Oracle Industries indicates a focus on vertical-specific solutions, a strategy that could differentiate Oracle from AWS/Azure/GCP by leveraging their deep enterprise relationships in healthcare, financial services, and telecommunications.
  • These executive changes come as Oracle tries to position OCI as the preferred cloud for enterprise workloads, particularly for customers already invested in Oracle databases and applications who want integrated stack benefits.
  • The promotions suggest organizational stability at Oracle Cloud during a critical growth phase, though the real test will be whether new leadership can accelerate customer adoption beyond Oracle’s traditional installed base.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

  continue reading

316 에피소드

모든 에피소드

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드

탐색하는 동안 이 프로그램을 들어보세요.
재생