# 357: Cache Me If You Can – Now With Durability
Duration: 60 minutes
Speakers: Justin Brodley, Justin
Date: 2026-06-10

## Transcript

[00:07] Justin Brodley: Welcome to The Cloud Pod, where the forecast is always cloudy.

[00:10] Justin: We talk weekly about all things AWS, GCP, and Azure. We are your hosts, Justin, Jonathan, Ryan, and Matt.

[00:18] Justin Brodley: Episode 357, recorded for June 2nd, 2026. Cash me if you can now with durability. Good evening, Matt. How you doing?

[00:29] Justin: Good.

[00:29] Justin Brodley: How are you? Uh, you know, it's, uh, June. I don't know how that happened. Cause I swear yesterday was like January 1st. So it's just like, I see on the opposite.

[00:39] Justin: I feel like last week was that when I think it was just you and Ryan was like 6 years ago.

[00:44] Justin Brodley: Yeah. Yeah. I mean, I have that going on as well, but like the fact that it's June, I can't process either. So I don't know. But, uh, It just feels like every day there's something going on in the world or AI or, you know, the day job or, you know, people reach out to the podcast to do stuff. Like it's busy, it's busy, busy.

[01:05] Justin: Uh, which is— Busy in a good way.

[01:06] Justin Brodley: In a good way. Yeah, for sure. Uh, well, you know, we have a bunch of stuff to get to this week, including Microsoft Build, which happened, uh, since last time we recorded, which we did not do predictions for because Matt forgot. That's how I like to think about it. But we wouldn't, we don't, we'd never done Build. So in fairness, it's just like we don't do Google I/O because They're both kind of like a little bit pie in the sky, uh, conferences or, or keynotes where they're like, you know, it's the art of the possible versus things you can kind of predict. So we, we typically don't do Build or Google I/O for that reason. But, uh, let's still talk about it though, cuz they had some cool stuff they announced.

[01:40] Justin: Yeah, they definitely have Build. I feel like started adding more like Azure stuff, unlike Google I/O. Mm-hmm. So maybe at one point, but I don't know, we'll figure that out in the future.

[01:49] Justin Brodley: I mean, AI is blurring the lines of what cloud and development is anyway, so it makes sense that it's happening. Hmm. Uh, well, Microsoft, uh, has apparently canceled their internal Claude code licenses just months after encouraging widespread adoption, redirecting employees to GitHub Copilot CLI instead. This does not affect their broader foundry partnership with Anthropic, but it signals that token costs at scale become difficult to justify internally. Uh, this adds on to Uber's situation, which happened a little bit ago, where they said they burned their entire 2026 AI coding tools budget in 4 months after internal teams were incentivized advice to compete on usage. Yeah, don't token max. It's a bad plan. Uh, the core economic tension worth discussing about AI tooling cost at scale can actually undercut the labor savings argument. When compute bills approach or exceed payroll savings, the ROI case for broad AI deployments gets more complicated for finance and engineering leaders to defend. Companies appear to be responding with tighter governance rather than full rollbacks, including usage caps, narrower approvals, and more targeted deployments focused on measuring productivity gains. There's also an infrastructure cost layer beyond software licensing as AI workloads drive substantial data center energy and water consumption. Of course, the other side of this too is, you know, the foundational models, Gemini and Anthropic, are expensive and they do a good job and they are great and they are, but they're expensive. And so if you can use open models, uh, you can save yourself some money. And so a lot of companies are also doing a lot of experimentation with open models like GLM, Quen, Kiwi 2.6, etc., to see if there's different alternatives. And so, you know, spinning up GPUs on the cloud that you then run an open model with could save you a lot of money. So this is going to be an interesting area to see what happens in the FinOps space as we start getting now more maturity in that area. Do we start seeing open models become a bigger deal and customers starting to look at different options beyond the foundational model? Uh, because yeah, these things are getting expensive. Now, you know, the whole Uber thing, I don't know what their budget was, but you know, the fact that they— am I saying, well, we ran out 4 months early, so are you just really bad at budgeting, or did you, did you really, uh, overdo it that much. And I think it's a matter of a little bit of both, but, uh, yeah, do take some of those, those, uh, claims at some level of bias. Yeah.

[03:57] Justin: I mean, it's also the way people are using it, right? Like, I don't sit there and develop an opus every single second. And especially they started adding the larger context windows. People don't realize how fast that burns your budget. And you know, if you use it right, I think you and I have talked about it and others have talked about it where you use different models in different locations, right? So if you're doing more of a project plan, start with the, start with the more expensive model and then dump down to the lower model for the more individual tasks that you're working on. But I do agree that these things are expensive and I think reality is starting to hit, especially with GitHub changing their model starting this week, you know, where it's more pay as pay for what you use. And I think you're going to see the FinOps really grow in it, you know, especially around LLM, you know, they'll probably start to have a focus output so you can start to really see what people are doing because I think a lot of people just are told use more AI, use more AI. And there was the, there's the joke going around the internet for a while now. It's like, you know, you get a prize for using more tokens at work, but it shouldn't be that. It should, how do you accomplish your work in a cost-effective and manner. You know, it kind of also goes back to— was it Vogel's keynote 2 years ago? The, um, Frugal Architect? Might have been 3 years ago. I don't really have a sense of time anymore in my life. But like, it— the same concepts apply here, right? Don't just throw everything at it. Target your code base. You know, don't just say, hey, go look at all the logs from production for the last 48 hours and find the one bug. No, say, hey, go query production logs for $750 when we had this error that told us the podcast starts in 6 days, not in 1 hour. Might be something I'm doing right now. You know, but you gotta use stuff in the most effective way. And right now I think a lot of people are just, you know, using, what's the phrase, a hammer for every solution versus, okay, maybe we can do something more targeted in this location that will save a ton of money.

[06:04] Justin Brodley: Yeah, I agree. So not only, uh, you know, is Microsoft saying that AI could be more expensive, but, uh, TechCrunch had an article saying that tech CEOs are apparently suffering from AI psychosis. This was from Box CEO Aaron Levie, coined the term, uh, describe how executives overestimate AI capabilities because they interact with polished demos and prototypes rather than the messy last-mile work required to actually deploy and maintain AI systems in production. Layoff data is worth noting, with 115,000 tech jobs having been cut so far in the first 5 months of 2026. Nearly matching all of what was laid off in 2025, with many companies citing AI productivity gains as justification, even other business factors are driving the decision. Research does not support the productivity assumptions behind those decisions. A UC Berkeley meta-analysis found no robust relationship between AI adoption and aggregate productivity gain. And MIT researchers project agents will reach base competence on most text tasks by 2029 and will need additional years to outperform humans. Harvard Business Review study identified a practical bottleneck problem when AI increases output volume across an organization. The constraint shifts to the executives who must review and authorize that output. They can create organizational slowdowns rather than efficiency gains. And for cloud practitioners and developers, the practical takeaway is that AI agents still require substantial human review for code, contract terms, and hallucinated library calls. So keep that in mind as you're listening to all the hype.

[07:22] Justin: Yeah, I mean, I think the world is not just, you know, tech CEOs. I feel like the world's starting to get sick of hearing the term AI. We joke around about it on the podcast too. We're slowly becoming the AI podcast versus, you know, The Cloud Pod. And I think people are using AI as an excuse to cut people, not always just doing it for logic's sake. You know, there's a lot of other, you know, things going on in the world that are causing enough concerns that companies are laying people off and using AI as the excuse. It's just an excuse, I feel like, in a lot of places. You know, and while it definitely can increase productivity, you again, like I feel like I just said, you gotta use AI in, in a smart way and in a targeted way. Don't use the hammer on everything, you know, figure out where it works and how it works, especially when you're getting into legal and other things. While it can validate things, you still need a human in the loop on a lot of things.

[08:25] Justin Brodley: Well, we'll continue to see, uh, I assume layoffs. I, I mean, I think a lot of these layoffs that are being blamed on AI is also just general business problems. And so as long as it's a convenient excuse the market isn't going to punish you for, if you're publicly traded, why not blame the AI? I think it's really what it comes down to, right?

[08:41] Justin: And it, you know, it also looks good. Oh look, we're using AI to make us be leaner and better. So it kind of ends up with, with a positive.

[08:49] Justin Brodley: I mean, I look at the, like, you know, number of middle managers who are gone, you know, who are being laid off to these things and expand controls that are getting increased. I'm like, you know, you're Eventually, when you do need to start doing other things, this becomes a much more complicated, uh, problem in the future where now you're dealing with how do you do performance management when you don't have enough managers to do the work? And so there's all kinds of things like, I'm like, yes, it makes sense that you think this is a great, a great plan, but, uh, the reality is that there's a lot of risk in these things. All right, move on to AI is how ML makes money this week. Uh, Databricks. Databricks is introducing always-on pricing for Lakebase, its managed PostgreSQL offering, which gives a 25% discount on baseline compute capacity while retaining full auto-scaling for traffic spikes, eliminating the traditional forced choice between provisioned and serverless database tiers. The pricing model activates automatically after 24 hours of continuous use with scale-to-zero disabled, requiring no new contracts, no downtime, and no separate product provisioning. Databricks recommends keeping scale-to-zero as the default for new or intermittent workloads, where load patterns are unknown and switching to always-on only once code or once historical usage data shows a consistent baseline for activity. Now through January 31st, 2027, you'll get an additional 50% promotional discount stack on top of the always-on rate, which would meaningfully reduce cost for production PostgreSQL workloads running continuously on Lakebase. This is a big commercial model change and so something to factor into all your FinOps models around Databricks. That's nice.

[10:16] Justin: I kind of like this model, you know, where they give you the discount because it's been running for X period of time. I've always kind of appreciated that about the Google model. You know, AWS, you have to say, hey, we're, we're gonna do a savings plan and whatnot, and they, you get that. And I know Google has kind of an equivalence, but just because it's running for, was it, is it Google 30 days, then they drop it a little bit or something like that?

[10:43] Justin Brodley: So, um, I think so.

[10:45] Justin: Yeah. Yeah. Like I kind of appreciate it's like it's less management and it feels a little bit like they're just not trying to take money for the sake of taking money. You know, if you have an EC2 instance running for 3 years, Amazon at one point just assumes that thing's gonna be running, but they'll give you a discount if you're on on-demand. So it's kind of nice that they're kind of giving this benefit back to the people. To the customers.

[11:14] Justin Brodley: Well, it's, you know, the reality is that a lot of the things that you pay for are the automation and deploying the server and updating the things. And so if you're not running those code paths and you're not running those architectures, then, you know, the company's also saving money, which is why they can turn that, some of that savings over to you. But yeah, I mean, the ideal thing is you want to keep people sticky. And so if I can keep myself running, I think it's a good, a good model and it makes sense to me. Well, Anthropic had a busy week this last week. First, they introduced Claude Opus 4.8, an incremental upgrade over Opus 4.7 with improvements in agentic task performance, tool calling efficiency, and honesty. Pricing remains unchanged at $5 per million input tokens and $25 per million output tokens, while fast mode is now 3 times cheaper than previous models at 2.5x the speed. A notable reliability improvement is that Opus 4.8 is approximately 4 times less likely than Opus 4.7 to let code flaws pass unremarked, and early testers report it proactively flags uncertainties rather than making unsupported claims. On the SuperAgent benchmark, it completed every case end-to-end and scored 84% on online mind-to-web for .browser agent tasks. Dynamic Workflows is a new research preview feature in Claude Code for Enterprise, Team, and Max Plans that lets the model plan and run hundreds of parallel subagents in a single session. This enables codebases to scale migrations across hundreds of thousands of lines of code from start to merge, which is practically a capability, a practical capability for large engineering teams. The message API now accepts system entries inside the messages array, letting developers update Claude's instructions mid-task without breaking the prompt cache, which is useful for GenTech workflows where permissions, token budgets, or environment contexts need to be changed as a task runs. Anthropic previewed a higher-capacity LLM called Mythos, which they are also announcing is being expanded out to an additional 100 or 50 Sorry, an additional 150 partners, making 200 total organizations across 15 countries, uh, now having access to the Mythos preview, which has apparently surfaced more than 10,000 high or critical security severity vulnerabilities across partner codebases. And partners are now using the model not just for detection, but also for writing patches and pre-release vulnerability checks. So that's pretty cool.

[13:19] Justin: Yeah, I mean, I feel like every week Anthropic is just on a roll. You know, so I mean, 4.8, I switched over to it. I feel like I saw some improvements, but you know, I feel like not that much. You read the internet, everyone was complaining about 4.7 and, but they complained about 4.6 and, but I would kind of also like to see them update some of the older models too. The, I'm going to call it the lower end models, the Sana and the Haiku. It's great that they keep improving the Opus model, but Getting the Haiku, I think Haiku's on 4.5 still and Sonnet's on 4.6, you know, getting those a little bit more updated with new training models and everything, I think would be nice. So I would like to see that, I guess, is kind of my feedback. Like, yes, it's an incremental upgrade, but I would like incremental updates 'cause I'm trying to use Sonnet more. You know, I don't need the power of Opus on everything. 'Cause especially, you know, we talked about, how expensive it is.

[14:17] Justin Brodley: Yeah. I mean, I, so it's been interesting cuz I've, I typically don't use Opus that often, but I've, because of this new, this new capability around dynamic workflows, I've been playing with a little bit more, uh, than usual. And so it's, it's interesting because I can see the value and if the price would come down on it, it could be huge. But, uh, I do try to use Sonnet as much as possible to kind of help, uh, keep things, keep things cheap and, uh, and cost effective for what we do around here. But I also do a lot of experimenting with the open models. I mean, most of the code I now write for Bolt, I actually use GLM 5.1 for. I don't typically use the Claude model for that, mostly because I, you know, I have work stuff I'm doing. I don't want to use my work tokens for my personal project. That'd be bad. And so yeah, I just use my personal Ollama subscription and cloud GLM 5.1. So it works great.

[15:09] Justin: Yeah, I feel like, uh, I, with the two young kids that I have, uh, time to do some of these things are less, so I don't quite consume all my tokens. So I'm like, here you go, Anthropic, have, uh, have fun with my tokens for a while.

[15:21] Justin Brodley: Yeah.

[15:21] Justin: But when I do, there have been weeks that I get, you know, kids go to sleep early or whatever happens, I get some time. I definitely consume it. I have done like you have, I switched to LLM models or I start doing more and more in haiku, which then I hate myself in the future for a little bit.

[15:37] Justin Brodley: But you know, I mean, Haiku's fine if you know exactly what you want it to do and you don't want it to think in any way, shape, or form, then Haiku is great. It can, it can templatize anything. But it, yeah, if you, if you're trying to make it think, don't, don't try that.

[15:48] Justin: Let's be honest. Yeah, no, it's bad news bears across the board. I have not played with the Daymak workflows though, so I may have to spend a little bit of time on it.

[15:54] Justin Brodley: It's worth, it's worth spending some time with. I was, uh, pretty impressed, um, with its capabilities, to be honest. So definitely recommend checking that out.

[16:01] Justin: Yeah, I missed that announcement this week until we were, uh, prepping for the podcast a few, you know, an hour ago.

[16:06] Justin Brodley: I understand. It's, uh, news is coming fast and furious. Uh, you know, Anthropic was just mentioned is releasing a lot of code and features, but they're also burning a lot of money. And so they have, uh, raised a Series H at $65 billion raise, which is about a $965 billion post-money valuation with run rate revenue crossing $4.7 billion earlier this month, reflecting substantial enterprise adoption of Claude across global organizations. I mean, that's a crazy number. Just absolutely mind-boggling how much money they are raising and what their post-money valuation is.. And I was seeing someone talk about it on Blind. They were saying, you know, if you, if you joined, you know, 2 or 3 years ago, I think when they did like the Series D and you got 100,000 shares, uh, you know, basically with this valuation, those are worth like $20 million now. Cool. I'm like, damn it, I wish I had gotten into Anthropic. So yeah, it's gonna be interesting when they go public cuz they did also file a secret S-1 this week, which, uh, we, we aren't talking about today cuz I can't look at it. It's a secret. But, um, you know, the fact that they are gonna go public along with, uh, a bunch of other AI companies this year will be a lot of interesting conversations for us here on the show. But, you know, a lot of these people who have built these companies are gonna make a lot of money potentially in an IPO. And then, you know, if you're making $200,000 to $300,000 a year and you have $20 million from your IPO, are you still doing that job? I don't know. That's a, it's a question mark.

[17:27] Justin: I mean, I know a couple people that made some good money on, you know, some companies that IPO'd, was it like during COVID or before COVID I felt like there was a large slew of IPOs that occurred and a lot of them took the money, you know, got the money, but they didn't cash out. So while they had it, it worked well, but either they cashed out part of it or didn't diversify or whatnot, and the stock went up and down. It was kind of all over, you know. So I think a lot of people still will work, especially if you've been in, you know, Anthropic for, Honestly, I feel like if you're a live engineer, it's like you just like what you do. You know, maybe you cut back or you go join a startup that's, you know, or a different company that's a little bit less pressure. But I don't know, I have different opinions about that.

[18:13] Justin Brodley: I mean, if you have that kind of money, do you just go start another company, you know, doing things that you enjoy doing at Anthropic? I don't know, it'd be interesting to see kind of how, you know, once a bunch of new millionaires are minted in the Bay and these other places, what happens, so.

[18:26] Justin: I feel like you need to see, like, um, LinkedIn does it every month, I think, or they used to do it where, like, they would show you who's going where. And it would be interesting to see in, like, you know, 5 years, people that were at Anthropic, you know, 3 years prior, 2 years prior to IPO, where they all end up. Kind of see that mapping of it. I don't know if they still do that, like, who moved where for what jobs, but it was always interesting to see.

[18:50] Justin Brodley: Yeah, I remember seeing that, but I don't know if they still make that or if that's third-party stuff you now get. But yeah, I remember that view. That was kind of neat. Yeah. Moving on to cloud tools, uh, we're talking about Gremlin today, which, uh, is always fun. Gremlin failure flags, uh, by proxy let teams run fault injection tests on serverless applications by routing traffic through a sidecar container requiring zero code changes to their application itself, addressing a longstanding gap where serverless platforms lack the infrastructure-level access needed for traditional reliability testing. This new proxy approach supports common failure scenarios like dropping availability zone-specific traffic, injecting latency, and generating exceptions to test error handling logic. It works across Kubernetes, AWS Lambda, AWS ECS, and Pivotal Cloud Foundry. The intelligent health checks automatically establish baseline metrics for network throughput, latency, and error rate, then halt tests if any metric exceeds its threshold during a test run. The practical value here is that your team can validate failure modes in serverless environments that were previously difficult or impossible to test, such as bad API responses, corrupted payloads, and message ordering issues. This new no-code deployment model lowers the barrier for teams that want chaos engineering coverage without modifying application code or waiting on development cycles to instrument SDKs, which I mean, a sidecar is a great way 'cause you can basically proxy everything man-in-the-middle style and inject all kinds of fun things. So this is a nice enhancement to the Gremlin platform if you are trying to do chaos engineering. Although AI does a pretty good job causing chaos engineering too.

[20:08] Justin: I was gonna say, you and I doing the Vault pod, we cause some chaos engineering on that thing on a weekly basis.

[20:15] Justin Brodley: Yes, we do.

[20:17] Justin: I mean, I always love, chaos engineering, but I feel like so few companies get the time, effort, and buy-in to do it. You know, as an engineer in me, I'm like, this is awesome. As a, you know, management person, I'm like, okay, but what project do I punt out to do this? You know, when we, when our platform, it does have high availability. And while this is something nice, you don't always get time to go do these things, you know. As a SOC, right? So you gotta do your yearly DR test and that's like prying, you know, time out of the, you know, PM org's cold dead hands in order to get the time to do the things you have to do so your compliance team doesn't yell at you. So doing something like this, I feel like you are less likely to get time for. Yeah. You gotta figure out how to sell it.

[21:05] Justin Brodley: Yeah. I mean, it just, it, most companies care about the features. They care about shipping the features. They don't care about making the feature reliable unless they're, in a business that require, you know, it's a life and death situation or it's, you know, huge amounts of money for downtime. That's where typically you see a mature chaos engineering process. But for everybody else, it's very much, you know, not something to get budget to do unless it's easy and cheap and free. All right, let's move on to AWS. AWS is announcing the next generation of Amazon OpenSearch serverless, which scales from zero to thousands of requests per second and back to zero when idle, offering up to 60% cost savings compared to provisioned OpenSearch service clusters sized for peak capacity. The new generation provisions resources in seconds and scales capacity to 20 times faster than previous generation, supporting full-text search and vector search collection types with an express create option that requires no manual configuration. Native integrations with Vercel and Quro allow developers to deploy search and vector backends for AI agents directly from those platforms. And the OpenSearch Agent Skills repository provides pre-built domain knowledge and multiple-step execution logic for common agent workflows. Pricing is consumption-based using OpenSearch compute units for indexing, search, and GPU acceleration. The storage billed separately per gigabyte month. The classic OpenSearch Serverless infrastructure remains available for existing users who prefer it.

[22:21] Justin: Yeah, I mean, getting these serverless to go down to zero and with a response time as fast as possible is always great. So seeing them continue to make improvements here just means that the average consumer that's just running a small OpenSearch cluster is going to be going to get the savings out of it. You know, the, the large business that's running it, you know, they're probably running with enough other things. But let's say we were running OpenSearch for the podcast, it probably would sit idle, you know, 90% of the time. So these savings are real if you can actually get them to that point. But I feel like forever the ability of these systems to actually scale down to zero was minimal. You know, even Aurora never, I felt like, truly scaled down to zero unless if you truly had your application made sure that there was no open sockets anywhere to the system. So this type of scale-up capability, especially with it going faster, is just going to be really good for potentially production workloads too.

[23:29] Justin Brodley: I agree.

[23:33] Justin: There are a lot of cloud cost management tools out there, but only Archera provides insured commitments. It sounds fancy, but it's really simple. Archera gives you the cost savings of a 1 or 3-year AWS Savings Plan with a commitment as short as 30 days. If you do not use all the cloud resources you've committed to, Archera will literally cover the differences. Other cost management tools may say they offer insured commitments, but remember to ask, will you actually give me my rebate? Archera will. Check out thecloudpod.net/archera to schedule a demo today.

[24:16] Justin Brodley: All right. AWS Shield Advanced now provides packet-level DDoS attack flows logs capturing source and destination IPs, ports, protocols, packet counts, and source country data during active attacks published to S3 CloudWatch Logs or Data Firehose at 5-minute intervals. I mean, thank you. You only took 100 years to get us this quality of life improvement. So I appreciate it, but, uh, I'm not gonna spend a lot of time on this one.

[24:40] Justin: No, I mean, this is truly just a quality of life. Somebody pointed Kirov at this tool and and said, how many people have built the solution? Great, let's go actually fix the problem now, which is amazing. So it's much better to be able to get that data, especially when you are in the middle of a situation, than being like, well, it's just blocked at the edge, so there's not much we can really do about it. We don't even see it.

[25:05] Justin Brodley: Yep. Uh, Amazon has deployed a new networking architecture called RNG, or Resilient Network Graphs, in data centers since late 2024, starting in Dublin and expanded to Germany and Spain, with most new newly built data centers now using this RNG design. RNG uses a quasi-random flat network topology that eliminates the traditional fat tree hierarchy of switches and routers, addressing longstanding inefficiencies in data center cabling and routing that have persisted since the mid-1980s. The reported performance numbers are notable: 69% fewer routers and switches, 33% higher data throughput, 40% reduction in network power consumption, and 27% lower operating costs compared to traditional network devices. A key hardware component is the shufflebox, a new optical device Amazon developed internally that physically organizes and shuffles cable connections between routers, replacing the tangled cable bundles typically typical of fat tree setups with more structured physical layouts. Notably, Amazon says RNG is not optimized for AI training workloads, which require more coordinated and centrally orchestrated data patterns. So this is primarily an efficiency improvement for general cloud infrastructure rather than direct response to AI compute demand. I mean, in general, like, anytime you hear from GCP or Azure or Amazon about their networking, it blows your mind. And so, you know, the fact that you got away from, you know, the, the tree-leaf fat-tree setup is great. I mean, there's a lot of inefficiencies of that model, lots of central bottlenecks. And so, you know, in a situation where every customer has their own VPCs, you know, the reality is the network has to constantly morph and evolve. And so it's good to see they've done this, and I'm glad to see that this is solving a big problem for them. Fully powered by the fact that they have ASICs that can custom do this work.

[26:42] Justin: Yeah, I mean, I assume a lot of this is their custom hardware that they're able to set up. And the bottlenecks, you know, I never truly thought about it. You know, when I built a data center back in 2012 or so, was probably the last data center rack I really built out, you know, was you had your switches in each rack that went up to a core centralized switch and There was different switches for storage and for, you know, network flow and then the end users and it all merged up into different areas. And but there was still a core switch in there and that's not logical in a multi-customer, multi-tenant with the abstracted networking layer out. So being able to really rebuild the network kind of from the ground up and get rid of that 1980s philosophy, that's amazing that they were able to do it and gain those improvements is going to be awesome. So it's probably just things that we all are getting the benefits of and Amazon's taking the profit on the backend, but it's just going to be faster for everyone.

[27:50] Justin Brodley: Yeah. Amazon RDS for SQL Server supports bring your own media. Now, for those of you who were like, what? This isn't what it sounds like to install SQL Server. This is really about bringing your own license. Licensing. So Amazon RDS for SQL Server now supports bring your own media, allowing customers to reuse existing Microsoft SQL Server licenses, including Software Assurance, through Microsoft's License Mobility Program when migrating to RDS. This feature directly addresses a common migration blocker where organizations were either paying for duplicate licenses or waiting for existing agreements to expire before moving to a managed database service. BYOM integrates with AWS License Manager, giving customers a centralized way to track SQL Server license usage across their AWS environment. And maintain license compliance. The feature targets customers running SQL Server on-premise, on other clouds, or self-managed instances on EC2 who want the operational benefit of RDS, such as automated backups, high availability, and monitoring without additional licensing costs. Pricing for BYOM differs than standard RDS SQL Server pricing, so customers should review the Amazon RDS for SQL Server pricing page before making a change.

[28:48] Justin: I just want to point out that I don't know that anybody is ever actually in compliance with their licensing agreement.

[28:54] Justin Brodley: I mean, it's, uh, I mean, considering how often they change it, that is a true statement for sure.

[29:03] Justin: I mean, it's a great quality of life that somebody cares about, and I avoid Microsoft SQL, I would say, as much as I can, but I've dealt with it a lot in the past, and that's just where the world is. But being able to move it and the license mobility and everything is definitely useful. So it's a great quality of life improvement.

[29:23] Justin Brodley: Agree. I mean, I feel like they had this and they lost it and now they have it back. I mean, it's kind of nice. Like, it definitely is a big blocker to using things like Cloud SQL or Redis when you couldn't use your licensing that you've, you know, paid for. You do have to have licensing mobility, which they do charge extra for, which is still annoying that you don't have to pay for in Azure. So it's still not quite, you know, resolving that issue, but it's definitely getting closer to a better, better place for Microsoft customers, which is good.

[29:49] Justin: Wasn't Microsoft getting sued in the EU for that? And I wonder whatever happened with that lawsuit. Do you remember?

[29:55] Justin Brodley: I mean, I think it got settled and they agreed to do some special things in your EU that they don't do anywhere else to solve that. Okay.

[30:03] Justin: Yeah.

[30:03] Justin Brodley: That's nice. ElastiCache for Valkyrie now supports durability via multi-AZ transactional log, allowing it to serve workloads where data loss is unacceptable. Should you be using Valkyrie? Not just traditional caching scenarios. Two write modes are available. Synchronous writes guarantee zero data loss at single-digit millisecond write latency, while asynchronous writes maintain microsecond latency with potential loss window of up to 10 seconds and come at no additional cost. Both options preserve microsecond read latency, meaning customers do not have to trade read performance for durability, which is a meaningful distinction compared to traditional durable databases. AWS specifically calls out AI-oriented use cases like agent long-term memory, RAG knowledge bases, and workflow state management, positioning ElastiCache as a viable primary store for latency-sensitive AI applications. Rather than just a cache layer. This feature is available to you in any commercial region, China, and GovCloud where Vault Key 9.0 is available.

[30:54] Justin: I understand the use cases. I still say you're using cache wrong. Like, I totally get why you would— could use this with, you know, like you're in a cloud coding session or whatever, but at the same point, it's a cache. It's okay. You know, unless if you're going to do something like, okay, we've cached a ton of, you know, our data and rebooting the cache or having to reload the cache would kill our SQL database to the point of no existence. You know, like that's the only other use case I could think of. But at that point, how do you trust the data in the cache to start off? So I do like the synchronous writing modes though. That aspect of it to ensure that consistency, that is nice, especially with multiple nodes, because then you make sure you're actually returning the right data.

[31:45] Justin Brodley: Exactly. All right, the CUR 2.0 now matches CUR 1.0's Athena and Redshift integration capabilities, closing a feature gap that had been a barrier for customers considering a migration to the newer report family. When selecting Athena or Redshift integration, exports are automatically delivered in optimal format, either via Parquet or GZIP. Along with infrastructure templates, table definitions, and data loading instructions, removing the need for manual configurations or custom ETL pipelines. Cost data refreshes in CURD 2.0 are automatically reflected in Athena and Redshift tables, meaning customers can query up-to-date billing data using standard SQL without building or maintaining additional data pipeline infrastructure. Pricing of this feature follows existing costs for underlying services: S3 storage for exports, Athena query costs at $5 per terabyte scanned, and Redshift cluster or serverless costs depending on the query engine chosen. Feature available to across all commercial AWS regions, but excludes GovCloud US and China regions.

[32:36] Justin: All right. So this is just, I misread this article when it first came out. It's really just feature parity between 1.0 and 2.0.

[32:43] Justin Brodley: Well, 1.0 has been missing though for a while. Cause I mean, I, I know I've used kurd.2.0 and I had to build my own pipeline for this, which was annoying because it was there for kurd.1.0. Right. And so, you know, I built this pipeline a couple of times, use the new, the newer format cause it's much better. But this is actually even better because they've, They've kind of automated some of the other sharp edges of dealing with the kernel, like the pricing lists and that stuff and the automatic index updates for Athena. So this is a nice quality of life improvement, but yeah, it is annoying because it didn't exist, but it is a little bit of catch-up for something they lost when they moved to kernel 2.0.

[33:17] Justin: Yeah. What I didn't realize, you know, and I've seen a bunch of kernel reports in the past. I've yet to see a kernel report that is 5 terabytes though. But I guess if you're processing it multiple times a day or, you know, daily, then I can definitely see it be a terabyte. But a single month, I've never seen it at a terabyte. And that was even with like a reseller account where we had, you know, tens, if not hundreds of sub-accounts that rolled up, back before you had like organizations and such.

[33:48] Justin Brodley: [Speaker] Yeah. Let's move on to Google. Google AI Threat Defense is a new automated security system that combines Wiz for exposure mapping, CodeMender for code remediation, Gemini for AI reasoning, and Mandiant for threat intelligence into a single vulnerability management workflow. The goal is to shrink remediation time from weeks to minutes by automating the scan, prioritize, remediate, and monitor cycle itself. The multi-model approach is a notable technical detail here. Google explicitly acknowledges no single AI model catches all vulnerability types. So the platform uses multiple frontier models like Gemini Enterprise Agent Platform, uh, via the Gemini Enterprise Agent Platform to cover application logic, cloud configuration, binary analysis, and exploitability validation across different asset types. Codebender is the code remediation agent at the center of the fix workflow, generating patches directly in the developer's IDE or CLI, rewriting code to memory-safe languages, and automatically generating tests to verify fixes before deployments. It integrates with Wiz and a tool called Antigravity to coordinate library dependency changes across source control and production environments. Wiz is a context-aware pentesting agent continuously simulates attacks to validate exploitable paths during application layer or identity-driven risks. Which distinguishes this from traditional attack surface management tools that only identify what is exposed without confirming actual exploitability. Uh, no pricing was announced with this one, but ecosystem partners including Accenture, Deloitte, PwC will handle deployment, ongoing management, and custom workflow integration for their enterprise customers. So I read that as expensive.

[35:10] Justin: I read that as here's my wallet, just set it on fire. I mean, especially because Wiz and other tools that they're leveraging already are not cheap. So yeah, I assume you're paying for all of these tools plus the additional, but it is nice to see them really start to bring that security ecosystem into a single— I really don't want to say this out loud— pane of glass and a single point. You know, with the Wiz acquisition and everything else that they've done over the last couple years, they keep adding security, but they're still kind of independent. I feel like this is their first big, you know, swing at merging everything together.

[35:50] Justin Brodley: I agree. I mean, it was interesting. The— I wonder if this feedback about the model not being able to handle each of these needs, if that becomes something that they try to train the models to do better so they can just use a Gemini or just a single model. That'd be kind of interesting to see.

[36:04] Justin: I don't know. I feel like this is where potentially you— they, they might have custom models. They could for each of these independent things because they need to get that, you know, pinpoint accuracy on it. Where, you know, a Claude, you know, I'm just gonna go back to Claude Opus or, you know, whatever, might not be able to pull that. They might— I would assume they probably have a Gemini custom model, you know, based on, you know, whatever the latest Gemini is. I'm stuck on— in my head went to, uh, OpenAI model numbers. But, you know, based on latest Gemini model, they then retrain and/or RAG on top of it to get more security-focused models.

[36:44] Justin Brodley: Yeah, I, yeah, we'll see. I don't know, but looking forward to seeing continued development in the security space for sure. I do wonder if they're gonna come out with their own Mythos type model at some point. I mean, with considering the Mandiant knowledge that they have, you would think they'd be able to create some pretty cool security-based findings and research.

[37:03] Justin: Or they have and they're just not telling anyone, which is what I kind of assume.

[37:06] Justin Brodley: That could be too. Yeah. Google AI Studio now supports full-stack app deployment to Cloud Run with either Firestore for document storage or Cloud SQL for PostgreSQL as as a relational option, with the AI agent automatically selecting the appropriate database based on your prompt, removing a common decision point for developers prototyping new applications. Your users can deploy up to two full-stack applications to the Google Cloud Starter tier at no cost and without a billing account, lowering the barrier for developers who want to test production-grade infrastructure before committing to a paid plan. Cloud SQL integrates using a new PostgreSQL Developer Edition that scales to zero when not in use, meaning you only pay during active usage. Cloud SQL support on the Starter tier is noted as coming next month. So it's not fully available at, uh, right now, but will be soon. Firebase Auth serves as the single login layer across the stack and enables Google Workspace integration, including Sheets, Calendar, and Gmail, through a standard sign-in with Google Flow. And the agent handles provisioning, authentication, Firestore security rules, and database functions automatically through Google Notes. Users should review security rules before sharing the app publicly. When a project outgrows the starter tier limits, resources transfer to a standard billable Google Cloud project without requiring a rebuild, providing a straightforward path from prototype to production. At aistudio.google.com.

[38:16] Justin: I really like this. I feel like a lot of people are building these side apps or whatever from there, and they are not thinking about how do we actually make this thing be GA. You know, we see— I've seen a lot of like friends, oh look, I built this POC in Lovable or here or there, and I'm like, cool, that's gonna work for 3 people. How are you gonna run for 30 people, let alone 3,000 or 300,000, which is what telling me you're interested in. So at least here they've kind of handled security, uh, storage of the data. You know, they put you in a decent starting point for a real production workload because what AI is great at is, hey, yesterday I did an hour-long quick POC to see if I want to do this thing at my day job in a web interface and I got it up and I said, you know what, that's stupid. This is not the way it needs to go. But I had to build in all these different layers because my day job, I'm not on Google, you know? So having this nice easy place to say, okay, let's start with this and then throw these in as a quick POC to see how it works, gives you that, you know, multiple layer, you know, with storage and authentication and all the things that you need. So, to get to production, your, you know, barrier of entry is much lower.

[39:38] Justin Brodley: Yeah. Well, and the fact that it converts right over makes it so much easier to—

[39:42] Justin: Oh yeah, that's fantastic.

[39:43] Justin Brodley: Get from POC. I mean, I, I, I think this is an answer to Vercel, right? A lot of, a lot of developers right now are doing POCs and building apps on top of Vercel because of how easy it is and how much they don't have to do. And so I, I feel like this is a direct response to that in many ways. So we'll see. Well, uh, Nanobanana 2 and Nanobanana Pro are now available to everyone. I mean, I've had 'em for a while. This is basically the Imagen 4 and Imagen 4 Pro models, which are are marketed as Nanobanana, which I don't really understand why they did that, but I love the name. I'm just saying. I mean, I love it. It's great, but I don't know why they're even still calling it Imagen. Like, just retire Imagen. Just call it Nanobanana. That's what you're gonna do.

[40:18] Justin: In my head, it's Imagen is retired. It's just Nanobanana.

[40:21] Justin Brodley: Yeah. In my brain, it's that way as well. But the problem is the API isn't Nanobanana, so it messes me up sometimes. Oh, really? When I'm doing things.

[40:28] Justin: Yeah.

[40:28] Justin Brodley: It's, it's like Image Pro or something. But, um, you know, yeah, I, I've been using this for a while. Like I mentioned, you know, the, the podcast covers are all image, you know, AI generated now based off our show titles. And so if you haven't checked out our, our covers lately, they're, they're a lot of fun, uh, especially me trying to figure out how to, um, come up with good images for some of these. Like, today's was a bit of a struggle because, uh, our last week's episode was named, uh, Holy Labor Displacement Batman: The Vatican Weighs In. So I was like, how do I combine the Vatican and Batman and all these things together? And I think I came out pretty well on that this week in particular. But, uh, yeah, it's been a lot of fun doing these and, you know, I also use the ChatGPT image generation as well, but most of the time I end up choosing the Image in Image because it's typically a little bit better or more in line with what we want. But, uh, yeah, I mean, I'm glad it's GA. I mean, it has some rough edges. It had some challenges early on, but, uh, yeah, I continue to iterate on our cover generator quite a bit to make sure that works really well and, uh, very happy with it.

[41:30] Justin: I just like the name. I'm not gonna lie.

[41:34] Justin Brodley: All right. You don't like my covers? That's fine. I understand too.

[41:36] Justin: I do like your covers. I don't look at them as closely as I'm sure I'm supposed to, but maybe we'll have to have BoltBot actually, uh, force us to and have a vote. But then, you know, maybe getting 3 of us on a podcast every week sometimes doesn't work. So maybe we should not do this via democracy. Yeah.

[41:54] Justin Brodley: You guys really messed me up with The Cloud Pod's AI pleads not guilty, blames Philip K. Dick. That one, that one took some work. You're welcome. That one, uh, that one I was like, wow, how am I going to get, you know, and, and the thing is like the, the COVID generator, you know, runs a script to basically generate ideas and then I can modify the ideas or I can write my own idea if I have a better one. And then it, it generates a bunch of variations based on like with Bolt, without Bolt, with the host, without the hosts, et cetera. And so I had a lot of fun ones for these. I let you guys vote on them a couple of times, like where I wasn't really sure. And you guys always kind of enjoy picking the one you like best, which is Pretty good. Sometimes you guys don't agree with what I think is the best one, which is fine. I just chose yours when that happens, so it works out. All right, AlloyDB for PostgreSQL now offers hot standby HA, where the standby node continuously applies write-ahead logs from the primary instead of sitting idle, allowing the database startup phase during failover and reducing downtime to approximately 15 seconds in testing. The key practical benefit beyond faster failover is post-failover performance consistency because the standby node keeps its buffer cache warm by actively replaying the logs. The new primary serves requests at normal throughput almost immediately rather than degrading for several minutes while the cache rebuilds off the disk. Yeah, this is nice if you need high performance with AlloyDB.

[43:04] Justin: I'm kind of surprised that this wasn't there. Like, what was the cat— what was the standby doing at that point? I mean, I guess that's where I'm a little bit confused. If it wasn't replaying the logs, or was it like replaying the logs every like 15 minutes versus in real time? I'm not really sure, but you know, it kind of felt like that's what the standby should be doing. I mean, Redis failover from primary to secondary, from— I know they say it can take up to 5 minutes, or that's what they used to say. I probably should update my knowledge set to see if they've adjusted that, but for the longest time it was— it could take up to like a minute or 5 minutes or something. Was normally like a second.

[43:42] Justin Brodley: You would drop a couple queries and call it a day, and But I mean, there was at one point where Redis nodes, you know, they had the same problem where they would have to build from cache and they, you know, so there are like, yes, I think other clouds have had this for a lot longer, especially in Aurora. But I think Redis had some of these similar pain points as well.

[44:02] Justin: But that was like 10 years ago.

[44:04] Justin Brodley: Yeah. I mean, Google's not that old. They had to take the time to know that these are problems you have to solve. And AlloyDB, I feel, I feel several years behind, Aurora because they kind of did Spanner first and wanted everyone to do Spanner. Then everyone was like, no, that's too much lock-in. Well, at least AlloyDB is, is Postgres compatible. So I feel like it, you know, just it's a factor of the fact that it's, you know, multiple years younger than Aurora in some ways. Well, uh, you also now have a new AlloyDB remote MCP server. This remote MCP server for Alloy, uh, gives AI agents a managed HTTP endpoint to securely query operational data data, sorry, database data without the infrastructure overhead and local MCP server deployments. This is part of Google's broader rollout of 50+ managed MCP servers across its cloud services. So I mean, managed MCPs are definitely all the rage right now. All the cloud providers are dropping them. I like this one though. It's kind of interesting. I don't know that this should be your primary interface to a DB at performance scale, but if you need an interface for an admin or for a user to do more ad hoc querying, this is way better than giving them SELECT against the database.

[45:06] Justin: I always give my, my business associates like star ability against my production databases. Nothing bad ever happens.

[45:13] Justin Brodley: Trust me. Explain some of the outages you've had to deal with recently. No, I mean, I'm also gonna take it. I also gotta take a note to not let you write any SQL code for Vault.

[45:24] Justin: You should never. No, no, that's just a given life choice. I, I could do infrastructure, I could do AI. I've never gotten into SQL and I'm perfectly okay with that. I can do SELECT FROM, and that's it. You want me to do INNER JOINs, OUTER JOINs, CROSS JOINs, whatever else? No. You want me to write stored procedures? It's bad news bears. I've always just let somebody else do it. You know, my old team used to always make fun of me. They'd be like, hey, can you like go and do this thing? Because I had permissions, and I'd be like, how do I do that? They're like, you have to write me the SQL, whatever you want me to do. So SQL, not my thing. Writing SQL queries is not my thing.

[46:03] Justin Brodley: I know enough SQL to be dangerous, but I can, I can do a left inner join, I can do a right outer join. I know how that stuff all works. But I, you know, I definitely enjoy having AI write my queries now for me.

[46:13] Justin: I was just saying, I just have AI do it. It's much easier. Yeah.

[46:16] Justin Brodley: I need to query this data from a table. It's like, oh, here you go. I'm like, wow, that was so much faster than me remembering the syntax. So I appreciate it a lot.

[46:24] Justin: We've talked about in the past how SQL Studio Manager has AI Copilot in it. And while I think, I don't know that I actually want that, I've definitely played with it and had it make me queries on the fly because otherwise what I was doing was going to Claude and, or, you know, Copilot and having it generate the queries for me and then putting it back in and going back and forth in circles.

[46:50] Justin Brodley: Yeah. I mean, I've used Beekeeper Studio, which has AI built into it for that exact reason as well. You know, and that one, that was nice because it handles multiple different types of databases, not just MySQL or PostgreSQL or whatever. It supports a wide array of databases. So that's what my go-to, which I've loved, uh, and definitely something to, uh, use if you're looking for something for AI for other database platforms you might want. All right. GKE standby buffers address the longstanding trade-off between over-provisioning costs and slow cold starts by suspending pre-initialized nodes to disk. Releasing compute and memory costs while retaining only persistent disk and IP address charges. This results in cost overhead in the low single-digit percentage range compared to full overprovisioning. Standby buffers resume 2 to 3x faster than provisioning fresh nodes, and when combined with active buffers, the two work in sequence, with active buffers handling the immediate spike while standby nodes resume to cover sustained load. Benchmarks show P50 latency dropping from 4 to 6 minutes to single-digit seconds under identical traffic conditions. That's nice, but I mean, didn't we hear about Amazon doing this not that long ago?

[47:53] Justin: I thought it might have been Azure.

[47:54] Justin Brodley: I think it was both of them did it.

[47:56] Justin: Yeah, I thought it was Azure, but I could be wrong. I mean, it's just a warm standby pool.

[48:00] Justin Brodley: I feel like that's basically what it is with some automation that was sort of lacking before.

[48:06] Justin: Yeah. I mean, your p50f, you know, from 4 seconds to single digits, like 4 minutes to spin up a new node, you know, that's where I, at that point, I don't always, you know, while I totally understand scaling everything based on, you know, CPU and memory usage and everything else, most workloads at a lot of businesses, I'm speaking specifically from a business standpoint, are known ups and downs. You know, everyone turns on their work computer at a given time. You know what, you know what your workload's going to be at that point. So, you know, while this might help in the situation, probably a lot of companies aren't going to leverage it right away.

[48:43] Justin Brodley: Yep, agreed. Well, Google has broken ground on its first data center in Horndal, Sweden, expanding GCP infrastructure into the Nordic region to support growing demand for Search, Google Cloud, and YouTube services. The facility will use air cooling instead of water cooling because it's freaking freezing there, which reduces water consumption compared to traditional data center designs and includes offsite heat recovery to support warmth to provide nearby homes and businesses. For GCP customers in Northern Europe, this expansion means lower latency, improved regional availability for cloud workloads. And Google has supported over 700 megawatts of renewable energy additions to the Swedish grid since 2013. This new facility continues the sustainable focus, which matters for enterprises with carbon reporting requirements. So very soon you'll have a new Swedish region.

[49:25] Justin: I really like the fact that they're using the air to cool it down, but then not just venting it out the other side, but venting it to homes, right? Which need the heat and kind of getting that double whammy, right? Yeah. So. While they are consuming power, I understand it's never, you know, infinite power here, you know, they are at least outputting some power essentially and saving power elsewhere in the grid, which is nice. So it's a good way to kind of hit their green requirements as they build out these places.

[49:58] Justin Brodley: Yeah, that's a great, great solution. Azure has a bunch of stuff this week. First up, Application Gateway for Containers has now reached general availability, which integrates with Istio service mesh, meshes, automating mutual TLS connectivity between the gateway and mesh-enabled services simplifies secure north-south traffic management in Kubernetes environments. The integration supports both upstream open-source Istio and the managed Istio add-on for EKS, giving teams flexibility to choose a preferred deployment model without changing their ingress configuration approach. The notable operational benefit is a single ingress path for routing traffic to services both inside and outside the mesh, which reduces the need for repetitive mTLS definitions and separate gateway configurations.

[50:38] Justin: If I remember correctly from the beta of it, the biggest annoyance of this is that you can't go from a current App Gateway to an App Gateway for Containers. They're different resources inside of Azure. So the fun part about this is you actually need to move and relaunch. And even though you tell customers to never whitelist your IP address on your App Gateway, there's definitely always one customer out there that does and then opens a SEV1 ticket. So great feature. Kind of wish the Application Gateway was all under one bigger umbrella, but I have other issues with the Application Gateway, so I'll just leave that as that. And look, we hit my Application Gateway complaint for the day.

[51:18] Justin Brodley: Yeah, good. Perfect. So again, Microsoft Build 2026 just happened and The Verge nicely summarized the 7 biggest announcements from their opinion. First up was Scout, the always-on assistant built on Open Claude that integrates with Microsoft 365 apps, including Outlook, OneDrive, and Teams. Handling background tasks like calendar management and expense reporting, and it's currently in desktop preview for frontier customers in the US with broader availability planned. I mean, that's pretty cool if you want all your data to be leaked. So OpenClaw concerns still exist, so do be careful about that. Microsoft revealed 7 new AI models under its MAI lineup, or MAI lineup, including MAI Thinking 1, its first reasoning model featuring 35 billion active parameters and 128 kilobyte context window. The, sorry, 128,000 kilobyte context window. The model targets complex multi-step instructions, long context reasoning, and code generation, signaling a continued push towards in-house model development rather than reliance on OpenAI for all things. Which, I mean, yeah, thank God. We were talking about this not too long ago that they really needed to drop some new models because their stuff are getting a little old in their process. But their flagship MEI Thinking-1 model is that reasoning model they trained from scratch with no distillation., which Microsoft positions as a clean data lineage option for enterprises and benchmarked even CloudSonnet 4.6 in blind human testing and matches Claude Opus 4.6 on coding benchmarks. So it's not as good as the latest Opus, but Opus 4.6 wasn't anything to sneeze at. So it's not bad for an open model coming from Microsoft.

[52:48] Justin: Yeah, I mean, I feel like, uh, I think last year I predicted a new model and the prior time, so it's nice to see them actually get a new model out there. Because it's been, it's been a while, like you said, and getting something out there that people can use and honestly them internally getting off of OpenAI, you know, for all the, I'm just gonna say Copilot Star things out there have, I'm sure for a long time they were using OpenAI and paying somewhere for it. So if they can run it all under their own model, probably gonna be better off in the long run as a business.

[53:21] Justin Brodley: Yep, I agree. Next up at Build was Microsoft Execution Containers, which introduces a sandbox security layer for AI agents running on Windows via OpenCLAW, giving developers defined guardrails over what agents can access on a device. Companion app lets users configure their own agents or connect to existing ones with this controlled environment. So yeah, they give you a secure place to do it. I'm hoping like Claude and others will also adopt this for, you know, places like Cloud Code can run, or if you're using um, some of the other cloud capabilities like Cowork, those can all live inside potentially this MXC environment, which would be a nice improvement.

[53:59] Justin: Yeah, definitely. This at a global level will be nice, not just for here. So I hope this gets picked up like the, uh, MCP protocol, you know, and other things that Anthropic has built out to really help. Because I always worry what Claude is doing on my laptop, even though I have it set to only look at certain areas, I'm like, do I trust the people that are putting their own guardrails in place to notify me that they're breaking their own rules? 'Cause you know, I have to tell, oh, you know, AI, do not modify my test to make them pass by setting it to return true. So, you know, I worry about things.

[54:36] Justin Brodley: And they also announced the Surface RTX Spark Dev Box, which targets developers running local AI models featuring NVIDIA's ARM-based Spark RTX chip and 128 gigs of unified memory with Visual Studio Code and GitHub Copilot pre-installed. The pricing and specs were not fully disclosed, but basically this will be a Surface computer designed to run local AI models, uh, for potentially using, you know, their Scout product or others on your laptop, which is actually a really nice improvement because using a local model that you control versus a public model, like, there's a lot of less data leakage concern there.

[55:12] Justin: Yeah, it's an area I'm kind of curious. This is getting pushed for by, I don't know, government entities to build out, you know, these open models that we, they can use. Or like, I feel like in the last 6 months in the world of AI, that feels like forever though ago, you know, I feel like there's more conversation around open models and running more things locally and I don't know if it's the cost or if security, you know, Ryan's not here to make, so we can make fun of him, you know, it's just being a pain in the butt in places, or if larger businesses and government entities are kind of forcing that forward. Yeah.

[55:50] Justin Brodley: I mean, I think it's a bit of both. I think it's, you know, there's a, a need to do more local models, more local governments. The costs and availability of large, you know, expensive GPUs is, is difficult. And if your workload isn't super heavy, having a local model that can do some of the light weightlifting like text editing and stuff like that, that's, that's a big win. So I think that's what you're seeing. Uh, and then our final announcement, uh, that The Verge covered was Microsoft's Majorana 2 quantum chip delivers qubits rated at 1,000 times greater accuracy than its predecessor, which came out last year. We talked about it quite a bit about how they didn't release any of the research about the Majorana chip and all the new, uh, matter they supposedly created. Uh, but Microsoft predicts it could achieve a practical quantum computer by 2029 based on this new chip and what they're doing in their research side. So still skeptical on some of that one, like show the science, but, uh, definitely interesting.

[56:42] Justin: It's definitely interesting. And, uh, it goes back to what I guess we talked about two weeks ago, time to get your quantum computing cipher suites for security protocol, security communications, post-quantum security encryption. Thank you. Post-quantum. Quantum scary. That's what happens when we record at 10 o'clock at night.

[57:02] Justin Brodley: Yep. So the, uh, basically, you know, all this kind of wraps up into their overall positioning around how AI alone won't change your business, but the systems that run your AI will change your business, is what Microsoft's basically claiming now. And it's positioning its entire agent platform as a 5-layer system covering build, contextualize, run, govern, and improve by integrating GitHub, Azure Foundry, and Microsoft IQ Azure 365 and Microsoft security stack into a single workflow rather than separate tools. Microsoft IQ is a notable new component that grounds agents and enterprise data from Microsoft 365 business systems on the web via WebIQ. The Frontier tuning allowing organizations to post-train models on their workflows and data while keeping that trained intelligence within their own environment. So yeah, Microsoft's been invest— not been waiting around. They're investing a lot of money to make AI a thing in the enterprise.

[57:48] Justin: And they're hoping to pay it all back. I mean, I think AI has already changed businesses. Mm-hmm. The question is how much more will it change it and how will it continue to change it? You know, like we talked about at the top of the show, the cost of AI is not what, you know, free and it, it's getting more and more expensive as people use it more 'cause it's all computationally heavy. So it's definitely changing the landscape out there now. Now, given that they own GitHub and Foundry and IQ and all the data for businesses and SharePoint and all that, I just am concerned from Microsoft's point of view with the EU or potentially other places that are they gonna run into an, like an, an old, not monopoly, I guess monopoly, you know, on the data or get in trouble with the EU for kind of locking out their vendors over time.

[58:43] Justin Brodley: For sure. That's it. We made it through another week here, Matt. I appreciate you as always.

[58:49] Justin: I enjoy this. It's always fun. You know, it's, it's a little bit different with two people. I feel like it's just back and forth where the third person, we get a little bit more conversation, but it's always fun to, to do.

[59:00] Justin Brodley: Yeah, I always enjoy it. Oh, well, I do appreciate you. Uh, I didn't— you weren't here last week for me to hear my thank you, but I thought your guys' episode without me was fantastic. So thank you very much for doing that.

[59:09] Justin: I thought you were gonna rip it apart.

[59:10] Justin Brodley: No, it was good. I, I even, uh, wrote some nice things in this week's, uh, newsletter comments for me, um, so you can see what I had to say there. But, uh, yeah, it's, uh, great as always. Uh, one thing, I am trying to get us to 50 iTunes reviews. So if you have not written a review of our show on iTunes or Spotify or wherever you'd like to listen to your podcast, please do. Uh, we are pushing for 50 by the end of, uh, June if we can. So get out there, do it. If you've done it before, you can always do it again, uh, especially since, you know, the last review someone posted on our podcast was, uh, from 2 years ago. So definitely, uh, looking for people to get out there and post some reviews onto our show. Helps people find it and it helps, uh, us know that you're out there listening. So we appreciate you all. All right, uh, Matt, I'll see you next week, hopefully with, uh, Jonathan or Ryan joining us.

[59:58] Justin: Perfect. Talk to you then. Another week of cloud news wrapped up. Bolt will collect the news. Justin will get the notes. Jonathan will write some code. Ryan will watch the perimeter. And Matt will reluctantly watch Azure. Till next week for AI, Amazon, Google Cloud, and Azure. And hey, maybe even Oracle, who knows? Check out thecloudpod.net for our newsletter. Join our Slack, message us on socials, or leave a review.