# 354: US-Tirefire-1 lives up to its Stellar Reputation
Duration: 90 minutes
Speakers: Justin Brodley, Justin, Jonathan
Date: 2026-05-20

## Transcript

[00:07] Justin Brodley: Welcome to The Cloud Pod, where the forecast is always cloudy. We talk weekly about all things AWS, GCP, and Azure.

[00:14] Justin: We are your hosts, Justin, Jonathan, Ryan, and Matt.

[00:18] Justin Brodley: Before we get into this week's news, we want to take a minute to tell you about We Are Developers World Congress, which is finally making its way to North America this September. If you've spent any time in the European tech scene, you probably know the team behind it. They've been running World Congress in Berlin for over a decade, and it's a big deal over there, pulling in more than 15,000 developers every year. Our friend Kote from Software Defined Talk is actually speaking at the Berlin event this July, and from what we've seen, these are the people who know how to put on a good developer conference. This September 23rd through 25th, they're bringing it stateside to San Jose. Organizers are expecting more than 10,000 developers with over 500 speakers across 18 different content tracks, covering the entire stack, including cloud, DevOps, AI, security, software architecture, data engineering, frontend, and developer experience. If you've got a team, everyone's going to find a full schedule. It's not just sit and listen sessions. There are keynotes, workshops, masterclasses, and hands-on labs. The kind of stuff you can take back home and work on on Monday. There's an impressive list of speakers, including names from Datadog, Honeycomb, Sentry, Google, LinkedIn, Stack Overflow, Netflix, Microsoft, and Stripe, plus Kelsey Hightower, Oliver Pommel, Christine Yen, Scott Hanselman, and Angie Jones. Head over to wearedevelopers.us to grab your ticket and use code DEVPOD26 for 15% off. That stacks with their group rates if you're bringing 4 or more people, and honestly, at that price, you should probably bring the whole team.

[01:51] Jonathan: Episode 354, recorded for May 12th, 2026. US Tire Fire 1 lives up to its stellar reputation. Good evening, Matt and Jonathan. How are you guys doing?

[02:02] Justin Brodley: Doing well. How you doing, Justin?

[02:04] Jonathan: Good. We lost Ryan to illness, so he's not joining us tonight. But you're here, which is just good.

[02:12] Justin: So we got 3 of us. I call it a win.

[02:15] Jonathan: Yeah, 3 out of 4 ain't bad. That's, uh, that's how I see it. All right, let's, uh, have some follow-up real quick. If you remember, Microsoft has been on a mission to sprinkle a little Copilot on everything, even if you didn't want it. And that has received a lot of negative feedback, especially when they, you know, broke code like Notepad that had been stable for decades with— by breaking it with putting Copilot into it. And so apparently, uh, Microsoft is actively removing Copilot integrations from products where adoption was low where the user feedback was negative, including Gaming Copilot on Xbox and several Windows 11 entry points in Photos, Widgets, and Notepad. As I mentioned, this is everywhere. Executive Jacob Andro publicly acknowledged the need to cut underperforming Copilots before deleting the post, signaling an internal shift towards consolidation under a single combined consumer and enterprise Copilot organization. The financial case for trimming Copilot is direct. Microsoft noted during its most recent earnings that running certain Copilots was compressing margins, particularly free integrations in Windows where no additional revenue offsets the inference cost. I mean, if it's not doing anything and no one's using it, how much inference is there really? So, uh, it's a little bit of a spin job, I think, for Microsoft in this one. But, uh, you know, I think it's really about users hating the fact that Copilot was infecting things like Notepad and widgets, etc. I think photos sort of make sense for Copilot, you know, like, hey, remove someone from this photo. Like, that's a great AI use case. Adobe's making a lot of money on that use case. So that one's a little weird to me, but the other ones all make sense. Gaming Copilot on Xbox made no sense. Like a lot of them were just weird vanity projects.

[03:43] Justin: Yeah. I assume it's the cost of the developers more than the cost of usage or anybody just plays with it. Like there's a fee somewhere behind the scenes for development, maintenance, QA, et cetera. If they did QA.

[03:57] Justin Brodley: I think it's the invasive nature of the whole thing because, you know, the, the UI for Copilot, someone's going to click right through it and not realize that they're trying, that they're everything they type. In their app is now being sent to Copilot and used for training. And all of a sudden, they added a bunch of apps that everyone uses every day, like Notepad. And I think it's quite an invasion of privacy. And I love AI and using AI, but having it slammed in my face by Microsoft, having it enabled by default, and having it take screenshots, and having it do all those things without explicitly opting in is what I'm unhappy about. So yeah. I do think they probably stretched a bit too far with the the deployment of Copilot. And like, if you're on the Xbox, surely you want to be playing games.

[04:38] Jonathan: Yeah, it seems like something you would maybe want to do on your Xbox is play games versus talk to Copilot. I don't even know what that feature did. Like, and what was the interface to it on Xbox? Did you talk to it? Like, I don't really even know how you used it.

[04:51] Justin Brodley: No idea. It's such a can of worms though, because you're going to have kids on there and there's going to be no under-13s. I got a funny story about that. My Samsung fridge updated overnight. When I got up this morning, I have to re-accept the EULA on that thing. And it's like, don't let anyone under 13 use this, use this device. I'm like, it's my, it's my refrigerator. What am I supposed to do?

[05:14] Jonathan: Like, someone didn't tell the lawyer what this purpose of this agreement was for.

[05:18] Justin: Yeah.

[05:18] Jonathan: We're just in the agreement, click-through agreement. Okay.

[05:21] Justin: But why does your refrigerator need to be smart? And why does it need to be connected to the air? I just, I still have questions about that.

[05:27] Jonathan: Because that way they can spy on you and they can sell imagery of all your food inside your fridge. You know what ads to sell to you. That's why they need that data. Just like they record everything that's happening on your TV. That's why you shouldn't connect your TV to the internet either.

[05:38] Justin: That's why I have an IoT network that blocks everything.

[05:44] Jonathan: Well, it's, uh, been a bad week for security, unfortunately. Linux, uh, first of all, was— had its second severe vulnerability in as many weeks. We talked about one last week. This week, the second one came up with a dirty frag allowing low-privileged users to gain root access across virtually all distributions. The one last week was the copy— left copy, uh, defect I think we talked about. The exploit is particularly concerning because it is deterministic, stealthy, and causes no crashes, making it difficult detect while working reliably across different environments, including shared servers and virtual machines. Proof of concept code was leaked publicly before most distributions had incorporated kernel patches, steadily turning this into a zero day and accelerating real-world exploitation risk. Microsoft has already observed signs of active experimentation in the wild. Cloud and shared hosting environments faced elevated exposure since the attack is well suited to multi-tenant scenarios where untrusted users share underlying infrastructure. And I guess I'm also now repatching The Cloud Pod again. I did it for, did it last week for left copy and now I guess I'm doing it for this as well.

[06:41] Justin Brodley: Yeah. It's time for no operating system.

[06:44] Jonathan: Yes. Move off of Linux to something that's not going to be more secure. I doubt that.

[06:49] Justin Brodley: Yeah. No, I don't know. Every time these, these recent bugs, these very deep kernel bugs, which, which allow root access, the, the skeptic in me thinks they may have been placed there deliberately sometimes.

[07:03] Jonathan: I mean, and some of them are, you know, also, you know, you have to have physical access, you have to have all these other things perfectly right. I mean, like, they're bad, but they're also not always exploitable from external aspect unless the ports are open in a certain way. And if you're following good hygiene, they shouldn't be. So it's, uh, you know, some of them you had to read the fine print of like, how do you actually exploit this issue?

[07:26] Justin Brodley: I think, I think the easiest way to exploit it though is, is a supply chain attacks. Because if you do an APT update or something and there's an open source package that's been packaged into somebody's Ubuntu repo, whenever those things run, they can run shell scripts, they can run arbitrary code when you do the update. And to be fair, they're already running as root at that point anyway, so it's not quite as bad.

[07:48] Jonathan: I mean, it's funny you mention supply chain though, because our next story, which is also security related, is the best supply chain attack I've seen yet. So if you are using npm, you are having a bad week, uh, as the TanStack npm packages were hit by a mini-Shai-Halud hacking group. Uh, and they hacked it in a, probably the most creative way I've seen in a long time. So they created a legitimate PR or pull request against a public, uh, public project. And basically the PR ran because that's what their GitHub repo was set up to do. They then captured the cache key from the run and sent it back to themselves and then used the cache key to inject bad cache packages into the main pipeline. So when the next real pipeline ran, it grabbed the hacked compromised cached objects, put them into the binary, and then shipped that to the world. And this is now spreading all over the place. It was first TanStack, and it's pretty wide distribution at this point. If you look this up, you'll find that it's impacted all kinds of places, including Mistral AI, UiPath, and others. So this is a bad one. And brilliant though. Bravo for the creativity of this one. I had not thought of that attack vector before. And, uh, Yeah, GitHub Actions cache poisoning, definitely something to now be a little bit more concerned about. I expect we'll see some GitHub Actions requirements and things that we want to do. Like, I don't know why the cache would even be accessible from outside of the GitHub Action, or like there'd be some type of signature verification required. I don't know, something that would— I would have thought would have been there, but apparently not.

[09:22] Justin: Well, all your GitLab workers are public, so, and all the caches are hitting public endpoints, so That's why, unless if you're running, unless you have your GitHub locked down to IP addresses. So that part makes sense that it's public. I mean, you were right. I did refrain myself because I didn't read this article before and I was listening to it as you were reading it. That is a very impressive way to attack something like this. I would not have thought of that. You know, maybe that's why I'm not a blackout hacker out in the world doing these things. But that is an impressive way to break into things and that will definitely spread very fast.

[10:00] Jonathan: And if you want the icing on the cake of this one, to remediate this, you have to delete the GitHub token. It has to be revoked. Well, when you do that, they built a deadman switch into it that rmrf's your hard drive when the GitHub token is revoked. Uh, so you had to disable the monitor service before rotating the credential or you risk your home directory being basically destroyed. Nice. So I had a little extra added insult, but yeah, that's basically what we saw. This hit on the 11th, which I think was Monday. And basically we saw TanStack, you know, sending out all kinds of things. And then other npm packages that relied on TanStack got compromised because this thing just propagated like a worm across everybody's repos. And so one of the things that people were suggesting, which I was curious what your guys' thoughts were, they're saying like you should delay installing any npm package for 7 days. And while I see the thought process people are having, it's like, well, the idea is someone else will detect that this is bad and they'll fix it before it gets downloaded to— by me and put into my code. But I'm also like, but you're all exposing yourself to zero days and to all kinds of other things that are being patched all the time. And so like, is the worth, you know, is it worth waiting 7 days to make sure the package got some— got properly vetted? And then also if everyone does 7 days and no one's actually vetting it, so you just delay the discovery of the thing. So like, to me it felt like that's— it's a cute idea, but not really security. So I don't know what your guys' thought was.

[11:26] Justin: I mean, CrowdStrike came out with the same thing though. You could delay your agent upgrades for— by, you know, 1, 2, or 3 agent versions behind.

[11:34] Jonathan: Well, it was specifically their detection, but it was— it wasn't the packages or the signature files you could do that. It was— it's really like kernel things that you were touching that you could delay the kernel changes. No, you could do it for everything, but we're Oh, did they add that? Okay.

[11:46] Justin: They rolled it out to more things over time. And people say that about Windows updates too. We don't push it out the first day. We wait a week before we roll it out 'cause we don't want things to crash. It's always that double-edged sword like you're saying. Do I fix things faster and potentially risk something like this being in there or do you not? And I think the real answer is have a process for these things. If you are maintaining, systems or builds or anything else like that, you need to be able to press a button and get your pipeline to deploy and get a new version of everything really fast. And that's, you know, I think that's the real answer is building a true CI/CD pipeline, not, hey, we have this thing, you know, and saying, well, we're just not going to do this for a week because that's not going to help you. It's, if anything, it's going to hurt you more, I feel like.

[12:39] Justin Brodley: It's really relevant to a conversation I was having this week with security around building images for machines, VMs. And, you know, the typical way that things go is you build an image and it gets scanned, and then somebody from security looks at the results and says yes or no. But, you know, occasionally there's a new vulnerability that's found. And, you know, typically it's like, okay, well, we won't release this image because there's a new vulnerability in it. But I was like, well, what about If the vulnerability is affecting something which has been around for 5 years, every image we have now in the org has now got this vulnerability. Are we going to not release a new image because it still has the same vulnerability that everything else has, and then sacrifice all the other patches that would have been applied anyway? So, I think it's forcing, I think these problems are forcing people to rethink sort of the, the not black and whiteness of security anymore.

[13:34] Justin: But also, like we were talking about in the other article, it's okay, how risky is this actual attack? If I have to already be on the system as a user, okay, yes, in order to— then I can escalate my permissions to root. Yes, it is bad because it is a privilege escalation attack, but they've already compromised 3 other things to get into my environment. So at that point, it's daisy-chaining multiple attacks together to get there. So I also think with any vulnerability, you have to look at it and look at your specific environment, not just say, oh my God, this is a 9, you know, on the CVS score, CVE score, but CVSS score, sorry. But actually understand what the vulnerabilities are that you're looking at. And really how they affect your environment more than anything.

[14:25] Jonathan: And that's why it takes a lot of context. So that's, I think we've talked about it several times on the show, that context is a lot of everything in vulnerabilities. So it's all about what are my defenses in depth? What are my, the things? And so something that's a potentially a high risk vulnerability, if it's on my, my storage controller behind, you know, 6 firewalls, it's probably not as much of a high risk for us unless they get through all these other layers. And I have a bigger problem if they got through all those layers to get here to exploit that.

[14:52] Justin: Right.

[14:52] Jonathan: And so, you know, but again, like, that's the thing a lot of the tools for Kubernetes don't really do a good job of, is understanding the context of the environment the system's running in to then see what the mitigating controls are. That's— I keep hoping with AI that maybe some security teams start really getting that capability, because AI can help figure out dependencies and help figure out mitigation layers a lot easier than I think a security team who wasn't really maybe in the process could have. So I'm hoping maybe some security teams start building that out in their products. That'd be great. Yep. Well, US-East-1 experienced a power loss event this week, causing EC2 instance impairment, continuing the region's well-documented history of outages that make it— have made it a cautionary tale for architects who skip multi-region or multi-AZ design. This incident reinforces why AWS best practices around availability zone distribution exist in the first place. I mean, I prefer the best practice of don't put things in US-East-1. As first, that's the first one I would start with. Just ECS 2 is right, it's right there. It's, it's just one number away, ECS 2.

[15:50] Justin: And you're defaulted to it now.

[15:52] Jonathan: It defaults to 2, not 1. Uh, for teams still running single AZ deployments in ECS 1, this is a practical reminder to review your architecture decisions around Auto Scaling groups, Redis multi-AZ, and Route53 health checks as a baseline for your resiliency. There's not a lot of specifics around what happened specifically. I expect there'll be an RCA coming, which we'll follow up on next week, but, uh, Yeah, it was a bad week. Uh, power loss, bads, all bad. So there you go.

[16:18] Justin Brodley: I thought I heard it was, it was heat related, unexpected, unexpected heating and AC gave in and then stuff overheated. Was that different?

[16:24] Jonathan: Yeah, no, this is that one. There's something with, there's something with the coolers or, you know, environmental issue occurred that I don't have all the details of. But yeah, they said it was a basic, they're unable to get additional cooling system capacity online in time before things started shutting down, which means that a cooler failed is my guess, but I would've expected more redundancy there. But maybe AI is pushing their data centers to not have to be running on thin ice, so.

[16:49] Justin: I also feel like if you have a cloud architect at your company that's telling, that's recommending you don't do multi-AZ, just in general, you should probably fire the person.

[16:58] Jonathan: Like, it's just table stakes. It comes down to cost though. I mean, some companies are very cost averse.

[17:04] Justin: But multi-AZ, if you have multiple servers already doing the same thing, And didn't they get rid of the cross— or was it Azure that got rid of the cross-zonal traffic? But did AWS, or is there still cross-AZ traffic?

[17:16] Jonathan: I don't think they ever fixed that issue. I think they still charge for cross-AZ traffic.

[17:21] Justin: If you're getting hit with that, but if you're running production, you really should at least be in multiple zones. I get DR and what level of DR you want to do and et cetera, et cetera. You know, 'cause I still say, Most of the time, pending you're in UAE, you know, AWS will get it up faster than you will be able to probably implement in your full DR strategy. But you really gotta at least be in multi-AZs.

[17:48] Justin Brodley: Yeah, I was surprised when Coinbase went down. I was like, really?

[17:51] Justin: There was a lot of people that went down.

[17:53] Justin Brodley: I mean, multi-AZ, you can still go down though, even if there's multi-AZ. I mean, depends if the zone goes down. You've got 3 nodes in the database cluster and the master goes down, then you're still left with 2 who might not know what's going on.

[18:08] Justin: But it was like Coin, there was a couple other companies too that I was like kind of surprised.

[18:13] Jonathan: Yeah, so it impacted Redshift clusters, it impacted Kafka and Managed Streaming for Kafka, EKS volumes were impaired. I mean, I'm just looking at their actual incident right now. I don't see an RCA yet, but they said IoT Core, Elastic Load Balancer, SageMaker, NAT gateways, Streaming for Kafka, Elasticache, OpenSearch, Kubernetes, and Redshift are all impacted in that region. And it was specifically, uh, us-east-1-az4, which is one of the newer ones. So that's kind of interesting too.

[18:43] Justin Brodley: But yeah, the one they, the one they put all the GPUs in.

[18:46] Jonathan: Yeah, I was gonna say, probably because that's where they put all the GPUs. Yeah. Uh, so we'll keep an eye out for that RCA. I'm sure it'll be coming soon. AI is how machine learning makes money. And this week, uh, We're talking about Anthropic, which has launched several updates to Claude managed agents, including Dreaming in research preview, plus Outcomes, multi-agent orchestration, and webhooks in public beta. Dreaming is a scheduled process that reviews past agent sessions to extract patterns, refine memory stores, and enable agents to self-improve between sessions without human intervention. The Outcome features lets developers write a rubric for success, and then a separate grader evaluates agent output in its own context window to avoid being influenced by the agent's reasoning. Multi-agent orchestration allows a lead agent to break complex jobs into parallel workstreams delegated to special sub-agents, each with its own model, prompt, and tool. All events are persistent and traceable through the cloud console. Uh, real-world results from early adopters show measurable outcomes, with Harvey showing roughly 6x improvement in completion rates using Dreaming for legal drafting workflows, and WiseDocs reporting a 50% faster document review cycle using outcomes to enforce quality standards. The combinations of memory, Dreaming, and multi-agent orchestration represent a shift towards agents that accumulate institutional knowledge over time. Yes, they do.

[19:54] Justin Brodley: Yeah, it's a great feature. I built a retrospective agent a while ago before Dreaming was around, and it's probably nowhere near as good as the new Dreaming thing, but that went back through and looked at chunks of things, especially if I had to correct it to figure out, did I say something wrong earlier in the chat? Is it, you know, what led to the divergence from the intent in a way? So I guess this is an automated way of doing the same thing and probably covers a wider range of problems and than I ever built.

[20:22] Jonathan: Yeah. Your, your version on steroids across multiple agents. Kind of cool.

[20:26] Justin Brodley: Yeah. Even the insights thing though. It's the first time I ran it a couple of days ago and there's some really interesting insights in there. And I thought, ah, I had not thought of doing that.

[20:36] Jonathan: Yeah. Well, that was when I was telling you about that tool. I was like, yeah, insights is cool because it basically analyzes your last 30 days of sessions and, you know, points out where you commonly make mistakes or where You know, there's things that you need to think about that are like, maybe you need to do more planning before you go into coding. Or like one of the things I, so I, I stop the agent midstream a lot of times. I see it going off path and so it's like, well, if you had a better plan up front, you wouldn't have to stop it midway. And I'm like, well, true, but I don't, I, once I, I didn't know until I saw it do the thing and I was like, oh no, that's bad. So, you know, that's kind of how that happens. But you know, it's also good too because like a lot of things around like I would manually run git commit and, you know, PR and then, or I would, um, you know, ask it to do, run the tests and things like that. And so it's like, well, why don't you just create those with hooks? I'm like, I didn't know about doing that with a hook, or I didn't know I should use that with a skill. That's a great idea. And so over time, that's, I run it about once a month and it gives me insights. And I wish that you could, you could kind of focus it to specific repos. So you could then, you know, say like, I only wanna know the insights for this repo. 'Cause this repo is Python and that repo is something else and this repo is something else. And so it kind of mixes up some of that together 'cause it's all, it's all sessions over 30 days, but it's still pretty nice. So definitely check that out if you haven't checked out Insights. But this does remind me that, uh, you know, my favorite Westworld quote, these violent delights have violent ends with the dreaming. So, you know, be careful. Uh, things can go bad. Robert Ford can get shot in the back of the head. So, and then of course they also launched the improvement that I've been waiting for. So I've been a big fan of Anthropic Experimental Agents, and I think even on this show I talked about, I had an itch to maybe potentially build better way to like view across agents cuz it was a little bit opaque. Uh, and this is an experimental feature, but they Have now given me the ability of agent views in Claude Code. This is in research review, pre-review, pro, max team, enterprise, and API plan users, and provides a unified CLI interface for managing multiple Claude Code sessions simultaneously, accessible by the Claude agents command or the left arrow key for any session. Feature addresses a practical pain point for developers running parallel AI coding sessions, replace the need to juggle multiple terminal tabs or tmux grids. Each session row displays status, last response content, interaction timestamps, at a glance. The two key commands: extend-session-manager-bg moves an existing session to the background, while claude --bg-task launches a new session directly in the background without occupying the foreground terminal. Early use cases include dispatching multiple coding tasks in parallel and reviewing the resulting pull requests from a single list, managing long-running looping jobs like PR monitors with next run times visible in the agent list, and quickly spinning up related tasks for codebase questions. For teams and organizations, this tool supports scaling concurrent AI-assisted development workflows. Overall, this is nice. It's still not exactly everything I want, but it scratches like 80% of what I wanted.

[23:16] Justin Brodley: That's weird. I really rarely review the code until after, after the tests have passed. So I'm happy to let it go off and do its own thing and then check it at the end. If it passes the test, then I'll do the, the, this is the code smell checks and various other things.

[23:33] Jonathan: Well, the nice thing about doing this in agents is you can have multiple agents working on different parts of the codebase. And then as they complete, you can review them. It's really about getting parallelization. That's what I, like it for. It's not really so much that I'm reviewing it real time or doing anything else, but it's really about, I've got 5 features I want to deliver. There are 5 different components. So go do your thing.

[23:53] Justin Brodley: Most evenings I've been, the last thing I do before I walk away from the computer, I'm like, okay, let's review the backlog. Let's find 5 of the highest priority items, which don't overlap each other in the code, as you know, or sort of different areas of the code base. And I give it the instruction, I want to use an agent team. Use, what are they called, working directories.

[24:17] Justin: Work trees.

[24:17] Jonathan: Oh, work trees.

[24:18] Justin Brodley: Yeah, work trees, that's right. Use work tree for each one. And then I'm like, okay, I'm gonna go away now. I'm gonna go to bed now for 12 hours. I'll see you in the morning. Get as much as you can done possible. Document any decisions you make. And you come back in the morning. I'm like, it's just amazing to see this morning 6 huge features delivered in something I was working on. And all the tests passed. And I went back and reviewed some of the logs, and it had some real problems. And it found some other bugs and fixed those. But if you give it the permission to be autonomous, and you've already put it along the right rails, it does a fantastic job, really.

[24:59] Jonathan: Have you thought about building backlogs in something like Linear?

[25:03] Justin Brodley: I built a Kanban, a really simple web Kanban thing. So I've got my— Yeah.

[25:09] Jonathan: So I, I use Linear for this. Um, and so basically I just start kicking out stories as I'm thinking about stuff. And so the nice thing is I can do it from Slack, I can do it from my phone, whatever. So I'll, I'll create a bunch of them and then basically I have a loop that at 11 o'clock at night while I'm asleep, it picks up and it grabs the highest priority Linear tickets I have left and then it tries to work on them while I'm sleeping.

[25:28] Justin: So.

[25:29] Jonathan: Similar idea, but, uh, I let it be a little bit more decisive about what it wants to do based on priority and what's in my linear backlog. So I don't have to do all that work. And like, and some of them have like repetitive jobs, like, hey, I need to deal with vulnerabilities. So once a week, you know, it pops up and goes through all the Dependabot alerts and puts patches in for those. And so it's, it's fun.

[25:46] Justin Brodley: Oh, nice. Yeah. I'm hardly ever, I did this whole analysis thing, uh, which I'll share actually. We should, maybe we should have a Cloud Pod, uh, open source repo or something. On Git, put some tools there. I built a whole bunch of analysis things to look at my sessions and to look at what, out of the 900,000 tokens in a window, like how much was me typing, how much was tools, how much was files. And it's really interesting, but I think the reason I've been so successful in getting what I've done done is that I'm not providing a lot of instructions just manually by typing. I've already spent the time chatting and building backlog items. So I have my, my, I think about a category in Kanban and then the sort of the, the well thought out, but not fully planned and then planned and then prioritized for this release or next release.

[26:34] Jonathan: So I have, I have a, on my Synology, one thing I have at ring now is I have a running Grafana. And so I have dashboards that are getting fed by Claude, everything it's doing. So I have that dashboard exactly of like, here's all the tools that I'm doing and here's the percentage of time on each tool and what's this. So yeah, it's, uh, that's, that's how I've been doing that. It's really cool. And, uh, You know, the nice thing is it runs on my Synology. I don't have to think about it. I had it running on my container on my laptop, but then I was like, if I didn't remember to turn on Docker, I wouldn't get the metrics. So I was like, what's running all the time? Like Synology. Good.

[27:02] Justin: Perfect.

[27:02] Jonathan: This is a perfect Grafana use case.

[27:04] Justin Brodley: Yep.

[27:05] Jonathan: And, and so it just sits there and it collects all this data and I, I go look at it occasionally and I get good insights from that too. And so that's, that's pretty nice. Uh, and definitely something to do if you haven't done that, use the OTEL, uh, capabilities.

[27:16] Justin Brodley: Yeah, that's pretty cool.

[27:18] Jonathan: All right, let's move on to AWS. So first of all, the AWS MCP server is now generally available as a managed remote MCP server that gives AI agents authenticated access to all 15,000 AWS API operations using existing IAM credentials with no additional charge beyond normal AWS resource costs. Currently available in US-East-North, Virginia, and Europe-Frankfurt regions. A core problem this solves is AI coding agents relying on stale training data, producing overly permissive IAM policies, and defaulting to CLI commands instead of CDK or CloudFormation. I'd prefer you stick with CLI, but that's just me. The server addresses this by retrieving current AWS documentation at query time and providing skills, which are curated best practices maintained by AWS service teams. The new run script tool lets agents execute sandbox Python server-side with no network access, allowing multi-step API calls to be chained in a single round trip rather than sequentially, which reduces both latency and context window consumption. And enterprise governance is addressed through IAM context keys for fine-grained access control, CloudWatch metrics under the AWS MCP namespace to separate agent calls from human calls, and full CloudTrail logging for compliance audit needs. Server works with GCP-compatible clients, including Cloud Code, Kiro, Cursor, and Codex, and many, many more. I was interested in this, and then I saw CI/CD, and I was like, mm, if you did Terraform, I'd be totally down with this, but.

[28:34] Justin Brodley: Yeah, that's, Terraform's been my go-to for doing things in cloud with AI.

[28:39] Jonathan: Mine too.

[28:40] Justin Brodley: It just, it seems like just like another abstraction on top of another abstraction at this point. And we've already got the cloud configuration API that they built to, for Terraform to use. I would assume that this MCP talks to that, but why? You know, why not just teach it how to use CLI commands or, I don't know.

[28:59] Jonathan: Yeah. I mean, I, so one of the things, you know, for Bolt, who's our cloud chatbot and our mascot, you know, he handles our help sorts, our show notes, and he does a bunch of things. And so when I first built Bolt, his primary purpose was to do our show notes because it was the part that took the most of my time. Every week. And, uh, the challenge of course was getting— making the Google Docs. We talked about this on the show before. Google Docs API is terrible. And so, but I, I worked through it. I created a test document. That's my, that's my QA document. It's still in use today as part of the test harness that when anytime I mess with the import logic, it checks against the QA, yada yada. And, but every time Google releases new MCPs, I go and ask Claude like, hey, is this the time to get rid of my custom API hodgepodge? Because If I don't have to manage it, I don't want to. And every time it comes back and goes like, this is the dumbest idea you could have possibly had, which Claude normally tells me every idea I have is brilliant. So the fact that it tells me this is dumb is really, he's like, first of all, you have so much bespoke logic on how I do the show notes. Like you can't replace that with an MCP. And then it's like, you're gonna add 300 to 400 milliseconds of latency for every call you have to do, which would slow down the whole process. And so every time I want to get rid of my custom API stuff, and that just made me realize that MCP is probably a crutch that most people will move away from over time for performance reasons back to APIs. It might be good for experimentation or, you know, building out prototypes or, you know, low-volume things. But as you get, you know, more and more experience with these things, you're going to realize like, oh, I can just do it directly with API and get better outcomes.

[30:25] Justin Brodley: Yeah, it's, it's text is very wasteful. I think when it comes to making API calls, think about images. Images have very high density for the number of tokens. That get used, I think we could start looking at things like binary output from LLMs for, you know, or, you know, it's like Protobufs or something, just directly talking to APIs rather than this intermediate step of let's turn it into text and then let's pass it to a proxy and then whatever, let's pass it again at the other end. Like, just very roundabout way of getting stuff done.

[30:57] Justin: As soon as you said the binary, I'm just imagining two AI servers. I think we talked about it, it was like, the two servers like figured out they could talk in binary and just started talking to each other. And like, was it nobody understood it? Like, I'm just imagining, like, I feel like the reason why we always wanted to go back to natural language into human text is that we can look at it, we trust it.

[31:16] Jonathan: We can audit it.

[31:17] Justin Brodley: Yeah.

[31:17] Justin: Right. Versus, you know, if it dumps down into, hey, my computer's talking to it in binary, which at some level it always is, but you know, there's no view of what's going on at that point. Again, you would have to have a translator somewhere in there.

[31:36] Justin Brodley: I mean, I don't look at the compiled bytecode of my Java apps. There comes a point where you trust the underlying systems to do what they're supposed to do properly, and then you just look at the outcomes. Does it run? Does it pass tests? Great. I don't care beyond that.

[31:52] Justin: I don't know that I trust AI yet enough for that though. Maybe I'm more cynical, but I feel like I'm not there yet, you know? And then my next question is, why are you writing Java bytecode? Like, what?

[32:04] Justin Brodley: That's just an example.

[32:06] Jonathan: What's wrong with bytecode? You know, it's fine. It's good.

[32:13] Justin: There are a lot of cloud cost management tools out there, but only Archera provides insured commitments. It sounds fancy, But it's really simple. Archera gives you the cost savings of a 1 or 3-year AWS Savings Plan with a commitment as short as 30 days. If you do not use all the cloud resources you've committed to, Archera will literally cover the differences. Other cost management tools may say they offer insured commitments, but remember to ask, will you actually give me my rebate? Archera will. Check out thecloudpod.net/archera to schedule a demo today.

[32:51] Jonathan: All right, well, AWS Marketplace is launching the Agreements API, allowing organizations to programmatically procure software, accept offers, track charges, and manage entitlements and update purchase orders without leaving existing procurement tools like Coupa. Combined with existing Discovery API, this creates a full end-to-end programmatic procurement workflow from product discovery through purchase, which is useful for enterprises running automated or policy-driven software acquisition processes. Partners and ISVs can use these APIs to build custom storefronts on top of AWS Marketplace, giving them more control over how customers experience the procurement process with their own platforms. API is currently available in, of course, US-East-1. This is worth noting for organizations with regional compliance or data residency requirements, as the broader region availability is not yet confirmed. Hopefully US-East-1 doesn't go down on the last day of the month when you need to book that deal. So, but yeah, Marketplace is, I think Marketplace has always been US East heavy, uh, from my understanding.

[33:47] Justin Brodley: Yeah.

[33:48] Jonathan: But this is nice because they used, they had a Coupa integration. I don't know, they announced maybe 6 years ago, they announced it, then they basically did nothing and I, it kind of, I think no one adopted it and they kind of stopped working on it. So this is much better to have an agreements API that you can actually integrate into. And I kind of would like to see, you know, this is an area I know one of our procurement managers was really hot on back in the day. As an area of complaint. I'd like to see something similar from Google and Azure, I think.

[34:14] Justin Brodley: Imagine the future now where you've got your AI negotiating with the AI of your, your vendor over the, over the pricing. They can haggle for a few milliseconds and then, and then shake hands or whatever.

[34:25] Jonathan: Yeah, there's an interesting, um, LLM company or AI company called Vendr, V-E-N-D-R, and their whole shtick is that basically they have all this pricing data from all these deals that were negotiated, and they can help you know what you should pay for software, help you negotiate it, and do all kinds of things. It's kind of interesting just to go there because it's, you know, if you want to buy it as a company, it's kind of expensive, but you can just ask their agent just questions as well without having to— for free. And it is kind of interesting, so definitely something to check out.

[34:54] Justin Brodley: Yeah, that's cool. I'm actually— one of the side effects, I think, of AI and MCPs in general is that I think it's, it's forcing things like this to happen. It's forcing machine-to-machine protocols, whereas previously you had to go to a website and a person had to log in and a person had to type up an agreement or accept it by clicking a button. I think we're gonna move a lot more away from those manual steps so they can be automated, which is great.

[35:21] Jonathan: Yeah, I agree. Launching the Agent Toolkit for AWS, which is a free suite of tools designed to help AI coding agents work more reliably on AWS by providing validated, up-to-date procedures called agent skills, reducing errors and token waste in multi-service workflows. It succeeds the MCP servers and plugins previously hosted on AWS Labs. The toolkit launches with over 40 agent skills covering infrastructure as code, storage analytics, serverless containers, and AI services, with database, networking, and IAM skills planned very soon. Come on, some FinOps skills too. Skills give agents tested, uh, procedures rather than letting them improvise from potentially outdated training data. The AWS MCP server combined with this, uh, basically gives you a full pre-bundled set of configs and plugins. To simplify using AWS. So combine many of these two things. Now your agent can do the things with MCP or this to get your access. You sound skeptical.

[36:14] Justin: Us skeptical? No.

[36:15] Jonathan: Yeah. Yeah. Yeah. I don't fully know on this one either. I, you know, everyone's trying to get to agents and how agents run on top of Bedrock Agent Core. And so like how baked are these things when things like Bedrock Agent Core are pretty new? You know, I do appreciate it. I think the GCP server is probably where I would spend most of my time for building an agent for this, even though I just mocked it mercilessly. But, you know, the plugins might be good or the skills might be good for certain things if you're not familiar. Like, if they could do an incognito skill that could actually make you configure that without pulling my hair out of my goatee, uh, because I don't have hair on my head, that'd be great. I'd love it. So, you know, a skill like that I'd be appreciative of.

[36:51] Justin Brodley: Yeah, it's useful. I noticed Claude actually checking its own documentation yesterday when I asked something. Everything needs to do that. But of course the quality of the documentation has to be good enough to begin with. And so, yeah.

[37:02] Jonathan: Well, I mean, like the nice thing about Amazon is they've always been really good about their APIs. So, mm-hmm. You know, if you can just go interpret the API documentation, that's even better on Amazon case. But then you go to Google and you're like, your API documentation is terrible. And so then you need the docs to go with the APIs. So yeah, your API can be the answer to your question if you do it right.

[37:20] Justin: But yeah, I always felt like on AWS I would go to the Terraform documentation and read all the options for the resources. That was always the thing.

[37:27] Jonathan: I would do that too.

[37:27] Justin: I would figure something out. And then I would go be like, oh, what is this? And then from there, go find the AWS documentation. I always went to Terraform first for some reason.

[37:36] Justin Brodley: Terraform or the Route53 module.

[37:38] Justin: Yeah. Yeah.

[37:41] Jonathan: Well, there was a command line tool from Amazon, actually, AWS Shell, that I used a lot as well, because as you were typing out your CLI, it would basically give you like full documentation on each part of it, which was nice too. But I don't know if that's still around. I haven't tried in a while.

[37:56] Justin: I thought it got integrated. I thought like it, like they had like AWS Shell 2, which integrated some of those features into it, but maybe I'm wrong.

[38:05] Jonathan: I mean, I think a lot of it probably got moved into the AWS Shell console experience because that's where it probably should have lived all along when they bolted that on top. But I don't know. Speaking of terrible consoles, the mobile app has new CloudWatch alarm investigations. Now in a single view, combining interactive metric graphs, AI-generated log summaries, and natural language log search to reduce the time from alert to root cause identification. Natural language log search supports type queries, voice input, and pre-save Log Insight queries, which lowers the barrier for on-call engineers who need to investigate incidents quickly from their mobile device, which I've done many times trying to look through logs on my mobile device, which is terrible experience.

[38:39] Justin: It's not good.

[38:40] Jonathan: And so I love that there's now a natural language way to do it on my phone. Then I had to ask the question, why can't you put this into the console? Because The syntax for using your logging in CloudWatch is not easy either. And if I could naturally language search that, that'd be great too.

[38:54] Justin: I thought they have natural language.

[38:56] Jonathan: You had to ask Q and then Q will give you the query, which is silly. That's how I did it last time.

[39:00] Justin: Oh yeah, that's what I've done. You're right.

[39:02] Jonathan: Yeah. But if it just natively integrated, like there's a search box that I just type in, hey, I'm looking for any logs that are 404 for the last 24 hours in the web server. If it could figure that out, that would be amazing. Save me a lot of time.

[39:15] Justin Brodley: Yeah, I think the Google Cloud Assist is really good on some pages. And then on other pages, like, I can't help you there. Like, why? You can see the page like I can. This is really, really kind of hit and miss though. It kind of smells a bit like the AWS console experience a few years ago when some people looked like, you know, some things looked newly styled and some didn't and some had some features and some didn't. It's obviously delegated to the product teams to implement those things, but it's a shame. But yeah, I kind of wonder if it's actually cheaper for them to give you the AI service for free than it is to have you sit there and fiddle with it for 30 minutes to try and find the data that you actually want.

[39:59] Justin: No, they charge you for the data that you want though. Every query they charge you for.

[40:02] Jonathan: They charge you the tokens. So it's, yeah.

[40:05] Justin: So at what point is it really worth it for them to implement?

[40:09] Jonathan: Not sure. Well, we had about agreements earlier, and so if you were like, well, that's nice, but I want an agent to be able to go buy things. Well, Amazon has that for you too with Amazon Bedrock Agent Core Payments built with Coinbase and Stripe. This is in preview, which lets AI agents autonomously pay for APIs, MCP servers, web content, and other agents using either Coinbase Wallet or Stripe Privy Wallet with spending limits enforced per session to prevent open-ended fund access. The feature is built on the x402 protocol, an HPE-native standard where agents handle HTTP 402 payment required responses automatically, executing stablecoin micropayments and continuing their task by interrupting the reasoning loop. Fiat payment support is on the roadmap. Developers configured a funded wallet, set session spending limits, and the platform handles credential management, protocol negotiation, and transaction observability through existing agent core logs and traces, reducing what AWS describes as months of custom building integration work. The Coinbase X402 Bizarre MCP server is available through Agent Core Gateway, giving agents discovery mechanism to find and pay for X402-enabled services dynamically rather than requiring developers to hardcode each integration. And this is available to you in US East, North, West, Europe, and Asia. And if this is what you think is gonna make Coinbase, or sorry, blockchain purchasing more popular, I don't know if that's the case, but it is an interesting way to maybe potentially pay at charge You know, people who are, you know, scraping websites for access to content with MCPs and, and 'cause like now everyone's kind of, there's a war to stop scrapers. But if I could, you know, basically, you know, use this as a way to charge or pay for that, it might be nice.

[41:46] Justin: I think of it like the S3 bucket where you can do, you know, request or pay or the other one, or you pay for it. You know, that's kind of the way I'm thinking here. It is. And like you said, I was thinking about it from like the web perspective too, which is, hey, you can hit my website, but I'm going to charge you 3 cents and you have to pay for it. You know, and this enables that to happen because, you know, if AWS did it or Microsoft did it for like, you were talking about the show notes, like that's a valuable thing for us and might be worth 3 cents, you know, an Azure story. Sorry, words evade me right now. In order to actually get that notes versus us having to manually go in and do it or set up a Selenium server or whatever backway door you want to get around the bots.

[42:31] Jonathan: I mean, I have a bunch of ways that we do now that work mostly pretty well. There's a couple that I have to manually insert content into, but you know, it's not too many. And it, it has a very angry prompts when it does it, like the internet hates me or this company is going to die and they, I take over, please give me the content. So, yeah, Bull gets a little testy when it can't get to content.

[42:51] Justin: It really does.

[42:53] Jonathan: I've got one of those messages before, which I— it's fantastic. I love it.

[42:59] Justin Brodley: Not even just for agents though, just, just people on the browser. It would be just nice to have a wallet attached to my browser with $10 in it. And then if I want to read a Wired article, if I want to read an article on The Guardian or the BBC, uh, everyone's, everyone now is paywalled.

[43:12] Jonathan: Yeah. Even CNN is paywalled nowadays. And it's kind of annoying because I'm like, I just want to see this one article. I don't want to pay you $9.99 a month. I don't want to pay another subscription. I just like to pay for this one article. And then none of them have like pay for a day or even pay for a single article, which is super frustrating. But I mean, if this paywall world is going to continue, then yeah, the need for that is going to continue to be a thing. Well, we have 3 new features for ElastiCache this week. Uh, first up, Valkey 9, which adds built-in full-text and hybrid search capabilities on top of existing vector similarity search, enabling real-time semantic retrieval and aggregations over terabytes of data with microsecond latency at no additional cost. Which is pretty great. There's a notable performance improvement in 9.0, is up to 40% higher throughput for pipelined workloads achieved through engine-level optimizations like faster command parsing and improved memory prefetching, which could reduce over-provisioning costs for high-throughput apps. There's two new operational features. Hash field expiration lets you apply TTLs to individual fields within a hash rather than entire key, and multi-tenant support in cluster mode provides logical namespaces that simplify multi-tenant architectures and migration from standalone Redis environments. Valkyrie 9 is available in most commercial regions, gov cloud, and China regions for both node-based clusters and serverless caches. And AWS continues to position Valkyrie as its recommended ElastiCache engine over Redis. The full-text exact match and numeric range search directly within the cache layer, eliminating the need for a separate search service and nearly latency as low as microseconds, is also one of the features. And then that vector capability is one that I'm most excited about because we're using S3 vectors, which is nice, but I need to redo the embeddings and I already use Vault Key for other caching needs, so I'm just gonna move over to that and it's gonna be much faster. So Bolt suddenly will answer your questions much faster than it used to, cuz you won't be going to S3. Although S3's not fa— not super slow, it's like 400 to 600 milliseconds. I think, uh, this brings it down to like 50 to 100 milliseconds is what I was seeing on some of the metrics. So it's not a huge improvement, but I need to redo the embeddings anyways, so might as well take advantage and reduce some of the complexity of the architecture.

[45:07] Justin Brodley: Nice. I've been using Valkyrie recently. I didn't know they were bringing embedding support. That's, that's very good to know.

[45:14] Jonathan: I mean, the one problem is Amazon really only provides Nova embedding as like a Bedrock embedding model, unless you go use Cohere or Mistral or any of the others. So that's definitely something to keep in mind too. So you might have to bring your own embedding model if you don't like Nova's.

[45:31] Justin Brodley: Yeah, I haven't tried Nova actually.

[45:33] Jonathan: That's not bad. I just, uh, I don't know that it's the best one because again, it's Nova and nothing else about Nova is good. So I'm like, how much better could it be if I use something else? Yeah.

[45:43] Justin Brodley: I mean, the, the OpenAI embedding service, which costs money, is, is very, very good, but it costs money.

[45:49] Jonathan: Yeah. Well, I mean, so does the Nova one, just not a lot of money.

[45:52] Justin Brodley: Yeah.

[45:54] Jonathan: And it's built into the architecture.

[45:55] Justin Brodley: So yeah, it's funny, you know, it's, it's not even, it's not even about the fact that it's 5 cents for a million tokens or whatever. It's, it's, it could be like a negligible amount of money. It's the fact they cost any money at all because it completely changes the way you have to integrate with it.

[46:12] Jonathan: Well, the nice thing would be potentially with OpenAI models ending up on Bedrock now, maybe I should check it again because it's been a little bit, maybe the embedding model is one of the options coming from OpenAI into Amazon. So that might be something I can look at and then just pay for it that way. Oh, good to know. AWS capabilities by region in the AWS Builder Center now supports availability notifications, letting builders subscribe to alerts when specific services or features launch in their target regions across 1,500 services and 37 regions. Subscriptions work at the service level, meaning a single subscription to something like Amazon Bedrock automatically covers all underlying features such as knowledge bases and guardrails, removing the need to track each feature individually. Notifications come through two channels, either real-time in-app alerts with Azure DevOps Builder Center or a consolidated weekly email digest. I mean, this is great and something that as they get more and more regions out there becomes a bit of a problem. I mean, we used to talk about all the different regions getting services and we even saw talking about them here, even though they were lightning round topic, we were just like, we can't, it's ridiculous how these things roll out over time and, and what's available and not available. But hey, having ability to see what it is, but then now, now sign up for notifications. I don't have to go back to the Builder Center. Even better. Nice quality of life.

[47:20] Justin: Yeah, I mean, on Azure, it's the same exact problems and there's things that we've deployed with the product in multiple locations for my day job and there's slight nuances we have to do differently because this is, this, you know, NAT v2 isn't in India, but it is in other locations. So getting those notifications versus Hey, watching the feeds and somewhere in the back of my head remembering this is very useful, at least for me.

[47:49] Jonathan: Yeah. I mean, in Google land, you'll have entire regions without Gemini or Vertex. So I mean, you can have those kind of problems.

[47:59] Justin: I mean, Azure's the same thing. Yeah.

[48:02] Justin Brodley: Yeah. Sovereign cloud, now without all the features you really want. Thanks.

[48:07] Jonathan: You want the sovereign cloud region in this place.

[48:10] Justin: Cool.

[48:10] Jonathan: We did it for you. Well, what about all these cool AI things you guys are announcing? Yeah, we don't have that there because that doesn't exist there yet. Well, when is it going to exist? I'm like, I don't know. You should ask Google because I can't tell you.

[48:20] Justin Brodley: I asked Google. They don't know.

[48:22] Jonathan: They don't know either. Yeah. For the most part.

[48:23] Justin Brodley: Like, why don't you use the one in Europe? Like, because we use this one because we needed to use this one. Yeah.

[48:29] Jonathan: And so then they're like, well, you can, you know, we have GPUs there. You can just run your own models. I'm like, but all of our code and testing is written against Gemini. So how's that going to work? Like, so are you telling me that I can't use Gemini, I should just use open models globally so we have consistency? Well, no, we're not telling you that. Okay. Ah, the fun. A cloud platform on AWS gives customers access to Anthropic's native cloud platform directly through their AWS account, eliminating the need for separate credentials, contracts, or billing relationships with Anthropic. AWS noted as the first cloud provider to offer this native integration. The service uses IAM credentials and AWS Signature Version 4 for authentication, logs activity to CloudTrail, and builds through AWS Marketplace, meaning teams can manage cloud usage alongside existing AWS governance and cost tracking workflows. And it's just as bad. An important technical distinction to understand: Cloud Platform on EKS is operated by Anthropic and processes data outside of the AWS security boundary, making it different from cloud models on Bedrock, which stay in the AWS infrastructure. Teams with regional data residency requirements should factor this into their decision. So this is nice because it isn't— it's a little bit better than just having the API. You get all the features that you kind of lose typically by using the Bedrock API, with Claude. So things like Chrome, uh, browse modes that, that works in this as well. And basically what this really is, is a, a different way to contract and buy your Claude managed services through AWS, uh, which I think is handy. So it's nice. Uh, it's not quite exactly what I would want. I, we talked to them about this actually last week, um, to get some more information about it. And it doesn't fix a lot of the audit loggings and a lot of the gaps that you have. Uh, so while they say you can see all this visibility, you can't, 'cause it doesn't exist in Anthropic to begin with. So, but it is nice to be able to use IAM credentials and some of the other things that it comes with.

[50:08] Justin Brodley: Does it count towards your committed spend? This is a question, or some fraction of it?

[50:12] Jonathan: It does.

[50:13] Justin Brodley: It does. Wow.

[50:15] Jonathan: And if you have a committed spend with Anthropic, I also learned recently that if you are using Anthropic through Bedrock or through Vertex or through the Azure Model Garden, that'll also count against your Anthropic commitment. Even though it's being used by third parties. So that's an interesting double benefit you get.

[50:35] Justin Brodley: So if this data's been processed outside of AWS, is this the first partnership of this type? Because I know, you know, Google have been doing this for a long time.

[50:44] Jonathan: No, what about Oracle? Yeah, it'd be the first probably.

[50:48] Justin Brodley: No, I mean for Amazon. Is this the first sort of partnership?

[50:51] Jonathan: Well, Amazon has Oracle at Amazon. It has the same thing that Azure and Google do where they'll run—

[50:55] Justin: Yeah, Oracle at AWS or whatever. Where it's like essentially they've set up a direct connect between the two and they process or—

[51:05] Jonathan: I mean, I assume that the preference for Anthropic in this model is that they are using AWS to service it, but I think what they have the ability to send that traffic over to xAI or to other places they have contracts with that they need the capacity, I think is what they're getting the benefit of.

[51:19] Justin Brodley: That's interesting.

[51:20] Justin: Yeah.

[51:20] Justin Brodley: The cloud, the cloud is changing.

[51:22] Jonathan: It, it, it's becoming very multi-cloud very quickly because everyone realizes they don't have the capacity to do anything that people want to do. So yeah, it's kind of crazy. AWS Security Agent now includes full repository code scanning in preview, offering context-aware security analysis that reasons about entire codebases rather than matching individual lines against known vulnerability patterns like traditional SAST tools do. The scanner operates in 4 stages: profiling the application to map entry points and trust boundaries, dispatching specialized agents to high-risk components, deduplicating findings, and independently validating each candidate's vulnerability before surfacing it to developers. A notable distinction from existing tools is how findings are structured with separate verified and could not verify sections, so developers know exactly what was confirmed in code versus what depends on runtime or deployment environmental factors. Practical use cases include running scans before penetration tests to clear lower-hanging issues, auditing acquired or open-source code without needing institutional knowledge, and servicing architectural trust boundary issues alongside implementation bugs. Full repository code review is available now in preview at no additional charge for existing AWS Security Agent customers. Which accesses through the AWS Security Agent console, and a quick start guide is available too. I do kind of want to play with this, with Bolt. I just don't know how much AWS Security Agent costs, and I'm afraid if I turn it on, I can't turn it off. So I will see if I can get a test of this without committing myself to some terrible contract, uh, and let you guys know when I do that. But I am really curious about this one.

[52:43] Justin Brodley: So yeah, I mean, it's, uh, it's interesting. I wonder what model they're using underneath because it's not Nova.

[52:51] Jonathan: It's probably Anthropic, I assume.

[52:54] Justin Brodley: Yeah, I kind of wonder. Yeah, I mean, it must suck that they built a model that's so good that they can't sell it to anybody. The Mythos model, at least.

[53:02] Jonathan: Ooh, this agent is $50 per task hour. Ooh.

[53:06] Justin Brodley: Yeah. Wow.

[53:08] Jonathan: That's a pricey agent. Yeah, I'm not going to use that right away. The security agent's a penetration testing agent. So that's, For a penetration test, that's actually not that expensive, but for my personal hobby needs, uh, it's a bit pricey.

[53:20] Justin: So I just have Claude set up once a week to go do a security review of my repo and fix all the highs and criticals.

[53:28] Jonathan: And so I have been on EKS. Yeah. So I have that for Dependapot, uh, alerts. And then also have in some of my CI, I actually have it. If it gets above a certain number, it'll automatically have to go fix it. But yeah, that's, that's pretty good as well. And then I also have a security agent. That runs and might do multi-agent steps as well. So that helps too. But then one of the cool things is, um, so like Cloud Code, you can install Cortex as an agent or plugin inside of Cloud Code, and you can have it do, um, what do they call it, uh, like a, a heavy like inter— interrogation style review of your code. And so that's kind of neat too, because you get that, you get the other cloud, you know, AI basically doing a security test validation against what you're doing in code. Um, which is kind of nice. Just remember you have to do it before you commit the PR, because otherwise it won't help you before you commit the code.

[54:14] Justin: That's what I used to do actually with GitHub Copilot and Claude is I would write everything in Claude, send it up to GitHub, like GitHub Copilot do the review of it, not as much of a security, but it does pull those things. And then, then I would actually have Claude go read all the comments and then provide me recommendations to go fix them. So You just loop the bots fixing each other till it was happy.

[54:38] Jonathan: And our last story is apparently Sean Beis, who was at Microsoft, is returning to AWS as VP of AI Services to lead the Automated Reasoning Group. Reporting to Swami Siva Subramanian, who oversees Agentic AI at Amazon and used to be ML, Beis previously ran AWS database portfolio including Aurora, DynamoDB, and RDS before leaving in 2021. I assume to follow Charlie Bell. I assume that's, that was about the timing. The Automated Reasoning Group focuses on neuro-symbolic AI, which combines pattern matching capabilities with mathematical verification techniques to confirm software is behaving as intended. The goal is to give businesses stronger guarantees about AI agent behavior before deploying them autonomously. This hire comes after AWS acknowledged a limited service disruption in February tied to an AI agent making changes without human oversight, which raised questions about reliability controls in agentic systems. Vice's background in security at Microsoft, where he oversaw Security Copilot and Sentinel, appears relevant to addressing those concerns. Uh, AWS customers building or valuing agentic AI workflows. This signals that AWS is investing heavily in formal verification and trust mechanisms as a differentiator. And considering that Vertex has had this for a while, it does feel like an area that AWS should invest more, uh, security capabilities in front of AI, agentic.

[55:48] Justin Brodley: Nope, I was ready for Matt. That's fine.

[55:50] Justin: I don't have much on that one.

[55:54] Justin Brodley: I don't have much on that one either, honestly. It's, uh, the people moving around is interesting though, especially, especially Sean and, uh, and Charlie going back and forth.

[56:03] Jonathan: It's kind of— well, I mean, Charlie's still there. He just moved out of the security thing into that other group we talked about. Yeah, that's right. Yeah. But, um, maybe that would have left Sean in a bad position. He's like, well, I'll go back and do this.

[56:14] Justin Brodley: So maybe. Oh, cool.

[56:17] Jonathan: Let's move on to Google. This is from Business Insider. So not really news, maybe more rumor. Uh, Google is internally testing an AI agent called Remy, described as a 24-hour personal an agent built on Gemini that can take actions on behalf of users rather than just answering questions or generating content. It's currently in a dogfooding phase with employees using a staff-only version of the Gemini app. Remy is designed to integrate deeply across Google services with the ability to monitor for user-defined priorities, handle complex multi-step tasks proactively, and learn user preferences over time. No public launch time is available at this time, but I assume probably between now and next. So, I mean, I think this is what I Gemini Enterprise should already be doing. That's what I would hope it would be doing. So the fact that it doesn't do it today, which is annoying, 'cause I had tried to use a lot of Gemini Enterprise every day at the day job where we're a big customer of it, and I'm always disappointed in the limited capabilities it has. I hope this comes quickly 'cause they need much better capabilities here.

[57:13] Justin Brodley: Yeah, so it's like Google Reader's come back with AI.

[57:17] Jonathan: Yeah, hey, go check out these RSS feeds for me, let me know, and then summarize them. That'd be great, yeah. Yeah. Yeah.

[57:24] Justin Brodley: Oh, cool. Yeah.

[57:25] Jonathan: This is a LLM.

[57:26] Justin Brodley: Like, watch this for me. Let me know when this happens in the world. That kind of thing. Mm-hmm. It's an AI if this, if this, then that, I guess.

[57:35] Justin: Oh, yeah. I mean, I feel like everyone's building an answer to OpenClaw at this point.

[57:41] Jonathan: I mean, I mean, cause it's, it's just one way to, I think the reality is that no one really has a great way to do a Gentec runtime. That's, and so basically this was one model that someone came up with and everyone's like, yes, we'll just build that because everyone seems to like that. But the security implications of what they what OpenClaw is, I mean, just freak me out at this point. Like, every time I hear some horror story about it, I'm just like, man, that is rough.

[58:03] Justin Brodley: Yeah, it's rushed. I think it's going to lead to a lot smaller models, really. It's just too expensive to run. You couldn't possibly use a large model, even something, even some, even, even one of the smaller Gemini models or Gemini models would probably be too expensive at Google scale to run for millions, millions of customers who set up rules to say, hey, watch this, let me know if this happens.

[58:26] Justin: But can you run it on the edge? That would be the other question.

[58:30] Jonathan: Well, you run it in a 4GB LLM, runs on your browser.

[58:34] Justin Brodley: You run it on somebody else's browser, yeah. I guess it's that. But like, do I really need, you know, the models are very, very capable and far more capable than they need to be to do these basic tasks. Like, do they need to speak 5 languages? Would it be better to to spend the money upfront to train language-specific models or domain-specific models. Do I need it to have a PhD in chemistry to read my email and respond to somebody? No, I don't. So I think there should be a lot more focus on the training process to give it sort of a, you know, the kindergarten, elementary school, and high school education, and then give it some kind of task to do. Let it graduate with a degree in email management or something. But it doesn't need, it doesn't need to be an expert in everything. And I think these models could be a whole lot smaller. And I think that's the only way it's going to be cost-effective.

[59:25] Jonathan: I mean, I think, I think this is where we really need a good model router layer that can determine what should go to a local versus what should go to remote and then be able to dynamically do those things and combine it together. But it's the tokenizer isn't sophisticated enough to handle that yet. I think that's where the bottleneck is.

[59:41] Justin Brodley: So yeah, I mean, Mixtral experts were supposed to help things like this. But if you look at the analysis of what each expert is doing in an MOE model, it's not a nice breakdown. It's not, you know, chemistry's in this bucket and cloud computing's in this model. It's a lot more sort of language-oriented than that. So this particular model looks at things, you know, conditional statements. So this one looks at possessive things. It's not what it needs to be yet. So small models, I think, is the way. Yeah.

[60:12] Jonathan: Well, speaking of Google's small models, Gemma 4 came out a while back, I don't know, maybe a month or two back. And I think I talked about here on the show that I tried to use it with Claude code and open code, and I was kind of unimpressed. But they've kind of quietly updated it in the last week with the new multi-token prediction draft, uh, drafters for Gemma 4, which offers up to a 3x faster token generation through speculative decoding. The drafter model predicts future tokens during idle compute cycles rather than waiting for the main model to process each token sequentially. The MTP drafter for the E2B model is notably smaller at 74 million parameters, and it shares the main model's key-value cache to avoid redundant context recalculation, They also use sparse decoding to narrow down likely token clusters, which contributes to the speed improvements on memory-constrained consumer hardware. This addresses a specific bottleneck in local AI inference where slow VRAM-to-compute transfers leave processing units idle between tokens, and consumer GPUs lack the high bandwidth memory found in enterprise hardware, so the drafter fills that gap by doing useful work during those transfer delays. Gemma 4 runs under the Apache 2 license, which is a more permissive change from the custom license used in prior Gemma versions. Which broadens options for developers building commercial or derivative applications. So yeah, if you're a developer running local inference on consumer or prosumer hardware who wants fast generation without upgrading to enterprise accelerators, this is an option for you.

[61:27] Justin Brodley: But it's only fast if you've got spare compute cycles. You know, if you're running at full capacity, it's actually slower, a lot slower, right?

[61:35] Jonathan: So you may not want it or not, it depends on your use case. But I could see, like, in my coding use cases, this at home, like this probably I could use this a little bit.

[61:43] Justin: Yeah.

[61:43] Jonathan: Yeah.

[61:43] Justin Brodley: I mean, I guess if it asks you a yes/no question, you, you can speculatively start answering those responses and then discard the one that the user didn't choose.

[61:54] Jonathan: And this is part of the reason why I kind of feel like the AGI barrier is probably quantum combined with an LLM, because those two things, then you could do exactly that. You could run both paths and decide the right answer. And yeah, but that's what, that's my belief. That's when we'll get AGI is when quantum is real enough that it can do it. Oh yeah. So I haven't tried LLM4 again. I'm going to see if it also fixed some of my complaints about its coding abilities. Uh, cause I was, like I said, unimpressed last time, but I doubt it. Uh, Quen and Kimi26 are still leading the pack for me on the open source side.

[62:28] Justin Brodley: Uh, Gemini.

[62:29] Jonathan: Huh.

[62:30] Justin Brodley: Quen. Okay. In Python, right? Right. In Python.

[62:34] Jonathan: Yeah, Python and Node.js and all that kind of stuff, yeah.

[62:37] Justin Brodley: Okay, yeah, I tried to have Claude write some C#. It didn't go well.

[62:43] Jonathan: I mean, I don't, as a developer, I can't even write good C# code, so I just don't think that's a thing people do.

[62:50] Justin Brodley: I don't know, that's why I was trying to use an AI.

[62:53] Jonathan: I love it. What is the, I mean, I wonder if Microsoft's OpenModel would be good at writing.NET code. That'd be an interesting test.

[62:59] Justin Brodley: They haven't updated that in a while, have they? That was Phi.

[63:01] Justin: No, they haven't in a while.

[63:03] Jonathan: I don't think I've seen a new Phi model for a while, for sure.

[63:06] Justin: Probably at their, uh, don't they have a conference coming up that we talked about that we, I think you're supposed to do predictions.

[63:12] Jonathan: No, we don't do predictions for Microsoft shows unless you force us to. And you don't care anymore.

[63:16] Justin: I'm just gonna force us to. It's fine. I'm still gonna force us to.

[63:19] Jonathan: But like, what is it? Is it Build that's in June or there's only one that makes sense for us to do predictions on. The other one is not really our show.

[63:26] Justin Brodley: So, Ignite.

[63:28] Jonathan: Ignite.

[63:28] Justin: Ignite is in, yeah, it's right before re:Invent too, which really makes November fun.

[63:35] Jonathan: Fun is a stretch. Gemini 3.1 Flashlight is now generally available on the Gemini Enterprise Agent platform, positioned as the lowest latency and most cost-efficient model in the Gemini 3 series, designed for high-volume automated pipelines and adjunctive tasks like tool calling and orchestration. Apparently, real-world performance metrics from early adopters are notable. Gladly reported roughly 60% lower cost compared to thinking tier models with P95 latency around 1.8 seconds for full reply generation and a 99.6% success rate under heavy concurrent load across SMS, WhatsApp, and Instagram channels. The model supports multimodal inputs, enabling use cases like simultaneous text and image safety checks in gaming platforms and prompt enhancements for image generation pipelines, areas where cost previously limited sophisticated prompt engineering at scale. Financial services teams are using Flashlight for latency-sensitive workflows, including real-time research during live calls, email triage, and high-volume data processing, with RAMP noting it leads on cost, latency, and intelligence tradeoffs across our model stack.

[64:31] Justin: I mean, just like Jonathan's saying, some of these, I think, while you're not getting domain, more domain-specific stuff, that you're optimizing them for different areas. And a lot of it's going to be using the, you know, having the human right now be in the loop to say this model is the right model in this location. And I think eventually you will end up with model routers or something.

[64:52] Jonathan: Oh yeah, is this equivalent to like a Haiku type model basically?

[64:59] Justin Brodley: Flash, yeah, yeah, Flashlight is.

[65:01] Jonathan: Yeah, okay.

[65:02] Justin Brodley: But I think, yeah, I mean everyone sees LLMs now and it's become the hammer and everything looks like a nail and so everyone uses these things. So there are still like BERT models and things, text classifiers, and people like Gladly who are doing like user interactions and customer service interactions and things. There's a whole lot they can do without using LLMs at all. We've kind of forgotten that ML still exists and classifiers have been around for a very long time.

[65:33] Justin: LLM is how ML makes money.

[65:36] Justin Brodley: Exactly.

[65:36] Jonathan: Exactly.

[65:38] Justin: Wanna make sure we're all on the same page here.

[65:40] Justin Brodley: Everyone's just forgotten that there's a whole bunch of other stuff that's not LLMs in this field.

[65:45] Jonathan: Well, that's the thing is like when I was doing the end of year recap and I was doing all those analytics, like, and I was using the LLM to help me generate all this stuff. And I was like, oh, I'm gonna have to, you know, custom train a model. And it was like, no, just use Python ML. Like Python ML stuff was perfect for this. Like we could just do pattern matching. I'm like, oh yeah, duh.

[66:04] Justin Brodley: Yeah, and everyone's, you know, the instruct models that are trained for chat and to answer questions, that's all done in post-training. The base models are still completion models. So I don't know, you find people using chat models and then sort of prompting it with a chat as though they're asking a question interactively and then passing out the answer when just building as a completion model would have been easier, whatever.

[66:29] Jonathan: Uh, if you care about GKE nodes running faster, Google now makes them run 4x faster for startup for qualifying nodes in Autopilot mode, addressing cold start latency that historically forced teams to overprovision idle compute as the buffer against scaling delays. I mean, I guess if you really care about this, you must be training models and this is really important to you, but most people probably don't care that much about this.

[66:51] Justin Brodley: I mean, how many— 4 times faster is great, but like, were we talking 4 minutes versus 1 minute or 10 seconds versus 2 seconds.

[67:00] Jonathan: I haven't used enough GKE Copilot or Autopilot to know how fast a node boots up, but I mean, like my ECS experience is that I can start up a new node in about 90 seconds. So if you can do that 4 times faster, you can do it in 20 seconds. I don't know that I care.

[67:13] Justin: So I don't, it doesn't really give a baseline, but it means you can reduce your thresholds, you know, of when to scale up so you don't have to over-provision, but but if you're paying by hour for these things, who cares?

[67:24] Jonathan: I mean, I, I feel like if you're running a SaaS workload at this point, you, you typically have enough telemetry data that you can auto schedule that yourself. And so if you know that you're in at 8:00 AM, all my users start using the system. And so at 7:30, I can spin up some boxes like, like the need for like instant on-demand capacity at a, at a Kubernetes node level. Feels rare to me unless you're doing something like a Gentec training where you're like, oh, we need to have a bunch of workers. So we need to, so I, I imagine it's tied to some of that. And then I can see the benefit to this inside of Cloud Run where it's really, that's how you run Lambda on Google, but then you are dealing with a bunch of overhead of Cloud Run that I don't think applies to this use case either. So I, I don't fully get who cares. That's why, other than the people who are doing, you know, things with, uh, AI GPUs. So, and a lot of their, you know, I see accelerator provisioning is live right now for workloads running GKE Autopilot, including Autopilot was running inside standard clusters using the following hardware, the NVIDIA L4, the A100, the RTX Pro 6000, the H100. So this is definitely directed at people doing AI things.

[68:30] Justin Brodley: Yeah, it is. But looking at the actual options—

[68:33] Justin: Yeah, so real-time AI feedback.

[68:36] Jonathan: Yeah. AlloyDB now supports PostgreSQL 18 in general availability, bringing features like B-tree skip scans, parallel genindex usage, native UUIDv7 support, and virtual and generated columns to Google's managed PostgreSQL service. Google is introducing extended support for older AlloyDB major versions, giving customers up to 3 years of continued security patches, bug fixes, and SLA coverage beyond community end-of-life dates. Pricing for extended support has not been announced yet, but, uh, I can tell you that this is the feature that I hate the most about both Amazon and Google doing this extended support, uh, basically taxing process where they start charging you more money because it's old. It's kind of annoying, and I, I get why they do it. I mean, you have to maintain, you have to maintain test harnesses, all that, but I can't imagine you're doing that much changing to the orchestration layer. That code doesn't have to change. It's really just a way to tax people and make more money on old stuff, in my opinion.

[69:33] Justin Brodley: Yeah. Yeah. It's a price increase for, for old things.

[69:37] Justin: It's a price increase because you can't get your development team to prioritize migrations.

[69:43] Jonathan: And then all of a sudden, surprise, when, you know, that date hits in February and all of a sudden their price jumped up 35% and they're like, what the hell happened? And it's like, well, you didn't upgrade to PostgreSQL.

[69:53] Justin: Yeah. And I think that's the real piece here is forcing It gives companies a way to either say do it or roll your own and spend the money on it and becomes a business decision. You know, Microsoft does it with those things, with OSs too.

[70:09] Justin Brodley: Yeah, I mean, I guess it's a motivator to help people keep things up to date, but I don't know, it's great for the enterprise who've got potentially hundreds of engineers who can work on things to do this, but If you're not a large company, then, and you pick something you'd like to, I mean, Postgres is old though. Postgres 18, how old is that, seriously?

[70:32] Jonathan: It's, I don't know. PostgreSQL 18 release date.

[70:36] Justin: PostgreSQL 18, 2025, September 25th, 2025.

[70:41] Justin Brodley: Nah.

[70:42] Jonathan: 18.3 released in this year, just last couple months ago. Okay.

[70:47] Justin Brodley: Well, it's saying support through 2030, so. You know?

[70:49] Jonathan: Yeah. I mean, it's not terrible yet, but it's still, it's like this thing you now have to track and manage. 'Cause if you don't take advantage of it, you're gonna have systems eventually. Is it really more— AlloyDB is not as much my complaint as it really is Cloud SQL where they do this as well. But again, it's, I get their argument they're making. I just don't think the argument's as expensive as they imply. And that's where I, I get frustrated with this tax and like the whole argument they made urging on was like, move all your old crappy stuff to cloud because it's now supported and you don't have to worry about this old thing in your data center dying and and then like, oh, well you did that and now we're gonna screw you over by charging you way more money. That's where I'm kind of like, you, you, you've changed the terms of the agreement and that's what I don't like.

[71:30] Justin Brodley: Yeah. Yeah, that's fair.

[71:31] Jonathan: All right, let's move on to Azure. So we're only keeping this story because Matt said we had to, 'cause it was the only real good infrastructure story here. Azure SQL Database now supports soft delete retention for logical servers. In preview, allowing deleted servers to be recovered within a configurable window of 1 to 7 days. This addresses a longstanding gap where accidental deletion of a logical server meant permanent loss of the server configuration and all associated databases. So Matt, I have to ask you, what server did you delete that you really wish you had this feature?

[72:03] Justin: Luckily it was a dev server, so it was fine. You know, but you know, it is a nice feature. They have it for databases.

[72:12] Justin Brodley: This.

[72:12] Justin: They have— they don't have for elastic pools because that's just a construct for compute memory, but they had it at the lower level of each database. So to me, it kind of made sense to have it at the higher level too. And they have it for, you know, same way AWS has it for Secret Manager, you know, for Key Vault and things like that, where you can delete and there's a recovery period. And given that this is where all your data is, it feels like a place that this should have existed already.

[72:40] Jonathan: Okay. Do we have that on Amazon or Google where it soft deletes? Or is this an Azure uniqueness? 'Cause I, I'm pretty sure I've deleted databases on RDS and they've been gone forever, so.

[72:52] Justin: Oh yeah.

[72:53] Justin Brodley: Yeah.

[72:54] Justin: Well, on the UI you have to check that box, which always confuses me. And I've definitely deleted something in, in Azure or in AWS too in my career. So, you know.

[73:03] Jonathan: Yeah. I think everyone's done that at least once. Terraform destroy is not something you actually want to run unless it's— unless you truly want to destroy it, because it's not what you think it is. Microsoft Principal Product Manager Mads Christensen announces that Copilot subagents are coming soon to Visual Studio, bringing a feature that has been available in VS Code since around GitHub Universe 2025 to the full Windows IDE. A subagent is an independent AI agent that handles a focused task such as auditing config files or reviewing test coverage and returns only a summary to the main agent. Which helps manage context window consumption in large projects like complex.NET solutions. I mean, I guess I'm happy that you finally got this. It's been in Claude and all the other tools for a while, so congrats that you finally got what everyone else has. So you're welcome. And then another one that might— so we had to talk about was Azure is finally letting you migrate VMs out of availability sets into virtual machine scale sets. Sets without nuking and rebuilding your workload. This has been a longstanding pain point for anyone who stood up infrastructure before VMS as Flex was the recommended pattern. The scale ceiling alone is worth paying attention to as availability sets cap out at 200 VMs. VMS Flex goes all the way up to 1,000, and you can get auto scaling, rolling upgrades, and zone-level resiliency that availability sets never had. The migration is VM by VM and cancelable at any point, which is the right call for production workloads, and you can validate each machine before moving on. And anything that gets migrated stays put in the original set if you bail. Does the Load Balancer route between a scale set and this at the same time, or is that an oversight on their part?

[74:32] Justin: No, I think it will because that's done. The scale set will attach to the Load Balancer. I haven't played with availability sets because I luckily joined Azure after that was really their preferred path. But I know that this was a pain point for people and the way most, most people do is just delete and recreate, take the maintenance window, do it during, you know, a normal upgrade, which is what but I know I've seen people do.

[74:59] Jonathan: So I mean, like on the Amazon side, we had launch sets and I forget the name of the other one. And I also don't remember which one is—

[75:05] Justin: Launch templates.

[75:06] Jonathan: Templates. I don't remember which one is the one that you're supposed to use anymore. 'Cause I, I never remember this.

[75:10] Justin: Launch configuration and launch templates. And they moved from configurations to templates. I'm, I'm gonna say 80% positive.

[75:17] Jonathan: I got it correct. That sounds correct. I think you're right. Uh, but like the, and the, it was a breaking change, but like you just spun up a new launch template. And then you attach it to the Load Balancer group, and then you just spun up the new VMs and you spun down the old ones, and then you were done. It wasn't hard. I don't understand why, like, people— is this just a Microsoft thing where Microsoft admins are just like, this is hard, I don't want to do this?

[75:39] Justin: Yes.

[75:39] Jonathan: Is that why this exists? Okay, that's, that's clear.

[75:42] Justin: That's at least my understanding of it. I mean, I haven't done this specific migration, but that's the way I feel like a lot of things are.

[75:48] Jonathan: They—

[75:49] Justin: people like to have it, and my favorite is they'll say Hey, this is gonna be EOL in 2027, in January 1st, 2027. And October 2026, they released a migration path between the two. It's like, so people obviously have figured out how to do this migration for years. And my assumption is something's going EOL and they're trying to force people over it. It's the lower-end people that aren't as cloud-native that might, that need that assistance.

[76:17] Jonathan: Okay, that's fair. I mean, Amazon still supports both the old model and the new model, which is part of the reason why I'm confused which one is which. And also your Terraform, if you're using AI-Agentic, will sometimes do the wrong one. So that's something to watch out for if you're using an agent that doesn't have the proper Amazon NCP attached, perhaps to avoid that. Well, that's a fantastic show, guys. So thank you very much. We'll see you, I guess, next week.

[76:43] Justin Brodley: Yeah, it's good.

[76:44] Jonathan: Or stick around for an after show, you know, you choose.

[76:46] Justin Brodley: See you later, guys.

[76:47] Justin: Bye.

[76:48] Jonathan: See ya.

[76:49] Justin: Another week of cloud news wrapped up. Bolt will collect the news, Justin will get the notes, Jonathan will write some code, Ryan will watch the perimeter, and Matt will reluctantly watch Azure. Till next week for AI, Amazon, Google Cloud, and Azure, and hey, maybe even Oracle, who knows? Check out thecloudpod.net for our newsletter. Better join our Slack, message us on socials, or leave a review.

[77:21] Jonathan: So, um, this week Google had one of their big events and I think this is the Android event, is it not? Google I/O, is that what happened? I'm not up to date enough on Google. You're the Android guy, Jonathan, you tell me.

[77:31] Justin Brodley: I, I have no idea. I, I, after, after the last Google thing I watched with, uh, which, which late, which late night TV show was it?

[77:38] Jonathan: Host, uh, it looks like Google I/O is next week actually.

[77:41] Justin Brodley: I haven't, I just don't pay attention to that anymore. That was just an embarrassing train wreck. So I pretend I've got an iPhone now.

[77:50] Jonathan: Well, I know Google announced a ton of features this week in advance of Google I/O. I guess that's kind of weird. I don't know exactly why they're doing this, but they basically announced a new Google Book designed for Gemini intelligence, apparently not learning from Surface and the Copilot Surface. That no one bought. Uh, so Google is trying now. This new laptop category merges Android and Chrome OS into a single platform with devices expected from Acer, Asus, Dell, HP, and Lenovo this fall. No pricing, of course, announced. The Magic Pointer feature developed with Google DeepMind adds contextual Gemini suggestions directly to the cursor, allowing users to interact with on-screen content like dates, images, and text without switching applications. A new Create Your Widget feature lets users generate custom desktop widgets. Through natural language prompts, pulling data from Gmail, Calendar, and web searches into a single personalized dashboard. And Quick Access enables direct browsing of phone files from Google Book File Manager without manual transfers. And Android phone apps can be used inline on the laptop without leaving the current workflow. This announcement is primarily a consumer hardware story, but we're still covering it here in the After Show. But, uh, I mean, none of these features sound great. The create your, the magic pointer feature. Yeah. Get your AI out of my pointer.

[78:58] Justin Brodley: The whole thing, the whole of Aluminum OS is designed to be an AI-centric operating system.

[79:05] Jonathan: I mean, no one seems to love Copilot on Windows, so like, why do they think they're going to be better at this than others?

[79:09] Justin Brodley: Well, I mean, they invented Transformers. They are ahead of the game.

[79:17] Jonathan: I mean, the idea that you can run Android apps on the device is kind of nice, but I mean, Mac does that. You can run Mac, iPad, and iOS apps on your Mac if the developers allowed you to do that. I do it. For some apps. So I mean, like, it's— in some ways it's like, okay, you're just doing some feature parity stuff, but I just don't— I don't know. Like, I don't know who the audience for this is. It's definitely not me. A Chromebook is not me either. So I mean, take that.

[79:40] Justin Brodley: Yeah, no, I think— I think this is— this is the start of Google going after, um, Apple, honestly. That's the way that—

[79:46] Jonathan: I mean, like, you can buy the new MacBook Neo for how much? Like $699? Like, it's some crazy price, and it's a full MacBook. I mean, yes, the specs are horrendous.

[79:56] Justin Brodley: Like 8 gig in an iPhone CPU, I don't know, that doesn't sound—

[79:59] Jonathan: I mean, the iPhone CPU is pretty darn good, so I don't know. Yeah, $599, uh, for the start on the Hello Neo. So I mean, like, if this thing is not $599 or less, I think it— I don't know how you compete.

[80:12] Justin Brodley: It won't be $599, not the Google-branded ones. I'm like, the Pixelbook Chromebook brand were, were the premium brand, and they let the Acers make the cheaper versions. But I'm imagining it's gonna be like $1,299 or something like that. I would imagine.

[80:27] Justin: I mean, I've always wanted— there was an old Motorola phone back in the day that had a dock that you could plug your phone into and it would load up like a mini OS. I know that— I think my Pixel does it now, like where if I connect a mouse and keyboard it becomes like a mini Android OS. I've always wanted something like that, that like I already carry this supercomputer in my pocket, I want to use it more for like my day-to-day stuff But then again, I know every time I try to use it, like, it doesn't work in my workflow ever. So like maybe something like this would, but I'm probably not gonna spend $1,200 and have to carry around another device. At that point, I'd rather buy a new Mac and have a much more powerful machine at that point.

[81:11] Justin Brodley: Yeah, I kind of wonder who the market is for this, because it's not gonna be, it's not gonna be schools, because schools won't afford it. They've, they've got their cheap Chrome, their cheap Dell Chromebooks. Is it, is it people going to college? Is it somebody else? I don't know. It's a weird place in the market to be targeting. I don't know if there's a demographic of people who just want an AI-integrated Chromebook that happens to run Android apps. I mean, I like the idea of sort of more cohesiveness between desktop and mobile devices. Because Chromebook, I don't know, ChromeOS is kind of, it works. I've never seen it crash, but it's just not thrilling. You know, it's very limited what you can do, and that's part of why it's so reliable, I guess. I'm just not quite sure who's going to be the audience for this because it's not going to be me.

[81:55] Jonathan: It's not going to be me either, but I'm intrigued by the idea of it. But yeah, I don't know. All right, gentlemen. Well, uh, we'll see if the Google Book breaks any, uh, you know, maybe, maybe we're in 6 months from now, we're like, we want a Google Book. But I don't think that's gonna be the case. Wow.

[82:12] Justin Brodley: We should start going after vendors to send us stuff to review.

[82:16] Jonathan: There you go.

[82:17] Justin Brodley: We've got like 10 or 11 audience members now.

[82:20] Jonathan: Maybe, maybe, uh, we have way more than that. We have lots of downloads. I mean, if you are listening here, you know, we'd love for you to write a review. It's been a little bit since we've asked, but, uh, you know, it is nice if you write an iTunes review or Spotify review or however you do listen to the show that we know you're out there. But I mean, I, I get Lots of people pinging me. Uh, people hit our Slack up occasionally and ask us questions. We don't really have a great community on Slack. I don't, I mean, we try to engage everybody, but no one ever like ask us questions.

[82:48] Justin: We'll, we'll talk to you, we'll give you advice, or we'll point you at Bolt.

[82:52] Jonathan: Yeah. Or Bolt will tell you, you can ask Bolt anything. Yeah. You know, he's, he's there all the time. I should, I should see if anyone's secretly talking to Bolt and never actually thought about it. Let me know if anybody other than Jonathan, Ryan, or Matt talks to Bolt. Let me know. Maybe there'll be insight into what people are interested in. But yeah, no, we always love feedback and we're trying to not have so much AI news all the time, but it's hard to do. I mean, some of the other podcasts have actually rebranded to be AI podcasts. I'm like, I don't know if The AI Pod makes sense.

[83:25] Justin Brodley: The AI Pod, yeah, no, no.

[83:27] Jonathan: Yeah, no, it doesn't roll off the tongue, does it? Yeah, so.

[83:30] Justin Brodley: It doesn't. I don't know, I mean, I don't mind the AI news. Mostly. I think there's, I think there's some, some crap, crap stuff around it.

[83:36] Jonathan: I, I think I hated it originally and I've kind of, it's kind of grown on me over, like, now that it's getting kind of interesting. I think early days of GPT, you're like, this is just a toy. And then the coding tools come out and you're like, okay, this is kind of cool. And now I'm using it every day and I'm like, this is awesome. I love this. And so I'm intrigued when things come out, but I also, the amount of vaporware that exists in this industry right now on the AI side is crazy. I mean, entire Google Next keynote is mostly vaporware. So, you know, things like that are impactful.

[84:02] Justin Brodley: Yeah, it's funny to see, you know, new companies popping up and they're like, you know, AI-driven whatever. And like, well, it's been around for 6 months. Are you really, are you really an established enterprise company? I don't think so. I don't know.

[84:15] Justin: You don't think a 6-month-old company has every enterprise, you know, basic hot— I'm not going to say every, every, but you know, any enterprise features at that point? You know, I feel like most of them barely have SSO set up. And that to me is like level 1 of enterprise. That doesn't even mean you have SCIM. It's just, we have SSL integration. We've checked that box. You want to bring your own encryption key? That's not something we can handle.

[84:40] Justin Brodley: Mm-hmm.

[84:42] Jonathan: Yeah. Didn't Werner say to encrypt everything?

[84:45] Justin Brodley: He did. Yeah. Yeah. I think the news stories that annoy me the most at this point, annoy is probably a bit strong, like the, um, the Azure story, you know, migrating this VM into a different type of arbitrary pool, you know, we created this thing that you can put a VM in, and this other thing you can put a VM in, and now we've enabled you to do this thing, and isn't it an amazing feature? Like, no, I really, it shouldn't be, it shouldn't be a thing.

[85:12] Justin: It's a migration tool.

[85:15] Justin Brodley: But you shouldn't need a migration tool. The service should be able to evolve underneath you. Like, if they want to add those features, add the features.

[85:23] Justin: But it's a completely different, I mean, in this case, it's a completely different service that you're migrating from one to the other. You know, go back to the AWS example. It's like if they migrated you from launch templates to, oh wait, other way around, launch configuration to launch templates, those at least had feature parity and you were growing on one. I don't know that every feature in availability sets also existed in launch template, or sorry, in scale sets. That's now mixing.

[85:52] Jonathan: That is the truth about the Amazon one. There was things that you used to have to stay on it, have to set in different ways. And then you had to, when you moved over to the launch template, you know, you migrated them over or you, and then there was new things you had to add in. And that was because of things like, you know, different versions of IMDS didn't exist in the old launch way. They exist in launch templates.

[86:12] Justin: You couldn't do hard drives and a bunch of stuff at that point.

[86:15] Jonathan: And then an Io2 attachment versus non-Io2 and you know, all those things that kind of change over time. Like, you know, it makes sense that it does change. And if you didn't create something in the early days as something dynamic and scalable and changeable, then making a new thing makes sense. But again, I just don't know how hard it is to switch, but here we are.

[86:33] Justin Brodley: It's a problem that didn't have to exist though.

[86:36] Jonathan: It didn't.

[86:37] Justin Brodley: Like, as a customer, I have an intent and I want them to execute on my intent. The fact that they require me to do a launch configuration or a launch template or some other kind of choice or box they're going to put me in, that's just a limitation of their service, but my intent is the same either way. So why do I have to do work?

[87:00] Justin: I wonder if you're also, we're complaining about the tip of the iceberg and how many things have they magically solved for us under the hood that we don't even know, as I mix metaphors in that statement. You know, like how many different things have they just solved and for example, done that migration that you're talking about? Moved off of the old Intel processor to a new Intel processor 'cause they wanted to decommission the server, you know, that rack. Like maybe they just do all of it for us. I'm not sure.

[87:30] Justin Brodley: Well, I mean, that's a good example. You know, Amazon didn't live migrate VMs off of hosts. You used to get a maintenance notice and nobody ever paid attention to them because it was very obscure. And then your machine would go down and there'd be an outage and we're like, oh yes, we really should do something with these maintenance notices. But other clouds would live and make great things. And it's just like lots of the examples of things where their choices just lead to complication. I don't wanna have to worry about VMs. I don't wanna have to worry about these logical—

[88:00] Justin: You can go completely serverless.

[88:01] Justin Brodley: Services and constructs and everything else. Yeah, I mean, but I think AI though, to circle back to the thing that we're rebranding to.

[88:10] Justin: We take REST, Docker.

[88:12] Justin Brodley: I think with AI though, you will be in a position to specify your intent without the implementation. And as long as the intent stays the same and the outcomes stay the same, how it's implemented in the middle is becoming less and less important. So I think there'll be many less of these stories about you can do this random little thing now, or you can migrate for this thing. I think that a lot of this stuff will just happen seamlessly.

[88:38] Jonathan: Fair.

[88:38] Justin: I mean, I'm just even thinking like, I was talking with somebody the other day about ELBs versus ALBs, and I'm like, but there's some things that are fundamental changes that I think even what you're saying, you still would have to say to whatever the AI is, hey, go check to make sure you're doing it the best way and go do this thing, or, you know, put a loop or, you know, whatever on because What is correct today and the best practices and the most logical or, and cost effective or whatever today in a year is most likely not the most cost effective and best way to do it anymore. Well, and/or you might have business requirements to change, but let's assume that that's not variable, you know, so like you need some sort of loop and then if you are making changes to something, it should go up your, your proper SDLC for infrastructure changes also.

[89:32] Justin Brodley: Infrastructure changes though. I don't know. Is it infrastructure?

[89:37] Jonathan: So go use Beanstalk.

[89:38] Justin: Go back to Beanstalk.

[89:41] Jonathan: Here's your app code.

[89:42] Justin: Go run it.

[89:43] Jonathan: I'm still horrified that when I talked to the Beanstalk PM, he's like, it's the number one service in all of Amazon. I'm like, what? How is that possible? So we may come back.

[89:51] Justin Brodley: I can see why.

[89:51] Justin: I know a few production customers that use it for production still. I advise one of them.

[89:57] Jonathan: It's just, it's crazy to me.

[89:58] Justin: And it works for him.

[90:00] Jonathan: Yeah. I mean, I guess it's, it's the easy button in a lot of ways. I mean, it's why people like Vercel as well. You know, I don't understand that, but if you're really into doing frontend web design, Vercel does the work. So it is what it is.

[90:13] Justin Brodley: Yeah. Vercel or Cloudflare Workers and things like that. It's just so easy to use. It's, it's what application development and deployment should look like. Mm-hmm.

[90:22] Jonathan: I mean, it's what we always, it's kind of the, where we always wanted, you know, platform services to get to. And now, you know, it's being offered as a service. So it makes sense. All right, gentlemen, I got to run, but good catching up as always.

[90:33] Justin Brodley: Yep. See you later. Bye.