Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

CloudPro

67 Articles
Shreyans from Packt
16 Jun 2025
Save for later

How to Make Sure Your Kubernetes Sidecar Starts Before the Main App

Shreyans from Packt
16 Jun 2025
Why Automatic Rollbacks Are Risky and Outdated in Modern DevOpsCloudPro #96Platform Weekly - the world’s largest platform engineering newsletterWith over 100,000 weekly readers Platform Weekly dives into platform engineering best practices, platform engineering news, and highlights, lessons and initiatives from the platform engineering community.Subscribe Now📌 A hidden prompt injection flaw in GitLab Duo that quietly leaked source code📌 Just-in-time AWS access using Entra PIM (yes, that’s possible now)📌 Cloud SQL charging 2TB storage for 6GB of data, because of WAL logs📌 Why automatic rollbacks in DevOps might be doing more harm than goodYou’ll also find sharp reads on scaling Terraform teams, new volume tools for AI/ML in GKE, and a brutally honest take on Kubernetes complexity. On the observability side, AWS added visual dashboards to Network Firewall, and OpenTelemetry clarified how to treat logs vs. events.Hope you find something that helps you ship safer, smarter, or faster.Cheers,Shreyans SinghEditor-in-ChiefPS: If you’re not already reading Platform Weekly, I’d recommend it.It’s one of the few newsletters I make time for every week: focused on platform engineering, cloud native, and the kind of problems teams actually face. 100,000+ people read it, but it still feels like it’s written by someone who gets it.Here’s the link if you want to check it outSubscribe Now🔐 Cloud SecurityJust-in-time AWS Access to AWS with Entra PIMJust‑in‑time privileged access can be implemented by integrating Microsoft Entra PIM with AWS IAM Identity Center using SCIM/SAML, enabling temporary group-based access tied to approval workflows and time limits. By mapping Entra security groups to AWS permission sets (e.g. EC2AdminAccess) and enabling eligibility/activation in PIM, users gain access only when approved, and only for a set duration.On‑Demand Rotation Now Available for KMS Imported KeysAWS KMS now lets you rotate imported symmetric key material on‑demand without needing to create a new key or change its ARN, simplifying compliance and security by avoiding workload disruptions. New API operations, including RotateKeyOnDemand and KeyMaterialId tracking, let you import, rotate, audit, expire, or delete individual key versions while retaining decryption access to older ciphertext.CloudRec: multi-cloud security posture management (CSPM) platformCloudRec is an open‑source, scalable CSPM platform that continuously discovers 30+ cloud services across AWS, GCP, Alibaba, and more, offering real‑time risk detection and remediation.It uses OPA‑based declarative policy management, enabling dynamic, flexible rule definitions without code changes or redeployment.How to use the new AWS Secrets Manager Cost Allocation Tags featureAWS Secrets Manager now supports cost allocation tags, letting you tag each secret (e.g., with CostCenter) and track its costs in Cost Explorer or cost-and-usage reports.Enable tags in Billing → Cost Allocation Tags, then filter or group secrets costs by tag to see spend per department or project.GitLab Duo Prompt Injection Leads to Code and Data ExposureA hidden prompt injection flaw in GitLab Duo allowed attackers to embed secret instructions, camouflaged in comments, code, or MR descriptions, triggering the AI assistant to reveal private source code. The attacker leveraged streaming markdown rendering and HTML injection (like <img> tags) to exfiltrate stolen code via base64-encoded payloads. GitLab patched the vulnerability in February 2025, blocking unsafe HTML elements and tightening input handling.⚙️ Infrastructure & DevOpsAmazon API Gateway introduces routing rules for REST APIsAmazon API Gateway now supports routing rules for REST APIs on custom domains, allowing dynamic routing based on HTTP headers, URL paths, or both. This enables direct A/B testing, API versioning, and backend selection, removing the need for proxies or complex URL structures.Amazon EC2 now enables you to delete underlying EBS snapshots when deregistering AMIsEarlier, snapshots had to be removed separately, often leading to orphaned volumes and wasted spend. Now. AWS EC2 will let users automatically delete EBS snapshots when deregistering AMIs, cutting down on manual cleanup and storage costs. This update streamlines resource management with no extra cost and is available across all AWS regions.Why is your Google Cloud SQL bill so high?A developer discovered that their Cloud SQL instance showed 2 TB of usage for only 6 GB of actual data, due to retained Write-Ahead Logs (WAL) from Point-in-Time Recovery. These logs can silently bloat storage costs when frequent transactions occur. To control costs, users should reduce WAL retention or re-provision instances with right-sized storage.Why Automatic Rollbacks Are Risky and Outdated in Modern DevOpsAutomatic rollbacks seem helpful but often fail due to the same issues that break deployments, like expired credentials or partial database changes. Modern practices like Continuous Delivery and progressive deployment (canary, blue/green, feature flags) offer safer, faster recovery paths. Human oversight adds resilience and learning, making manual intervention more effective than rollback automation.How to structure Terraform deployments at scaleAt scale, Terraform deployments require a clear structure that balances control and team autonomy. Scalr’s two-level hierarchy: Account and Environment scopes, lets central DevOps manage policies and modules, while engineers deploy independently within isolated workspaces. This setup encourages reusable code and standardization through a shared module registry.📦 Kubernetes & Cloud NativeMaking Kubernetes Event Management Easier with Custom AggregationAs Kubernetes clusters grow, managing events becomes harder due to high volume, short retention, and poor correlation. This article shows how to build a custom event system that groups related events, stores them longer, and spots patterns: helping teams debug issues faster. It uses Go to watch, process, and store events, and includes options for alerts and pattern detection.GKE Volume Populator Simplifies AI/ML Data Transfers in KubernetesGoogle Cloud’s new GKE Volume Populator helps AI/ML teams automatically move data from Cloud Storage to fast local storage like Hyperdisk ML, no custom workflows needed. It uses Kubernetes-native PVCs and CSI drivers to manage transfers, delays pod scheduling until data is ready, and supports fine-grained access control.How to Make Sure Your Kubernetes Sidecar Starts Before the Main AppIf your app depends on a sidecar, Kubernetes doesn’t guarantee the sidecar is fully ready before the main container starts, even with the new native support. This article shows how to delay the app start using startupProbe or postStart hooks in the sidecar. These methods let the app wait until the sidecar is actually ready, avoiding startup errors without needing code changes.Not every problem needs KubernetesKubernetes promises scalability and flexibility, but for most teams, it adds unnecessary complexity. Many workloads can be handled more easily with VMs, managed cloud services, or simpler container platforms like AWS Fargate or Google Cloud Run. Unless you truly need hybrid cloud, global scale, or run hundreds of services, Kubernetes may just slow you down and drain resources.What You Actually Need for Kubernetes in ProductionProduction Kubernetes setups need more than just working clusters. Use readiness, liveness, and startup probes correctly to avoid early traffic issues or restarts. Always define CPU and memory limits, isolate secrets using volumes, and enforce RBAC with least privilege. Use HPA for scaling, avoid local storage, and apply network policies to control traffic. Tools like kube-bench, Trivy, and FluentBit help monitor security, cost, and logs effectively.Book Now🔍 Observability & SREAWS Network Firewall launches new monitoring dashboardAWS Network Firewall now includes a monitoring dashboard that shows key traffic patterns like top flows, TLS SNI, HTTP host headers, long-lived TCP flows, and failed handshakes. This helps teams troubleshoot issues and spot security concerns faster. It’s available in all supported regions at no extra firewall cost, but requires Flow and Alert logs to be configured.Official RCA for SentinelOne Global Service InterruptionSentinelOne’s May 29 global service outage was caused by a software flaw in a deprecated infrastructure control system, which accidentally deleted critical network routes. This broke internal connectivity, taking down management consoles and related services. While customer endpoints stayed protected, teams lost visibility and control during the incident.There's a Lot of Bad Telemetry Out ThereMuch of today’s telemetry is noisy, irrelevant, or misleading: causing higher costs, slow troubleshooting, and poor decisions. Common problems include incomplete traces, outdated metrics, irrelevant logs, and data overload. Engineers often lack clear standards or guidance on good telemetry, especially for newer systems like LLMs. To fix this, teams should define what's useful, apply consistent conventions (e.g. OpenTelemetry), and work closely with devs to improve instrumentation at the source.OpenTelemetry Clarifies Its Approach to Logs and EventsOpenTelemetry treats logs as structured records sent through its Logs API, with a special focus on events: logs with a defined schema and guaranteed structure. Events are preferred for new instrumentation, as they integrate with context and can correlate with traces and metrics. Unlike spans, events have no duration or hierarchy. OpenTelemetry recommends using logs mainly for bridging existing systems, while semantic instrumentation should rely on events for consistency and context sharing.Storing all of your observability signals in one place matters!Treating traces, logs, and metrics as separate “pillars” creates silos and hinders correlation. Many teams still split signals across tools or vendors, leading to fragmented insights and painful debugging. A centralized “single pane of glass” setup helps correlate signals in one place, making it easier to understand system behavior.Forward to a Friend📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.If you have any comments or feedback, just reply back to this email.Thanks for reading and have a great day!*{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}
Read more
  • 0
  • 0

Shreyans from Packt
09 Jun 2025
Save for later

Uber built a multi-cloud secrets platform to prevent leaks and automate security at scale

Shreyans from Packt
09 Jun 2025
How to Block Up to 95% of Attacks Using AWS WAFCloudPro #95A better way to handle vendor security reviews?If you've ever dealt with vendor onboarding or third-party cloud audits, you know how painful it can be: long email chains, stale spreadsheets, and questionnaires that don’t reflect what’s actually happening in the cloud.We recently came across CloudVRM, and it’s a refreshingly modern take on the problem.Instead of asking vendors to fill out forms or send evidence, CloudVRM connects directly to their AWS, Azure, or GCP environments. It pulls real-time telemetry every 24 hours, flags misconfigs, and maps everything to compliance frameworks like SOC 2, ISO 27001, and DORA.It’s already being used by banks and infra-heavy orgs to speed up vendor approvals by 85% and reduce audit overhead by 90%.Worth checking out if you're building or maintaining systems in regulated environments, or just tired of spreadsheet security.Watch the demoThis week’s CloudPro kicks off with something genuinely useful: a tool that replaces vendor security questionnaires with real-time cloud evidence.📌CloudVRM connects directly to AWS, Azure, or GCP and auto-checks compliance, no spreadsheets, no guesswork📌AWS CloudTrail silently skipping logs if IAM policies get too large (and attackers know it)📌PumaBot is now brute-forcing IoT cameras and stealing SSH credsWe’ve also got sharp engineering writeups: from how Uber rotates 20K secrets a month, to how Netflix handles 140 million hours of viewing data daily, to one team’s story of slicing a $10K Glue bill down to $400 with Airflow.Hope you find something in here that saves you time, money, or migraines.Cheers,Shreyans SinghEditor-in-Chief🔐 Cloud SecurityAWS CloudTrail logging can be bypassed using oversized IAM policiesResearchers at Permiso Security found that AWS CloudTrail fails to log IAM policies between 102,401 and 131,072 characters if they're inflated using whitespace. This gap allows attackers to hide malicious changes from audit logs. The issue stems from undocumented size limits and inconsistent handling of policy data. AWS has acknowledged the problem and plans a fix in Q3 2025.PumaBot targets Linux-based IoT surveillance devices via SSH brute forceA new botnet called PumaBot is targeting IoT surveillance systems by brute-forcing SSH access using IP lists from its command-and-control server. Written in Go, the malware disguises itself as system files, adds persistence through systemd, and installs custom PAM modules to steal credentials. Related binaries in the campaign also auto-update, spread across Linux systems, and exfiltrate login data.How to Block Up to 95% of Attacks Using AWS WAFThis guide explains how to configure AWS Web Application Firewall (WAF) to block threats like SQL injection, XSS, bots, and DDoS attacks with minimal effort. By leveraging pre-built managed rules and setting up a Web ACL, users can protect apps behind ALB, CloudFront, or API Gateway without custom code.CloudPEASS: Toolkit to find and exploit cloud permissions across AWS, Azure, and GCPCloudPEASS helps red teamers and defenders map out permissions in compromised cloud accounts without modifying resources. It supports AWS, Azure, and GCP, detecting privilege escalation paths using API access, brute-force permission testing, and AI-assisted analysis. It also checks Microsoft 365 services in Azure and enables Gmail/Drive token access in GCP.Uber built a multi-cloud secrets platform to prevent leaks and automate security at scaleTo manage over 150,000 secrets across services and vendors, Uber developed a centralized secrets management platform. It blocks leaks in code with Git hooks, scans systems in real time, and consolidates 25 vaults into 6. The platform enables auto-rotation, access tracking, and third-party secret exchange via SSX. It now rotates ~20,000 secrets monthly and is evolving toward secretless auth and workload identity federation.BOOK NOW AT 25% OFF⚙️ Infrastructure & DevOpsAWS Cost Explorer now offers a new Cost Comparison featureAWS launched a new Cost Comparison feature in Cost Explorer that highlights key changes in cloud spend between two months. It automatically identifies top cost drivers, like usage shifts, discounts, or refunds, without needing manual spreadsheets. A new “Top Trends” widget shows the biggest changes at a glance, and deeper insights are now available through the Compare view.Go-based Git Add Interactive tool adds advanced staging and patch filteringThis Go port of git add -i/-p enhances Git’s interactive staging with features like global regex filters, auto-hunk splitting, and multi-mode patch operations (stage, reset, checkout). It supports keyboard shortcuts, color-coded UI, and fine-grained hunk control across all files.GitLab-based monorepo streamlines Terraform module versioning and securityThis setup uses a GitLab CI pipeline to manage Terraform modules in a monorepo, with automated versioning, linting, and security scans via tools like TFLint, tfsec, and Checkov. Git tags handle module versions without extra auth tokens. The workflow enforces changelogs, labels, and approvals, and publishes docs and tags post-merge.A fully automated fix for Terraform’s backend bootstrapping problem on AzureThis guide solves the common issue where Terraform needs a backend to store state, but can’t create it without an existing backend. It automates the creation of an Azure Blob backend using Terraform itself, then seamlessly switches to that backend by generating partial config files and migrating state. The setup includes secure access via managed identity and GitHub OIDC, enabling CI/CD workflows without manual secrets or scripts.Using Terraform to automate disaster recovery infrastructure and failoversThis post explains DR strategies like Pilot Light and Active/Passive, and shows how Terraform enables flexible, cost-efficient deployments using conditionals and modular IaC. A working AWS example demonstrates DNS failover and dynamic EC2 provisioning using a toggle variable. This lets teams switch between production and DR environments with minimal effort, reducing downtime and idle resource costs.📦 Kubernetes & Cloud NativeGateway API v1.3.0 Adds Smart Mirroring and New Experimental ControlsGateway API v1.3.0 is now GA with percentage-based request mirroring, letting teams test blue-green deployments without full traffic duplication. The release also debuts experimental support for CORS filters, retry budgets, and listener merging via new X-prefixed APIs. These features help fine-tune request handling, scale listener configs across namespaces, and manage retry spikes, without upgrading Kubernetes itself.Introducing Gateway API Inference ExtensionThe new Gateway API Inference Extension introduces model-aware routing for GenAI and LLM services running on Kubernetes. It adds InferenceModel and InferencePool resources to better match requests with the right GPU-backed model server based on real-time load. Early benchmarks show reduced latency under heavy traffic compared to standard Services, helping ops teams optimize resource usage and avoid contention.Deep Dive into VPA 1.3.0: Smarter Resource Tuning for Kubernetes PodsThis post explores how the Vertical Pod Autoscaler (VPA) v1.3.0 uses historical and real-time metrics to recommend CPU and memory resource requests. It focuses on the Recommender component, which aggregates usage into decaying histograms to auto-tune workloads and reduce resource waste.Default Helm Charts Leave Kubernetes Clusters at RiskMicrosoft researchers warn that many open-source Helm charts deploy with insecure defaults, exposing services like Apache Pinot, Meshery, and Selenium Grid to the internet without proper authentication. These misconfigurations often include LoadBalancers or NodePorts with no access controls, making them easy targets for attackers. Teams should avoid "plug-and-play" setups and review YAML/Helm configs before deploying to production.Batch Scheduling in Kubernetes: YuniKorn vs Volcano vs KueueKubernetes lacks native support for batch workloads like ML training and ETL jobs, prompting the rise of tools like Apache YuniKorn, Volcano, and Kueue. YuniKorn replaces the default scheduler with strong multi-tenancy support; Volcano focuses on high-performance use cases with gang scheduling; and Kueue integrates natively to manage job queues without altering core scheduling.🔍 Observability & SREWhat's new in Grafana v12.0Grafana v12.0 introduces Git-based dashboard versioning, dynamic layouts, and experimental APIs for managing observability as code. Drilldowns for metrics, logs, and traces are now GA, enabling queryless deep dives across signals. SCIM support simplifies team provisioning, and a new “Recovering” alert state reduces flapping.Sentry Launches Logs in Open Beta to Boost Debugging ContextSentry now supports direct log ingestion in open beta, letting developers view application logs alongside errors and traces in a single interface. This integration adds vital context, like retry attempts or upstream responses, to help identify root causes faster without switching tools.How to use Prometheus to efficiently detect anomalies at scaleGrafana Labs has built and open-sourced an anomaly detection system using only PromQL: no external tools or services required. It computes dynamic bands using rolling averages, standard deviation, and seasonal patterns, with tunable sensitivity and smoothing to reduce false positives. The framework scales across tenants and works with any Prometheus-compatible backend, making it easy to plug into SLO-based alerts for better incident context.Beyond API uptime: Modern metrics that matterTraditional uptime checks fall short in today’s fast-paced environments where even minor API delays can cause major user churn. Catchpoint’s Internet Performance Monitoring (IPM) combines global synthetic tests, percentile-based metrics, and user-centric objectives to detect slowdowns before they escalate. With features like API-as-code, chaos engineering, and CI/CD integration, IPM helps teams catch latency issues early and simulate real-world failures.Microservices Monitoring: Metrics, Challenges, and Tools That MatterMonitoring microservices requires more than just uptime: it demands insight into latency, throughput, error rates, resource use, and inter-service communication. Tools like Middleware, Prometheus-Grafana, and Dynatrace help track these metrics at scale, support alerting, and simplify root cause analysis. Best practices include centralized logging, distributed tracing, automation, and continuous optimization to maintain performance in complex distributed systems.Forward to a Friend📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.If you have any comments or feedback, just reply back to this email.Thanks for reading and have a great day!*{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}
Read more
  • 0
  • 0

Shreyans Singh
15 Sep 2025
Save for later

Don’t miss this: AI-Powered Platform Engineering workshop (Sept 27)

Shreyans Singh
15 Sep 2025
Hands-on workshop + expert panel. Special offer for CloudPro readers.CloudPro #107: Special IssueI'm interrupting our regular newsletter schedule today because something came up that I genuinely think you need to know about.I want to tell you about our event on September 27th that could be a game-changer for how you think about platform engineering.MongoDB's Director of Engineering shows you how to build AI-powered developer platforms that actually work at scaleExclusive 40% Off for CloudPro SubscribersUse code CLOUDPROHere's why I'm personally excited about this:We've got George Hantzaras from MongoDB leading a 5-hour intensive on AI-Powered Platform Engineering. And when I say intensive, I mean it – this isn't another surface-level "AI is the future" talk. George is the Director of Engineering at MongoDB, speaks at Kubecon and HashiConf, and he's going deep into the practical stuff that actually matters.Agenda for the workshop:Self-Service Golden Paths – build workflows that reduce friction while keeping developer flexibilityKnowledge as a Platform Capability – embed organizational knowledge with AI (RAG, context modeling)Intelligent Developer Portals – natural language interfaces and scaffolding services that understand developer needsAI-Driven Operations – anomaly detection, observability, and incident triage beyond traditional monitoringWhy this matters for your daily work:If you’re working with monitoring stacks like Prometheus or Grafana, George’s approach to integrating runbooks, standards, and service catalogs into developer workflows will feel directly applicable.Exclusive 40% Off for CloudPro SubscribersUse code CLOUDPROOur Panelists:We're not just doing sessions. We've also put together a panel:Ajay Chankramath – Founder, PlatformetricsDr. Gautham Pallapa – Principal Director, Cloud, ScotiabankMax Körbächer – Founder, Liquid ReplyTogether, they’ll unpack the real-world challenges and production patterns they’re seeing across industries.How You’ll Leave PreparedGeorge is ending the day with something I've never seen at these events – a structured workshop to draft your actual 90-day pilot plan. You're walking out with a personalized roadmap, not just ideas.Why This Event is DifferentFocuses on implementation, not hypeGives you time to go deep (5 hours, not 50 minutes)Ends with an actionable plan, not just slide decksExclusive 40% off for CloudPro subscribersExclusive 40% Off for CloudPro SubscribersUse code CLOUDPROBest,ShreyansEditor-in-Chief, CloudPro📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.If you have any comments or feedback, just reply back to this email.Thanks for reading and have a great day! *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0}#converted-body .list_block ol,#converted-body .list_block ul,.body [class~=x_list_block] ol,.body [class~=x_list_block] ul,u+.body .list_block ol,u+.body .list_block ul{padding-left:20px} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}
Read more
  • 0
  • 0
Subscribe to Packt _CloudPro
Our mission is to bring you the freshest updates in Cloud, Identity and Access Management, CI/CD, DevSecOps, Cloud Security, and adjacent domains.

Shreyans from Packt
13 Feb 2026
Save for later

How to Build Always-On Applications on Azure

Shreyans from Packt
13 Feb 2026
By Stephane EyskensCloudPro #119WATCH NOWToday’s CloudPro Expert Article comes from Stéphane Eyskens, a Microsoft Azure MVP and seasoned solution architect with over a decade of experience designing enterprise-scale cloud systems. Stéphane is the author of The Azure Cloud Native Architecture Mapbook (2nd Edition), a comprehensive guide featuring over 40 detailed architecture maps that has earned 5.0 stars on Amazon and become an essential resource for cloud architects and platform engineers.In the article below, Stéphane tackles one of the most challenging aspects of Azure architecture: building truly resilient multi-region systems, with concrete examples using Azure SQL, Cosmos DB, and Azure Storage, complete with code samples and Terraform scripts you can adapt for your own DR testing.Happy reading!Shreyans SinghEditor-in-ChiefSAVE THE ARTICLE FOR LATERThe Azure Cloud Native Architecture Mapbook, Second EditionDesign and build Azure architectures for infrastructure, applications, data, AI, and security.Get 40% off eBookAnd 20% off PaperbackFor the next 72 HoursGET THE BOOKI like to say that Azure is simple… until you go multi-region. The transition from a well-designed single-region architecture to a truly resilient multi-region setup is where simplicity gives way to nuance. Concepts that seemed abstract (high availability versus disaster recovery, failover semantics, DNS behavior, data replication guarantees) suddenly become very real, very concrete, and sometimes painfully operational.This article is written for architects and senior platform engineers, who already understand the fundamentals but are required to build solutions that must remain available despite regional outages, service failures, or infrastructure-level incidents. The scope is intentionally narrowed to Recovery Time Objective (RTO). Data corruption, ransomware, and backup-based recovery are explicitly out of scope. Instead, the focus is on how applications and data services behave during live failover scenarios, and how architectural decisions, sometimes subtle ones, can make the difference between a seamless transition and a prolonged outage.Through concrete examples using Azure SQL, Cosmos DB, and Azure Storage, this article explores how replication models, DNS design, private endpoints, and SDK behavior interact at runtime, and what architects must do to ensure their applications remain functional when regions fail.Rather than focusing on theoretical patterns, the goal here is pragmatic—minimizing downtime and operational friction when things do go wrong. You’ll see diagrams, Terraform and deployment scripts, plus .NET code samples you can adapt for your own DR tests and game days.Before getting into the details, let’s briefly revisit the difference between high availability (HA) and disaster recovery (DR).HA and DR exist on a spectrum, with increasing levels of resilience depending on the type of failure you want to withstand:Application-level failures: In some cases, you may simply want to tolerate application bugs—for example, a memory leak introduced by developers. Running multiple instances of the application on separate virtual machines, even on the same physical host, can already prevent a full outage when one instance exhausts its allocated memory. That is for instance, what you would get if you spin up 2 instances of an Azure App Service within the same zone (no zone redundancy).Hardware failures: To handle hardware failures, workloads should be distributed across multiple racks. That is what you would get if you’d host virtual machines on availability sets.Data centre–level outages: To withstand more severe incidents, workloads should be spread across multiple data centers, such as by deploying them across multiple availability zones. You can achieve this by turning on zone-redundancy on Azure App Service or use zone-redundant node pools in AKS. With such a setup, you should survive a local disaster such as fire, flooding, etc.Regional outages: Finally, to survive major outages, such as a major earthquake, a country-level power supply issue, etc., workloads must be deployed across geographically distant data centers. You can achieve this by deploying workloads across multiple Azure regions in active/active or active/passive mode.Looking at Azure SQLLet’s first analyse the different data replication possibilities with Azure SQL. Table 1 summarizes the different capabilities.Table 1 – Replication capabilitiesWe’ll set aside named replicas and geo-restore, as the former does not contribute to disaster recovery and the latter is likely to introduce significant downtime and potential data loss. This leaves geo-replication as the remaining option. As you might have understood by now, using Azure SQL’s built-in capabilities, you cannot achieve a full ACTIVE/ACTIVE setup since it doesn’t support multi-region writes. This means that you can only have one read-write region and the secondary region(s) are read only.Table 2 outlines the two available geo-replication techniques.Table 2 – Geo replication optionsActive geo-replication may require updates to connection strings or DNS records to point to the new primary after a failover. That said, the actual impact depends on where (*) the client application is located as well as how you deploy to both regions. Let’s look at this in more detail. Figure 1 illustrates an active geo-replication setup between Belgium Central and France Central.Figure 1 – SQL geo replication with active geo replicationIn such a setup, under normal circumstances:Workloads in the primary region (Belgium Central) can connect to the primary server in read/write modeWorkloads in the primary region can perform read-only activities against the secondary replica, providing they tolerate the extra latency incurred by the roundtrip to the remote region (France Central).Workloads in the secondary region (if any), can perform read-only operations against the read replica with no extra latency.The configuration shown in Figure 1 supports a database-only failover. Both regions expose private endpoints to both SQL servers and rely on region-scoped DNS zones.Although Private DNS zones are global by design, keeping them regional allows each region to resolve both the primary and secondary servers. This requires four DNS records in total—primary and secondary endpoints registered in each regional zone.With a single shared DNS zone, this would not be possible: while all four private endpoints could be deployed, only two DNS records would be registered, since the endpoints map to just two FQDNs (primary and secondary). While this approach works, it keeps the regions siloed and prevents any cross-region traffic. From a resilience standpoint, it is preferable to provide as many fallback paths as possible.Moreover, as we will see later, with other resources such as Storage Accounts, a single DNS zone would force us to update the DNS records upon failover, causing a minimal downtime. Bottom line: using multiple DNS zones prevents issues during failover.Back to active geo replication! In case of failover, SQL servers switch roles: the primary becomes secondary and vice versa. This concretely means that the connection string primary.database.windows.net targets the read/write region in a normal situation but a read-only or unavailable one after failover. Workloads using this connection string would either stop working (if the regional outage persist), either talk to a read-only database instead of a read-write one, once the failover completed. Similarly, the connection string secondary.database.windows.net usually targeting the read-only region under normal circumstances now targets the read-write one after failover.Knowing this, a few options exist:You may choose to fail over everything (database+compute). In that scenario, workloads running in the secondary region can use their default secondary connection string, which will automatically target the new primary after failover. This approach requires the deployment pipeline to be region-aware, detect the target region, and apply the appropriate connection string. When deployed in the primary region, the application should use primary.database.windows.net, while in the secondary region it should already be configured with secondary.database.windows.net. This design eliminates the need for any connection string changes after failover. If your webapps, K8s pods, etc. are already up and running, the only thing you still have to do is route traffic to them. Any other SQL client not running in the secondary region (eg: on-premises), would have to update its connection string to target the new primary.You may choose to redeploy the compute infrastructure (web apps, etc.) to the secondary region only in case of regional outage. This approach is cheaper but risky as you’re not guaranteed to have the available capacity and it is causing a significant downtime. However, such an approach allows you to adjust your pipelines, specify the right connection string and simply redeploy your infrastructure and/or application package.If you want to deploy the application with the exact same settings in both regions, you’ll need to update the connection string used by workloads in the secondary region, since primary.database.windows.net will now resolve to an unavailable server after failover. If the original primary later comes back online, it will return as a secondary (read-only) replica, which would not support write operations. You can as well make your application failover aware (**).You can’t simply update DNS, meaning making secondary target primary and vice versa, because the FQDN (primary-or-secondary.database.windows.net) is validated by the target server, and the names must match—so redirecting it to a different server would simply fail.In conclusion, when using active geo-replication as the replication technique, you should make your applications failover-aware (**) and pre-provision both connection strings and implement the failover/retry logic in the application code itself. You may wrap your Entity Framework context into a factory to abstract away the retry logic. Given we typically use a scoped lifetime, you may expect some HTTP requests to fail (in case of an API) but new instances targeting the right server would ultimately succeed without having to restart the application. You may as well use a geo-redundant Azure App Configuration and failover it along SQL, then switch the primary server connection string after failover. The SDK allows you to monitor a sentinel key and to reload the configuration without having to restart the application:Read The Full Article by Stephane Here18 cloud architecture books in one bundle including AWS for Solutions Architects, Kubernetes for Generative AI Solutions, and more.2000+ Bundles already sold.Get The Bundle at$858$5.9048-Hour Flash Sale: 40% off with code FLASH40Book Your Seat NowEarly Bird Offer LIVE Now: 40% Off With Code EARLY40Book Your Seat NowWebinar: How to Build Faster with AI AgentsSave Your Seat📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.If you have any comments or feedback, just reply back to this email.Thanks for reading and have a great day! *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0}#converted-body .list_block ol,#converted-body .list_block ul,.body [class~=x_list_block] ol,.body [class~=x_list_block] ul,u+.body .list_block ol,u+.body .list_block ul{padding-left:20px} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}
Read more
  • 0
  • 0

Shreyans from Packt
06 Feb 2026
Save for later

A blueprint for cyber resilience...

Shreyans from Packt
06 Feb 2026
.CloudPro #118Attackers are actively trying to keep you from recoveringIn the event of a cyberattack, the cost of downtime is measured not just in financial terms, but in operational disruption and reputational damage. While prevention strategies are crucial, they are not a substitute for a robust recovery plan.Backups alone do not guarantee a clean restoration.We invite you to our virtual event,Foundations of Cyber Resilience, on 11 February, where we will provide a practical framework for what happens after a breach.You will learn:Whytraditional recovery strategies can fail when they are needed most.Howto detect and eliminate threats within your backups to prevent reinfection.Key componentsof a modern, orchestrated, and clean recovery process.REGISTER NOWNext week in CloudPro, we're dropping something special: a deep-dive from Microsoft Azure MVP Stéphane Eyskens that every cloud architect needs to read.If you've ever wondered why your multi-region Azure setup feels more complex than it should, or if you're still figuring out what actually happens when a region goes down, this one's for you.Stéphane, author of the 5-star rated Azure Cloud Native Architecture Mapbook, is sharing battle-tested patterns for building truly resilient systems using Azure SQL, Cosmos DB, and Storage. We're talking real code, Terraform scripts, and the kind of insights you only get from years in the trenches.Here's a sneak peek into what's coming...Cheers,Shreyans SinghEditor-in-ChiefHow to Build Always-On Applications on AzureBy Stephane EyskensA Sneak PeekBefore getting into the details, let's briefly revisit the difference between high availability (HA) and disaster recovery (DR).HA and DR exist on a spectrum, with increasing levels of resilience depending on the type of failure you want to withstand:Application-level failures: In some cases, you may simply want to tolerate application bugs—for example, a memory leak introduced by developers. Running multiple instances of the application on separate virtual machines, even on the same physical host, can already prevent a full outage when one instance exhausts its allocated memory. That is for instance, what you would get if you spin up 2 instances of an Azure App Service within the same zone (no zone redundancy).Hardware failures: To handle hardware failures, workloads should be distributed across multiple racks. That is what you would get if you'd host virtual machines on availability sets.Data centre–level outages: To withstand more severe incidents, workloads should be spread across multiple data centers, such as by deploying them across multiple availability zones. You can achieve this by turning on zone-redundancy on Azure App Service or use zone-redundant node pools in AKS. With such a setup, you should survive a local disaster such as fire, flooding, etc.Regional outages: Finally, to survive major outages, such as a major earthquake, a country-level power supply issue, etc., workloads must be deployed across multiple Azure regions in active/active or active/passive mode.Next week, Stéphane walks through exactly how to architect for each scenario, with diagrams, code, and real failover examples you can test yourself. Don't miss it.Early Bird closes in 72 hours. Last Few Seats At This Price.Book Your Seat NowUse code EARLYBIRD40 to get 40% OffWe're running a 5-hour workshop on architecting production-grade GenAI systems on AWS. Hands-on, practical, built for cloud architects and engineers.Here's the deal:Most GenAI content is either toy demos that work once or vendor pitches. This isn't that.We took real production problems: models breaking after launch, RAG pipelines failing silently, agents that cost too much or hallucinate in production, and turned them into architectural patterns using AWS services and real-world trade-offs.You'll learn how to pick the right AWS model for quality, cost, and latency.You'll build and tune RAG pipelines that don't break when data changes.And you'll understand when to use agents versus when they'll create more problems than they solve.Early Bird closes in 72 hours. Last Few Seats At This Price.Book Your Seat NowUse code EARLYBIRD40 to get 40% OffEarly Bird Offer LIVE Now: Get 40% Off TicketsBook Your Seat NowUse code EARLY40 to get 40% Off📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.If you have any comments or feedback, just reply back to this email.Thanks for reading and have a great day! *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0}#converted-body .list_block ol,#converted-body .list_block ul,.body [class~=x_list_block] ol,.body [class~=x_list_block] ul,u+.body .list_block ol,u+.body .list_block ul{padding-left:20px} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}
Read more
  • 0
  • 0

Shreyans from Packt
20 Jan 2026
Save for later

Your AI data is everywhere. Here’s how to actually see and secure it on Feb 3.

Shreyans from Packt
20 Jan 2026
.CloudPro #117Every week, AI helps your team work faster, but it also increases your data’s exposure. Files move between new tools, models use sensitive data, and traditional DLP often misses the most important context.OnFebruary 3 at 11:00 AM PT, we’ll introduce Cyberhaven’s data lineage powered and unified DSPM and DLP platform. You’ll see how one AI-native solution can finally keep up with the way data really moves.Join us live to see:The first public demo of our unified AI and data security platform, designed for the challenges of 2026 and beyond, including SaaS sprawl, shadow AI tools, and constantly moving data.How security teams gain x-ray vision into data usage, so they can spot the risky handful of actions hidden in millions of “normal” events—and stop them in real time, not after the damage is done.Hear honest stories from security leaders about where legacy DLP and standalone DSPM fall short, and how they are rethinking data protection by focusing on context instead of fixed rules.Get a preview of what’s next for DLP, insider risk, AI security, and DSPM from Cyberhaven’s product and leadership teams, along with our future investment plans.Register NowDon’t wait for another AI-related incident to reveal gaps in your data security. Reserve your spot and be among the first to see how a unified DSPM and DLP platform can change how your organization protects its most important data.The official Kubernetes Dashboard is getting archived after a decade. No active contributors or maintainers left. End of an era for one of the earliest K8s UI projects.Meanwhile, someone trained LLMs on three years of incident postmortems and built systems that predict outages 15-45 minutes before alerts fire. We're also covering K8s 1.35's in-place pod restarts, why learning Linux primitives makes Kubernetes finally click, and a Palo Alto DoS flaw that crashes firewalls into maintenance mode.Plus: 20+ tools that auto-generate K8s diagrams and a game where you fix 50 broken clusters to learn.Cheers,Shreyans SinghEditor-in-Chief3 Days Remaining: Book Your Seat NowGet 30% OffUse code FINAL30This Week in CloudKubernetes 1.35 lets you restart entire pods in-placeK8s 1.35 adds in-place pod restart (alpha, behind RestartAllContainersOnContainerExits gate) which is huge for AI/ML workloads. Previously if an init container corrupted the environment or a sidecar failed, you had to delete the entire pod and let the scheduler recreate it: slow and expensive. Now you can trigger a full restart that preserves pod UID, IP, network namespace, sandbox, volumes, everything except ephemeral containers. All init containers rerun from scratch, giving you a clean state.Training AI on your incident history predicts outages 15-45 minutes earlySomeone trained LLMs on three years of incident postmortems and built systems that predict failures 15-45 minutes before traditional alerts fire.The trick is extracting causal embeddings. Not just "symptom and cause are related" but learning the transformation from "what we observed" to "what was actually wrong." They decompose incidents into structured reasoning chains, create separate vector spaces for symptoms/causes/resolutions/precursors, then continuously pattern-match current system state against historical precursor embeddings.Every tool that generates Kubernetes architecture diagramsHuge GitHub repo comparing 20+ tools that generate K8s architecture diagrams from manifests, APIs, Helm charts, etc.KubeDiagrams leads with 47+ resource types supported, reads from manifests/kustomize/Helm/API, outputs to PNG/SVG/PDF/DOT, supports namespace/label clustering. Most tools use Python with Diagrams library, some use Go/TypeScript/Java. Common pattern: 60% support KIS (Kubernetes Icons Set), 45% do namespace clustering, 95% show Services, 80% show Deployments.Learn Kubernetes by fixing 50 broken clustersOpen source game-based K8s training with 50 progressive challenges across 5 worlds (Core Basics, Deployments, Networking, Storage, Security). Each level breaks something in K8s and you fix it using kubectl. Has real-time monitoring with "check" command, progressive hints, step-by-step guides, post-mission debriefs explaining why your fix worked.Palo Alto patched a DoS flaw that crashes firewalls into maintenance modePalo Alto patched CVE-2026-0227 (CVSS 7.7), a DoS vulnerability in PAN-OS firewalls with GlobalProtect enabled that lets unauthenticated attackers crash firewalls into maintenance mode. PoC code already exists and a researcher reported it, though no active exploitation yet. This is almost identical to CVE-2024-3393 from late 2024 which was a zero-day.Early Bird Offer: 40% Off for 72 HoursGet 40% OffUse code EARLY40Deep DiveWhy you should learn Linux before diving into KubernetesDocker didn't invent containers. It wrapped existing Linux features (cgroups, namespaces) that Google had been using for years into a simple interface anyone could use. Every K8s feature relies on Linux primitives: pod isolation uses namespaces (PID, network, mount, user, IPC), resource limits use cgroups, networking uses iptables/nftables for ClusterIP services and NAT, network policies use packet filtering, images use OverlayFS for layered filesystems, Cilium uses eBPF for high-performance networking instead of iptables. When you create a Pod, you're orchestrating Linux isolation and resource management tools. Understanding namespaces, cgroups, network filtering makes K8s and Docker click—you realize they're just convenient wrappers over powerful Linux capabilities. Learn the foundation first, the abstractions make way more sense after.Auto-comment K8s manifest changes on PRsGo tool that receives GitHub webhooks for PRs, auto-discovers ArgoCD apps configured with that repo as source, generates diffs against live state using ArgoCD CLI, and comments on PRs with markdown showing what would change. No per-repo configuration needed.How etcd actually works (and why Kubernetes uses it)etcd is a strongly consistent distributed key-value store using the Raft consensus algorithm. All writes go through an elected leader, changes replicate to followers, new elections happen if leader dies. Production clusters typically run 3 or 5 nodes (odd numbers only since you need majority for availability). K8s stores everything under /registry prefix with naming like /registry/pods/<namespace>/<pod-name> , uses prefix queries and watch subscriptions for real-time updates. This is how controllers and operators subscribe to resource changes.Kubernetes Dashboard is being archived after a decadeThe official Kubernetes Dashboard project is getting archived after no active contributors and maintainers running out of time to work on it. Started in 2015 when K8s was still new, it served the community for over a decade but ecosystem needs have changed significantly. End of an era for one of the earliest K8s UI projects, but makes sense given how much the tooling landscape has evolved since 2015.Self-healing infrastructure is running in production right nowAutonomous healing infrastructure isn't science fiction. It's operational in production serving millions of users, and the difference from past attempts is reasoning capability. The architecture needs four pieces: decision engine combining rule-based policies with LLM reasoning for edge cases, safety sandbox that never executes directly in prod (snapshots state, enhanced monitoring, automatic rollback on any degradation), graduated action library (green/yellow/red based on risk), and learning loop where every action generates training data to improve confidence scores.📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.If you have any comments or feedback, just reply back to this email.Thanks for reading and have a great day! *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0}#converted-body .list_block ol,#converted-body .list_block ul,.body [class~=x_list_block] ol,.body [class~=x_list_block] ul,u+.body .list_block ol,u+.body .list_block ul{padding-left:20px} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}
Read more
  • 0
  • 0
Shreyans from Packt
16 Jan 2026
Save for later

[New IT leader’s guide] Your blueprint for cloud resilience

Shreyans from Packt
16 Jan 2026
.CloudPro #116Attackers are actively trying to keep you from recoveringIt’s a hard truth, but recent intelligence confirms that cloud-native backup is now a primary target for groups like Storm-0501.To survive these threats, you need more than just infrastructure and data durability. You need a strategy built for an active adversary, one that includes mindset, architecture, and preparation.ReadThe Four Levels of Cloud Cyber Resilience: An IT Leader’s Guideto learn:Understandwhy relying on your cloud provider’s “uptime” gives you false confidence against targeted attacksUncover the blind spotsin your current security stack that will prevent fast recovery when seconds countGet the blueprintfor upleveling your cloud cyber resilienceMake sure your cloud can survive a cyberattack.Read NowIn today's CloudPro, we'll look at self-healing infrastructure that actually works is already running in production.Grafana's taking a similar approach with AI agents that investigate incidents in 13 minutes instead of hours, which could save your team about $90k/year in senior engineering time.Meanwhile, DORA's latest research on 5,000 tech professionals figured out which AI capabilities actually separate high performers from struggling teams.We've also got the real reason your network automation keeps failing, AWS's new pentesting agent, and why the Orca-Wiz patent war finally ended.Cheers,Shreyans SinghEditor-in-Chief24 Hours Remaining: Book Your Seat NowGet 40% OffUse code FINAL40This Week in CloudNetwork automation keeps failing because your data is a messNetwork teams keep kicking off "source of truth" projects to consolidate scattered data but EMA found these are "long and painful endeavors." The blockers: execs don't get why you need $60k for a database when apps are running fine, your network data lives in spreadsheets and random IPAMs with everyone doing their own thing, and even after you build it engineers keep making CLI changes that drift everything out of sync. The fixes are obvious but hard: get exec buy-in, use discovery tools, integrate with everything, and lock down CLI access until people actually trust it more than their spreadsheets.Kubernetes 1.35 adds structured debugging endpointsK8s 1.35 enhances z-pages debugging endpoints like /statusz and /flagz with structured JSON responses instead of just plain text. Now you can programmatically query component state for automated health checks and better debugging tools without parsing text output. Still alpha and requires feature gates, but if you're building internal tooling or want to automate component validation, worth experimenting with in test environments.Google wants gRPC as an official MCP transportModel Context Protocol uses JSON-RPC but enterprises running gRPC-based services need transcoding gateways.So Google's working with the MCP community to support gRPC as a pluggable transport directly in the SDK. gRPC gives you binary encoding (10x smaller messages), full duplex streaming, built-in flow control, mTLS, and method-level authorization.MCP maintainers agreed to support pluggable transports and Google will contribute a gRPC package soon.Grafana built AI agents that investigate incidents for youGrafana's Assistant Investigations deploys specialized AI agents in parallel during incidents. They analyze metrics, logs, traces, and profiles simultaneously to build a comprehensive picture in 13 minutes instead of the 2-4 hours a human takes. Real example: payment service latency issue detected connection pool exhaustion and traced it to a recent deployment in minutes, with zero PromQL knowledge needed.Conservative estimate saves 50 hours/month of senior engineering time = $90k/year in reclaimed expertise. Free during public preview, worth trying for three weeks to prove ROI.Self-healing infrastructure is here and it's not about replacing SREsAutonomous healing infrastructure is running in production serving millions of users, and the difference from past attempts is reasoning capability. Systems can finally understand context and make decisions that used to need human judgment. Most orgs are stuck at Level 2 (automated detection, human fixes) but we've deployed Level 5 (predictive prevention) for specific failure classes. Real results: memory leak auto-remediation in 7 minutes vs 35 minutes with humans, 73% autonomous resolution rate, 81% reduction in after-hours pages.The architecture needs four pieces: decision engine, safety sandbox, action library, and learning loop. The future of infrastructure is autonomous, question is whether you can afford not to build it.7 Days Remaining: Book Your Seat NowGet 30% OffUse code FINAL30Deep DiveDORA figured out which AI capabilities actually matterDORA's 2025 report on 5,000 tech professionals found AI adoption is universal but success varies wildly because AI amplifies what you already are: makes high performers better and struggling teams worse. They identified seven capabilities that determine if AI helps or hurts: clear AI stance, healthy data ecosystems, AI-accessible internal data, strong version control, small batches, user-centric focus, and quality platforms.AWS Security Agent does automated pentesting (and it's actually useful)AWS launched Security Agent at re:Invent: An AI agent that runs continuous penetration testing on your apps, currently in free preview. A test against DVWA took ~2 hours and found tons of vulns with actual PoC steps to reproduce, not vague scanner output. Definitely helps reduce pentest time but you still need your own manual testing: think of it as a teammate, not a replacement.AWS Direct Connect now supports chaos engineering with FISAWS Direct Connect now integrates with Fault Injection Service so you can run controlled chaos experiments testing BGP session disruptions on your Virtual Interfaces. You can validate that traffic actually routes to redundant VIs when the primary BGP session fails and your apps keep working as expected.Basically chaos engineering for your Direct Connect architecture before a real outage proves your failover doesn't work.Orca and Wiz dropped their patent lawsuit slugfestOrca and Wiz agreed to dismiss all claims in their dueling patent lawsuits after the US Patent Board invalidated three of Orca's six asserted patents for lacking novelty. The whole mess started in July 2023 when Orca accused Wiz of copying their architecture, Wiz countersued, and now it's over 10 months after Google agreed to acquire Wiz for $32 billion. Orca's worth $1.8B by comparison and has shrunk headcount 7% while Wiz nearly tripled to 3,150 employees.📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.If you have any comments or feedback, just reply back to this email.Thanks for reading and have a great day! *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0}#converted-body .list_block ol,#converted-body .list_block ul,.body [class~=x_list_block] ol,.body [class~=x_list_block] ul,u+.body .list_block ol,u+.body .list_block ul{padding-left:20px} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}
Read more
  • 0
  • 0
Success Subscribed successfully to !
You’ll receive email updates to every time we publish our newsletters.
Modal Close icon
Modal Close icon