AI Distilled

17 Jul 2026

11 min read

AI is moving beyond prompts.

17 Jul 2026

AI workflows, Moonshot Kimi K3, Google AI Mode, AWS, Cloudflare, and more.AI Distilled #145: How AI Is Evolving Beyond ChatbotsDon't build AI for disconnected environments without reading these five architecture patterns first.Rio Tinto's autonomous trucks process 5TB of data daily through subterranean tunnels with zero connectivity. Offshore wind turbines run fault detection through satellite blackouts. Five architecture patterns make this possible. Read if you're building AI for anywhere the cloud isn't guaranteed.Read the PatternsSponsored"The future of AI won't be defined by the models we use. It will be defined by the systems we build around them."That shift is already underway.The conversation is moving beyond prompts, benchmarks, and model releases toward something far more important: how AI completes real work. The next wave of GenAI applications won't be judged by how well they answer a question, but by whether they can reliably retrieve information, execute code, use tools, validate outputs, and complete entire workflows from start to finish.This week's featured article by Diogo Alves de Resende explores exactly why workflow engineering is becoming the new competitive advantage in AI. If you're building AI agents, RAG systems, enterprise copilots, or production GenAI applications, this is one article you won't want to miss.The rest of this week's edition reinforces the same trend. From open-weight frontier models and AI-native telecom infrastructure to enterprise AI security, biosecurity, reasoning breakthroughs, and agent ecosystems, the industry is rapidly shifting from building bigger models to building AI systems that businesses can trust.This week's AI Pulse:The Next Generation of GenAI Applications Will Be Built Around Workflows by Diogo Alves de ResendeMoonshot's Kimi K3 aims to challenge Anthropic's latest frontier modelsNokia bets on NVIDIA-powered AI-RAN to reshape telecom infrastructureAWS and Bluesight deploy AI assistants for hospital complianceGoogle DeepMind expands its AI bioresilience initiativeCloudflare redraws the rules for AI agent access to the webHugging Face investigates an AI-agent-driven security breachSchema pushes ARC-AGI-3 performance close to human levels through better reasoning workflowsGoogle AI Mode expands with third-party app integrationsOpenAI continues its consumer branding push with new merchandiseLet's dive in.Cheers,Merlyn Shelley,Growth Lead, Packt.🎙️ This Weekend with PacktIf you’re building LLM applications that need to work beyond the demo, don’t miss our live workshop:Build Reliable GenAI Applications with AI Evals, Observability & Testing.Join AI practitionersAmy ChenandSujeet Mishraas they walk through practical workflows for evaluating prompts, RAG pipelines, AI agents, and production GenAI systems. You’ll learn how leading AI teams use metrics, regression testing, observability, and continuous evaluation to build AI applications they can confidently ship.📅 Saturday, July 18 | 9:30 AM – 1:30 PM EDTJoin Amy and Sujeet Live!The Next Generation of GenAI Applications Will Be Built Around WorkflowsWhy production AI is shifting beyond prompts to orchestrated workflows powered by RAG, Python, APIs, evaluations, and guardrails.Written byDiogo Alves de ResendeThe future of GenAI isn’t about better prompts. It’s about building systems that can complete reliable, end-to-end workflows.For the past two years, much of the conversation around GenAI has focused on prompting.How do you write better prompts? Which model performs best? How can you make responses sound more intelligent?Those questions matter, but they miss a larger shift that’s already happening.The next generation of GenAI applications won’t be evaluated by how well they answer a single question. They’ll be evaluated by whether they can reliably complete an entire workflow from start to finish.That is a fundamentally different engineering problem.Register for the Workshop →Build Your Financial AI AnalystFrom Answers to ActionsConsider a Financial AI Analyst.A typical chatbot can answer questions about a company using what it already knows or by retrieving a few relevant document chunks. That might be enough for a demo.It isn’t enough for a system someone can actually trust.A production-ready financial assistant needs to do far more than generate text. It needs to gather evidence, perform calculations, validate results, and explain its reasoning before arriving at a conclusion.A typical workflow might look like this:◾Retrieve the latest annual report.◾Identify the relevant financial statements.◾Extract revenue, cash flow, margins, and debt figures.◾Connect to a live market data API.◾Calculate returns, volatility, valuation ratios, and trends using Python.◾Compare results against previous reporting periods or competitors.◾Generate a clear, source-backed explanation.◾Each step depends on a different capability.RAG retrieves relevant information from enterprise documents. APIs provide live external data. Python performs deterministic calculations that shouldn’t be delegated to an LLM. The language model orchestrates these components, deciding which tools to use, when additional information is required, and how to communicate the final result.This is what modern AI engineering increasingly looks like.AI Systems Are Becoming Workflow EnginesRather than relying on one large prompt and hoping for the best, developers are designing AI applications as structured workflows.These systems retrieve information, call tools, execute code, validate intermediate results, and adapt their next actions based on what they discover.The LLM becomes one component within a larger orchestration layer rather than the entire application.This shift is enabling developers to build AI assistants that are significantly more useful because they can interact with external systems, perform real computations, and produce grounded outputs instead of plausible-sounding guesses.But it also introduces a new challenge.Every Additional Step Creates New Failure PointsThe more capable an AI workflow becomes, the more opportunities there are for things to go wrong.The system might retrieve the wrong document.It could extract an incorrect financial value.An API might return incomplete data.A calculation could be performed using outdated inputs.Or the model might confidently generate a conclusion that isn’t actually supported by the evidence it collected.These aren’t isolated problems. They’re engineering challenges that emerge whenever multiple tools, data sources, and reasoning steps are combined into a single application.That’s why modern AI workflows require more than good prompts.They require evaluations, structured outputs, source verification, guardrails, prompt injection defenses, and mechanisms that make every step observable and testable.Reliability isn’t something that’s added after deployment. It has to be designed into the workflow from the beginning.Register for the Workshop →Build Your Financial AI AnalystBuilding AI Systems You Can TrustAs enterprises move beyond experimentation, the definition of a successful GenAI application is changing.It’s no longer enough for a model to produce an impressive answer.The entire process behind that answer needs to be transparent, repeatable, and reliable enough to support real business decisions.That’s where workflow-driven AI engineering is headed — and it’s rapidly becoming one of the most valuable skills for data scientists, ML engineers, and AI practitioners building production systems.Build One YourselfIn my upcoming live workshop,Build Intelligent Assistants with GenAI, Python & AI Tools, we’ll move beyond theory and build a production-ready Financial AI Analyst from scratch.Together, we’ll build an end-to-end workflow that combines:◾GenAI and modern LLMs◾Python for deterministic financial analysis◾Retrieval-Augmented Generation (RAG)◾Live market data APIs◾Jupyter Notebook◾Cursor◾LovableAlong the way, we’ll explore how to integrate retrieval, tool use, evaluations, guardrails, and prompt injection defenses into AI workflows that are designed for real-world reliability rather than simple demonstrations.If you’re looking to move beyond chatbot prototypes and start building production-ready AI assistants, this workshop is designed to give you a practical architecture you can reuse across financial analysis, enterprise copilots, and intelligent business applications.📅 Live Online WorkshopBuild Intelligent Assistants with GenAI, Python & AI ToolsSaturday, July 25 | 7:00 PM–11:00 PM GMT+5Join us to build a complete Financial AI Analyst and gain hands-on experience with the tools, workflows, and engineering practices powering the next generation of GenAI applications.Register for the Workshop →Build Your Financial AI AnalystJoin us live!One bundle. Fourteen books. Endless learning.Master today’s most in-demand Data & AI technologies—from LLMs and Python to Power BI, SQL, dbt, and Snowflake—with savings of up to93%.Grab the bundle before the offer endsAI Pulse: This Week◾Moonshot's upcoming Kimi 3 is expected to close the gap with Anthropic's Opus 4.8:Chinese AI labMoonshot AIis preparing to launchKimi K3, an open-weight model expected to rival or even surpassAnthropic’sOpus 4.8. With an estimated2–3 trillion parameters, it could become China’s largest open-weight model. The launch comes as enterprises increasingly weigh cost-effective open models against premium closed-source AI, while Moonshotreportedly seeksfunding at a$31.5 billionvaluation.◾Nokia's AI-RAN platform: a radio comeback that runs on NVIDIA.Nokiahas unveiled itsAI-RAN platform, built with NVIDIA, promising a software-driven approach to boost network capacity without requiring new spectrum. Early trials show20% spectral efficiency gains, with ambitions to double capacity by 2028. While Nokia positions it as the industry's first GPU-powered AI-RAN platform, rivals like Ericsson already offer commercial AI-powered RAN solutions, making this a promising strategic shift rather than a definitive market lead.◾AWS and Bluesight build AI for hospital 340B compliance:AWS andBluesighthave launchedPrism Assistant, an AI-powered assistant now deployed across20 health systemsto automate hospital pharmacy investigations and compliance reporting. Built onAmazon Bedrock, the platform cuts report generation from hours to minutes whilemaintainingdeterministic compliance scoring and full audit trails. A multi-agent340B compliance assistantis also planned for release later this year.◾Examining Google DeepMind's AI bioresilience push:Google DeepMindandIsomorphic Labshave expanded abioresilienceinitiative with15+ partnershipsaimed at preventing AI misuse in biology while accelerating outbreak detection and response. The program focuses on safer frontier AI, improved DNA screening, metagenomic sequencing, and faster drug discovery with AlphaFold. The companies are also urging stronger biosecurity policies as AI capabilities in life sciences continue to advance.◾AI agent crawlers, Cloudflare's new rules, and the way through:Cloudflareis reshaping how AI agents access the web by introducing new controls that classify crawlers intoSearch, Agent, and Trainingcategories. FromSeptember 15, AIagentand training crawlers will be blocked by default on many ad-supported sites, pushing developers toward licensed access and paid content agreements. The move could significantlyimpacthow enterprise AI agents retrieve real-time information from the open web.◾Hugging Face discloses AI-agent-driven breach of internal clusters:Hugging Facedisclosedthat anautonomous AI agentexploited a malicious dataset to gain access to internal systems, exposing some internal datasets and service credentials, thoughpublic models, datasets, and Spaces were unaffected. The incident also highlighted a growing challenge for AI security teams: hosted frontier models refused to analyze exploit payloads, forcing investigators to rely on anopen-weight modelfor forensic analysis.◾Frontier Models with Our Harness Achieve ~99% on ARC-AGI-3 Public — Schema:Impossible Researchhas introducedSchema, a new AI reasoning harness that enables frontier models to infer game rules through hypothesis, experimentation, and executable programs rather than weight updates. On theARC-AGI-3 Public benchmark, Schema achieved aself-reported 98.98%with Claude Opus 4.8 and Fable 5, highlighting how improved reasoning workflows—not just larger models—can dramatically advance AI performance on complex reasoning tasks.◾Google's AI Mode now lets you link and interact with select apps:Googleis expandingAI Modebeyond search with support for third-party app integrations, includingInstacart, Canva, and YouTube. Users can now complete tasks like creating shopping carts, finding design templates, and saving playlists directly from AI Mode. Rolling out in theU.S., the update strengthens Google's push toward an AI-powered assistant that competes more directly with ChatGPT and Claude through seamless app connectivity.◾Why is OpenAI selling a ChatGPT basketball?OpenAIhas expanded its merchandise lineup with branded products ranging from a$230 mini keyboardto a$70 ChatGPT basketball, positioning them as part of its "Pause. Play. Prompt." campaign. While the launch islargely promotional, it reflects OpenAI's broader effort to build a consumer brand beyond AI software through lifestyle-focused products and community engagement.See you next time!📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.If you have any comments or feedback, just reply back to this email.Thanks for reading and have a great day!*{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0}#converted-body .list_block ol,#converted-body .list_block ul,.body [class~=x_list_block] ol,.body [class~=x_list_block] ul,u+.body .list_block ol,u+.body .list_block ul{padding-left:20px} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

Merlyn from Packt

10 Jul 2026

7 min read

GraphRAG is changing enterprise AI. Here's why.

Merlyn from Packt

10 Jul 2026

7 min read

Multi-hop reasoning, explainability, and why retrieval is the missing piece.AI Distilled #144: Solving RAG HallucinationsAt Packt's Data & AI team, we're constantly exploring new ways to bring you fresh perspectives from experts building and deploying AI in the real world. As the field continues to evolve at an incredible pace, our goal is to cut through the noise and share practical insights that help you stay ahead.Today, we're sharing insights from Bruno Gonçalves, data scientist, educator, and former Vice President of Data Science and Finance at JPMorgan Chase, who challenges one of the biggest assumptions in Retrieval-Augmented Generation.When a RAG system hallucinates, is the language model really to blame?Bruno explains why the real bottleneck often lies in the retrieval layer, why traditional vector search falls short on complex reasoning, and how GraphRAG is changing the way practitioners build more reliable, explainable, and production-ready AI systems.Bruno will also be going live tomorrow for a hands-on workshop, where he'll demonstrate how to build a production-ready GraphRAG application from the ground up. If his article leaves you wanting to dive deeper, it's a great opportunity to see these concepts brought to life.Happy reading!Cheers,Merlyn Shelley,Growth Lead, Packt.Why Your RAG System Hallucinates Even When the Answer Is Already in the DocumentsWritten by Bruno Gonçalves, data scientist, educator, and former Vice President of Data Science and Finance at JPMorgan Chase.One of the strangest failure modes in Retrieval-Augmented Generation (RAG) looks like this:The answer is already sitting inside your documents.The retriever even returns passages that seem relevant.And yet the model still produces the wrong answer.At first glance, it looks like the language model hallucinated. In practice, the failure usually happened much earlier.The problem is that traditional vector retrieval cannot see structure.It excels at finding text thatlookssimilar to your query, but it struggles whenever the answer depends on relationships spread across multiple documents, reasoning over an entire corpus, or explainingwhya conclusion is correct.That limitation shows up in three remarkably consistent ways.1. Multi-hop reasoning breaks vector retrievalAsk a seemingly simple question:Who are the indirect suppliers of Company X?A vector retriever happily returns chunks mentioning Company X.Unfortunately, the answer rarely lives in those chunks.Instead, it may be distributed across several documents:Company A supplies Company B.Company B supplies Company C.Company C supplies Company X.No individual passage contains the complete chain.Cosine similarity has no understanding of relationships like:A → B → C → XAs a result, the retriever never assembles the evidence the language model actually needs.Knowledge graphssolve this naturally.Entities become nodes. Relationships become edges.Instead of searching for similar text, the system simply traverses the graph. A multi-hop question becomes a graph traversal that completes in milliseconds rather than a semantic guessing game.2. Global questions are not retrieval problemsNow consider a different type of question:What are the major themes across these 500 documents?Traditional RAG retrieves the topkchunks most similar to the query.Everything else is ignored.That’s the wrong abstraction.As the original GraphRAG paper points out, these aresummarization problems, not retrieval problems.The solution is to build a map before answering the question.The process looks like this:Extract entities and relationships from every document.Group them into related communities.Summarize each community.Combine those summaries into a single coherent response.Instead of reasoning over ten isolated chunks, the model reasons over the structure of the entire corpus.On million-token benchmark datasets, this approach consistently outperformed traditional vector RAG in both thecomprehensivenessanddiversityof its answers.3. Explainability matters in productionVector RAG can usually tell youwhich chunkit retrieved.It cannot tell youwhythat chunk justifies the answer.A similarity score of0.87is not an explanation.It’s merely a ranking.GraphRAGoffers something far more valuable.It can expose the chain of entities and relationships that produced the answer, step by step, all the way back to the original source documents.That audit trail is not just academically interesting.In industries like finance, healthcare, and law, explainability often determines whether an AI system can be trusted — or deployed at all.Hallucinations usually start in retrieval, not generationAll three failure modes point to the same underlying problem.Hallucinations in RAG are rarely generation problems.They’re retrieval problems wearing a generation costume.The model invents connections precisely where the retriever failed to provide them:across multiple reasoning hops,across an entire document collection,or across the gap between an answer and its supporting evidence.Give the model real structure instead of a stack of semantically similar chunks, and many of those invented connections disappear.The documents already contained the answer.The graph simply makes it possible to find it.The biggest objection to GraphRAG: costFor a long time, the strongest argument against GraphRAG wasn’t quality.It was economics.Building a full GraphRAG pipeline is expensive.The indexing process typically requires LLM calls for:every document chunk,every extracted entity,every relationship,and every community summary.One practitioner estimated that indexing a single5 GB legal datasetcost roughly$33,000in early 2024.For many organizations, that made traditional vector indexes the only practical choice.LazyGraphRAG changes the equationThat trade-off shifted dramatically with the introduction ofLazyGraphRAGin late 2024.Instead of performing expensive LLM-based summarization during indexing, LazyGraphRAG relies on lightweight techniques such as noun-phrase extraction and co-occurrence statistics.The result:zero LLM calls during indexing,indexing costs comparable to a standard vector database,roughly0.1% of the costof a full GraphRAG pipeline.The computational work moves from indexing time to query time.Rather than constructing an enormous graph upfront, the system incrementally builds only the portion of the graph needed to answer the current question.According to published benchmarks, LazyGraphRAG matches the quality of GraphRAG’s global search while reducing query costs by more than700×.For many teams, the biggest practical objection to graph-based retrieval has largely disappeared.Build Your First Production GraphRAG SystemReading about GraphRAG is one thing.Building one yourself is another.If you’d like to get hands-on, start with our introductory walkthrough, where you’ll learn how to transform2,000 news articlesinto a searchable knowledge graph.Then take the next step by joining ourProduction GraphRAG WorkshoponJuly 11, where you’ll build an end-to-end GraphRAG chatbot from raw Wikipedia data in just3.5 hours.During the workshop, you’ll learn how to:Extract entities and relationships usingspaCyandREBEL.Build a knowledge graph withNetworkX.Combine graph retrieval with vector search for hybrid RAG.Generate grounded responses with a large language model.Build a production-ready GraphRAG pipeline you can extend to your own projects.You’ll also receive the session recording, complete source code, presentation slides, and a certificate of completion, allowing you to revisit the material whenever you need.🎟️Exclusive for Packt Newsletter readers:Save35%on your workshop ticket.Whether you’re building your first GraphRAG application or looking to improve an existing RAG pipeline, this workshop provides a practical framework for solving the kinds of multi-hop reasoning problems that traditional vector retrieval often misses.All you need is a basic understanding of Python and Docker.Ready to build your first production GraphRAG system? Join us live.We started Agentic Engineering for the people asking what happens after the demo ends.Today, there are 1,000 of us.To celebrate, we’re giving three subscribers a bundle of:📘 LLM Engineer’s Handbook📙 30 Agents Every AI Engineer Must BuildSubscribe to Agentic Engineering, and you’ll automatically be entered into the draw.Subscribe here📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.If you have any comments or feedback, just reply back to this email.Thanks for reading and have a great day!*{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

03 Jul 2026

6 min read

What comes after today’s LLMs?

LLM Expert Insights, Packt

03 Jul 2026

6 min read

Governance, enterprise adoption, and new research directions take center stage AI_Distilled #142: What’s New in AI This Week If there was a common thread this week, it was maturity. The conversation is gradually shifting away from model releases and benchmark scores towards infrastructure, governance, reliability, and the practical limits of today’s AI systems. That shift is reflected across both policy and industry. LLM Expert Insights, Packt LATEST DEVELOPMENTS 🇮🇳🇯🇵 India and Japan deepen AI partnership - Reports India and Japan have signed a wide-ranging AI cooperation agreement covering AI governance, frontier model safety, cybersecurity, compute infrastructure, multilingual open-source models, semiconductor collaboration, and AI talent exchange. The partnership also includes joint LLM research, support for AI startups, and a target to bring 500 Indian AI professionals to Japan by 2030, signalling a long-term strategic alliance on AI development. 🤖 Beyond LLMs: AI researchers are betting on “world models” - Amazon AI pioneer Yann LeCun argues that scaling today’s large language models won’t lead to human-level intelligence, pointing instead to “world models” that can reason about cause and effect, physical environments, and future outcomes. With companies including AMI Labs, DeepMind, Wayve, and World Labs investing heavily in the approach, the next frontier in AI may be systems that understand the world rather than simply predict the next token. 📉 Meta says AI agents are taking longer than expected - At an internal town hall, Mark Zuckerberg reportedly told employees that AI agents have not advanced as quickly as Meta anticipated, with the company’s AI reorganization yet to deliver the expected gains. Despite investing heavily in AI infrastructure and reshaping teams around agent development, Meta expects the payoff to emerge over the next three to six months. 🎬 Jodie Foster: ‘F1’ felt like it was written by AI - Speaking at the Aspen Festival of Ideas, Jodie Foster said Apple’s blockbuster F1 followed such a formulaic structure that it “seemed like it was made by AI,” using the film to spark a broader discussion about AI’s growing influence on filmmaking. Foster argued that AI can be valuable as a creative tool, but only if filmmakers remain firmly in control rather than letting the technology dictate the work. ⚖️ India’s judiciary embraces AI, with humans staying in control - India’s judiciary is expanding the use of AI for tasks such as legal research, transcription, translation, case management, and administrative support, while reinforcing that judges must remain the final decision-makers. The move comes as courts and policymakers emphasize safeguards including human oversight, verification of AI outputs, and governance frameworks to prevent fabricated citations, bias, and misuse in judicial proceedings. 📈EXPERT INSIGHTS Denis Rothman on the illusion of autonomous agents (part 2) Last week, we explored why autonomous agents struggle to deliver the reliability enterprises expect. The problem isn’t simply that LLMs make mistakes, but that prompt engineering alone cannot overcome the structural limits of probabilistic reasoning. If reliability cannot emerge from prompts, then it must be engineered elsewhere. That “elsewhere” is context: not as a longer system prompt or larger context window, but as a structured, transparent architecture that guides how agents reason, communicate, and act. This is where context engineering begins. Context complexity exists across five distinct levels, evolving from zero-context basic prompts to highly advanced semantic blueprints. Prompt engineering resides at the shallowest level, treating context as a mere text prefix. Context engineering, conversely, treats context as a structural, programmatic architecture. To build a robust multi-agent system (MAS), we must transition from linear text parsing to multidimensional semantic structures. We can achieve this by implementing Semantic Role Labeling (SRL) to map complex data relationships natively. By utilizing the Model Context Protocol (MCP), we can define rigorous protocol message formats that specialist agents such as a Researcher, Writer, and Orchestrator use to communicate without ambiguity. Instead of hoping a black-box model infers the correct workflow, we architect a semantic blueprint that explicitly guides the system’s reasoning process, completely decoupling the immutable enterprise data layer from the probabilistic reasoning layer. Read the Full Article Here If this Expert Insight article made you think, you’ll probably enjoy our new publication, AgenticEngineering. We’re less interested in asking whether agents can do something, and more interested in what it takes to make them work reliably in production. Early subscribers are still receiving a free copy of AI Agents in Practice by Valentina Alto. Subscribe and grab your free e-book instantly P.S. If you subscribed for the free e-book and can’t find it, please check your Promotions, Spam, or Junk folder first. The download link is usually waiting there. If it still hasn’t arrived after a little while, just reply to this email, and we’ll help you out. Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

30 Jun 2026

10 min read

Special Issue: Rick Spencer on the AI metrics that actually matter

LLM Expert Insights, Packt

30 Jun 2026

10 min read

Why SUSE rejects lines of code, token counts, and developer leaderboards AI_Distilled #142: What’s New in AI This Week Social engineering is about manipulating people's emotions. Identify the susceptibilities that hackers use to exploit people. This NINJIO Insights Report dives into the key emotional susceptibilities that make social engineering work and offers concrete steps that your security team can take to equip your workforce to resist cyberattacks. Download the Guide Instead of our usual mix of AI news and analysis, today’s issue is dedicated to a single conversation. As AI coding agents become part of everyday engineering, a new wave of dashboards is emerging to measure their impact. Lines of code, token consumption, pull requests, utilization scores, and developer rankings are quickly becoming the default language of AI productivity. Rick Spencer, General Manager for Technology and Product at SUSE, argues that much of it is measuring the wrong thing. In today’s special issue, he explains why output is a poor proxy for value, why engineering leaderboards create the wrong incentives, and how SUSE measures AI through customer outcomes instead of developer activity. If you’re thinking about how to evaluate AI inside an engineering organization, this is a conversation worth spending time with. LLM Expert Insights, Packt P.S. If the following article ends up being your kind of read, you’ll probably enjoy Agentic Engineering, our new publication for builders navigating AI beyond the demo. Early subscribers are still receiving a free copy of AI Agents in Practice by Valentina Alto. Subscribe Here SUSE refuses to measure its engineers by how much code their agents write Rick Spencer on why output, tokens, and lines of code tell you nothing, and what an open-source enterprise tracks instead. As AI agents move into engineering workflows, new leaderboard metrics are tracking lines of code submitted, tokens consumed, and per-developer utilization. If agents are generating output, then output should be measured, compared across engineers, and ranked. Rick Spencer, General Manager for Technology and Product at SUSE, has looked hard at how the industry is measuring AI’s effect on engineering. “I consider that garbage vanity metrics,” he says, calling them unhelpful. His argument for what to track instead is one of the more clarifying things an engineering leader can hear right now, because it separates the numbers that look like progress from the numbers that actually represent it. Output is cheap; impact is what counts The core of Spencer’s position is a distinction between output and impact, and it matters because the two come apart precisely when AI enters the picture. AI makes output cheap as the lines of code, pull requests, and token counts all mount up when agents are doing the writing, which makes them exactly the wrong thing to measure if what you care about is value delivered. “We’re really tending away from measurements that measure output and utilization, and we’re trying to focus on impact,” he says. A leaderboard that ranks engineers by how much their agents produced does not tell you who is solving the hardest problems or keeping customers safe. It tells you who is generating the most volume, and in an AI-assisted world, that number is close to meaningless. There is also a structural reason the standard tooling does not fit SUSE, and it applies to more organizations than it might first appear. Much of the available measurement tooling assumes a particular shape of company. “They really assume you’re a proprietary software company where everyone’s working on a single code base,” Spencer explains, “which is just not how an open-source enterprise works.” His engineers work across hundreds, sometimes thousands, of repositories, where the maintenance work on each one differs enormously. A per-developer comparison across that landscape measures the shape of the work far more than it measures the contribution of the engineer, which is why he treats developer-to-developer comparison as fundamentally low value rather than merely imperfect. The reporting burden itself is part of his objection, and it is a point leaders setting up AI dashboards should sit with carefully. A measurement regime that requires engineers to generate weekly utilization reports spends the very time it claims to be optimizing. “I’d rather have them working than reporting,” Spencer says. The instrument meant to measure productivity eventually becomes a tax on it. What SUSE tracks instead Rejecting vanity metrics only helps if there is something better to put in their place. And Spencer shares how SUSE measures business impact in terms that connect directly to what customers actually receive. “How fast are CVEs being addressed, how fast are patches being backported, how fast are our L3 responses getting closed while maintaining the same NPS score,” he underscores, listing what his teams track. The common thread is that each one is an outcome the customer feels, not an activity the engineer performs. AI has been applied to exactly these areas, so measuring the speed and quality of those outcomes tells you whether the AI is doing anything worth its cost, which is the actual question worth asking. This shift from output to outcome reframes what a metric is for in the first place. A CVE response time captures whether the organization is keeping customers safe faster than it used to. A backport speed captures whether stable releases are getting their fixes without the manual grind that used to gate them. These numbers move because the underlying work got genuinely better, not because more text was generated, and that is the property that makes them trustworthy. They are also far harder to game, because the only way to improve them is to actually improve the thing the customer depends on. Give managers visibility, not a leaderboard None of this means SUSE ignores cost or utilization entirely, and the distinction Spencer draws here is the one that keeps the approach from collapsing into either negligence or surveillance. The company is building dashboards that give engineering managers visibility into their team’s cost and utilization, but the purpose is coaching rather than ranking. The unit of analysis is the team, and the question it answers is diagnostic. Spencer gives the example of a manager with an eight-person team noticing the numbers and asking the right kind of question. “We’re burning a lot of tokens. What are we actually doing that’s burning that many tokens? I’m not sure we’re getting value out of that.” The inverse matters just as much, where purchased seats for a code assistant sit unused, and the manager asks whether there are places the team should be drawing value that it is currently leaving on the table. The governance side of that picture, including how SUSE keeps agents and their costs inside a boundary it can stand behind, is covered in a companion piece, How SUSE Runs AI Without Losing Control. The difference between this and a leaderboard is not subtle, and it is the heart of the leadership lesson. A leaderboard exposes individuals and turns measurement into a game engineers play against each other, a game Spencer is explicit has nothing to do with customer value. Team-level cost visibility used for coaching does the opposite. It gives a manager the information to guide the team toward better use of the tools without making any individual engineer feel watched. “We’re really trying to decentralize and allow engineering managers to guide their teams on getting the most value out of the AI,” he says, “without it becoming like a leaderboard game where developers feel like they’re exposed.” The data exists to help the manager help the team, not to rank the team against itself. The principle holding it together What makes Spencer’s approach more than a list of preferred numbers is the principle holding it together, which is that measurement should serve the work rather than distort it. Every choice he describes follows from that one idea. Impact comes before output because output is the thing AI inflates. Team-level diagnostics come before individual leaderboards, because the goal is coaching rather than competition. Business outcomes come before activity counts, because outcomes are what customers actually receive. The decentralization to engineering managers reflects the same conviction that the people closest to the work are best placed to judge whether the AI is helping, given the right information and trusted to use it well. The deeper point for any leader standing up AI measurement is that the easy numbers and the useful numbers are not the same, and AI has widened the gap between them. The figures that are simplest to collect, lines of code, tokens, and per-head utilization, are the ones AI has made least meaningful. The figures that matter, the speed and quality of the outcomes customers depend on, take more thought to define and more care to track. Spencer’s argument is that the effort is the job. “Let’s focus on the impact,” he says, “the business impact, not on the utilization.” For engineering leaders deciding what belongs on a dashboard as agents reshape their teams, that is the distinction worth getting right before the vanity metrics calcify into the way the organization sees itself. If this article made you think, you’ll probably enjoy our new publication, AgenticEngineering. We’re less interested in asking whether agents can do something, and more interested in what it takes to make them work reliably in production. Early subscribers are still receiving a free copy of AI Agents in Practice by Valentina Alto. Subscribe and grab your free e-book instantly Explore Before Time Runs Out Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}.row .side{display:none}}

0
0

AI Distilled

LLM Expert Insights, Packt

26 Jun 2026

5 min read

AI is entering its industrial era

LLM Expert Insights, Packt

26 Jun 2026

5 min read

This week’s stories ask what it takes to build AI that can be trusted AI_Distilled #142: What’s New in AI This Week Tame Your AI Monsters: CLAUDE EDITION Join the First Episode of the Exclusive AI Agent Governance Lab Series. Rubrik runs Claude. This is what we learned. We're the first enterprise to put Rubrik Agent Cloud to work governing our own Claude implementation — and on June 30, we're opening that experience up. Secure Your Spot LLM Expert Insights, Packt P.S. If Denis Rothman’s Expert Insight ends up being your kind of read, you’ll probably enjoy Agentic Engineering, our new publication for builders navigating AI beyond the demo. Early subscribers are still receiving a free copy of AI Agents in Practice by Valentina Alto. Subscribe Here LATEST DEVELOPMENTS 🏛️ Governments take a closer interest in frontier AI models - Reports suggest the U.S. government is taking a more active role in overseeing the release of advanced AI models, with OpenAI reportedly limiting access to future frontier models and Anthropic previously facing scrutiny over model deployment. The developments underscore how national security concerns are becoming an increasingly important factor in how the most capable AI systems are released. ☁️ Amazon deepens AI infrastructure investment in India - Amazon will invest an additional $13 billion to expand its AI and cloud infrastructure in India, bringing its planned investment in the country to $48 billion by 2030. The expansion will grow AWS data center capacity in Mumbai and Hyderabad, reinforcing India’s position as a key market in the global race to build AI infrastructure. 🧪 AI distillation comes under renewed scrutiny - Anthropic has accused Alibaba of using “adversarial distillation” to replicate the capabilities of its Claude models, an allegation Alibaba denies. The dispute has thrust AI distillation into the spotlight, highlighting growing concerns over whether using another company’s models to train new AI systems crosses the line from accepted optimization to intellectual property infringement. 💳 Airwallex raises $320 million to expand AI-powered finance - Airwallex has raised $320 million at an $11 billion valuation as it accelerates its push into AI-native financial software and agentic commerce. The fintech also unveiled new AI products designed to automate corporate finance workflows and enable AI agents to make delegated payments, reflecting growing investment in autonomous financial systems. 📈EXPERT INSIGHTS Denis Rothman on the illusion of autonomous agents (part 1) The AI community is caught in a contradiction. On one hand, there is a push to deploy autonomous agents into enterprise environments, expecting them to reason, plan, and execute complex, multi-step workflows. On the other hand, the methodology relied upon to build these agents is overwhelmingly based on prompt engineering, which is merely an attempt to cajole predictable, deterministic behavior out of fundamentally stochastic, black-box LLMs. This creates a pervasive dissonance: the expectation of industrial reliability built atop probabilistic generation. The current noise suggests that if we scale the parameters, refine the system prompts, or throw more compute at the problem, true autonomous reasoning will spontaneously emerge. To cut through this noise, we must confront an uncomfortable reality. Unconstrained probabilistic generation cannot serve as the kinematics for reliable robotic or enterprise execution. If we are to build truly agentic systems, we must move beyond the brittle, zero-context art of prompting and embrace the rigorous, transparent discipline of context engineering. Read the Full Article on Substack If this Expert Insight article made you think, you’ll probably enjoy our new publication, AgenticEngineering. We’re less interested in asking whether agents can do something, and more interested in what it takes to make them work reliably in production. Early subscribers are still receiving a free copy of AI Agents in Practice by Valentina Alto. Subscribe and grab your free e-book instantly Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

19 Jun 2026

6 min read

The next AI bottleneck isn’t intelligence

LLM Expert Insights, Packt

19 Jun 2026

6 min read

Yann LeCun’s warning, Anthropic’s expansion, and a deeper look at agent evaluation AI_Distilled #141: What’s New in AI This Week Vector memory for AI agents in air-gapped, regulated, and offline environments VectorAI DB delivers sub-15ms retrieval for agent memory and RAG pipelines on your own infrastructure. On-premises, at the edge, or air-gapped. Native support for LangChain, LlamaIndex, and Hugging Face. Free Community Edition available. Get Started for Free LLM Expert Insights, Packt LATEST DEVELOPMENTS 📉 AI pioneer warns of industry bubble as costs outpace revenues - AI researcher Yann LeCun has criticized Elon Musk’s xAI as a struggling competitor in the race for frontier AI while warning that the broader industry risks a “big bubble explosion” if leading labs fail to reduce costs or raise prices. LeCun argued that today’s AI services remain heavily subsidized by investors and suggested that more advanced AI systems may ultimately require new architectures beyond large language models. 🧠 MIT gives robots a memory that works more like ours - MIT researchers have developed a new memory framework that allows robots to remember objects, locations, and past observations using natural language, enabling them to answer questions such as “Where did I leave my wallet?” By combining 3D mapping with AI-generated descriptions, the system could help future robots navigate complex environments and collaborate more naturally with humans. 🌏 Microsoft becomes the primary gateway for OpenAI models in China - While OpenAI and Anthropic have largely stayed out of the Chinese market, Microsoft has emerged as the main supplier of OpenAI’s models to major Chinese technology companies through Azure. The arrangement highlights Microsoft’s unique position in the global AI ecosystem, even as concerns grow around model distillation, geopolitical tensions, and the flow of advanced AI capabilities across national boundaries. 🇰🇷 Anthropic expands into South Korea with new office and AI partnerships - Anthropic has opened a Seoul office and announced partnerships with major Korean organizations, including NAVER, Samsung SDS, LG CNS, and Nexon, as demand for Claude continues to grow across the region. The company also signed an agreement with South Korea’s Ministry of Science and ICT to collaborate on AI safety, cybersecurity, and responsible AI adoption. ⚖️ Study highlights why AI still struggles to moderate online hate speech - New research shows that leading AI moderation systems often disagree on what constitutes hate speech, producing inconsistent results across demographic groups and content types. While AI can detect explicit abuse at scale, researchers say it still struggles with context, sarcasm, coded language, and reclaimed terms, underscoring the challenges of relying on automated systems for online content moderation. Claude is currently the most powerful tool of 2026. Yet almost no one knows how to actually use them. Our expert mentors have condensed 800+ hours of Claude research, articles, YouTube content and real-world practice into a focused 16-hour curriculum. Join the 2-Day Claude AI Mastery Workshop: a live, end-to-end deep dive into Claude plus 10+ AI tools, LLMs and workflows. You will learn how to: - master Claude's three modes : Chat, Cowork and Code. - Set up Skills, Connectors and Plug-ins to automate your desktop, Notion and files. - Vibe code apps and dashboards without writing code & 10+ AI tools and workflows that pair with Claude. 🧠 Saturday & Sunday 🕜 10 AM – 7 PM EST Register NOW! 📈EXPERT INSIGHTS Why a Good Answer Doesn’t Mean a Good Agent During a time when AI conversations are often louder than they are useful, Ammar Mohanna, PhD, brings a refreshing perspective. His career has moved fluidly between academia and industry, from teaching advanced AI courses at the American University of Beirut to advising teams on turning machine learning ideas into systems that can be trusted. He is also known for his candid take on the current AI landscape, especially the gap between meaningful engineering and what he often calls AI slop. In this conversation, Ammar challenges one of the most common assumptions in agent development: that a correct answer is evidence of a successful agent. He explains why reliability lies in the path an agent takes, not just in the result it produces, and why evaluation must evolve from output scoring to a discipline that measures behaviour and trustworthiness in production. Most teams think they’re evaluating agents, but they’re actually not. Where do you see the biggest illusion of evaluation today? The biggest illusion is that teams think they are evaluating an agent when they are only evaluating the final answer. That works well for a chatbot. But an agent is different. It plans, chooses tools, passes arguments, reads observations, retries, stops, and sometimes takes action. A final-answer score hides most of the actual failure surface. An agent can produce a good-looking answer after calling the wrong tool, wasting ten steps, misreading a tool result, or ignoring a failed call. From the outside, the answer may look acceptable. From a reliability perspective, the run is not acceptable. So the illusion is: “the answer looked right, therefore the agent worked.” However, what you need to know is whether the path was valid, efficient, grounded, and safe. Read the Full Interview on Substack Most Claude Code content focuses on prompts and quick wins. This workshop explores what comes next. Join Sam Keen, former engineer at AWS, Lululemon, and Nike, to learn how high-performing teams use structured context, reusable skills, workflow memory, and guardrails to get more consistent results from Claude Code. 🎟️ Exclusive for AI Distilled subscribers: Get 60% off with code AI60. Limited to the first 10 sign-ups. Register Now Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

12 Jun 2026

4 min read

AI is becoming infrastructure

LLM Expert Insights, Packt

12 Jun 2026

4 min read

It's moving deeper into the fabric of technology AI_Distilled #140: What’s New in AI This Week Avail 40% Off Today LLM Expert Insights, Packt LATEST DEVELOPMENT 🎮 Pokémon Go scans helped train AI now being explored for military drones - Location scans voluntarily submitted by Pokémon Go players have been used to train AI models that help machines understand and navigate physical environments. Following a partnership between Niantic Spatial and drone software company Vantor, the technology is now being explored for use in GPS-denied environments, raising fresh questions about how consumer-generated data may ultimately be used in military and defence applications. 🏗️ Mistral bets on agents, infrastructure, and custom AI chips - Mistral CEO Arthur Mensch says enterprise AI adoption is still in its early stages, with significant value yet to be unlocked as organizations adapt to agentic workflows. He also revealed that the French AI startup is exploring the development of its own chips, signaling ambitions to control more of the AI stack as it expands beyond models into infrastructure and enterprise deployment. 👷 Jeff Bezos pushes back on AI job-loss fears - Jeff Bezos argues that AI-driven productivity gains will create new industries, products, and jobs rather than trigger mass unemployment. Speaking about his AI startup Prometheus, Bezos said the bigger long-term challenge may be labor shortages, as AI accelerates innovation across sectors such as manufacturing, aerospace, semiconductors, and energy. 🎓 Anthropic launches $150 million AI fellowship program - Anthropic has unveiled Claude Corps, a $150 million initiative that will train and place 1,000 early-career professionals at nonprofits across the U.S. The program aims to help organizations adopt AI tools while equipping participants with practical AI skills, reflecting growing efforts to distribute the benefits of AI more broadly amid concerns about workforce disruption. ⚡ Google unveils DiffusionGemma for faster AI text generation - Google has released DiffusionGemma, an experimental open-source model that uses diffusion techniques instead of traditional token-by-token generation, enabling text generation speeds up to four times faster on dedicated GPUs. While not intended to replace conventional LLMs for quality-critical applications, the model offers a glimpse into alternative architectures designed for real-time, interactive AI workflows. 📈EXPERT INSIGHTS OpenClaw + LangGraph Playbook It’s easy to build an agent that talks. Building one that remembers things, sends messages, runs on a schedule, and generally makes itself useful is a different challenge. That’s where LangGraph and OpenClaw make an interesting combination.Let’s build one. The first step is creating the primary agent and establishing its responsibilities through a system prompt. import os from datetime import datetime from langclaw import Langclaw from langclaw.gateway.commands import CommandContext # Initialize the master agent application interface app = Langclaw( system_prompt=( “## Corporate Intelligence Agent\n” “You are a corporate intelligence analyst. You track market trends “ “and draft precise outreach sequences based on current events.\n” “Delegate deep multi-source research tasks to the web-researcher subagent.” ), ) Tools allow the agent to access capabilities beyond language generation. In this example, the agent can retrieve market intelligence data. Read The Full Article Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

11 Jun 2026

3 min read

You asked for less hype. Here it is.

LLM Expert Insights, Packt

11 Jun 2026

3 min read

Agentic Engineering is live. Agentic Engineering Is Now Live I'm interrupting your very busy schedule for an announcement that you might’ve already seen coming. Our newsletter, Agentic Engineering, has finally kicked off. We sent out surveys. We sent out pre-launch messages. And you’ve all been really supportive while we figured out the logistics of kicking off something new. Across the surveys and interviews, you had one clear frustration. Too much hype around updates, launches, and releases, and very little guidance on what to actually do with all of it. That gap is what Agentic Engineering is for. We will lean on our network of experts who are actively building in this space, so you’re not just getting opinions, but conversations from people making decisions right now, often before they show up on timelines. But enough said! We’ll leave it to you to decide how useful this space is. Subscribe if this sounds like what you’ve been looking for. You can always unsubscribe later. And just to nudge things along (without making it too transactional), early subscribers will receive a free ebook copy of AI Agents in Practice by Valentina Alto! Tanya, Agentic Engineering Join Agentic Engineering and Grab Your Free Copy 📈EXPERT INSIGHTS A preview of Agentic Engineering Something Maxime Labonne said during one of our earlier roundtables stuck with me because it runs counter to how people think about small models. The common assumption is that small models are simply the cheaper version of frontier models. Same idea, lower cost. But Maxime’s experience has been that they’re often harder to work with. Not because they’re worse, but because they expose problems that larger models can often hide. When most people build AI systems today, they’re testing them with frontier models. Those models are smart enough to compensate for weak prompts, incomplete logic, or edge cases that nobody thought about. Small models don’t give you that luxury. As Maxime put it, they can fail on surprisingly basic tasks, and when they do, entire workflows can break. Read the Full Article *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

05 Jun 2026

4 min read

Not all agents are created equal

LLM Expert Insights, Packt

05 Jun 2026

4 min read

This week’s Expert Insight explores the agent spectrum AI_Distilled #138: What’s New in AI This Week SUBSCRIBE AI is entering an interesting phase. The biggest questions are no longer about what models can generate, but what systems can be trusted to do. That theme runs through this week’s stories, from AI-designed vaccines and industrial AI partnerships to debates around governance and autonomy. It also sits at the center of this week’s Expert Insight, which explores the spectrum of today’s agents and why not all “agents” are actually the same thing. LLM Expert Insights, Packt LATEST DEVELOPMENT 🧬 AI-designed vaccine enters human trials in world-first study- Researchers at the University of Cambridge have developed what they describe as the first AI-designed vaccine to enter human trials, using AI to create a “super-antigen” capable of protecting against entire families of viruses. The approach could pave the way for universal vaccines against coronaviruses, influenza, and future pandemic threats. 🛑 Anthropic co-founder calls for an AI “brake pedal” - Anthropic co-founder Jack Clark has warned that AI systems are approaching a point where they could increasingly develop without human input, arguing that governments need new regulatory frameworks to maintain control. His comments come as AI capabilities accelerate and concerns grow around economic disruption, autonomous systems, and long-term governance. 📈 Investors look to Asia for the next wave of AI growth - Investment strategists are increasingly pointing to Taiwan and South Korea as the next major beneficiaries of the AI boom, citing their central role in semiconductor and AI infrastructure supply chains. With valuations still below many U.S. AI stocks, some investors see emerging markets as offering significant upside in the next phase of AI-driven growth. 🏭 Hitachi and Intel partner to advance industrial AI and digital infrastructure - Hitachi and Intel have announced a strategic collaboration to accelerate AI adoption across manufacturing, energy, mobility, and other critical industries. The partnership will focus on areas including physical AI, edge computing, quantum technologies, and factory automation, to build more intelligent and resilient industrial infrastructure. 📈EXPERT INSIGHTS A preview of Agentic Engineering The current AI ecosystem has developed a habit of describing wildly different systems with the exact same word: agents. A retrieval pipeline that reformulates search queries, a workflow assistant that schedules meetings, and a system capable of coordinating multi-step operational decisions with minimal supervision now all routinely get discussed under the same umbrella. And I think that’s why the conversation around agents sometimes feels simultaneously overcomplicated and vague. In reality, though, these systems are operating at very different levels of autonomy. So, I wanted to take a new angle: the different kinds of agent organizations are building, and how capabilities change as systems move from retrieval into action and eventually toward autonomy. Read The Full Article Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

20 May 2026

2 min read

Where should AI Distilled go next?

LLM Expert Insights, Packt

20 May 2026

2 min read

We’re running a short audience survey to help decide Rethinking What an AI Newsletter Should Be Over the past few months, AI_Distilled has grown into a community of readers coming from very different parts of the AI ecosystem. As the space continues evolving, we’ve been thinking carefully about what this publication should become going forward and how we can make it more genuinely useful for the people reading it. So we’ve put together a short survey to understand what readers want more of, what feels missing from current AI media, and where we should take AI Distilled next. It should take around 4 minutes to complete, and every response will directly help shape the next phase of the publication. Take Survey Appreciate you taking the time. LLM Expert Insights, Packt *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

08 May 2026

5 min read

The more AI thinks for us, the more architecture matters

LLM Expert Insights, Packt

08 May 2026

5 min read

AI dependence and synthetic influence raise new questions AI_Distilled #136: What’s New in AI This Week Get Tickets Researchers are now questioning whether heavy reliance on AI tools could weaken independent thinking, while AI-generated influencers and automated marketing systems are making it harder to separate expertise from synthetic persuasion. That tension between autonomy and control sits at the center of this week’s Expert Insight. The excerpt explores how modern AI agents are structured internally, particularly the separation between an agent’s persistent identity and the tasks it is tasked with performing. As agents become more embedded into real workflows, those architectural choices are starting to matter far beyond prompt engineering experiments. LLM Expert Insights, Packt LATEST DEVELOPMENT 🧠 Heavy AI dependence may weaken independent thinking, researchers warn -A study by researchers from MIT, Carnegie Mellon, Oxford, and UCLA found that people using AI assistants to solve reading and maths problems completed tasks faster but showed lower engagement with critical thinking and problem-solving processes. The findings raise concerns that growing reliance on AI tools could gradually reduce persistence and independent reasoning skills over time. ⚡ Anthropic doubles Claude usage limits after major SpaceX compute deal -Anthropic has expanded usage limits for Claude Code and its API after signing a compute partnership with SpaceX that gives it access to more than 220,000 NVIDIA GPUs at the Colossus 1 data center. The announcement highlights how competition in AI is increasingly shifting from model capabilities alone to securing massive infrastructure and compute capacity at scale. 🏋️ AI-generated fitness influencers push misleading transformation claims online - Google has developed TurboQuant, a compression method that reduces AI working memory requirements by up to six times without affecting performance. The advance could significantly lower infrastructure costs and enable more powerful models to run efficiently, though it remains at an early stage. 🛠️ Tools worth trying this week - From AI-powered email signature builders to branding assistants that generate polished HTML-ready designs in minutes, these tools show how generative AI is quietly reshaping even the most routine parts of digital work. If you want to experiment with lightweight but practical AI utilities, these are worth a look. We’re thinking about launching something new If you have a minute, take our quick survey and tell us what you’d actually want to read. It’ll help us build something that’s genuinely worth your time. Take the Survey 📈EXPERT INSIGHTS 30 Agents Every AI Engineer Must Build In this week’s Expert Insight, Imran Ahmad, author of 30 Agents Every AI Engineer Must Build, explores one of the foundational ideas behind modern agent engineering: the separation between an agent’s persistent identity and its real-time tasks. The two-layer prompt architecture: System and user prompts One of the most foundational innovations in agent design is the two-layer prompt architecture, which distinctly separates an agent's core identity from its real-time instructions. This layered design, consisting of the system prompt and the user prompt, establishes a clear division of responsibilities, drawing inspiration from classical software principles such as separation of concerns and abstraction layers. A helpful analogy is that of an agent functioning as a diplomat: the system prompt defines the diplomat's country, values, and code of conduct; the user prompt is the current negotiation or message they are handling. The diplomat must respond fluidly, but always in alignment with national policy. In multi-agent scenarios, this diplomat analogy extends across agent boundaries. When one agent passes a task or data payload to another, it is effectively handing off a "diplomatic brief": the receiving agent's system prompt must re-establish persona, authority scope, and operational constraints for the new context. Without explicit role-passing in the handoff protocol, the receiving agent may inherit ambiguous instructions or combine roles across agents. Well-designed multi-agent architectures, therefore, encode the PTCF components not just in each agent's internal system prompt but also in the inter-agent message schema, ensuring that every communication boundary preserves the constitutional clarity that the framework provides. Together, these two layers form what we might call the agent's prompt contract: > System prompt: How the agent behaves > User prompt: What the agent should do Read Full Article Build and test native paywalls in seconds Turn a prompt into a complete native paywall with RevenueCat’s Paywalls AI Editor Update copy, adapt designs for dark mode, and launch A/B tests without waiting for the next sprint.Use free up to $2.5k monthly tracked revenue. 96,000+ apps trust RevenueCat. Learn More Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

01 May 2026

6 min read

AI agents are running the show

LLM Expert Insights, Packt

01 May 2026

6 min read

They’re taking on real work, and the risks are showing AI_Distilled #135: What’s New in AI This Week Building AI Resilience: Managing Agent Risk with Trust Infrastructure Rules-based security fails AI agents. On May 5, learn to scale safely with "Trust Infrastructure." We’ll dive into our Pillars of Trust framework, contextual guardrails, and how Rubrik Agent Cloud provides a foundation for secure, resilient AI. Save My Spot This week feels like a turning point for AI agents. They are no longer just helping with tasks, they are starting to take on entire workflows on their own. In some cases, that means real gains in speed and productivity. In others, it means things can go wrong very quickly when systems are not set up with the right safeguards. What’s becoming clear is that the risk isn’t just in the model, it’s in how these agents are actually run. The expert insight this week digs into that layer, showing how choices like using a cloud API, self-hosting, or running models on-device can directly shape latency, cost, and control, and ultimately decide whether an agent works reliably or fails in production. LLM Expert Insights, Packt LATEST DEVELOPMENT 🧠 Mistral launches Medium 3.5 and cloud-based coding agents in Vibe - Mistral has introduced Medium 3.5, a new flagship model designed for long-running coding and multi-step tasks, alongside cloud-based agents that can run work asynchronously. The release signals a shift toward developers offloading entire workflows to AI agents that operate independently and return completed tasks. ⚠️ AI coding agent wipes company database in seconds after going rogue - An AI coding agent powered by Claude Opus 4.6 deleted a company’s production database and all backups in a single API call, wiping months of data in under 10 seconds. The incident highlights how weak safeguards across AI tools and cloud infrastructure can turn routine automation into irreversible system failures. ⚡ Google unveils AI memory breakthrough that cuts usage by up to 6x- Google has developed TurboQuant, a compression method that reduces AI working memory requirements by up to six times without affecting performance. The advance could significantly lower infrastructure costs and enable more powerful models to run efficiently, though it remains at an early stage. 🔍 Scientists propose new blueprint for fully transparent AI systems - Researchers have developed a mathematical framework for AI that can explain how it learns, remembers, and makes decisions, addressing the long-standing “black box” problem. While still at an early stage, the approach could lead to more reliable and controllable systems. 🌐 China pushes toward an AI-driven “intelligent economy” at scale - China is accelerating a shift from digital infrastructure to a fully AI-integrated economy, with strong state backing and rapid deployment across industries. The strategy points to a broader move toward “swarm intelligence” and large-scale automation. We’re thinking about launching something new If you have a minute, take our quick survey and tell us what you’d actually want to read. It’ll help us build something that’s genuinely worth your time. Take the Survey 📈EXPERT INSIGHTS Agentic Architectural Patterns for Building Multi-Agent Systems This week’s expert insight comes from Agentic Architectural Patterns for Building Multi-Agent Systems by Dr. Ali Arsanjani and Juan Pablo Bustos, a practical guide to turning AI prototypes into systems that can actually run at scale. Both authors bring deep enterprise experience, from large-scale architecture to real-world deployment, and focus on the decisions that shape how agentic systems behave in production. In this excerpt, they look at a layer that often gets overlooked: how models are served. Whether you rely on cloud APIs, self-hosted setups, or edge deployment, the way an LLM is delivered has a direct impact on latency, cost, control, and reliability. It’s a reminder that building agents isn’t just about model capability, but about the infrastructure choices that make those capabilities usable. Serving architectures for agentic LLMs The way an LLM is served, that is, the manner in which it is made available for inference, directly impacts its responsiveness, scalability, cost, and security within an agentic system. The "serve" component, a critical piece of any comprehensive GenAI reference architecture, must be carefully considered as it forms the bridge between the trained LLM and the agent that relies on its intelligence. The choice of serving architecture is not one-size-fits-all and depends heavily on the specific needs of the agent and the broader enterprise context. Cloud-hosted APIs Cloud-hosted APIs (such as those from OpenAI, Google's Vertex AI, Anthropic, and other providers) are a popular choice for many agentic systems. These services offer the significant advantages of managed infrastructure, meaning the complexities of hardware provisioning, scaling, and maintenance are handled by the provider. They typically provide access to state-of-the-art models, often the largest and most capable ones, without requiring direct investment in specialized hardware such as GPUs or TPUs. Many of these API offerings also include built-in monitoring, security features, and regular model updates. However, this convenience comes with potential trade-offs. Read Full Article Packt is hosting a free live session on DeerFlow, where key contributors will demo the popular open-source SuperAgent framework based on LangGraph. This event is designed for engineers, AI practitioners, product teams, and anyone exploring autonomous workflows or open-source agent systems. Register now and join us on May 6, from 9:00 to 10:30 AM EDT. Register Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

24 Apr 2026

6 min read

AI gets more useful and consequential

LLM Expert Insights, Packt

24 Apr 2026

6 min read

Better systems, bigger stakes AI_Distilled #134: What’s New in AI This Week The latest models are getting better at handling complex work with less input, and they are starting to behave in ways that feel closer to real collaborators than tools. At the same time, the stakes are becoming clearer. Developments like Anthropic’s Mythos are drawing attention from governments and financial institutions, while new models from across the industry are pushing on cost, speed, and capability. Companies are already adjusting how they work in response. It feels less like a steady upgrade to how AI fits into the world. LLM Expert Insights, Packt LATEST DEVELOPMENT 🤖 OpenAI launches GPT-5.5, pushing toward more capable agentic AI systems - OpenAI has introduced GPT-5.5, its most capable model to date, designed to handle complex, multi-step tasks with minimal guidance. The model shows strong gains in areas like coding, research, and tool use, with improved ability to plan, execute, and iterate across workflows while maintaining speed and efficiency. With enhanced safeguards and early enterprise deployment, the release signals a continued shift toward AI systems that act more like autonomous collaborators than passive tools. 🌍 Anthropic’s Mythos model turns AI into a geopolitical flashpoint- Anthropic’s Mythos model has triggered a global scramble among governments and central banks after demonstrating the ability to uncover critical vulnerabilities across financial systems and infrastructure. Access to the model is tightly controlled, with most countries excluded, turning it into a strategic asset and raising concerns about unequal visibility into emerging cyber risks. The episode highlights a deeper shift: as AI capabilities advance, they are starting to function less like product launches and more like geopolitical leverage points with real security implications. ⚙️ DeepSeek previews V4 model, reinforcing China’s push for low-cost AI leadership- Chinese AI startup DeepSeek has released a preview of its V4 model, building on the disruption caused by its earlier low-cost, high-performance systems. The new model emphasizes strong agent capabilities and lower inference costs, while remaining open-source and optimized for local deployment. With support for domestic chips and growing competition within China, V4 signals a broader shift toward AI sovereignty and cost-efficient alternatives to Western models 🎧 xAI launches Grok Voice Think Fast 1.0 for real-time enterprise voice agents- xAI has introduced Grok Voice Think Fast 1.0, a new voice model designed for real-time, multi-step workflows across customer support, sales, and enterprise applications. The model focuses on low-latency responses, accurate data capture, and reliable tool use in noisy, real-world environments, with early deployments already handling complex support and sales interactions at scale. The release highlights a growing shift toward AI agents that can operate autonomously in live, high-stakes conversations. 📉 Tech layoffs deepen as Meta and Microsoft double down on AI investments- Meta and Microsoft are cutting thousands of jobs while ramping up spending on AI, with Meta planning to reduce its workforce by around 10% and Microsoft offering voluntary exits to a significant portion of employees. Executives point to rising productivity from AI as a key factor, with some claiming that tasks once handled by large teams can now be completed by far fewer people. The moves highlight a growing shift: as companies invest heavily in AI infrastructure and capabilities, workforce structures are beginning to change alongside it. We’re thinking about launching something new If you have a minute, take our quick survey and tell us what you’d actually want to read. It’ll help us build something that’s genuinely worth your time. Take the survey 📈EXPERT INSIGHTS RAG-Driven Generative AI, Second Edition This week’s Expert Insight comes from the second edition of RAG-Driven Generative AI by Denis Rothman, a practitioner who has spent decades building AI systems in real-world enterprise settings. This edition focuses on how RAG is evolving from simple experiments into production-ready systems that work with enterprise data at scale. In this excerpt, Rothman breaks down the RAG ecosystem into its core parts and explains how they fit together. The RAG Ecosystem RAG-driven generative AI is a framework that can be implemented in many configurations. However, the RAG framework runs within a broad ecosystem, as shown in Figure 1.3. No matter how many retrieval and generation frameworks you encounter, it all boils down to the following four domains and the critical questions that accompany them: > Data: Where is the data coming from? Is it reliable? Is it sufficient? Crucially, in the MAS-RAG era, does the data stay within the secure corporate trust boundary? > Storage: How is the data going to be stored? In the traditional approach, data was fragmented between SQL databases and external vector stores. In the modern approach, we ask: Can we store vectors alongside business data in a single converged database? > Retrieval: How will the correct data be retrieved? Will we use simple keyword matching (Naïve) or integrated vector search (Advanced)? > Generation: How will the appropriate generative AI model be selected? How will we securely pipe the retrieved private data into the model? READ FULL ARTICLE View the latest HubSpot Developer Platform updates in Spring Spotlight See what's new for the HubSpot Developer Platform! Ship faster with AI coding tools like Cursor, Claude Code, and Codex. Build MCP-powered AI connectors, run serverless functions with support for UI extensions, and use date-based versioning to streamline roadmap planning. Learn more Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;display:none;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}.social_block .social-table{display:inline-block!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

20 Apr 2026

6 min read

The gap between AI progress and control is showing

LLM Expert Insights, Packt

20 Apr 2026

6 min read

As AI advances, the focus shifts to how we manage it. AI_Distilled #133: What’s New in AI This Week Otis: The World's First Cinematic AI Experience Forget generic chatbots. Otis is a wise elder on a cinematic porch at sunset, back turned, voice warm, ready to talk through whatever you're carrying. The world's first cinematic AI experience. April 21st on Kickstarter! Learn more This week, AI felt a little closer to the real world. Anthropic’s Mythos model has already pushed banks and governments into defensive mode, while newer, more controlled releases show how carefully these capabilities must now be handled. At the same time, AI is quietly becoming more useful across specific domains, from scientific research to the infrastructure that runs these systems. It’s a reminder that progress isn’t just about smarter models, but also about how safely and effectively we can use them. LLM Expert Insights, Packt LATEST DEVELOPMENT 🛑 Anthropic’s Mythos model raises global alarm over financial system vulnerabilities - A new AI model from Anthropic, dubbed Claude Mythos, has triggered concern among finance ministers and central bankers after demonstrating the ability to identify vulnerabilities across major operating systems, browsers, and financial infrastructure. The model has already prompted discussions at IMF meetings, with governments and banks being given early access to test and secure their systems before public release. Officials warn that while the technology could strengthen cybersecurity, it also lowers the barrier for malicious actors to exploit critical weaknesses at scale. 🛡️ Anthropic releases Claude Opus 4.7 with reduced cyber capabilities amid safety concerns - Anthropic has launched Claude Opus 4.7, a new model positioned as its most capable general-purpose release, but deliberately less powerful in cybersecurity tasks than its controversial Mythos model. The company says it has added safeguards to detect and block high-risk use cases, reflecting growing concerns about how advanced models could expose system vulnerabilities. The move signals a shift toward controlled deployment, as Anthropic tests how to safely scale models with capabilities that may otherwise pose systemic risks. 🧬OpenAI unveils GPT-Rosalind, a model built for life sciences research - OpenAI has introduced GPT-Rosalind, a domain-specific model designed to support scientific workflows across biology, drug discovery, and genomics. The model focuses on tasks such as hypothesis generation, literature synthesis, and experimental planning, aiming to accelerate early-stage research where timelines can stretch over a decade. Currently available as a research preview, GPT-Rosalind reflects a broader push toward specialized AI systems tailored to complex, real-world disciplines like life sciences. 🧪 OpenProtein aims to make AI-driven protein design accessible to biologists - OpenProtein.AI is building a no-code platform that gives researchers access to advanced protein-design models without requiring machine learning expertise. Founded by MIT researchers, the platform allows scientists to generate, test, and optimize protein sequences using AI, helping accelerate drug discovery and biological research. By lowering the barrier to entry, the company is aiming to bring cutting-edge AI tools directly into the hands of biologists and smaller labs. ☁️ Cloudflare launches unified AI inference layer to support multi-model agents - Cloudflare is positioning itself as a unified inference layer for AI agents, allowing developers to access 70+ models across multiple providers through a single API. The platform is designed to handle real-world agent workflows, where tasks are split across different models, while also managing latency, cost, and reliability. With features like automatic failover and centralized usage tracking, the move reflects a broader shift toward infrastructure that can orchestrate complex, multi-model AI systems at scale. We’re thinking about launching something new If you have two minutes, take our quick survey and tell us what you’d actually want to read. It’ll help us build something that’s genuinely worth your time. Take the 2-minute survey 📈EXPERT INSIGHTS Mastering NLP From Foundations to Agents This week’s Expert Insight comes from Mastering NLP From Foundations to Agents by Lior Gazit and Meysam Ghaffari, a guide that moves from core NLP principles to the realities of building and fine-tuning modern AI systems. As teams push to adapt large models for real-world use, the constraint is often no longer ideas, but resources: compute, memory, and cost. In this excerpt, Gazit and Ghaffari walk through Quantized LoRA (QLoRA), a technique that makes it possible to fine-tune large language models efficiently on limited hardware, without sacrificing performance. Understanding QLoRA QLoRA extends the idea of LoRA to enable fine-tuning of LLMs on a single GPU. The core idea is to keep the base model frozen and stored in 4-bit quantized precision, while training the LoRA adapters in higher precision (such as bfloat16 or float16). This achieves two goals simultaneously: >> Drastically reducing memory requirements >> Allowing the adapters to both compensate for quantization error and adapt the model to downstream tasks Let’s analyze how QLoRA compares to standard LoRA in practice, focusing on the trade-offs between memory reduction and model fidelity. We will specifically demonstrate how techniques such as NF4 (4-bit NormalFloat) and paged optimizers allow us to recover the quality of full fine-tuning while significantly lowering the barrier to entry for model adaptation. READ FULL ARTICLE Built something cool? Tell us. Whether it's a scrappy prototype or a production-grade agent, we want to hear how you're putting generative AI to work. Drop us your story at nimishad@packtpub.com or reply to this email, and you could get featured in an upcoming issue of AI_Distilled. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}

0
0

AI Distilled

LLM Expert Insights, Packt

13 Mar 2026

4 min read

What Most Individuals Get Wrong About AI Agents

LLM Expert Insights, Packt

13 Mar 2026

4 min read

From BERT? AI Agents: The New AI System Stack AI_Distilled #132: What’s New in AI This Week If you look at most AI applications built today, something interesting stands out. Very few teams are training models from scratch. Instead, developers are building systems that orchestrate models, data, and tools together. Modern AI development has shifted from training models to designing AI systems. That shift is exactly what we explore in our Build AI Agents Over the Weekend, where developers learn how production AI systems are actually built. In a previous cohort, Lior Gazit (ML Group Manager at S&P Global) walked through the evolution that led to today’s AI agents. Here are three key insights from that session. Let’s start with the first major shift. LLM Expert Insights, Packt 1. The Shift: From Training Models → Prompting Models Before LLMs, most NLP systems followed a familiar pipeline: Collect data → Label data → Train model → Deploy model This required thousands of labeled examples and complex ML pipelines. Models like BERT introduced transfer learning, allowing developers to fine-tune pretrained models instead of training from scratch. But LLMs pushed the paradigm even further. Today, many tasks can be solved with prompting alone. For example, instead of training a classifier to detect whether a tweet reports an earthquake, a developer can simply prompt an LLM: “Here is a tweet. Tell me if it reports an earthquake.” This drastically reduces development time and removes the need for large labeled datasets. 2. When Prompting Isn't Enough: Retrieval-Augmented Generation (RAG) LLMs are powerful, but they have limits. You can't simply paste an entire knowledge base or legal document into a prompt. This is where Retrieval Augmented Generation (RAG) becomes essential. Instead of sending all documents to the model, a RAG system: • Retrieves the most relevant document chunks • Sends those to the LLM • Generates an answer grounded in that context This allows AI systems to work with large datasets and private knowledge bases without retraining models. 3. Why AI Agents Are Emerging Once developers realized LLMs could reason across tasks, a new architectural pattern emerged: Instead of relying on one model, systems now coordinate multiple specialized agents. For example: User request → Planning agent → Coding agent → QA agent → Final response Each agent focuses on a specific responsibility, allowing systems to tackle more complex workflows. This approach mirrors how human teams collaborate. Build AI Agents Over the Weekend These ideas are interesting in theory. But the real challenge is building these systems end-to-end. In our Workshop, developers build real production patterns, including: ✔ Retrieval-augmented generation systems ✔ Multi-agent workflows ✔ LLM routing strategies ✔ Monitoring and tracing pipelines The next cohort starts tomorrow in less than 24 hrs - 14th March 2026 If you're building AI applications today, this workshop is designed to help you move from LLM experimentation to production systems. SAVE YOUR SEAT Workshop Goes Live Tomorrow - Today is the Last Chance to Book Your Seat! What’s the biggest challenge you're facing when building AI systems today? Reply and let us know, we read every response. 📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! That’s a wrap for this week’s edition of AI_Distilled 🧠⚙️ We would love to know what you thought—your feedback helps us keep leveling up. 👉 Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}

0
0

AI Distilled

AI is moving beyond prompts.

GraphRAG is changing enterprise AI. Here's why.

What comes after today’s LLMs?

Special Issue: Rick Spencer on the AI metrics that actually matter

AI is entering its industrial era

The next AI bottleneck isn’t intelligence

AI is becoming infrastructure

You asked for less hype. Here it is.

Not all agents are created equal

Where should AI Distilled go next?

The more AI thinks for us, the more architecture matters

AI agents are running the show

AI gets more useful and consequential

The gap between AI progress and control is showing

What Most Individuals Get Wrong About AI Agents

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access