From multimodal LLMs to self-thinking agents, see what’s driving AI’s next leap.👋 Hello ,Welcome to DataPro #155 ➖Where Models Get Smarter, Agents Get Autonomous, and AI Gets Real-Time.This week’s edition explores the frontier of intelligent systems that see, reason, and act. From Meituan’s LongCat Flash Omni and DeepAgent’s unified reasoning to OpenAI’s gpt-oss-safeguard and SkyRL tx, AI is rapidly evolving toward autonomy, speed, and safety. We also look at how multimodal RAG, ethical AI, and data mesh are redefining how we build and scale intelligence.Knowledge Partner Spotlight: OutskillAt Packt, we’ve partnered with Outskill to help readers gain practical exposure to AI tools through free workshops, complementing the deeper, hands-on, expert-led experiences offered through Packt Virtual Conferences.If you're interested in enhancing your AI skills, Outskill’s LIVE 2-Day AI Mastermind offers a 16-hour training on AI tools, automations, and agent-building. This weekend’s sessions (Saturday and Sunday, 10 AM–7 PM EST) are available at no cost as part of their Black Friday Sale, providing a great opportunity to elevate your knowledge in just two days.Learn AI tools, agents & automations in just 16 hoursJoin now, limited free seats available!This week’s highlights:🔸LongCat Flash Omni:Meituan’s open 560B multimodal model for real-time interaction🔸DeepAgent: A unified reasoning agent that thinks, searches, and acts autonomously🔸SkyRL tx v0.1.0: Tinker-style reinforcement learning engine for local clusters🔸OpenAI gpt-oss-safeguard: Policy-conditioned safety reasoning models, open-weight and Apache 2.0🔸Does AI Need to Be Conscious to Care? Exploring the philosophy of artificial moral concern🔸Building Multimodal RAG: How to make retrieval truly visual and contextual🔸Covestro x Amazon DataZone: A blueprint for scaling data governance through data meshEach story in this issue unpacks a new layer in how AI learns, governs, and grows—so grab a coffee, settle in, and let’s dive into the full roundup.Cheers,Merlyn ShelleyGrowth Lead, PacktSponsored:🔸82% of data breaches happen in the cloud. Join Rubrik’s Cloud Resilience Summit to learn how to recover faster and keep your business running strong. [Save Your Spot]🔸Build your next app on HubSpot’s all-new Developer Platform,the flexible, AI-ready foundation to create, extend, and scale your integrations with confidence. [Start Building Today]Subscribe|Submit a tip|Advertise with UsTop Tools Driving New Research 🔧📊🔶 LongCat-Flash-Omni: A SOTA Open-Source Omni-Modal Model with 560B Parameters with 27B activated, Excelling at Real-Time Audio-Visual Interaction. Meituan’s LongCat Flash Omni is a 560B-parameter open-source multimodal model that activates 27B per token using shortcut-connected MoE. It extends text LLMs to vision, video, and audio with 128K context and real-time streaming through 1-second audio-visual interleaving at 2 fps duration-conditioned sampling. With modality-decoupled parallelism, it retains 90% text-only throughput and scores 61.4 on OmniBench, 78.2 on VideoMME, and 88.7 on VoiceBench, nearing GPT-4o performance.🔶 DeepAgent: A Deep Reasoning AI Agent that Performs Autonomous Thinking, Tool Discovery, and Action Execution within a Single Reasoning Process. Most agent frameworks still follow a fixed Reason–Act–Observe loop, but DeepAgent from Renmin University and Xiaohongshu redefines this with end-to-end deep reasoning. Built on a 32B QwQ backbone, it unifies thought, tool search, tool call, and memory folding within one stream. It dynamically retrieves tools from 16K+ APIs, compresses long histories into structured memories, and trains via Tool Policy Optimization (ToolPO) for precise tool use. DeepAgent achieves 69.0 on ToolBench and 91.8% success on ALFWorld, outperforming ReAct-style workflows in both labeled and open tool settings.🔶 Anyscale and NovaSky Team Releases SkyRL tx v0.1.0: Bringing Tinker Compatible Reinforcement Learning RL Engine To Local GPU Clusters. Anyscale and UC Berkeley’s NovaSky team released SkyRL tx v0.1.0, a local, Tinker-compatible engine that unifies training and inference for LLM reinforcement learning. It implements Tinker’s low-level API (forward_backward, optim_step, sample, save_state) and runs on user infrastructure. The update adds end-to-end RL, jitted sharded sampling, LoRA adapter support, gradient checkpointing, micro batching, and Postgres integration, enabling full RL training on 8×H100 GPUs with Tinker-level efficiency and open deployment.🔶 OpenAI Releases Research Preview of 'gpt-oss-safeguard': Two Open-Weight Reasoning Models for Safety Classification Tasks. OpenAI released gpt-oss-safeguard, two open-weight safety reasoning models, 120B and 20B parameters, that let developers enforce custom safety policies at inference time. Fine-tuned from gpt-oss and Apache 2.0 licensed, they replicate OpenAI’s internal Safety Reasoner used in GPT-5 and Sora 2. The models reason step by step on developer-supplied policies, outperform gpt-5-thinking on multi-policy accuracy, and fit on single-GPU setups for real moderation pipelines.Topics Catching Fire in Data Circles 🔥💬🔶 Does AI Need to Be Conscious to Care? This philosophical study explores that question through a precise framework. It distinguishes functional, experiential, and moral caring, showing that caring behaviors can exist without consciousness, as seen in bacteria, plants, and immune systems. While current AI systems display goal-directed, welfare-promoting behavior, they lack genuine concern. Consciousness-based and agency-based routes could both lead to artificial moral concern, suggesting caring exists on a spectrum. Future AI may combine conscious experience with robust agency, raising urgent ethical questions about artificial moral significance.🔶 Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources. Retrieval-Augmented Generation (RAG) has long powered text-based chatbots, but extending it to images, tables, and graphs is far harder. Real documents, like research papers and corporate reports, mix text, formulas, and figures without consistent formatting, breaking the link between visuals and context. To fix this, a new multimodal RAG pipeline introduces context-aware image summaries using nearby text instead of isolated captions, and text-response-guided image selection, where visuals are chosen after the textual answer is generated. Together, these steps yield consistent, contextually grounded multimodal retrieval across complex documents.🔶 From Classical Models to AI: Forecasting Humidity for Energy and Water Efficiency in Data Centers. This blog explores how accurate humidity forecasting can improve the efficiency, reliability, and sustainability of AI data centers. It explains how temperature and humidity directly affect cooling systems, energy use, and water consumption, and presents a real-world case study using Delhi’s climate data. The post compares forecasting methods, AutoARIMA, Prophet, XGBoost, and deep learning, with prediction intervals to assess accuracy and uncertainty, aiming to identify the best tools for operational planning and environmental optimization in large-scale AI infrastructure.🔶 Scaling data governance with Amazon DataZone: Covestro success story. This blog explores how Covestro Deutschland AG reengineered its global data architecture by transitioning from a centralized data lake to a domain-driven data mesh using Amazon DataZone and the AWS Serverless Data Lake Framework (SDLF). The transformation empowered teams to manage data products independently while maintaining consistent governance, improving data sharing and visibility. Through AWS Glue, S3, and automated data quality checks, Covestro now operates over 1,000 standardized data pipelines, achieving faster delivery, stronger governance, and scalable analytics across the enterprise.New Case Studies from the Tech Titans 🚀💡🔶 How to design conversational AI agents? This blog explores how conversational AI is transforming the online shopping experience by replacing rigid keyword-based search with natural, intuitive interactions. It outlines seven key design principles for creating AI shopping agents that understand user intent, personalize recommendations, support multimodal input, and present rich visuals. The post also highlights best practices for building user trust, handling ambiguity gracefully, and leveraging Google Cloud’s Conversational Commerce tools and Figma’s component library to design adaptable, on-brand, and intelligent shopping experiences.🔶 How 5 agencies created an impossible ad with Gemini 2.5 Pro? Generative AI is rewriting the rules of creativity. With Gemini 2.5 Pro and Google’s suite of generative media models, Imagen, Veo, Lyria, and Chirp, brands are moving beyond traditional campaigns to design what was once impossible. From Slice’s AI-powered retro radio station and Virgin Voyages’ personalized “postcards from your future self,” to Smirnoff’s interactive party co-host and Moncler’s cinematic AI film, these projects show how imagination and technology now merge to create entirely new forms of storytelling and brand expression.🔶 Build intelligent ETL pipelines using AWS Model Context Protocol and Amazon Q: Building and maintaining ETL pipelines has long been one of the most time-consuming parts of data engineering. With conversational AI and Model Context Protocol (MCP) servers, teams can now automate much of that process, turning complex scripting into guided, natural language interactions. By integrating with AWS services like Redshift, S3 Tables, and Glue, organizations can generate, test, and deploy pipelines faster while preserving security and governance standards. This post demonstrates how data scientists and engineers can use conversational AI to extract data, validate quality, and automate end-to-end migrations from Redshift to S3, reducing manual effort, improving accuracy, and accelerating insight generation.🔶 Amazon Kinesis Data Streams launches On-demand Advantage for instant throughput increases and streaming at scale: Managing real-time data streams just became simpler and more cost-efficient with the launch of Amazon Kinesis Data Streams On-demand Advantage mode. This new capability introduces warm throughput for instant scalability during traffic spikes and a committed-usage pricing model that significantly lowers costs for steady, high-volume workloads. Designed for use cases ingesting at least 10 MiB/s or operating hundreds of streams per region, it eliminates the need to manually switch between capacity modes. The post explains how On-demand Advantage helps organizations handle predictable surges, optimize costs, and configure warm throughput up to 10 GiB/s, along with setup steps, pricing details, and best practices for maintaining high-performance streaming pipelines.Blog Pulse: What’s Moving Minds 🧠✨🔶 The Pearson Correlation Coefficient, Explained Simply: Understanding how variables move together is the foundation of predictive modeling. In this walkthrough, we explore how to calculate and interpret the Pearson correlation coefficient, a key step before fitting a regression model. Using a simple salary dataset with Years of Experience and Salary, the post explains how to visualize relationships with scatter plots, compute variance, covariance, and standard deviation, and finally derive the correlation coefficient. With a result of r = 0.9265, the example shows a strong positive linear relationship, confirming that simple linear regression is well suited for predicting salary based on experience.🔶 Graph RAG vs SQL RAG: Comparing how large language models reason over structured and connected data reveals valuable insights into retrieval-augmented systems. In this experiment, a Formula 1 results dataset was stored in both a SQL and a graph database, then queried using retrieval-augmented generation (RAG) with models like GPT-3.5, GPT-4, and GPT-5. Each model translated natural language into SQL or graph queries to answer questions about drivers, races, and championships. The results show that newer models like GPT-5 achieved near-perfect accuracy across both databases, while simpler models struggled more with graph data. The study concludes that RAG-equipped LLMs can reason reliably over either database type, letting teams choose whichever structure best fits their data without sacrificing performance.🔶 RF-DETR Under the Hood: The Insights of a Real-Time Transformer Detection. Object detection has come a long way from rigid anchor grids to adaptive Transformer architectures. RF-DETR, Roboflow’s latest real-time detection model, embodies that evolution. Building on DETR’s end-to-end design, Deformable DETR’s adaptive attention, and LW-DETR’s lightweight efficiency, RF-DETR fuses these innovations with a DINOv2 self-supervised backbone for domain adaptability and speed. The result is a model that achieves real-time performance without sacrificing accuracy, capable of both bounding box detection and segmentation. In essence, RF-DETR showcases how adaptive attention and self-supervised vision have made Transformers fast, flexible, and production-ready for modern computer vision tasks.🔶 Building secure Amazon ElastiCache for Valkey deployments with Terraform. Managing infrastructure through code is becoming essential for secure, scalable cloud deployments. Using Infrastructure as Code (IaC) with Terraform, this guide walks through building a secure Amazon ElastiCache for Valkey cluster, covering both serverless and node-based options. It demonstrates how IaC ensures consistent configurations for encryption, authentication, and network isolation across environments. The walkthrough details step-by-step deployment, from provisioning private subnets and KMS-encrypted storage to implementing token-based authentication and CloudWatch logging. The result is a reproducible, production-grade ElastiCache setup that combines automation, security, and cost efficiency through a modern Terraform workflow.See you next time!*{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}
Read more