Master AI Tools, Set Automations & Build Agents – all in 16 hours (for free)

fabric-spark-native-execution-engine-mongodb-atlas-is-now-an-azure-native-integration-ani-gemini-powered-text-to-sql-img-0

Join the 2-Day Free AI Upskilling Sprint by Outskill which comes with 16 hours of intensive training on AI frameworks, tools and tactics that will make you an AI expert. Originally priced at $499, but the first 100 of you get in for completely FREE! Claim your spot now for $0! 🎁

📅23rd May- Kick Off Call & Session 1
✅Live sessions- 24th & 25th May
🕜11AM EST to 7PM EST

Inside the AI Bootcamp, you will learn:

👉 AI tools to automate repetitive tasks and free up time for high-value work.
👉 Smarter decision-making with Generative AI, Neural Networks, and LLMs.
👉 Learn to generate images and videos using AI to speed up content creation and marketing.
👉 AI-powered automations to eliminate manual, repetitive tasks.
👉 CustomGPTs and AI Agents to make AI work for you even while you’re asleep.

By the way, you will be learning from mentors from the top industries across the globe like Microsoft, Google, META, Amazon, etc. 🎁 You will also unlock $3,000+ in AI bonuses: 💬 Slack community access, 🧰 top AI tools, and ⚙️ ready-to-use workflows — all free when you attend!

Join in now, (we have limited free seats! 🚨)

Sponsored

Subscribe | Submit a tip | Advertise with us

🎉 Welcome to the 100th Edition of BIPro! 🎉
Your trusted source for business intelligence insights, now with a powerful new twist.

This milestone marks more than just a number, it’s a testament to the incredible community we’ve built together. Your support, feedback, and curiosity have made BIPro the go-to digest for data professionals navigating the rapidly evolving world of BI and analytics.

To celebrate, we’re thrilled to launch the Deep Dive Column, spearheaded by our Content Engineer Ayushi Bulani. This new format tackles some of the most urgent questions in BI, starting with how Generative AI is reshaping (and sometimes complicating) data workflows. From hallucinated SQL to schema confusion, Ayushi breaks down what GenAI can’t do yet, and how data teams can work around those gaps with precision and care.

🔍 Start with the lead story: [What Generative AI Still Can’t Do (and Why That Matters)]

💡 This edition also delivers big updates across the data and AI ecosystem:

⚙️ [Fabric Spark Native Execution Engine (NEE)] is now GA, bringing 6× faster Spark performance, no code changes required.
🌐 [MongoDB Atlas becomes a native Azure integration (ANI)], streamlining billing, security, and deployment through the Azure console.
🧠 [Gemini-powered Text-to-SQL from Google Cloud] transforms how analysts write queries, tackling schema context, SQL dialects, and hallucination challenges.
🔎 [Looker’s semantic layer] and [Amazon QuickSight’s SPICE engine + Amazon Q] redefine how GenAI interprets business logic and delivers conversational BI at scale.

⚡ Quick Wins You Don’t Want to Miss:

📊 [Embedded analytics] done right, with Metabase’s white-labeled, interactive dashboards for SaaS products.
🧩 [Power Apps tip] on sorting Combo Box values for cleaner user experiences.
🛠️ [ADF onboarding guide] for teams navigating Azure Data Factory, Synapse, and Fabric transitions.

📣 Plus: Take our AI Frustration Survey and help shape our upcoming issue on prompt engineering for data pros.

As we step into this new phase, let’s build the future of BI together. Share your ideas, feedback, and wish, we’re listening.

Here’s to the next 100 editions of innovation, insight, and impact.

Cheers,

Merlyn Shelley

Growth Lead, Packt

🎓Master the Math Behind Machine Learning - Free Primer!

fabric-spark-native-execution-engine-mongodb-atlas-is-now-an-azure-native-integration-ani-gemini-powered-text-to-sql-img-1

Get a head start on our upcoming release, Mathematics of Machine Learning by Tivadar Danka, with this free downloadable primer.

🔍 Inside:

Core concepts: Linear Algebra, Calculus, Probability
Clear explanations + hands-on Python examples
Written by a PhD mathematician & ML educator

📩 Enter your email to get Essential Math for Machine Learning delivered to your inbox within 24 hours.

👉 Sign Up Now - Get Your Free Primer!

Rubrik

fabric-spark-native-execution-engine-mongodb-atlas-is-now-an-azure-native-integration-ani-gemini-powered-text-to-sql-img-2

30% of GenAI projects stall due to data quality, cost, and compliance challenges

Tired of watching promising GenAI projects stall in proof-of-concept limbo?

Almost 1 out of every 3 projects will stay there. Let’s change that.

Reminder: Save May 25th on your calendar for an exclusive session about Rubrik Annapurna—built on Rubrik Security Cloud and integrated with Amazon Bedrock.

This is your chance to push your AI from pilot to full production, securely and at scale.

Here’s why you should register:

Overcome architectural pitfalls that slow down GenAI deployments
Achieve zero-copy, real-time, permission-aware data access
See how to use DSPM capabilities for secure, compliant data handling

Save Your Spot

Sponsored

What Generative AI Still Can’t Do (and Why That Matters)

fabric-spark-native-execution-engine-mongodb-atlas-is-now-an-azure-native-integration-ani-gemini-powered-text-to-sql-img-6

Generative AI tools like ChatGPT and Copilot are transforming the modern workflow. These tools promise high performance, but they're far from perfect; knowing their limitations is just as important as knowing their strengths. Right off the bat, these models don’t understand your data. They don’t connect predictions to your schema, business logic, or production constraints.

They generate plausible code or queries but often without validating against actual structures or edge cases. This is where an understanding of prompt engineering comes in which doesn’t just mean better phrasing but translating context into constraints the model can work with. Otherwise, you're just as likely to get broken logic as usable code.

Providing that structure up front significantly improves the accuracy of the output and this kind of precision is critical in data projects.

Then there’s precision. You could be using GenAI to craft transformation pipelines or writing code. But the problem with this is that generative AI often hallucinates, which means that it confidently suggests syntax, libraries, or functions that don’t behave as described or sometimes even don’t exist! This can be especially risky when you're deploying to production or relying on subtle transformations that impact business-critical logic.

That’s why you still need to vet the output carefully. Check the generated code against official documentation, test it in a sandbox, and validate the assumptions it's making. Even better, turn the AI into a research assistant. Ask it to cite its sources, link to relevant docs, or summarize the best practices from trusted repositories. Perhaps even ask the LLM to explain the rationale behind the code it generates. This not only helps you understand what it's trying to do, but also gives you a chance to spot gaps in its logic or mismatches with your data context before integrating anything into your pipeline.

They’re also stateless. Most models can’t track your session context or versioned data logic across interactions. Unless you carefully prompt, they’ll forget key constraints or project-specific naming conventions. A work around for statelessness is maintaining a session summary. This is a running list of decisions, assumptions, and outputs that you can paste into each new prompt to keep the model aligned.

Until LLMs gain persistent memory or better long-context performance, the burden of context management is on you. Being explicit pays off.

Finally, there’s trust. In data engineering, pipelines break when assumptions are wrong. You can’t just eyeball AI output, you need test coverage, validation, and deployment-aware thinking that these tools can’t yet offer. To work around this, treat any AI-generated code or config as a first draft, not production-ready logic, always assume it's incomplete. You can build unit tests to see how well the generated code performs. In addition, consider working in a virtual environment when testing AI-suggested code. It allows you to safely install and trial new dependencies without affecting your core environment or other projects.

Used well, generative AI can accelerate boilerplate, improve documentation, and even suggest alternatives. But it’s not a drop-in replacement for domain knowledge, testing discipline, or production-readiness.

Where Does AI Fail You?

Generative AI is everywhere and it’s not perfect. If you’ve ever been frustrated by code hallucinations, vague answers, or simply found that AI has a knowledge gap when it comes to your industry, we want to know. Help us map the real-world gaps in AI adoption by sharing your experience. We’ll publish the results (anonymized) in an upcoming issue on prompt engineering for data professionals.

Take the Survey Now

fabric-spark-native-execution-engine-mongodb-atlas-is-now-an-azure-native-integration-ani-gemini-powered-text-to-sql-img-7

Machine Learning Summit 2025

JULY 16–18 | LIVE (VIRTUAL)

20+ ML Experts | 25+ Sessions | 3 Days of Practical Machine Learning and 35% OFF

BOOK NOW AND SAVE 35%

Use CodeEARLY35at checkout

Day 1: LLMs & Agentic AI
From autonomous agents to agentic graph RAG and democratizing AI.
Day 2: Applied AI
Real-world use cases from tabular AI to time series GPTs and causal models.
Day 3: GenAI in Production
Deploy, monitor, and personalize GenAI with data-centric tools.

Learn Live fromSebastian Raschka,Luca Massaron,Thomas Nield, and many more.

35% OFF ends soon – this is the lowest price you’ll ever see.

📊 Data Viz Trends Shaping the Future of Insights

⭕Extracting deeper insights with Fabric Data Agents in Copilot in Power BI: Microsoft announces the integration of Fabric Data Agents with Copilot in Power BI, enabling users to query multiple Fabric resources like lakehouses, warehouses, and KQL databases using natural language. This standalone Copilot experience enhances data discovery, insight extraction, and workflow efficiency.

⭕ Microsoft Fabric Spark: Native Execution Engine now generally available. The Fabric Spark Native Execution Engine (NEE) is now generally available in Fabric Runtime 1.3, offering up to 6× faster Spark workloads on lakehouses with no code changes. Built on Apache Gluten and Velox, NEE boosts Delta/Parquet processing with C++ vectorized execution.

⭕ Techniques for improving text-to-SQL: Google Cloud’s Gemini-powered text-to-SQL enables users to generate SQL from natural language, enhancing productivity across BigQuery, CloudSQL, AlloyDB, and more. Techniques like schema retrieval, intent disambiguation, self-consistency, and validation address challenges in context understanding, SQL dialects, and accuracy, ensuring high-quality SQL generation at scale.

⭕ How Looker’s semantic layer enhances gen AI trustworthiness: Looker’s semantic layer ensures trusted, consistent AI-driven business intelligence by grounding LLM responses in governed data models via LookML. It improves accuracy, reduces hallucinations, supports centralized metrics, and enhances natural language analytics, making conversational BI reliable, interpretable, and aligned with business definitions.

⭕ How Opendoor transformed business intelligence with Amazon QuickSight? Opendoor transformed its business intelligence by migrating to Amazon QuickSight, achieving 80% cost savings, faster dashboards, and self-service analytics for non-technical users. QuickSight’s SPICE engine, natural language querying via Amazon Q, and external embedding enhanced data accessibility, performance, and partner collaboration organization-wide.

⚡ Quick Wins: BI Hacks for Instant Impact

⭕ What is embedded analytics? Metabase’s embedded analytics empowers SaaS apps to deliver interactive, white-labeled dashboards within their product. It supports self-service exploration, enhances user retention, reduces support requests, and enables scalable data access. With iframe, SDK, or custom builds, teams can integrate analytics with speed, security, and flexibility.

⭕ How to Sort Combo Box Values in a Power Apps Canvas App: This Power Apps guide shows how to sort Combo Box and Drop Down values using the Sort and Value functions for better user experience. Demonstrated with SharePoint Lists, it supports text and numeric fields, ensuring clean, user-friendly UI in Canvas Apps.

⭕ Top 5 Things You Should Know About Azure Data Factory: This onboarding guide outlines key Azure Data Factory (ADF) essentials, comparing ADF, Synapse, and Fabric Pipelines. It covers pipeline orchestration, dataflows, connection management, and billing complexities, stressing cost impacts of runtime and activity types. It’s crucial for teams migrating between Azure and Fabric.

⭕ MongoDB Atlas is Now Available as a Microsoft Azure Native Integration: MongoDB Atlas is now available as an Azure Native Integration (ANI), enabling seamless deployment, unified billing, and native access via the Azure Portal. This boosts AI-driven development, real-time analytics, and secure scalability, while simplifying operations through Azure-native tools, including Service Connector and Entra ID.