Shane Murray on AI-ready data, the truth about RAG, and why building beats planning for trustworthy AI

Listen now on YouTube | Spotify | Apple Podcasts

*The Data Faces Podcast with Shane Murray, Field CTO at Monte Carlo Data*

The AI imperative and its data foundation

The current AI craze creates significant opportunities for many businesses. However, the success of AI applications depends on reliable, quality data. Many organizations now face the challenge of moving beyond tire-kicking to build trustworthy AI data products. A foundational question emerges: what does “AI-ready” truly mean? While the term is widely used, a clear consensus on its definition can be elusive.

On the Data Faces Podcast, I recently had the opportunity to explore what “AI-ready” data means in practice, how to cultivate trust in generative AI, and what effective governance involves when the stakes are higher than ever with Shane Murray of Monte Carlo Data.

About Shane Murray

Shane Murray is the field CTO at Monte Carlo Data. He previously spent nearly a decade at The New York Times, where he led a data organization of about 160 people covering data platforms, engineering, machine learning, data science, and embedded analytics. He also established a data product function there. Shane’s journey in data began at a multivariate testing startup in Sydney, which was later acquired by Accenture. This move brought him from a 30-person startup to a 250,000-person consulting company and eventually to the US. His current work at Monte Carlo involves research and development focused on how teams build with generative AI and how data observability can support them.

In our conversation, we discuss:

What it means to build AI-ready data products.
How to think about trust with generative AI.
What governance looks like when data reliability is paramount.
Practical steps for data teams starting their AI journey.

Deconstructing “AI-ready”: from slippery term to actionable strategy

The term “AI-ready” shows up a lot in discussions about data, trust, and governance these days. But what does it actually mean when the rubber meets the road? Shane Murray shared that it can be a “slippery term” where “a lot of people said it, and everyone kind of nods, but doesn’t have a super clear definition”. He sees organizations tackling AI readiness in a couple of main ways, depending on what they’re trying to achieve. These paths are not mutually exclusive; rather, they represent different facets of AI readiness that business leaders should consider.

“I’ve seen the term [AI-ready] now pretty much everywhere, and I feel like it’s can be one of those kind of slippery terms… where a lot of people said it, and everyone kind of nods, but doesn’t have a super clear definition.” –Shane Murray, Field CTO at Monte Carlo Data

Path 1: AI-ready for internal analytics and conversational BI

Some data teams are focused on getting their internal analytics and business intelligence AI-ready. This means preparing data for new conversational BI initiatives, such as enabling tools like Snowflake’s Cortex Analyst or various AI/BI solutions in Databricks. The main jobs here are:

Ensuring high-quality data is available as an essential starting point.
Contextualizing that data with good metadata and perhaps provenance so the AI can reliably query it and really understand what it means.

For business leaders, a key question for your teams here is: How are we ensuring our foundational data quality and contextual metadata are sufficient for reliable AI-driven analytics and the emerging conversational BI tools?

Path 2: AI-ready for building new applications

Other organizations are concentrating on how to get ready for building brand-new AI applications directly on their data. This approach means preparing and storing unstructured data with the same thoroughness for accuracy and completeness as they do for structured data. One of the most important insights Shane offers is that data often does not become truly “ready” until teams start actively using it in AI systems.

“The truth I’ve found is that that data doesn’t really become ready until you start using it in AI.” Shane Murray, Field CTO at Monte Carlo Data

He’s found that the teams making real headway are those that deploy prototype and production AI products. They learn firsthand where these products break, where bias creeps in, or what level of data reliability is actually good enough for those specific applications.

Business leaders can foster this by championing a culture of experimentation. Encourage pilot projects for new AI applications, with the understanding that true data readiness and limitations are often best discovered through these initial deployments. You should ask: What strategic investments and resources do our teams need to effectively prepare and manage our unstructured data for these innovative applications?

Data quality reimagined: navigating unstructured data for AI

AI, and especially generative AI, relies on unstructured data sources like text, video, audio, and images. This means that traditional thinking about data quality, which often centers on numbers, neatly arranged in rows and columns, needs to evolve. As many of us know, when people say “data quality,” the mind often goes to numbers and structure, not necessarily to text, video, and audio. For leaders, understanding this view of data quality is vital for setting realistic expectations and allocating resources appropriately for AI projects that use diverse data types.

Shane Murray advocates a core principle on this. He has “always held the position that data quality is contextual to the use case”. For example, financial use cases might demand data accurate to the penny against a general ledger, requiring daily verification. In contrast, past ML applications sometimes had fuzzier accuracy requirements but needed high availability at sub-second latency. Current AI applications often need a mix of both high quality and high availability.

“I guess I’ve always held the position that data quality is contextual to the use case.” Shane Murray, Field CTO at Monte Carlo Data

When considering unstructured data, some of the same high-level objectives for quality apply. Shane mentions that unstructured data needs to be “relevant, it needs to be complete, it needs to be fresh… it needs to be consistent”. However, defining and measuring these attributes can be more complex. Often, teams must perform some structuring of the unstructured data to obtain a metric they can monitor.

“At a high level, it’s some of the same things we talk about with structured data… like, it needs to be relevant, it needs to be complete, it needs to be fresh, right? It needs to be consistent.” –Shane Murray, Field CTO at Monte Carlo Data

Business leaders should direct their data strategists to explicitly define and address the quality requirements for unstructured data, recognizing these may differ significantly from traditional metrics for structured data. In strategy reviews, question your technical leaders: How are we defining and measuring the relevance, completeness, freshness, and consistency of the unstructured data fueling our key AI initiatives? What are the business risks if these are not met?

Practical assessment techniques

One common approach involves creating vector embeddings from unstructured data like images or text. These mathematical embeddings allow teams to analyze distance metrics. Such metrics help understand the relationship between the context provided by unstructured data and the questions users ask, or the relationship between that context and the AI’s responses, ensuring responses are grounded.

Even before employing complex embeddings, simpler methods can offer value. Shane sees teams taking unstructured data and using modern data warehouse capabilities, including LLMs, to extract topics and understand their frequency of occurrence. For many teams new to using these unstructured data sources, “even just getting an understanding of the topic distribution and prevalence is kind of this necessary profiling to know that their knowledge base is going to be up to date”.

Support necessary investments in technologies and skills (e.g., for vector embeddings, topic modeling) that enable your teams to effectively assess and enhance the quality of critical unstructured data assets.

The trust gauntlet: common roadblocks in AI data reliability

Building trust in AI applications means maneuvering around several significant roadblocks, which can pose real business risks to reputation, operational efficiency, and sound decision-making. While LLM hallucinations, or AI confabulations, often grab headlines, they can be just the tip of the proverbial iceberg. These errors frequently mask deeper issues with an AI agent’s effectiveness or its foundational data. Shane Murray points out that the “thing that gets most publicized is the hallucinations”, like a chatbot selling a car for one dollar or a customer success bot going rogue due to a spate of cancellations. Business leaders should challenge their teams to perform root cause analysis for such AI errors, looking beyond surface-level issues to underlying data integrity or design flaws.

“Oftentimes, that hallucination… is often masking issues that are affecting the effectiveness of the AI agent.” Shane Murray, Field CTO at Monte Carlo Data

Beyond these publicized errors, several persistent challenges threaten the trustworthiness of AI applications.

Persistent challenges threatening trust

“Garbage in, garbage out”: This age-old data adage still remains a primary concern. Shane notes it as “one of the most common examples” where outdated or incomplete source data is a significant culprit. Many teams are, for the first time, using unstructured knowledge-based data that might have existed in a data lake or was not even previously ingested. They might find that business logic or help pages “haven’t been updated in, you know, two years”. If models lack a complete knowledge base, they tend to “make things up”. Leaders should prioritize and fund initiatives to systematically update and maintain critical knowledge bases, especially unstructured ones, to mitigate this risk.

“A lot of these teams, for the first time, are looking at this unstructured knowledge base data… and finding that their business logic, or their help pages haven’t been updated in two years, right?” Shane Murray, Field CTO at Monte Carlo Data

Embedding drift: The relevance of vector embeddings, which represent data in AI models, can change over time. Shane has seen people encountering “drift over time. And so what was relevant yesterday may not be relevant tomorrow”.
“Small tweaks, big surprises”: Shane highlights an interesting issue he calls “small tweaks big surprises”. Seemingly minor alterations, such as a prompt change or an upgrade to an underlying foundation model like Claude, Gemini, or GPT, “can have huge effects downstream on the effectiveness of their application”. He has talked to many teams that “almost had to go back to the drawing board” or re-run alpha and beta testing due to these changes. Even recent versions of models, like a newer ChatGPT, are sometimes perceived as more likely to hallucinate, impacting applications built on them. Business leaders should inquire about application stability, asking: How are we managing the risks associated with ‘small tweaks, big surprises’? What are our contingency plans?

“This can have huge effects downstream on the effectiveness of their application. And so I’ve talked to many teams that have almost had to go back to the drawing board.” Shane Murray, Field CTO at Monte Carlo Data

Ecosystem complexity: The new AI ecosystem introduces additional operational overhead. Teams must manage components like vector databases, new orchestrators, and API calls to various models.
The evaluation bottleneck: A significant challenge is how teams evaluate their AI models effectively, as approaches vary widely. Some organizations have a human “checking every output, right, or hundreds a day”. While human evaluation is necessary, Shane questions its scalability, remarking, “if a human’s required to check these outputs, then, then I’m not sure you can even call it AI at the end of the day”. On the other side, people are trying to figure out the right ways to “automatically evaluate these things at a scale that allows them to roll out into production”. David Sweenor, the podcast host, suggests a manufacturing-like quality control process where you sample outputs instead of inspecting every single one. Leaders should drive discussions on establishing scalable and efficient AI model evaluation processes. Ask your teams: Are we overly reliant on manual checks? What is our strategy for adopting automated validation methods to ensure reliability as we scale our AI deployments?

AI governance in flux: from data control to AI enablement

With all hands on generative AI, your old approaches to governance are no longer sufficient and need to change. Traditionally, data teams managed deterministic value chains of data, where the primary goal was to make processes more reliable and accurate, but repeatable. Machine learning then introduced probabilistic systems. Now, generative AI systems present even more stochasticity. This means rethinking data governance for an AI-first world, transforming it into a competitive differentiator that enables both speed and safety.

A central issue for data teams is ensuring their involvement from the outset. Shane Murray notes, “it’s not a given that the data team is in the room” when CTOs push for generative AI adoption across the board. If data teams are not positioned correctly, they “may struggle to actually have a say in these conversations”. Business leaders must actively ensure that their data and AI governance leaders are integral to strategic AI discussions from inception, not merely as a compliance checkpoint.

“It’s not a given that the data team is in the room, which is potentially a problem for some of these applications.” –Shane Murray, Field CTO at Monte Carlo Data

Shane observes several emerging governance models as teams navigate this new territory.

Emerging governance frameworks

Data team end-to-end ownership: In some cases, the data team owns the AI use case entirely. This often applies to internal solutions like document processing, where LLMs automate previously human-driven tasks. Here, the data team controls the data quality fed into these systems and ensures quality controls on the output.
Data team as expert enablers: In other scenarios, data teams support AI use cases led by software or product teams, such as personalization or discovery features. The data team does not lead these initiatives but contributes their knowledge and expertise. They advise on which data streams are most reliable and how to evaluate the output to meet accuracy or precision requirements.
Platform-centric AI governance: Some forward-thinking leaders are building a foundational platform approach. Data platform or machine learning platform leads are evolving their roles to become heads of “data and AI platform” teams. Their goal is to provide “privacy, security, observability, maybe experimentation infrastructure, to the way these things are rolled out, and ensure there’s some standards to how we do this”.

Business leaders should strategically determine which governance model or combination of models best suits different AI initiatives across their organization and allocate resources accordingly.

“How do I bring privacy, security, observability, maybe experimentation infrastructure, to the way these things are rolled out, and ensure there’s some standards to how we do this?” Shane Murray, Field CTO at Monte Carlo Data

Successfully implementing these new governance structures requires a delicate balancing act. Data teams must demonstrate the same speed that the organization expects for deploying new applications. They need to avoid being perceived as “the person pumping the brakes”. While enabling rapid innovation, they must also “bring the standards to make sure they have high-quality inputs and high-quality outputs”. Leaders can foster this by empowering their data and AI platform teams to establish robust standards while also cultivating a culture where these teams are seen as enablers of innovation, helping the business move faster, safely.

Defining trust in the age of generative AI: beyond anthropomorphism

A central question when working with generative AI is the extent to which these systems can be trusted, not in a human sense, but in terms of their predictability, reliability, and alignment with business objectives. Can we rely on them as we would a human colleague? Shane Murray’s perspective is that we cannot trust them “implicitly”. He suggests that success in building with these agents sometimes involves treating them like a team of people, with specialists for different tasks, all orchestrated and managed. However, this analogy has its limits. It would be “almost equivalent to someone who didn’t have common sense”. Therefore, we need clear frameworks for interaction. Business leaders should set clear and realistic expectations within the organization regarding AI capabilities. Emphasize that AI tools require clear objectives, constraints, and feedback loops; they are not autonomous ‘colleagues’.

“I don’t think we’re, we’re applying the same [trust] to these, these llms or AI, it’d almost be equivalent to someone who didn’t have common sense, and so you need to establish very clear objectives, tight constraints and a really tight feedback loop.” Shane Murray, Field CTO at Monte Carlo Data

To build a foundation for reliable AI performance, teams must “establish very clear objectives, tight constraints and a really tight feedback loop to make sure they don’t go off in the wrong direction”.

The indispensable human element

The “human in the loop” remains the “predominant way teams are actually making use of these things”. There is a visible shift away from purely autonomous chatbots towards more “collaborative systems that have people checking in in multiple steps”. Shane believes this is the most viable approach because these systems cannot always be relied upon to possess common sense. Leaders should champion human-in-the-loop processes for critical AI applications, ensuring appropriate oversight and intervention points.

A word of caution on RAG

Retrieval Augmented Generation (RAG) is a technique used to provide AI models with relevant, proprietary data to inform their responses. While helpful, Shane cautions that he “definitely don’t think RAG is a fail-safe for hallucination”. It “brings relevant context from your proprietary data… but it doesn’t necessarily prevent this idea that the model will get outside of that knowledge base and start making things up”. Getting a prototype RAG pipeline out the door might be easy, but making it production-ready means dealing with issues like hallucination and bringing observability across inputs, the value chain, and outputs. When teams propose solutions using RAG, business leaders should probe deeper: What are the specific measures to monitor its effectiveness and prevent reliance on out-of-context or fabricated information?

“I definitely don’t think rag is a fail-safe for hallucination.” -Shane Murray, Field CTO at Monte Carlo Data

The future of trustworthy AI: essential frameworks and technologies

Looking ahead, several key frameworks and technological advancements will be essential for building and maintaining trustworthy AI. These are not just technical upgrades; they represent strategic goals for organizations serious about long-term AI success.

Pillar 1: Responsible AI as a mandate

A core component of trust will be the continued development and adoption of responsible AI principles. This involves ensuring that AI systems are fair, unbiased, and incorporate explainability and auditability. Shane anticipates that upcoming “legislation… is going to mean many teams have to build in explainability and auditability of these systems, which is quite challenging today”. There is a significant field of research and development focused on how to trace all responses in complex multi-agent systems to understand their operations. Business leaders should mandate and allocate budgets for the development and implementation of Responsible AI principles, positioning these as core to the AI strategy, especially in anticipation of evolving regulations.

“Trust can also a pivotal component is going to be, you know, this idea of responsible AI, ensuring these are fair and unbiased. And I think with with legislation, that’s going to mean many teams have to build in explainability and auditability of these systems.” –Shane Murray, Field CTO at Monte Carlo Data

Pillar 2: The evolution to AI observability

The field of data observability is evolving into a broader “data and AI, observability space”. This expanded scope includes managing all structured and unstructured inputs to the models and handling the complexity of the AI system itself. A huge requirement will be evaluating model outputs to ensure they are grounded and relevant. This evaluation might involve human review, machine evaluation, or approaches like using LLMs as a judge. Leaders should invest in expanding their organization’s data observability capabilities to encompass the entire AI lifecycle. This includes managing diverse inputs and implementing robust, efficient methods for evaluating model outputs against business objectives.

Pillar 3: Production monitoring is non-negotiable

The traditional approach where teams might “solve this in dev and then let it go in prod” is no longer sufficient for AI. Shane emphasizes that for teams building these applications, “a lot of the work… is in prod”. While it might be relatively easy to launch an AI application, the significant challenge lies in “maintaining something that’s reliable and trustworthy in production” post-launch. Therefore, “monitoring and observing these systems in prod is going to be essential”. Trust cannot be siloed; it must be ensured across every piece of the system, from development through production. Business leaders must lead a cultural shift to recognize that AI system reliability is an ongoing operational concern, not just a development task. Ensure robust support for continuous monitoring, maintenance, and improvement of AI systems post-deployment.

“A lot of the work for teams that are building these is in prod, and monitoring and observing these systems in prod is going to be essential.” –Shane Murray, Field CTO at Monte Carlo Data

Getting started: actionable advice for data teams from the field

For data teams wondering where to begin or how to advance their AI initiatives, Shane Murray offers practical advice grounded in his experience. Business leaders can enable their teams to act on these insights by fostering the right environment and strategic focus.

1. Build to learn

Perhaps the most crucial piece of advice is to start building. Shane states, “you’ll learn more in a couple of weeks of building than months of planning and prognosticating”. He encourages teams to engage in research and development, getting proofs of concept (POCs) up and running. This hands-on experience helps teams learn what “trustworthy and reliable really means” and informs conversations with leadership about productionizing these systems. Initial projects do not need to be complex chatbots. Teams can start by “summarizing call transcripts, or it could be taking some document processing, structuring the unstructured data”. Many teams are doing interesting work in these areas. Business leaders should foster an organizational culture where rapid prototyping and learning from AI development cycles are valued. Encourage teams to start with manageable projects to build internal expertise and understanding.

“You’ll learn more in a couple of weeks of building than, you know, months of planning and prognosticating… And so the teams that are getting doing… research… and getting POCs up and running, and learning about what trustworthy and reliable really means.” –Shane Murray, Field CTO at Monte Carlo Data

2. Embrace human-AI collaboration

It is important to “keep humans in the loop”. AI development does not have to be about creating an “all-intelligent robot”. Instead, tasks can transition from being human-driven to human-reviewed. Shane observes that “all the successful teams I’m seeing are doing that”. Business leaders should advocate for AI systems that augment human capabilities through “human-review” tasks, rather than aiming for full automation in all areas, especially critical ones.

“Keep humans in the loop. This doesn’t have to be an all-intelligent robot. This can be something where you move from human-driven tasks to human review tasks.” –Shane Murray, Field CTO at Monte Carlo Data

3. Focus on meaningful, long-term impact

Teams should focus on “problems that have meaningful long-term impact”. AI projects should not be viewed as items to build and then “throw it over the fence”. Instead, they should be considered “something that’s going to be long-lived”. This means selecting problems that will remain useful over months or years, not just weeks. Business leaders should guide AI investments towards solving core business problems with sustained value, treating AI initiatives as long-term product commitments, not short-term experiments.

About David Sweenor

David Sweenor is an AI, Generative AI, and Product Marketing Expert. He brings this expertise to the forefront as founder of TinyTechGuides and host of the Data Faces podcast. A recognized top 25 analytics thought leader and international speaker, David specializes in practical business applications of artificial intelligence and advanced analytics.

Books

With over 25 years of hands-on experience implementing AI and analytics solutions, David has supported organizations including Alation, Alteryx, TIBCO, SAS, IBM, Dell, and Quest. His work spans marketing leadership, analytics implementation, and specialized expertise in AI, machine learning, data science, IoT, and business intelligence.

David holds several patents and consistently delivers insights that bridge technical capabilities with business value.

Follow David on Twitter @DavidSweenor and connect with him on LinkedIn.

Podcast Highlights – Key Takeaways from the Conversation

David Sweenor 0:00: Hello everyone. Welcome to Data faces… Today, I’m super excited to be joined with Shane Murray. He’s field CTO at Monte Carlo data….we’re going to talk about what it means to be build AI ready data products, how to think about trust with generative AI and what governance looks like when the stakes are probably higher than ever.

Shane Murray 0:44: Thank you. Nice to be joining you today.

David Sweenor 3:23: So I’d love to get your take on what the heck does it mean to be AI ready? And, you know, do people underestimate this challenge?

Shane Murray 3:46: Yeah, I think I’ve seen the term [AI-ready] now pretty much everywhere, and I feel like it can be one of those kind of slippery terms… where a lot of people said it, and everyone kind of nods, but doesn’t have a super clear definition and probably people interpret it very differently. For some teams… AI ready can mean actually getting your data ready for these kind of conversational BI initiatives… But it also means, how do you contextualize that data, you know, with metadata and with maybe provenance, so that the AI can actually query it reliably. I will say that the truth I’ve found is that that data doesn’t really become ready until you start using it in AI.

David Sweenor 7:01: Let’s talk a little bit about data products… moving from structured data to unstructured data… does the term data quality still apply, or is there something else that people need to think about?

Shane Murray 7:54: I guess I’ve always held the position that data quality is contextual to the use case….At a high level, [for unstructured data] it’s some of the same things we talk about with structured data… it needs to be relevant, it needs to be complete, it needs to be fresh, right? It needs to be consistent.

David Sweenor 12:19: What are the biggest challenges when we talk about trust and reliability of data?

Shane Murray 12:43: I think the thing that gets most publicized is the hallucinations… But I think what’s been interesting as I talk to more and more data teams… oftentimes, that hallucination… is often masking issues that are affecting the effectiveness of the AI agent. One of the most common examples… is the source data outdated? Is it incomplete?…If these models don’t have a complete knowledge base, that’s when they have a tendency to make things up….I keep coming across is this thing that I’ve kind of been calling “small tweaks big surprises,” where maybe it’s a prompt change or a change in the underlying model… This can have huge effects downstream.

David Sweenor 19:37: With generative AI, you’re sort of going to use some sort of foundation model… How are organizations rethinking data or AI governance for this AI world?

Shane Murray 20:31: I think what it’s doing to data meets software engineering… it’s not a given that the data team is in the room, which is potentially a problem….Forward thinking leaders… are like, “All right, we’re building applications in a decentralized way. How do I bring a foundation and a platform?” … “How do I bring privacy, security, observability, maybe experimentation infrastructure… and ensure there’s some standards?”

David Sweenor 24:34: What does trust mean in this context [of generative AI]?

Shane Murray 25:35: I think success in building these agents is sometimes treating them like a team of people… but I think to the question of whether we can trust them as humans, I’d say not implicitly. It’d almost be equivalent to someone who didn’t have common sense, and so you need to establish very clear objectives, tight constraints and a really tight feedback loop.

David Sweenor 28:29: Conventional wisdom says, when we use rag implementation, it reduces hallucinations… Have you come across this?

Shane Murray 29:31: Well, I definitely don’t think RAG is a fail safe for hallucination. I think it brings relevant context from your proprietary data… but it doesn’t necessarily prevent this idea that the model will get outside of that knowledge base and start making things up.

David Sweenor 30:45: What sort of frameworks or tech do you think will really define the future of trustworthy AI and data products?

Shane Murray 31:12: I think trust can also… a pivotal component is going to be this idea of responsible AI, ensuring these are fair and unbiased… I think the data observability space has evolved into the data and AI observability space….A lot of the work for teams that are building these is in prod, and monitoring and observing these systems in prod is going to be essential.

David Sweenor 34:38: What sort of words of wisdom or advice could you impart to our listeners who are like, I don’t know where to start?

Shane Murray 34:38: You’ll learn more in a couple of weeks of building than, you know, months of planning and prognosticating….Keep humans in the loop… move from human driven tasks to human review tasks. And finally, they’re also focusing on problems that have meaningful long term impact.