Artificial Intelligence

Generative AI’s Force Multiplier: Your Data

A Business Leaders Guide to Providing Context to LLMs

Generative AI requires context to gain a competitive advantage

The Dawning Age of Generative AI in Business

With the launch of OpenAI’s ChatGPT, Anthropic’s Claude, and Google’s Bard, the long-standing dream of text analytics and natural language processing (NLP) at scale became an instant reality. Not only did AI convincingly pass the Turing Test, it catalyzed a paradigm shift in corporate strategies. Suddenly, AI wasn’t just a topic of conversation—it instantaneously became an integral part of every forward-thinking company’s roadmap. The reason? 

Well, before generative AI, most of the use cases discussed involved numbers (see my blog on Business Applications of Generative AI). Now, with human-like text and simple-to-use interfaces to chat in plain language and receive realistic responses, AI is now accessible to everyone–no matter their skill level, across the entire planet. Generative AI technologies can certainly echo what they’ve learned, but what happens when you ask them to create something unique? For the most part, they struggle. 

Technologies like OpenAI’s ChatGPT, Meta’s AudioCraft, and Midjourney can craft words, code, melodies, and visual masterpieces with a fluency that feels almost human. But it’s essential to remember that, at their core, they are only as adept and innovative as the data they’ve been fed. If your competitors are simply using the same publicly available models that you are, then how can this form the basis of a competitive advantage?

For savvy business leaders, the message is crystal clear: your organization’s data isn’t just a commodity—it’s your mightiest weapon. Having spent my entire career in data and analytics, the dream of harnessing the potential of unstructured data has long tantalized businesses, always seeming just a bit out of reach. In fact, I was joking with an industry analyst friend last summer and said, “Perhaps this is the year that we’ll crack the nut on text analytics.” Little did I know what awaited us in November 2022.  

Yet the journey to creating an AI-driven business success isn’t about adopting a generic model—it’s about adapting and modifying AI to understand and operate within your organization’s unique context–fueled by your data. 

Applying Your Company’s Unique Context to Generative AI

Putting all the puzzle pieces together for any business can be a bit daunting. However, the picture is clear: ‌to create a competitive AI advantage, you need to create a system that understands your organization’s unique needs, interactions, and ways of operating. Fundamentally, there are a few different approaches to applying this context which are outlined below.

Adding Context to the LLMs

Let’s start with your large language models (LLMs); you can build from scratch, fine-tune, or buy a package.

  • Building from scratch: A tall order for sure. Constructing your own AI solution offers infinite customization, but it comes with a heavy price tag. Not just in monetary terms, but also the time, expertise, and resources required. An article from McKinsey suggests that this could cost between $5M to $200M for an initial build with a $1M to $5M recurring fee. Only the largest organizations can afford this approach.
  • Fine-tuning existing models: Rather than starting from ground zero, you could adapt or tweak an existing generative AI model. This approach provides a balance between customization and cost, allowing you to tailor the AI to your context while using a foundation that’s already been pre-trained. McKinsey estimates this would cost between $2M to $10M, with a $0.5M to $1M recurring annual maintenance budget.
  • Buying a pre-packaged solution: For companies that prefer a ready-made solution, there are off-the-shelf AI products. These may lack the same degree of personalization, but they offer quicker deployment and‌ lower initial costs. These will typically be used in situations where it is used to streamline your business operations, but it won’t be your competitive advantage. McKinsey calculates that this is in the $0.5M to $2M range with a $0.5M recurring annual fee.

Now, how can we augment your models? 

Adding Context via RAG

For many companies, security and privacy are foundational to how they operate. Sending your proprietary data to a service provider that can use your data to train models for others, is simply unconscionable. In fact, I like Hippocratic AI’s mantra, “do no harm”.  Fields like healthcare, insurance, banking, manufacturing, legal, and many others have strict regulations on how data can be used, where it can be transmitted, and how it needs to be stored. So, if using your data to fine-tune a model is not in the cards for your organization, consider retrieval augmented generation (RAG). Essentially, a RAG approach connects your organization’s data to the LLM without actually giving your data to the LLM. RAG allows the response to your chat with the LLM to include your proprietary data. In other words, it provides the context! 

  •  Leveraging retrieval-augmented generation (RAG) with vector databases: This approach combines the capabilities of large language models with the specificity and fact-checking ability of databases, offering a nuanced way to ensure accurate and company-relevant outputs.

So, now that we’ve connected your organization’s data to the LLM by either building your own, fine-turning, or adopting the RAG approach, it’s now time to make sure we ask it good questions.

Adding Context via Prompt Engineering

This is where prompt engineering comes into play. 

  • Prompt engineering: Prompt engineering is essentially a structured way to ask your LLM a question so that it understands it what you want it to do. Remember, the LLM doesn’t “know” anything; it simply knows what words statistically and frequently appear next to one another. 

There are a few different types of prompts which include:

  • Zero-shot prompting: this is where you ask the LLM a question without any specific context. For example, you could write a prompt like this:

Why is the sky blue?

  • Few-shot prompting: for his use case; you provide the models with examples so that it provides output similar to the examples you provide. So, you could ask the following (output from ChatGPT):

Example: Write a rhyming couplet based on the given prompts.

Prompt 1: Moon Couplet 1: In the silent night, it’s the brightest boon, Glowing softly, is the silver moon.

Prompt 2: Rain Couplet 2: Drops from the sky, clear and plain, Quenching the earth is the gentle rain.

Your Task: Prompt: Blue Sky Couplet: ?

Response: 

Prompt: Blue Sky Couplet: Above where birds and planes do fly, Stretches the vast expanse of the blue sky.

  • Delimter use: when you use delimters, you put in special tokens and phrases to provide structure to the model. For example, you could write a prompt like this: 

I’m a Product Marketing Manager at [[UVM Medical Center, a teaching hospital]]. I want you to read the user reviews below that we received and summarize them to me. [[paste reviews below]]

Prompt engineering is certainly both an art and a science. But fortunately, we are starting to see pre-built templates built into applications so that casual users (like myself) can type the inputs into a box within a web browser and it will appropriately structure the query to the LLM. 

For example, what if you wanted to build a case study? Well, these are pretty structured, so there are some generative AI software solutions that have templates tailored to specific use cases. 

For the case study template, it includes the following fields:

  • Company name: [add company name here]
  • Customer name: [add customer name here]
  • Challenge: [describe challenge here]
  • Solution: [add solution here]
  • Results: [add results here]

So, if I simply type those responses into the text boxes on my web browser, out comes a pretty good case study. Another example would be a press release, the software asks for:

  • What type of press release? [add type here e.g. product announcement, event, quarterly results]
  • What people or entities should be mentioned? [add people or entities here]
  • What is the press release about? [add press release here]
  • Additional context—what quotes woud you like to include? [add quotes and attribution here]
  • About: [add company description here]

And voilà , we have great first draft for a press release! 

Most businesses understand that integrating generative AI into your business isn’t linear. It will be challenging, but the payoff will be immense as long as you apply your company’s unique context. 

Summary: Generative AI Starts With Your Data

If your organization is going to capitalize on generative AI, you need to provide your company’s unique context. To provide the correct context, you can do this via LLMs, RAG, prompt-engineering, or perhaps a hybrid of the three different approaches. Keep in mind that the context is use-case specific, so start gathering your list of use-cases and data examples so you can start to test out the generative AI waters.