LLM to ROI: How to scale gen AI in retail – McKinsey

Latest Comments

No comments to show.

Once generative AI (gen AI) hit the mainstream, in late 2022, it took little time for retail executives to realize the potential in front of them. Mentions of artificial intelligence (AI) in retailers’ earnings calls soared last year—which was no surprise, given that gen AI is poised to unlock between $240 billion to $390 billion in economic value for retailers, equivalent to a margin increase across the industry of 1.2 to 1.9 percentage points. This, combined with the value of nongenerative AI and analytics, could turn billions of dollars in value into trillions.
This article is a collaborative effort by Alexander Sukharevsky, Andreas Ess, Denis Emelyantsev, Emily Reasor, and Holger Hürtgen, with Oleg Sokolov and Sergey Kondratyuk, representing views from QuantumBlack, AI by McKinsey, and McKinsey’s Retail Practice.
Over the past year, most retailers have started testing different gen AI use cases across the retail value chain. Even with all this experimentation, however, few companies have managed to realize the technology’s full potential at scale. We surveyed more than 50 retail executives, and although most say they are piloting and scaling large language models (LLMs) and gen AI broadly, only two executives say they have successfully implemented gen AI across their organizations (see sidebar, “Our survey findings”).
Some retailers have found it difficult to implement gen AI widely because it requires rewiring parts of the retail organization, such as technical capabilities and talent. Data quality and privacy concerns, insufficient resources and expertise, and implementation expenses have also challenged the speed at which retailers can scale their gen AI experiments.
In April 2024, we conducted a survey of 52 global Fortune 500 retail executives. Our survey focused on the progress retailers made in exploring and experimenting with generative AI (gen AI). We found that most retail executives (90 percent) say they began experimenting with gen AI solutions and scaling priority use cases and that these experiments had knock-on effects across their other AI initiatives; two-thirds of retail leaders say they want to invest in and focus more on data and analytics.
Sixty-four percent of retail leaders say they have conducted gen AI pilots that have augmented their organizations’ internal value chains, while 26 percent say they are already scaling gen AI solutions in this area. What’s more, 82 percent of retailers say they have conducted pilots for gen AI use cases related to the reinvention of customer service. Thirty-six percent say they are scaling gen AI solutions in this area.
Off-the-shelf gen AI tools have become more readily available in the past year. More than half of retail leaders surveyed (60 percent) opted for ready-made platforms, although the adoption rate of these third-party platforms is lower in areas such as procurement (18 percent) and commercial (25 percent). The adoption of third-party gen AI solutions will likely grow as the gen AI platform market matures. Two-thirds of retailers say they intend to increase their gen AI budgets over the next year.
Meanwhile, 10 percent of retailers say they are adopting a wait-and-see approach to gen AI. They plan to integrate gen AI into their operations at a later date, particularly in the areas where there are no full-service platforms yet. This decision may stem from factors such as having insufficient expertise or organizational resources, issues related to data quality and privacy, as well as the expenses associated with gen AI implementation.
Retail companies that have succeeded in harnessing gen AI’s power typically excel in two key areas. First, they consider how gen AI use cases can help transform specific domains rather than spreading their resources too thin across a range of scenarios. Second, they effectively transition from pilot and proof-of-concept to deployment at scale. This requires not just data prioritization and technological integration but also significant organizational changes to support widespread AI adoption.
In this article, we explore which use cases can offer the most value and what organizational transformations are necessary to scale these technologies successfully.
Retailers we spoke with have already piloted gen AI use cases within their internal value chains, and some are even beginning to scale gen AI solutions. Gen AI can help streamline operations, allowing leaders to make faster, better-informed decisions across retailers’ internal value chains. The technology also offers immediate, no-regret efficiency gains, as well as applications that could redefine decision making in retail (more on this later).
Retailers have also experimented with gen AI to reinvent the customer experience. Gen AI can deepen relationships with customers (in part, by extending the interactions between retailers and customers across the customer journey) and help make the customer experience more personalized and fulfilling. The advanced conversational abilities of gen AI chatbots, powered by natural-language models, can make the smart-shopping assistant a primary shopping channel.
Gen AI has the potential to boost productivity and efficiency along each step of the retail value chain, including in marketing, commercialization, distribution, and back-office work (Exhibit 1).
Retailers can start to realize gen AI’s impact across the value chain through quick-win use cases. These use cases generally require fewer resources to implement relative to their impact and compared with other gen AI use cases. In fact, retailers may more easily deploy current off-the-shelf tools without the need for much customization. Real examples of these use cases include the following:
While the above examples can help simplify daily tasks, gen AI can also help retailers accelerate their decision making by automatically generating insights, root causes, and domain-level and company-wide responses (Exhibit 2).
Retail operations are affected by countless forces that are difficult to quantify and track, making performance analytics and forecasting an arduous task. Traditionally, teams might spend weeks studying competitors’ tactics, changes in pricing and promotion, supply chain issues, and unexpected disruptions to understand sales declines and devise strategies to avoid future sales drops. The combination of gen AI and advanced analytics can revolutionize this process: rather than manually assessing that data, workers from across the company—from CEO to category manager—can access a personalized report featuring key performance insights and suggested actions.
Let’s use a hypothetical electronics retailer as an example. The retailer’s television sales are 6 percent lower than it had forecasted. The retailer’s team spent a week looking for the root cause of the decline and came up with a dozen potential reasons: Could the missed sales forecast have been caused by the unusually rainy weather? A delayed product release? Or were temporary out-of-stock items and a weak promotional campaign to blame?
In this example, a gen AI system, trained on the retailer’s proprietary data, could automatically analyze the impact of not only these potential root causes but also additional scenarios, such as what actions its competitors may have taken at the same time. A cross-functional team, led by the retailer’s technology leaders and considering input from sales and commercial teams, could work with technology providers to customize the retailer’s AI- and gen-AI-powered system. The gen AI platform could then create a list of causes by impact, as well as a set of actions the retailer could consider to help reduce sales drops in the future.
Based on our early work with retailers, we expect gen-AI-powered decision-making systems to propel up to 5 percent of incremental sales and improve EBIT margins by 0.2 to 0.4 percentage points.
When it comes to using gen AI copilots, companies will need to decide if they are a “taker” (a user of preexisting tools), a “shaper” (an integrator of available models with proprietary data for more customized results), or a “maker” (a builder of foundation models). Across the internal value chain, most retailers will likely adopt the taker archetype, using publicly available interfaces or APIs with little to no customization to meet their needs.
However, many of today’s off-the-shelf solutions don’t offer the functionality that some retailers need to fully realize the technology’s value, since the technology powering these solutions typically doesn’t account for sector- and company-specific data. At the same time, most retailers won’t be able to adopt the maker archetype, given that the costs associated with building foundation models are outside the typical retailer’s budget. In these cases, retailers may opt for the shaper archetype, customizing existing LLM tools with their own code and data. The shaper archetype will also be relevant for gen AI decision-making use cases. How many resources a retailer invests in shaping its gen AI tools will depend on the market it intends to serve, which use cases it wants to prioritize, and how these use cases complement the retailer’s core value proposition.
Today, retailers typically engage in only three of the seven steps of the customer journey. Gen AI has the potential to increase retailer engagement and reinvent the customer experience across the entire customer journey (Exhibit 3).
Gen-AI-powered chatbot assistants are one primary tool retailers can use to better engage with customers. Customers can use chatbots to receive product recommendations, learn more about a product or retailer, or add or remove items from their virtual shopping carts. Importantly, since many consumers will use these chatbots before deciding to purchase a product rather than after, using chatbots allows retailers to engage with customers earlier in their shopping journey, which can help increase customers’ overall satisfaction.
Gen AI chatbots work by recognizing the intent of a customer’s message. An LLM agent—the system that the chatbot relies on for its reasoning engine—processes the customer’s message and is then connected to various data sets (such as a retailer’s SKU base) and to other models, such as an analytical personalization engine. To create the best outputs, a retailer must dedicate resources toward product design and conduct frequent user testing to calibrate how it wants the chatbot to process the customer’s message. (How customers most frequently use the chatbot will largely determine this calibration.)
For example, a shopper might be interested in planning a dinner party but may not know what to buy. After the customer provides the gen AI assistant with a few details about the dinner party—such as how many people are attending, whether any guests have dietary restrictions, and overall budget—the gen AI assistant could provide specific product recommendations based on the customer’s preferences or purchase history.
While some retailers have adopted a wait-and-see approach to generative AI (gen AI), others have already started experimenting with the technology (exhibit). Leading retailers, particularly those in grocery and fashion, started experimenting with gen-AI-powered chatbots in late 2023. These chatbots have taken different forms. Walmart launched its “Text to Shop” tool, where customers can text the retailer to search for items, add or remove items from their carts, reorder products, and schedule deliveries. Instacart created a ChatGPT plug-in that allows users to plan a meal in ChatGPT and then convert the output into a basket on Instacart’s website.
While chatbots can be a convenient tool to help reduce customers’ mental load and shopping time, to truly transform the shopping experience and win over customers, chatbots will need to be deeply personalized—for example, being able to remember customers’ order histories, product preferences, and shopping habits. Many leading retailers, particularly in the grocery and fashion spaces, have already begun experimenting with chatbots, though most of these early experiments have not yet harnessed the power of personalization (see the sidebar “Retailers embark on the chatbot journey”).
As is the case with internal value chain gen AI use cases, retailers often adopt the “shaper” archetype for gen AI use cases that transform the customer experience.
Determining the costs of chatbots in retail. The first concern many retailers have about integrating gen-AI-powered chatbots into their business is how much it will cost. That depends on a few factors. Product performance metrics (or the length of a conversation between a customer and chatbot) is one of the first considerations. The length of the conversation is inversely related to the quality of personalization—meaning, the more personalized a chatbot is for a given customer, the shorter their conversation. Purchase conversions are another factor. The higher the conversion rate, which is linked to the effectiveness of the chatbot, the lower the net operational costs of that chatbot. A third factor is the price of LLM APIs. The cost of using these LLM APIs has dropped dramatically in the past year (for example, when comparing the cost of input tokens, GPT-4o, released in May 2024, is half as expensive to operate as GPT-4 Turbo, released a year earlier). AI experts believe that the price of LLM APIs will continue to drop substantially, with some estimates showing a drop of as much as 80 percent within the next two to three years.
Based on our experience building gen AI chatbots with retail companies across a range of realistic scenarios, a 2 to 4 percent basket uplift can justify LLM costs. Retailers can also combine the power of their generative and analytical AI products to further justify LLM costs. For example, companies can first use gen AI to learn more about a customer, then use analytical models to surface personal offers relevant to that customer. Together, these two technologies can help increase sales conversions.
When building a business case, retailers should also consider the investment required to develop a chatbot. Sometimes, the basket uplift may not be high enough to cover the cost of the investment. To understand the full return on their investment, retailers should factor in the cost of attracting new customers who will use these tools, as well as how much the tool can increase the purchase frequency for existing customers.
Measuring chatbots’ impact. In controlled customer experiments, we’ve seen chatbots create a significant increase in convenience for customers. When comparing a traditional retailer app with the minimum viable product of a gen-AI-enabled chatbot, the chatbot reduced the time spent to complete an order by 50 to 70 percent.
QuantumBlack, McKinsey’s AI arm, helps companies transform using the power of technology, technical expertise, and industry experts. With thousands of practitioners at QuantumBlack (data engineers, data scientists, product managers, designers, and software engineers) and McKinsey (industry and domain experts), we are working to solve the world’s most important AI challenges. QuantumBlack Labs is our center of technology development and client innovation, which has been driving cutting-edge advancements and developments in AI through locations across the globe.
Retailers that aren’t ready to invest in chatbots may instead choose to launch smart-search functionality. Smart-search tools allow a customer to receive a list of recommended products by asking a question rather than needing to engage in a conversation with a chatbot. (For example, a customer might search for “dinner party supplies,” and the smart-search tool would provide a list of products that one might need for a dinner party). While traditional search uses basic algorithms and relies on keyword matching, smart-search tools powered by gen AI can better understand the context and intent of a search term, even if it veers away from keyword use. Although these smart-search tools may be limited in functionality compared with using chatbots—and therefore limited in impact—they are easier and less expensive to develop. They also carry fewer risks compared with using a chatbot; their outputs are generally limited to a list of products rather than longer text that a chatbot would give, which means the responses are less likely to be harmful, offensive, or inaccurate.
Gen AI is no longer a novelty. As companies figure out how to implement the technology to create real value, best-in-class retailers will need to move from testing to scaling or else risk falling behind their competitors—or, worse, losing customers. To scale their gen AI tools, retail executives can consider five imperatives for outcompeting in digital and AI:
Some of the guidance outlined above may be sector-agnostic, but scaling gen AI in retail is unique because several of the technology’s use cases involve direct interactions with consumers. In retail, even a 1 percent margin of error could result in millions of customer-facing mistakes. This emphasizes the importance of strong gen AI risk guidelines and safety testing. The stakes may be higher, but the rewards are, too.
Alexander Sukharevsky is a senior partner in McKinsey’s London office, where Sergey Kondratyuk is an associate partner; Andreas Ess is a partner in the Zurich office; Denis Emelyantsev is a partner in the Atlanta office; Emily Reasor is a senior partner in the Denver office; Holger Hürtgen is a partner in the Düsseldorf office; and Oleg Sokolov is an associate partner in the Stockholm office.
The authors wish to thank Andrei Persh and Sergei Sereda for their contributions to this article.
This article was edited by Alexandra Mondalek, an editor in the New York office.

source

CATEGORIES:

Stories

Tags:

Comments are closed