Say what you will about generative AI (genAI) enterprise perceptions, but it’s certainly neither nuanced nor balanced.
For months, virtually everyone thought genAI was going to solve all business and global problems. Then the reality pendulum swung the other way, with various reports and experts arguing it won’t work, nothing will come of it, the “bubble is bursting” and simply, “the numbers aren’t there.”
Consider Gartner’s report that “at least 30% of generative AI (genAI) projects will be abandoned after proof of concept by the end of 2025, due to poor data quality, inadequate risk controls, escalating costs or unclear business value.” The problem with the Gartner figure is that roughly that same percentage of all IT projects never survive trial tests — so it’s not clear how genAI is worse.
Of course, there’s the report about the CIO of a major pharmaceutical who paid Microsoft to have 500 employees use Copilot, only to have the CIO cancel the project after six months, saying it delivered slides that looked like “middle school presentations.” (Note: At least most middle school slideshows quickly get to the point, unlike every Microsoft presentation I have seen. But I digress.)
The practical truth is that both views are wrong. GenAI tools absolutely have value, but it won’t come easy. IT needs to do a lot more homework.
What kind of homework?
Clean your data: As I noted recently about Agentic RAG strategies, many enterprises suffer from terrible data. It’s out-of-date, error-ridden, obtained from dubious sources, and might unintentionally contain sensitive data (including PII and health data) that is not supposed to be there. No genAI magic can ever work if the data foundation is a mess. Have your team generate pristine data and your AI ROI has a chance.
Select more ideal projects: This is actually a twofer: First, talk with your team about genAI particulars so you can identify where the technology can help. GenAI can indeed handle anything, but it can only handle a very small subset really well. Secondly, far too many projects have been selected because, as an experiment, execs wanted to see what genAI can truly do. You need to be far more selective if you want to give genAI a fair chance.
Assess your hallucination comfort zone: This is arguably the most crucial. GenAI will hallucinate, and it will do so with no predictability. There are mechanisms you can deploy to reduce hallucinations a small degree — such as using AI to double-check AI, as is being attempted by Morgan Stanley, as well as limiting the data sources genAI is permitted to use.
But hallucinations can’t be stopped, and many argue they can’t even be meaningfully reduced. That means difficult conversations. What tasks do you need done where you can tolerate a few blatant lies here and there? Do you want to ban its use with anything customer-facing, such as customer service chatbots?
Even using it to summarize documents or meeting notes requires a discussion. How much human oversight can you apply before the efficiency goes away? One way to look at it: What projects do you have that are complex enough to benefit from genAI but not important enough that lies/errors are not deal-killers?
Be realistic about ROI objectives
Line-of-business chiefs are used to running ROI objectives by someone in the CFO’s office or at least a division general manager’s office. With genAI efforts, it’s essential to also check with an IT specialist who intimately understands what the technology can and can’t do.
My recommendation: Start with the genAI expert — don’t even discuss it with the number-crunchers until IT okays goals that are reasonable from a tech perspective.
Is it even something you want to bring to the CFO’s office at all? If this is experimentation to see what genAI can do — a perfectly reasonable goal at this point — then perhaps it doesn’t need a spreadsheet-friendly ROI yet.
Rita Sallam, distinguished vice president analyst at Gartner who tracks genAI strategies, said she understands the frustrations CIOs have when trying to apply ROI standards to genAI.
“You can’t get your hands around the actual value,” Sallam said. “There is additional work on your data that has to be done. Your proof of concept needs to be a proof of value. There is a certain percentage that will fail due to lack of the right data, the right guardrails or the absence of being able to properly demonstrate the value. Enterprises are sometimes not acknowledging the foundations that are necessary for genAI success.”
Another industry AI expert, Wirespeed CTO Jake Reynolds, was more blunt. “Believe how excited I was to learn we’re now moving away from statistics and math and instead using a drunken toddler to make these decisions for us,” he said.
About those hallucinations
And about the concept of hallucinations, some experts have questioned whether the hallucination concept is being handled appropriately, mostly because it puts the blame on the software. GenAI is not necessarily malfunctioning when it hallucinates: it is doing precisely what it was programmed to do.
“AI hallucination is all that genAI does,” said Symbol Zero CEO Rafael Brown. “All that it does is throw things together, like throwing pasta and sauce at a wall and waiting to see what sticks. This is done based on what the viewer likes and doesn’t like. There’s no real rhyme or reason. There’s isn’t true structure, context, simulation, or process. There is no skill, insight, emotion, judgment, inspiration, synthesis, iteration, revision, or creation. It’s like a word jumble or a word salad generator. It’s not even as good as Scrabble or Boggle. It’s better to think of it as AI Mad Libs — trust your business, your future, and your creation to AI Mad Libs.”
There’s also the possibility that genAI might well implode as it starts feeding on itself and all reality-based data vanishes. That’s how Pascal Hetzscholdt, senior director at content protection at publisher Wiley, sees it.
“Models like ChatGPT4 must constantly be retrained on new data to stay relevant and useful,” he said. “As such, this means generative AI is already starting to eat itself alive by being trained on its own output or other AI output.
“Why is this a problem? Well, it means that they will start recognizing patterns of AI generative content, not human-made content,” Hetzscholdt said. “This can lead to a rabbit hole of development, in which the AI is optimizing itself in a counterproductive way. The patterns it sees within the AI content might also go directly against those it sees in human content, leading to incredibly erratic and unstable outputs, which could render the AI useless. This is known as model collapse.”
Hetzscholdt pointed to a study that found that “it only takes a few cycles of training generative AI models on their own output to render them completely useless and output complete nonsense. In fact, one AI they tested only needed nine cycles of retraining it on its own output before the output was just a repetitive list of jackrabbits. As such, by 2026, these generative AIs will likely be trained on data that is primarily of their own creation, and it will only take a few rounds of training on this data before these AIs fall apart.”
His less-than-optimistic conclusion: “This is the paradox of AI — the more we use it, the worse it will get. It’s also why we shouldn’t build our industries or digital social systems around this technology, as it could crumble away very soon, leaving our economy and digital lives like a hollowed-out rotten tree waiting for the next storm to topple it.”
Say what you will about generative AI (genAI) enterprise perceptions, but it’s certainly neither nuanced nor balanced.
For months, virtually everyone thought genAI was going to solve all business and global problems. Then the reality pendulum swung the other way, with various reports and experts arguing it won’t work, nothing will come of it, the “bubble is bursting” and simply, “the numbers aren’t there.”
Consider Gartner’s report that “at least 30% of generative AI (genAI) projects will be abandoned after proof of concept by the end of 2025, due to poor data quality, inadequate risk controls, escalating costs or unclear business value.” The problem with the Gartner figure is that roughly that same percentage of all IT projects never survive trial tests — so it’s not clear how genAI is worse.
Of course, there’s the report about the CIO of a major pharmaceutical who paid Microsoft to have 500 employees use Copilot, only to have the CIO cancel the project after six months, saying it delivered slides that looked like “middle school presentations.” (Note: At least most middle school slideshows quickly get to the point, unlike every Microsoft presentation I have seen. But I digress.)
The practical truth is that both views are wrong. GenAI tools absolutely have value, but it won’t come easy. IT needs to do a lot more homework.
What kind of homework?
Clean your data: As I noted recently about Agentic RAG strategies, many enterprises suffer from terrible data. It’s out-of-date, error-ridden, obtained from dubious sources, and might unintentionally contain sensitive data (including PII and health data) that is not supposed to be there. No genAI magic can ever work if the data foundation is a mess. Have your team generate pristine data and your AI ROI has a chance.
Select more ideal projects: This is actually a twofer: First, talk with your team about genAI particulars so you can identify where the technology can help. GenAI can indeed handle anything, but it can only handle a very small subset really well. Secondly, far too many projects have been selected because, as an experiment, execs wanted to see what genAI can truly do. You need to be far more selective if you want to give genAI a fair chance.
Assess your hallucination comfort zone: This is arguably the most crucial. GenAI will hallucinate, and it will do so with no predictability. There are mechanisms you can deploy to reduce hallucinations a small degree — such as using AI to double-check AI, as is being attempted by Morgan Stanley, as well as limiting the data sources genAI is permitted to use.
But hallucinations can’t be stopped, and many argue they can’t even be meaningfully reduced. That means difficult conversations. What tasks do you need done where you can tolerate a few blatant lies here and there? Do you want to ban its use with anything customer-facing, such as customer service chatbots?
Even using it to summarize documents or meeting notes requires a discussion. How much human oversight can you apply before the efficiency goes away? One way to look at it: What projects do you have that are complex enough to benefit from genAI but not important enough that lies/errors are not deal-killers?
Be realistic about ROI objectives
Line-of-business chiefs are used to running ROI objectives by someone in the CFO’s office or at least a division general manager’s office. With genAI efforts, it’s essential to also check with an IT specialist who intimately understands what the technology can and can’t do.
My recommendation: Start with the genAI expert — don’t even discuss it with the number-crunchers until IT okays goals that are reasonable from a tech perspective.
Is it even something you want to bring to the CFO’s office at all? If this is experimentation to see what genAI can do — a perfectly reasonable goal at this point — then perhaps it doesn’t need a spreadsheet-friendly ROI yet.
Rita Sallam, distinguished vice president analyst at Gartner who tracks genAI strategies, said she understands the frustrations CIOs have when trying to apply ROI standards to genAI.
“You can’t get your hands around the actual value,” Sallam said. “There is additional work on your data that has to be done. Your proof of concept needs to be a proof of value. There is a certain percentage that will fail due to lack of the right data, the right guardrails or the absence of being able to properly demonstrate the value. Enterprises are sometimes not acknowledging the foundations that are necessary for genAI success.”
Another industry AI expert, Wirespeed CTO Jake Reynolds, was more blunt. “Believe how excited I was to learn we’re now moving away from statistics and math and instead using a drunken toddler to make these decisions for us,” he said.
About those hallucinations
And about the concept of hallucinations, some experts have questioned whether the hallucination concept is being handled appropriately, mostly because it puts the blame on the software. GenAI is not necessarily malfunctioning when it hallucinates: it is doing precisely what it was programmed to do.
“AI hallucination is all that genAI does,” said Symbol Zero CEO Rafael Brown. “All that it does is throw things together, like throwing pasta and sauce at a wall and waiting to see what sticks. This is done based on what the viewer likes and doesn’t like. There’s no real rhyme or reason. There’s isn’t true structure, context, simulation, or process. There is no skill, insight, emotion, judgment, inspiration, synthesis, iteration, revision, or creation. It’s like a word jumble or a word salad generator. It’s not even as good as Scrabble or Boggle. It’s better to think of it as AI Mad Libs — trust your business, your future, and your creation to AI Mad Libs.”
There’s also the possibility that genAI might well implode as it starts feeding on itself and all reality-based data vanishes. That’s how Pascal Hetzscholdt, senior director at content protection at publisher Wiley, sees it.
“Models like ChatGPT4 must constantly be retrained on new data to stay relevant and useful,” he said. “As such, this means generative AI is already starting to eat itself alive by being trained on its own output or other AI output.
“Why is this a problem? Well, it means that they will start recognizing patterns of AI generative content, not human-made content,” Hetzscholdt said. “This can lead to a rabbit hole of development, in which the AI is optimizing itself in a counterproductive way. The patterns it sees within the AI content might also go directly against those it sees in human content, leading to incredibly erratic and unstable outputs, which could render the AI useless. This is known as model collapse.”
Hetzscholdt pointed to a study that found that “it only takes a few cycles of training generative AI models on their own output to render them completely useless and output complete nonsense. In fact, one AI they tested only needed nine cycles of retraining it on its own output before the output was just a repetitive list of jackrabbits. As such, by 2026, these generative AIs will likely be trained on data that is primarily of their own creation, and it will only take a few rounds of training on this data before these AIs fall apart.”
His less-than-optimistic conclusion: “This is the paradox of AI — the more we use it, the worse it will get. It’s also why we shouldn’t build our industries or digital social systems around this technology, as it could crumble away very soon, leaving our economy and digital lives like a hollowed-out rotten tree waiting for the next storm to topple it.” Read More