This article is sponsored by NLP Logix and was written, edited, and published in accordance with content guidelines sponsored by Emerj. To learn more about our thought leadership and content creation services, please visit our Emerj Media Services page.
The current increase in capital allocation to GenAI (enterprise AI assistants and collaboration platforms such as ChatGPT Enterprise and Microsoft Copilot) observed over the past few years is well documented. However, several widely circulated research papers and studies from reliable sources such as Gartner and the Massachusetts Institute of Technology point to potentially worrying findings. In other words, quantifying the value of these tools for leaders is proving more difficult than expected.
Research described in MIT’s Project NANDA State of AI in Business 2025 report distinguishes between activities and outcomes and argues that signals for adoption can outweigh provable returns. According to the organization’s framework, 95% of organizations report “zero return” from their GenAI investments.
Rather than a lack of adoption, the report points to structural challenges that prevent AI tools from delivering sustained value at scale. NANDA highlights weak workflows, limited feedback loops, and weak alignment of AI systems with day-to-day operations as key factors preventing organizations from converting usage into measurable business outcomes.
Taken together, these findings help explain why many GenAI and enterprise assistant programs struggle to maintain executive trust. License and platform spending is visible, but without a clear measurement model or operational approach tied to workflow performance, leaders often lack a defensive way to demonstrate productivity gains or justify continued investment.
In an interview on Emerj’s “AI in Business” podcast, Matt Berseth, co-founder and CIO of NLP Logix, and Russell Dixon, strategic advisor at NLP Logix, joined Emerj Editorial Director Matthew DeMello to explore why AI assistant adoption is stalling and what leaders can operationalize to ensure measurable ROI.
Berseth argues that Copilot and ChatGPT Enterprise are often treated as “collaboration apps” rather than being managed strategically. At the same time, Dixon emphasizes that many rollouts fail upstream, before use cases, guardrails, and measurements are defined.
This article analyzes two core insights for enterprise leaders deploying Microsoft Copilot and ChatGPT Enterprise:
Treat Copilot and ChatGPT Enterprise as strategic systems. By operationalizing enablement, ownership, and usage measurement, GenAI Assistant provides measurable ROI beyond license deployment. Start with goals, guardrails, and measurements. Define the up-front blueprint for implementation (use cases, governance constraints, training expectations, and measurement design) before an assistant joins the workforce.
Listen to the full episode below.
Guest: Matt Berseth, Co-Founder and CIO, NLP Logix
Expertise: AI, data science, software engineering
Quick recognition: Berseth is the co-founder and CIO of NLP Logix, where he leads the delivery of advanced machine learning solutions for industries such as healthcare, logistics, and finance. He has over 20 years of technical leadership experience and previously held engineering and architecture roles at Microsoft and CEVA Logistics. He is an adjunct professor and holds a master’s degree in software engineering from North Dakota State University.
Guest: Russell Dixon, NLP Logix Strategic Advisor
Areas of expertise: Technology innovation, business transformation, information technology
Brief recognition: Dixon is a strategic advisor at NLP Logix, specializing in global operations and business transformation. With over 20 years of experience in information technology, he advises organizations on implementing AI solutions and cloud technologies. Russell’s expertise includes enterprise sales and business automation with a focus on identifying high-value use cases to drive ROI.
Treat Copilot and ChatGPT Enterprise as strategic systems
Berseth paints the current market as one in which leaders’ focus is shifting to “agent AI” and bespoke GenAI systems. These efforts are complex, and he points to a broader pattern of pilots stalling or failing to transition to durable, scaled production value that is also reflected in analyst and industry comments.
In Berseth’s view, assistant platforms such as ChatGPT Enterprise and Microsoft Copilot are being introduced into the environment, but often without the operating model needed to reap their benefits. He describes these tools as “collaboration apps” that are “not viewed through a strategic lens,” and links weak results to a lack of enablement, unclear ownership, and shallow measurement that stalls adoption.
This, he argues, is the beginning of “tool creep.” Companies distribute access, assume value will follow, and then discover uneven usage patterns and low reliability, leading leaders to pay for features that aren’t operational.
“I think it’s more like an instinct in the average employee to say, ‘I use ChatGPT at home. I like that interface better. I come to work. I don’t want to learn how to use this tool either.'”
Therefore, it is necessary to promote the organization. If you do that, I think you can achieve your goal. If you don’t, I think you’ll come back in three or six months and try to achieve your goals again because you took the wrong path. ”
– Matt Berseth, Co-Founder and CIO, NLP Logix
The Assistant becomes “another tool” and users return to the consumer interface they already understand. In this scenario, leaders see cost rather than value, and assistant programs are viewed as disposable during budgeting and renewal cycles.
Berseth’s prescription is simple. Treat these assistants as managed functions with goals, operational ownership, and measurement. The practical implication for executives is that assistant programs need the same scaffolding as other corporate efforts to change the way work is done. He emphasizes that “enabling it” is not an implementation plan.
He argues that planning needs to be clear about which workflows are important, what the “appropriate” use of those workflows is, and what business metrics will be used to determine progress. It also requires a clear owner who is responsible not only for licenses and access, but also for the validation and adoption of the results.
Next, Barthes zeroed in on measurement, arguing that many organizations rely on adoption metrics that are too shallow to guide decision-making. He recommends gathering qualitative information through surveys from leaders and end users. He advises leaders to combine this with an emphasis on quantitative usage data, focusing on “who is using what features and how.”
The important point is that adoption is not the same as value. Berseth explains that adoption is too vague a measure, arguing that usage patterns vary widely by user, team, and department.
In his framework, ROI depends on identifying “high leverage uses” and creating mechanisms to extract and distribute them throughout the organization. If that mechanism is not present, effective usage remains siled and the assistant’s value appears inconsistent from an executive perspective.
The system is also a governance tool. This allows leaders to reduce noise (“everyone is experimenting in different ways”) and replace it with repeatable practices that can be taught, audited, and measured.
It also reduces the chance that the most valuable usage patterns remain siled within a single team or department. This is a common assumption underlying many failed pilots when AI assistants are deployed widely with minimal enablement.
In practice, Berseth’s model implies a set of operational moves that can be owned by an AI enablement function, a CIO/CTO office, or a cross-functional transformation team.
Define a small set of high-priority workflows where the assistant is expected to change throughput, quality, and cycle time. Ensure the use of tools beyond login by tracking feature-level patterns and encouraging use in workflows. Identify high-leverage users, understand differences in their behavior, and inform training and norms. Features ship weekly or monthly, and as best practices change, we continually update training as the “surface area” of the GenAI assistant changes.
Berseth also emphasizes the speed at which the product is evolving. If assistants change frequently, one-time training quickly becomes useless. Organizations need a mechanism to continually update usage guidance and bring new product features back into production. If you don’t, you’re systematically underutilizing the tools you’re paying for.
Start with goals, guardrails, and measurements
While Barthes focuses on the operating model needed to benefit the assistant over the long term, Dixon focuses on the decisions that will determine success before the deployment begins.
He argues that many adoptions fail upstream when leaders release tools without defining goals, realistic use cases, governance guardrails to protect internal and client data, and a plan for how productivity and usage will be measured.
In Dixon’s framework, these decisions are not administrative overhead. They determine whether users perceive consistent value, whether adoption becomes part of daily work, and whether leaders can evaluate programs with confidence rather than intuition.
To achieve this objective, Dixon argues, leaders need to solidify realistic use cases, deployment approaches, precisely expected results, and plans to train users on daily usage. Without it, he says, leaders won’t get the results they want.
This means that “use cases” should not be treated as a list of common AI assistant functions (summary, draft, search), but should be expressed as workflow intent and output expectations. That is, which work products will be improved, what changes in turnaround time and quality are expected, and which roles will be affected.
In this way, Dixon argues, leaders can prevent assistant programs from becoming diffusion experiments that cannot be reliably evaluated, managed, or measured.
Next, Russell ties Assistant implementation into governance, including how tools are used and guardrails are put in place to ensure internal and client data is protected. Rather than being presented as an abstract compliance layer, he argues that governance must be a prerequisite for leaders to drive large-scale adoption. When governance is unclear, users circumvent tools or use them in uncontrolled ways, increasing risk.
Measurement is the last step in Dixon’s sequence. He argues that leaders need to decide whether to rely on user feedback or implement formal measurement tools and processes “on the fly” to monitor adoption and usage as the rollout progresses.
“You need to understand what you want from a tool from the beginning: what are your use cases, how will you train your users, what are the guardrails to protect your data? Finally, how will you measure productivity?”
– Russell Dixon, NLP Logix Strategic Advisor
Dixon also highlights the downside of skipping these steps by explaining how the deployment actually plays out. When leaders “just release these tools for a while,” users get frustrated because they don’t get the results they’re looking for. They then either “look for something else” or fail to take full advantage of the investment their organization has made in the system.
This warning is particularly relevant in environments where “shadow AI” is already present. As consumer tools become more easily accessible, enterprise adoption faces competition from familiar experiences. Organizations cannot rely on novelty or obligation. It requires clarity, training, guardrails, and a measurement approach that makes value visible.
Dixon’s sequence defines what decisions need to be made before deployment, so your AI assistant program can be evaluated and improved with discipline.
Adapt tools and workflows: Tune your assistant for optimal performance. Goals: What the organization looks for in an assistant. Use cases, deployment, and results: what to do, where to do it, and what results to expect. Training: How users are deployed every day rather than once in a while. Governance/Guardrails: How to restrict usage to protect internal and client data. Measurement: How will you track productivity, adoption, and usage throughout your deployment?
He argues that without a framework as a basis, universality becomes a drawback. When a tool can be used for everything, it’s easy for an organization to end up measuring nothing. For Dixon and Barthes, such a scenario makes it difficult for leaders to confidently defend their renewal decisions.

