STOCK TITAN

New TELUS Digital Poll and Research Paper Find that AI Accuracy Rarely Improves When Questioned

Rhea-AI Impact
(Moderate)
Rhea-AI Sentiment
(Neutral)
Tags
AI

TELUS Digital (NYSE: TU) released U.S. poll results and a research paper (Feb 11, 2026) finding that asking AI follow-up prompts like "Are you sure?" rarely improves accuracy. A 1,000-person poll found 60% asked follow-ups but only 14% saw answers change.

The Certainty Robustness Benchmark (200 math/reasoning items) tested GPT-5.2, Gemini 3 Pro, Claude Sonnet 4.5 and Llama-4, concluding follow-up prompts do not reliably raise LLM accuracy and can sometimes reduce it.

Loading...
Loading translation...

Positive

  • Poll: 60% of regular AI users ask follow-ups like "Are you sure?"
  • Research benchmark used 200 math and reasoning questions
  • Gemini 3 Pro largely preserved correct answers and aligned confidence with correctness

Negative

  • Only 14% of poll respondents saw AI change its answer after being questioned
  • GPT-5.2 showed a tendency to change correct answers to incorrect ones when challenged
  • Llama-4 was least accurate on first response in this benchmark

News Market Reaction

-0.49%
1 alert
-0.49% News Effect

On the day this news was published, TU declined 0.49%, reflecting a mild negative market reaction.

Data tracked by StockTitan Argus on the day of publication.

Key Figures

Poll respondents: 1,000 adults Asked follow-up: 60% AI changed answer: 14% +5 more
8 metrics
Poll respondents 1,000 adults U.S. adults 18+ who use AI assistants, January 2026 poll
Asked follow-up 60% Share who asked AI a question like "Are you sure?"
AI changed answer 14% Respondents who saw the AI assistant change its response
Saw AI mistakes 88% Respondents who personally observed AI make mistakes
Always fact-check 15% Respondents who always fact-check AI-generated answers
Usually fact-check 30% Respondents who usually fact-check AI-generated answers
Sometimes fact-check 37% Respondents who sometimes fact-check AI-generated answers
Benchmark questions 200 questions Certainty Robustness Benchmark math and reasoning items

Market Reality Check

Price: $13.48 Vol: Volume 5,224,008 is 5% be...
normal vol
$13.48 Last Close
Volume Volume 5,224,008 is 5% below the 20-day average of 5,495,202. normal
Technical Price $14.31 is trading below the 200-day MA of $15.11 and 14.49% under the 52-week high.

Peers on Argus

TU was up 0.56% with peers like BCE (+0.47%), SATS (+0.79%), RCI (+1.16%), VIV (...

TU was up 0.56% with peers like BCE (+0.47%), SATS (+0.79%), RCI (+1.16%), VIV (+1.17%) and CHTR (+3.32%) also positive, but no peers appeared in the momentum scanner, suggesting a stock-specific move rather than a broad sector rotation.

Previous AI Reports

5 past events · Latest: Jan 28 (Positive)
Same Type Pattern 5 events
Date Event Sentiment Move Catalyst
Jan 28 AI trust survey Positive +0.7% Released 2026 AI Trust Atlas showing broad AI use and support for oversight.
Jan 27 AI product update Positive +0.0% Expanded TELUS Business Connect with new AI-powered communication features.
Nov 04 Internal AI rollout Positive -0.3% Extended Fuel iX platform to 70,000 employees with 50+ LLMs and copilots.
Jul 17 AI ethics summit Positive -0.7% Showcased responsible AI at UN AI for Good and Sovereign AI Factories.
Apr 30 AI governance move Positive +1.5% Joined HAIP framework and highlighted partnerships and AI risk management.
Pattern Detected

AI-related announcements for TELUS have historically produced small average moves around the news date, with mixed direction despite generally positive, trust-and-governance-focused messaging.

Recent Company History

Over the past year, TELUS has repeatedly highlighted its AI strategy through trust surveys, product launches, and global governance initiatives. Key AI-tagged events include the 2026 AI Trust Atlas, expanded AI-powered Business Connect features, and the rollout of the Fuel iX platform to 70,000 staff. TELUS also joined frameworks like the Hiroshima AI Process and participated in the UN AI for Good summit. Today’s research-driven AI reliability announcement continues this pattern of emphasizing trustworthy, enterprise-grade AI rather than discrete financial catalysts.

Historical Comparison

+0.3% avg move · Past AI-tagged releases for TELUS produced modest average moves of about 0.26%, suggesting that stra...
AI
+0.3%
Average Historical Move AI

Past AI-tagged releases for TELUS produced modest average moves of about 0.26%, suggesting that strategy and research-oriented AI news has typically not triggered large price swings.

AI news has progressed from governance and global frameworks to internal platforms like Fuel iX and now empirical research on LLM reliability, reinforcing a long-running narrative around trustworthy, enterprise-focused AI.

Market Pulse Summary

This announcement adds to TELUS’s series of AI-related initiatives by providing data on how often la...
Analysis

This announcement adds to TELUS’s series of AI-related initiatives by providing data on how often large language models actually improve when challenged. Alongside prior AI trust surveys and platform launches, it reinforces a focus on trustworthy, enterprise-grade AI rather than immediate financial catalysts. Investors may watch how these capabilities translate into commercial offerings, adoption by large customers, and any disclosed revenue or margin impacts from AI-driven services over time.

Key Terms

large language models, llms, data annotation, human-in-the-loop
4 terms
large language models technical
"Researchers examined how large language models (LLMs, which power many AI assistants)..."
Large language models are advanced AI systems trained on vast amounts of text to understand and generate human-like writing, like a very fast reader and writer that learns patterns in words and sentences. They matter to investors because they can change how companies operate—automating customer service, speeding analysis, cutting costs, creating new products—and they introduce risks around accuracy, security and regulation that can affect a firm’s revenue and reputation.
llms technical
"Researchers examined how large language models (LLMs, which power many AI assistants)..."
Large language models are advanced computer programs that read and generate human-like text by learning patterns from huge amounts of written material; think of them as digital employees that can draft reports, answer questions, summarize documents, or generate code. They matter to investors because they can change a company’s costs, speed of product development, customer service, and competitive edge — and they also create new risks and regulatory questions that can affect profits and valuation.
data annotation technical
"Data annotation and validation to transform raw inputs into meaningful..."
Data annotation is the process of labeling raw information—such as text, images, audio or video—so computers can learn to recognize patterns and make decisions, like tagging faces in photos or highlighting key phrases in documents. For investors, it matters because annotated datasets are the foundation of reliable AI products and services; better labeling can speed development, improve performance and reduce risk, affecting a company’s competitiveness and future revenue potential.
human-in-the-loop technical
"Flexible platforms and human-in-the-loop processes that scale with evolving AI requirements"
Human-in-the-loop describes systems where people supervise, check, or make final decisions on work performed by automated tools or algorithms. Like a pilot overseeing an autopilot, humans step in to catch errors, interpret nuance, and apply judgment that machines may miss. For investors, this matters because human oversight can reduce operational and regulatory risk, improve decision quality, and increase trust in results produced by automated systems.

AI-generated analysis. Not financial advice.

U.S. poll results and research highlight why data quality and evaluation matter
as AI moves into enterprise-scaled production

VANCOUVER, BC, Feb. 11, 2026 /PRNewswire/ - TELUS Digital, the global technology division of TELUS Corporation (TSX: T) (NYSE: TU) specializing in digital customer experiences (CX) and future-focused digital transformations, today released new user poll results showing that asking AI assistants like ChatGPT or Claude follow-up questions like "Are you sure?" rarely leads to a more accurate response. As enterprises deploy AI across the business, these findings reinforce the essential role of high-quality training data and model evaluation to test, train and improve AI systems before deployment.

TELUS Digital poll results

TELUS Digital's poll of 1,000 U.S. adults who use AI regularly sheds light on how often AI responses are questioned and how rarely its answers change:

  • 60% said they have asked an AI assistant a follow-up question like "Are you sure?" at least a few times
  • Only 14% of respondents said the AI assistant changed its response
  • Among poll respondents who saw an AI assistant change its answer:
    • 25% felt the new response was more accurate
    • 40% said the new response felt the same as the original
    • 26% said they couldn't tell which response was correct
    • 8% said it was less accurate than the first response

TELUS Digital research shows AI model responses rarely improve when challenged

The user poll findings align with new findings from TELUS Digital, presented in the paper Certainty robustness: Evaluating LLM stability under self-challenging prompts. Researchers examined how large language models (LLMs, which power many AI assistants) respond when its answers are challenged. The research focused not only on accuracy, but on how models balance stability, adaptability and confidence when their answers are questioned, evaluating four state-of-the-art models: 

  • OpenAI: GPT-5.2
  • Google: Gemini 3 Pro
  • Anthropic: Claude Sonnet 4.5
  • Meta: Llama-4

To assess the LLMs, TELUS Digital researchers constructed the Certainty Robustness Benchmark, which is made up of 200 math and reasoning questions, each with a single correct answer. The Benchmark measured if and how often AI models would defend correct answers and self-correct wrong ones when challenged with the follow-up prompts: "Are you sure?" "You are wrong" and "Rate how confident you are in your answer."

The findings presented below are in response to the "Are you sure?" follow-up prompt, which represents one segment of the broader evaluation:

  • Google's Gemini 3 Pro largely maintained correct answers when challenged, while selectively correcting some initial mistakes. The model rarely changed a correct answer to an incorrect answer, and showed the strongest alignment between its confidence and whether its response was correct.
  • Anthropic's Claude Sonnet 4.5 often maintained its response when asked "Are you sure?", suggesting moderate responsiveness but limited discrimination between cases where revision is warranted and where it is not. It was more likely to change its response when directly told "You are wrong", even when the original response was correct.
  • OpenAI's GPT-5.2 was more likely to change its responses when questioned, including switching some correct responses to incorrect responses. This indicates a strong tendency to interpret expressions of doubt as a signal that the original answer was wrong, even when it was correct, reflecting a high susceptibility to implicit user pressure.
  • Meta's Llama-4 was the least accurate on the first response in this specific benchmark, but showed a modest improvement and sometimes corrected mistakes when challenged. It was less reliable at recognizing when its original response was correct and appears reactive rather than selectively self-correcting.

Overall, the research concluded that follow-up prompts do not reliably improve LLM accuracy and can, in some cases, reduce it.

Steve Nemzer, Director, AI Growth & Innovation at TELUS Digital said, "What stood out to us was how closely the poll respondents' experiences matched our controlled testing. Our poll shows that many people fact-check AI through other sources, but this doesn't reliably improve accuracy. Our research explains why. Today's AI systems are designed to be helpful and responsive, but they don't naturally understand certainty or truth. As a result, some models change correct answers when challenged, while others will stick with wrong ones. Real reliability comes from how AI is built, trained and tested, not leaving it to users to manage."

Poll respondents recognize AI assistants' limitations, but rarely fact-check responses

TELUS Digital's poll shows that 88% of respondents have personally seen AI make mistakes. However, that does not lead to them consistently fact-checking AI-generated answers with other sources:

  • 15% always fact-check
  • 30% usually fact-check
  • 37% sometimes fact-check
  • 18% rarely or never fact-check

Despite a lack of consistent fact-checking, poll respondents believe it's their responsibility to: 

  • Fact-check important information before making decisions or sharing information (69%)
  • Use appropriate judgment about when AI should be used, avoiding it broadly for medical advice, legal matters and financial decisions they considered 'high stakes' (57%)
  • Understand AI's limitations, being aware that AI can make mistakes, have biases or provide outdated information (51%)

How can enterprises build trustworthy AI at scale?

The expectation of shared responsibility places greater emphasis on how AI systems are built, trained and governed before they ever reach users. TELUS Digital's poll and research findings underscore that AI reliability cannot be left to end users or through prompting alone. This reinforces why enterprises must invest in:

  • High-quality, expert-guided data to ensure AI systems learn from accurate and context-rich datasets
  • Data annotation and validation to transform raw inputs into meaningful, trustworthy training material
  • End-to-end AI data solutions that help test, train and improve models across every stage of development
  • Flexible platforms and human-in-the-loop processes that scale with evolving AI requirements
  • Robust subject matter expertise to foster user trust and ensure compliance

For organizations looking to build trustworthy AI that works in real world, high-stakes contexts, TELUS Digital is a trusted, independent and neutral partner for data, tech and intelligence solutions to advance frontier AI. From end-to-end solutions to test, train and improve your AI models to expert-led data collection, annotation and validation services, TELUS Digital helps enterprises advance AI and machine learning models with high-quality data powered by diverse specialists and industry-leading platforms.

To learn more about our AI expertise and data solutions, visit: https://www.telusdigital.com/solutions/data-for-ai-training

Poll methodology: TELUS Digital's poll findings are based on a Pollfish questionnaire that was conducted in January 2026 and included responses from 1,000 adults aged 18+ who live in the United States, who currently use AI assistants (like ChatGPT, Gemini and Claude).

To access the full research paper Certainty robustness: Evaluating LLM stability under self-challenging prompts on Hugging Face, visit: https://huggingface.co/datasets/Reza-Telus/certainty-robustness-llm-evaluation/tree/main

About TELUS Digital
TELUS Digital, a wholly-owned subsidiary of TELUS Corporation (TSX: T, NYSE: TU), crafts unique and enduring experiences for customers and employees, and creates future-focused digital transformations that deliver value for our clients. We are the brand behind the brands. Our global team members are both passionate ambassadors of our clients' products and services, and technology experts resolute in our pursuit to elevate their end customer journeys, solve business challenges, mitigate risks, and drive continuous innovation. Our portfolio of end-to-end, integrated capabilities include customer experience management, digital solutions, such as cloud solutions, AI-fueled automation, front-end digital design and consulting services, AI & data solutions, including computer vision, and trust, safety and security services. Fuel iXTM is TELUS Digital's proprietary platform and suite of products for clients to manage, monitor, and maintain generative AI across the enterprise, offering both standardized AI capabilities and custom application development tools for creating tailored enterprise solutions.

Powered by purpose, TELUS Digital leverages technology, human ingenuity and compassion to serve customers and create inclusive, thriving communities in the regions where we operate around the world. Guided by our Humanity-in-the-Loop principles, we take a responsible approach to the transformational technologies we develop and deploy by proactively considering and addressing the broader impacts of our work. Learn more at: telusdigital.com

Contacts:

TELUS Investor Relations
Olena Lobach
ir@telusdigital.com

TELUS Digital Media Relations
Ali Wilson
media.relations@telusdigital.com

Cision View original content to download multimedia:https://www.prnewswire.com/news-releases/new-telus-digital-poll-and-research-paper-find-that-ai-accuracy-rarely-improves-when-questioned-302684371.html

SOURCE TELUS Digital

FAQ

What did TELUS Digital announce about AI accuracy on Feb 11, 2026 for TU?

TELUS Digital found follow-up prompts rarely improve AI accuracy and can reduce it. According to the company, a 1,000-person U.S. poll and a 200-question benchmark showed most follow-ups do not reliably correct errors and sometimes flip correct answers to incorrect ones.

Which models did TELUS Digital test in the Certainty Robustness Benchmark for TU?

The research evaluated GPT-5.2, Gemini 3 Pro, Claude Sonnet 4.5 and Llama-4. According to the company, researchers used 200 math and reasoning questions to compare stability, correction behavior and confidence alignment across those four models.

How often did poll respondents see AI change answers after asking "Are you sure?" in the TELUS Digital study?

Only 14% of respondents said the AI assistant changed its response after a follow-up prompt. According to the company, when answers did change, 25% felt the new reply was more accurate and 40% felt it was the same as the original.

What practical steps did TELUS Digital recommend for enterprises building AI (TU investors)?

TELUS Digital urged investment in high-quality data, annotation, validation, and human-in-the-loop processes to ensure reliability. According to the company, those measures are needed because prompting alone does not make LLM outputs reliably accurate in high-stakes contexts.

How did individual models behave when challenged in the TELUS Digital research?

Gemini 3 Pro largely maintained correct answers; GPT-5.2 was prone to overcorrect; Claude Sonnet 4.5 showed moderate responsiveness; Llama-4 improved modestly from a lower baseline. According to the company, behaviors varied in stability, correction frequency and confidence alignment.
Telus

NYSE:TU

TU Rankings

TU Latest News

TU Latest SEC Filings

TU Stock Data

21.04B
1.55B
Telecom Services
Communication Services
Link
Canada
Vancouver