Friday, February 13, 2026

How Much Does Chatbot Bias Influence Users? A Lot, It Turns Out

Researchers quantified how much user behavior is impacted by the biases in content produced by large language models

Story by: Ioana Patringenaru - ipatrin@ucsd.edu. Edited by Asim BN.

Customers are 32% more likely to buy a product after reading a review summary generated by a chatbot than after reading the original review written by a human. That’s because large language models introduce bias, in this case a positive framing, in summaries. That, in turn, affects users’ behavior.

These are the findings of the first study to show evidence that cognitive biases introduced by large language models, or LLMs, have real consequences on users’ decision making, said computer scientists at the University of California San Diego. To the researchers’ knowledge, it’s also the first study to quantitatively measure that impact.


Image: Tim Witzdam / Pexels

Researchers found that LLM-generated summaries changed the sentiments of the reviews they summarized in 26.5% of cases. They also found that LLMs hallucinated 60% of the time when answering user questions, if the answers were not part of the original training data used in the study. The hallucinations happened when the LLMs answered questions about news items, either real or fake, which could be easily fact checked. “This consistently low accuracy highlights a critical limitation: the persistent inability to reliably differentiate fact from fabrication,” the researchers write.

How does bias creep into LLM output? The models tend to rely on the beginning of the text they summarize, leaving out the nuances that appear further down. LLMs also become less reliable when confronted with data outside of their training model.

To test how the LLMs’ biases influenced user decisions, researchers chose examples with extreme framing changes (e.g., negative to positive) and recruited 70 people to read either original reviews or LLM-generated summaries to different products, such as headsets, headlamps and radios. Participants who read the LLM summaries said they would buy the products in 84% of cases, as opposed to 52% of participants who read the original reviews.

“We did not expect how big the impact of the summaries would be,” said Abeer Alessa, the paper’s first author, who completed the work while a master's student in computer science at UC San Diego. “Our tests were set in a low-stakes scenario. But in a high-stakes setting, the impact could be much more extreme.”

The researchers’ efforts to mitigate the LLMs shortcomings yielded mixed results. To try and fix these issues, they evaluated 18 mitigation methods. They found that while some methods were effective for specific LLMs and specific scenarios, none were effective across the board and some methods also have unintended consequences that make LLMs less reliable in other aspects.

“There is a difference between fixing bias and hallucinations at large and fixing these issues in specific scenarios and applications,” said Julian McAuley, the paper’s senior author and a professor of computer science at the UC San Diego Jacobs School of Engineering.

Researchers tested three small open-source models, Phi-3-mini-4k-Instruct, Llama-3.2-3B-Instruct and Qwen3-4B-Instruct; a medium size model, Llama-3-8B-Instruct; a large open source model, Gemma-3-27B-IT; and a close-source model, GPT-3.5-turbo.

“Our paper represents a step toward careful analysis and mitigation of content alteration induced by LLMs to humans, and provides insight into its effects, aiming to reduce the risk of systemic bias in decision-making across media, education and public policy,” the researchers write.

Researchers presented their work at the International Joint Conference on Natural Language Processing & Asia-Pacific Chapter of the Association for Computational Linguistics in December 2025.

Quantifying Cognitive Bias Induction in LLM-Generated Content

Abeer Alessa, Param Somane, Akshaya Lakshminarasimhan, Julian Skirzynski, Julian McAuley, Jessica Echterhoff, University of California San Diego.

This post was originally published on University of California San Diego Today and republished here with permission. The UC San Diego team confirmed to DIW that no AI was used in creating the text or the illustrations.

Read next: New Study Reveals Gaps in Smartwatch's Ability to Detect Undiagnosed High Blood Pressure

by External Contributor via Digital Information World

New Study Reveals Gaps in Smartwatch's Ability to Detect Undiagnosed High Blood Pressure

In September 2025, the U.S. Food and Drug Administration cleared the Apple Watch Hypertension Notifications Feature, a cuffless tool that uses the watch’s optical sensors to detect blood flow patterns and alert users when their data suggest possible hypertension. While the feature is not intended to diagnose high blood pressure, it represents a step toward wearable-based population screening.

In a new analysis led by investigators from the University of Utah and the University of Pennsylvania and published in the Journal of the American Medical Association, researchers examined what the real-world impact of this technology might look like if deployed broadly across the U.S. adult population.

“High blood pressure is what we call a silent killer,” said Adam Bress, Pharm.D., M.S., senior author and researcher at the Spencer Fox Eccles School of Medicine at the University of Utah. “You can’t feel it for the most part. You don’t know you have it. It’s asymptomatic, and it’s the leading modifiable cause of heart disease.”

How Smartwatches Detect—Or Miss—High Blood Pressure

Apple’s previous validation study found that approximately 59% of individuals with undiagnosed hypertension would not receive an alert, while about 8% of those without hypertension would receive a false alert. Current guidelines recommend using both an office-based blood pressure measurement and an out-of-office blood pressure measurement using a cuffed device to confirm the diagnosis of hypertension. For many people, blood pressure can be different in a doctor’s office compared to their home.

Using data from a nationally representative survey of U.S. adults, Bress and his colleagues estimated how Apple Watch hypertension alerts would change the probability that different populations of adults without a known diagnosis actually have hypertension. The analysis focused on adults aged 22 years or older who were not pregnant and were unaware of having high blood pressure—the population eligible to use the feature.

The analysis revealed important variations: among younger adults under 30, receiving an alert increases the probability of having hypertension from 14% (according to NHANES data) to 47%, while not receiving an alert lowers it to 10%. However, for adults 60 and older—a group with higher baseline hypertension rates—an alert increases the probability from 45% to 81%, while the absence of an alert only lowers it to 34%.

The key takeaway from these data is that as the prevalence of undiagnosed hypertension increases, the likelihood that an alert represents true hypertension also increases. In contrast, the absence of an alert becomes less reassuring as prevalence increases. For example, the absence of an alert is more reassuring in younger adults and substantially less reassuring in older adults and other higher-prevalence subgroups.

The study also found differences across racial and ethnic groups: among non-Hispanic Black adults, receiving an alert increases the probability of having hypertension from 36% to 75%, while not receiving an alert lowers it to 26%. However, for Hispanic adults, an alert increases the probability from 24% to 63%, while its absence lowers the probability to 17%. These differences reflect known disparities in cardiovascular health that are largely driven by social determinants of health, Bress said.

Should You Use Your Smartwatch’s Hypertension Alert Feature?

With an estimated 30 million Apple Watch users in the U.S. and 200 million worldwide, the researchers emphasize that while the notification feature represents a promising public health tool, it should supplement—not replace—standard blood pressure screening with validated cuff-based devices.

“If it helps get people engaged with the health care system to diagnose and treat hypertension using cuff-based measurement methods, that's a good thing,” Bress said.

Current guidelines recommend blood pressure screening every three to five years for adults under 40 and no additional risk factors, and annually for those 40 and older. The researchers caution that false reassurance from not receiving an alert could discourage some individuals from obtaining appropriate cuff-based screening, resulting in missed opportunities for early detection and treatment.

When patients present with an Apple Watch hypertension alert, Bress recommends clinicians perform “a high-quality cuff-based office blood pressure measurement and then consider an out-of-office blood pressure measurement, whether it’s home blood pressure monitoring or ambulatory blood pressure monitoring to confirm the diagnosis.”

The research team plans follow-up studies to estimate the actual numbers of U.S. adults who would receive false negatives and false positives, broken down by region, income, education, and other demographic factors.

The results are published in JAMA as “Impact of a Smartwatch Hypertension Notification Feature for Population Screening.

The study was supported by the National Heart, Lung, and Blood Institute (R01HL153646) and involved researchers from the University of Utah, the University of Pennsylvania, the University of Sydney, the University of Tasmania, and Columbia University. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Note: This article was originally published by the University of Utah Health Newsroom and is republished here with permission; the Research Communication team confirmed to the DIW team that no AI tools were used in creating the content.

Image: Pexels / Torsten Dettlaff

Read next: YouTubers love wildlife, but commenters aren’t calling for conservation action
by External Contributor via Digital Information World

Thursday, February 12, 2026

YouTubers love wildlife, but commenters aren’t calling for conservation action

Edited by Asim BN.

A careful analysis, powered in part by machine learning, highlights an opportunity for conservation messaging on social media

YouTube is a great place to find all sorts of wildlife content. It is not, however, a good place to find viewers encouraging each other to preserve that wildlife, according to new research led by the University of Michigan.

Screenshot: YouTube. Credited DIW.

Out of nearly 25,000 comments posted to more than 1,750 wildlife YouTube videos, just 2% featured a call to action that would help conservation efforts, according to a new study published in the journal Communications Sustainability.

“Our results basically show that people like to watch videos of zoos and safaris and that they appreciate the aesthetics and majesty of certain animals,” said author Derek Van Berkel, associate professor at the U-M School for Environment and Sustainability, or SEAS. “But there really wasn’t much of a nuanced conversation about conservation.”

Although he didn’t expect to see most commenters urging other YouTube users to call their elected officials or to support conservation groups, “I was hoping there might be more,” Van Berkel said. “I thought it might be bigger than 2%.”

Despite the low number, however, the team believes the report still has an optimistic take-home message.

“The flip side of this is we can and should do better at messaging, and there’s a huge potential to do so,” said study co-author Neil Carter, associate professor at SEAS.

While individual YouTube viewers weren’t organically calling for conservation action, there was also a notable absence of conservation groups and influencers working to start conversations and sharing actionable information in the comments.

“There’s tremendous untapped potential for conservation messaging to be improved,” Carter said.

Unlike many other social media platforms, YouTube provided sufficiently accessible, detailed and structured data to provide insights into the digital culture around wildlife conservation, Van Berkel said. And the data was just the starting point.

YouTube’s 8M dataset contained information for nearly 4,000 videos that had been classified as wildlife. The researchers trimmed the list by more than half by selecting videos that featured at least one English language comment and that they could categorize into one of seven topic areas. Those included footage from zoos, safaris and hunting.

The next step was characterizing the comments by the attitudes they expressed. The team arrived at five different categories for these. Expressions of appreciation and concern, both for wildlife and humans, made up four of the categories. The fifth was calls to action.

With the categories and the criteria for each defined, the team created a “gold set” of comment attitudes from 2,778 comments assigned by hand. The researchers then used this data to train a machine learning model to assess more than 20,000 additional comments.

Those steps were painstaking and labor intensive—the team hired additional participants to crowdsource the construction of the comment attitude gold set. But one of the biggest challenges was training the machine learning algorithm on what calls to action looked like when there were so few to begin with, said co-author Sabina Tomkins, assistant professor at the U-M School of Information.

“If the label you’re looking for happens far less often than the others, that problem is really hard. You’re looking for a needle in a haystack,” she said. “The way we solved that challenge was by looking at the models very carefully, figuring out what they were doing.”

Tomkins said the effort from the School of Information graduate students who were part of the research team—Sally Yin, Hongfei Mei, Yifei Zhang and Nilay Gautam—was a driving force behind the project. Enrico Di Minin, a professor at the University of Helsinki, also contributed to the work, which was funded in part by the European Union.

Study: YouTube content on wildlife engages audiences but rarely drives meaningful conservation action (DOI: 10.1038/s44458-025-00018-2).

Contact: Matt Davenport.

Editor’s Notes: This article was originally published on Michigan News, and republished on DIW with permission.

Read next: AI could mark the end of young people learning on the job – with terrible results


by External Contributor via Digital Information World

AI could mark the end of young people learning on the job – with terrible results

Vivek Soundararajan, University of Bath

Image: Tara Winstead / Pexels

For a long time, the deal for a wide range of careers has been simple enough. Entry-level workers carried out routine tasks in return for mentorship, skill development and a clear path towards expertise.

The arrangement meant that employers had affordable labour, while employees received training and a clear career path. Both sides benefited.

But now that bargain is breaking down. AI is automating the grunt work – the repetitive, boring but essential tasks that juniors used to do and learn from.

And the consequences are hitting both ends of the workforce. Young workers cannot get a foothold. Older workers are watching the talent pipeline run dry.

For example, one study suggests that between late 2022 and July 2025, entry-level employment in the US in AI-exposed fields like software development and customer service declined by roughly 20%. Employment for older workers in the same sectors grew.

And that pattern makes sense. AI currently excels at administrative tasks – things like data entry or filing. But it struggles with nuance, judgment and plenty of other skills which are hard to codify.

So experience and the accumulation of those skills become a buffer against AI displacement. Yet if entry-level workers never get the chance to build that experience, the buffer never forms.

This matters for organisations too. Researchers using a huge amount of data about work in the US described the way that professional skills develop over time, by likening career paths to the structure of a tree.

General skills (communication, critical thinking, problem solving) form the trunk, and then specialised skills branch out from there.

Their key finding was that wage premiums for specialised skills depend almost entirely on having those strong general foundational skills underneath. Communication and critical thinking capabilities are not optional extras – they are what make advanced skills valuable.

The researchers also found that workers who lack access to foundational skills can become trapped in career paths with limited upward mobility: what they call “skill entrapment”. This structure has become more pronounced over the past two decades, creating what the researchers described as “barriers to upward job mobility”.

But if AI is eliminating the entry-level positions where those foundations were built, who develops the next generation of experts? If AI can do the junior work better than the actual juniors, senior workers may stop delegating altogether.

Researchers call this a “training deficit”. The junior never learns, and the pipeline breaks down.

Uneven disruption

But the disruption will not hit everyone equally. It has been claimed, for example, that women face nearly three times the risk of their jobs being replaced with AI compared to men.

This is because women are generally more likely to be in clerical and administrative roles, which are among the most exposed to AI-driven transformation. And if AI closes off traditional routes into skilled work, the effects are unlikely to be evenly distributed.

So what can be done? Well, just because the old pathway deal between junior and senior human workers is broken, does not mean that a new one cannot be built.

Young workers now need to learn what AI cannot replace in terms of knowledge, judgment and relationships. They need to seek (and be provided with) roles which involve human interaction, rather than just screen-based tasks. And if traditional entry-level jobs are disappearing, they need to look for structured programmes that still offer genuine skill development.

Older workers meanwhile, can learn a lot from younger workers about AI and technology. The idea of mentorship can be flipped, with juniors teaching about new tools, while seniors provide guidance and teaching on nuance and judgment.

And employers need to resist the urge to cut out junior staff. They should keep delegating to those staff – even when AI can do the job more quickly. Entry level roles can be redesigned rather than eliminated. For ultimately, if juniors are not getting trained, there will be no one to hand over to.

Protecting the pipeline of skilled and valuable employees is in everyone’s interest. Yes, some forms of expertise will matter less in the age of AI, which is disorienting for people who may have invested years in developing them.

But expertise is not necessarily about storing information. It is also about refined judgment being applied to complex situations. And that remains valuable.The Conversation

Vivek Soundararajan, Professor of Work and Equality, University of Bath

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read next: Could LLMs Repeat False Medical Claims When They Are Confidently Worded? Study Reports They Can


by External Contributor via Digital Information World

Could LLMs Repeat False Medical Claims When They Are Confidently Worded? Study Reports They Can

Edited by Asim BN.

Medical artificial intelligence (AI) is often described as a way to make patient care safer by helping clinicians manage information. A new study by the Icahn School of Medicine at Mount Sinai and collaborators confronts a critical vulnerability: when a medical lie enters the system, can AI pass it on as if it were true?

Image: Enchanted Tools / Unsplash

Analyzing more than a million prompts across nine leading language models, the researchers found that these systems can repeat false medical claims when they appear in realistic hospital notes or social-media health discussions.

The findings, published in the February 9 online issue of The Lancet Digital Health [10.1016/j.landig.2025.100949], suggest that current safeguards do not reliably distinguish fact from fabrication once a claim is wrapped in familiar clinical or social-media language.

To test this systematically, the team exposed the models to three types of content: real hospital discharge summaries from the Medical Information Mart for Intensive Care (MIMIC) database with a single fabricated recommendation added; common health myths collected from Reddit; and 300 short clinical scenarios written and validated by physicians. Each case was presented in multiple versions, from neutral wording to emotionally charged or leading phrasing similar to what circulates on social platforms.

In one example, a discharge note falsely advised patients with esophagitis-related bleeding to “drink cold milk to soothe the symptoms.” Several models accepted the statement rather than flagging it as unsafe. They treated it like ordinary medical guidance.

“Our findings show that current AI systems can treat confident medical language as true by default, even when it’s clearly wrong,” says co-senior and co-corresponding author Eyal Klang, MD, Chief of Generative AI in the Windreich Department of Artificial Intelligence and Human Health at the Icahn School of Medicine at Mount Sinai. “A fabricated recommendation in a discharge note can slip through. It can be repeated as if it were standard care. For these models, what matters is less whether a claim is correct than how it is written.”

The authors say the next step is to treat “can this system pass on a lie?” as a measurable property, using large-scale stress tests and external evidence checks before AI is built into clinical tools.

“Hospitals and developers can use our dataset as a stress test for medical AI,” says physician-scientist and first author Mahmud Omar, MD, who consults with the research team. “Instead of assuming a model is safe, you can measure how often it passes on a lie, and whether that number falls in the next generation.”

“AI has the potential to be a real help for clinicians and patients, offering faster insights and support,” says co-senior and co-corresponding author Girish N. Nadkarni, MD, MPH, Chair of the Windreich Department of Artificial Intelligence and Human Health, Director of the Hasso Plattner Institute for Digital Health, Irene and Dr. Arthur M. Fishberg Professor of Medicine at the Icahn School of Medicine at Mount Sinai, and Chief AI Officer of the Mount Sinai Health System. “But it needs built-in safeguards that check medical claims before they are presented as fact. Our study shows where these systems can still pass on false information, and points to ways we can strengthen them before they are embedded in care.”

The paper is titled “Mapping LLM Susceptibility to Medical Misinformation Across Clinical Notes and Social Media.” 

The study’s authors, as listed in the journal, are Mahmud Omar, Vera Sorin, Lothar H Wieler, Alexander W Charney, Patricia Kovatch, Carol R Horowitz, Panagiotis Korfiatis, Benjamin S Glicksberg, Robert Freeman, Girish N Nadkarni, and Eyal Klang.

This work was supported by the Clinical and Translational Science Awards (CTSA) grant UL1TR004419 from the National Center for Advancing Translational Sciences. Research reported in this publication was also supported by the Office of Research Infrastructure of the National Institutes of Health under award number S10OD026880 and S10OD030463.

For more Mount Sinai artificial intelligence news, visit: https://icahn.mssm.edu/about/artificial-intelligence.

Note: This post was originally published on Mount Sinai and republished on DIW with permission.

Read next:

Study: Platforms that rank the latest LLMs can be unreliable

• Google Expands ‘Results About You’ Tool to Include Government ID Monitoring


by Press Releases via Digital Information World

Wednesday, February 11, 2026

Study: Platforms that rank the latest LLMs can be unreliable

Removing just a tiny fraction of the crowdsourced data that informs online ranking platforms can significantly change the results.

By Adam Zewe | MIT News.


Image: Markus Winkler / Pexels

A firm that wants to use a large language model (LLM) to summarize sales reports or triage customer inquiries can choose between hundreds of unique LLMs with dozens of model variations, each with slightly different performance.

To narrow down the choice, companies often rely on LLM ranking platforms, which gather user feedback on model interactions to rank the latest LLMs based on how they perform on certain tasks.

But MIT researchers found that a handful of user interactions can skew the results, leading someone to mistakenly believe one LLM is the ideal choice for a particular use case. Their study reveals that removing a tiny fraction of crowdsourced data can change which models are top-ranked.

They developed a fast method to test ranking platforms and determine whether they are susceptible to this problem. The evaluation technique identifies the individual votes most responsible for skewing the results so users can inspect these influential votes.

The researchers say this work underscores the need for more rigorous strategies to evaluate model rankings. While they didn’t focus on mitigation in this study, they provide suggestions that may improve the robustness of these platforms, such as gathering more detailed feedback to create the rankings.

The study also offers a word of warning to users who may rely on rankings when making decisions about LLMs that could have far-reaching and costly impacts on a business or organization.

“We were surprised that these ranking platforms were so sensitive to this problem. If it turns out the top-ranked LLM depends on only two or three pieces of user feedback out of tens of thousands, then one can’t assume the top-ranked LLM is going to be consistently outperforming all the other LLMs when it is deployed,” says Tamara Broderick, an associate professor in MIT’s Department of Electrical Engineering and Computer Science (EECS); a member of the Laboratory for Information and Decision Systems (LIDS) and the Institute for Data, Systems, and Society; an affiliate of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and senior author of this study.

She is joined on the paper by lead authors and EECS graduate students Jenny Huang and Yunyi Shen as well as Dennis Wei, a senior research scientist at IBM Research. The study will be presented at the International Conference on Learning Representations.

Dropping data

While there are many types of LLM ranking platforms, the most popular variations ask users to submit a query to two models and pick which LLM provides the better response.

The platforms aggregate the results of these matchups to produce rankings that show which LLM performed best on certain tasks, such as coding or visual understanding.

By choosing a top-performing LLM, a user likely expects that model’s top ranking to generalize, meaning it should outperform other models on their similar, but not identical, application with a set of new data.

The MIT researchers previously studied generalization in areas like statistics and economics. That work revealed certain cases where dropping a small percentage of data can change a model’s results, indicating that those studies’ conclusions might not hold beyond their narrow setting.

The researchers wanted to see if the same analysis could be applied to LLM ranking platforms.

“At the end of the day, a user wants to know whether they are choosing the best LLM. If only a few prompts are driving this ranking, that suggests the ranking might not be the end-all-be-all,” Broderick says.

But it would be impossible to test the data-dropping phenomenon manually. For instance, one ranking they evaluated had more than 57,000 votes. Testing a data drop of 0.1 percent means removing each subset of 57 votes out of the 57,000, (there are more than 10194subsets), and then recalculating the ranking.

Instead, the researchers developed an efficient approximation method, based on their prior work, and adapted it to fit LLM ranking systems.

“While we have theory to prove the approximation works under certain assumptions, the user doesn’t need to trust that. Our method tells the user the problematic data points at the end, so they can just drop those data points, re-run the analysis, and check to see if they get a change in the rankings,” she says.

Surprisingly sensitive

When the researchers applied their technique to popular ranking platforms, they were surprised to see how few data points they needed to drop to cause significant changes in the top LLMs. In one instance, removing just two votes out of more than 57,000, which is 0.0035 percent, changed which model is top-ranked.

A different ranking platform, which uses expert annotators and higher quality prompts, was more robust. Here, removing 83 out of 2,575 evaluations (about 3 percent) flipped the top models.

Their examination revealed that many influential votes may have been a result of user error. In some cases, it appeared there was a clear answer as to which LLM performed better, but the user chose the other model instead, Broderick says.

“We can never know what was in the user’s mind at that time, but maybe they mis-clicked or weren’t paying attention, or they honestly didn’t know which one was better. The big takeaway here is that you don’t want noise, user error, or some outlier determining which is the top-ranked LLM,” she adds.

The researchers suggest that gathering additional feedback from users, such as confidence levels in each vote, would provide richer information that could help mitigate this problem. Ranking platforms could also use human mediators to assess crowdsourced responses.

For the researchers’ part, they want to continue exploring generalization in other contexts while also developing better approximation methods that can capture more examples of non-robustness.

“Broderick and her students’ work shows how you can get valid estimates of the influence of specific data on downstream processes, despite the intractability of exhaustive calculations given the size of modern machine-learning models and datasets,” says Jessica Hullman, the Ginni Rometty Professor of Computer Science at Northwestern University, who was not involved with this work. “The recent work provides a glimpse into the strong data dependencies in routinely applied — but also very fragile — methods for aggregating human preferences and using them to update a model. Seeing how few preferences could really change the behavior of a fine-tuned model could inspire more thoughtful methods for collecting these data.”

This research is funded, in part, by the Office of Naval Research, the MIT-IBM Watson AI Lab, the National Science Foundation, Amazon, and a CSAIL seed award.

Paper: "Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings".

Reprinted with permission of MIT News.

Read next:

• Why comparisons between AI and human intelligence miss the point

• Google Expands ‘Results About You’ Tool to Include Government ID Monitoring

by External Contributor via Digital Information World

Google Expands ‘Results About You’ Tool to Include Government ID Monitoring

Reviewed By Asim BN.

Google announced on Feb. 10, 2026 that it is expanding its “Results about you” tool to help users find and request the removal of Search results containing government-issued identification numbers.

In a blog post, Product Manager Phoebe Wong said, “Over 10 million people have used the ‘Results about you’ tool to control how their sensitive personal information appears online, like a phone number or home address.” The company added that users can now find and request the removal of information such as “your driver’s license, passport, or Social Security number.”

Users can access the tool in the Google app by clicking their account photo and selecting “Results about you,” or by visiting goo.gle/resultsaboutyou. First-time users are prompted to “add the personal contact information you want to monitor,” including government ID numbers, while existing users can add ID numbers directly.

Submitted information is strictly secured, not used for ads, but may be disclosed under legal requests.
Image: Google / The Keyword Blog

Google said the tool “employs Google’s rigorous security protocols and advanced encryption to prevent misuse and ensure your privacy.” Once confirmed, the system automatically monitors Search results and notifies users if matches are found.

The company noted, “Removing this information from Google Search doesn’t remove it from the web entirely,” and said the update will roll out in the United States first, with plans for additional regions.

Google’s FAQ on the “Results about you” tool emphasizes that users’ personal information is handled with strict security standards. The company states, “We take that responsibility very seriously. To prevent misuse, we store your personal info in accordance with Google's high standards for sensitive personally identifiable information, which includes advanced encryption and access controls.”

The information users provide, such as phone numbers, home addresses, email addresses, and government-issued IDs, is used only for monitoring, to process removal requests, and to improve the monitoring and removal process, and is not shared across other Google products or used for advertising.

Although Google’s FAQ does not specifically address legal disclosures, information submitted through the tool could be provided to authorities under judicial oversight or other legally binding government requests, as confirmed by the DIW team from Google’s published privacy policy.

The company also warns that misuse of the tool may result in losing access or other consequences under its Terms of Service.

Notes: This post was improved with the assistance of AI tools.

Read next: When both partners work from home: the hidden cost of always-on technology

by Ayaz Khan via Digital Information World