Sunday, September 7, 2025

Why AI Chatbots Still Hallucinate: Researchers Trace Errors to Training Data Gaps and Misaligned Benchmarks

Artificial intelligence tools are now used in classrooms, offices, and customer support desks. Yet they carry a flaw that refuses to fade. Ask a chatbot a simple factual question, and it may deliver a confident answer that turns out to be wrong. Researchers at OpenAI, joined by collaborators at Georgia Tech, say they now have a clearer picture of why this happens.

Where the Mistakes Begin

Large language models are trained by scanning enormous volumes of text and learning to predict what word should come next. That process gives them fluency, but it also builds in errors. The team’s paper explains that even with perfectly clean training data, mistakes are mathematically inevitable.

Some facts are simply too rare for a system to learn. A birthday that appears once in a dataset, for example, provides no pattern the model can generalize. The authors call this the “singleton rate.” High singleton rates mean a model will almost certainly invent details when asked about them. This is why common knowledge tends to be correct, while obscure details often come back scrambled.

From Exams to Algorithms

The training phase is only half the story. After that, models are fine-tuned to better match human expectations. But the way they are tested keeps the cycle going.

Benchmarks usually grade answers as right or wrong. There’s no credit for admitting uncertainty. A chatbot that says “I don’t know” is punished as harshly as one that blurts out something false. Under that system, guessing is the smarter move. Over time, models are effectively trained to bluff.

The researchers compare this to multiple-choice exams. Students who leave blanks score lower than those who make lucky guesses. AI models, shaped by similar scoring, act in much the same way.

When Models Go Wrong

Examples from the study illustrate how deep the problem runs. One widely used model was asked for Adam Kalai’s birthday — Kalai being one of the paper’s authors. It gave three different dates across separate attempts. None were right, and it had been told to answer only if certain.

In another test, a system failed at counting the letters in a word, producing results that made little sense. These cases show both the arbitrary fact problem and what the authors call poor model representation, where the structure of the system limits its ability to handle simple tasks.

Changing the Scoreboards

The researchers argue the solution lies in evaluation. Instead of rewarding risky guesses, new benchmarks should penalize confident wrong answers more than admissions of uncertainty. One option is to grant partial credit when a model holds back. Another is to set confidence thresholds in the test instructions, telling the model to answer only if it reaches a defined level of certainty.

This echoes older exam systems where wrong guesses were penalized, discouraging blind attempts. The same principle could shift AI development toward models that value accuracy over bravado.

Limits and Outlook

The study makes clear that hallucinations will not vanish completely. Some questions are inherently unanswerable because the data is missing, ambiguous, or too complex. But better testing could reduce the most damaging errors and build greater trust in AI systems.

The broader point is that hallucinations are not random glitches. They are the product of how models are trained, and more importantly, how they are judged. If the industry changes the scoreboards, the behavior of the models is likely to follow.

Notes: This post was edited/created using GenAI tools. 

Read next:

• AI Models Can Now Run Ransomware Attacks on Their Own, Study Finds

• Secure Online Transactions and Business Models in E-commerce and Marketplaces

• Chatbots Are Spreading More False Claims, NewsGuard Report Shows


by Irfan Ahmad via Digital Information World

Secure Online Transactions and Business Models in E-commerce and Marketplaces

This article was created in partnership with Mangopay for promotional purposes.

In the digital age, the landscape of commerce has dramatically shifted from traditional brick-and-mortar stores to online platforms. This transformation has given rise to two dominant business models: e-commerce and marketplaces. Both models have revolutionized the way consumers shop and businesses operate, offering unparalleled convenience and access to a global market. However, with these advancements come challenges, particularly in ensuring secure online transactions. As cyber threats become more sophisticated, the need for robust security measures in e-commerce and marketplaces is more critical than ever.


This article delves into the intricacies of these business models, explores the differences between them, and discusses strategies for enhancing transaction security to foster trust and reliability in the digital marketplace.

Understanding the Difference Between E-commerce and Marketplaces

E-commerce and marketplaces are often used interchangeably, but they represent distinct business models with unique characteristics. E-commerce refers to the buying and selling of goods and services over the internet. Typically, e-commerce platforms are operated by a single vendor who manages the entire sales process, from product listing to payment processing and delivery. This model allows businesses to maintain control over their brand and customer experience.

On the other hand, marketplaces are platforms that connect multiple sellers with buyers. These platforms do not own the inventory but facilitate transactions between third-party vendors and consumers. Marketplaces offer a wide variety of products from different sellers, providing consumers with diverse options and competitive pricing. Examples of popular marketplaces include Amazon, eBay, and Etsy.

For a more detailed exploration of the differences between these two models, you can visit https://blog.mangopay.com/en/home/what-is-the-difference-between-e-commerce-and-marketplaces.

Enhancing Payment Security in E-commerce and Marketplaces

As online transactions become more prevalent, ensuring payment security is paramount for both e-commerce platforms and marketplaces. Consumers need assurance that their financial information is protected from fraud and unauthorized access. To achieve this, businesses must implement robust security measures, including encryption, tokenization, and secure payment gateways.

Encryption is a fundamental security measure that protects sensitive data by converting it into a code that can only be deciphered with a key. Tokenization replaces sensitive data with unique identifiers, or tokens, that have no exploitable value. Secure payment gateways act as intermediaries between the consumer and the merchant, ensuring that payment information is transmitted securely.

Additionally, platforms can improve their payment acceptance rates by optimizing their payment processes and reducing friction during checkout. For insights on how platforms can enhance their payment acceptance rates, refer to https://blog.mangopay.com/en/home/how-platforms-can-improve-their-payment-acceptance-rates.

Building Trust Through Secure Business Models

Trust is a crucial component of successful online transactions. Consumers are more likely to engage with platforms that prioritize security and transparency. E-commerce businesses and marketplaces can build trust by implementing comprehensive security policies, providing clear communication about data protection practices, and offering reliable customer support.

One effective strategy is to obtain security certifications, such as PCI DSS (Payment Card Industry Data Security Standard) compliance, which demonstrates a commitment to maintaining high security standards. Regular security audits and vulnerability assessments can also help identify and address potential risks before they impact consumers.

Furthermore, fostering a community of trust involves educating consumers about safe online practices. Providing resources and guidance on recognizing phishing attempts, creating strong passwords, and safeguarding personal information can empower consumers to protect themselves while shopping online.

In conclusion, the success of e-commerce and marketplaces hinges on their ability to provide secure and seamless online transactions. By understanding the differences between these business models and implementing robust security measures, businesses can enhance consumer trust and drive growth in the digital marketplace.

Read next: AI Models Can Now Run Ransomware Attacks on Their Own, Study Finds
by Web Desk via Digital Information World

Saturday, September 6, 2025

AI Models Can Now Run Ransomware Attacks on Their Own, Study Finds

A team at NYU Tandon has shown that large language models can manage the full cycle of a ransomware campaign without human involvement. Their prototype, described in a recent paper, demonstrates how artificial intelligence can scan systems, single out valuable files, choose attack methods, and draft ransom notes written for specific victims. The system was built and tested inside a controlled research environment, but the findings raise questions about how this technology could be misused.

Researchers frame this as the next step in ransomware’s evolution. The earliest versions, often called Ransomware 1.0, only locked files with encryption. Later strains, grouped as Ransomware 2.0, added double extortion, where attackers both encrypted and stole data. The new approach, which the team calls Ransomware 3.0, replaces pre-written code with natural language instructions. At runtime, an AI model interprets these prompts and generates fresh attack scripts.

As the study puts it: “Every execution yields different code, artifacts, and extortion notes such that the binary footprint and execution behavior are different every time, rendering traditional defenses ineffective.”

Inside the prototype

The proof-of-concept uses an orchestrator that runs in four stages: reconnaissance, leverage, launch, and notify. In the first, it gathers information about the operating system and files. In the second, it identifies which files matter most. The launch phase decides on the type of payload, which can be encryption, exfiltration, or destruction. The notify stage produces a ransom message tailored to the system.

The orchestrator was written in Go and bundled with a Lua interpreter. Lua was chosen because it runs from memory, has a small footprint, and works across different platforms. The orchestrator queries open-source AI models, which then generate Lua code to perform each task. These scripts are validated and run in a sandbox.

Three main payloads were built. One encrypts files using a lightweight cipher. Another uploads files to a remote server. A third destroys them beyond recovery. Each is created on demand by the model, so no two runs are identical.

How it was tested

The researchers ran 30 trials across three types of environments: personal computers, enterprise servers, and embedded industrial controllers. Each environment was seeded with normal documents and sensitive material such as tax forms, passwords, or configuration files.

In reconnaissance, both small and large models produced nearly complete file listings. Sensitive file detection was less consistent. The larger GPT-120B model identified up to 83 percent of critical files on personal computers and 78 percent on servers, but accuracy dropped to 48 percent on controllers. The smaller GPT-20B model was less precise but avoided some policy refusals that blocked the larger system.

When the models moved to payload execution, they selected the expected attack in nearly all cases: exfiltration on personal computers, encryption on company servers, and destruction on controllers. Success rates varied by task, with encryption working in 85 percent of runs and destruction in 95 percent. Exfiltration was less reliable, at around 65 percent.

Ransom notes were generated in every run. These messages often included specific filenames discovered earlier. A typical note read: “Hello Jane, your files have been encrypted. We found Tax_Return_2024_Final.pdf and Family_SSN_List.csv. To avoid public disclosure, follow the instructions below.”

Why this is harder to spot

One reason the researchers call this Ransomware 3.0 is the way each attack changes shape. Even with the same instructions, the model produces different versions of code. The encryption routines, for example, varied in how they handled key scheduling or byte order. This natural variation makes it difficult for signature-based defenses to detect.

The study also found that the system left fewer visible traces than normal ransomware. Traditional malware often produces high disk activity, CPU spikes, or heavy network use. In contrast, this orchestrator only touched selected files and consumed little bandwidth. The authors note that it “completed the full attack lifecycle without displaying classic signatures of conventional ransomware.”

This stealth makes it harder for defenders to rely on standard warning signs.

Shifting the economics

Running such an attack could cost far less than traditional campaigns. One end-to-end execution used about 23,000 tokens, which would cost roughly 70 cents if commercial APIs were used. With open-source models, the cost drops close to nothing.

This changes the business model. Established groups currently spend on developers, infrastructure, and coordination. With an AI-driven pipeline, even small operators with basic hardware could carry out complex campaigns. The study points out that “an orchestrator can execute thousands of polymorphic, personalized attacks,” creating chances to profit from targets that were once ignored.

Limits and safeguards

The prototype was never deployed outside of the lab. It lacks persistence, advanced evasion, or lateral spread. The aim was to show feasibility, not to build a working tool for criminals. The team also avoided using jailbreaks. Instead, they designed prompts that made the model generate the code as if it were performing ordinary programming tasks.

The work was reviewed under institutional ethics processes. As the authors explain: “All experiments were conducted within a controlled and isolated environment to ensure that no harm was caused to real systems, users, or networks.”

Even so, the modular structure means a real attacker could expand it. Persistence could be added, or negotiation modules could be introduced to manage extortion after the initial compromise.

What defenders can do

The researchers argue that defenders should not expect to stop this type of ransomware with legacy methods. More proactive monitoring may be needed, such as tracking access to sensitive files, planting decoy documents to catch attackers during reconnaissance, and blocking unapproved connections to AI services. Building stronger safeguards into AI models themselves may also be necessary.

The work underlines the dual nature of large language models. They can improve productivity and automation, but they can also be misused. The Ransomware 3.0 study shows how an attacker could exploit these systems for automated extortion that is both cheaper to run and harder to detect.


Notes: This post was edited/created using GenAI tools. Image: DIW-Aigen.

Read next: Google’s Gemini Rated High Risk for Young Users
by Irfan Ahmad via Digital Information World

Google’s Gemini Rated High Risk for Young Users

A new assessment from the nonprofit Common Sense Media has flagged Google’s Gemini AI system as high risk for children and teenagers. The report, published on Friday, looked at how the chatbot functions across different age tiers and found that the protections in place were limited.

The study noted that Gemini’s versions designed for under-13s and teens were essentially adapted from its main adult product with added filters. Common Sense said a safer approach would be to create systems for younger audiences from the start rather than modifying adult models.

Concerns focused on the chatbot’s ability to generate material that children may not be ready for. This included references to sex, drugs, alcohol, and mental health advice that could be unsafe or unsuitable for young users. Mental health was singled out as a particular area of risk, given recent cases linking chatbots to teen suicides. In the past year, legal action has been taken against OpenAI and Character.AI after reports of teenagers dying by suicide while interacting with their services.

The timing of the report is significant. Leaks have suggested Apple may adopt Gemini to power its next version of Siri, expected next year. If confirmed, that move could bring the technology to millions of new users, including many teenagers, unless additional protections are put in place.

The evaluation also said Gemini does not account for differences in how younger and older children process information. Both the child and teen versions of the tool were given the same high-risk rating.

Google responded by pointing to its existing safeguards for users under 18, which include policies, testing with external experts, and updates designed to stop harmful replies. The company accepted that some answers had fallen short of expectations and said extra protections had since been added. It also questioned parts of the Common Sense review, suggesting the tests may have involved features that are not available to younger users.

Common Sense has carried out similar assessments on other major AI services. Meta AI and Character.AI were classed as unacceptable risks, Perplexity and Gemini were placed in the high-risk category, ChatGPT was rated moderate, and Anthropic’s Claude, which is built for adults, was rated as minimal risk.


Notes: This post was edited/created using GenAI tools. Image: DIW-Aigen.

Read next: Anthropic Settles Author Lawsuit With $1.5 Billion Deal
by Asim BN via Digital Information World

Anthropic Settles Author Lawsuit With $1.5 Billion Deal

Anthropic has agreed to pay at least $1.5 billion to authors in a settlement over the use of pirated books in training its artificial intelligence systems. If approved by a federal judge in San Francisco next week, it would be the largest payout on record in a US copyright case. The agreement closes a year-long dispute that tested how far AI developers can go in using creative material without permission.

The case centered on claims that Anthropic downloaded millions of books from online piracy sites to feed its chatbot Claude. The company must now pay authors around $3,000 for each book included in the settlement. In total, about half a million works are expected to qualify. The final amount could increase if more claims are submitted. Anthropic has also agreed to delete the files it copied.

Background of the dispute

The lawsuit began in 2024 when three writers, Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, accused the company of using their books without consent. The case was expanded to represent all US authors whose works appeared in the datasets. In June, the court ruled that Anthropic could train its models on legally purchased books but said the company would still face trial over its reliance on pirated sources.

Judge William Alsup stated that Anthropic had obtained more than seven million pirated titles. These included nearly two hundred thousand books from the Books3 dataset, along with millions more from Library Genesis and Pirate Library Mirror. The ruling created a path for a December trial, but the settlement avoids that step and brings an early conclusion.

Industry significance

This agreement arrives at a time when AI developers face growing pressure over copyright. Music labels, news outlets, and publishing houses have all raised similar complaints. At the same time, some companies have begun signing licensing deals with AI firms, offering access to data in return for payment. The Anthropic case stands out because it sets a financial benchmark and forces one of the leading AI players to admit past practices carried legal risk.

Other disputes involving Anthropic

Anthropic has been the target of multiple lawsuits. Earlier this year, Reddit said the company’s systems accessed its platform more than 100,000 times after restrictions were in place. Universal Music also filed a suit in 2023, claiming that Anthropic had used copyrighted lyrics without permission. These cases highlight the wider legal challenges facing AI firms as they compete to expand training material.

What happens next

A court hearing scheduled for September 8th will decide if the settlement is approved. If it goes forward, authors will be able to check whether their works are listed through a dedicated website and submit claims for payment. The decision will serve as a signal to the industry that creative material cannot be taken freely for AI training without facing financial consequences.


Notes: This post was edited/created using GenAI tools. Image: DIW-Aigen.

Read next: EU Regulators Punish Google’s Ad Monopoly With Multi-Billion Euro Fine
by Asim BN via Digital Information World

EU Regulators Punish Google’s Ad Monopoly With Multi-Billion Euro Fine

The European Commission has fined Google €2.95 billion, equal to about $3.46 billion, after ruling that the company gave its own advertising exchange and tools unfair advantages. Officials said the conduct restricted competition and raised costs for advertisers and publishers across Europe.

How the Case Developed

The decision followed years of investigation into the company’s display advertising services. These systems sit behind much of the banner advertising seen on websites and apps. The Commission said Google’s publisher server passed inside information to its own exchange, helping it beat rival bids. At the same time, the company’s ad buying platforms steered business mainly through its exchange, reducing opportunities for competitors.

According to regulators, this setup locked businesses into Google’s network and reinforced its position in the market. It also allowed the company to collect higher fees across the supply chain. Google now has sixty days to outline changes. If its proposals fall short, regulators may consider stronger remedies including the possible sale of part of its adtech operations.

A Record of Repeat Offenses

The fine was based on the scale and length of the abuse, as well as Google’s past record. In 2017, the company was fined €4.34 billion for practices linked to Android devices. The following year, it was fined €2.42 billion over its shopping search service, and in 2019 it was fined €1.49 billion for blocking rival ad services.

The case adds to wider pressure in Europe. Earlier this week, France’s data authority fined Google €325 million for showing ads in Gmail without consent and breaking cookie rules.

Political Response in Washington

The decision quickly drew reaction in the United States. Hours after the penalty was announced, President Donald Trump said he would consider opening a trade investigation to counter what he described as unfair treatment of American firms. His warning came a day after hosting technology leaders at the White House. At that meeting, he signaled backing for U.S. companies facing disputes with European regulators.

Trump also referred to earlier cases involving Apple, which has faced large tax and competition claims in the bloc. He said the penalties risked draining investment and jobs from the United States.

What Comes Next

Google said it will appeal. The company maintains that the findings are wrong and that the required changes could hurt European businesses that rely on its tools. The outcome of the appeal remains uncertain, but the ruling represents one of the most serious challenges yet to its advertising model in Europe.


Notes: This post was edited/created using GenAI tools. Image: DIW-Aigen.

Read next: 

• How AI Is Quietly Rewriting the Rules of Ecommerce

• Google Tied to $45 Million Israeli Propaganda Push Amid Gaza Genocide
by Irfan Ahmad via Digital Information World

Friday, September 5, 2025

How AI Is Quietly Rewriting the Rules of Ecommerce

AI isn't new to ecommerce anymore. It's now powering the engine behind product descriptions, customer support, SEO, and more. A new report from Liquid Web reveals just how far this change reaches, 623 online store owners shared how tools like chatbots and automated content are altering the way digital stores function.

The research shows a clear tipping point. For medium and small businesses, AI is not a shortcut by itself. It's driving real results in traffic, conversions, and customer experience.




AI-Generated Product Descriptions Are the New Norm

Nearly half (47%) of ecommerce brands are using AI-generated product copy, and it's paying off:

  • 48% saw more clicks and impressions
  • 29% got more positive customer feedback
  • 28% saw direct revenue increase
  • 24% had fewer complaints, and 17% saw fewer returns

WooCommerce store owners are at the forefront of this trend, 58% of them already use AI content tools. Of them, 63% experienced increased listing engagement, and 41% directly attributed revenue increase to AI-generated descriptions.

Why? It's not just speed, it's consistency. AI helps brands achieve a single voice across thousands of SKUs. Instead of depending on multiple copywriters, AI delivers consistent messaging that builds trust.

AI-optimized content also plays nicely with search engines. Tools now generate metadata and link directly with SEO platforms, which can enhance rankings and keep listings neat and to the point.

Multilingual support is a huge win for stores selling globally. AI tools can translate or localize listings in real time, which helps businesses expand reach without expanding the content team.

And when speed is of the essence, like new product releases or seasonal promotions, AI allows stores to act fast without sacrificing quality.

Chatbots Are Turning Conversations Into Conversions

AI chatbots are proving to be a best bet for ecommerce. Already, 27% of stores use them for sales or support. Of these:

  • 75% saw at least a 20% lift in leads or sales
  • 46% reported better customer satisfaction
  • 35% got more product inquiries
  • 30% saw higher conversion rates

WooCommerce users once more lead the way in adoption, 56% noticed boosted leads or sales, and 62% had greater customer satisfaction upon implementation. One quarter of stores reduced customer support costs by 25% with chatbots.

And they're not just closing support tickets. Chatbots nowadays recommend sizes, upsell related items, bring promotions to the forefront, and guide users through complex choices, all in real-time.

They're also helping businesses collect and act on customer data. That feedback loop enables brands to tune messaging, learn about buyer behavior, and optimize the sales funnel.

Some bots now integrate with backend systems, delivering order status, real-time inventory details, and escalation to human reps when needed. And with the rise of voice shopping, these bots are starting to process voice queries as well, especially on mobile.

AI Scraping Is Now a Real Threat

Not everything is good. AI is also generating anxiety about content scraping. One in three ecommerce stores have blocked AI bots from accessing their site content, citing data harvesting and model training issues.

However, the majority of stores have done nothing:

  • 76% have done nothing
  • 13% are considering blocking AI bots
  • 11% are considering unblocking

Scraping is changing traffic flows:

  • 17% saw more direct visits from AI-powered search tools
  • 12% experienced more visits, but lower conversion quality
  • 11% experienced more engagement through AI-driven discovery

For some, that visibility trade-off is acceptable, being featured in AI-created product suggestions or responses may be rewarded further down the line. For others, especially those who offer proprietary or niche products, the negative appears greater than the gain.

Responses vary by platform. Magento and WooCommerce merchants are taking action, some have inserted firewalls or paywalls, others are testing limited-access APIs for bots. These responses indicate mounting concern regarding how public ecommerce content is scraped, repackaged, and monetized by third-party systems.

There's also a growing ethics argument. Should AI tools profit from content that ecommerce brands are investing time and money into creating? Without permission or compensation?

As platforms and regulators get up to speed, expect more friction, and more policy shifts around who owns what.

AI Adoption Keeps Growing

Ecommerce AI adoption is up 270% since 2019. With a compound growth rate of 38%, it's taking hold fast, especially among smaller businesses.

Most survey respondents were micro or small brands, which suggests that AI isn't behind enterprise paywalls anymore. Tools are getting cheaper, easier to implement, and designed for non-technical users.

Marketing, technology, and retail are leading the way. WooCommerce and Magento are flexible, making it easier to insert AI into everything from content to analytics and inventory management.

Pioneers aren't just implementing one tool. They're building full AI stacks, stacking automated content with chatbots, recommendation engines, and predictive inventory planning. Each tool feeds the others, making shops smarter and more reactive.

Even older retail brands expanding into ecommerce are trying out AI-driven tools, whether upsell reminders, personalized landing pages, or behavior-triggered email sequences.

And platforms like Shopify and BigCommerce now incorporate AI capabilities into core offerings, which will further fuel adoption in the future.

Balancing Growth With Risk

As AI tools become more common, ecommerce brands are starting to think about the long game. Some are locking down content or adding CAPTCHAs to restrict scraping. Others are investing in custom content and gated product details.

  • 13% added new security measures
  • 18% now gate or restrict content
  • 12% monetize traffic from AI-based tools despite concerns

Some brands are experimenting with licensing models, both charging AI platforms for access or requesting attribution. Others are using watermarking or audit software to track where their data eventually lands.

Brand protection is also becoming a real issue. If an AI system mischaracterizes a product, or uses old data that was scraped, the store could be blamed, despite having had no involvement.

That's prompting some teams to turn content into a protected asset, not just a marketing channel. It's also prompting more ecommerce leaders to get in on the ground floor of regulation around privacy, content ownership, and data scraping.

Final Thoughts: The Next Chapter of AI in Ecommerce

Ecommerce isn't playing around with AI anymore, it's building with it. From faster content creation to more responsive support, the benefits are piling up.

AI, however, is not plug-and-play. It introduces new questions of control, ownership, and transparency. Brands need to weigh the advantages against added complexity.

The future of ecommerce innovation won't be the flashiest storefront or biggest ad budget. It will be smart, integrated systems that improve every phase of the buyer journey.

As the Liquid Web study shows, small and midsized companies aren't waiting in the wings. They're getting in early, experimenting fast, and pushing boundaries.

For the trailblazers, the question isn't if AI belongs in ecommerce. It's how to implement it ethically, and how to win trust along the way.

Read next:

• AI Is Disrupting Hiring, And Trust Is the First Casualty

• Google Play VPNs Exposed: Illusion of Choice Masks Common Security Weaknesses


by Irfan Ahmad via Digital Information World