• About Us
  • Contact Us
  • Advertise
  • Privacy Policy
  • Guest Post
No Result
View All Result
Digital Phablet
  • Home
  • NewsLatest
  • Technology
    • Education Tech
    • Home Tech
    • Office Tech
    • Fintech
    • Digital Marketing
  • Social Media
  • Gaming
  • Smartphones
  • AI
  • Reviews
  • Interesting
  • How To
  • Home
  • NewsLatest
  • Technology
    • Education Tech
    • Home Tech
    • Office Tech
    • Fintech
    • Digital Marketing
  • Social Media
  • Gaming
  • Smartphones
  • AI
  • Reviews
  • Interesting
  • How To
No Result
View All Result
Digital Phablet
No Result
View All Result

Home » 57% Of The Internet Could Be AI Generated Content

57% Of The Internet Could Be AI Generated Content

Rukhsar Rehman by Rukhsar Rehman
September 10, 2024
in News
Reading Time: 2 mins read
A A
57% Of The Internet Could Be AI Generated Content
ADVERTISEMENT

Select Language:

You’re not imagining things; the quality of search results is indeed declining. Recent research from Amazon Web Services (AWS) indicates that a staggering 57% of online content is either generated by artificial intelligence or translated through AI algorithms.

ADVERTISEMENT
The research paper, titled “A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism,” points to the widespread use of low-cost machine translation (MT) as a major factor. This method reprocesses existing content into multiple languages, which the study identifies as a significant contributor to the overwhelming volume of online material now deemed to be AI-generated. According to the researchers, “Machine-generated, multi-way parallel translations not only dominate the translated content on the web in low-resource languages where MT is prevalent, but they also make up a substantial portion of all web content available in those languages.”

The researchers also uncovered a type of selection bias in the content that is translated into multiple languages versus that which remains in a single language. “Content that is translated into multiple languages tends to be shorter, more predictable, and has a different distribution of topics compared to single-language content,” they noted.

Furthermore, the proliferation of AI-generated content, along with an increasing dependence on AI tools to alter and refine such content, could instigate a phenomenon known as model collapse. This situation is already impacting the quality of search outcomes across the web. Advanced AI models, such as ChatGPT, Gemini, and Claude, rely heavily on vast amounts of training data sourced from the public web—regardless of potential copyright violations. With the internet becoming saturated with AI-generated content, which is frequently erroneous, the performance of these models could significantly deteriorate.

“The speed at which model collapse occurs can be quite surprising, and it often goes unnoticed,” warned Dr. Ilia Shumailov from the University of Oxford in an interview with Windows Central. “Initially, it primarily affects minority data sets—those that are poorly represented. Over time, this issue reduces the diversity of outputs, which might create a false impression of improved performance on majority data while masking deterioration in minority data. Model collapse can lead to significant repercussions.”

ADVERTISEMENT

The consequences of this could be illustrated by a study in which professional linguists categorized 10,000 randomly selected English sentences from various topics. The findings displayed a “dramatic change in topic distribution” when comparing translations from two languages to those spanning eight or more languages, particularly a surge in the “conversation and opinion” category, which jumped from 22.5% to 40.1%.

This discovery underscores the selection bias present in the types of data given multiple translations—these are “substantially more inclined” to fall under the “conversation and opinion” topic.

Additionally, the study showed that “translations involving more than eight languages are significantly of lower quality (6.2 Comet Quality Estimation points lower) than those involving only two languages.” An audit of 100 of these highly multi-way translations revealed that “the vast majority” originated from content farms, producing articles deemed low quality and requiring minimal expertise or effort to generate.

This trend helps clarify why OpenAI’s CEO, Sam Altman, continually emphasizes the necessity of unrestricted access to copyrighted materials for creating tools akin to ChatGPT, stating it is “impossible” without it.

ChatGPT Add us on ChatGPT Perplexity AI Add us on Perplexity
Google Banner
ADVERTISEMENT
Rukhsar Rehman

Rukhsar Rehman

A University of California alumna with a background in mass communication, she now resides in Singapore and covers tech with a global perspective.

Related Posts

How To

How to Check Your Laptop’s M.2 Slots for SSD Upgrade

September 14, 2025
Technology

This wild week in science and tech

September 14, 2025
Platforms, Prices, and Switch 2: Completing and Solving Upgrades
Gaming

Platforms, Prices, and Switch 2: Completing and Solving Upgrades

September 14, 2025
Next Apple TV Nears, Might Be a Game-Changer
News

Next Apple TV Nears, Might Be a Game-Changer

September 14, 2025
Next Post
Major Upgrade for China iPhone: Apple Intelligence in Chinese Next Year

Major Upgrade for China iPhone: Apple Intelligence in Chinese Next Year

  • About Us
  • Contact Us
  • Advertise
  • Privacy Policy
  • Guest Post

© 2025 Digital Phablet

No Result
View All Result
  • Home
  • News
  • Technology
    • Education Tech
    • Home Tech
    • Office Tech
    • Fintech
    • Digital Marketing
  • Social Media
  • Gaming
  • Smartphones

© 2025 Digital Phablet