Apple Breaks New Ground with $50 Million AI Training Deals with Major Publishers

Apple is charting new territories in the AI landscape by pioneering a novel methodology to train its large language models (LLMs). The tech magnate has reportedly reached out to elite publishers and news organizations, offering coveted multi-year deals worth at least $50 million. Names on this distinguished list include Condé Nast, NBC News, and IAC, proprietor of highly circulated platforms like People, The Daily Beast, and Better Homes and Gardens.

Forgoing the conventional path of scraping data from the internet and books for AI training, Apple aims to cultivate a controlled, reliable, and legally sound environment by directly licensing content from publishers. This strategy marks a significant shift from traditional methodologies in the AI sector.

Despite the promising potential, Apple’s novel method has given rise to certain apprehensions among publishers and legal authorities. Concerns primarily revolve around the licensing terms, legal liabilities, as well as how generative AI will be utilized in a news dissemination context.

However, undeterred by the present concerns, Apple’s investment in AI training is notably robust, with daily expenditures reportedly running into millions of dollars. The brainchild behind this innovative venture is the “Foundational Models” team, led by renowned AI engineer John Giannandrea. Under his leadership, the team has been developing conversational AI, akin to ChatGPT, along with other advanced models. One of their premier inventions includes an LLM chatbot designed to interact with customers availing AppleCare services.

Among its many accomplishments, Apple’s team has invested in training its most advanced LLM known internally as Ajax GPT. With a robust training regimen with over 200 billion parameters, Ajax GPT is poised to rival OpenAI’s GPT-3.5 in terms of power and functionality.

Despite its several advantages, Apple’s approach is not free from scrutiny. Given that the company’s strategy involves licensing content from a select group of publishers, the scope of knowledge for its AI models might be relatively constricted. This methodology essentially trades the almost infinite pool of internet data for a more legally sound but substantially smaller base for its AI to learn from.

With Apple steadfast in its decision to license content for its AI training, only the future will tell how this initiative impacts the broader AI and news generation landscape, offering an intriguing development to watch for industry insiders.