AI is setting off a great scramble for data

Published on August 14, 2023

Adobe has defied predictions that AI tools would render its software obsolete by developing its own AI suite called Firefly, which has generated over 1 billion images since its release in March. Unlike competitors that mine the internet for images, Adobe leverages its massive stock photo database to avoid copyright disputes. This illustrates the growing demand for quality data to power AI models, with the text-based models of companies like Google and Meta trained on over 1 trillion words. However, as the quest for data intensifies, content creators are increasingly demanding compensation for their material, leading to copyright infringement cases. This has spurred AI companies to secure data sources through deals with news agencies, stock photography providers, and other data-rich organizations, as well as improving the quality of existing inputs and gathering user interaction data to refine models. Another potential data source lies within corporate customers, who possess valuable but often untapped unstructured datasets that could be used to fine-tune AI tools. As businesses seek to manage and unlock these datasets, startups in the AI-focused database sector are gaining traction and investment.

Link to the full article: AI is setting off a great scramble for data (economist.com)