In order to respect consumers' privacy, Meta Platforms used public Facebook and Instagram posts to train its new Meta AI virtual assistant, but excluded private posts shared only with family and friends, the company's top policy executive told Reuters in an interview.
According to Meta President of Global Affairs Nick Clegg, speaking on the sidelines of the company's annual Connect conference this week, Meta did not use private chats on its messaging services as training data for the model and took steps to filter private details from public datasets used for training.
"We've tried to exclude datasets that have a heavy preponderance of personal information," Clegg said, adding that the "vast majority" of the data used by Meta for training was publicly available.
He cited LinkedIn as an example of a website whose content Meta purposefully avoided using due to privacy concerns.
Clegg's remarks come at a time when tech companies such as Meta, OpenAI, and Alphabet's Google have been chastised for using information scraped from the internet without permission to train their AI models, which ingest massive amounts of data in order to summarise information and generate imagery.
While facing lawsuits from authors accusing them of infringing copyrights, the companies are debating how to handle the private or copyrighted materials vacuumed up in that process that their AI systems may reproduce.