OpenAI‘s CTO, Mira Murati, was unable to provide a clear response when asked by The Wall Street Journal’s technology reporter, Joanna Stern, about the data used to train the company’s text-to-video generation tool, Sora. This lack of transparency became even more evident following recent comments from YouTube’s CEO, Neal Mohan.
Mohan’s Remarks
In an interview with Bloomberg’s Emily Chang, Mohan stated, “From a creator’s perspective, when a creator uploads their hard work to our platform, they have certain expectations. One of those expectations is that the terms of service are going to be abided by. It does not allow for things like transcripts or video bits to be downloaded, and that is a clear violation of our terms of service. Those are the rules of the road in terms of content on our platform.”
Essentially, Mohan is saying that OpenAI’s use of numerous YouTube videos to train Sora is a violation of YouTube’s terms of service.
OpenAI’s Sora
Sora is OpenAI’s AI-based text-to-video generator. The tool has been under scrutiny for its data collection methods, which involve using large amounts of content from YouTube videos.
It’s refreshing to see publishers and platforms like YouTube taking a stand against AI systems using their content without permission. Mohan’s stern warning to OpenAI is a clear message to respect the rights of content creators.
Irony in Google’s Stance
However, there’s an element of irony in Google’s stance. Google, which dominates the Internet search market, uses publisher data to train its search engine and AI. Yet, it warns OpenAI against doing the same with YouTube data.
While it’s not a matter of taking sides, it’s important to note that both tech giants are essentially consuming the open web, repackaging it, and presenting it as innovation. Of the two, OpenAI appears more ethically questionable, as it has built its systems on the creativity and work of others who were unaware of their indirect involvement in creating a massive imitation machine.