Unique Issues To Look Out For in Generative AI Transactions
Aaron Rubin and Heather Whitney authored an article for Law360 about the unique issues raised by certain artificial intelligence (AI) technology transactions, particularly those that involve generative AI models.
"For their initial training, many generative models — particularly large language models — are trained on data scraped from the internet," the authors wrote. "There have been years of litigation on the permissibility of web-scraping, which is independent of scraping for training purposes, and there are ongoing cases addressing whether training on copyrighted materials without permission qualifies as a fair use."
They added: "Where a generative model is initially trained on large swaths of data scraped from the internet, the model provider may be reluctant to indemnify the model customer for claims arising from such data, particularly given that the law around such training is unsettled. The model provider may argue that, at least for large language models, this type of training is a necessary aspect of developing the technology and that there is no practical way for the model provider to affirmatively secure rights to the data. Without any contractual risk allocation, the model provider will likely bear most of the risk arising from use of such data for initial training, given that the model provider will be the party sourcing the data — e.g., through scraping — and performing the activities that could give rise to claims, like making copies."
Read the full article.