Are there any initiatives aimed at training generative AI using 100% public domain works and works authorized by the creator?

HiddenLayer555@lemmy.ml · 21 hours ago

Are there any initiatives aimed at training generative AI using 100% public domain works and works authorized by the creator?

isekaihero@ani.social · 10 hours ago

It’s not an issue to me, and is completely befuddling to begin with. Training an AI on copyrighted material doesn’t mean the AI violates that material when it generates new artwork. AI models don’t contain a copy of all the works they were trained on - which could be petabytes of data. They reduce what they learned to math algorithms and use those algorithms to generate new stuff.

Humans work much the same way. We are all exposed to copyrighted material all the time, and when we create new artwork a lot of the ideas churning inside our heads originate from other people’s works. When a human artist draws a mouse man smiling and whistling a tune, for some reason it’s not considered a copyright violation as long as it doesn’t strictly resemble mickey mouse. But when an AI generates a mouse man smiling and whistling a tune? Suddenly the anti-AI crowd points at it and screams about it violating Disney IP.

It’s not an issue. It never was. AI training is a strawman argument manufactured by the anti-AI crowd to justify their hatred of AI. If you created an AI trained on public domain stuff, they would still hate it. They would just clutch at some other reason.

RandomVideos@programming.dev · 9 hours ago

Has anyone ever defended the copyright of a massive corporation when talking about AI?

Image generation, during its training, tries to get as close as possible to the image its training on. The way the AI trains isnt even remotely close to how humans do it

Also, copyright is not the only reason why people hate AI. Obviously another reason would be presented if one is eliminated. It doesnt just appear out of nowhere

isekaihero@ani.social · 8 hours ago

No that’s not how it works. AI models don’t carry a repository of images. They use algorithms. The model itself is a few gigabytes where as the training data would be petabytes - far larger than I could fit on my home desktop running stable diffusion.

It actually is close to how humans do it. You’re thinking “it’s copying that image” and it’s not. It’s using algorithms to create an image in a similar style. It knows different artistic styles because it has been fed a repository of millions of images in that style and can generate similar images in that style.

As for copyright, it was recently all over social media that AI could copy studio ghibli’s art style. To the rage of social media and their fanbase, this is allowed. Studio Ghibli can’t copyright an art style, and that’s why AI image generators continue to include the option to generate art in that art style.

RandomVideos@programming.dev · 5 hours ago

I never said that the images were saved. I said that the AI was trained to copy the images, not that it had a way to check them after it trained

Even though both can “know” styles, the methods used to train humans and AI and how they act is completely different. A human doesnt start with noise and gradually removes it to create an image

masterspace@lemmy.ca · 6 hours ago

It’s not a popular opinion but you’re entirely right.

AI isn’t copying in the way that most people think it is. It truly is transformative in all the tradition copyright ways.

Is it copyright infringements if my company pays an employee to study the internet and that makes them capable of animating a frame from the Simpsons? No, it’s copyright infringement when that company publishes that copyright infringing work.

The reality is that copyright has always been a nonsense system and ‘fair use’ concepts were also nonsense and arbitrary. AI algorithms just let us expose how nonsense they are at scale.