• 20 Posts
  • 397 Comments
Joined 2 years ago
cake
Cake day: June 23rd, 2023

help-circle








  • I guess the idea is that the models themselves are not infringing copyright, but the training process DID.

    I’m still not understanding the logic. Here is a copyrighted picture. I can search for it, download it, view it, see it with my own eye balls. My browser already downloaded the image for me, in order for me to see it in the browser. I can take that image and edit it in a photo editor. I can do whatever I want with the image on my own computer, as long as I don’t publish the image elsewhere on the internet. All of that is legal. None of it infringes on copyright.

    Hell, it could be argued that if I transform the image to a significant degree, I can still publish it under Fair Use. But, that still gets into a gray area for each use case.

    What is not a gray area is what AI training does. They download the image and use it in training, which is like me looking at a picture in a browser. The image isn’t republished, or stored in the published model, or represented in any way that could be reconstructed back to the source image in any reasonable form. It just changes a bunch of weights in a LLM model. It’s mathematically impossible for a 4GB model to somehow store the many many terabytes of images on the internet.

    Where is the copyright infringement?

    I remember stories about the RIAA suing individuals for many thousands of dollars per mp3 they downloaded. If you applied that logic to OpenAI — maximum fine for every individual work used — it’d instantly bankrupt them. Honestly, I’d love to see it. But I don’t think any copyright holder has the balls to try that against someone who can afford lawyers. They’re just bullies.

    You want to use the same bullshit tactics and unreasonable math that the RIAA used in their court cases?


  • Legislators have to come up with a way to handle how copyright works in conjunction with AI.

    That’s the neat part. It doesn’t.

    Copyright hasn’t worked for the past 100 years. Copyright was borne out of an social agreement that works generated from it would enter public domain in a reasonable time frame. Thanks to Mark Twain and Disney, the limit is basically forever, or it might as well be. Here we are still arguing about the next Bond film for a book series that was made in the fucking 1950s. Or the Lord of the Rings series, the genesis of all fantasy. Or thousands of other things that deserve to be in public domain already.

    Copyright is a blunt tool that rich people use to bash the poor with. Whatever you think copyright is doing to protect your rights or your works is easy enough for them to just spend enough money with lawyers and cases until you cave. If copyright isn’t working for the public good, then we should abolish it.

    People hate AI because it’s mostly developed and used by the rich as a shitty way to save money and layoff even more people than we’ve already had. But, it doesn’t have to be. All of these LLM projects were based on freely available research. Hell, Stable Diffusion is still something you can just download and use for free, despite the fact that Stability AI is still trying to wrestle back their own control into the model.

    Instead of sticking our ears in our fingers and saying “la la la la, AI doesn’t exist, it must be destroyed/regulated/fined”, we could push this technology to open sourced as much as possible. I mean, let’s assume that we somehow regulate AI so that people have to pay to use copyrighted works for training (as absurd as that is). AI training goes down drastically, and stagnates. Counties like China are not going to follow those same rules, and eventually, China will be the technological leader here.

    Or the program works, and other people who don’t give a shit about copyright freely allow AI to train their works. Then you have AI models that have to follow these arcane rules, but arrived at the same spot, anyway, but only for the rich people who can afford the systems that allow for that regulation. What the fuck was the point in the regulation, except to make it even more expensive to make?