The biggest issue with generative AI, at least to me, is the fact that it’s trained using human-made works where the original authors didn’t consent to or even know that their work is being used to train the AI. Are there any initiatives to address this issue? I’m thinking something like an open source AI model and training data store that only has works that are public domain and highly permissive no-attribution licenses, as well as original works submitted by the open source community and explicitly licensed to allow AI training.

I guess the hard part is moderating the database and ensuring all works are licensed properly and people are actually submitting their own works, but does anything like this exist?

  • isekaihero@ani.social
    link
    fedilink
    arrow-up
    6
    arrow-down
    6
    ·
    10 hours ago

    It’s not an issue to me, and is completely befuddling to begin with. Training an AI on copyrighted material doesn’t mean the AI violates that material when it generates new artwork. AI models don’t contain a copy of all the works they were trained on - which could be petabytes of data. They reduce what they learned to math algorithms and use those algorithms to generate new stuff.

    Humans work much the same way. We are all exposed to copyrighted material all the time, and when we create new artwork a lot of the ideas churning inside our heads originate from other people’s works. When a human artist draws a mouse man smiling and whistling a tune, for some reason it’s not considered a copyright violation as long as it doesn’t strictly resemble mickey mouse. But when an AI generates a mouse man smiling and whistling a tune? Suddenly the anti-AI crowd points at it and screams about it violating Disney IP.

    It’s not an issue. It never was. AI training is a strawman argument manufactured by the anti-AI crowd to justify their hatred of AI. If you created an AI trained on public domain stuff, they would still hate it. They would just clutch at some other reason.

    • RandomVideos@programming.dev
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      9 hours ago

      Has anyone ever defended the copyright of a massive corporation when talking about AI?

      Image generation, during its training, tries to get as close as possible to the image its training on. The way the AI trains isnt even remotely close to how humans do it

      Also, copyright is not the only reason why people hate AI. Obviously another reason would be presented if one is eliminated. It doesnt just appear out of nowhere

      • isekaihero@ani.social
        link
        fedilink
        arrow-up
        2
        arrow-down
        2
        ·
        8 hours ago

        No that’s not how it works. AI models don’t carry a repository of images. They use algorithms. The model itself is a few gigabytes where as the training data would be petabytes - far larger than I could fit on my home desktop running stable diffusion.

        It actually is close to how humans do it. You’re thinking “it’s copying that image” and it’s not. It’s using algorithms to create an image in a similar style. It knows different artistic styles because it has been fed a repository of millions of images in that style and can generate similar images in that style.

        As for copyright, it was recently all over social media that AI could copy studio ghibli’s art style. To the rage of social media and their fanbase, this is allowed. Studio Ghibli can’t copyright an art style, and that’s why AI image generators continue to include the option to generate art in that art style.

        • RandomVideos@programming.dev
          link
          fedilink
          arrow-up
          2
          ·
          5 hours ago

          I never said that the images were saved. I said that the AI was trained to copy the images, not that it had a way to check them after it trained

          Even though both can “know” styles, the methods used to train humans and AI and how they act is completely different. A human doesnt start with noise and gradually removes it to create an image

        • masterspace@lemmy.ca
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 hours ago

          It’s not a popular opinion but you’re entirely right.

          AI isn’t copying in the way that most people think it is. It truly is transformative in all the tradition copyright ways.

          Is it copyright infringements if my company pays an employee to study the internet and that makes them capable of animating a frame from the Simpsons? No, it’s copyright infringement when that company publishes that copyright infringing work.

          The reality is that copyright has always been a nonsense system and ‘fair use’ concepts were also nonsense and arbitrary. AI algorithms just let us expose how nonsense they are at scale.