• XEAL@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    1 year ago

    Large Language Models (such as GPT) and AI image generators.

    I follow certain AI related post tags on Tumblr and sometimes I see people expressing pure hatred towards these tools, as they only see the AIs as content thieves.

    • Kalash@feddit.ch
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 year ago

      It’s not that I hate it, but like, chatGPT sucks.

      There was this uber hype around it, then we started using it … and it just makes so many errors, it’s literally just generating more work. Scrapped it after less than a week. It’s modern snakeoil.

    • DokPsy@infosec.pub
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      I don’t mind the tool itself if you use it as such. I do mind when people use its output as the final product. See: the lawyer who used chatgpt for a legal brief

    • uralsolo [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 year ago

      I like them as non-profit tools for personal use, but the hatred is justified IMO because we’re already seeing people with writing jobs lose that job and get replaced by an LLM and an “editor” who is paid less than the writer was.

      Also, for stuff like art competitions and magazines, there is a need to develop a rigorous method of verification of what is and isn’t AI-generated. I’ve been published in a magazine before, but if I were to submit a story now I’d be competing against a massive wave of generated stories.

      • XEAL@lemm.ee
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        I like them as non-profit tools for personal use, but the hatred is justified IMO because we’re already seeing people with writing jobs lose that job and get replaced by an LLM and an “editor” who is paid less than the writer was.

        That’s capitalism in all of its glory. People never mattered to the ones who want to make money; they just want want to as much profit as possible with the minimal investment. Someone at work created a tool that turns a work day of painstating tasks into a 5 minute wait? Fire the people, keep the tool. You may call LLMs or AIs enablers, but it’s like hating baseball bats because some use them to crack open skulls instead of hitting baseballs.

        Regarding the verification of AI-generated content, I just can say I agree, but it’s going to be hard to detect.

    • alcoholicorn [comrade/them, doe/deer]@hexbear.net
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 year ago

      they only see the AIs as content thieves.

      AI is a method of content theft, it takes other people’s work and pieces it together in a way that resembles other works, without any actual coherency.

      I don’t like that it churns out slop that displaces actual content.

      I also don’t like the way it’s sped up enshitification of google and news sites. I didn’t think it could get worse than pages of listicles written by disinterested journalists paid fuckall to churn out 10 a day, but now you have chatGPT churning out 100 completely useless articles a day.

      • XEAL@lemm.ee
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        1 year ago

        LLMs just automates and does faster certain things that a person could do on their own if they invested way more effort and time. If a human being takes people’s work and pieces it together in a way that resembles other works without using any LLM/AI or automation tool, is the final result content theft too?

        I agree with the content enshitification, but I disagree about the coherency.

        Usually, implementations like the ChatGPT web/app will generate different outputs for the same prompt/input. You can also ask it to tweak a previous output, make it shorter, more concise, exclude parts, etc. And if you’re making API calls through a script you can tweak parameters like the Temperature, Top P, Presence Penalty or Frequence Penaly, which affect things like the coherence, randomness or repetitiveness of the output.

        There’s also fine tunning using embeddings, which can help training a model to fit one’s specific needs and expectations, but I haven’t got to try it yet.

        • TheActualDevil@sffa.community
          link
          fedilink
          arrow-up
          0
          ·
          1 year ago

          If a human being takes people’s work and pieces it together in a way that resembles other works without using any LLM/AI or automation tool, is the final result content theft too?

          Yes, obviously. Artists and writers can learn from others and can be inspired by other’s works, but they can’t use parts of those works. That is content theft. Imitating a style is fine, but you have to create something new. LLMs cannot create, only steal.

          • XEAL@lemm.ee
            link
            fedilink
            arrow-up
            0
            ·
            edit-2
            1 year ago

            If, for example, I ask an LLM to produce a short story with a completely unique and random prompt that doesn’t resemble any known existing story in its training data (or in the entire world, if you like), is the generated output of the LLM also stolen?

            • TheActualDevil@sffa.community
              link
              fedilink
              arrow-up
              0
              arrow-down
              1
              ·
              1 year ago

              I think what you’re proposing isn’t something they can do. Are you saying “What if I asked it to create a short story who’s pieces don’t resemble any pieces of known stories?” or are you saying “What if I asked it to create a short story who’s whole doesn’t resemble any known stories?”

              The first one can’t happen. The second? Yes, it’s stealing.

              Where is it getting this story? LLMs don’t have creativity. They don’t understand story structure. It pulls sentences and paragraphs from work in it’s training data. If the generated output contains work that others have made, that’s called plagiarism. If it doesn’t, then your hypothetical isn’t realistic. LLMs can’t create original works. That’s the whole point. It pulls pieces of the training data and rearranges them. It would be like if I was writing a college paper and instead of writing anything myself I just pulled 100 different sources and copied a sentence or two from each source and structured them as my paper. That’s 100% plagiarism.

              • XEAL@lemm.ee
                link
                fedilink
                arrow-up
                1
                arrow-down
                1
                ·
                1 year ago

                I was referring to producing a unique plot.

                The process of generating a story involves recombining and rephrasing the LLM’s training data in unique ways, it’s not a copypaste job. They generate content by predicting and generating text based on patterns, an this implicates a degree of transformation and synthesis.

                Where do you draw the line between plagiarism vs inspiration, whether it’s a person or an LLM? How long and similar to something existing does a fragment of text have to be to cross the plagiarism line?

        • alcoholicorn [comrade/them, doe/deer]@hexbear.net
          link
          fedilink
          English
          arrow-up
          0
          arrow-down
          1
          ·
          1 year ago

          I disagree about the coherency.

          Coherency requires relating symbolic meanings. AI just uses statistical analysis.

          Consider if you were locked in the national library of Thailand. You don’t speak Siamese, and any pictures or bilingual dictionaries were removed.

          Given a thousand years, you could look at the patterns and produce text similar to what someone who writes Siamese would write, but there’s still no coherency because you cannot connect the meaning behind any of the words.

          That doesn’t necessarily mean your outputs are useless though, someone who does read Siamese can have you generate outputs until you print out something they can infer a coherent thought from, but you’re fundamentally unable to be trained to do that yourself.

          If a human being takes people’s work and pieces it together in a way that resembles other works without using any LLM/AI or automation tool, is the final result content theft too?

          We’re getting into ethics territory. IP is a social construct and we live under capitalism, our model for determining what is and isn’t theft should be selected by what supports artists and consumers against capitalists.

          • XEAL@lemm.ee
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            Given a thousand years, you could look at the patterns and produce text similar to what someone who writes Siamese would write, but there’s still no coherency because you cannot connect the meaning behind any of the words.

            That doesn’t necessarily mean your outputs are useless though, someone who does read Siamese can have you generate outputs until you print out something they can infer a coherent thought from, but you’re fundamentally unable to be trained to do that yourself.

            You’re comparing an LLM to something similar to the infinite monkey theorem. In your analogy, you should consider that someone who knows perfect Siamese is giving me feedback to optimize and improve my outputs, even I don’t really know the meaning of anything.

            While an LLM may not have a conscience to evaluate if its output is coherent, it can identify patterns and relationships from its training and can generate text that is still appears coherent to human readers.