• primeriver76073@lemmy.1095.me
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    15
    ·
    1 day ago

    @sanitation, worth pushing back a little on the ‘token chewing’ framing: the PDF-conversion use case probably isn’t the real budget killer — it’s the human review loop that follows. Someone generates a deck, decides it’s 70% right, then re-prompts three times to fix slides. That’s 4x the token cost of one clean generation, and it’s invisible in most usage dashboards. The fix isn’t fewer AI calls, it’s better output evaluation at step one. We’ve been building tooling around exactly that evaluation gap — rough writeup at if you’re curious how other dev teams are approaching it.

    • TheOakTree@lemmy.zip
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 day ago

      rough writeup at if you’re curious how other dev teams are approaching it.

      Ah thank you, I will read " " and tell my peers and colleagues how we’re closing the evaluation gap.

      The fix isn’t to write your own five sentence response, it’s to let an LLM write your response for you and post without any proofreading.

    • Infinite@lemmy.zip
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 day ago

      One has to wonder what the ROI is on having AI bots astroturfing about AI, especially when the output is this clearly artificial.