• Zos_Kia@jlai.lu
    link
    fedilink
    arrow-up
    24
    ·
    1 day ago

    I think the issue is also that you need some serious hardware to get good inference speed when your devs are working, but then most of the time this hardware will be under utilized.

    That being said you can get good performance from indie inference farms, at a fraction of the cost of the big US labs. I think it’s a great compromise and in a few months the open models will be near parity with opus 4.6 which is really all you need for most tasks.

    • plyth@feddit.org
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 day ago

      opus 4.6 which is really all you need for most tasks.

      The same tasks that can fit into 640KB.

          • Zos_Kia@jlai.lu
            link
            fedilink
            arrow-up
            2
            arrow-down
            1
            ·
            1 day ago

            Aha thanks for sharing that’s a cool anecdote. But i think my point still stands, as there are thresholds effects in LLM “intelligence” which don’t directly map to the RAM comparison.

            Opus 4.6 is comparable to a mid-level developer. It requires some guidance and will sometimes get things wrong, but is also suitable to work in most business environments: most projects are not that complicated or high stakes in the first place.

            In the future you’ll probably have Opus 7.5 or some shit, which will be at a mega-senior level but also considerably more expensive. And given the price difference, companies will suddenly discover that they don’t really need expert level coding at a high price tag, and that a reliable workhorse at a fraction of the cost is largely enough for their needs.