• hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    9
    ·
    1 month ago

    What kind of AI workloads are these NPUs good at? I mean it can’t be most of generative AI like LLMs, since that’s mainly limited by the memory bandwith and at this point it doesn’t really matter if you have a NPU, GPU or CPU… You first need lots of fast RAM and a wide interface to it.

    • SlopppyEngineer@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 month ago

      That’s why NPU will have high bandwidth memory on chip. They’re also low precision to save power but massively parallel. A GPU and CPU can do it too, but less optimized.

      • hendrik@palaver.p3x.de
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 month ago

        That was my question… How much on-chip memory do they have? And what are applications for that amount of memory? I think an image generator needs like 4-5GB and a LLM that’s smart enough as a general porpose chatbot needs like 8-10GB. More will be better. And at that point you’d better make it unified memory like with the M-series Macs or other APUs? Or this isn’t targeted at generative AI but some other applications. Hence my question.

        • SlopppyEngineer@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          1 month ago

          Last I heard this is for onboard speech recognition and basic image recognition/OCR so these things can more intelligently listen, see and store what you’re doing without sending it to a server. Not creepy at all.