I hope this bubble bursts soon I’m getting real tired of it.
Let me guess: AWS, Azure, GCP, NSA
Damn, missed Meta :(
Yeah you are likely correct. it’s not hard to guess. all this hype around ai is to temporarily increase share price.
From what I remember from a few months back the H100 GPU (I think) cost something like 30k and there were a few companies each buying up to 100k of them so something like this makes sense.
So if the demand for this GPUS don’t continue high, which is possible as the newer LLMs are smaller and much better while not taking as much compute to run after trained, then what matters is by how much this deman will fall as I really doubt the demand will fall to 0 but we could be one development away from the deman tanking, being maintained or even going up.
Meta trained Llama 3.1 405b model on 16 thousand H100s: https://ai.meta.com/blog/meta-llama-3-1/
So yeah, the scale of it can get wild. But from what I can tell, it seems like there’s a clear diminishing returns on usefulness of throwing more processing power at model training and more breakthroughs are needed in the architecture to get much meaningfully further on general model “competence.” The main problem seems to be that you need a ridiculous amount of decent data to make it worth scaling up. Not only in terms of the model showing signs of actually being “better”, but in terms of the cost to run inference on it when a given user actually uses the model. And quantization can somewhat reduce the cost to run it, but in exchange for reducing overall model competence.
Right now, my general impression is that the heavy hitter companies are still trying to figure out where the boundaries of the transformer architecture are. But I’m skeptical they can push it much further than it has gone through brute forcing scale. I think LLMs are going to need a breakthrough along the lines of “learning more from less” to make substantial strides beyond where they’re at now.
I guess these whales do benefit from more efficient llms as well, it’s not like their choice is “expand compute power” XOR “use more efficient llm”. Worst case they can rent spare compute power to other companies.
Maybe. Idk
Yeah this is true. For companies that operate the scale of FAANG, even seemingly insignificant savings add up to become significant.
I think it’s less about models getting smaller more about the current approach to AI (transformer based LLMs) hitting a ceiling that will cause problems for them.
what are whales again, I thought they were just beautiful animals
Whale is a term that became popular with the rise of online games, specially on mobiles, that rely heavily on microtransactions for their income stream
The companies behind such games internally use the term whales for players/customers who spend ludicrously and disproportionately high amounts of money on the game. Often these players are addicted to the game or there are other psychological factors at play.
Most importantly, income from whales accounts for a large portion of the total income. There is this talk by a mobile game dev that went viral because he very casually talks about the exploitative nature of his games with no self awareness: https://youtube.com/watch?v=xNjI03CGkb4
Since then the term is used in other businesses where applicable. Here a small number of customers are accounting for about half the revenue so they are being called whales.
I found a YouTube link in your comment. Here are links to the same video on alternative frontends that protect your privacy:
Whales ruined gaming, whales ruined pc Hardware. Now whales are about ruin Nvidia, critical support to the whales ig.