On Using AI For Commercial Uses

ThefuzzyFurryComrade@pawb.social · 3 months ago

On Using AI For Commercial Uses

Hotzilla@sopuli.xyz · edit-2 3 months ago

These new LLM models and vision models have their place in software stack. They do enable some solutions that have been nearly impossible in the past (mandatory xkcd ref: https://xkcd.com/1425/ , this is now trivial task)

ML works very well on large data sets and numbers, but it is poor at handling text data. LLM’s again are shit with large data and numbers, but they are good at handling small text data. It is a tool, and properly used very powerful one. And it is not a magic bullet.

One easy example from real world requirements: you have five paragraph of human written text, and you need to summarize it to header automatically. Five years ago if some project owner would have request this feature, I would have said string.substring(100), live with it. Now it is pretty much one line of code.

TheTechnician27@lemmy.world · edit-2 3 months ago

Even though I understand your sentiment that different types of AI tools have their place, I’m going to try clarifying some points here. LLMs are machine learning models; the ‘P’ in ‘GPT’ – “pretrained” – refers to how it’s already done some learning. Transformer models (GPTs, BERTs, etc.) are a type of deep learning is a branch of machine learning is a field of artificial intelligence. (edit: so for a specific example of how this looks nested: AI > ML > DL > Transformer architecture > GPT > ChatGPT > ChatGPT 4.0.) The kind of “vision guided industrial robot movement” the original commenter mentions is a type of deep learning (so they’re correct it’s machine learning, but incorrect that it’s not AI). At this point, it’s downright plausible that the tool they’re describing uses a transformer model instead of traditional deep learning like a CNN or RNN.

I don’t entirely understand your assertion that “LLMs are shit with large data and numbers”, because LLMs work with the largest data in human history. If you mean you can’t feed a large, structured dataset into ChatGPT and expect it to be able to categorize new information from that dataset, then sure, because: 1) it’s pretrained, not a blank slate that specializes on the new data you give it, and 2) it’s taking it in as plaintext rather than a structured format. If you took a transformer model and trained it on the “large data and numbers”, it would work better than traditional ML. Non-transformer machine learning models do work with text data; LSTMs (a type of RNN) do exactly this. The problem is that they’re just way too inefficient computationally to scale well to training on gargantuan datasets (and consequently don’t generate text well if you want to use it for generation and not just categorization). In general, transformer models do literally everything better than traditional machine learning models (unless you’re doing binary classification on data which is always cleanly bisected, in which case the perceptron reigns supreme /s). Generally, though, yes, if you’re using LLMs to do things like image recognition, taking in large datasets for classification, etc., what you probably have isn’t just an LLM; it’s a series of transformer models working in unison, one of which will be an LLM.

Edit: When I mentioned LSTMs, I should clarify this isn’t just text data: RNNs (which LSTMs are a type of) are designed to work on pieces of data which don’t have a definite length, e.g. a text article, an audio clip, and so forth. The description of the transformer architecture in 2017 catalyzed generative AI so rapidly because it could train so efficiently on data not of a fixed size and then spit out data not of a fixed size. That is: like an RNN, the input data is not of a fixed size, and the transformed output data is not of a fixed size. Unlike an RNN, the data processing is vastly more efficient in a transformer because it can make great use of parallelization. RNNs were our main tool for taking in variable-length, unstructured data and categorizing it (or generating something new from it; these processes are more similar than you’d think), and since that describes most data, suddenly all data was trivially up for grabs.

sturger@sh.itjust.works · 3 months ago

Now it is pretty much one line of code.

… and 5kW of GPU time. 😆