State of LLMs in China and the US, plus JDPON Don

piccolo [any]@hexbear.net · 1 day ago

Any chance you could post an (anonymized) sample of the writing style? think have a pretty good vibes-based intuition about what’s Al. Also, (just as the state of the world)

piccolo [any]@hexbear.net · 12 days ago

I can deal with the sideways ones but the trains in my city have backwards seats :(

piccolo [any]@hexbear.net · 13 days ago

I heard that before going on strike, Canadian posties were escalating labour action incrementally, where first they refused overtime work, then they threatened to stop delivering flyers, which apparently makes up a substantial amount of Canada Post revenue.

piccolo [any]@hexbear.net · edit-2 18 days ago

Just as they ban Venezuelan president Nicolás Maduro, they unban the anti-vaxxers

piccolo [any]@hexbear.net · 19 days ago

piccolo [any]@hexbear.net · 20 days ago

RIP to the good bit about saying “I wonder what Milo Yabadabadopolis-fe-fi-filopolis has to say about this” then linking to his Twitter account that just said “@Nero was banned from Twitter”

piccolo [any]@hexbear.net · edit-2 21 days ago

Right after your quote:

For companies and investors caught in the fray, it’s been “total misery,” says Dan Wang, research fellow at Stanford University’s Hoover History Lab and the author of Breakneck. China’s model relies on “a lot of state power, a lot of consumer power, but not very much financial investor benefit,” he says.

won’t someone think of the poor shareholders?

No.

piccolo [any]@hexbear.net · 23 days ago

Like when Biden’s campaign claimed he had a history of civil rights activism… They failed to mention that it was against civil rights

piccolo [any]@hexbear.net · 1 month ago

Currently there’s a benchmark for creative writing where samples are produced by various AIs and then graded by one particular “judge” AI, so I’m sure they’re working on this

piccolo [any]@hexbear.net · 2 months ago

Canola is a genetic variant based on that type. All canola oil is that oil but not all that oil is canola. It was specifically bred to have particular properties, like being low in erucic acid.

piccolo [any]@hexbear.net · 2 months ago

There are some differences between canola and the type of oil you’re referring to (but they’re pretty much the same)

piccolo [any]@hexbear.net · 2 months ago

I get my info from a bunch of places, here are some of them:

Simon Willison’s blog. He writes about LLMs in an interesting way, and has been consistently talking about security and best practices, which is refreshing when no one else talks about that. He seems a bit biased towards western models, but he still provides good coverage of a lot of model releases.
/r/LocalLlama - one of the best places on reddit for people talking about LLMs, particularly open ones that they can theoretically run on their own computer. However, most people there still use hosted LLMs via e.g. OpenRouter, which makes sense because not many people have the kinds of GPUs needed to run the largest of these models.
Gwern, AI Alignment Forum, and LessWrong. All of these have the rationalist bug and generally bad politics but they can be interesting to read. A lot of abstract philosophy about AI in general, plus crazy stuff like Roko’s basilisk (don’t look it up if you don’t already know about it, it’s a cognitohazard). Papers on AI safety are often linked here.
arXiv for technical papers, and once you find one interesting one it’s easy to go down the citations rabbit hole to find 300 more.

Also, do you see anything in the horizon regarding people trying other ways to conceptualize “AI”? I mean, nowadays, for all intents and purposes, AI equals LLM.

Good question. I think that LLMs are definitely the dominant metagame currently. I think they will still get better, and I tend to agree that I don’t think this will lead to AGI, but I also have no idea what will. I think Anthropic’s research in understanding how the LLM “brain” works is very compelling and might lead to new developments, but I don’t know what they are. Here’s an essay talking about what might be the next improvement to LLMs: LLM Daydreaming · Gwern.net. I think this is very interesting and Gwern is good at predicting this kind of thing I think. But it also requires companies to be invested in a longer horizon of profit, which they’re notoriously pretty bad at doing.

I also came across this article: Xi Jinping warns against China’s overinvestment in EVs and AI, which seems potentially relevant. It’s interesting that the US is saying nothing of the sort.

I mean, what if Chinese researchers (because let’s face it, a great novel breakthrough would most likely come from there) find out that there’s another way to do AI that has nothing to do with LLMs and does not require GPUs? Just spitballing here, but if that were the case, then how would the US government and AI companies pivot, now that they’re so heavily invested into this?

Yeah, I really don’t know. I mean I think that GPUs are likely in any future AI breakthrough because massively parallelizing computation is what they’re good at, and they’ve been a staple of every kind of ML breakthrough in the last long while. Of course, massive parallelization doesn’t equal AI, but it’s hard for me to imagine an AI breakthrough that doesn’t use massive parallelization.

I feel like the gargantuan sunk cost of billions upon billions invested in one particular technology is at least partially driving this monotonic search for ways to make LLMs better and better, rather than branching off in some novel direction. We’re at a point where NVIDIA stock prices are so central we’re not even going to pretend to do something different.

Yes, I agree for sure. I saw a hexbear the other day (sorry, can’t remember who) saying that they didn’t think it was a coincidence that the market shifted to LLMs right after the crypto bubble popped. There’s a lot of GPU capacity that was freed up by that, which conveniently feeds perfectly into LLMs! And Nvidia is ridiculously overvalued as a company for sure. That being said, I think that even if LLMs are a bubble, it will be one more like the internet than like crypto - still overvalued but based on a fundamentally compelling technology, and it’ll stick around even after the bubble pops.

First, if I need to find something specific online, I’ll sometimes go to ChatGPT and use its online search function to see if it ends up pointing me towards useful references, something that Google can no longer do most of the time. I don’t do this often at all, and it’s kinda helpful in that regard, but not very much. I’ll also use it sometimes for grammaticality checks since I’m not a native English speaker, but I take the answers with a grain of salt… what if it’s trying to suck up to me by saying my sentences are “not only beautifully crafted — they’re very deep and meaningful”?

Second, and this is what I do with AI 95% of the time, is I use Deepseek to study Chinese, confirming everything it tells me with my native tutor, of course, which is why I’ll gladly accept cheaper, more efficient Chinese models.

Edit: another thing I find LLMs to be useful for is to search for collocations. This is entirely unsurprising as a useful feature, since collocations are by definition a function of natural frequent association, and the entire concept of LLMs revolves around word associations.

Firstly, your English is very good. I never would have guessed you weren’t a native speaker.

Secondly, these all seem like good use cases of LLMs. In my experience, Claude and Gemini both have decent search tools, with Gemini’s especially good for academic research. I’m curious to see open source models get better at search, but also that might just be a function of access to search infrastructure, which obviously costs money. I also haven’t fairly assessed this yet I think.

Thirdly, using it for language-related tasks is probably its most compelling use case for most people. They’re getting really good at writing and editing. ChatGPT really likes to use very obviously AI language, but you can get DeepSeek/GLM/Claude to generate much less AI-sounding content.

piccolo [any]@hexbear.net · edit-2 2 months ago

Well, I think that’s part of it. I think that companies are probably trying to do that, but I think most software jobs won’t be replaced by Claude for a long while. I don’t know what people who work for OpenAI/Anthropic/Google think about this - maybe they think it’s coming, maybe not. IME, they’re good at writing code when instructed by a skilled engineer, but on their own, not very good. And the code they write is not always very maintainable. As someone who is currently unemployed but normally works in software, I’ve been using LLMs for brainstorming software design decisions for personal projects, and I find that they are good at “talking through” these kinds of things but less good at actually implementing them in a way that makes sense.

Some more reasons why LLMs are getting better at programming tasks:

There’s a ton of training data. This is separately frustrating to me because all of the open source code that people wrote for the greater good or whatever is now just getting hoover’d up into these closed source models (tbh I care less when Qwen or DeepSeek does this because they release the weights). In open source, there’s a license called the GPL that says that any derivative work has to be open sourced and released under the same terms of the GPL, and I think this is a Really Good way to make copyright law work for the public interest. Of course, the Silicon Valley mindset of move fast and break things (especially the law, until you’re big enough that the law doesn’t matter anymore - see Uber, Airbnb, etc) doesn’t give a fuck about this, and now LLMs are already too entrenched for anything to happen about this.
A lot of the time, there is a concrete right or wrong in programming. It’s a lot harder to evaluate how well your model works as a creative writer, but for code you can run the code and see if it does the thing you wanted it to. (Obviously there are other factors like code style and stuff but at least you have the baseline. It’s also easier for an LLM to grade whether output code is good style than it is to assess if a story is good.) Most LLMs nowadays are trained significantly on “synthetic data” which means data generated by other LLMs, and doing this at scale means you don’t have a human in the loop reading over the training data and grading essays or anything. It’s computers all the way down.
Reasoning is important, and programming is a very concrete task to train chain-of-thought style responses, e.g. DeepSeek. That’s also why they are trained on a lot of math, e.g. AIME benchmarks.
I think that GPT3 got a lot of gains from training on programming, in a way that generalized to other tasks. Somehow having all of that structured data in its training data made it better at tasks across the board, and that was one of the breakthroughs that led to the original ChatGPT. I think that the companies think that because it’s easy to train on code, and it seems that training on code makes it better at other things too, it’s the easiest path to more intelligent LLMs.
Programmers are early adopters of technology and potentially willing to pay large sums of money. Claude Code costs $200/month, which is crazy. And people buy it. Because when you make $200k/year, if a tool helps you do your job 30% better, it’s worth that kind of money. I think that this is a unique phenomenon - other high paying jobs like lawyer or doctor wouldn’t adopt this kind of technology as urgently. Hopefully most people in these classes realize that LLMs hallucinate too much to be useful in the general case. They can be good at reading papers or documents and summarizing them, but that is a task that is done much easier than programming, so even if they were widely used for that narrow purpose it wouldn’t justify a $200/month subscription. Like if ChatGPT can do it for free, or you buy their $20 tier and it can do that fine, who would pay $200?

piccolo [any]@hexbear.net · 2 months ago

For writing in English, GLM 4.5 is pretty good, open weights, and free or very cheap (free if you want just a chat interface, go to z.ai; cheap if you want API access, I’d recommend OpenRouter). It’s imo the best non-closed source LLM for writing in English. DeepSeek can be good for that too, but I’ve found that it can sometimes produce sentences that flow a lot worse (you can use DeepSeek for free via their website as a chat interface). For other creative pursuits, I’m not sure - if you give me an idea of what you’d want out of one, I can try to give you advice.

piccolo [any]@hexbear.net · 2 months ago

State of LLMs in China and the US, plus JDPON Don

piccolo [any]@hexbear.net · 10 months ago

Parenti stopped being friends with Bernie after the Kosovo vote afaik

piccolo [any]@hexbear.net · 11 months ago

I think CloudFlare uses lava lamps because it’s a cool story, but there are ways you can get truly random bits from other things, like this. Generally, you want to have some sort of physical process going on that provides random entropy, because CPUs by themselves can only produce pseudorandom numbers. For example, random.org uses atmospheric noise, which is random and unpredictable when you look at very tiny variances. You can also use, e.g. a super sensitive Geiger counter to measure random bits of radiation, or if you shoot photons at a semi-reflective surface, sometimes they go through and sometimes they reflect. For the type shown here, though, the most common kind of noise they use is from quantum effects relating to transistors, as far as I know. This is an actual source of randomness, so if it’s done right it can be just as good as lava lamps or Geiger counters or whatever.

piccolo [any]

State of LLMs in China and the US, plus JDPON Don

State of LLMs in China and the US, plus JDPON Don