The entire point of em-dashes as identifier for LLMs is not the usage of dashes/hyphens/whatever themselves - dashes are just part of normal human writing. The point is that almost no human would use actual em-dashes in a normal conversation, as using them is very annoying with a full sized keyboard and a pointless detour on a phone at best. Therefore, it’s usually reserved for professional writing (Books, studies, etc.). But LLMs don’t distinguish, and just use the most common token of their training data, which is em-dash, even when it doesn’t fit.
The entire point of em-dashes as identifier for LLMs is not the usage of dashes/hyphens/whatever themselves - dashes are just part of normal human writing. The point is that almost no human would use actual em-dashes in a normal conversation, as using them is very annoying with a full sized keyboard and a pointless detour on a phone at best. Therefore, it’s usually reserved for professional writing (Books, studies, etc.). But LLMs don’t distinguish, and just use the most common token of their training data, which is em-dash, even when it doesn’t fit.
On my keyboard layout, it’s shift + comma, which makes sense as a stronger, capital separator, so I do frequently use them on my PC
https://en.m.wikipedia.org/wiki/Neo_(keyboard_layout)
It was a joke, that commentor was AI, and trying this fool us