• Lucy :3@feddit.org
    link
    fedilink
    arrow-up
    21
    ·
    17 days ago

    The entire point of em-dashes as identifier for LLMs is not the usage of dashes/hyphens/whatever themselves - dashes are just part of normal human writing. The point is that almost no human would use actual em-dashes in a normal conversation, as using them is very annoying with a full sized keyboard and a pointless detour on a phone at best. Therefore, it’s usually reserved for professional writing (Books, studies, etc.). But LLMs don’t distinguish, and just use the most common token of their training data, which is em-dash, even when it doesn’t fit.