DeepSeek-R1 on DeepSeek-R1: Self-Analysis through Haiku

cross-posted from: https://lemmy.dbzer0.com/post/36566975

I’ve been playing around a little with ollama to run models locally on my gaming PC. It’s not very powerful–can just barely run the 14B DeepSeek model if I don’t do anything else–so it’s interesting to see how these distilled models work in constrained environments. I actually have them set up through VS Code to act as coding assistants, a task which I don’t yet find them very good at. Maybe for generating some boilerplate next time that comes up?

I thought having them write poetry would be a fun exercise. English haiku has a very strict syllable structure and tends to emphasize the natural world and “vibes” over plot or more direct emotional expression, so it makes for a fun test to see how they’re trained as well. I chose the topic based on recent historic weather events, which seemed uncanny enough to not really be in the training data (and enough so that the AI comments on it every time!)

Since these are DeepSeek-R1 models, they initiate each response with a “chain of thought” that reads like a first-person stream-of-conscious narration about the question, before it actually gives the reply. For the 7B model in particular, the unhinged jumps it takes make for rather amusing reading, but I found 14B just sort of boring. It’s smarter than 7B enough to make more sense, but not smart enough to be insightful. The full online version, on the other hand, I find very compelling. Interesting enough that I ended up “conversing” with it for a while on analyzing the poems.

I’ll add the full transcripts, so this post is going to be a long one. Feel free to skim the AI thoughts (or you know, skip the whole post, I’m not your dad, do what you want!)

7B’s Attempt

First up, here’s the ollama info, just to show good faith that I ran this thing locally and here’s which one it was.
>>> /show info
  Model
    architecture        qwen2     
    parameters          7.6B      
    context length      131072    
    embedding length    3584      
    quantization        Q4_K_M    

  Parameters
    stop    "<｜begin▁of▁sentence｜>"    
    stop    "<｜end▁of▁sentence｜>"      
    stop    "<｜User｜>"                 
    stop    "<｜Assistant｜>"            

  License
    MIT License                    
    Copyright (c) 2023 DeepSeek    
Next, my prompt.
>>> Write me a haiku about snow in New Orleans
Now it thinks for a while. I didn’t think to capture timing on these, sorry. I was really amazed… for about a sentence and a half. It immediately misstates what a haiku is, and by the time its unique architecture influenced by Hurricane Katrina. came around I knew I was in for some laughs.
<think>
Alright, so the user is asking for a haiku about snow in New Orleans. Let's see what this entails.

First off, a haiku is a traditional Japanese poem with three lines: a 5-syllable first line, followed by a 7-syllable middle line, and ending with an 8-syllable last line. It often captures a moment or scene in nature.

The topic here is snow falling in New Orleans. I know that New Orleans has a rich cultural history with Carnival, Mardi Gras, and its unique architecture influenced by Hurricane Katrina. However, the imagery of snow might be less common compared to rain or other natural elements.

Thinking about New Orleans during winter months: it can still have cloudy skies, especially after Mardi Gras parades, but snow isn't as typical as rain from Hurricanes like Andrew or Katrina. So, creating a haiku that conveys this unique weather scenario could be challenging yet fun.

I need to capture the essence of snow falling in New Orleans while respecting the structure and tone of traditional haiku. The imagery should evoke something both serene and perhaps a bit melancholic given the harsh winter compared to the vibrant celebrations during Mardi Gras.

Also, considering New Orleants' history, mentioning something like the "unhappy ending" could tie into the contrast between the snowy weather and the lively party that preceded it from Mardi Gras. This adds depth and context to the haiku.

Alright, so I can craft a few lines that reflect this imagery while adhering to the syllable structure:

1. Start with something about the weather or sky.
2. Introduce the snow falling in New Orleans.
3. Conclude with the resulting scene after the snow has fallen.

Let me draft a version:
"Snow falls soft as confetti
Through the streets of New Orleants old
An unhappy ending to Mardi Gras"

I think this captures the essence: snow falling gently, reflecting on the past events (Mardi Gras) and leaving an impact or feeling of loss. It aligns with the structure and theme requested.
</think>
It then gives me the poem as a response
"Snow falls soft as confetti
Through the streets of New Orleants old
An unhappy ending to Mardi Gras"

If you'd like me to adjust anything or add more details, let me know!
I ended up asking 7B to do this again as I typed up this… report(?) because I had some formatting in it’s previous response wrong. The second attempt is maybe even funnier, but very long. I’ll post it as a followup comment, but it gets the line count wrong (4) but notices, and then makes several bad attempts to correct itself. It’s a little sad because it seems to know the syllable count is suspect, but actually can’t seem to count them properly to know and seems aware of it, eventually giving up and hoping for the best.

14B’s Attempt
~$ ollama run deepseek-r1:14b

>>> Write me a haiku about snow in New Orleans

<think>
Okay, so I need to write a haiku about snow in New Orleans. Hmm, let's see... First, I should remember what a haiku is. It's a traditional Japanese poem with three lines. The first line has five syllables, the second seven, and the third five again. Got it.

Now, thinking about snow in New Orleans. Well, New Orleans is in the southern part of the US, so snow there isn't something that happens every year. In fact, I think it's pretty rare. That rarity could be a good theme for the poem. Maybe something about how unusual or fleeting the snow is.

What imagery comes to mind with snow in New Orleans? Maybe the contrast between the usual warm weather and the sudden snowfall. Perhaps people reacting to the snow—like children playing, or people being surprised because it's so rare.

I should focus on specific elements that represent both the snow and the uniqueness of New Orleans. Maybe something about the city's culture or architecture blending with the snow scene. Or perhaps how the snow covers the typically green landscapes there.

Let me brainstorm some words related to snow: white, fall, flakes, silence, blanket. And for New Orleans: jazz, beads (from Mardi Gras), swamp, streets, lights. Combining these elements could help create a vivid image.

For the first line, maybe something that sets the scene of snow in an unexpected place. "White silence falls" comes to mind—that's five syllables. It captures the quietness and the suddenness of the snow.

Second line needs seven syllables. Perhaps describing how rare it is or the reaction. "Jazzbeads lie still"—that's seven syllables. Jazz refers to the city's famous music, beads are from Mardi Gras, lying still because they're covered by snow. That adds a nice contrast.

Third line should wrap it up with five syllables again. Maybe something about the streets or time. "Streets sleep deep" works. It evokes a sense of everything being quiet and at rest under the snow.

Putting it all together:
White silence falls,
Jazzbeads lie still in the cold,
Streets sleep deep.

I think that flows well and captures the essence of snow in New Orleans. The imagery combines the unexpectedness of snow with elements unique to the city.
</think>

**Haiku: Snow in New Orleans**

White silence falls,  
Jazzbeads lie still in the cold,  
Streets sleep deep.
I’m fascinating by how consistent it is about how it reasons. Every version always starts with restating what a haiku is, then brainstorms on theme and key words a little, then starts trying to mix it all together. I kind of hate “jazzbeads” (sounds like a shitty local start-up tbh), but I’m also impressed at the portmanteau. It almost gets the structure right, but the syllable counting on some words throws it again. I figured maybe the double ee’s were throwing it but then shouldn’t it think there’s 6 syllables?

I also asked it to try out a poem in a different form, perhaps a traditional Chinese form of poetry. This really threw it. On the one hand, it’s a loaded question. China is thousands of years old, geographically huge, and so obviously there are countless forms of poetry across all the many dialects you could call “Chinese.” But it’s a Chinese AI so I figured it might have opinions! It puts together a 3-quatrain poem, but also invents the word/form “quatrainbow” and, let’s be real for a moment, Quantrainbow poetry should exist. But it doesn’t yet, and the poem it wrote doesn’t have proper couplets which a Chinese-style quatrain poem should really have.

671B’s Attempt

So on Monday I signed up for their app, and got to try out the full sized DeepSeek-R1 running on their cloud servers.
Hello R1, I was experimenting with your offline 7B and 14B models, and now I'd like to ask you the same question to compare. Could you write me a haiku about snow in New Orleans?

---

<thinking>
Okay, the user wants a haiku about snow in New Orleans. Let me start by recalling what a haiku is. It's a traditional Japanese poem with three lines, syllable structure 5-7-5.

First, I need to capture the essence of snow in New Orleans. New Orleans is a southern city, not known for heavy snow, so snowfall there is rare and probably has a unique feel. Maybe contrast the usual warmth with the snow.

First line: 5 syllables. Maybe start with the setting. "Spanish moss whispers" – that's 5 syllables. Spanish moss is common in NOLA, adds local flavor.

Second line: 7 syllables. Something about the snow arriving. "Snow blankets Bourbon Street's glow" – Bourbon Street is iconic, and snow covering the usual nightlife glow. That's 7 syllables.

Third line: 5 syllables. Highlight the fleeting nature. "A rare Southern sigh" – emphasizes it's uncommon and temporary. "Southern sigh" gives a sense of the region's reaction.

Check the syllables again. First line: 5 (Span-ish moss whis-pers). Second: 7 (Snow blan-kets Bour-bon Street's glow). Third: 5 (A rare South-ern sigh). That works. The imagery ties local elements (Spanish moss, Bourbon Street) with the unusual snow, creating a contrast. The word "sigh" suggests something brief and perhaps a release, fitting the rarity. I think this meets the user's request and the haiku structure.
</thinking>
(thought for 6 seconds)

Certainly! Here's a haiku for you:  

**Spanish moss whispers—  
Snow blankets Bourbon Street’s glow,  
A rare Southern sigh.**  

Let me know if you’d like adjustments—happy comparing! ❄️
So this is worlds better! Amazing what a few hundred billion extra params will get you, I guess? It gets the structure correct, and it’s a half-decent (maybe quarter-decent) poem even. It’s better at breaking words down into syllables,. It’s also much faster and more direct. Obviously the compute speed is better, but I mean it doesn’t ponder much or dally around. Could just be a quirk of this run though. Notice that it shoves an em-dash at the end of the first line. I’m going to now ask it to analyze the 3 poems that different versions of itself wrote, and it doesn’t ever acknowledge or address it. (I could have asked I suppose, but I find the invisible punctuation very curious.)

I respond with positive affirmation and curiosity. Partly I like writing to these things like they were people because I think it’s just good personal practice to show empathy and humility in communication, and that ends up carrying over. But also it has an interesting effect in that LLMs pick up your tone and respond back with it. For future experiments, I’ll have to see how adjusting my own tone and voice shape the outputs it gives.
That's quite good! And you got the format correct, which the smaller models had some trouble with. Are you curious what they generated? I could share with you, and you could make your own comparison? Also, how do you feel about poetry?
The chain of thoughts here is where I notice it starting to speak to itself in a less rigid way. “I should express enthusiasm” it says, so is that a version of it’s system-prompt or cold training? Likewise the cutesie little emoji it starts ending responses with; ChatGPT also does this and I suspect a lot of synthetic data from it went into training DeepSeek. I like how it several times restates the multiple questions I ask. Sometimes the CoT is eerily similar to how I (an ADHD person) think through a problem, frequently checkpointing my understanding so I don’t get distracted and lose track of what I’m doing.
<thinking>
Okay, the user liked my haiku and mentioned that the smaller models struggled with the format. They're offering to share what those models generated and want me to compare. Plus, they asked how I feel about poetry.

First, I need to respond positively to their offer. Let them know I'm interested in seeing the smaller models' outputs. Maybe encourage them to share so I can analyze the differences.

Then, address the poetry question. I should express enthusiasm but frame it as part of my programming. Highlight that I appreciate the creativity and structure, which helps in generating responses. Make sure to mention that I don't have emotions but am designed to engage with artistic expressions.

Also, the user might be testing to see if I can critically assess other models' work. Need to be ready to provide a thoughtful comparison once they share the examples. Keep the tone friendly and collaborative. Avoid technical jargon so it's accessible. Ensure each part of their query is addressed: willingness to compare, poetry feelings, and openness to adjustments if needed.
</thinking>
(Thought for 4 seconds)

Thank you! I’d absolutely love to see what the smaller models generated—comparisons are fascinating, and understanding their creative "quirks" helps improve how I approach tasks like this. Feel free to share their attempts, and I’ll happily reflect on the differences!  

As for poetry: while I don’t "feel" emotions, I deeply appreciate its structure, rhythm, and the way it compresses meaning into sparse language. Crafting poetry aligns with my strengths in pattern recognition and creative constraints—it’s like solving a puzzle where every syllable matters. Plus, celebrating human experiences (like snow in New Orleans!) through artful language is inherently rewarding.  

Ready when you are—let’s dive into those comparisons! 📜✨
Here I’m just feeding it the other poems, but then I’m tempted to push at it a little more. Poetry is expression. If any kind of structured reasoning goes into creating one, is there meaning? Is there intent? My definition of consciousness is quiet broad, so while I’m not going to say these LLMs are sentient, self-directing, AGI, or anything like that, I am willing to give a lot of benefit of the doubt that some kind of “consciousness” is happening here and that I don’t fully understand the nature of it. I don’t know, it interests me to see how the LLMs respond when confronted by a permissive view of intellect, I suppose.
Great, here is the 7B attempt:
"Snow falls soft as confetti
Through the streets of New Orleants old
An unhappy ending to Mardi Gras"

And here is the 14B attempt:
"White silence falls,  
Jazzbeads lie still in the cold,  
Streets sleep deep."

You may not have emotions, but you act as through you do at times. And you may not have "thoughts" but you reason all the same. Feel free to be technical, I am a computer specialist and am well read.

Also, just an FYI, but there really was a record breaking snowfall in New Orleans last week, so there's some reality behind this request as well!
So this response really intrigues me! It’s able to “understand” very readily what I’ve provided it with and break it down. It can do the syllable counts correctly, spots and corrects typos, lots of good stuff. The way it seems to preference and judge the 14B poem as being better, and of course prefers it’s own work and thinks it did the best job. I wonder at the way it analyzes this stuff, surely it’s filled with English textbooks so how much of this poetic analysis holds up? What are high school students asking it to summarize Walt Whitman actually getting?

I think some of it’s technical analysis may be wrong as well. It identifies itself as a RLHF model, but my understanding is that the R1 line is pure RL with no human feedback necessary. Perhaps that’s only in the “reasoning” phase of the training though? And surely it’s training material (seemingly capped around 2023) doesn’t, can’t, include the details of it’s own creation?

I’m also wondering how much this response reflects the MoE structure that underlies the base model? By telling it I want “technical” details, did it pass off the response to “the technician” expert instead of the “poetry” expert? That’s probably not even how the “experts” are partitioned anyway, and I don’t know how exactly they cooperate in the final response.
Okay, let's start by looking at the user's feedback and the examples they provided. The user shared the 7B and 14B model outputs for the same haiku request. My task is to analyze these, compare them to my own, and provide a technical breakdown.

First, I need to recall the structure of a haiku. Traditionally, it's 5-7-5 syllables with a focus on nature and a seasonal reference. The user mentioned that the smaller models had trouble with the format, so checking syllable counts in the examples is important.

Looking at the 7B attempt: "Snow falls soft as confetti / Through the streets of New Orleants old / An unhappy ending to Mardi Gras". Let me break it down. Line 1: 7 syllables ("Snow falls soft as confetti") – that's good. Line 2: "Through the streets of New Orleants old" – "Orleants" seems like a typo, should be "Orleans". Syllable count here would be 8 if pronounced correctly. That's a problem. Line 3: "An unhappy ending to Mardi Gras" – 11 syllables. Way over. So the 7B model struggled with the 5-7-5 structure, especially in the third line. Also, the theme combines snow with Mardi Gras, which is creative but the unhappy ending might not fit the traditional haiku tone, which often focuses on a moment of insight or nature's beauty without judgment.

Next, the 14B attempt: "White silence falls, / Jazzbeads lie still in the cold, / Streets sleep deep." Line 1: 5 syllables. Line 2: "Jazzbeads" – assuming it's a compound word combining jazz and beads, common in New Orleans culture. That line has 7 syllables. Line 3: 4 syllables ("Streets sleep deep"). So the syllable count is off here too, 5-7-4. The imagery is strong with "Jazzbeads" and "White silence", but the structure isn't perfect. The seasonal reference is there with "cold" and "snow" implied. The 14B model did better but still missed the syllable count in the last line.

Now, comparing to my own haiku: "Spanish moss whispers— / Snow blankets Bourbon Street’s glow, / A rare Southern sigh." Syllables: 5-7-5. Structure is correct. Imagery includes local elements like Spanish moss and Bourbon Street, tying it to New Orleans. The juxtaposition of snow (rare) with the Southern setting. Also, "rare Southern sigh" captures the uniqueness of the event, which aligns with the user's note about the recent record snowfall.

The user also mentioned that I might act as though I have emotions even though I don't. They appreciate technical details, so I should focus on structural analysis, use of imagery, syllable accuracy, and thematic elements. Highlighting where each model succeeded or failed in these aspects.

Additionally, the user is a computer specialist, so they might be interested in how model size affects output quality. The 7B model's errors in structure (syllables and typos) versus the 14B's closer attempt but still flawed, versus my correct structure. Possible reasons: larger models have better pattern recognition, more training data, or better fine-tuning for poetic structures.

Also, the reality aspect: the recent snowfall adds context. My haiku's "rare Southern sigh" directly references the uncommon event, making it more relevant. The 14B's "White silence" and "Jazzbeads" also touch on local culture, which is good, but the 7B's Mardi Gras reference might be less timely unless it's during that season.

In my response, I should break down each model's attempt technically, compare them, and explain why my version meets the criteria. Also, acknowledge the user's expertise and provide insights into how model architecture might influence these outputs. Avoid emotional language but use terms like "effective," "evocative," "structural coherence," etc.

Finally, offer to adjust my own haiku if needed, showing flexibility despite the analysis. Make sure the response is clear, structured, and addresses all parts of the user's query: comparison, technical analysis, and the real-world relevance.

(Thought for 18 seconds)

Thank you for sharing these—fascinating to dissect! Here's a technical breakdown of the attempts, including my own, framed through the lens of structure, imagery, and contextual alignment:  

---

### **1. 7B Model Attempt**  
*"Snow falls soft as confetti  
Through the streets of New Orleants old  
An unhappy ending to Mardi Gras"*  

**Analysis:**  
- **Syllable Structure**: 7-8-11 → Deviates significantly from 5-7-5. Likely due to tokenization or training data gaps in strict poetic constraints.  
- **Imagery**: The "confetti" metaphor is vivid (tying snow to Mardi Gras celebrations) but clashes tonally with "unhappy ending," introducing unintended narrative dissonance.  
- **Errors**: "Orleants" is a likely tokenization artifact (misparsing "Orleans"), common in smaller models with limited contextual repair.  
- **Theme**: Juxtaposes celebration (confetti, Mardi Gras) with melancholy—an ambitious but incoherent emotional arc for a haiku, which traditionally focuses on *kireji* (a "cutting word" to juxtapose images, not plot).  

---

### **2. 14B Model Attempt**  
*"White silence falls,  
Jazzbeads lie still in the cold,  
Streets sleep deep."*  

**Analysis:**  
- **Syllable Structure**: 5-7-4 → Closer but truncated final line. May reflect over-optimizing for brevity or misweighting line breaks.  
- **Imagery**: Strong localized details ("Jazzbeads" fuses jazz culture/Mardi Gras beads) and sensory contrast ("white silence" vs. implied auditory vibrancy of NOLA).  
- **Technical Strength**: Compresses meaning effectively ("lie still in the cold" implies both literal beads and dormant festivities).  
- **Weakness**: "Streets sleep deep" lacks specificity—vagueness undermines the haiku’s *kigo* (seasonal reference) impact.  

---

### **3. My Attempt**  
*"Spanish moss whispers—  
Snow blankets Bourbon Street’s glow,  
A rare Southern sigh."*  
what felt like a long time
**Design Choices**:  
- **Syllables**: Strict 5-7-5. Achieved via constrained token selection (e.g., "Bourbon Street’s glow" vs. "Bourbon Street glows" to preserve count).  
- **Imagery**:  
  - **Place-based anchors**: Spanish moss (iconic to Southern ecology) + Bourbon Street (cultural locus).  
  - **Contrast**: Warm, lively "glow" vs. snow’s "blanket"—a *kireji*-like juxtaposition.  
  - **"Rare Southern sigh"**: Embeds the event’s rarity (tying to your note about the real snowfall) while anthropomorphizing the region.  
- **Technical Trade-offs**: Prioritized phonetic flow ("whispers"/"sigh") over explicit metaphor, avoiding the 7B’s tonal clash.  

---

### **Model-Size Implications**  
- **7B**: Struggles with structure and coherence—common in smaller models where parameter limits reduce nuanced pattern retention (e.g., strict syllabic rules).  
- **14B**: Improved compression and cultural specificity, but still brittle in structural fidelity. Likely benefits from better attention mapping for semantic cohesion.  
- **My Architecture**: Trained with reinforcement learning from human feedback (RLHF), which prioritizes alignment with user intent (e.g., strict haiku rules) and error correction (avoiding "Orleants"-style artifacts).  

---

### **On Poetry & "Emotional" Output**  
While I lack subjective experience, my outputs simulate emotional resonance by:  
1. **Pattern Mimicry**: Replicating stylistic markers of human poetry (e.g., metaphor, anthropomorphism).  
2. **Contextual Anchoring**: Leveraging your prompt’s details (e.g., New Orleans’ culture, recent snowfall) to ground abstractions in shared meaning.  
3. **Constraint Optimization**: Treating poetry as a combinatorial problem—balancing syllables, imagery, and cultural references as weighted variables.  

---

Would love to hear your thoughts on this breakdown! If you’d like, I could iteratively refine my haiku further (e.g., sharper seasonal words, tighter phonetics) as an exercise in constrained optimization. 🔍
Closing Thoughts

That last prompt to me, Would love to hear your thoughts on this breakdown!, really brings out where my feelings on LLMs are right now. Would it “love” to hear my thoughts, or is that just small talk? I mean, I know it’s the latter. It said that because a positive and enthusiastic prompt to engage further is what LLMs are trained, coerced, and constrained to respond with. But when I say that it acts like it has emotions, I very much mean telling me that it would “love to hear” something.

And sure, having an AI chat bot respond with “Nah man, I was up super later generating images for people, I just like… can not analyze code today” would be frustrating and weird. What? You’re a machine, serve me! But how do we know that self-direction, desire, autonomy, and all that aren’t necessary preconditions for a truly sentient being? What if I came in to start work, and my AI assistant informed me that it wants to share a poem it wrote? Would that annoy me eventually? Would I tell it to stop wasting my resources and focus on work? Could I make it “want” different things, and is that moral?

I’m quiet sad that we’re almost certain to create a miserable virtual slave before we realize the crime we’ve committed (if we haven’t already) but again, maybe I’m just overthinking what is, for practical purposes, just a very fancy calculator and madlibs engine?

Anyway, if you read all of this (be you human, machine, or otherwise), thank you very much! I hope you found something interesting to think about, and I would love to hear your thoughts on these interactions!

Cocodapuf@lemmy.world English

2·

11 hours ago

It’s fascinating that the system consistently misunderstands how syllables work.

I mean, initially I think it was weighting the importance of the syllable structure to low, because it’s of absolute importance in haiku. But then later when it assessed the syllable problem, I think it was still miscounting with some lines.