Sara Vera Marjanovic
saravera.bsky.social
Sara Vera Marjanovic
@saravera.bsky.social
PhD fellow in XAI, IR & NLP
✈️ Mila - Quebec AI Institute | University of Copenhagen 🏰
#NLProc #ML #XAI
Recreational sufferer
DeepSeek-R1 also exhibits higher safety vulnerabilities compared to its non-reasoning counter-part DeepSeek-V3 and the model's reasoning capabilities can be used to generate jailbreak attacks that successfully elicit harmful responses from other safety-aligned LLMs.
April 1, 2025 at 8:07 PM
Notably, we show DeepSeek-R1 has a ‘sweet spot’ of reasoning, where extra inference time can impair model performance and continuously scaling length of thoughts does not necessarily increase performance.
April 1, 2025 at 8:07 PM
DeepSeek-R1’s thoughts follow a consistent structure. After determining the problem goal, it decomposes the problem towards an interim solution. It will then either re-explore or re-verify the solution multiple times before completion, though these re-verifications can lack in diversity.
April 1, 2025 at 8:07 PM
The availability of R1’s reasoning chains allows us to systematically study its reasoning process, an endeavor we term Thoughtology💭. Starting from a taxonomy of R1s reasoning chains, we study the complex reasoning behavior of LRMs and provide some of our main findings below👇.
April 1, 2025 at 8:07 PM
Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour.
🔗: mcgill-nlp.github.io/thoughtology/
April 1, 2025 at 8:07 PM