Recent Posts
- LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals
- The Tool Illusion: Rethinking Tool Use in Web Agents
- Do LLMs Follow Their Own Rules? A Reflexive Audit of Self-Stated Safety Policies
- Human Values Matter: Investigating How Misalignment Shapes Collective Behaviors in LLM Agent Communities
- Litmus (Re)Agent: A Benchmark and Agentic System for Predictive Evaluation of Multilingual Models
Recent Comments
No comments to show.