When Less is More: Layer Removal That Improves LLM Reasoning
Part 2 of 2: Systematic analysis shows that removing specific layers improves mathematical reasoning performance by up to 11 percentage points The Core Finding In The Perplexity Trap, I documented...
The Perplexity Trap: Why Standard Metrics Fail for LLM Layer Pruning
Part 1 of 2: An empirical investigation showing that perplexity-based importance metrics provide no predictive signal for reasoning task performance The Redundancy Assumption A key assumption that drives a lot...
1