Why most leadership assessments measure the wrong thing
A leader can top every personality assessment and still freeze when a decision carries real weight. Most tools measure traits and preferences. Few measure the capability that decides whether a leader holds under pressure.

A leader can score in the top decile of every personality assessment you put in front of them and still freeze the moment a decision carries real weight. The profile said confident, decisive, strategic. The behaviour, when a supplier collapsed and the board wanted answers by Friday, said something else. This gap is not a measurement error. It is the predictable result of measuring the wrong thing.
Traits are not capability
Most leadership assessment tools were built to describe disposition. Personality typologies sort people by preference: how they take in information, how they prefer to decide, where they draw their energy. That has real value for self-awareness and for helping a team understand its own wiring. It was never designed to predict how a person leads when the situation turns and the easy options run out.
The distinction matters because organisations keep asking a description to do a prediction's job. A typology tells you that a manager prefers structure and dislikes ambiguity. It does not tell you whether that manager can hold a steady judgement when the structure falls away. Those are different questions, and only the second one decides whether capability is actually present.
This is not a semantic distinction, because it changes what you do with the result. If you assess traits, the logical next step is to match people to roles that suit their wiring, which is useful but static. If you assess capability, the next step is development: capability can be built, and the assessment becomes a map of where to build rather than a label fixed to a person. The most expensive consequence of shallow assessment is that it quietly tells an organisation its leaders are finished products when they are not.
There is a quieter problem underneath. Many of these instruments rely on self-report. A leader answers questions about how they typically behave, and the tool scores the answers. Self-report measures self-perception, which is exactly the faculty that pressure distorts first. The leaders who most need an honest reading are often the least able to give one about themselves.
Pressure is the discriminator. In calm conditions almost any competent manager looks like a leader, because calm conditions forgive a great deal. The questions that separate leaders are the ones that only arrive under load: whether judgement holds when the information is incomplete, whether the difficult conversation still happens when it would be easier to defer it, whether the person can stay present rather than retreat into procedure or bravado. An assessment that never examines behaviour under that load is, by design, blind to the trait that matters most.
Assess capability, not just disposition
At CapabilityFX we draw a hard line between leadership skills and leadership capability. Skills are specific and learnable: running a meeting, giving feedback, reading a balance sheet. Capability is the broader ability to apply judgement well across situations you have not seen before, especially when the stakes are high. Skills are a subset of capability. An assessment that stops at skills and traits stops short of the thing that matters.
This follows directly from how leadership actually changes. A decade of doctoral research underpinning our work points to a consistent finding: lasting change happens inside-out, at the level of who a leader is, not only what a leader can do. If development works inside-out, assessment has to look there too. Measuring only the outer layer, the visible behaviours and stated preferences, tells you little about whether the inner capability that drives them under pressure is present or absent.
Capability that endures has a particular signature. It shows up as consistency between calm and pressure rather than a sharp drop when conditions harden. It shows up as judgement that improves with experience instead of hardening into habit. And it shows up in the willingness to accept an uncomfortable reality before acting on it, which is often the hardest part of leading and the part that disposition tools never test. An assessment worth trusting looks for that signature, not for a flattering self-portrait.
What a capability lens looks at
A capability-oriented assessment asks a different set of questions. Not "what is your style" but "what do you do when your style stops working." Ennea International's Five Lens Development Platform, which we use whenever coaching needs to go beneath the surface, reads leadership across several dimensions of capability rather than collapsing a person into a type. The Tomorrows Compass future-readiness assessment, which CapabilityFX is licensed to use as its measurement backbone, looks forward, at whether a leader is equipped for the demands coming rather than the ones already mastered. Both are built to surface capability that holds, not preferences that flatter.
The point is not that personality tools are worthless. The point is that they answer a narrow question and are routinely asked a much larger one. When the larger question is "can this person lead when it counts," disposition is necessary context and nothing more.
What this looks like in the room
The gap between trait and capability is easiest to see in specific people doing specific work.
The regional operations director. Consider a regional operations director in a wholesale and retail group, strong on every profile the business had run, marked as high potential. In stable conditions the rating held. Under pressure a pattern showed itself: every exception was escalated upward, every difficult call deferred until someone more senior signed it off. The disposition was decisive. The capability to carry judgement under load was not yet built. No personality assessment had flagged it, because none had asked the right question.
The HR director's rollout. Consider an HR director who introduced a full battery of psychometrics and a 360-degree feedback cycle across the leadership population, expecting the data to lift performance. A year on, the reports were thorough and the behaviour was unchanged. The instruments had described the leaders accurately and developed none of them. Accurate description is not the same as a useful reading, and neither is the same as change.
The plant manager who held. Now consider the opposite. A manufacturing plant manager scored unremarkably on the company's preferred profile, middling on confidence, low on the traits the business associated with leadership presence. When a safety incident hit, that manager ran a calm, sequenced response, held the difficult conversations, and recognised honestly where the system had failed. The profile had under-read them precisely because it was measuring presentation, not capability. The capability was there. The tool could not see it. What the tool also missed was the effect on everyone around that manager: the team gave straight answers because they trusted the response would be measured, and the review surfaced the real cause rather than a convenient one. Capability is visible in the room it creates, not only in the person.
Across all three, the lesson repeats. The observable behaviour that decides a leader's value shows up under pressure, in the small moments where judgement is either present or it is not. An assessment that only measures calm-weather disposition will misread all three of these leaders, two by overrating them and one by missing them entirely.
How to tell what your assessment is actually measuring
You do not need to abandon the tools you have to ask sharper questions of them. Before you trust an assessment with a development decision, put it to a short test.
- Does it predict, or only describe? Ask what the tool claims to forecast about behaviour under pressure, and what evidence backs that claim. If the honest answer is that it describes preference, treat it as context, not a verdict.
- Does it measure capability or self-perception? A reading built mostly on self-report measures how a leader sees themselves. Useful, but not the same as how they lead when it counts.
- Does it look forward? Most tools assess the leader you have for the situations you have already faced. Ask whether it reads readiness for the demands coming next.
- Does it survive the pressure test? Picture your most demanding scenario of the last two years. Ask whether the assessment would have told you, in advance, which leaders would hold through it. If you cannot say yes with confidence, you are measuring fair-weather leadership.
If you want a structured way to think about how leaders move from describing a situation to leading through it, our DUAL model sets out the sequence: Discover, Understand, Accept, Lead. It is a useful frame for seeing where, in a real decision, capability either holds or gives way. Our 4D method shows how that capability is built rather than merely measured.
Measure the half that matters
Assessment is not the enemy of good leadership development. Shallow assessment is. A tool that measures traits and preferences answers a real but narrow question, and the cost of mistaking it for the whole picture is high: leaders promoted on disposition who falter under load, and capable leaders overlooked because they do not present in the expected way. If you want to know whether a leader will hold when the stakes are highest, measure the capability that decides it. If you would like to see what a capability-led reading looks like for your own leadership team, start a conversation.
The leaders described here are representative composites drawn from patterns we observe in practice, not identifiable individuals. The research claim refers to the doctoral work underpinning the CapabilityFX approach to leadership capability.
Dr Eric Albertini · Co-Founder, CapabilityFX
Originator of the DUAL model, developed through his doctoral research at the University of Johannesburg. Eric has spent his career building leadership capability inside executive teams.


