Skip to content
Leadership Assessment

How to choose a leadership assessment that predicts, not just describes

Most leadership assessments describe who someone is. Far fewer predict what they will actually do under pressure. A practical buyer's guide to choosing the right tool for the right purpose.

Dr Eric Albertini · Co-Founder, CapabilityFX
A person reading a report against a bright wall

The cost of asking the wrong question

Every year, organisations invest in leadership assessments and walk away with beautifully formatted reports that say very little about whether the person will actually lead well. The problem is rarely the tool. It is the question behind the purchase.

Most assessments answer the question: who is this person? That is a legitimate question. But the question most buyers actually need answered is different: what will this person do when it counts? Those are not the same question, and no single instrument answers both equally well.

This is not an argument against psychometrics. It is an argument for clarity about purpose. Choosing the wrong tool for the job is not merely an academic concern. It shapes who gets promoted, who receives coaching, and which capability investments your organisation makes. The stakes are real.

What psychometric and behavioural assessments actually measure

The distinction between psychometric and behavioural assessment is frequently collapsed into a vague preference debate. It should not be. The two approaches measure genuinely different things.

Psychometric assessments measure stable psychological constructs: personality traits, cognitive styles, values, motivational drivers. They are self-reported, normed against reference populations, and designed to be consistent across time. When a leader completes a personality inventory, they are providing a description of how they tend to perceive themselves and the world. The data is rich, especially for understanding patterns that sit below the surface of day-to-day behaviour.

Behavioural assessments observe or simulate what a person does. This includes structured observation tools, 360-degree feedback instruments, situational judgement tests, and scenario-based simulations. Rather than asking how you describe yourself, they ask: what did you actually do, or what would you do, in this specific situation? The data is more directly predictive of performance because it is grounded in action.

Neither approach is categorically superior. Psychometrics offer depth and developmental richness. Behavioural tools offer proximity to performance. The mistake most organisations make is not choosing one over the other. It is using a psychometric tool in isolation and expecting it to do a behavioural tool's job.

A personality profile cannot tell you whether a leader will hold a difficult conversation with a direct report next Tuesday. It can tell you something about their tendencies around conflict, discomfort, and relational courage. That is valuable. But it is a starting point for interpretation, not a prediction.

The isolation mistake

When organisations treat a single psychometric score as a hiring or promotion decision, they are asking a description to carry the weight of a prediction. The research on this is consistent. Schmidt and Hunter's 1998 meta-analysis, still widely referenced in industrial-organisational psychology, placed unstructured personality assessment near the bottom of predictors of job performance. Structured assessments combined with work-sample tests and cognitive ability measures performed significantly better.

More recent work reinforces the direction. A 2023 Chartered Institute of Personnel and Development review of assessment validity found that multi-method approaches consistently outperformed single-tool selection processes, particularly for senior roles where context complexity is high.

The isolation problem is not unique to psychometrics. A behavioural assessment used without any depth understanding can produce leaders who perform well in structured simulations but lack the self-awareness to sustain that performance when conditions change. Both tools have a floor.

The framing that serves buyers best is not "which type?" but "what are we trying to learn, and which combination of evidence gets us there?"

Selection versus development: two different jobs

This is perhaps the most important distinction in the buyer's conversation, and it is often absent from it entirely.

Selection asks: will this person perform in this role? The evidentiary standard is predictive validity. You need data that connects assessment results to actual outcomes: performance ratings, retention, readiness to lead at the next level. Behavioural assessments, especially structured simulations and scenario-based tools, are better suited to selection because they generate data closer to actual job performance. Psychometrics can contribute to a selection picture, but they should not carry it alone.

Development asks: what does this person need to grow, and how can we help them? Here, the evidentiary standard shifts. You are not predicting; you are understanding. Depth, nuance, and interpretive richness matter more than predictive coefficients. Psychometric tools, when used with skilled facilitation, can surface patterns of identity, self-concept, and motivation that purely behavioural data does not reach. That depth is precisely where lasting capability change begins.

CapabilityFX's orientation toward inside-out development rests on this distinction. The DUAL model (Discover, Understand, Accept, Lead) moves a leader through a process that begins with genuine self-knowledge, not competency mapping. That requires the right instrument for the developmental purpose.

The practical consequence for buyers: if you are choosing an assessment for a selection process, weight predictive validity heavily. Ask vendors for independent criterion-related validity studies. If you are choosing for a development programme, ask what the instrument surfaces about the person's relationship to their own patterns, not just their behavioural tendencies.

Most procurement conversations conflate these two purposes, which is one reason organisations cycle through tools without getting what they actually need.

Five sharper questions to ask any vendor

When evaluating an assessment, the standard vendor checklist (reliability, norm group, user experience) is necessary but insufficient. These questions go further.

Does it predict or describe? Push the vendor to distinguish between concurrent validity (does this correlate with current performance ratings?) and predictive validity (does this correlate with future performance outcomes?). Many tools claim validity without specifying which kind. They are not equivalent.

Is the evidence independent? Validity studies conducted by the tool's own publisher deserve more scrutiny than those published in peer-reviewed journals or conducted by independent researchers. Ask whether the validity data applies to your context: sector, role level, cultural setting. A tool normed on North American middle management may not transfer cleanly to South African senior leadership.

Self-report or observed behaviour? Self-report instruments are faster to administer and often richer for developmental insight. But they are vulnerable to social desirability effects, particularly in high-stakes selection contexts. Observed or simulated behavioural tools reduce (though do not eliminate) that vulnerability.

Is it forward-looking? Most leadership assessments are built around the demands of the present role. That is a reasonable starting point. But for organisations planning succession or building capability for a shifting context, you need an instrument that asks something about readiness for what is coming. Does the tool assess a leader's capacity to operate in conditions of uncertainty? To lead through change they did not initiate? These are different questions from "is this person competent at their current level?"

What does the debrief require? An assessment without skilled interpretation and a structured developmental conversation is a report file. Ask what facilitation is built into the process, and whether the practitioners who debrief the results are qualified to work with the depth of the data the tool produces. Complexity assessment data, in particular, should not be handed over without a supported unpacking.

How CapabilityFX combines tools

CapabilityFX does not advocate for a single instrument. The approach depends on what the client needs to learn and at what stage of the process.

For developmental depth, CapabilityFX uses the Five Lens Development Platform, developed by Ennea International. The Five Lens integrates Enneagram-based personality insights with behavioural observation, making it more nuanced than a standard personality inventory. It surfaces patterns at the level of motivation, identity, and relational style. These patterns are not easily visible through behavioural data alone, and they are precisely the patterns that shape how a leader responds under pressure. CapabilityFX is licenced to use and facilitate the Five Lens; the intellectual property and framework belong to Ennea International.

You can read more about how CapabilityFX applies the Five Lens in practice on the assessments page.

For forward-looking readiness, CapabilityFX works with the Tomorrows Compass Future Readiness Assessment. Tomorrows Compass developed this instrument to assess a leader's readiness to perform in complex, uncertain, and rapidly shifting conditions. It is behavioural in orientation and forward-facing in design. CapabilityFX is a licensed distributor and measurement partner for Tomorrows Compass; the assessment belongs to them.

Details on how this instrument works and who it is for are on the Tomorrows Compass assessments page.

Neither tool replaces the other. Used together, within the CapabilityFX method, they provide a picture that neither alone can produce: the depth of who the leader is, and the forward-readiness evidence for where they need to go. The relationship between those two data sets is where the most useful developmental conversations happen.

If you are uncertain which combination is right for your context, the assessments overview outlines the options, and the nuance of fit-to-purpose is a conversation worth having before any procurement decision is made.

What this looks like in practice

A financial services organisation investing in its next tier

A major financial services firm approached CapabilityFX ahead of a leadership transition programme for 14 senior managers identified as successors for executive roles. The initial brief was to "run assessments" as a baseline for the programme. When CapabilityFX asked what decisions the data would inform, the answer was: selection of the final eight participants for an intensive development track, plus developmental planning for all 14.

Those are two different questions requiring different evidentiary standards. For the selection component, CapabilityFX recommended including the Tomorrows Compass assessment alongside the Five Lens, precisely because selection for a senior leadership track requires forward-looking behavioural evidence, not only a developmental depth picture. The combination gave the organisation something a single psychometric tool could not: both the identity-level patterns that would shape how each leader engaged with the development process, and a structured view of readiness for the complexity they were being prepared to lead.

The debrief process, facilitated by CapabilityFX, surfaced several cases where the two data sets told a usefully different story. A head of operations who scored strongly on forward-readiness indicators showed, through the Five Lens work, a deep-seated pattern around perfectionism and control that she had not previously named. Her readiness to lead strategically was genuine. But without addressing the identity-level driver of her perfectionism, that readiness would stall the moment strategic ambiguity required her to let go of operational certainty. The selection recommendation was inclusion; the developmental priority was clear.

An HR director reconsidering a long-standing assessment

An HR director at a wholesale distribution business contacted CapabilityFX after a difficult experience. The company had used a standard personality inventory as a key input for leadership appointments for several years. The inventory was well-normed and widely used. It was also, she had come to believe, not telling them much that was useful.

When she described the problem more specifically, two issues emerged. First, the tool was being used for selection without any corresponding validity data connecting its scores to the outcomes that mattered to the business: whether appointed leaders built capable teams, held accountability, or performed under the particular pressures of distribution operations. Second, because the same tool was also used for development, neither purpose was being served well. The selection use created social desirability pressure that compromised the developmental debrief.

The recommendation was to separate the two purposes, not to abandon psychometric assessment altogether. For selection, a structured behavioural component was added. For development, the Five Lens, with its Ennea International depth and skilled facilitation, replaced the personality inventory at senior levels. Separating the purposes made both processes more honest and more useful.

You can read a related discussion of what leadership assessments frequently miss, and why, in Why most leadership assessments measure the wrong thing.

Starting the right conversation

Choosing a leadership assessment is a procurement decision that sits inside a more important question: what do you need to know, and what will you do with the answer?

If the purpose is selection, ask for predictive validity evidence, independent validation, and a behavioural component. If the purpose is development, ask what the instrument surfaces about identity, motivation, and self-concept, and whether the facilitation process is designed to hold that depth.

If the purpose is both, separate them deliberately, or acknowledge what you are trading off.

The organisations that get the most from assessment investment are not necessarily the ones with the most sophisticated tools. They are the ones who have asked the right questions before the purchase, and who have a clear line between the data they collect and the conversations that data enables.

If you are working through these questions and want a practical conversation before any procurement decision, the team at CapabilityFX is available. The assessments page is a useful starting point, and you can reach us directly via contact.

The leaders and organisations described here are representative composites drawn from patterns we observe in practice, not identifiable individuals.

Dr Eric Albertini · Co-Founder, CapabilityFX

Originator of the DUAL model, developed through his doctoral research at the University of Johannesburg. Eric has spent his career building leadership capability inside executive teams.

The dispatch

New thinking, when it lands.