How to design a leadership development programme that actually holds
A practical blueprint for designing a leadership development programme that builds capability you can still see a year later: start from a real baseline, run both paths together, and measure behaviour, not satisfaction.

Most leadership programmes are designed backwards. Someone decides on a budget, a number of cohorts, and a calendar of content, and the design question becomes which modules fill the slots. The harder question, the one that decides whether anything holds twelve months later, gets answered by accident: who do these leaders need to become, and what conditions would actually let them become it. If you are responsible for designing the next programme, that is the question worth starting from. This is a guide to doing exactly that.
Start where the diagnosis ends
We have written elsewhere about why your leadership programme is not building capability: the short version is that most programmes train the visible competency and leave the underlying capability untouched. That piece is the diagnosis. This one is the design. If you have read it and recognised your own programme in it, the natural next question is not "what went wrong" but "what would I build instead." Everything below assumes you accept the premise and want the blueprint.
A good design rests on a small number of decisions made well, not a long syllabus assembled efficiently. There are five of them. Get these right and the content almost designs itself. Get them wrong and no amount of polished facilitation will save the result.
Decision one: baseline from capability, not from a competency wish-list
The first instinct in most programme design is to agree a target: the behaviours you want leaders to display by the end. That is a destination without a starting point. You cannot design a developmental arc if you do not know where each leader actually begins, and "begins" here means more than a skills gap.
A real baseline answers two different questions at once. What can these leaders currently do, which is the outside-in picture a competency model or a 360 captures reasonably well. And who are these leaders currently being under pressure, which those tools barely touch. The second is where the capability lives, and it is the one most baselines skip because it is harder to read.
This is where measurement earns its place, used honestly. A structured assessment that looks below behaviour gives you a starting line you can design against and, later, measure change against. We set out the instruments we use and why on the assessments page. Two points matter for design. First, credit where it is due: the Five Lens Development Platform is Ennea International's, and the future-readiness assessment is Tomorrows Compass's. CapabilityFX is licensed to use both; we did not author either. Second, treat the output as a starting point, not a verdict. A baseline tells you where the developmental work needs to happen for each leader. It does not sort people into keepers and write-offs.
A worked example. A regional retail group ran its usual high-potential programme for two years and could not understand why the same people kept stalling at the same level. When they baselined properly, the pattern was clear. The cohort was strong on outside-in capability, fluent in strategy, finance, and operations, and consistently thin on one inside-out dimension: the willingness to hold a hard position when a more senior person was uncomfortable. No competency wish-list had named that, because it does not show up as a missing skill. It shows up as a leader who goes quiet at the moment the business most needs them to speak. Once the baseline named it, the programme had something real to build on, rather than another round of strategy content for people who already knew the strategy.
Decision two: design for both paths, deliberately
The most common design failure is to build a programme that occupies one path and assume it covers the whole leader. We have made the full case for this in leadership development keeps training the wrong half, so the principle here is brief: leadership capability grows on two paths at once, and a programme that runs only one will not hold.
The two paths, in the language of the DUAL model:
Outside-in. The skills the role demands, the decisions the business requires, the behaviours a leader can learn and practise. This is the path most programmes already do well. It is teachable, schedulable, and measurable.
Inside-out. Character, judgement, and how a leader holds themselves when conditions are hard. This is the path that decides whether the outside-in work survives contact with reality. It cannot be delivered as content. It is built through challenge, honest feedback, and time.
Designing for both paths does not mean adding an "inner work" module to an otherwise unchanged programme. It means alternating between the two deliberately, session by session, so that a concrete leadership challenge (outside-in) is followed by honest reflection on what patterns showed up in handling it (inside-out). The same piece of feedback data serves both: it is information about behaviour and, in the right hands, a mirror for identity. The design skill is keeping both paths in motion without overwhelming either. Push the inside-out work faster than a leader can integrate it and you produce defensiveness. Run only the outside-in and you produce surface change that does not transfer.
Practically, this is the difference between a programme map that reads as a list of topics and one that reads as a sequence of movements. Discover, Understand, Accept, Lead is the arc, and the design job is to build experiences that move leaders along it rather than lectures that describe it.
Decision three: build in time, not more days
Here is the trade most programmes get exactly wrong. They compress development into the fewest possible contact days because days are the visible cost. But capability is not built by adding input. It is built by encountering difficulty, responding, reflecting on the response, returning to the work, and encountering difficulty again. That cycle does not complete in a two-day offsite. It completes across months.
So the design choice is not "how many days of content" but "how long an arc." A programme that runs as a series of shorter sessions spread across four to six months, with structured practice and reflection between each, will do more developmental work than the same hours compressed into a single block. The compression is almost always driven by cost and diary convenience, not by what development requires. The business case for spreading the arc is simply the business case for the investment working at all.
Two design elements make the arc real rather than nominal:
Intimacy. Smaller groups are not a luxury, they are a mechanism. A leader can sit through three days in a cohort of 30 and never have a single genuine assumption examined. In a group of eight working over six months, concealment is much harder to sustain, and the inability to conceal is where the real work begins. If budget forces a choice between reaching more leaders shallowly and fewer leaders properly, the design should usually choose depth.
Accountability between sessions. Development that vanishes into the busyness of the organisation the moment a session ends builds nothing durable. The strongest programmes build in structured return: what did you try, what did you notice, what did you avoid, and why. Not as a reporting chore, but as the practice of honest examination repeated until it becomes the leader's own habit.
Our 4D method is built around exactly this architecture: an arc that holds duration, intimacy, and accountability together, rather than a sprint that simulates them.
Decision four: connect the work to the work
A programme that runs in a room, on content unconnected to the live pressures of the business, teaches leaders to perform in that room. The development needs to attach to real decisions the leaders are actually carrying, or it stays theoretical.
In design terms, this means the raw material of each session should be the participants' own current leadership challenges, not generic case studies. The supplier negotiation that is going sideways. The team member who is quietly disengaging. The strategy that is clear on the page and stalling in execution. When the programme works on live material, two things happen. The outside-in skill is practised on something that matters, so it transfers. And the inside-out pattern shows up in real time, because people behave under genuine stakes the way they actually behave, not the way they would in a hypothetical.
A second worked example. A professional services firm designed a programme for its newly promoted partners and, instead of a leadership curriculum, anchored every session in a real client or team problem each partner brought. One partner kept presenting a stalled cross-team project as an organisational design issue. Working it live, the group surfaced something he had not: he avoided the one conversation that would unstick it, with a peer he disliked, and had built an elaborate structural explanation to avoid naming that. The skill he needed (running a direct, repair-oriented conversation) was outside-in. The reason he had avoided it for months was inside-out. Because the programme worked on his actual project rather than a case study, both halves became visible and addressable at once. The project moved within weeks, which is the kind of signal a board can see. Our use cases describe more of how this connection takes shape in practice.
Decision five: measure behaviour change, not satisfaction
The final design decision is the one most quietly abandoned: how you will know it worked. Most programmes measure the wrong thing because the wrong thing is easy to measure. Post-programme surveys capture satisfaction and immediate reaction. They tell you the facilitator was good and the lunch was adequate. They tell you almost nothing about whether capability moved.
Design the measurement in from the start, and design it around behaviour. The useful questions are not asked on the last day; they are asked six months later, and they are behavioural:
- Are these leaders holding difficult conversations they would previously have deferred?
- Are they visible in moments of uncertainty, or still absent from them?
- Are their teams bringing them real problems, or only safe ones?
- Has the specific pattern the baseline identified actually shifted under pressure, not just in the room?
This is why decision one matters so much to decision five. A real baseline gives you something honest to measure against. If you know that this cohort started thin on holding a hard position under senior discomfort, you have a precise, observable thing to look for a year on, far more useful than an aggregate satisfaction score. Measuring behaviour change is harder than scoring a survey. It is also the only measurement that tells you the truth. We have written separately on the metrics that actually track capability rather than activity, and the same logic applies to any programme you design: pick signals that a sceptical board would accept as evidence the leaders genuinely changed.
Putting the five together
These decisions are not a menu. They reinforce each other or they fail together. A real baseline (one) gives the both-paths design (two) something honest to work on and gives the behavioural measurement (five) something precise to track. The time and intimacy of the arc (three) is what allows the inside-out path to move at all. Connecting the work to live business challenges (four) is what makes both paths transfer beyond the room. Drop any one and the others weaken.
If you want a single test to run over any programme design, including a vendor's proposal landing on your desk this week, ask: does this build the half that holds under pressure, or only the half that performs in good conditions? Does it start from where these leaders actually are, run long enough to change anyone, work on real stakes, and measure whether behaviour moved? Most proposals answer the easy half of each question and skip the hard half. The hard half is the design.
Where to start
You do not need to redesign every programme at once to put this into practice. The first move that pays back most is usually the baseline, because it changes every decision downstream and because it surfaces, often uncomfortably, the gap between what the current programme is building and what the leaders actually need. From there, the design choices follow more naturally than they look from the outside.
If you are weighing how to build, or rebuild, a leadership programme that holds, that is the conversation we are built for. You can see how the work takes shape across our services, or start a conversation with us about your own leadership population. The right design is specific to where your leaders actually are. The starting point is finding that out honestly.
The organisations and leaders described here are representative composites drawn from patterns we observe in practice, not identifiable clients or individuals.
Ricardo Albertini · Co-Founder, CapabilityFX
Ricardo Albertini is a co-founder of CapabilityFX. His career spans leadership consulting, EdTech, FinTech, and media across South Africa and internationally. He launched Africa's first multiplayer VR training tool and has designed development programmes for some of the country's largest financial and automotive organisations. He holds certifications in team performance and Enneagram-based coaching, and writes about what it takes to build capability that lasts.


