Proposed by: The Spix Foundation, for consideration by the African Union Development Agency (AUDA-NEPAD), in partnership with Africa CDC (institutional and technical guidance on continental data federation) and university research institutions in Africa
Duration: 24 months (two 12-month phases)
Requested funding: USD 10 million (staged across two phases with go/no-go gate)
CRADLE — CRADLE, a Back-End for Africa's DPI-Ed (Essay 22) — establishes the case for a continent-scale federated education database: why Africa's education data is fragmented, how health's data infrastructure provides the architectural precedent, what six constraints Africa's multi-sovereign, low-resource environment imposes on any viable architecture, and what CRADLE will enable across the Breakthrough System. This Project Plan assumes the reader has read that essay.
This proposal requests USD 10 million over 24 months to execute CRADLE's research programme: designing, prototyping, and validating the federated architecture, governance framework, and operational policies described in the essay. AUDA-NEPAD provides institutional coordination, policy guidance, and the implementation support mandate conferred by Decision Assembly/AU/Dec.973(XXXIX). CRADLE's funded scope delivers a validated architecture, governance framework, and working prototype across six pilot countries — designed to serve as a technical input into future continental DPI processes. Continental-scale operational federation will require a materially larger, multi-year envelope beyond this research program.
The problem CRADLE addresses — Africa's education data fragmentation, the health data precedent, the DPI-Ed opportunity, and the five convergence factors that make CRADLE feasible now — is established in Essay 22, Sections 1–3. This Project Plan does not restate that case. The remainder of this document specifies how CRADLE's research programme will be executed: its architecture, milestones, budget, governance, risks, and evaluation criteria.
Essay 22 derives CRADLE's architecture from first principles. Sections 4–5 of the essay establish the six constraints that Africa's multi-sovereign, low-resource environment imposes (data sovereignty, governance separation, right-sized complexity, open data formats, EMIS compatibility, and offline-first low-bandwidth operation) and propose a candidate architecture — based entirely on production-grade FOSS — that satisfies all six. This research programme will evaluate that candidate architecture against alternatives, stress-test its assumptions, and produce the validated specification that the continent will build on.
The following EMIS-specific research questions are not addressed in the essay and are central to this programme's scope:
CRADLE's prototype must function with whatever EMIS systems exist in the pilot countries today. BEINGS' EMIS Interoperability building block specification (Essay 27, Section 4.8) will provide the long-term standard once developed.
CRADLE's dependencies, synergies, governance architecture, and downstream impact on the Breakthrough System are described in Essay 22, Sections 7–8 and 10. Two dependencies are directly relevant to this programme's timeline:
If this program produces a validated federated education data architecture and governance framework (outputs), then participating countries can share specified education data streams under sovereign governance (immediate outcome), which enables cross-jurisdictional research, comparative benchmarking, and continental education intelligence (intermediate outcome), which strengthens the evidentiary base for Results-Based Finance for Education and evidence-based policy across the continent (long-term impact).
A review of the health data federation, data sovereignty, and federated learning literatures identifies six substantive challenges that CRADLE's design must address. Each challenge has been incorporated as a design requirement:
| Challenge | Design Response |
|---|---|
| Re-identification risk (anonymized education data can be re-identified through linkage with other datasets, especially in small jurisdictions) | Multi-level anonymization with differential privacy guarantees; granularity thresholds calibrated to jurisdiction size; re-identification risk assessment as a mandatory component of every data-sharing agreement |
| Indicator harmonization (education systems define learning outcomes, grade levels, and subject areas differently across jurisdictions) | Curriculum IR-mediated harmonization (leveraging ECM's Curriculum Intermediate Representation) for learning data; EMIS-specific indicator mapping for administrative data; explicit documentation of non-comparable indicators |
| Political sensitivity (education data is politically charged — literacy rates, dropout rates, and learning outcomes are national performance indicators that governments may resist sharing) | Tiered access model with government-controlled release; aggregate-only continental views unless countries opt into finer-grained sharing; explicit governance provisions for data embargo periods around national elections or policy reviews |
| EMIS heterogeneity (national EMISs vary from paper-based to fully digitized, with incompatible data models) | Adapter-based integration following the thick-core/thin-adapter pattern described in Essay 3, Section 5; CRADLE defines the federation interface, not the EMIS implementation; adapters built per-country for Phase 1 pilot systems |
| Sustainability after research funding ends (health data systems in Africa have repeatedly been built with project funding and then degraded when funding ended) | Sustainability model designed from the outset: RESPECT Ecosystem Fund contribution (see Essay 25), EdTech Task Force coordination levy, and explicit transition plan to permanent AU-aligned institution |
| Data completeness under intermittent connectivity (learning data is generated on devices in offline-first school environments; data arrives at the country-level back-end in delayed, asynchronous batches — potentially days or weeks after the learning interaction) | Ingestion layer designed for bursty batch uploads; every aggregation and federation summary carries completeness metadata (fraction of expected sources synchronized, recency of each source's latest sync); architecture distinguishes "no events occurred" from "events not yet received"; RBF4Ed pipeline integrity preserved through cryptographic signatures applied on-device at recording time |
The 24-month timeline is anchored to the Breakthrough System's funding tranches. CRADLE must deliver a working prototype before the Phase 1 → Phase 2 funding gate for V&P_Core, demonstrating that federated education data is achievable across the pilot countries.
Goal: Design the federated education database architecture, develop a working Malabo-compliant prototype federating DPI-Ed-generated data across V&P_Core's pilot countries, and establish the governance framework.
Phase 1 requires V&P_Core to have begun deploying RESPECT in at least two pilot countries, generating learning data through standardized interfaces.
Milestones:
Goal: Validate the architecture across all six pilot countries, stress-test the governance framework, prepare the scaling pathway, and execute the sustainability transition.
Phase 2 requires the following Phase 1 deliverables as inputs: Comparative health data analysis; EMIS landscape assessment; data governance framework v0.1; architecture specification v0.1; working prototype with Ministerial approval.
Milestones:
CRADLE's strategy is to learn from health's experience, not to replicate its infrastructure. Essay 22, Section 2, details the three components of health's experience that are directly instructive (DHIS2, Africa CDC's CDR, and the Continental Health Data Governance Framework), what health got right, and what health got wrong. The Phase 1 comparative analysis (Months 1–4) will produce a systematic technical report quantifying which patterns transfer to education and which require adaptation.
CRADLE aligns with Development Partners investing in digital public infrastructure and data governance at continental scale:
The Bill & Melinda Gates Foundation — DPI Program (USD 200M+ commitment, announced September 2022). CRADLE is education DPI at the data layer. The foundation's existing investments in MOSIP (digital identity) and Mojaloop (digital payments) establish the pattern; CRADLE adds the education data federation layer. The foundation's Global Education Program ($240M+ commitment, announced April 2025) has a direct interest in cross-jurisdictional evidence of digital education effectiveness.
The African Development Bank — Digital Development Programs. The AfDB has funded continent-scale data infrastructure in multiple sectors. CRADLE aligns with its digital transformation strategy and its education investment portfolio.
The World Bank — Digital Development / GovTech Programs. The World Bank's investments in government digital transformation and education data systems create a natural fit. CRADLE provides the continental layer that connects World Bank-funded national EMIS investments.
The Global Fund. As the primary financial backer of Africa CDC's Central Data Repository, the Global Fund has demonstrated commitment to continent-scale data federation in Africa. A parallel investment in education data federation would leverage the institutional and technical patterns the CDR has established.
No existing system provides federated education data at continental scale. The closest approaches are:
CRADLE will design and validate the architecture for a layer that does not currently exist: continental federation infrastructure connecting national education data systems — both legacy EMISs and DPI-Ed-generated learning data — through a single, sovereignty-preserving, Malabo-compliant architecture. The research program will produce a working prototype across six pilot countries; scaling to continental operations will require subsequent investment and formal AU governance decisions.
Three domains of expertise define the PI requirements for CRADLE:
Domain 1 — Federated Data Architecture. Deep expertise in designing, implementing, and governing federated data systems at scale. The PI must understand the architectural tradeoffs between centralized and federated models, the technical mechanisms of privacy-preserving data sharing (differential privacy, secure multi-party computation, federated analytics), and the practical challenges of data integration across heterogeneous systems. Direct experience with DHIS2, health data federation, or comparable multi-country data infrastructure is strongly preferred.
Domain 2 — African Education Data Systems. Working knowledge of African EMIS architectures, the differences among them, and the institutional landscape (Ministries of Education, AUDA-NEPAD, Regional Economic Communities). This expertise may reside in a co-PI or senior research partner.
Domain 3 — Data Governance and Privacy. Expertise in data protection frameworks, particularly the Malabo Convention and its interaction with national data protection laws across African jurisdictions. Understanding of anonymization techniques, re-identification risks, and ethical frameworks for education data. Legal expertise in cross-border data-sharing agreements.
The Spix Foundation provides the engineering and project management capacity that bridges research and deployment, integrating CRADLE's federation layer with the RESPECT platform's data interfaces and ensuring that the prototype is built on production-ready infrastructure rather than academic proof-of-concept code. (See ECM Research Proposal, Section 10.3, for organizational details.)
Africa CDC will be invited to serve in a technical advisory capacity, sharing architectural patterns, governance frameworks, and operational lessons from the Central Data Repository. This is not a co-implementation role — Africa CDC's mission is health, not education — but an advisory relationship that accelerates CRADLE's design by leveraging health's hard-won experience. The possibility of initially housing CRADLE's continental data unit within Africa CDC infrastructure is a Phase 1 research question, not a predetermined decision.
AUDA-NEPAD serves as the program's institutional home, providing:
| Institution | Country/Region | Contribution |
|---|---|---|
| University of Oslo / HISP Center | Norway (with African partners) | DHIS2 architecture expertise; health data federation experience; HISP Network mobilization for cross-training |
| Africa CDC | Continental | Technical advisory on Central Data Repository architecture and Continental Health Data Governance Framework |
| [African university partner — data systems] | TBD | EMIS landscape assessment; national data system expertise; pilot country engagement |
| [African university partner — data governance] | TBD | Malabo Convention expertise; cross-border data governance; legal framework development |
| Ministries of Education (6 pilot countries) | Kenya, Liberia, Eswatini, + 3 TBD | EMIS access; data-sharing agreement negotiation; Ministerial approval |
| The Spix Foundation | United States | Project management, software development, RESPECT platform integration |
| Category | Amount (USD) |
|---|---|
| Personnel (PI team, researchers, data engineers, governance specialists) | 3,100,000 |
| EMIS landscape assessment and integration (6 countries) | 1,050,000 |
| Health data precedent study (Africa CDC CDR, DHIS2, governance framework) | 400,000 |
| Infrastructure (cloud computing, database systems, security, monitoring) | 650,000 |
| Data governance framework development (legal, policy, Malabo compliance) | 650,000 |
| Prototype development and deployment | 850,000 |
| Partner institution subgrants (African universities) | 800,000 |
| Travel and convening (Ministry engagement, workshops, Africa CDC advisory) | 400,000 |
| Program management and administration (AUDA-NEPAD + Spix Foundation) | 700,000 |
| Independent evaluation (external evaluator, 2 assessments) | 200,000 |
| Contingency (~12%) | 1,200,000 |
| Total | 10,000,000 |
| Phase | Duration | Amount (USD) | Key Activities |
|---|---|---|---|
| Phase 1: Architecture + Prototype | Months 1–12 | 6,000,000 | Health precedent study, EMIS assessment, governance framework v0.1, architecture v0.1, working prototype |
| Phase 2: Validation + Scaling Prep | Months 13–24 | 4,000,000 | Full 6-country deployment, architecture v1.0, governance v1.0, research validation, sustainability plan |
Funding is structured as staged commitments with a go/no-go gate between phases (see Section 13).
The personnel budget assumes approximately 4 FTE researchers, 2 FTE data governance specialists, and 4 FTE software engineers across all partner institutions over 24 months, with staffing levels varying by phase. Loaded costs are blended across institutions: senior PIs at USD 180–220K per year (University of Oslo / HISP ecosystem), African-based researchers at USD 80–120K, and Spix Foundation engineers at USD 120–150K.
The EMIS integration budget allocates approximately USD 175,000 per pilot country for landscape assessment, adapter development, testing, and Ministry liaison. This reflects the heterogeneity of existing systems: some pilot countries may use DHIS2-based EMISs (lower integration cost), while others may use proprietary or paper-based systems requiring more extensive adapter work.
The data governance framework budget reflects the legal complexity of cross-border education data federation across six sovereign jurisdictions. Each country has a different national data protection regime that must be analyzed for Malabo Convention compatibility, and each data-sharing agreement must be negotiated individually with the relevant Ministry of Education. International legal work at this scope — six jurisdictions, each requiring local counsel plus a continental-level legal architect — justifies the allocation.
The budget is informed by comparable initiatives. Africa CDC's Central Data Repository required a 15-month feasibility study and prototype validation before its January 2026 launch, funded by the Global Fund; its cost is not publicly available, but comparable continent-scale data infrastructure programs (MOSIP's initial development phase at USD 8M actual, Rwanda's national eHealth plan at USD 32M, the SMART on FHIR ecosystem at USD 15M in ONC funding) suggest that USD 10M for a 24-month research and prototype program across six countries is within the expected range for this class of work. CRADLE's timeline is more aggressive than the CDR's, justified by three factors: (a) the federated architecture pattern is proven in health and can be adapted rather than invented; (b) DPI-Ed-generated data is born digital and standardized, eliminating the data collection and digitization challenges that dominate health data projects; (c) the Spix Foundation's engineering team provides implementation capacity from the outset, avoiding the procurement delays typical of multi-partner research programs.
The contingency allocation reflects the inherent uncertainty in multi-country research programs involving cross-border coordination, legal negotiation, and integration with heterogeneous legacy systems. It includes a rounding margin: the bottom-up line items total USD 8.8M, and the program total is rounded to USD 10M to avoid projecting unwarranted precision from estimates that individually carry ±15–25% uncertainty.
The program will be independently evaluated at the end of Phase 1 and Phase 2 by an external evaluator nominated by the Development Partner during Phase 1. Evaluation criteria include: architectural soundness and scalability, Malabo Convention compliance, Ministerial satisfaction, data quality and comparability across jurisdictions, and the practicality of the governance framework.
Phase 1 → Phase 2 gate (Month 12): The working prototype federates DPI-Ed-generated data from at least two pilot countries with demonstrated Malabo compliance. The prototype correctly ingests delayed batch uploads from offline-first front-end environments and propagates completeness metadata through to federation summaries. The data governance framework v0.1 is approved by participating Ministers of Education. The EMIS integration pathway is validated for at least one pilot country. If Ministerial approval is not obtained, the program pauses for up to 3 months to address governance concerns before re-evaluation.
The program provides quarterly progress reports to the Development Partner, including: milestones achieved, technical architecture decisions and rationale, governance framework development, Ministry engagement status, and risk register updates.
If the federated approach proves politically non-viable (Ministries refuse data-sharing agreements) or technically impractical (EMIS heterogeneity is too great to bridge), the program will have produced three outputs with independent value: (a) the most comprehensive assessment of African education data systems ever conducted — a detailed survey of EMIS architectures across six countries; (b) a comparative analysis of health and education data federation, identifying transferable and non-transferable patterns; (c) a Malabo-compliant data governance framework for education data, applicable to any future education data initiative. These outputs serve the broader DPI-Ed ecosystem regardless of the federation architecture's ultimate viability.
| Risk | Mitigation |
|---|---|
| Ministries refuse to participate in data sharing | Phase 1 starts with the two most willing pilot countries; governance framework is co-developed with Ministries, not imposed. The proposed AUDA-NEPAD EdTech Task Force will provide the coordination and trust-building function. Ministries control what data they share, at what granularity, and can withdraw at any time. |
| EMIS systems too heterogeneous to integrate | Country-specific adapters accommodate diversity; the federation interface is standardized, not the national systems. Phase 1 targets the minimum viable integration for each pilot country, not full EMIS transformation. |
| DPI-Ed deployment delayed (insufficient learning data) | CRADLE's Phase 1 can proceed with EMIS-only federation (administrative data) while DPI-Ed-generated learning data comes online. The architecture is designed for both data streams; either can be federated independently. |
| Re-identification risk from federated education data | Privacy-by-design architecture with differential privacy, anonymization, and granularity thresholds. Mandatory re-identification risk assessment for every data-sharing agreement. Independent privacy review as part of the Phase 1 evaluation. |
| Political sensitivity of cross-country education comparison | Tiered access model with government-controlled release. Countries can embargo their data from specific comparisons. Continental views are aggregate-only by default. The governance framework includes explicit provisions for politically sensitive data. |
| Sustainability after research funding ends | Post-research institutional home and funding model defined during Phase 2 (Month 20–24). Three potential paths: AUDA-NEPAD operational budget, Africa CDC co-hosting, or dedicated AU-aligned institution — each with defined funding mechanisms. |
| Health data precedent doesn't transfer to education | The comparative analysis (Phase 1, Months 1–4) explicitly identifies transferable and non-transferable patterns before the architecture is designed. CRADLE is informed by health, not derived from it. |
| Data completeness degraded by intermittent connectivity | Architecture treats incomplete data as the normal operating condition, not an exception. Every aggregation carries completeness metadata. Dashboards and federation summaries display data recency and coverage alongside computed indicators. The Phase 1 prototype validates correct behaviour under simulated multi-day sync delays. |
The downstream impact of a validated federated architecture — cross-jurisdictional research, RBF4Ed evidence amplification, PREMIER training data, sovereign AI infrastructure, adoption network effects, and continental education intelligence — is described in Essay 22, Section 7.
If Phase 1 achieves Ministerial approval and the Phase 2 validation succeeds across all 6 countries, the immediate scaling path is: (a) extend to the 15 additional countries in V&P_Core's Phase 2 expansion (Years 3–4), at an estimated cost of approximately USD 200,000–400,000 per additional country for EMIS integration and governance framework adaptation; (b) extend to all 55 AU member states over 5–7 years; (c) explore integration with non-education continental data systems (health, civil registration) for cross-sector analysis. The total estimated cost to reach all AU member states is USD 15–25 million over 5–7 years, fundable through a combination of Development Partner follow-on grants, AfDB allocations, and RESPECT Ecosystem Fund contributions.
Three options for CRADLE's permanent institutional home — an AUDA-NEPAD operational unit, Africa CDC co-hosting, or a dedicated AU-aligned institution — are outlined in Essay 22, Section 8. The Phase 2 sustainability transition plan (Months 20–24) will evaluate all three against the programme's experience and recommend one, with defined criteria and a transition timeline. The AU Assembly's adoption of Decision Assembly/AU/Dec.973(XXXIX) provides the institutional basis for this evaluation.
Post-research funding is expected to come from three sources:
Estimated annual operating cost for the continental federated database, once established: USD 500,000–1,000,000, covering infrastructure, maintenance, governance administration, and a small operational team.
Research outputs will be disseminated through: (a) peer-reviewed publications in education data, health informatics, and data governance venues; (b) presentation at AUDA-NEPAD's education technology convenings and Africa CDC's data governance forums; (c) open-source release of all architecture specifications, governance frameworks, and integration tools on GitHub under Apache License 2.0; (d) a public-facing documentation site for participating Ministries and prospective adopter countries.
All research outputs — architecture specifications, governance frameworks, integration tools, and anonymization protocols — are released under the Apache License 2.0. Copyright is held jointly by AUDA-NEPAD and the contributing research institutions. National education data remains under sovereign national authority; CRADLE's governance framework defines the terms under which it is federated, not the terms under which it is owned.
CRADLE — the Continental Research Architecture for Data Linkage in Education — addresses the following AU provisions:
This proposal requests USD 10 million over 24 months to determine whether federated education data is technically achievable and politically viable at continental scale. The case for the attempt — and the architecture that multi-sovereignty requires — is established in Essay 22. This Plan specifies how that attempt will be executed: phased milestones with a go/no-go gate, a budget benchmarked against comparable infrastructure programmes, a governance structure that separates political authority from technical execution, and a sustainability pathway that does not depend on perpetual donor funding. If the programme succeeds, the validated architecture, governance framework, and working prototype will serve as the foundation for continental-scale education intelligence. If it does not, the EMIS landscape assessment, health-education comparative analysis, and Malabo-compliant governance framework will retain independent value for any future education data initiative.