CRADLE Database Project Plan (Draft)

CRADLE — Continental Research Architecture for Data Linkage in Education: A Research Proposal

Designing the Architecture for Africa's Federated Education Database

Proposed by: The Spix Foundation, for consideration by the African Union Development Agency (AUDA-NEPAD), in partnership with Africa CDC (institutional and technical guidance on continental data federation) and university research institutions in Africa

Duration: 24 months (two 12-month phases)

Requested funding: USD 10 million (staged across two phases with go/no-go gate)


1. Executive Summary

Africa has no continent-scale education database. Health has one — Africa CDC's Central Data Repository, launched in January 2026, federates surveillance, laboratory, and programme data from national health systems across the continent. Education has nothing comparable. Country-level education data remains siloed in incompatible national Education Management Information Systems (EMISs), inaccessible to cross-jurisdictional research, invisible to continental policy, and unusable for the evidence-based financing that Africa's education systems urgently require.

This proposal requests USD 10 million over 24 months to design, prototype, and validate the architecture, governance framework, and operational policies for a federated education data architecture across participating Member States, with AUDA-NEPAD providing institutional coordination and policy guidance. CRADLEContinental Research Architecture for Data Linkage in Education — will take as its starting point the architectural patterns and governance lessons of Africa's continental health data infrastructure, adapting proven federated approaches to education's distinctive requirements: curricular sovereignty, linguistic diversity, and the integration of learning evidence with existing EMIS systems. CRADLE's funded scope delivers a validated architecture, governance framework, and working prototype across six pilot countries — designed to serve as a technical input into future continental DPI processes. Continental-scale operational federation will require a materially larger, multi-year envelope beyond this research program.

CRADLE will deliver a working Malabo Convention-compliant prototype federating DPI-Ed-generated data across six pilot countries, a validated continent-scalable architecture specification, a data governance framework addressing anonymization, aggregation, sovereignty, and tiered access control, and peer-reviewed research. Each School Leader — whether a Ministry of Education or the leader of a private, NGO-based, or faith-based school system — retains sovereignty over its own data; CRADLE federates, it does not centralize.


2. The Problem: Africa's Education Data Fragmentation

2.1 The Intelligence Gap

Africa's education systems generate data — enrollment figures, examination results, teacher attendance, school infrastructure inventories — but this data remains trapped within national silos. Every African country maintains some form of Education Management Information System (EMIS), ranging from paper-based registers to partially digitized platforms. These systems were designed for national administrative purposes: reporting to Ministries, satisfying donor requirements, and tracking inputs (teachers hired, textbooks distributed, schools built).

What they were not designed for — and cannot currently support — is cross-jurisdictional analysis that reveals patterns invisible within any single country. Which teaching approaches produce measurably better outcomes across linguistically similar regions? How do learning trajectories differ between countries that adopted the same curriculum framework? What can a country struggling with numeracy learn from a neighboring country with similar demographics that achieved better results? These questions are unanswerable today because the data to answer them does not exist in any integrated, comparable form.

2.2 The Structural Barrier

The fragmentation is structural, not accidental. Africa's EMISs differ in data models, collection methods, indicator definitions, reporting frequencies, and technical platforms. Some countries use DHIS2 (originally a health platform, increasingly adapted for education). Others use proprietary systems built by different vendors under different donor-funded projects. Some still rely on paper-based data collection with periodic digitization. There is no shared data standard for education across the continent, no common indicator framework, and no institutional mechanism for cross-border data comparison.

This is the same problem that health faced before DHIS2 achieved continental adoption — and that health is still working to solve at the federation level, as evidenced by Africa CDC's Central Data Repository (launched January 2026) and the Continental Health Data Governance Framework (set for AU endorsement in February 2026). Education is a generation behind health in data infrastructure maturity.

2.3 The Opportunity: Africa's DPI-Ed Changes the Data Landscape

Africa's Digital Public Infrastructure for Education (DPI-Ed) fundamentally changes the education data landscape. Unlike legacy EMIS systems — which are administrative databases recording what was provided (inputs) — DPI-Ed generates continuous, curriculum-aligned evidence of what was learned (outcomes). The RESPECT platform, as the first reference implementation of Africa's DPI-Ed, produces standardized learning data by default: every interaction between a learner and a RESPECT Compatible App generates curriculum-aligned evidence through standardized interfaces.

This means that for the first time, comparable learning data will exist across the countries deploying Africa's DPI-Ed. But "exists" is not "is federated." Without deliberate architectural design, this data will fragment along the same national boundaries as legacy EMIS data. CRADLE's purpose is to ensure that the education data generated by DPI-Ed is federable from the outset — that the architecture supports continent-scale analysis while preserving the national sovereignty that makes participation possible.

2.4 Why Now

Five developments have converged to make CRADLE feasible and urgent:

Africa CDC has proven continent-scale data federation. The Central Data Repository, launched in January 2026 after 15 months of feasibility study and prototype validation, demonstrates that federated data infrastructure is technically achievable and politically viable at continental scale. The accompanying Continental Health Data Governance Framework provides an institutional template. Education can learn from health's architecture and governance — and from its mistakes.

Africa's DPI-Ed is entering pilot deployment. V&P_Core's six pilot countries will generate standardized learning data through RESPECT within the CRADLE program's timeline. CRADLE must be designed alongside — not after — this deployment, so that federation is built in rather than bolted on.

The Malabo Convention has entered into force. The African Union Convention on Cyber Security and Personal Data Protection entered into force on June 8, 2023, following its 15th ratification. It provides the continental legal framework for data protection, cross-border data flows, and privacy that CRADLE's governance framework must implement. Fifteen countries have ratified it; many others have enacted national data protection laws aligned with its principles.

Results-Based Finance requires cross-jurisdictional evidence. RBF4Ed (Essay 26, Section 4.2; see also Essay 7 "Making Education Outcomes Finance-Grade") depends on auditable, comparable outcome evidence. CRADLE provides the continental intelligence layer that makes RBF4Ed evidence exponentially more valuable — benchmarking outcomes across jurisdictions rather than measuring them in isolation

AUDA-NEPAD has launched the final African EdTech 2030: Vision & Plan and the Policy Framework for Standards-based, Vendor-Neutral EdTech. Both instruments call for Africa's DPI-Ed. The V&P's transition from draft (July 2025) to final AUDA-NEPAD-approved form (February 2026) — through a public stakeholder consultation process — signals institutional readiness for the education data architecture that CRADLE will design.


3. The Proposed Solution: A Federated Education Database

3.1 The Core Design Principle: Federate, Don't Centralize

CRADLE's foundational design principle is federation, not centralization. Each School Leader — whether a Ministry of Education or the leader of a private, NGO-based, or faith-based school system — retains sovereignty over its own data. CRADLE defines the interfaces, protocols, and governance through which sovereign data holders choose to share specified data streams, at specified granularities, under specified conditions. This federated architecture also provides the sovereign data infrastructure on which African AI models for education will be trained — continent-scale learning data under African governance, available for research and AI development without extraction (see Essay 12): AI in Africa's DPI-Ed.

This is the same architectural principle that governs DHIS2's most successful deployments and Africa CDC's Central Data Repository: countries contribute data to a continental view while retaining full ownership of and access to their own data. The alternative — a centralized database where data is extracted from national systems and stored continentally — is politically non-viable, technically fragile, and ethically problematic.

3.2 The Health Data Precedent

Africa's health data infrastructure provides the closest precedent for what CRADLE must build. The parallels and differences are instructive:

What health got right — and CRADLE will adopt:

What health got wrong — and CRADLE will address:

3.3 Two Data Streams, Two Governance Regimes

Africa's DPI-Ed (see Essay 15), Section 6 sends identical data into two separate data streams that have entirely different functions, which CRADLE must accommodate:

Stream 1 — Educational data for policy, research, administration, and system learning. This data is to be federated continentally under CRADLE's governance framework. Researchers gain cross-jurisdictional datasets. App developers identify performance variations across contexts. Educators discover continent-wide patterns. School Leaders benchmark against peers (perhaps under NDA). This stream is the primary focus of CRADLE.

Stream 2 — Finance-grade outcome data that moves through the GEOS Data Pipeline for Results-Based Finance purposes. This data has stricter integrity requirements (it is used for financial disbursement) and is expected to terminate at the country level — i.e., not be federated continentally. CRADLE's architecture must cleanly separate these two streams, ensuring that the educational data federation does not compromise the integrity of the finance-grade pipeline, and that the finance-grade pipeline does not constrain the research and policy value of the educational data stream.

3.4 The EMIS Integration Challenge

Every country that participates in CRADLE will already have an existing EMIS — a system that is entrenched, politically embedded, and operationally essential regardless of its technical sophistication. CRADLE must interoperate with these systems, not replace them. This is the "post-entrenchment landscape" that BEINGS' EMIS Interoperability building block specification is designed to address (see BEINGS, Essay 26, Section 4.8; and Essay 3, Section 7)

CRADLE's research must determine: In addition to DPI-Ed-generated data, what data from existing EMISs should flow into the federated database? At what granularity? Through what interfaces? How should EMIS-originated administrative data (enrollments, teacher records, school infrastructure) be linked with DPI-Ed-originated learning data (curriculum-aligned assessment signals)? What governance prevents this linkage from creating surveillance risk?

3.5 Architecture Overview

CRADLE's technical architecture will be defined through the research program. The following constraints are design inputs, not predetermined solutions:


4. Positioning Within the Breakthrough Ecosystem

4.1 The Breakthrough System and Data Federation

CRADLE is one component within a larger system. The African EdTech Breakthrough System addresses four structural barriers to EdTech deployment in Africa — Policy, Technology, Data, and Economics (see Essay 7), "Making Education Outcomes Finance-Grade". CRADLE addresses the Data Barrier directly: it will design and validate the architecture for a continental intelligence layer that will transform country-level pilot data into continent-wide evidence.

Africa's DPI-Ed generates the data. CRADLE federates it. The GEOS Organization certifies it for finance. RBF4Ed converts certified evidence into funding flows. Each component is loosely-coupled to the others; CRADLE provides the connective tissue.

4.2 Dependencies and Synergies

CRADLE depends on: - V&P_Core (Essay 26, Section 3 deploying RESPECT in six pilot countries, generating the standardized learning data that CRADLE federates. - BEINGS (Essay 26, Section 4.8 developing the EMIS Interoperability building block specification that standardizes how EMIS data flows into the federated database. (CRADLE must function with existing EMIS interfaces during Phase 1; BEINGS provides the long-term standard.)

CRADLE amplifies: - RBF4Ed: Cross-jurisdictional outcome evidence is exponentially more valuable for results-based financing than single-country data. CRADLE enables comparative benchmarking that strengthens the evidentiary case for RBF4Ed disbursements. - ECM: Researchers using CRADLE's federated data can validate ECM's Curriculum IR mappings across jurisdictions — does the IR-mediated alignment produce consistent results across different countries' curricula? - V&P_Core: CRADLE strengthens the value proposition for prospective RESPECT adopters — each new country that joins the federation adds to the continental intelligence, creating a network effect that accelerates adoption. - PREMIER Institute: Cross-jurisdictional data enables the PREMIER Institute's "Easy X" research projects to validate their results across diverse deployment contexts.

4.3 Data Flow

DPI-Ed-generated learning data flows from RESPECT Compatible Apps through the RESPECT platform's standardized data interfaces to School System-designated databases. At the country level, this data is available to the national School Leader (Ministry of Education or equivalent) through their sovereign data access. CRADLE adds a federation layer: specified data streams, at specified granularities, are shared with the continental federated database under governance rules defined by the data-sharing agreements negotiated during Phase 1. Researchers, policymakers, and authorized users access the federated view through tiered access controls. Learner-level PII never leaves the national boundary; federated data is anonymized or aggregated to levels defined by the governance framework.

4.4 Governance Architecture: Separation of Roles

CRADLE follows the Breakthrough System's loosely-coupled governance model (see Essay 7), Section 5:

4.5 Theory of Change

If this program produces a validated federated education data architecture and governance framework (outputs), then participating countries can share specified education data streams under sovereign governance (immediate outcome), which enables cross-jurisdictional research, comparative benchmarking, and continental education intelligence (intermediate outcome), which strengthens the evidentiary base for Results-Based Finance for Education and evidence-based policy across the continent (long-term impact).


5. Literature-Informed Design Principles

A review of the health data federation, data sovereignty, and federated learning literatures identifies five substantive challenges that CRADLE's design must address. Each challenge has been incorporated as a design requirement:

Challenge Design Response
Re-identification risk (anonymized education data can be re-identified through linkage with other datasets, especially in small jurisdictions) Multi-level anonymization with differential privacy guarantees; granularity thresholds calibrated to jurisdiction size; re-identification risk assessment as a mandatory component of every data-sharing agreement
Indicator harmonization (education systems define learning outcomes, grade levels, and subject areas differently across jurisdictions) Curriculum IR-mediated harmonization (leveraging ECM's Curriculum Intermediate Representation) for learning data; EMIS-specific indicator mapping for administrative data; explicit documentation of non-comparable indicators
Political sensitivity (education data is politically charged — literacy rates, dropout rates, and learning outcomes are national performance indicators that governments may resist sharing) Tiered access model with government-controlled release; aggregate-only continental views unless countries opt into finer-grained sharing; explicit governance provisions for data embargo periods around national elections or policy reviews
EMIS heterogeneity (national EMISs vary from paper-based to fully digitized, with incompatible data models) Adapter-based integration following the thick-core/thin-adapter pattern described in Essay 3, Section 5; CRADLE defines the federation interface, not the EMIS implementation; adapters built per-country for Phase 1 pilot systems
Sustainability after research funding ends (health data systems in Africa have repeatedly been built with project funding and then degraded when funding ended) Sustainability model designed from the outset: RESPECT Ecosystem Fund contribution (see Essay 24), EdTech Task Force coordination levy, and explicit transition plan to permanent AU-aligned institution

6. Research Goals and Milestones

Program Timeline: 24 Months (2 Phases)

The 24-month timeline is anchored to the Breakthrough System's funding tranches. CRADLE must deliver a working prototype before the Phase 1 → Phase 2 funding gate for V&P_Core, demonstrating that federated education data is achievable across the pilot countries.

Phase 1: Architecture Design and Prototype (Months 1–12) — USD 6 million

Goal: Design the federated education database architecture, develop a working Malabo-compliant prototype federating DPI-Ed-generated data across V&P_Core's pilot countries, and establish the governance framework.

Phase 1 requires V&P_Core to have begun deploying RESPECT in at least two pilot countries, generating learning data through standardized interfaces.

Milestones:

Phase 2: Validation and Scaling Preparation (Months 13–24) — USD 4 million

Goal: Validate the architecture across all six pilot countries, stress-test the governance framework, prepare the scaling pathway, and execute the sustainability transition.

Phase 2 requires the following Phase 1 deliverables as inputs: Comparative health data analysis; EMIS landscape assessment; data governance framework v0.1; architecture specification v0.1; working prototype with Ministerial approval.

Milestones:


7. Alignment with Health Data Precedent: Build, Don't Replicate

CRADLE's strategy is to learn from health's experience, not to replicate its infrastructure. The following table maps health data precedents to CRADLE's design:

Health Precedent CRADLE Application Key Difference
DHIS2 (organizational hierarchy, indicator management, aggregate reporting) Organizational hierarchy adapted for education (nation → region → district → school); indicator framework for education-specific metrics CRADLE receives born-digital data from DPI-Ed, not manually entered facility data; data quality at source is structurally higher
Africa CDC CDR (continental federation, sovereignty-preserving, interoperable platform) Direct architectural model for CRADLE's federation layer; governance template for data-sharing agreements Education data includes curriculum-aligned learning evidence — a data type that health does not have; requires ECM-mediated harmonization
Continental Health Data Governance Framework (cross-border data sharing, Malabo compliance, harmonized governance) Template for CRADLE's education data governance framework; institutional pathway for AU endorsement Education data is more politically sensitive than routine health surveillance — learning outcomes are national performance indicators
HISP Network (distributed implementation, local capacity building, open-source community) RESPECT Certified Partners provide the distributed implementation capacity, cross-trained from HISP Network members (Essay 11, Section 4.1 Education-specific training required; RESPECT's ecosystem governance differs from DHIS2's open-source governance

8. Natural Development Partner Profile

CRADLE aligns with Development Partners investing in digital public infrastructure and data governance at continental scale:

The Bill & Melinda Gates Foundation — DPI Program (USD 200M+ commitment, announced September 2022). CRADLE is education DPI at the data layer. The foundation's existing investments in MOSIP (digital identity) and Mojaloop (digital payments) establish the pattern; CRADLE adds the education data federation layer. The foundation's Global Education Program ($240M+ commitment, announced April 2025) has a direct interest in cross-jurisdictional evidence of digital education effectiveness.

The African Development Bank — Digital Development Programs. The AfDB has funded continent-scale data infrastructure in multiple sectors. CRADLE aligns with its digital transformation strategy and its education investment portfolio.

The World Bank — Digital Development / GovTech Programs. The World Bank's investments in government digital transformation and education data systems create a natural fit. CRADLE provides the continental layer that connects World Bank-funded national EMIS investments.

The Global Fund. As the primary financial backer of Africa CDC's Central Data Repository, the Global Fund has demonstrated commitment to continent-scale data federation in Africa. A parallel investment in education data federation would leverage the institutional and technical patterns the CDR has established.


9. Competitive Landscape

No existing system provides federated education data at continental scale. The closest approaches are:

CRADLE will design and validate the architecture for a layer that does not currently exist: continental federation infrastructure connecting national education data systems — both legacy EMISs and DPI-Ed-generated learning data — through a single, sovereignty-preserving, Malabo-compliant architecture. The research program will produce a working prototype across six pilot countries; scaling to continental operations will require subsequent investment and formal AU governance decisions.


10. Principal Investigator Profile and Team

10.1 Required Expertise

Three domains of expertise define the PI requirements for CRADLE:

Domain 1 — Federated Data Architecture. Deep expertise in designing, implementing, and governing federated data systems at scale. The PI must understand the architectural tradeoffs between centralized and federated models, the technical mechanisms of privacy-preserving data sharing (differential privacy, secure multi-party computation, federated analytics), and the practical challenges of data integration across heterogeneous systems. Direct experience with DHIS2, health data federation, or comparable multi-country data infrastructure is strongly preferred.

Domain 2 — African Education Data Systems. Working knowledge of African EMIS architectures, the differences among them, and the institutional landscape (Ministries of Education, AUDA-NEPAD, Regional Economic Communities). This expertise may reside in a co-PI or senior research partner.

Domain 3 — Data Governance and Privacy. Expertise in data protection frameworks, particularly the Malabo Convention and its interaction with national data protection laws across African jurisdictions. Understanding of anonymization techniques, re-identification risks, and ethical frameworks for education data. Legal expertise in cross-border data-sharing agreements.

10.2 Proposed Principal Investigators

10.3 Project Management and Software Development: The Spix Foundation

The Spix Foundation provides the engineering and project management capacity that bridges research and deployment, integrating CRADLE's federation layer with the RESPECT platform's data interfaces and ensuring that the prototype is built on production-ready infrastructure rather than academic proof-of-concept code. (See ECM Research Proposal, Section 10.3, for organizational details.)

10.4 Africa CDC Technical Advisory Role

Africa CDC will be invited to serve in a technical advisory capacity, sharing architectural patterns, governance frameworks, and operational lessons from the Central Data Repository. This is not a co-implementation role — Africa CDC's mission is health, not education — but an advisory relationship that accelerates CRADLE's design by leveraging health's hard-won experience. The possibility of initially housing CRADLE's continental data unit within Africa CDC infrastructure is a Phase 1 research question, not a predetermined decision.


11. Institutional Partners

11.1 Lead Institution: AUDA-NEPAD

AUDA-NEPAD serves as the program's institutional home, providing:

11.2 Research Partners

Institution Country/Region Contribution
University of Oslo / HISP Center Norway (with African partners) DHIS2 architecture expertise; health data federation experience; HISP Network mobilization for cross-training
Africa CDC Continental Technical advisory on Central Data Repository architecture and Continental Health Data Governance Framework
[African university partner — data systems] TBD EMIS landscape assessment; national data system expertise; pilot country engagement
[African university partner — data governance] TBD Malabo Convention expertise; cross-border data governance; legal framework development
Ministries of Education (6 pilot countries) Kenya, Liberia, Eswatini, + 3 TBD EMIS access; data-sharing agreement negotiation; Ministerial approval
The Spix Foundation United States Project management, software development, RESPECT platform integration

12. Budget Framework

12.1 Summary

Category Amount (USD)
Personnel (PI team, researchers, data engineers, governance specialists) 3,100,000
EMIS landscape assessment and integration (6 countries) 1,050,000
Health data precedent study (Africa CDC CDR, DHIS2, governance framework) 400,000
Infrastructure (cloud computing, database systems, security, monitoring) 650,000
Data governance framework development (legal, policy, Malabo compliance) 650,000
Prototype development and deployment 850,000
Partner institution subgrants (African universities) 800,000
Travel and convening (Ministry engagement, workshops, Africa CDC advisory) 400,000
Program management and administration (AUDA-NEPAD + Spix Foundation) 700,000
Independent evaluation (external evaluator, 2 assessments) 200,000
Contingency (~12%) 1,200,000
Total 10,000,000

12.2 Budget by Phase

Phase Duration Amount (USD) Key Activities
Phase 1: Architecture + Prototype Months 1–12 6,000,000 Health precedent study, EMIS assessment, governance framework v0.1, architecture v0.1, working prototype
Phase 2: Validation + Scaling Prep Months 13–24 4,000,000 Full 6-country deployment, architecture v1.0, governance v1.0, research validation, sustainability plan

Funding is structured as staged commitments with a go/no-go gate between phases (see Section 13).

12.3 Budget Rationale

The personnel budget assumes approximately 4 FTE researchers, 2 FTE data governance specialists, and 4 FTE software engineers across all partner institutions over 24 months, with staffing levels varying by phase. Loaded costs are blended across institutions: senior PIs at USD 180–220K per year (University of Oslo / HISP ecosystem), African-based researchers at USD 80–120K, and Spix Foundation engineers at USD 120–150K.

The EMIS integration budget allocates approximately USD 175,000 per pilot country for landscape assessment, adapter development, testing, and Ministry liaison. This reflects the heterogeneity of existing systems: some pilot countries may use DHIS2-based EMISs (lower integration cost), while others may use proprietary or paper-based systems requiring more extensive adapter work.

The data governance framework budget reflects the legal complexity of cross-border education data federation across six sovereign jurisdictions. Each country has a different national data protection regime that must be analyzed for Malabo Convention compatibility, and each data-sharing agreement must be negotiated individually with the relevant Ministry of Education. International legal work at this scope — six jurisdictions, each requiring local counsel plus a continental-level legal architect — justifies the allocation.

The budget is informed by comparable initiatives. Africa CDC's Central Data Repository required a 15-month feasibility study and prototype validation before its January 2026 launch, funded by the Global Fund; its cost is not publicly available, but comparable continent-scale data infrastructure programs (MOSIP's initial development phase at USD 8M actual, Rwanda's national eHealth plan at USD 32M, the SMART on FHIR ecosystem at USD 15M in ONC funding) suggest that USD 10M for a 24-month research and prototype program across six countries is within the expected range for this class of work. CRADLE's timeline is more aggressive than the CDR's, justified by three factors: (a) the federated architecture pattern is proven in health and can be adapted rather than invented; (b) DPI-Ed-generated data is born digital and standardized, eliminating the data collection and digitization challenges that dominate health data projects; (c) the Spix Foundation's engineering team provides implementation capacity from the outset, avoiding the procurement delays typical of multi-partner research programs.

The contingency allocation reflects the inherent uncertainty in multi-country research programs involving cross-border coordination, legal negotiation, and integration with heterogeneous legacy systems. It includes a rounding margin: the bottom-up line items total USD 8.8M, and the program total is rounded to USD 10M to avoid projecting unwarranted precision from estimates that individually carry ±15–25% uncertainty.


13. Evaluation and Go/No-Go Criteria

13.1 Independent Evaluation

The program will be independently evaluated at the end of Phase 1 and Phase 2 by an external evaluator nominated by the Development Partner during Phase 1. Evaluation criteria include: architectural soundness and scalability, Malabo Convention compliance, Ministerial satisfaction, data quality and comparability across jurisdictions, and the practicality of the governance framework.

13.2 Go/No-Go Gate

Phase 1 → Phase 2 gate (Month 12): The working prototype federates DPI-Ed-generated data from at least two pilot countries with demonstrated Malabo compliance. The data governance framework v0.1 is approved by participating Ministers of Education. The EMIS integration pathway is validated for at least one pilot country. If Ministerial approval is not obtained, the program pauses for up to 3 months to address governance concerns before re-evaluation.

13.3 Reporting

The program provides quarterly progress reports to the Development Partner, including: milestones achieved, technical architecture decisions and rationale, governance framework development, Ministry engagement status, and risk register updates.

13.4 What If the Federated Architecture Proves Non-Viable?

If the federated approach proves politically non-viable (Ministries refuse data-sharing agreements) or technically impractical (EMIS heterogeneity is too great to bridge), the program will have produced three outputs with independent value: (a) the most comprehensive assessment of African education data systems ever conducted — a detailed survey of EMIS architectures across six countries; (b) a comparative analysis of health and education data federation, identifying transferable and non-transferable patterns; (c) a Malabo-compliant data governance framework for education data, applicable to any future education data initiative. These outputs serve the broader DPI-Ed ecosystem regardless of the federation architecture's ultimate viability.


14. Risk Mitigation

Risk Mitigation
Ministries refuse to participate in data sharing Phase 1 starts with the two most willing pilot countries; governance framework is co-developed with Ministries, not imposed. The proposed AUDA-NEPAD EdTech Task Force will provide the coordination and trust-building function. Ministries control what data they share, at what granularity, and can withdraw at any time.
EMIS systems too heterogeneous to integrate Country-specific adapters accommodate diversity; the federation interface is standardized, not the national systems. Phase 1 targets the minimum viable integration for each pilot country, not full EMIS transformation.
DPI-Ed deployment delayed (insufficient learning data) CRADLE's Phase 1 can proceed with EMIS-only federation (administrative data) while DPI-Ed-generated learning data comes online. The architecture is designed for both data streams; either can be federated independently.
Re-identification risk from federated education data Privacy-by-design architecture with differential privacy, anonymization, and granularity thresholds. Mandatory re-identification risk assessment for every data-sharing agreement. Independent privacy review as part of the Phase 1 evaluation.
Political sensitivity of cross-country education comparison Tiered access model with government-controlled release. Countries can embargo their data from specific comparisons. Continental views are aggregate-only by default. The governance framework includes explicit provisions for politically sensitive data.
Sustainability after research funding ends Post-research institutional home and funding model defined during Phase 2 (Month 20–24). Three potential paths: AUDA-NEPAD operational budget, Africa CDC co-hosting, or dedicated AU-aligned institution — each with defined funding mechanisms.
Health data precedent doesn't transfer to education The comparative analysis (Phase 1, Months 1–4) explicitly identifies transferable and non-transferable patterns before the architecture is designed. CRADLE is informed by health, not derived from it.

15. Expected Outcomes and Impact

15.1 Direct Outputs

15.2 Downstream Impact

If CRADLE achieves a validated, continent-scalable architecture across 6 pilot countries, it will:

15.3 If the Program Exceeds Expectations

If Phase 1 achieves Ministerial approval and the Phase 2 validation succeeds across all 6 countries, the immediate scaling path is: (a) extend to the 15 additional countries in V&P_Core's Phase 2 expansion (Years 3–4), at an estimated cost of approximately USD 200,000–400,000 per additional country for EMIS integration and governance framework adaptation; (b) extend to all 55 AU member states over 5–7 years; (c) explore integration with non-education continental data systems (health, civil registration) for cross-sector analysis. The total estimated cost to reach all AU member states is USD 15–25 million over 5–7 years, fundable through a combination of Development Partner follow-on grants, AfDB allocations, and RESPECT Ecosystem Fund contributions.


16. Sustainability and Scaling

16.1 Post-Research Institutional Home

Three options for CRADLE's permanent institutional home will be evaluated during Phase 2. The choice among these options will be informed by formal AU governance decisions on DPI that are expected during the CRADLE program's timeline. The research program will design the transition plan to be compatible with whichever institutional pathway the AU endorses.

Option A — AUDA-NEPAD operational unit. CRADLE becomes a permanent education data unit within AUDA-NEPAD's operations, alongside its existing statistical and data functions. Advantage: direct alignment with AUDA-NEPAD's continental coordination role. Risk: AUDA-NEPAD's operational capacity may be insufficient for a technical data platform. This option is contingent on formal AU decisions regarding DPI governance.

Option B — Africa CDC co-hosting. CRADLE's continental data infrastructure is initially housed within Africa CDC, leveraging its existing Central Data Repository infrastructure, security posture, and institutional LeTS (as proposed in Essay 15, Section 6. Advantage: proven infrastructure and governance. Risk: education is outside Africa CDC's core mandate; long-term mission drift concerns.

Option C — Dedicated AU-aligned institution. A new continent-scale education data institution (provisionally designated "AU-IPED" — AU Institute for Pan-African Education Data, or similar) is established as conditions permit. Advantage: dedicated mission and governance. Risk: institution-building is slow and expensive; may not be viable in the medium term.

The Phase 2 sustainability transition plan will recommend one option based on the program's experience, with defined criteria and a transition timeline.

16.2 Funding Mechanism

Post-research funding is expected to come from three sources:

Estimated annual operating cost for the continental federated database, once established: USD 500,000–1,000,000, covering infrastructure, maintenance, governance administration, and a small operational team.

16.3 Dissemination

Research outputs will be disseminated through: (a) peer-reviewed publications in education data, health informatics, and data governance venues; (b) presentation at AUDA-NEPAD's education technology convenings and Africa CDC's data governance forums; (c) open-source release of all architecture specifications, governance frameworks, and integration tools on GitHub under Apache License 2.0; (d) a public-facing documentation site for participating Ministries and prospective adopter countries.

16.4 Intellectual Property

All research outputs — architecture specifications, governance frameworks, integration tools, and anonymization protocols — are released under the Apache License 2.0. Copyright is held jointly by AUDA-NEPAD and the contributing research institutions. National education data remains under sovereign national authority; CRADLE's governance framework defines the terms under which it is federated, not the terms under which it is owned.


17. Conclusion

Africa federates health data at continental scale. It does not federate education data. This gap means that the learning evidence generated by Africa's DPI-Ed — the most significant new source of education data in the continent's history — will fragment along national boundaries unless a deliberate federation architecture is built.

CRADLE builds that architecture. It learns from health's proven patterns — DHIS2's organizational hierarchy, Africa CDC's sovereignty-preserving federation, the Continental Health Data Governance Framework's institutional template — while addressing education's distinctive requirements: curricular sovereignty, politically sensitive learning outcomes, and the integration of born-digital DPI-Ed data with legacy EMIS systems.

The 24-month program will determine whether federated education data is technically achievable and politically viable at continental scale. If it is, the architecture, governance framework, and working prototype produced over those 24 months will enable continent-wide education intelligence that no single country's data could provide. Researchers will discover cross-jurisdictional patterns. Policymakers will benchmark with evidence. Results-Based Finance will operate on comparative data. And every additional country that joins the federation will make the whole more valuable.

Health built its continental data infrastructure over two decades, through trial, error, and iterative investment. Education can learn from that experience and build faster. The architecture is proven. The governance template exists. The data is being generated. CRADLE provides the federation layer that connects it all.


Appendices