ECM Mapping Project Plan (Draft)

Easy Curriculum Mapping (ECM): A Research Proposal

Building the Curriculum Intermediate Representation for Africa's Digital Public Infrastructure for Education

Proposed to: The Bill & Melinda Gates Foundation — Global Education Program

Proposed by: The Spix Foundation, for consideration by the African Union Development Agency (AUDA-NEPAD), in partnership with university research institutions in Africa, India, and the United States

Duration: 48 months (4 years)

Requested funding: USD 10 million (staged across two phases with a go/no-go gate)

1. Executive Summary

Nine out of ten children in Sub-Saharan Africa cannot read a simple sentence by age ten. Digital courseware that teaches foundational literacy and numeracy exists — and cannot deploy across African countries because mapping courseware to each country's curriculum standards is manual, expert-dependent, and prohibitively expensive. This curriculum-mapping bottleneck is a structural barrier to continental-scale deployment of effective EdTech.

This proposal requests USD 10 million over 48 months to build Easy Curriculum Mapping (ECM): a Curriculum Intermediate Representation (Curriculum IR) that collapses the combinatorial cost of curriculum mapping from O(Apps×Standards) to O(Apps+Standards). The architectural pattern is proven — LLVM demonstrated it for compilers, TCP/IP for computer networking, and FHIR for healthcare interoperability. ECM applies the same structural insight to education. Phase 1 includes a funded desk pilot validating the concept against real curricula. Deliverables include the open-source Curriculum IR specification, 12 digitized curricula (6 African, 6 Indian), validated crosswalks, and mapping tools for Ministries and courseware developers. AUDA-NEPAD leads. The Spix Foundation provides project management and software development. University partners in Africa, India, and the United States provide research expertise. The program's technical methodology leverages AI-assisted research and development tooling — from LLM-based concept extraction to AI-accelerated software engineering modeled on the open-source LLVM, TCP/IP, and FHIR codebases — compressing technical milestones and enabling proportionally greater investment in the institutional architecture that determines long-term adoption.

1A. Decision Package

What a Funder Commits To

The Gates Foundation commits USD 10M over 48 months, disbursed in two phases with a go/no-go gate at Month 24. Phase 1 (Months 1–24): USD 5.5M. Phase 2 (Months 25–48): USD 4.5M. Phase 2 funding is contingent on Phase 1 deliverables.

What a Funder Gets

Founder attribution for the ECM program and naming/hosting rights for the SOCLE Board (Standard for Open Curriculum Logic in Education) — the governance body that the Curriculum IR produces.
Legacy Attribution at the Founder tier (see Essay 25), Section 7.
Phase-gated accountability: Phase 2 funding is released only upon Phase 1 deliverable acceptance (see Section 1B).
Open-access research outputs: all specifications, tools, datasets, digitized curricula, and peer-reviewed publications produced by ECM are published under open licenses, extending the funder's impact beyond the Breakthrough System.
Direct amplification of existing Gates investments: every Gates-funded EdTech platform operating in Sub-Saharan Africa benefits from the Curriculum IR — any courseware application mapped to the IR deploys across all participating countries without additional mapping cost.

Success Criteria by Month 24

Phase 1 success is defined by the Month-24 Proof of Capability outcome set (Section 1B). If Phase 1 deliverables are met, Phase 2 funding is released. If they are not met, the program convenes a technical review to determine whether the IR architecture requires revision, the timeline requires extension, or the approach is non-viable (see Section 13.2).

Governance and Reporting

The program provides quarterly progress reports to the Gates Foundation, including milestones achieved, accuracy metrics, budget expenditure, and risk register updates. AUDA-NEPAD provides institutional coordination. The Spix Foundation provides project management and software development. Independent evaluation is conducted at the Phase 1 gate (Month 24) and at program completion (Month 48) by an external evaluator nominated by the Gates Foundation.

Intellectual Property Posture

The PREMIER Institute owns all intellectual property resulting from ECM research. The Gates Foundation, as funding partner, receives a worldwide, paid-up, royalty-free, sub-licensable, non-exclusive license to all such IP. Code and specifications are released under the Apache License 2.0. Creative works (illustrations, documentation artwork) are released under the appropriate Creative Commons license. University research partners retain academic publication rights; all code and infrastructure deliverables are owned by the PREMIER Institute.

Sovereignty Posture

Attribution is distinct from authority. Founder attribution and SOCLE Board hosting rights are recognition mechanisms; they confer no governance authority over curriculum standards, data access, or platform operations. National curriculum authority remains with Ministries of Education. Continental coordination authority remains with AUDA-NEPAD. Technical infrastructure authority remains with the RESPECT Platform's technical steward. The Curriculum IR does not set curriculum policy — it maps existing curricula as authored by sovereign governments.

1B. Month-24 Proof of Capability

The following concrete outcomes define Phase 1 success and gate Phase 2 funding:

Desk pilot: Completed and documented. Prototype Curriculum IR validated against at least 2 national curricula (Kenya CBC and Eswatini, K–3 Mathematics).
Curriculum IR v0.2 specification: Published as open-source specification, incorporating desk pilot findings and early mapping results.
Curriculum digitization: 12 curricula (6 African, 6 Indian) digitized in CASE-compliant machine-readable format, K–3 mathematics and literacy.
Curriculum mapping: All 12 curricula mapped to the Curriculum IR v0.2, with mappings validated by Ministry personnel and curriculum experts.
Courseware partnerships: At least 3 courseware developers have mapped content to the Curriculum IR.
Validation study: Formal validation achieving ≥85% concept-level accuracy for foundational literacy and numeracy across 12 curricula.
Open-source tooling: Ministry-facing tools for standards-to-IR mapping and developer-facing tools for courseware-to-IR tagging released.
Peer-reviewed submission: Validation methodology and empirical results submitted for publication.

2. The Problem: Africa's Curriculum Mapping Bottleneck

2.1 The Learning Crisis

Sub-Saharan Africa faces the world's most severe learning crisis. According to the World Bank's Learning Poverty indicator, functional illiteracy among ten-year-olds in the region stands at approximately 90%. The African Union has identified the elimination of learning poverty as a continental priority. The Gates Foundation's Global Education Program has identified foundational learning in Sub-Saharan Africa as a core priority, committing more than USD 240 million over four years (announced April 2025) to help 15 million children in Sub-Saharan Africa and India learn more effectively, and, with ADQ, an additional USD 40 million for responsible AI and EdTech deployment across Sub-Saharan Africa (announced December 2025).

Digital courseware addressing foundational literacy and numeracy exists and has demonstrated impact in controlled settings. The barrier to continental-scale deployment is the curriculum-mapping bottleneck described below.

2.2 The Structural Barrier

Across Africa, an estimated 100 or more distinct national or sub-national curriculum standards govern learning expectations. These standards differ in conceptual decomposition, sequencing, representation conventions, linguistic realization, and cultural embedding. For digital courseware to be used in a country's public education system, it must be mapped to that country's curriculum standards.

Today, this mapping is performed manually by subject-matter experts, separately for each country, separately for each courseware application. If there are N courseware applications and M national curricula, the total mapping effort scales as O(Apps×Standards). Each new country requires N new mappings; each new application requires M new mappings. This combinatorial cost structure makes multi-country deployment economically irrational for all but the largest publishers.

At realistic rates using African curriculum specialists, expert-produced curriculum mapping costs an estimated USD 1,000–3,500 per application per country, depending on subject scope and curriculum complexity. (This estimate is based on 20–80 hours of curriculum specialist time at USD 30–50/hour for standards analysis, content mapping, gap analysis, and quality assurance, plus project management and coordination overhead — consistent with African education consultant market rates.) The counterfactual at scale is decisive: covering 55 AU member states for 30 courseware applications through manual mapping would cost USD 1.7–5.8 million per mapping cycle — and this cost recurs every time a curriculum is revised or a courseware application updates its content. National curricula are typically revised on 5–10 year cycles; courseware applications update far more frequently. Each revision triggers a new round of manual re-mapping across all affected jurisdictions and applications. Over a decade with expansion to 1,000+ global jurisdictions, manual mapping costs compound into hundreds of millions of dollars — with no infrastructure, no machine-readable standards, and no path to automation.

The Curriculum IR transforms this cost structure. When a Ministry revises its national curriculum, the Ministry re-maps its standards to the Curriculum IR once; all courseware applications connected to the Curriculum IR receive the updated alignment automatically. When a courseware application updates its content, it re-maps to the Curriculum IR once; all jurisdictions receive the updated alignment automatically. An application mapped to an earlier version of the Curriculum IR can still be aligned to current national standards through the IR's versioning and transformation mechanisms — imperfectly, but at zero marginal cost, which is infinitely less expensive than the current system's requirement for a fresh manual mapping. The Curriculum IR converts a recurring, multiplicative expenditure into a one-time infrastructure investment that amortizes as the number of applications and jurisdictions grows.

2.3 The Digitization Gap

The mapping bottleneck is compounded by a digitization gap. No African country has published its curriculum standards in an internationally interoperable machine-readable format. South Africa's CAPS exists as PDF documents. Kenya's CBC is digitized for internal KICD use in a format specific to that institution. The African Union's Decade of Education (2025–2034) has yet to address curriculum digitization.

The ECM program must therefore solve both problems simultaneously: build the Curriculum IR and produce the machine-readable curricula it requires as inputs.

2.4 Why Now

Four developments have converged to make ECM feasible today:

LLM-based concept extraction has reached usable accuracy. Current research reports up to 89% F1-accuracy for goal-to-skill matching using large language models, with human validation. Five years ago, automated concept extraction from curriculum documents was impractical. Today, LLMs enable the extraction pipeline that the Curriculum IR depends on.

India's Sunbird/DIKSHA has proven open-source education infrastructure at national scale. India's DPI ecosystem — specifically the Sunbird taxonomy service — provides a tested, MIT-licensed platform layer that ECM can adopt, dramatically reducing infrastructure development time.

AI-assisted research and development tooling has reached production capability. The Curriculum IR's software architecture can be modeled directly on the production codebases of LLVM, TCP/IP, and FHIR — open-source systems whose design patterns are well-documented and accessible to AI-accelerated development tools. Environments such as Claude Code, GitHub Copilot, and their successors compress the implementation timeline for the IR compiler, mapping tools, and validation infrastructure. This acceleration is methodologically significant: time saved on technical milestones is reinvested in the institutional work — Ministry engagement, governance formation, Mapper certification — that determines whether infrastructure achieves adoption. ECM is AI-era education infrastructure, built with AI-era tools.

The African Union's Decade of Education (2025–2034) has created institutional momentum. AUDA-NEPAD's mandate, the African Continental Qualifications Framework (ACQF), and the AU's renewed commitment to education reform provide the continental coordination structure that an IR-based approach requires.

3. The Proposed Solution: Easy Curriculum Mapping (ECM)

3.1 The Core Insight

At the foundational level, curriculum standards describe concepts. Representations vary — the concept of number exists regardless of notation system; phonemic awareness is a prerequisite for alphabetic decoding regardless of language. A canonical intermediate layer that captures concepts independently of any particular curriculum's representation enables a structural reduction in mapping cost.

3.2 The Curriculum Intermediate Representation

ECM centers on a Curriculum IR that encodes learning concepts at a stable, representation-independent level. National curriculum standards map once to the Curriculum IR; digital courseware maps once to the same Curriculum IR. This converts the O(Apps×Standards) mapping problem into two linear processes — Standards-to-IR and Courseware-to-IR — yielding O(Apps+Standards) total cost.

The Curriculum IR is designed to interoperate with existing education metadata standards, including CASE (Competency and Academic Standards Exchange) for standards representation and IEEE LOM for learning object metadata. The Curriculum IR extends these standards with concept-level semantics, dialect support (see Section 3.4), and weighting metadata that existing frameworks do not provide.

3.3 The TCP/IP, LLVM, and FHIR Precedents

This architectural pattern has been proven in three directly analogous domains.

TCP/IP introduced the Internet Protocol (IP) as a canonical intermediate layer between application protocols and network technologies. Before TCP/IP, each application had to be implemented separately for each network type — an N×M problem. IP collapses this: any application protocol maps to IP (N mappings), and IP maps to any network technology (M mappings), yielding O(N+M). Conceived by Vint Cerf and Robert Kahn in their 1974 paper "A Protocol for Packet Network Intercommunication" (IEEE Transactions on Communications), funded by DARPA, and formalized as RFC 791 (1981), TCP/IP became the mandatory standard for US defense networks on January 1, 1983 ("Flag Day") and the foundational infrastructure of the global internet. The IETF governs its evolution through an open, consensus-based standards process with no single controlling institution. (Full history in Appendix C3.)

LLVM introduced a compiler intermediate representation that enables any source language to compile to any hardware target through a single canonical layer. LLVM began as a graduate research project at the University of Illinois (2000), funded by NSF and DARPA, and became the standard infrastructure for new compiler and toolchain projects within a decade (ACM Software System Award, 2012). Its successor, MLIR, extends the IR concept through a "dialect" mechanism that enables domain-specific representations within a unified framework. (Full history in Appendix C1.)

FHIR introduced canonical resources, concept maps, and integration with the UMLS Metathesaurus to collapse the N×M problem across a dozen incompatible clinical coding systems. FHIR began as a volunteer initiative within HL7 International (2011), was accelerated by ONC's USD 15 million SMART on FHIR grant to Harvard/Boston Children's Hospital, became a normative standard in 2018, and was mandated for US healthcare systems by the ONC's 2020 Final Rule implementing the 21st Century Cures Act. (Full history in Appendix C2.)

All three precedents demonstrate that: (a) IR-based approaches work for complex, multi-stakeholder interoperability problems; (b) they require a governance body, an economic mandate, and sufficiently formal domain semantics; and (c) they can move from initial research to normative standard within 7–10 years, with deployable prototypes within 3–4 years.

3.4 MLIR's Dialect Concept: A Key Architectural Insight

MLIR's innovation — domain-specific "dialects" within a unified IR framework — directly addresses the most serious literature-based objection to a Curriculum IR (see Section 5). African curricula include competency-based frameworks (Kenya's CBC), content-standards frameworks (e.g., South Africa's CAPS), and outcomes-based frameworks (various). These are genuinely different organizational logics, each encoding distinct instructional commitments. A Curriculum IR with dialect support represents each curricular tradition in its own terms while enabling transformation between dialects — preserving cultural and pedagogical specificity while achieving interoperability.

Several African countries and most Indian states maintain sub-national curriculum variations (language-of-instruction differences, regional supplementary content). The Curriculum IR's dialect mechanism accommodates sub-national variation within the same national mapping.

3.5 Why This Has Not Been Attempted Before

Curriculum interoperability at this level has three prerequisites that did not co-exist until recently: (a) the foundational NLP/LLM technology for automated concept extraction from curriculum documents; (b) an institutional actor with both the continental mandate (AUDA-NEPAD) and the technical capacity (the MLIR/LLVM research community) to conceive the approach; and (c) an economic incentive at sufficient scale. The Global North's education systems, which fund most EdTech R&D, face the Apps×Standards problem at manageable scale (a few dozen curricula, mostly digitized). Africa's fragmentation is uniquely severe — an estimated 100+ curricula, none digitized in interoperable format — and Africa's institutions are uniquely positioned to solve it.

4. Positioning Within the Breakthrough Ecosystem

4.1 The Breakthrough System and DPI-Ed

ECM is one component within a larger system. The Breakthrough System addresses four structural barriers to EdTech deployment in Africa — Policy, Technology, Data, and Economics (see Essay 07), "Making Education Outcomes Finance-Grade". ECM helps to address the Technology Barrier — the curriculum-mapping bottleneck — and enables the Data Barrier to be addressed through automated, curriculum-aligned assessment.

Africa's Digital Public Infrastructure for Education (DPI-Ed) is the open-source infrastructure layer that produces continuous, curriculum-aligned, auditable learning evidence. The Spix Foundation's RESPECT system is the first reference implementation of Africa's DPI-Ed. ECM provides the curriculum interoperability layer within DPI-Ed, enabling courseware to connect to any participating country's curriculum through a single mapping.

4.2 The Two-Track Curriculum Mapping Strategy

The Breakthrough System employs a two-track strategy for curriculum mapping:

Track 1 — RESPECT Certified Mappers (Years 1–4). During the period while ECM is under development, human curriculum experts — RESPECT Certified Mappers — perform manual, expert-validated curriculum alignments. RESPECT Certified Mappers are designed as a phase-limited profession: governance protocols include mandatory sunset clauses and transition pathways for Mappers into ECM-related auditing, standards-maintenance, and quality-assurance roles. (See Essay 24), "Mappers: Mapping Lessons to Curriculum Standards, Years 1–4."

Track 2 — ECM (Year 5+). By the end of Year 4, ECM is expected to deliver a deployable Curriculum IR and mapping toolset, collapsing the long-term cost of curriculum mapping. From Year 5 onward, ECM enables automated, curriculum-aligned assessment infrastructure — the foundation for cross-jurisdictional comparability that underpins Results-Based Finance for Education (RBF4Ed) at continental scale. (See Essay 23), "ECM: Mapping Lessons to Curriculum Standards, Year 5+."

This 48-month research program spans Years 1 through 4, producing a deployable Curriculum IR by Month 48 — aligned with the Breakthrough System's timeline. Mapper-produced curriculum alignments from Track 1 serve as expert ground-truth data for ECM validation, providing a natural bridge between the transitional manual system and the automated IR-based system.

4.3 Data Flow

A national curriculum document (e.g., Kenya's CBC, K–3 Mathematics) enters the system as a PDF. The digitization pipeline converts it into a CASE-compliant machine-readable format. The ECM research team maps the national standards to the Curriculum IR, in consultation with Ministry of Education personnel and curriculum experts to ensure the mapping reflects the jurisdiction's understanding of its own standards. A courseware application (e.g., a numeracy app) independently maps its lesson content to the same Curriculum IR. The crosswalk between the curriculum and the courseware is computed automatically from these two independent mappings. Ministries retain the right to challenge any IR-mediated alignment for cause — for example, if a courseware application claims curriculum alignment that a Ministry considers inaccurate or misleading — through formal contestability procedures with defined timelines and independent adjudication. Ministries are not burdened with the administrative overhead of reviewing or approving every courseware-to-curriculum alignment.

The program's ultimate goal is to produce mapping tools that are sufficiently intuitive and well-documented that Ministries choose to produce and maintain their own curriculum-to-IR mappings independently — perhaps with consulting support, but under their own sovereign authority. The incentive is direct: a Ministry that maintains its own Curriculum IR mapping gives every courseware application in the network automatic access to its curriculum, which enables curriculum-aligned assessment and the cross-jurisdictional comparability that underpins Results-Based Finance for Education (RBF4Ed). The tools must be good enough that this value proposition is self-evident.

4.4 Governance Architecture: Separation of Roles

The ECM program is designed as a loosely-coupled system with explicit separation of roles (see Essay 07):

AUDA-NEPAD provides institutional coordination and continental legitimacy.
Ministries of Education retain sovereign authority over their national curricula and the right to challenge any courseware-to-curriculum alignment for cause. The program's ultimate goal is for each Ministry to produce and maintain its own curriculum-to-IR mapping.
The SOCLE Board (Standard for Open Curriculum Logic in Education) (proposed, based in the Gulf for political neutrality, with representation from participating Ministries, research institutions, and the GEOS Organization) governs the Curriculum IR specification.
The Global Education Outcomes Standards Organization (GEOS Organization) (proposed, based in the Gulf) governs outcome certification for Results-Based Finance for Education, independently of the Curriculum IR governance. The GEOS Organization also governs the training and certification of curriculum mapping auditors worldwide (see Section 16).
Independent auditors validate mapping accuracy.
The Spix Foundation provides engineering capacity and project management.

This separation prevents any single institution from controlling the system and enables trust to accumulate across institutions with different mandates. All curriculum data, mapping outputs, and validation datasets are governed under Malabo Convention-compliant data sovereignty protocols, with each Ministry retaining full authority over its national curriculum data.

4.5 Theory of Change

If this program produces a validated Curriculum IR specification and mapping tools (outputs), then courseware developers can deploy across African and Indian jurisdictions by mapping once to the Curriculum IR (immediate outcome), which enables curriculum-aligned assessment at continental scale (intermediate outcome), which is the prerequisite for Results-Based Finance for Education at a projected benefit of approximately USD 35 per child per year (long-term impact; see Essay 07.

5. Literature-Informed Design Principles

A comprehensive review of the curriculum-mapping, ontology-alignment, and comparative education literatures identified five substantive objections to an IR-based approach. Each objection has been incorporated as a design requirement for the Curriculum IR:

Objection	Design Response
Interlingua problem (semantic drift in intermediate representations)	Continuous validation with formal feedback loops from Ministries; versioned Curriculum IR with built-in contestability
Pedagogical content knowledge (representation inseparable from concept)	Curriculum IR encodes concept relationships — prerequisites, co-requisites, and representational alternatives — alongside concept identifiers
Granularity problem (no universal "atomic" concept level)	Multi-granularity support; concepts can be leaf nodes in one curriculum's mapping and parent nodes in another's
Cultural embedding (curricula encode epistemological commitments)	Curriculum IR functions as mapping infrastructure, making limited, verifiable claims about concept overlap while preserving each curriculum's internal logic
Assessment validity (mapping does not preserve contextual weighting)	Curriculum IR encodes weighting and emphasis metadata alongside concept mappings

Each objection identifies a real design constraint. Each has been incorporated as a specification requirement for Phase 1, Milestone 1. The operating environment — Sub-Saharan Africa, where functional illiteracy among ten-year-olds stands at approximately 90% — demands infrastructure that is substantially better than the status quo. The status quo is no mapping system at all.

The architectural response to the literature's strongest objection — that heterogeneous curricular traditions cannot be represented in a single canonical layer — is MLIR's dialect concept (Section 3.4), which enables each tradition to retain its organizational logic within a unified transformation framework.

6. Research Goals and Milestones

Program Timeline: 48 Months (4 Phases)

The 48-month timeline aligns with the Breakthrough System's ecosystem design: Years 1–4 develop and validate ECM; Year 5 transitions to operational deployment with automated assessment and RBF4Ed integration.

Phase 1: Research and Validation (Months 1–24) — USD 5.5 million

Goal: Validate the Curriculum IR concept through a desk pilot, produce the IR v0.2 specification, digitize and map 12 curricula, and validate against real courseware.

Phase 1 requires no prior deliverables. This is the program's starting point.

Milestones:

Desk pilot (Months 1–6): Obtain curriculum documents for Kenya (CBC) and Eswatini (National Curriculum), K–3 Mathematics. Extract approximately 50 foundational concepts from each. Construct a prototype Curriculum IR. Measure concept-mapping accuracy against expert ground truth. Document results, including successful and failed mappings, in a desk pilot report (Appendix G). This is desk-based analytical work requiring no in-country fieldwork.
Complete formal specification of the Curriculum IR v0.1, incorporating desk pilot findings: concept schema, relationship types, granularity model, dialect support, and weighting/emphasis metadata. Publish as open-source specification under the Apache License 2.0.
Digitize 3 African national curricula and 3 Indian state curricula (K–3, mathematics and literacy) into CASE-compliant machine-readable format. Priority African countries: Kenya (CBC), Liberia, Eswatini — the three countries piloting the RESPECT system and Africa's EdTech Breakthrough, whose Ministers of Education have expressed eagerness to participate. Priority Indian states: Karnataka (proximity to IISc/Sunbird ecosystem), Tamil Nadu (distinctive curriculum, strong education system), Kerala (high literacy rates, strong public education). Final Indian state selection is solely the decision of the senior PI during Phase 1, after consultation with IISc and IIIT Bangalore.
Adopt and configure Sunbird's taxonomy service as the platform layer for Curriculum IR management.
License EdGate's standards database for non-African standards ingestion (providing comparative reference points across dozens of countries).
Establish LLM-based concept extraction pipeline (multi-model: GPT-4o, Claude, open-source models) with human validation workflow.
Map all 6 digitized curricula to the Curriculum IR v0.1, working with Ministry and curriculum-body personnel to ensure mappings reflect each jurisdiction's understanding of its own standards. Measure mapping accuracy against expert ground truth.
Recruit and onboard research teams at all partner institutions.
Digitize 3 additional African national curricula and 3 additional Indian state curricula (K–3, mathematics and literacy). Phase 2 African countries: to be selected from South Africa (CAPS), Rwanda, Nigeria, Senegal, Tanzania, or other AU member states based on curriculum accessibility and linguistic diversity. Phase 2 Indian states: Rajasthan, Maharashtra, Odisha (providing Hindi-belt, Marathi, and Odia linguistic contexts). Selections are finalized by the senior PI based on Phase 1 results, curriculum accessibility, and linguistic diversity. Total: 6 African countries + 6 Indian states (12 jurisdictions).
Map all 12 curricula to the Curriculum IR v0.2 (revised based on desk pilot and early mapping findings).
Partner with 3–5 courseware developers (selected from RESPECT Ecosystem participants or Gates-funded EdTech partners) to map their content to the Curriculum IR. Measure: does a single Curriculum IR mapping enable accurate alignment to all 12 curricula?
Conduct formal validation study: compare IR-mediated mappings against expert-produced mappings (including RESPECT Certified Mapper outputs from Track 1) for accuracy, completeness, and assessment validity across both African and Indian curricula. Target: ≥85% concept-level accuracy for foundational literacy and numeracy.
Publish the Curriculum IR v0.2 specification, validation methodology, and empirical results as peer-reviewed research.
Develop and release open-source mapping tools: Ministry-facing tools for standards-to-IR mapping, developer-facing tools for courseware-to-IR tagging.

Phase 2: Deployment and Operational Readiness (Months 25–48) — USD 4.5 million

Goal: Prepare ECM for operational deployment, achieve operational readiness across all 12 jurisdictions, prepare the scaling pathway to at least 44 countries (80% of AU Member States), and execute the sustainability transition.

Phase 2 requires the following Phase 1 deliverables as inputs: Curriculum IR v0.2; validated crosswalks for all 12 curricula; at least 3 courseware-to-IR mappings; formal validation study results; LLM-based concept extraction pipeline; open-source mapping tools.

Milestones:

Publish Curriculum IR v1.0 specification with full documentation, governance framework, and versioning protocol. Submit to the proposed SOCLE Board (based in the Gulf) for adoption.
Complete mapping of all 12 curricula (6 African, 6 Indian) to Curriculum IR v1.0. Produce validated crosswalks.
Deliver operational mapping tools to AUDA-NEPAD for integration into continental education infrastructure, and to Indian partners for integration with Sunbird/DIKSHA and related DPI-Ed tools.
Train Ministry of Education personnel in 6 African countries and 6 Indian states on the standards-to-IR mapping tools and processes. The goal is for Ministries to eventually produce and maintain their own curriculum-to-IR mappings independently — motivated by the RBF4Ed funding that curriculum-aligned assessment enables.
Establish the Curriculum IR governance framework: versioning policy, validation requirements, contestability mechanisms enabling any participating Ministry to challenge any courseware-to-curriculum alignment for cause through formal review procedures with defined timelines and independent adjudication, and sunset/migration protocols for early-version mappings.
Include a curriculum update protocol: when a participating Ministry revises its national curriculum, the Ministry re-digitizes the updated standards, re-maps them to the Curriculum IR using the provided tools, and all affected courseware crosswalks are recomputed automatically.
Demonstrate operational end-to-end deployment: at least 3 courseware applications mapping to the Curriculum IR and deploying across all 12 jurisdictions with validated curriculum alignments.
Conduct independent evaluation of mapping accuracy, Ministry satisfaction, and courseware developer adoption (see Section 13).
Publish comprehensive technical report and policy recommendations for scaling to at least 44 countries (80% of AU Member States) and additional Indian states.
Execute sustainability transition: hand off Curriculum IR governance to the proposed SOCLE Board (Gulf-based), IR maintenance to AUDA-NEPAD's operational team for African operations, and ongoing development to the open-source community (see Section 16).
Produce a scaling cost estimate: projected cost to extend the Curriculum IR to at least 44 countries (80% of AU Member States).
Establish the SOCLE Compliance Auditors' Professional Association: develop Body of Knowledge (derived from Tranche 1 Mapper experience), certification examination, training materials, and fee structure. Train and certify the first cohort of SOCLE Compliance Auditors, drawn primarily from RESPECT Certified Mappers transitioning from manual mapping to CuIR compliance certification (Essay 24, Section 8A.

7. The Hybrid Build & Buy Strategy

7.1 Components to Be Built (Original Research)

The Curriculum Intermediate Representation itself (Phase 1, Milestone 2): A canonical, language-independent, curriculum-agnostic concept layer with dialect support, granularity flexibility, and weighting metadata. No equivalent exists anywhere in the world.
Curriculum digitization (Phases 1–2): Converting 6 African national curricula and 6 Indian state curricula from PDF/analog into CASE-compliant machine-readable format. No country in Africa currently has this.
Cross-curriculum concept mapping algorithms (Phases 1–2): The validated mappings that connect each national or state curriculum to the canonical Curriculum IR.
Multilingual concept representation (Phases 1–3): Enabling content in any language to map to standards in any other language, initially targeting the African Union's official languages: Arabic, English, French, Portuguese, Spanish, and Kiswahili.
Validation infrastructure (Phase 2): Ground-truth datasets and tooling for expert validation of automated mappings.

7.2 Components to Be Bought or Adopted

Component	Source	Cost	Notes
Standards database (US/intl reference)	EdGate	USD 13,100/yr	EdGate Pro annual subscription (USD 12,500/yr) plus international standards library (from USD 600/organization). Post-grant recurring cost assumed by the Spix Foundation. EdGate's foundational correlation patent (US9373264, priority date 2002) has likely reached its 20-year term, but EdGate announced additional patents in 2018; license terms should be reviewed at contracting to determine which patent claims, if any, remain in force and whether the database is independently protected by copyright or trade-secret law.
Standards identifiers	AB GUIDs (US), CASE identifiers (global)	Custom pricing	US-centric; Curriculum IR provides the African/Indian layer
Platform infrastructure	Sunbird ED (MIT licensed) — taxonomy service	Free (open source)	Designed for India; adaptation required for multi-country use
Curriculum digitization tooling	OpenSALT (MIT licensed)	Free (open source)	CASE framework management; requires curriculum source documents
LLM-based mapping acceleration	GPT-4o / Claude / Llama	API costs (variable)	Up to 89% F1-accuracy for goal-to-skill matching; human validation at every stage

8. Alignment with Gates Foundation Programs

ECM directly addresses three active Gates Foundation commitments, each representing a current funding stream:

Global Education Program (more than USD 240M over 4 years, announced April 2025): ECM directly enables the program's goal of helping children in Sub-Saharan Africa learn more effectively through evidence-based digital solutions. By collapsing the curriculum-mapping bottleneck, ECM allows effective courseware to reach learners across multiple countries — the prerequisite for the "regional exemplars" (the foundation's term) scaling strategy.

Digital Public Infrastructure (USD 200M+ commitment, announced September 2022): ECM is education DPI. It provides foundational, reusable digital infrastructure — the Curriculum IR, mapping tools, digitized curricula — designed for public benefit. The foundation's DPI investments — MOSIP for digital identity, Mojaloop for digital payments — establish the pattern. ECM is the curriculum interoperability layer within Africa's DPI-Ed, complementing the identity and payments layers the foundation already supports.

AI and EdTech for Africa (USD 40M ADQ partnership, announced December 2025): ECM uses LLMs to accelerate concept extraction and alignment suggestion, with human expert validation at every stage. The foundation's emphasis on responsible AI adoption — solutions that reflect local needs, empower teachers, and build capacity for sustained progress — aligns precisely with ECM's design: AI-accelerated, expert-validated, open-source, and governed by African institutions.

The program includes 6 Indian state curricula alongside 6 African national curricula for three reasons: (a) Indian state curricula are more digitized than African equivalents, providing a higher-fidelity validation environment for the Curriculum IR during early phases; (b) India's Sunbird/DIKSHA ecosystem is the primary open-source platform infrastructure the program adopts, and Indian state curriculum mapping enables direct integration testing; (c) cross-continental validation (Africa and India) provides stronger evidence of Curriculum IR generalizability than intra-continental validation alone. The primary beneficiaries of the program remain African children.

The Gates Foundation has invested in multiple EdTech platforms operating in Sub-Saharan Africa through its Global Education Program and ADQ partnership. ECM is infrastructure that amplifies the impact of all Gates-funded EdTech: any courseware application that maps to the Curriculum IR can deploy across all participating countries without additional mapping cost.

9. Competitive Landscape

No existing system provides curriculum interoperability at the level the Curriculum IR proposes. The closest approaches are:

EdGate's standards database catalogs standards from 35+ countries and provides an API for standards access. EdGate does not provide concept-level cross-curriculum mapping. ECM adopts EdGate's database as a reference layer and builds the canonical mapping infrastructure above it.
UNESCO-IBE's curriculum analysis tools enable manual curriculum comparison across countries. UNESCO-IBE does not produce machine-readable crosswalks or automated mapping. ECM builds on UNESCO-IBE's comparative expertise and the curriculum specialists trained through IBE Master's programs.
Platform-specific mappers (e.g., Khan Academy's internal alignment system) map content to individual curricula without an intermediate layer. ECM provides the canonical layer that enables any platform to map once and deploy everywhere.

ECM is complementary to all three. It occupies a layer that does not currently exist: the canonical concept representation that connects standards databases, curriculum expertise, and platform-specific content through a single interoperable infrastructure.

Structural precedents for the IR approach itself are described in Section 3.3 (TCP/IP, LLVM, FHIR). ECM differs from all three precedents in one critical respect: it must operate across sovereign jurisdictions with different educational philosophies, not merely across technical systems with different formats. This jurisdictional dimension — addressed through the dialect mechanism (Section 3.4), the contestability framework (Section 4.3), and the Sovereignty Posture (Section 1A) — is ECM's distinctive contribution to the IR pattern.

10. Principal Investigator Profile and Team

10.1 Required Expertise

Three domains of expertise define the PI requirements for ECM:

Domain 1 — Intermediate Representation Architecture. Deep expertise in designing, implementing, and scaling canonical intermediate representations for complex, heterogeneous systems. The PI must understand why IRs succeed (formal semantics, compositionality, separation of concerns) and why they fail (semantic drift, granularity mismatch, cultural embedding). Direct experience with TCP/IP, LLVM, MLIR, FHIR, or analogous IR systems is strongly preferred.

Domain 2 — African and Indian Education Systems. Working knowledge of African curriculum structures, the differences among them, and the institutional landscape (Ministries of Education, ACQF, regional qualification frameworks). This expertise may reside in a co-PI or senior research partner.

Domain 3 — Computational Linguistics / Knowledge Representation. Expertise in ontology alignment, multilingual concept representation, and LLM-based information extraction. The Curriculum IR's multilingual and multi-granularity requirements demand computational linguistics sophistication.

10.2 Proposed Principal Investigators

The following PI candidates are proposed based on expertise fit. Formal expressions of interest will be secured following AUDA-NEPAD's institutional endorsement of the program.

Vikram Adve (University of Illinois at Urbana-Champaign) - Co-creator of LLVM. Donald B. Gillies Professor of Computer Science. ACM Fellow. ACM Software System Award (2012). - Directly responsible for the most successful IR in computing history. Post-LLVM trajectory demonstrates domain transfer: from compilers to security (SVA, SAFECode, ALLVM — USD 5.6M NSF/ONR) to agricultural AI (AIFARMS — USD 100M NIFA/NSF program). Proven ability to apply IR-based thinking to domains beyond compilers. - IIT Bombay alumnus. Established track record of securing large-scale federal research funding. - Complementary requirement: strong co-PI in curriculum and pedagogy.

Uday Bondhugula (Indian Institute of Science, Bangalore) - Co-author of the foundational MLIR paper and contributor to its design. Professor in the Department of Computer Science and Automation, IISc. - Deep expertise in multi-level IR design — the specific architectural innovation (dialects) most relevant to ECM's challenge of representing heterogeneous curricular traditions. - Based at IISc Bangalore, in geographic and institutional proximity to EkStep Foundation (Sunbird/DIKSHA) and India's DPI architects. Provides a natural bridge between IR research and education infrastructure. - Founder of Polymage Labs (compiler building blocks for AI). Understands research-to-deployment transition. - Complementary requirement: strong co-PI in curriculum and pedagogy. India-based; AUDA-NEPAD coordination via program management.

Lesley Le Grange (Stellenbosch University, South Africa) - Distinguished Professor, Department of Curriculum Studies. Vice-President of the International Association for the Advancement of Curriculum Studies (IAACS). Over 250 publications. - The most internationally connected African curriculum scholar. Deep expertise in curriculum theory, decolonization of curriculum, and cross-cultural knowledge systems. - Based in Africa. Understands the cultural and epistemological dimensions that the Curriculum IR must navigate. - Complementary requirement: strong co-PI in IR architecture or computer science.

10.3 Project Management and Software Development: The Spix Foundation

Research produces knowledge. Deployment requires code. The Spix Foundation provides the engineering and project management capacity that bridges the two. The Spix Foundation's development team implements the Curriculum IR specification, builds the mapping tools, integrates the Sunbird taxonomy service, develops the LLM-based extraction pipeline, and delivers the open-source tooling that Ministries and courseware developers will use. (Organizational details in Appendix H.)

Project management is led by Jim Plamondon, CEO of the Spix Foundation, who managed multi-million-dollar research budgets at Microsoft Research — notably the program to integrate third-party programming languages into Microsoft's .NET Common Language Runtime and Visual Studio. That program required the same kind of coordination ECM demands: academic researchers defining the specification (the Common Language Infrastructure), industry partners implementing against it, and a central project management function ensuring that research insights translated into shipping infrastructure on a fixed timeline. The .NET CLR is itself an intermediate representation — a virtual machine IR enabling multiple source languages to compile to a single target — making this experience directly architecturally relevant.

10.4 Operational Structure

Core team (approximately 18 FTE across 48 months, scaling by phase): 8 FTE researchers (IR architecture, computational linguistics, curriculum studies), 4 FTE curriculum experts (digitization, mapping validation, Ministry liaison), and 6 FTE software engineers (IR implementation, mapping tools, platform integration). Staffing levels vary by phase: Phase 1 emphasizes research and digitization; Phase 2 emphasizes tooling, training, and deployment.

Budget allocation model: Approximately 55–60% of the program budget flows to personnel and university partner subgrants (researcher salaries, curriculum expert fees, field work). The remaining 40–45% covers infrastructure, LLM costs, program management, travel, evaluation, and contingency.

Partner selection: University research partners are selected based on: demonstrated expertise in IR architecture, computational linguistics, or curriculum studies; presence in or partnership with African or Indian institutions; prior experience with applied (deployment-oriented) research; and ability to meet open-source and open-access requirements.

Principal Investigator responsibilities: The Lead PI is responsible for Curriculum IR design, validation methodology, and overall technical research direction. Each co-PI is responsible for their domain's research quality, deliverable acceptance criteria, and publication. All PIs report to the program's governance structure and to the Gates Foundation through quarterly reports.

Quality assurance: Each phase undergoes independent external evaluation (Month 24 and Month 48). Subgrant agreements include deliverable acceptance criteria, milestone-based disbursement, and financial audit provisions. The desk pilot at Months 1–6 provides an early feasibility checkpoint before the program's full investment.

10.5 Recommended Structure: Co-PI Team

The program should be led by a co-PI team:

Lead PI (IR Architecture): Vikram Adve or Uday Bondhugula. Responsible for Curriculum IR design, validation methodology, and technical research direction.
Co-PI (African Curriculum): Lesley Le Grange or a senior researcher from the ACQF team. Responsible for curriculum digitization, validation against African educational realities, and Ministry engagement.
Co-PI (Computational Linguistics): To be identified — a researcher with expertise in multilingual ontology alignment, LLM-based information extraction, and knowledge representation. Candidates may be drawn from IIIT Hyderabad's Language Technologies Research Center or from the broader NLP research community.
Project Management and Development: The Spix Foundation (see Section 10.3).

11. Institutional Partners

11.1 Lead Institution: AUDA-NEPAD

AUDA-NEPAD (African Union Development Agency) serves as the program's institutional home:

Continental legitimacy and alignment with the African Union's Continental Education Strategy for Africa (CESA 16-25).
Existing relationships with all 55 African Union member state Ministries of Education.
Coordination role in the African Continental Qualifications Framework (ACQF), the most relevant existing cross-country educational comparability effort in Africa.
Experience managing multi-country education programs, including the African School Curriculum Survey and the EdTech 2030 Vision.
Alignment with the foundation's emphasis on government systems partnerships and sovereign adoption.

11.2 African University Partners

Institution	Country	Contribution
Kenya Institute of Curriculum Development (KICD)	Kenya	CBC curriculum source documents; curriculum expert validation; RESPECT pilot country
Ministry of Education, Liberia	Liberia	Liberian curriculum source documents; RESPECT pilot country; West African curriculum access
Ministry of Education, Eswatini	Eswatini	Eswatini curriculum source documents; RESPECT pilot country; Southern African curriculum access
Stellenbosch University	South Africa	Curriculum studies expertise; comparative curriculum analysis; validation research
University of Cape Town	South Africa	Education research; assessment validity
Makerere University	Uganda	Programming language and systems expertise (Bainomugisha); East African curriculum access
UNESCO-IBE Master's Programs	Senegal, Congo-Brazzaville, Mozambique	Trained curriculum specialists across West, Central, and Lusophone Africa

11.3 Indian Partners

Institution	Contribution
Indian Institute of Science (IISc), Bangalore	MLIR/IR research expertise (Bondhugula); proximity to EkStep/Sunbird ecosystem; Indian state curriculum digitization and mapping
EkStep Foundation	Sunbird taxonomy service expertise; DIKSHA implementation experience; open-source platform support
IIIT Bangalore (Center for Digital Public Infrastructure)	DPI architecture expertise; CDPI co-chaired by Pramod Varma; Indian state curriculum standards access and DPI-Ed integration

11.4 US Partners

Institution	Contribution
University of Illinois at Urbana-Champaign	LLVM/IR architecture expertise (Adve); large-scale research program management
The Spix Foundation	Project management and software development (see Section 10.3)

11A. Institutional Outputs

ECM produces governance and standards bodies that outlive the research program:

SOCLE Board (Standard for Open Curriculum Logic in Education): An expert body that maintains and evolves the Curriculum IR specification. Based in the Gulf for political neutrality. The SOCLE Board's authority derives from the African Union's endorsement of the curriculum standard; it operates as a technical standards body analogous to W3C working groups. It does not set curriculum policy — that authority remains with national governments and the African Union. The SOCLE Board maintains the standard's technical integrity, manages versioning and evolution, and adjudicates contestability challenges from participating Ministries.
SOCLE Compliance Auditors' Professional Association: The SOCLE Board establishes a professional certification program for SOCLE Compliance Auditors — certified professionals who validate that a Ministry of Education's CuIR expression of its curriculum standards complies with SOCLE Board standards. The Body of Knowledge is derived from Tranche 1 Mapper experience. CuIR compliance certification enables the cross-jurisdictional comparability that underpins Results-Based Finance for Education (RBF4Ed), providing the SOCLE Board with a self-sustaining revenue model. First cohort certified by end of Year 4.
Mapper transition pathway: RESPECT Certified Mappers (the manual curriculum alignment professionals active during Years 1–4) are the natural first cohort of SOCLE Compliance Auditors. The profession is phase-limited by design; the SOCLE Compliance Auditor role is its permanent successor.

ECM's role is to produce the research and infrastructure that these institutions require; it does not govern them after handoff. Governance authority flows from the Breakthrough System's established structures.

11B. Standards-Based Interoperability

All ECM research outputs will conform to or align with relevant standards. The distinction matters: "comply with" means ECM will implement the standard and test/certify against it; "align to" means ECM will follow the standard's design principles and interoperate with its interfaces, adapting where the standard does not fully address African educational contexts.

CASE (Competency and Academic Standards Exchange) for curriculum standards representation — comply with. All digitized curricula will be published in CASE-compliant format. Testable against CASE conformance requirements.
IEEE LOM for learning object metadata — align to. The Curriculum IR extends IEEE LOM with concept-level semantics, dialect support, and weighting metadata that IEEE LOM does not provide.
GovStack building block interfaces for DPI-Ed interoperability — align to. ECM outputs will follow GovStack's building block interface patterns. Full compliance depends on GovStack's education-sector specifications, which are still evolving.
Sunbird taxonomy service for platform infrastructure — comply with. ECM adopts Sunbird's taxonomy service as its platform layer and will maintain compatibility with Sunbird's API specifications.
CRADLE's federated data governance framework for research data access — comply with. All ECM research data access will go through CRADLE's tiered governance process where applicable.

12. Budget Framework

12.1 Summary

Category	Amount (USD)
Personnel (PI team, researchers, curriculum experts, developers)	4,000,000
Curriculum digitization (6 African countries + 6 Indian states, K–3, math + literacy)	800,000
Desk pilot (Phase 1 proof-of-concept: 2 curricula, 50 concepts each)	300,000
Infrastructure (Sunbird adaptation, EdGate license, cloud computing, tools)	650,000
LLM costs (API usage for concept extraction and alignment, 48 months)	450,000
Partner institution subgrants (African and Indian universities)	1,400,000
Travel and convening (Ministry engagement, connectathons, workshops)	750,000
Program management and administration (AUDA-NEPAD + Spix Foundation)	900,000
Independent evaluation (external evaluator, 3 assessments)	275,000
Contingency (5%)	475,000
Total	10,000,000

12.2 Budget by Phase

Phase	Duration	Amount (USD)	Key Activities
Phase 1: Research + Validation	Months 1–24	5,500,000	Desk pilot, IR v0.1→v0.2, 12 curricula digitized and mapped, courseware partnerships, validation study
Phase 2: Deployment + Operational Readiness	Months 25–48	4,500,000	IR v1.0, tools delivered, Ministry training, governance framework, end-to-end deployment, scaling plan, sustainability transition

Funding is structured as staged commitments with a go/no-go gate (see Section 13).

12.3 Budget Rationale

The personnel budget assumes approximately 8 FTE researchers, 4 FTE curriculum experts, and 6 FTE software engineers across all partner institutions over 48 months, with staffing levels varying by phase. A detailed staffing plan is provided in Appendix D, to be developed by the Spix Foundation during Phase 1.

Based on the project's Build vs. Buy analysis (Appendix A), the hybrid Build & Buy strategy reduces the estimated cost from USD 8–12 million for a pure-build approach to USD 10 million with adopted infrastructure. The 48-month timeline is within range of the TCP/IP precedent (7 years from Cerf and Kahn's 1974 paper to RFC 791 in 1981) and the LLVM precedent (5 years from first code to 1.0 release), and aggressive relative to the FHIR precedent (7 years from first proposal to normative standard). Four factors justify the pace: (a) the architectural pattern is well understood and the program builds on existing IR design knowledge from three proven open-source precedents whose production codebases are directly accessible; (b) AI-assisted research and development tooling — from LLM-based concept extraction to AI-accelerated software engineering — compresses technical milestones, enabling the team to model the Curriculum IR compiler and mapping tools directly on LLVM, FHIR, and TCP/IP architectures; (c) time saved on technical milestones is reinvested in institutional readiness — Ministry engagement, governance formation, and Mapper certification — which is the rate-limiting factor for adoption; (d) Africa's education crisis demands urgency — the children currently in K–3 will age out of foundational learning within 36 months.

12.4 Assumptions and Bounds

The budget and timeline rest on the following assumptions. If an assumption proves false, the corresponding bound applies.

Assumption	Bound (what ECM is not promising)
LLM-based concept extraction achieves ≥85% F1-accuracy with human validation	If accuracy is lower, human expert effort increases; budget absorbs this through the contingency allocation. ECM does not promise fully automated extraction.
At least 6 African countries' curriculum documents are accessible through AUDA-NEPAD's Ministry relationships	If fewer are accessible, ECM substitutes additional Indian state curricula or other available African curricula. The Curriculum IR's validity depends on typological diversity, not on specific countries.
The Sunbird taxonomy service is adaptable for multi-country use within the budgeted infrastructure allocation	If adaptation proves more complex, ECM builds a lightweight alternative using the same API specification.
University partner institutions can recruit and retain qualified researchers within project budgets	If recruitment proves difficult, the co-PI structure provides redundancy: the program can proceed with any two of the three domain leads.
48 months is sufficient to reach Curriculum IR v1.0 with operational mapping tools	Phase 1 alone (24 months) produces 12 digitized curricula, a validated IR v0.2, and open-source mapping tools — valuable even if Phase 2 requires extension.
ECM does not promise that the Curriculum IR will replace all manual curriculum mapping by Month 48	The IR reduces cost and enables automation; expert validation remains part of the process. The goal is O(Apps+Standards) cost structure, not zero human involvement.

13. Evaluation and Go/No-Go Criteria

13.1 Independent Evaluation

The program will be independently evaluated at the end of Phase 1 and Phase 2 by an external evaluator nominated by the Gates Foundation during Phase 1. Evaluation criteria include: mapping accuracy against expert ground truth, time and cost per mapping, usability of mapping tools for Ministry personnel, effectiveness of contestability mechanisms, and courseware developer adoption rates.

13.2 Go/No-Go Gate

Phase 1 → Phase 2 gate (Month 24): The formal validation study achieves ≥85% concept-level accuracy across 12 curricula. At least 3 courseware developers have mapped content to the Curriculum IR. The Curriculum IR v0.2 specification is published. All 12 curricula are digitized and mapped. Peer-reviewed results are submitted for publication. If accuracy falls below 50%, the program convenes a technical review to determine whether the IR architecture requires fundamental revision or the approach is non-viable. Between 50% and 85%, the program may extend Phase 1 by up to 6 months for architectural refinement before re-evaluation.

13.3 Reporting

The program provides quarterly progress reports to the Gates Foundation, including: milestones achieved, accuracy metrics, budget expenditure, and risk register updates. A comprehensive mid-term review is conducted at the Phase 1 gate (Month 24).

13.4 What If the Curriculum IR Approach Proves Non-Viable?

If the Curriculum IR fails to achieve ≥50% accuracy at the Phase 1 gate, the program will have produced three outputs with independent value: (a) 6 digitized curricula in CASE-compliant format — the first machine-readable African curriculum canon; (b) a rigorous empirical assessment of the IR hypothesis, informing future research directions; (c) an LLM-based concept extraction pipeline with documented accuracy metrics. The digitized curricula and extraction pipeline serve the broader DPI-Ed ecosystem regardless of the IR's ultimate viability.

14. Risk Mitigation

Risk	Mitigation
IR architecture proves too lossy for foundational subjects	Phase 1 targets math and literacy (K–3), where cross-curricular concept overlap is highest and the IR approach is on strongest theoretical ground. Desk pilot validates before full investment.
Curriculum source documents unavailable or incomplete	AUDA-NEPAD's Ministry relationships provide direct access to African curricula; IISc and IIIT Bangalore provide access to Indian state curricula; KICD and other national/state curriculum bodies are named partners
Ministry engagement insufficient for validation	The program includes funded Ministry training and engagement activities; AUDA-NEPAD's existing relationships de-risk sovereign participation. Ministry adoption is incentivized by three outputs: free curriculum digitization, access to the full courseware network, and the eventual ability to produce their own curriculum-to-IR mappings — the foundation for cross-jurisdictional comparability that underpins RBF4Ed funding. Ministries are not burdened with approving individual courseware alignments.
PI recruitment contingent on institutional commitment	AUDA-NEPAD's endorsement is pursued first; PI recruitment follows. Co-PI structure with multiple candidates per domain provides redundancy; the program can proceed with any two of the three domain leads
LLM accuracy insufficient for production use	Human-in-the-loop validation is built into the design at every stage. LLMs accelerate expert judgment through automation of initial concept extraction and alignment suggestion.
Lock-in risk (premature standardization)	Explicit versioning from v0.1; sunset mechanisms for early mappings; open-source licensing (Apache 2.0) prevents single-institution control
48-month timeline proves insufficient	Phase structure allows useful outputs at each stage; Phase 1 alone (24 months) produces 12 digitized curricula, a validated IR v0.2, and open-source mapping tools — valuable even if Phase 2 requires extension
LLMs become accurate enough to map curricula without an IR	The Curriculum IR provides three capabilities that direct LLM mapping lacks: (a) governance and contestability (Ministries can audit mappings against a published specification); (b) compositionality (new curricula and courseware connect to the full network); (c) institutional permanence (the IR persists across LLM model generations). The IR and LLMs are complementary.
Ministries or courseware developers do not adopt the tools	Phase 1 includes partnership with 3–5 courseware developers and direct Ministry engagement in 12 jurisdictions. Phase 2 measures adoption rates. AUDA-NEPAD's relationships with all 55 AU Ministries provide the sovereign engagement pathway.

14.1 Degraded Operations: What Ships If Dependencies Slip

ECM's principal external dependencies are AUDA-NEPAD's Ministry access (for curriculum documents) and the Sunbird taxonomy service (for platform infrastructure). Neither is a hard blocker.

If Ministry access is delayed in specific countries:

ECM substitutes available curricula from other African countries or additional Indian states. The Curriculum IR's validity depends on typological diversity across curriculum traditions — competency-based, content-standards, outcomes-based — not on specific countries. Delayed countries are added when access is secured.

If Sunbird adaptation proves more complex than expected:

ECM builds a lightweight taxonomy management layer using the same API specification, deferring full Sunbird integration. The Curriculum IR specification is platform-agnostic; any conformant taxonomy service can host it.

If LLM accuracy falls short:

Human expert effort increases for concept extraction and alignment suggestion. The budget's contingency allocation absorbs additional expert time. The program shifts from "AI-accelerated with human validation" to "human-led with AI assistance" — slower, but the IR is still produced.

14.2 Go/No-Go Gates

Gate	Timing	Condition	Action if Not Met
Phase 1 → Phase 2 release	Month 24	≥85% concept-level accuracy across 12 curricula; at least 3 courseware-to-IR mappings; IR v0.2 published; all 12 curricula digitized and mapped	If accuracy ≥50% but <85%: extend Phase 1 by up to 6 months for architectural refinement. If accuracy <50%: convene technical review to assess viability.
Desk pilot checkpoint	Month 6	Prototype Curriculum IR constructed; concept-mapping accuracy measured against expert ground truth for 2 curricula	If results are negative, program pivots design before committing Phase 1's full investment
Courseware validation	Month 18	At least 1 courseware developer has mapped content to the Curriculum IR with measurable results	If no developer adoption, program intensifies partnership efforts and adjusts tooling priorities
Phase 2 completion	Month 48	IR v1.0 published; operational tools delivered; at least 3 courseware applications deploying across all 12 jurisdictions	Joint funder-program review; scope and timeline adjustment for any incomplete deliverables

15. Expected Outcomes and Impact

15.1 Direct Outputs

Curriculum IR v1.0 specification — open-source (Apache License 2.0), peer-reviewed, with governance framework.
6 African national curricula and 6 Indian state curricula digitized in CASE-compliant machine-readable format (K–3, math and literacy). The first machine-readable canon of African curriculum standards and a significant addition to Indian curriculum interoperability.
Validated crosswalks mapping all 12 curricula to the Curriculum IR.
Open-source mapping tools for Ministries (standards-to-IR) and courseware developers (content-to-IR).
Validation dataset providing ground-truth mappings for ongoing accuracy measurement.
Desk pilot report (Phase 1) providing the first empirical assessment of Curriculum IR viability.
Peer-reviewed publications establishing the Curriculum IR as a research contribution.

15.2 Beneficiary Population

The 6 African countries and 6 Indian states targeted in this program collectively serve tens of millions of K–3 students. At USD 10 million for infrastructure serving this population, the per-student investment is negligible — and amortizes toward zero as additional courseware applications and jurisdictions join the network. At full AU-wide deployment (approximately 170 million K–3 children across 55 member states), the per-student infrastructure cost would fall below USD 0.10.

15.3 Downstream Impact

If the Curriculum IR achieves its target of ≥85% concept-level accuracy for foundational literacy and numeracy across 12 curricula (6 African, 6 Indian), it will:

Enable any courseware application to deploy across 12 jurisdictions by mapping once to the Curriculum IR — reducing per-jurisdiction mapping cost from months of expert effort to automated alignment with human validation.
Provide the foundation for automated, curriculum-aligned assessment — the foundation for measuring and comparing learning outcomes at continental scale, enabling Results-Based Finance for Education (RBF4Ed) at a projected benefit of approximately USD 35 per child per year (see Essay 07), "Making Education Outcomes Finance-Grade".
Establish the first machine-readable standard for African curriculum representation and demonstrate cross-continental interoperability with Indian state curricula, creating a de facto infrastructure that additional countries and states can adopt.
Position AUDA-NEPAD and the African Union, alongside Indian DPI-Ed partners, as the global originators of curriculum interoperability infrastructure — a contribution that other regions (Latin America, Southeast Asia) can subsequently adopt.

Every year without curriculum interoperability infrastructure is another year in which the 90% illiteracy rate compounds. The children currently in K–3 across Sub-Saharan Africa will age out of foundational learning within this program's 48-month timeline. ECM is the structural prerequisite for deploying effective courseware across African countries at a cost that education financing can sustain. The Curriculum IR makes this possible.

15.4 If the Program Exceeds Expectations

If Phase 2 validation achieves ≥85% accuracy, the immediate scaling path is: (a) extend to all 55 AU member states over 3–5 years, at an estimated cost of approximately USD 500,000 per additional country for digitization and mapping; (b) extend to secondary subjects (science, social studies) and upper grades; (c) invite non-African, non-Indian countries to contribute dialects to the Curriculum IR. The total estimated cost to reach all AU member states at K–3 is USD 30–35 million, fundable through a combination of Gates follow-on grants, GPE allocations, and Ministry co-funding.

16. Sustainability and Scaling

16.1 Post-Grant Institutional Home

Following the 48-month program, Curriculum IR governance transfers to the proposed SOCLE Board (Standard for Open Curriculum Logic in Education), based in the Gulf for political neutrality. The Gulf is chosen because the Curriculum IR is designed as global infrastructure — serving African, Indian, and eventually Latin American, Southeast Asian, and other jurisdictions — and its governance body should be perceived as neutral by all participating regions. During the 48-month research phase, AUDA-NEPAD retains operational leadership; the transition to the Gulf-based SOCLE Board occurs as part of Phase 2's sustainability transition.

The GEOS Organization (also proposed for the Gulf) governs outcome certification for Results-Based Finance for Education. The GEOS Organization also assumes responsibility for training and certifying curriculum mapping auditors worldwide — a natural extension of its quality-assurance mandate, since the Curriculum IR mapping is upstream in the same pipeline as outcome certification.

16.2 Funding Mechanism and Global Scaling

The key post-grant challenge is scaling beyond the 12 research jurisdictions to the approximately 1,000 curriculum jurisdictions worldwide that will eventually need certified curriculum-to-IR mappings.

Infrastructure maintenance is funded through three channels: (a) the RESPECT Ecosystem Fund (see Essay 24), which allocates a percentage of platform-wide transaction fees to ecosystem maintenance; (b) successor grants or GPE allocations for curriculum update cycles; (c) AUDA-NEPAD's recurring continental education infrastructure budget for African-specific operations. Estimated annual maintenance cost: USD 500,000–800,000 for specification updates, curriculum re-digitization cycles, and tool maintenance.

Mapper training and certification is the larger scaling challenge. For every curriculum jurisdiction that participates in the Curriculum IR — ultimately numbering in the hundreds or thousands — Ministry personnel must be trained to produce certification-ready curriculum-to-IR mappings of their constantly evolving standards (especially as the Curriculum IR itself evolves across versions). The GEOS Organization governs this training and certification function, analogous to its role in certifying outcome assessors ("GEOSors" — see Essay 07. This is pipeline integration: GEOS certifies that a jurisdiction's curriculum-to-IR mapping meets quality standards (upstream) and separately certifies that learning outcomes measured through that mapping meet finance-grade evidence standards (downstream). The training and certification program is funded through certification fees, scaled to jurisdiction size, and cross-subsidized by RBF4Ed transaction fees flowing through the RESPECT Ecosystem.

16.3 Dissemination

Research outputs will be disseminated through: (a) peer-reviewed publications in education technology, computational linguistics, and standards interoperability venues; (b) presentation at AUDA-NEPAD's education technology convenings; (c) open-source release of all specifications, tools, and datasets on GitHub under Apache License 2.0; (d) a public-facing project website with documentation for Ministries and courseware developers.

16.4 Intellectual Property

The PREMIER Institute owns all intellectual property resulting from ECM research. Funding partners receive a worldwide, paid-up, royalty-free, sub-licensable, non-exclusive license to all such IP. Code and specifications (the Curriculum IR specification, mapping tools, validation infrastructure) are released under the Apache License 2.0. Creative works (documentation illustrations, training materials artwork) are released under the appropriate Creative Commons license. University research partners retain academic publication rights; all code and infrastructure deliverables are owned by the PREMIER Institute.

Digitized curricula are published under open licenses; the underlying curriculum content remains the intellectual property of the issuing government. EdGate's licensed database remains proprietary; the Curriculum IR specification is designed to function independently of any proprietary data source. The EdGate license should be structured to terminate when EdGate's relevant patents expire (expected during the program's timeline), unless EdGate can demonstrate independent copyright or other IP protection for its database content. Post-grant, the Spix Foundation assumes the recurring license cost for the duration of the license.

17. AU Mandate Alignment

ECM — building the Curriculum Intermediate Representation for Africa's DPI-Ed — addresses the following AU provisions:

Dec.973, para 23 — DPI-Ed investment: ECM's Curriculum Intermediate Representation is a foundational component of digital public infrastructure for education — the machine-readable curriculum layer that enables automated content alignment and certification.
AU DES, SO2 — Digital content and platforms: ECM enables curriculum-aligned digital content by creating the machine-readable specification that content must align to. Without ECM's intermediate representation, SO2's vision of curriculum-aligned content at continental scale cannot be realized.
AU DES, SO4 — Data management and analytics (EMIS 2.0): ECM's structured curriculum data is a prerequisite for the individual-level learning outcome measurement that EMIS 2.0 requires.
CESA 26–35, SA1/Obj 2 — Upgrade curricula: ECM creates the technical infrastructure for harmonizing curriculum development at continental scale, directly enabling CESA's call for regional and continental curriculum harmonization, especially for foundational education standards.
CESA 26–35, SA3/Obj 7 — Foundational learning: ECM's K-3 focus in Phase 1 ensures that curriculum mapping resources are concentrated where the learning crisis is most acute.
STISA-2034, SP1 — Accelerating sustainable and inclusive industrialization: ECM's open-source, reusable curriculum architecture lowers the cost of content development across Africa, supporting the industrial-scale production of educational technology.

18. Conclusion

ECM is AI-era education infrastructure. The Curriculum IR applies a proven architectural pattern — demonstrated by TCP/IP for networking, LLVM for compilers, and FHIR for healthcare interoperability — to the structural barrier preventing digital courseware from reaching the children who need it most. The program's methodology reflects the same convergence: AI-assisted research and development tools compress the technical timeline, while the institutional architecture — AUDA-NEPAD coordination, Ministry engagement, SOCLE Board governance, and global Mapper certification — receives the sustained investment that determines whether infrastructure achieves adoption.

The 90% functional illiteracy rate among ten-year-olds in Sub-Saharan Africa is a structural failure requiring a structural solution. ECM provides that solution: a canonical intermediate layer that converts the prohibitive O(Apps×Standards) cost of curriculum mapping into sustainable O(Apps+Standards) infrastructure. The Curriculum IR makes it economically rational for courseware developers to serve every African country — and, through automated curriculum-aligned assessment, creates the evidentiary foundation for Results-Based Finance for Education at continental scale.

This research program will determine whether the Curriculum IR works. If it does, the tools, specifications, and institutional framework produced over 48 months will enable deployment across the continent and beyond. The architectural pattern has been proven three times. The AI-accelerated tooling to build it exists. The institutional partners are ready. The children are waiting.

Appendices

Appendix A: Detailed curriculum landscape analysis (available — see "Curriculum Mapping Landscape: Build vs. Buy Analysis for ECM")
Appendix C1: LLVM and IR precedent analysis (available — see "LLVM History and IR Approaches Across Domains")
Appendix C2: FHIR precedent analysis (available — see "FHIR History and Parallels to ECM")
Appendix C3: TCP/IP precedent analysis [to be developed]
Appendix D: Detailed budget and staffing plan [to be developed]
Appendix E: Letters of intent from partner institutions [to be obtained following AUDA-NEPAD endorsement]
Appendix F: CVs of proposed PI team [to be assembled]
Appendix G: Desk pilot results [to be produced during Phase 1; methodology to be included in final proposal]
Appendix H: Spix Foundation organizational profile [to be developed]