Proposed to: The Bill & Melinda Gates Foundation β Global Education Program
Proposed by: The Spix Foundation, for consideration by the African Union Development Agency (AUDA-NEPAD), in partnership with university research institutions in Africa, India, and the United States
Duration: 48 months (4 years)
Requested funding: USD 10 million (staged across two phases with a go/no-go gate)
Nine out of ten children in Sub-Saharan Africa cannot read a simple sentence by age ten. Digital courseware that teaches foundational literacy and numeracy exists β and cannot deploy across African countries because mapping courseware to each country's curriculum standards is manual, expert-dependent, and prohibitively expensive. This curriculum-mapping bottleneck is a structural barrier to continental-scale deployment of effective EdTech.
This proposal requests USD 10 million over 48 months to build Easy Curriculum Mapping (ECM): a Curriculum Intermediate Representation (Curriculum IR) that collapses the combinatorial cost of curriculum mapping from O(AppsΓStandards) to O(Apps+Standards). The architectural pattern is proven β LLVM demonstrated it for compilers, TCP/IP for computer networking, and FHIR for healthcare interoperability. ECM applies the same structural insight to education. Phase 1 includes a funded desk pilot validating the concept against real curricula. Deliverables include the open-source Curriculum IR specification, 12 digitized curricula (6 African, 6 Indian), validated crosswalks, and mapping tools for Ministries and courseware developers. AUDA-NEPAD leads. The Spix Foundation provides project management and software development. University partners in Africa, India, and the United States provide research expertise. The program's technical methodology leverages AI-assisted research and development tooling β from LLM-based concept extraction to AI-accelerated software engineering modeled on the open-source LLVM, TCP/IP, and FHIR codebases β compressing technical milestones and enabling proportionally greater investment in the institutional architecture that determines long-term adoption.
The Gates Foundation commits USD 10M over 48 months, disbursed in two phases with a go/no-go gate at Month 24. Phase 1 (Months 1β24): USD 5.5M. Phase 2 (Months 25β48): USD 4.5M. Phase 2 funding is contingent on Phase 1 deliverables.
Phase 1 success is defined by the Month-24 Proof of Capability outcome set (Section 1B). If Phase 1 deliverables are met, Phase 2 funding is released. If they are not met, the program convenes a technical review to determine whether the IR architecture requires revision, the timeline requires extension, or the approach is non-viable (see Section 13.2).
The program provides quarterly progress reports to the Gates Foundation, including milestones achieved, accuracy metrics, budget expenditure, and risk register updates. AUDA-NEPAD provides institutional coordination. The Spix Foundation provides project management and software development. Independent evaluation is conducted at the Phase 1 gate (Month 24) and at program completion (Month 48) by an external evaluator nominated by the Gates Foundation.
The PREMIER Institute owns all intellectual property resulting from ECM research. The Gates Foundation, as funding partner, receives a worldwide, paid-up, royalty-free, sub-licensable, non-exclusive license to all such IP. Code and specifications are released under the Apache License 2.0. Creative works (illustrations, documentation artwork) are released under the appropriate Creative Commons license. University research partners retain academic publication rights; all code and infrastructure deliverables are owned by the PREMIER Institute.
Attribution is distinct from authority. Founder attribution and SOCLE Board hosting rights are recognition mechanisms; they confer no governance authority over curriculum standards, data access, or platform operations. National curriculum authority remains with Ministries of Education. Continental coordination authority remains with AUDA-NEPAD. Technical infrastructure authority remains with the RESPECT Platform's technical steward. The Curriculum IR does not set curriculum policy β it maps existing curricula as authored by sovereign governments.
The following concrete outcomes define Phase 1 success and gate Phase 2 funding:
Sub-Saharan Africa faces the world's most severe learning crisis. According to the World Bank's Learning Poverty indicator, functional illiteracy among ten-year-olds in the region stands at approximately 90%. The African Union has identified the elimination of learning poverty as a continental priority. The Gates Foundation's Global Education Program has identified foundational learning in Sub-Saharan Africa as a core priority, committing more than USD 240 million over four years (announced April 2025) to help 15 million children in Sub-Saharan Africa and India learn more effectively, and, with ADQ, an additional USD 40 million for responsible AI and EdTech deployment across Sub-Saharan Africa (announced December 2025).
Digital courseware addressing foundational literacy and numeracy exists and has demonstrated impact in controlled settings. The barrier to continental-scale deployment is the curriculum-mapping bottleneck described below.
Across Africa, an estimated 100 or more distinct national or sub-national curriculum standards govern learning expectations. These standards differ in conceptual decomposition, sequencing, representation conventions, linguistic realization, and cultural embedding. For digital courseware to be used in a country's public education system, it must be mapped to that country's curriculum standards.
Today, this mapping is performed manually by subject-matter experts, separately for each country, separately for each courseware application. If there are N courseware applications and M national curricula, the total mapping effort scales as O(AppsΓStandards). Each new country requires N new mappings; each new application requires M new mappings. This combinatorial cost structure makes multi-country deployment economically irrational for all but the largest publishers.
At realistic rates using African curriculum specialists, expert-produced curriculum mapping costs an estimated USD 1,000β3,500 per application per country, depending on subject scope and curriculum complexity. (This estimate is based on 20β80 hours of curriculum specialist time at USD 30β50/hour for standards analysis, content mapping, gap analysis, and quality assurance, plus project management and coordination overhead β consistent with African education consultant market rates.) The counterfactual at scale is decisive: covering 55 AU member states for 30 courseware applications through manual mapping would cost USD 1.7β5.8 million per mapping cycle β and this cost recurs every time a curriculum is revised or a courseware application updates its content. National curricula are typically revised on 5β10 year cycles; courseware applications update far more frequently. Each revision triggers a new round of manual re-mapping across all affected jurisdictions and applications. Over a decade with expansion to 1,000+ global jurisdictions, manual mapping costs compound into hundreds of millions of dollars β with no infrastructure, no machine-readable standards, and no path to automation.
The Curriculum IR transforms this cost structure. When a Ministry revises its national curriculum, the Ministry re-maps its standards to the Curriculum IR once; all courseware applications connected to the Curriculum IR receive the updated alignment automatically. When a courseware application updates its content, it re-maps to the Curriculum IR once; all jurisdictions receive the updated alignment automatically. An application mapped to an earlier version of the Curriculum IR can still be aligned to current national standards through the IR's versioning and transformation mechanisms β imperfectly, but at zero marginal cost, which is infinitely less expensive than the current system's requirement for a fresh manual mapping. The Curriculum IR converts a recurring, multiplicative expenditure into a one-time infrastructure investment that amortizes as the number of applications and jurisdictions grows.
The mapping bottleneck is compounded by a digitization gap. No African country has published its curriculum standards in an internationally interoperable machine-readable format. South Africa's CAPS exists as PDF documents. Kenya's CBC is digitized for internal KICD use in a format specific to that institution. The African Union's Decade of Education (2025β2034) has yet to address curriculum digitization.
The ECM program must therefore solve both problems simultaneously: build the Curriculum IR and produce the machine-readable curricula it requires as inputs.
Four developments have converged to make ECM feasible today:
LLM-based concept extraction has reached usable accuracy. Current research reports up to 89% F1-accuracy for goal-to-skill matching using large language models, with human validation. Five years ago, automated concept extraction from curriculum documents was impractical. Today, LLMs enable the extraction pipeline that the Curriculum IR depends on.
India's Sunbird/DIKSHA has proven open-source education infrastructure at national scale. India's DPI ecosystem β specifically the Sunbird taxonomy service β provides a tested, MIT-licensed platform layer that ECM can adopt, dramatically reducing infrastructure development time.
AI-assisted research and development tooling has reached production capability. The Curriculum IR's software architecture can be modeled directly on the production codebases of LLVM, TCP/IP, and FHIR β open-source systems whose design patterns are well-documented and accessible to AI-accelerated development tools. Environments such as Claude Code, GitHub Copilot, and their successors compress the implementation timeline for the IR compiler, mapping tools, and validation infrastructure. This acceleration is methodologically significant: time saved on technical milestones is reinvested in the institutional work β Ministry engagement, governance formation, Mapper certification β that determines whether infrastructure achieves adoption. ECM is AI-era education infrastructure, built with AI-era tools.
The African Union's Decade of Education (2025β2034) has created institutional momentum. AUDA-NEPAD's mandate, the African Continental Qualifications Framework (ACQF), and the AU's renewed commitment to education reform provide the continental coordination structure that an IR-based approach requires.
At the foundational level, curriculum standards describe concepts. Representations vary β the concept of number exists regardless of notation system; phonemic awareness is a prerequisite for alphabetic decoding regardless of language. A canonical intermediate layer that captures concepts independently of any particular curriculum's representation enables a structural reduction in mapping cost.
ECM centers on a Curriculum IR that encodes learning concepts at a stable, representation-independent level. National curriculum standards map once to the Curriculum IR; digital courseware maps once to the same Curriculum IR. This converts the O(AppsΓStandards) mapping problem into two linear processes β Standards-to-IR and Courseware-to-IR β yielding O(Apps+Standards) total cost.
The Curriculum IR is designed to interoperate with existing education metadata standards, including CASE (Competency and Academic Standards Exchange) for standards representation and IEEE LOM for learning object metadata. The Curriculum IR extends these standards with concept-level semantics, dialect support (see Section 3.4), and weighting metadata that existing frameworks do not provide.
This architectural pattern has been proven in three directly analogous domains.
TCP/IP introduced the Internet Protocol (IP) as a canonical intermediate layer between application protocols and network technologies. Before TCP/IP, each application had to be implemented separately for each network type β an NΓM problem. IP collapses this: any application protocol maps to IP (N mappings), and IP maps to any network technology (M mappings), yielding O(N+M). Conceived by Vint Cerf and Robert Kahn in their 1974 paper "A Protocol for Packet Network Intercommunication" (IEEE Transactions on Communications), funded by DARPA, and formalized as RFC 791 (1981), TCP/IP became the mandatory standard for US defense networks on January 1, 1983 ("Flag Day") and the foundational infrastructure of the global internet. The IETF governs its evolution through an open, consensus-based standards process with no single controlling institution. (Full history in Appendix C3.)
LLVM introduced a compiler intermediate representation that enables any source language to compile to any hardware target through a single canonical layer. LLVM began as a graduate research project at the University of Illinois (2000), funded by NSF and DARPA, and became the standard infrastructure for new compiler and toolchain projects within a decade (ACM Software System Award, 2012). Its successor, MLIR, extends the IR concept through a "dialect" mechanism that enables domain-specific representations within a unified framework. (Full history in Appendix C1.)
FHIR introduced canonical resources, concept maps, and integration with the UMLS Metathesaurus to collapse the NΓM problem across a dozen incompatible clinical coding systems. FHIR began as a volunteer initiative within HL7 International (2011), was accelerated by ONC's USD 15 million SMART on FHIR grant to Harvard/Boston Children's Hospital, became a normative standard in 2018, and was mandated for US healthcare systems by the ONC's 2020 Final Rule implementing the 21st Century Cures Act. (Full history in Appendix C2.)
All three precedents demonstrate that: (a) IR-based approaches work for complex, multi-stakeholder interoperability problems; (b) they require a governance body, an economic mandate, and sufficiently formal domain semantics; and (c) they can move from initial research to normative standard within 7β10 years, with deployable prototypes within 3β4 years.
MLIR's innovation β domain-specific "dialects" within a unified IR framework β directly addresses the most serious literature-based objection to a Curriculum IR (see Section 5). African curricula include competency-based frameworks (Kenya's CBC), content-standards frameworks (e.g., South Africa's CAPS), and outcomes-based frameworks (various). These are genuinely different organizational logics, each encoding distinct instructional commitments. A Curriculum IR with dialect support represents each curricular tradition in its own terms while enabling transformation between dialects β preserving cultural and pedagogical specificity while achieving interoperability.
Several African countries and most Indian states maintain sub-national curriculum variations (language-of-instruction differences, regional supplementary content). The Curriculum IR's dialect mechanism accommodates sub-national variation within the same national mapping.
Curriculum interoperability at this level has three prerequisites that did not co-exist until recently: (a) the foundational NLP/LLM technology for automated concept extraction from curriculum documents; (b) an institutional actor with both the continental mandate (AUDA-NEPAD) and the technical capacity (the MLIR/LLVM research community) to conceive the approach; and (c) an economic incentive at sufficient scale. The Global North's education systems, which fund most EdTech R&D, face the AppsΓStandards problem at manageable scale (a few dozen curricula, mostly digitized). Africa's fragmentation is uniquely severe β an estimated 100+ curricula, none digitized in interoperable format β and Africa's institutions are uniquely positioned to solve it.
ECM is one component within a larger system. The Breakthrough System addresses four structural barriers to EdTech deployment in Africa β Policy, Technology, Data, and Economics (see Essay 07), "Making Education Outcomes Finance-Grade". ECM helps to address the Technology Barrier β the curriculum-mapping bottleneck β and enables the Data Barrier to be addressed through automated, curriculum-aligned assessment.
Africa's Digital Public Infrastructure for Education (DPI-Ed) is the open-source infrastructure layer that produces continuous, curriculum-aligned, auditable learning evidence. The Spix Foundation's RESPECT system is the first reference implementation of Africa's DPI-Ed. ECM provides the curriculum interoperability layer within DPI-Ed, enabling courseware to connect to any participating country's curriculum through a single mapping.
The Breakthrough System employs a two-track strategy for curriculum mapping:
Track 1 β RESPECT Certified Mappers (Years 1β4). During the period while ECM is under development, human curriculum experts β RESPECT Certified Mappers β perform manual, expert-validated curriculum alignments. RESPECT Certified Mappers are designed as a phase-limited profession: governance protocols include mandatory sunset clauses and transition pathways for Mappers into ECM-related auditing, standards-maintenance, and quality-assurance roles. (See Essay 23), "Mappers: Mapping Lessons to Curriculum Standards, Years 1β4."
Track 2 β ECM (Year 5+). By the end of Year 4, ECM is expected to deliver a deployable Curriculum IR and mapping toolset, collapsing the long-term cost of curriculum mapping. From Year 5 onward, ECM enables automated, curriculum-aligned assessment infrastructure β the foundation for cross-jurisdictional comparability that underpins Results-Based Finance for Education (RBF4Ed) at continental scale. (See Essay 22), "ECM: Mapping Lessons to Curriculum Standards, Year 5+."
This 48-month research program spans Years 1 through 4, producing a deployable Curriculum IR by Month 48 β aligned with the Breakthrough System's timeline. Mapper-produced curriculum alignments from Track 1 serve as expert ground-truth data for ECM validation, providing a natural bridge between the transitional manual system and the automated IR-based system.
A national curriculum document (e.g., Kenya's CBC, Kβ3 Mathematics) enters the system as a PDF. The digitization pipeline converts it into a CASE-compliant machine-readable format. The ECM research team maps the national standards to the Curriculum IR, in consultation with Ministry of Education personnel and curriculum experts to ensure the mapping reflects the jurisdiction's understanding of its own standards. A courseware application (e.g., a numeracy app) independently maps its lesson content to the same Curriculum IR. The crosswalk between the curriculum and the courseware is computed automatically from these two independent mappings. Ministries retain the right to challenge any IR-mediated alignment for cause β for example, if a courseware application claims curriculum alignment that a Ministry considers inaccurate or misleading β through formal contestability procedures with defined timelines and independent adjudication. Ministries are not burdened with the administrative overhead of reviewing or approving every courseware-to-curriculum alignment.
The program's ultimate goal is to produce mapping tools that are sufficiently intuitive and well-documented that Ministries choose to produce and maintain their own curriculum-to-IR mappings independently β perhaps with consulting support, but under their own sovereign authority. The incentive is direct: a Ministry that maintains its own Curriculum IR mapping gives every courseware application in the network automatic access to its curriculum, which enables curriculum-aligned assessment and the cross-jurisdictional comparability that underpins Results-Based Finance for Education (RBF4Ed). The tools must be good enough that this value proposition is self-evident.
The ECM program is designed as a loosely-coupled system with explicit separation of roles (see Essay 07):
This separation prevents any single institution from controlling the system and enables trust to accumulate across institutions with different mandates. All curriculum data, mapping outputs, and validation datasets are governed under Malabo Convention-compliant data sovereignty protocols, with each Ministry retaining full authority over its national curriculum data.
If this program produces a validated Curriculum IR specification and mapping tools (outputs), then courseware developers can deploy across African and Indian jurisdictions by mapping once to the Curriculum IR (immediate outcome), which enables curriculum-aligned assessment at continental scale (intermediate outcome), which is the prerequisite for Results-Based Finance for Education at a projected benefit of approximately USD 35 per child per year (long-term impact; see Essay 07.
A comprehensive review of the curriculum-mapping, ontology-alignment, and comparative education literatures identified five substantive objections to an IR-based approach. Each objection has been incorporated as a design requirement for the Curriculum IR:
| Objection | Design Response |
|---|---|
| Interlingua problem (semantic drift in intermediate representations) | Continuous validation with formal feedback loops from Ministries; versioned Curriculum IR with built-in contestability |
| Pedagogical content knowledge (representation inseparable from concept) | Curriculum IR encodes concept relationships β prerequisites, co-requisites, and representational alternatives β alongside concept identifiers |
| Granularity problem (no universal "atomic" concept level) | Multi-granularity support; concepts can be leaf nodes in one curriculum's mapping and parent nodes in another's |
| Cultural embedding (curricula encode epistemological commitments) | Curriculum IR functions as mapping infrastructure, making limited, verifiable claims about concept overlap while preserving each curriculum's internal logic |
| Assessment validity (mapping does not preserve contextual weighting) | Curriculum IR encodes weighting and emphasis metadata alongside concept mappings |
Each objection identifies a real design constraint. Each has been incorporated as a specification requirement for Phase 1, Milestone 1. The operating environment β Sub-Saharan Africa, where functional illiteracy among ten-year-olds stands at approximately 90% β demands infrastructure that is substantially better than the status quo. The status quo is no mapping system at all.
The architectural response to the literature's strongest objection β that heterogeneous curricular traditions cannot be represented in a single canonical layer β is MLIR's dialect concept (Section 3.4), which enables each tradition to retain its organizational logic within a unified transformation framework.
The 48-month timeline aligns with the Breakthrough System's ecosystem design: Years 1β4 develop and validate ECM; Year 5 transitions to operational deployment with automated assessment and RBF4Ed integration.
Goal: Validate the Curriculum IR concept through a desk pilot, produce the IR v0.2 specification, digitize and map 12 curricula, and validate against real courseware.
Phase 1 requires no prior deliverables. This is the program's starting point.
Milestones:
Goal: Prepare ECM for operational deployment, achieve operational readiness across all 12 jurisdictions, prepare the scaling pathway to at least 44 countries (80% of AU Member States), and execute the sustainability transition.
Phase 2 requires the following Phase 1 deliverables as inputs: Curriculum IR v0.2; validated crosswalks for all 12 curricula; at least 3 courseware-to-IR mappings; formal validation study results; LLM-based concept extraction pipeline; open-source mapping tools.
Milestones:
| Component | Source | Cost | Notes |
|---|---|---|---|
| Standards database (US/intl reference) | EdGate | USD 13,100/yr | EdGate Pro annual subscription (USD 12,500/yr) plus international standards library (from USD 600/organization). Post-grant recurring cost assumed by the Spix Foundation. EdGate's foundational correlation patent (US9373264, priority date 2002) has likely reached its 20-year term, but EdGate announced additional patents in 2018; license terms should be reviewed at contracting to determine which patent claims, if any, remain in force and whether the database is independently protected by copyright or trade-secret law. |
| Standards identifiers | AB GUIDs (US), CASE identifiers (global) | Custom pricing | US-centric; Curriculum IR provides the African/Indian layer |
| Platform infrastructure | Sunbird ED (MIT licensed) β taxonomy service | Free (open source) | Designed for India; adaptation required for multi-country use |
| Curriculum digitization tooling | OpenSALT (MIT licensed) | Free (open source) | CASE framework management; requires curriculum source documents |
| LLM-based mapping acceleration | GPT-4o / Claude / Llama | API costs (variable) | Up to 89% F1-accuracy for goal-to-skill matching; human validation at every stage |
ECM directly addresses three active Gates Foundation commitments, each representing a current funding stream:
Global Education Program (more than USD 240M over 4 years, announced April 2025): ECM directly enables the program's goal of helping children in Sub-Saharan Africa learn more effectively through evidence-based digital solutions. By collapsing the curriculum-mapping bottleneck, ECM allows effective courseware to reach learners across multiple countries β the prerequisite for the "regional exemplars" (the foundation's term) scaling strategy.
Digital Public Infrastructure (USD 200M+ commitment, announced September 2022): ECM is education DPI. It provides foundational, reusable digital infrastructure β the Curriculum IR, mapping tools, digitized curricula β designed for public benefit. The foundation's DPI investments β MOSIP for digital identity, Mojaloop for digital payments β establish the pattern. ECM is the curriculum interoperability layer within Africa's DPI-Ed, complementing the identity and payments layers the foundation already supports.
AI and EdTech for Africa (USD 40M ADQ partnership, announced December 2025): ECM uses LLMs to accelerate concept extraction and alignment suggestion, with human expert validation at every stage. The foundation's emphasis on responsible AI adoption β solutions that reflect local needs, empower teachers, and build capacity for sustained progress β aligns precisely with ECM's design: AI-accelerated, expert-validated, open-source, and governed by African institutions.
The program includes 6 Indian state curricula alongside 6 African national curricula for three reasons: (a) Indian state curricula are more digitized than African equivalents, providing a higher-fidelity validation environment for the Curriculum IR during early phases; (b) India's Sunbird/DIKSHA ecosystem is the primary open-source platform infrastructure the program adopts, and Indian state curriculum mapping enables direct integration testing; (c) cross-continental validation (Africa and India) provides stronger evidence of Curriculum IR generalizability than intra-continental validation alone. The primary beneficiaries of the program remain African children.
The Gates Foundation has invested in multiple EdTech platforms operating in Sub-Saharan Africa through its Global Education Program and ADQ partnership. ECM is infrastructure that amplifies the impact of all Gates-funded EdTech: any courseware application that maps to the Curriculum IR can deploy across all participating countries without additional mapping cost.
No existing system provides curriculum interoperability at the level the Curriculum IR proposes. The closest approaches are:
ECM is complementary to all three. It occupies a layer that does not currently exist: the canonical concept representation that connects standards databases, curriculum expertise, and platform-specific content through a single interoperable infrastructure.
Structural precedents for the IR approach itself are described in Section 3.3 (TCP/IP, LLVM, FHIR). ECM differs from all three precedents in one critical respect: it must operate across sovereign jurisdictions with different educational philosophies, not merely across technical systems with different formats. This jurisdictional dimension β addressed through the dialect mechanism (Section 3.4), the contestability framework (Section 4.3), and the Sovereignty Posture (Section 1A) β is ECM's distinctive contribution to the IR pattern.
Three domains of expertise define the PI requirements for ECM:
Domain 1 β Intermediate Representation Architecture. Deep expertise in designing, implementing, and scaling canonical intermediate representations for complex, heterogeneous systems. The PI must understand why IRs succeed (formal semantics, compositionality, separation of concerns) and why they fail (semantic drift, granularity mismatch, cultural embedding). Direct experience with TCP/IP, LLVM, MLIR, FHIR, or analogous IR systems is strongly preferred.
Domain 2 β African and Indian Education Systems. Working knowledge of African curriculum structures, the differences among them, and the institutional landscape (Ministries of Education, ACQF, regional qualification frameworks). This expertise may reside in a co-PI or senior research partner.
Domain 3 β Computational Linguistics / Knowledge Representation. Expertise in ontology alignment, multilingual concept representation, and LLM-based information extraction. The Curriculum IR's multilingual and multi-granularity requirements demand computational linguistics sophistication.
The following PI candidates are proposed based on expertise fit. Formal expressions of interest will be secured following AUDA-NEPAD's institutional endorsement of the program.
Vikram Adve (University of Illinois at Urbana-Champaign) - Co-creator of LLVM. Donald B. Gillies Professor of Computer Science. ACM Fellow. ACM Software System Award (2012). - Directly responsible for the most successful IR in computing history. Post-LLVM trajectory demonstrates domain transfer: from compilers to security (SVA, SAFECode, ALLVM β USD 5.6M NSF/ONR) to agricultural AI (AIFARMS β USD 100M NIFA/NSF program). Proven ability to apply IR-based thinking to domains beyond compilers. - IIT Bombay alumnus. Established track record of securing large-scale federal research funding. - Complementary requirement: strong co-PI in curriculum and pedagogy.
Uday Bondhugula (Indian Institute of Science, Bangalore) - Co-author of the foundational MLIR paper and contributor to its design. Professor in the Department of Computer Science and Automation, IISc. - Deep expertise in multi-level IR design β the specific architectural innovation (dialects) most relevant to ECM's challenge of representing heterogeneous curricular traditions. - Based at IISc Bangalore, in geographic and institutional proximity to EkStep Foundation (Sunbird/DIKSHA) and India's DPI architects. Provides a natural bridge between IR research and education infrastructure. - Founder of Polymage Labs (compiler building blocks for AI). Understands research-to-deployment transition. - Complementary requirement: strong co-PI in curriculum and pedagogy. India-based; AUDA-NEPAD coordination via program management.
Lesley Le Grange (Stellenbosch University, South Africa) - Distinguished Professor, Department of Curriculum Studies. Vice-President of the International Association for the Advancement of Curriculum Studies (IAACS). Over 250 publications. - The most internationally connected African curriculum scholar. Deep expertise in curriculum theory, decolonization of curriculum, and cross-cultural knowledge systems. - Based in Africa. Understands the cultural and epistemological dimensions that the Curriculum IR must navigate. - Complementary requirement: strong co-PI in IR architecture or computer science.
Research produces knowledge. Deployment requires code. The Spix Foundation provides the engineering and project management capacity that bridges the two. The Spix Foundation's development team implements the Curriculum IR specification, builds the mapping tools, integrates the Sunbird taxonomy service, develops the LLM-based extraction pipeline, and delivers the open-source tooling that Ministries and courseware developers will use. (Organizational details in Appendix H.)
Project management is led by Jim Plamondon, CEO of the Spix Foundation, who managed multi-million-dollar research budgets at Microsoft Research β notably the program to integrate third-party programming languages into Microsoft's .NET Common Language Runtime and Visual Studio. That program required the same kind of coordination ECM demands: academic researchers defining the specification (the Common Language Infrastructure), industry partners implementing against it, and a central project management function ensuring that research insights translated into shipping infrastructure on a fixed timeline. The .NET CLR is itself an intermediate representation β a virtual machine IR enabling multiple source languages to compile to a single target β making this experience directly architecturally relevant.
Core team (approximately 18 FTE across 48 months, scaling by phase): 8 FTE researchers (IR architecture, computational linguistics, curriculum studies), 4 FTE curriculum experts (digitization, mapping validation, Ministry liaison), and 6 FTE software engineers (IR implementation, mapping tools, platform integration). Staffing levels vary by phase: Phase 1 emphasizes research and digitization; Phase 2 emphasizes tooling, training, and deployment.
Budget allocation model: Approximately 55β60% of the program budget flows to personnel and university partner subgrants (researcher salaries, curriculum expert fees, field work). The remaining 40β45% covers infrastructure, LLM costs, program management, travel, evaluation, and contingency.
Partner selection: University research partners are selected based on: demonstrated expertise in IR architecture, computational linguistics, or curriculum studies; presence in or partnership with African or Indian institutions; prior experience with applied (deployment-oriented) research; and ability to meet open-source and open-access requirements.
Principal Investigator responsibilities: The Lead PI is responsible for Curriculum IR design, validation methodology, and overall technical research direction. Each co-PI is responsible for their domain's research quality, deliverable acceptance criteria, and publication. All PIs report to the program's governance structure and to the Gates Foundation through quarterly reports.
Quality assurance: Each phase undergoes independent external evaluation (Month 24 and Month 48). Subgrant agreements include deliverable acceptance criteria, milestone-based disbursement, and financial audit provisions. The desk pilot at Months 1β6 provides an early feasibility checkpoint before the program's full investment.
The program should be led by a co-PI team:
AUDA-NEPAD (African Union Development Agency) serves as the program's institutional home:
| Institution | Country | Contribution |
|---|---|---|
| Kenya Institute of Curriculum Development (KICD) | Kenya | CBC curriculum source documents; curriculum expert validation; RESPECT pilot country |
| Ministry of Education, Liberia | Liberia | Liberian curriculum source documents; RESPECT pilot country; West African curriculum access |
| Ministry of Education, Eswatini | Eswatini | Eswatini curriculum source documents; RESPECT pilot country; Southern African curriculum access |
| Stellenbosch University | South Africa | Curriculum studies expertise; comparative curriculum analysis; validation research |
| University of Cape Town | South Africa | Education research; assessment validity |
| Makerere University | Uganda | Programming language and systems expertise (Bainomugisha); East African curriculum access |
| UNESCO-IBE Master's Programs | Senegal, Congo-Brazzaville, Mozambique | Trained curriculum specialists across West, Central, and Lusophone Africa |
| Institution | Contribution |
|---|---|
| Indian Institute of Science (IISc), Bangalore | MLIR/IR research expertise (Bondhugula); proximity to EkStep/Sunbird ecosystem; Indian state curriculum digitization and mapping |
| EkStep Foundation | Sunbird taxonomy service expertise; DIKSHA implementation experience; open-source platform support |
| IIIT Bangalore (Center for Digital Public Infrastructure) | DPI architecture expertise; CDPI co-chaired by Pramod Varma; Indian state curriculum standards access and DPI-Ed integration |
| Institution | Contribution |
|---|---|
| University of Illinois at Urbana-Champaign | LLVM/IR architecture expertise (Adve); large-scale research program management |
| The Spix Foundation | Project management and software development (see Section 10.3) |
ECM produces governance and standards bodies that outlive the research program:
SOCLE Board (Standard for Open Curriculum Logic in Education): An expert body that maintains and evolves the Curriculum IR specification. Based in the Gulf for political neutrality. The SOCLE Board's authority derives from the African Union's endorsement of the curriculum standard; it operates as a technical standards body analogous to W3C working groups. It does not set curriculum policy β that authority remains with national governments and the African Union. The SOCLE Board maintains the standard's technical integrity, manages versioning and evolution, and adjudicates contestability challenges from participating Ministries.
SOCLE Compliance Auditors' Professional Association: The SOCLE Board establishes a professional certification program for SOCLE Compliance Auditors β certified professionals who validate that a Ministry of Education's CuIR expression of its curriculum standards complies with SOCLE Board standards. The Body of Knowledge is derived from Tranche 1 Mapper experience. CuIR compliance certification enables the cross-jurisdictional comparability that underpins Results-Based Finance for Education (RBF4Ed), providing the SOCLE Board with a self-sustaining revenue model. First cohort certified by end of Year 4.
Mapper transition pathway: RESPECT Certified Mappers (the manual curriculum alignment professionals active during Years 1β4) are the natural first cohort of SOCLE Compliance Auditors. The profession is phase-limited by design; the SOCLE Compliance Auditor role is its permanent successor.
ECM's role is to produce the research and infrastructure that these institutions require; it does not govern them after handoff. Governance authority flows from the Breakthrough System's established structures.
All ECM research outputs will conform to or align with relevant standards. The distinction matters: "comply with" means ECM will implement the standard and test/certify against it; "align to" means ECM will follow the standard's design principles and interoperate with its interfaces, adapting where the standard does not fully address African educational contexts.
| Category | Amount (USD) |
|---|---|
| Personnel (PI team, researchers, curriculum experts, developers) | 4,000,000 |
| Curriculum digitization (6 African countries + 6 Indian states, Kβ3, math + literacy) | 800,000 |
| Desk pilot (Phase 1 proof-of-concept: 2 curricula, 50 concepts each) | 300,000 |
| Infrastructure (Sunbird adaptation, EdGate license, cloud computing, tools) | 650,000 |
| LLM costs (API usage for concept extraction and alignment, 48 months) | 450,000 |
| Partner institution subgrants (African and Indian universities) | 1,400,000 |
| Travel and convening (Ministry engagement, connectathons, workshops) | 750,000 |
| Program management and administration (AUDA-NEPAD + Spix Foundation) | 900,000 |
| Independent evaluation (external evaluator, 3 assessments) | 275,000 |
| Contingency (5%) | 475,000 |
| Total | 10,000,000 |
| Phase | Duration | Amount (USD) | Key Activities |
|---|---|---|---|
| Phase 1: Research + Validation | Months 1β24 | 5,500,000 | Desk pilot, IR v0.1βv0.2, 12 curricula digitized and mapped, courseware partnerships, validation study |
| Phase 2: Deployment + Operational Readiness | Months 25β48 | 4,500,000 | IR v1.0, tools delivered, Ministry training, governance framework, end-to-end deployment, scaling plan, sustainability transition |
Funding is structured as staged commitments with a go/no-go gate (see Section 13).
The personnel budget assumes approximately 8 FTE researchers, 4 FTE curriculum experts, and 6 FTE software engineers across all partner institutions over 48 months, with staffing levels varying by phase. A detailed staffing plan is provided in Appendix D, to be developed by the Spix Foundation during Phase 1.
Based on the project's Build vs. Buy analysis (Appendix A), the hybrid Build & Buy strategy reduces the estimated cost from USD 8β12 million for a pure-build approach to USD 10 million with adopted infrastructure. The 48-month timeline is within range of the TCP/IP precedent (7 years from Cerf and Kahn's 1974 paper to RFC 791 in 1981) and the LLVM precedent (5 years from first code to 1.0 release), and aggressive relative to the FHIR precedent (7 years from first proposal to normative standard). Four factors justify the pace: (a) the architectural pattern is well understood and the program builds on existing IR design knowledge from three proven open-source precedents whose production codebases are directly accessible; (b) AI-assisted research and development tooling β from LLM-based concept extraction to AI-accelerated software engineering β compresses technical milestones, enabling the team to model the Curriculum IR compiler and mapping tools directly on LLVM, FHIR, and TCP/IP architectures; (c) time saved on technical milestones is reinvested in institutional readiness β Ministry engagement, governance formation, and Mapper certification β which is the rate-limiting factor for adoption; (d) Africa's education crisis demands urgency β the children currently in Kβ3 will age out of foundational learning within 36 months.
The budget and timeline rest on the following assumptions. If an assumption proves false, the corresponding bound applies.
| Assumption | Bound (what ECM is not promising) |
|---|---|
| LLM-based concept extraction achieves β₯85% F1-accuracy with human validation | If accuracy is lower, human expert effort increases; budget absorbs this through the contingency allocation. ECM does not promise fully automated extraction. |
| At least 6 African countries' curriculum documents are accessible through AUDA-NEPAD's Ministry relationships | If fewer are accessible, ECM substitutes additional Indian state curricula or other available African curricula. The Curriculum IR's validity depends on typological diversity, not on specific countries. |
| The Sunbird taxonomy service is adaptable for multi-country use within the budgeted infrastructure allocation | If adaptation proves more complex, ECM builds a lightweight alternative using the same API specification. |
| University partner institutions can recruit and retain qualified researchers within project budgets | If recruitment proves difficult, the co-PI structure provides redundancy: the program can proceed with any two of the three domain leads. |
| 48 months is sufficient to reach Curriculum IR v1.0 with operational mapping tools | Phase 1 alone (24 months) produces 12 digitized curricula, a validated IR v0.2, and open-source mapping tools β valuable even if Phase 2 requires extension. |
| ECM does not promise that the Curriculum IR will replace all manual curriculum mapping by Month 48 | The IR reduces cost and enables automation; expert validation remains part of the process. The goal is O(Apps+Standards) cost structure, not zero human involvement. |
The program will be independently evaluated at the end of Phase 1 and Phase 2 by an external evaluator nominated by the Gates Foundation during Phase 1. Evaluation criteria include: mapping accuracy against expert ground truth, time and cost per mapping, usability of mapping tools for Ministry personnel, effectiveness of contestability mechanisms, and courseware developer adoption rates.
Phase 1 β Phase 2 gate (Month 24): The formal validation study achieves β₯85% concept-level accuracy across 12 curricula. At least 3 courseware developers have mapped content to the Curriculum IR. The Curriculum IR v0.2 specification is published. All 12 curricula are digitized and mapped. Peer-reviewed results are submitted for publication. If accuracy falls below 50%, the program convenes a technical review to determine whether the IR architecture requires fundamental revision or the approach is non-viable. Between 50% and 85%, the program may extend Phase 1 by up to 6 months for architectural refinement before re-evaluation.
The program provides quarterly progress reports to the Gates Foundation, including: milestones achieved, accuracy metrics, budget expenditure, and risk register updates. A comprehensive mid-term review is conducted at the Phase 1 gate (Month 24).
If the Curriculum IR fails to achieve β₯50% accuracy at the Phase 1 gate, the program will have produced three outputs with independent value: (a) 6 digitized curricula in CASE-compliant format β the first machine-readable African curriculum canon; (b) a rigorous empirical assessment of the IR hypothesis, informing future research directions; (c) an LLM-based concept extraction pipeline with documented accuracy metrics. The digitized curricula and extraction pipeline serve the broader DPI-Ed ecosystem regardless of the IR's ultimate viability.
| Risk | Mitigation |
|---|---|
| IR architecture proves too lossy for foundational subjects | Phase 1 targets math and literacy (Kβ3), where cross-curricular concept overlap is highest and the IR approach is on strongest theoretical ground. Desk pilot validates before full investment. |
| Curriculum source documents unavailable or incomplete | AUDA-NEPAD's Ministry relationships provide direct access to African curricula; IISc and IIIT Bangalore provide access to Indian state curricula; KICD and other national/state curriculum bodies are named partners |
| Ministry engagement insufficient for validation | The program includes funded Ministry training and engagement activities; AUDA-NEPAD's existing relationships de-risk sovereign participation. Ministry adoption is incentivized by three outputs: free curriculum digitization, access to the full courseware network, and the eventual ability to produce their own curriculum-to-IR mappings β the foundation for cross-jurisdictional comparability that underpins RBF4Ed funding. Ministries are not burdened with approving individual courseware alignments. |
| PI recruitment contingent on institutional commitment | AUDA-NEPAD's endorsement is pursued first; PI recruitment follows. Co-PI structure with multiple candidates per domain provides redundancy; the program can proceed with any two of the three domain leads |
| LLM accuracy insufficient for production use | Human-in-the-loop validation is built into the design at every stage. LLMs accelerate expert judgment through automation of initial concept extraction and alignment suggestion. |
| Lock-in risk (premature standardization) | Explicit versioning from v0.1; sunset mechanisms for early mappings; open-source licensing (Apache 2.0) prevents single-institution control |
| 48-month timeline proves insufficient | Phase structure allows useful outputs at each stage; Phase 1 alone (24 months) produces 12 digitized curricula, a validated IR v0.2, and open-source mapping tools β valuable even if Phase 2 requires extension |
| LLMs become accurate enough to map curricula without an IR | The Curriculum IR provides three capabilities that direct LLM mapping lacks: (a) governance and contestability (Ministries can audit mappings against a published specification); (b) compositionality (new curricula and courseware connect to the full network); (c) institutional permanence (the IR persists across LLM model generations). The IR and LLMs are complementary. |
| Ministries or courseware developers do not adopt the tools | Phase 1 includes partnership with 3β5 courseware developers and direct Ministry engagement in 12 jurisdictions. Phase 2 measures adoption rates. AUDA-NEPAD's relationships with all 55 AU Ministries provide the sovereign engagement pathway. |
ECM's principal external dependencies are AUDA-NEPAD's Ministry access (for curriculum documents) and the Sunbird taxonomy service (for platform infrastructure). Neither is a hard blocker.
If Ministry access is delayed in specific countries:
If Sunbird adaptation proves more complex than expected:
If LLM accuracy falls short:
| Gate | Timing | Condition | Action if Not Met |
|---|---|---|---|
| Phase 1 β Phase 2 release | Month 24 | β₯85% concept-level accuracy across 12 curricula; at least 3 courseware-to-IR mappings; IR v0.2 published; all 12 curricula digitized and mapped | If accuracy β₯50% but <85%: extend Phase 1 by up to 6 months for architectural refinement. If accuracy <50%: convene technical review to assess viability. |
| Desk pilot checkpoint | Month 6 | Prototype Curriculum IR constructed; concept-mapping accuracy measured against expert ground truth for 2 curricula | If results are negative, program pivots design before committing Phase 1's full investment |
| Courseware validation | Month 18 | At least 1 courseware developer has mapped content to the Curriculum IR with measurable results | If no developer adoption, program intensifies partnership efforts and adjusts tooling priorities |
| Phase 2 completion | Month 48 | IR v1.0 published; operational tools delivered; at least 3 courseware applications deploying across all 12 jurisdictions | Joint funder-program review; scope and timeline adjustment for any incomplete deliverables |
The 6 African countries and 6 Indian states targeted in this program collectively serve tens of millions of Kβ3 students. At USD 10 million for infrastructure serving this population, the per-student investment is negligible β and amortizes toward zero as additional courseware applications and jurisdictions join the network. At full AU-wide deployment (approximately 170 million Kβ3 children across 55 member states), the per-student infrastructure cost would fall below USD 0.10.
If the Curriculum IR achieves its target of β₯85% concept-level accuracy for foundational literacy and numeracy across 12 curricula (6 African, 6 Indian), it will:
Every year without curriculum interoperability infrastructure is another year in which the 90% illiteracy rate compounds. The children currently in Kβ3 across Sub-Saharan Africa will age out of foundational learning within this program's 48-month timeline. ECM is the structural prerequisite for deploying effective courseware across African countries at a cost that education financing can sustain. The Curriculum IR makes this possible.
If Phase 2 validation achieves β₯85% accuracy, the immediate scaling path is: (a) extend to all 55 AU member states over 3β5 years, at an estimated cost of approximately USD 500,000 per additional country for digitization and mapping; (b) extend to secondary subjects (science, social studies) and upper grades; (c) invite non-African, non-Indian countries to contribute dialects to the Curriculum IR. The total estimated cost to reach all AU member states at Kβ3 is USD 30β35 million, fundable through a combination of Gates follow-on grants, GPE allocations, and Ministry co-funding.
Following the 48-month program, Curriculum IR governance transfers to the proposed SOCLE Board (Standard for Open Curriculum Logic in Education), based in the Gulf for political neutrality. The Gulf is chosen because the Curriculum IR is designed as global infrastructure β serving African, Indian, and eventually Latin American, Southeast Asian, and other jurisdictions β and its governance body should be perceived as neutral by all participating regions. During the 48-month research phase, AUDA-NEPAD retains operational leadership; the transition to the Gulf-based SOCLE Board occurs as part of Phase 2's sustainability transition.
The GEOS Organization (also proposed for the Gulf) governs outcome certification for Results-Based Finance for Education. The GEOS Organization also assumes responsibility for training and certifying curriculum mapping auditors worldwide β a natural extension of its quality-assurance mandate, since the Curriculum IR mapping is upstream in the same pipeline as outcome certification.
The key post-grant challenge is scaling beyond the 12 research jurisdictions to the approximately 1,000 curriculum jurisdictions worldwide that will eventually need certified curriculum-to-IR mappings.
Infrastructure maintenance is funded through three channels: (a) the RESPECT Ecosystem Fund (see Essay 23), which allocates a percentage of platform-wide transaction fees to ecosystem maintenance; (b) successor grants or GPE allocations for curriculum update cycles; (c) AUDA-NEPAD's recurring continental education infrastructure budget for African-specific operations. Estimated annual maintenance cost: USD 500,000β800,000 for specification updates, curriculum re-digitization cycles, and tool maintenance.
Mapper training and certification is the larger scaling challenge. For every curriculum jurisdiction that participates in the Curriculum IR β ultimately numbering in the hundreds or thousands β Ministry personnel must be trained to produce certification-ready curriculum-to-IR mappings of their constantly evolving standards (especially as the Curriculum IR itself evolves across versions). The GEOS Organization governs this training and certification function, analogous to its role in certifying outcome assessors ("GEOSors" β see Essay 07. This is pipeline integration: GEOS certifies that a jurisdiction's curriculum-to-IR mapping meets quality standards (upstream) and separately certifies that learning outcomes measured through that mapping meet finance-grade evidence standards (downstream). The training and certification program is funded through certification fees, scaled to jurisdiction size, and cross-subsidized by RBF4Ed transaction fees flowing through the RESPECT Ecosystem.
Research outputs will be disseminated through: (a) peer-reviewed publications in education technology, computational linguistics, and standards interoperability venues; (b) presentation at AUDA-NEPAD's education technology convenings; (c) open-source release of all specifications, tools, and datasets on GitHub under Apache License 2.0; (d) a public-facing project website with documentation for Ministries and courseware developers.
The PREMIER Institute owns all intellectual property resulting from ECM research. Funding partners receive a worldwide, paid-up, royalty-free, sub-licensable, non-exclusive license to all such IP. Code and specifications (the Curriculum IR specification, mapping tools, validation infrastructure) are released under the Apache License 2.0. Creative works (documentation illustrations, training materials artwork) are released under the appropriate Creative Commons license. University research partners retain academic publication rights; all code and infrastructure deliverables are owned by the PREMIER Institute.
Digitized curricula are published under open licenses; the underlying curriculum content remains the intellectual property of the issuing government. EdGate's licensed database remains proprietary; the Curriculum IR specification is designed to function independently of any proprietary data source. The EdGate license should be structured to terminate when EdGate's relevant patents expire (expected during the program's timeline), unless EdGate can demonstrate independent copyright or other IP protection for its database content. Post-grant, the Spix Foundation assumes the recurring license cost for the duration of the license.
ECM is AI-era education infrastructure. The Curriculum IR applies a proven architectural pattern β demonstrated by TCP/IP for networking, LLVM for compilers, and FHIR for healthcare interoperability β to the structural barrier preventing digital courseware from reaching the children who need it most. The program's methodology reflects the same convergence: AI-assisted research and development tools compress the technical timeline, while the institutional architecture β AUDA-NEPAD coordination, Ministry engagement, SOCLE Board governance, and global Mapper certification β receives the sustained investment that determines whether infrastructure achieves adoption.
The 90% functional illiteracy rate among ten-year-olds in Sub-Saharan Africa is a structural failure requiring a structural solution. ECM provides that solution: a canonical intermediate layer that converts the prohibitive O(AppsΓStandards) cost of curriculum mapping into sustainable O(Apps+Standards) infrastructure. The Curriculum IR makes it economically rational for courseware developers to serve every African country β and, through automated curriculum-aligned assessment, creates the evidentiary foundation for Results-Based Finance for Education at continental scale.
This research program will determine whether the Curriculum IR works. If it does, the tools, specifications, and institutional framework produced over 48 months will enable deployment across the continent and beyond. The architectural pattern has been proven three times. The AI-accelerated tooling to build it exists. The institutional partners are ready. The children are waiting.