Easy FLN Localization Project Plan (Draft)

Easy FLN Localization: A Research Project Plan

Building the Writing Intermediate Representation for Africa's Digital Public Infrastructure for Education

Proposed as: A research project within the PREMIER Institute (Platform Research and Engineering for Modern Infrastructure in Education Readiness)

Proposed by: The Spix Foundation, for consideration by the African Union Development Agency (AUDA-NEPAD), in partnership with SIL International and university research institutions in Africa and the United States

Duration: 48 months (4 years, aligned with ECM research timeline)

Requested funding: USD 8 million (staged across two phases with a go/no-go gate)

1. Executive Summary

Nine out of ten children in Sub-Saharan Africa cannot read a simple sentence by age ten. Digital courseware that teaches foundational literacy and numeracy exists — and cannot deploy across African languages because localizing FLN courseware requires redesigning the entire pedagogical architecture — the localization surface spans all instructional sequences, all phonics scaffolding, all decodable-text inventories. This FLN localization bottleneck is a structural barrier to mother-tongue foundational learning at continental scale.

FLN courseware teaches the act of reading. In any given written language, the concepts of symbol, sound, meaning, and mnemonic are deeply intertwined. The pedagogical sequence — which letters to introduce first, which grapheme-phoneme correspondences to teach, which decodable words to construct, which blending exercises to scaffold — depends on the frequency and regularity of grapheme-phoneme correspondences in the specific target language. Changing the language changes the pedagogy. onebillion, co-winner of the Global Learning XPRIZE, reports that each language localization of its onecourse app requires approximately 180,000 words of contextually adapted content. Seven years after the codebase became open-source, it is available in only five languages (as of 2024).

A structural solution is feasible. Perfetti and Verhoeven's 2022 study of seventeen orthographies across five writing system types identified two universals: the Universal Writing System Constraint (all writing systems encode language and reflect basic properties of the linguistic system they encode) and the Universal Phonological Principle (reading activates phonology across all writing systems). Ziegler and Goswami's Psycholinguistic Grain Size Theory provides the parametric framework: all writing systems map written symbols to linguistic units, differing in the grain size and consistency of the mapping.

Africa has a structural advantage: the vast majority of its written languages use Latin or Arabic script; Ethiopic (Ge'ez) serves the languages of the Horn, and a small number of indigenous scripts (N'Ko, Tifinagh, Adlam, Vai) serve specific language communities. Most African Latin-script orthographies are transparent — designed by linguists in the 20th century with consistent grapheme-phoneme correspondences.

This proposal requests USD 8 million over 48 months to build Easy FLN Localization: a Writing Intermediate Representation (Writing IR) that captures the deep structural invariants among written languages — graphemes, phonemes, grapheme-phoneme correspondences, syllable structures, morphological rules, letter-introduction sequences, decodable word inventories, and pedagogical scaffolding patterns. The architectural pattern is the same one that makes Easy Curriculum Mapping (ECM) possible at O(Apps+Standards) cost. Languages will map once to the Writing IR; FLN courseware apps will map once to the same Writing IR. The result: automated FLN localization through parameterization, replacing manual redesign per language.

1A. Decision Package

What a Funder Commits To

The funder commits USD 8M over 48 months, disbursed in two phases with a go/no-go gate at Month 24. Phase 1 (Months 1–24): USD 4.5M. Phase 2 (Months 25–48): USD 3.5M. Phase 2 funding is contingent on Phase 1 deliverables. Easy FLN Localization is a "Big Easy" housed within the PREMIER Institute but independently fundable, with its own Founder attribution.

What a Funder Gets

Founder attribution for Easy FLN Localization and naming/hosting rights for any institution the project produces (e.g., a Writing IR governance body if warranted).
Legacy Attribution at the Founder tier (see Essay 25), Section 7. The PREMIER Institute Founder has Right of First Refusal on this project.
Phase-gated accountability: Phase 2 funding is released only upon Phase 1 deliverable acceptance (see Section 1B).
Open-access research outputs: all specifications, tools, language parameter sets, and peer-reviewed publications produced by Easy FLN Localization are published under open licenses, extending the funder's impact beyond the Breakthrough System.
Direct amplification of XPRIZE investments: every XPRIZE finalist entering the RESPECT Ecosystem benefits from the Writing IR — FLN courseware localizes into any parameterized language automatically.

Success Criteria by Month 24

Phase 1 success is defined by the Month-24 Proof of Capability outcome set (Section 1B). If Phase 1 deliverables are met, Phase 2 funding is released. If they are not met, the project convenes a technical review to determine whether the IR architecture requires revision, the timeline requires extension, or the approach is non-viable (see Section 10.2).

Governance and Reporting

Easy FLN Localization reports through the PREMIER Institute's governance structure. The project's Lead PI reports quarterly to the funder and to the PREMIER Institute Director. Independent evaluation is conducted at the Phase 1 gate (Month 24) and at project completion (Month 48) by an external evaluator. AUDA-NEPAD provides institutional coordination.

Intellectual Property Posture

The PREMIER Institute owns all intellectual property resulting from Easy FLN Localization research. Funding partners receive a worldwide, paid-up, royalty-free, sub-licensable, non-exclusive license to all such IP. Code and specifications (the Writing IR specification, parameterization tools, computational phonology pipeline) are released under the Apache License 2.0. Creative works (illustrations, audio recordings, training materials artwork) are released under the appropriate Creative Commons license. University research partners retain academic publication rights; all code and infrastructure deliverables are owned by the PREMIER Institute.

Sovereignty Posture

Attribution is distinct from authority. Founder attribution and any institutional hosting rights are recognition mechanisms; they confer no governance authority over research agendas, language policy, or platform operations. Language policy authority remains with national governments. Continental coordination authority remains with AUDA-NEPAD. Technical infrastructure authority remains with the RESPECT Platform's technical steward. The Writing IR does not prescribe orthography or language policy — it parameterizes existing written languages as defined by their linguistic communities and national standards.

1B. Month-24 Proof of Capability

The following concrete outcomes define Phase 1 success and gate Phase 2 funding:

Desk pilot: Completed and documented. Prototype Writing IR validated against at least 2 African languages (e.g., Kiswahili and Hausa) with algorithmically generated letter-introduction sequences compared against expert-produced sequences.
Writing IR v0.2 specification: Published as open-source specification, incorporating desk pilot findings and early parameterization results.
Language parameterization: Complete parameter sets for 12 African languages across at least 4 language families, covering grapheme inventories, phoneme inventories, GPC tables, syllable structures, letter-introduction sequences, and decodable word inventories.
Courseware partnerships: At least 2 FLN courseware developers have mapped their pedagogical structure to the Writing IR.
Validation study: Formal validation achieving ≥80% pedagogical validity for Writing IR–generated FLN localizations across 12 languages. Pedagogical validity is assessed by literacy specialists using a structured rubric scoring letter-sequence quality, decodable-text appropriateness, and scaffolding coherence on a 5-point scale; ≥80% validity means at least 80% of rubric items are rated 4 or 5.
Foundational Numeracy: Number-word inventories, counting convention descriptions, and word-problem templates completed for all 12 languages.
Open-source tooling: Linguist-facing parameterization toolkit released for eliciting and encoding new language parameters.
Peer-reviewed submission: Validation methodology and empirical results submitted for publication.

2. The Problem: Africa's FLN Localization Bottleneck

2.1 The Structural Barrier

FLN courseware is categorically harder to localize than courseware for already-literate learners. A Grade 5 history lesson assumes the its students are already literate; an FLN app treats language as the content itself. The localization surface is the entire pedagogical architecture.

The cost consequence is severe. onebillion reports approximately 180,000 words of content per language, each requiring contextual adaptation so that content is "culturally relevant and introduces letters in an order that makes sense." Each localization involves linguists, phonics specialists, audio engineers, illustrators, and pedagogy experts. Cisco-funded tooling improvements enabled multiple languages to be localized simultaneously — a throughput optimization — but the total effort per language remains substantial because each new language requires rebuilding the pedagogical scaffolding from scratch.

PREMIER's planned Easy Text Localization project addresses general-purpose translation and adaptation of educational content for already-literate learners. It is designed for content where language is the delivery medium. Easy FLN Localization addresses content where language is the pedagogical substance. The two projects are complementary: Easy Text Localization handles post-literacy courseware; Easy FLN Localization handles pre-literacy courseware.

2.2 The Combinatorial Challenge

Africa has approximately 2,000 languages and dozens of national curriculum standards. The Breakthrough Project's Phase 1 targets six countries, K-3, Foundational Literacy and Foundational Numeracy, in the AU languages of those countries. Phase 2 expands to ~21 countries. Phase 3 targets at least 44 countries (80% of AU Member States). At each expansion, the number of African languages requiring FLN localization grows. Without structural cost reduction, each language × each FLN app is a year-long manual localization effort — which simply too expensive to scale.

2.3 Foundational Numeracy

Foundational Numeracy presents a related but more tractable challenge. Arabic numerals (0–9) and positional notation are universal across African education systems. The mathematical concepts — counting, addition, subtraction, quantity comparison — are language-independent. What varies is the verbal layer: number words, counting conventions (some African languages use base-5 or base-20 counting alongside the base-10 system used in school), number-word irregularities, and word-problem contexts. Dehaene's Triple Code Model (1992) establishes that number representation involves three codes: a visual Arabic code (universal), an auditory verbal code (language-specific), and an analog magnitude code (universal). The verbal layer is a thinner localization surface than literacy's. Easy FLN Localization will treat Foundational Numeracy's verbal layer as a special case within the Writing IR — the same architectural framework, with a thinner parameter set. Specifically, the Numeracy verbal layer utilizes the Phoneme inventory, Morphological rules, and Pedagogical scaffolding patterns of the core Writing IR, adding a dedicated parameter set covering number words, counting conventions, and word-problem templates. This follows the Multi-Level IR (MLIR) design principle of accommodating domain-specific "dialects" within a unified IR infrastructure (Lattner et al., "MLIR: Scaling Compiler Infrastructure for Domain Specific Computation," IEEE/ACM CGO 2021).

2.4 Why Now

Four developments have converged to make Easy FLN Localization feasible today:

The psycholinguistic evidence base has matured. Perfetti and Verhoeven (2022) demonstrated universals across 17 orthographies. Ziegler and Goswami (2005) provided the parametric framework. Makalela (2024) validated the Orthographic Depth Hypothesis and Morphological Transparency Hypothesis with longitudinal data from African bilingual children. The theoretical foundation for a formal Writing IR is established.

Computational grapheme-to-phoneme models cover hundreds of languages. Deri and Knight (2016, ACL) built computational G2P models covering 531 languages. Transformer-based models have extended this coverage further. The component technology for automated phonological analysis exists.

SIL International has built component tools. PrimerPrep analyzes language data to recommend optimal letter-teaching sequences. Bloom creates decodable readers in any language by separating pedagogical structure from language-specific parameters. These tools demonstrate that aspects of the Writing IR problem have been solved in isolation.

ECM proposes the architectural precedent. If the Curriculum IR succeeds, it will have demonstrated that a formal intermediate representation can collapse an Apps×Standards cost problem to O(Apps+Standards) in the education domain. Easy FLN Localization applies the same hypothesis to a sibling problem, within the same institute, using shared methodology.

3. The Proposed Solution: A "Writing Intermediate Representation" (Writing IR)

3.1 The Core Insight

Written languages, despite their surface diversity, share deep structural invariants. All writing systems encode language through systematic mappings between written symbols and linguistic units. The mappings differ in grain size (phoneme, syllable, morpheme) and consistency (transparent vs. opaque) — but these dimensions are parametric. A formal intermediate representation that captures these invariants enables FLN localization through parameterization: supply a new language's parameters, and the pedagogical scaffolding is generated from the shared framework.

3.2 The Writing IR

The Writing IR will encode the structural elements of FLN pedagogy at an abstract level:

Grapheme inventory — the set of written symbols used in the language.
Phoneme inventory — the set of distinctive sounds in the language.
Grapheme-phoneme correspondence (GPC) table — the mapping between graphemes and phonemes, including consistency ratings (one-to-one, one-to-many, many-to-one).
Syllable structure templates — the permissible syllable shapes (CV, CVC, CVCC, etc.) and their frequencies.
Morphological rules — prefixing, suffixing, compounding, and reduplication patterns relevant to early reading.
Letter-introduction sequence — the optimal order for introducing graphemes, based on frequency, GPC consistency, and pedagogical scaffolding constraints.
Decodable word inventory — the set of words decodable at each point in the letter-introduction sequence.
Pedagogical scaffolding patterns — blending exercises, segmentation exercises, sight-word introduction rules, and passage construction templates.
Numeracy verbal layer — number words, counting conventions, word-problem templates, and mathematical vocabulary in the target language.

National languages will map once to the Writing IR by supplying their language-specific parameters. FLN courseware apps will map once to the Writing IR by expressing their pedagogical structure in terms of the IR's abstract elements. The crosswalk between a language and a courseware app will be computed automatically from these two independent mappings.

3.3 The IR Architectural Pattern

The IR pattern is proven in other domains. LLVM captures computation at a stable, representation-independent level: source languages compile once to the IR, target architectures map once from the IR, and the result is linear cost instead of quadratic. TCP/IP captures network communication at a stable, representation-independent level: applications map once to the protocol stack, physical networks map once to the protocol stack. Easy FLN Localization applies the identical pattern to FLN pedagogy: a canonical intermediate layer, two families of linear mappings, and automatic crosswalk computation. ECM proposes to apply this pattern to curriculum alignment; Easy FLN Localization proposes to apply it to FLN localization. The two projects share IR design methodology, validation approach, governance framework, and sustainability model.

3.4 The Dynamic Tonality Analogy

The Writing IR is analogous to the abstraction layer in Dynamic Tonality and the JIMS Isomorphic Music System (JIMS) — to which the Spix Foundation's CEO, Jim Plamondon, was a contributor. In music, the intervals between notes in a major triad follow the same pattern regardless of the triad's root — across not only in twelve-tone equal temperament, but all across the valid tuning range of the syntonic temperament, which includes the musical tunings of many non-Western cultures and eras. JIMS encodes the relationships between musical elements at a deep structural level, then renders them onto interfaces where the same physical gesture produces the same musical interval regardless of key or tuning. The Writing IR does the same for FLN pedagogy: it encodes the relationships among the elements of literacy at a deep structural level, then renders them into language-specific courseware where the same pedagogical pattern produces the same learning outcome regardless of target language.

3.5 Africa's Structural Advantage

The vast majority of Africa's written languages use Latin or Arabic script; Ethiopic (Ge'ez) serves the languages of the Horn, and a small number of indigenous scripts (N'Ko, Tifinagh, Adlam, Vai) serve specific language communities. Most African Latin-script orthographies were designed by linguists in the 20th century with consistent grapheme-phoneme correspondences — they are transparent. This constrains the parameter space for the Writing IR: transparent orthographies have regular GPC tables that map cleanly to the kind of formal representation the IR requires.

Mother-tongue literacy in a shared script transfers to colonial-language literacy at low cost. Research published in Economics of Education Review (2022) confirms that mother tongue reading materials serve as a bridge to second-language literacy. A child literate in Setswana has internalized the Latin alphabet's visual system and the cognitive operation of alphabetic decoding. Acquiring English literacy then requires learning English-specific grapheme-phoneme mappings and vocabulary — the concept of alphabetic reading itself does not need to be relearned.

This within-script transfer mechanism multiplies the Writing IR's value: localizing FLN courseware into a mother tongue using Latin script simultaneously prepares the child for literacy in (a) all of the African Union's official languages that share the same script (English, French, Portuguese, Spanish, Kiswahili), and (b) the vast majority of Sub-Saharan Africa's written languages.

4. Positioning Within the Breakthrough Ecosystem

4.1 Relationship to V&P_Core

Easy FLN Localization directly supports V&P_Core's scaling trajectory. Phase 1 targets six countries in AU official languages — requiring FLN courseware, each localized into at least one of those languages. Phase 2 expands to ~21 countries. Phase 3 targets at least 44 countries (80% of AU Member States). At each expansion, new languages require FLN localization. The Writing IR transforms this from a quadratic system cost (technically, O(Apps * Languages)) into a linear cost (O(Apps + Languages)).

4.2 Relationship to ECM

ECM and Easy FLN Localization are siblings. Both build formal intermediate representations that capture deep structural invariants across a surface-diverse domain. Both collapse an N×M cost problem to O(N+M) — O(Apps×Standards) for ECM, O(Apps×Languages) for Easy FLN. Both use manual expert work during Years 1–4 as the ground-truth foundation for the automated system that follows.

The two IRs are complementary. The Curriculum IR captures what must be taught — the learning objectives, concept sequences, and assessment expectations specified by a national curriculum. The Writing IR captures how literacy is taught in a given language — the grapheme-phoneme correspondences, letter-introduction sequences, and decodable-word inventories that constitute the pedagogy. Localizing an FLN app to a new language in a new country requires both: the Writing IR for the language-specific pedagogy, and the Curriculum IR for the curriculum-specific alignment.

Housing both projects in the PREMIER Institute enables internal handoff, researcher cross-pollination, and coordinated IR design.

4.3 Relationship to Easy Text Localization

Easy Text Localization facilitates localization of post-literacy courseware. Easy FLN Localization facilitates localization of pre-literacy courseware. A localized FLN app will use both: Easy Text Localization for UI text and instructional scaffolding addressed to literate users (e.g., teacher guides), and Easy FLN Localization for the phonics, decoding, and early reading content that constitutes the learner-facing pedagogy.

4.4 Relationship to XPRIZE

XPRIZE's Accelerate Learning Challenge ($10M, 2025–2029) will produce finalists — FLN courseware apps — that must be localized into the AU official languages of the Breakthrough Project's participating countries when they enter the RESPECT Ecosystem during Phase 2 (see Essay 29), XPRIZE & the Breakthrough Project. These finalists will arrive with content in one or a few languages; the RESPECT Ecosystem must localize them across dozens. During Phase 2, this localization will be performed manually (Track 1). If the Writing IR research succeeds, it will enable this at scale from Year 5 onward (Track 2).

4.5 The Two-Track FLN Localization Strategy

The Breakthrough System will employ a two-track strategy for FLN localization, mirroring the ECM two-track strategy:

Track 1 — Manual FLN Localization (Years 1–4). During the period while the Writing IR is under development, localizing FLN courseware is simply too expensive for the Breakthrough Project to undertake.

Track 2 — Writing IR (Year 5+). By the end of Year 4, the Writing IR is expected to reach operational readiness, enabling automated FLN localization through parameterization. From Year 5 onward, localizing an FLN app into a new language requires supplying the language-specific parameters into the Writing IR framework — a process measured in weeks, performed once per language, making that language available to all RESPECT Compatible FLN apps for free.

4.6 Theory of Change

If this project produces a validated Writing IR specification and parameterization tools (outputs), then FLN courseware developers can localize across African languages by parameterizing once to the Writing IR (immediate outcome), which enables mother-tongue FLN delivery at continental scale (intermediate outcome), which directly addresses the 89% functional illiteracy rate among ten-year-olds in Sub-Saharan Africa (long-term impact).

5. Literature-Informed Design Principles

A comprehensive review of reading science, computational linguistics, and FLN localization practice identified the empirical foundations and design constraints for the Writing IR:

Evidence Base	Design Implication
Perfetti & Verhoeven (2022) — Universal Writing System Constraint and Universal Phonological Principle across 17 orthographies	The Writing IR rests on empirically validated universals, applied to 3.7 billion speakers across 5 writing system types
Ziegler & Goswami (2005) — Psycholinguistic Grain Size Theory: grain size and consistency as parametric dimensions	The Writing IR parameterizes languages along these two dimensions, enabling systematic variation within a unified framework
Dehaene (1992) — Triple Code Model: visual, verbal, and magnitude codes for number representation	Foundational Numeracy localization reduces to the verbal code — a thinner parameter set handled as a special case within the Writing IR
Makalela (2024) — Orthographic Depth Hypothesis and Morphological Transparency Hypothesis, validated with African bilingual children	Africa's transparent orthographies constrain the parameter space and facilitate cross-language transfer
Deri & Knight (2016) — Computational G2P models covering 531 languages	Automated grapheme-to-phoneme analysis is available as a component technology for the Writing IR's GPC tables
SIL PrimerPrep and Bloom — letter-sequence optimization and decodable-reader generation	Component tools that solve individual Writing IR sub-problems; the integration into a unified abstraction is the missing step
Global Proficiency Framework (UNESCO/USAID/World Bank) — universal constructs for reading proficiency	Provides the assessment alignment layer for validating Writing IR–localized courseware against learning outcomes
onebillion localization data — ~180,000 words per language, 1+ year per localization	Quantifies the cost that the Writing IR is designed to collapse

Five design constraints emerge from the literature:

Constraint	Design Response
Orthographic depth variation (transparent vs. opaque scripts)	The Writing IR parameterizes orthographic depth; Africa's predominantly transparent orthographies are the primary target
Grain size variation (phoneme-level vs. syllable-level vs. morpheme-level mappings)	Multi-granularity support: the Writing IR represents GPC mappings at the grain size appropriate to each language
Cultural embedding of mnemonics (letter-sound associations are culturally specific)	Mnemonic associations are language-specific parameters, not IR-level structures; supplied per language alongside GPC data
Pedagogical sequencing constraints (letter-introduction order depends on language-specific frequency and regularity)	Letter-introduction sequences are computed from language-specific GPC data and frequency distributions, using algorithms validated against expert-produced sequences
FN verbal layer variation (number words, counting conventions)	Treated as a special case within the Writing IR: universal arithmetic structure + language-specific verbal parameters

6. Research Goals and Milestones

Program Timeline: 48 Months (4 Phases)

The 48-month timeline aligns with the ECM research timeline and the Breakthrough System's ecosystem design: Years 1–4 develop and validate the Writing IR; Year 5 transitions to operational deployment.

Phase 1: Research and Validation (Months 1–24) — USD 4.5 million

Goal: Validate the Writing IR concept through a desk pilot, produce the IR v0.2 specification, build language parameter sets for 12 African languages, and validate against real FLN courseware.

Phase 1 requires no prior deliverables. This is the project's starting point.

Milestones:

Desk pilot (Months 1–6): Select 2 pilot languages from Phase 1 countries (e.g., Kiswahili and Hausa — one Bantu, one Chadic, both Latin-script, both transparent). Extract complete grapheme inventories, phoneme inventories, GPC tables, syllable structure templates, and frequency distributions. Construct a prototype Writing IR. Generate letter-introduction sequences algorithmically and compare against expert-produced sequences from SIL PrimerPrep and from onebillion's existing localizations. Measure accuracy. Document results in a desk pilot report.
Complete formal specification of the Writing IR v0.1, incorporating desk pilot findings: grapheme-phoneme schema, syllable structure model, morphological rule representation, letter-sequence generation algorithm, decodable-word generation rules, and Foundational Numeracy verbal layer schema.
Build complete language parameter sets for 6 African languages across at least 2 language families (e.g., Bantu: Kiswahili, isiZulu, Chichewa; Chadic/West Atlantic: Hausa, Yoruba, Wolof). Language selection will prioritize pilot country languages and maximize typological diversity within the Latin-script constraint.
Establish computational phonology pipeline: automated G2P extraction, frequency analysis, and GPC consistency scoring for new languages.
Recruit and onboard research teams at all partner institutions.
Initiate partnership with onebillion and other XPRIZE alumni for experiential validation of the IR against real FLN courseware localization challenges.
Build complete language parameter sets for 6 additional African languages, expanding typological coverage to include at least one Arabic-script language and at least one tone language. Total: 12 languages across at least 4 language families.
Partner with 2–3 FLN courseware developers (selected from RESPECT Ecosystem participants, onebillion, or XPRIZE entrants) to map their courseware's pedagogical structure to the Writing IR. Measure: does a single Writing IR mapping enable generation of pedagogically valid localization parameters for all 12 languages?
Conduct formal validation study: compare Writing IR–generated FLN localizations against expert-produced localizations (including manually localized versions from Track 1) for pedagogical validity, letter-sequence quality, decodable-text appropriateness, and — where classroom trials are possible — learning outcomes. Target: Writing IR–generated localizations achieve ≥80% pedagogical validity as rated by literacy specialists.
Develop the Foundational Numeracy verbal layer: build number-word inventories, counting convention descriptions, and word-problem templates for all 12 languages. Validate against existing FN courseware localizations.
Publish the Writing IR v0.2 specification, validation methodology, and empirical results as peer-reviewed research.
Release open-source parameterization tools: a linguist-facing toolkit for eliciting and encoding a new language's parameters into the Writing IR.

Phase 2: Deployment and Operational Readiness (Months 25–48) — USD 3.5 million

Goal: Prepare the Writing IR for operational deployment, achieve operational readiness, prepare the scaling pathway to all AU languages, and execute the sustainability transition.

Phase 2 requires the following Phase 1 deliverables as inputs: Writing IR v0.2; validated language parameter sets for 12 languages; at least 2 courseware-to-IR mappings; formal validation study results; computational phonology pipeline; open-source parameterization tools.

Milestones:

Publish Writing IR v1.0 specification with full documentation, governance framework, and versioning protocol.
Complete language parameter sets for all 12 languages mapped to Writing IR v1.0. Produce validated FLN localization outputs.
Deliver operational parameterization tools to AUDA-NEPAD for integration into the RESPECT Platform.
Train literacy specialists and linguists in pilot countries on the parameter elicitation toolkit and process. The goal is for national teams to produce and maintain their own language parameter sets independently.
Coordinate with ECM research team: validate that Writing IR–localized FLN courseware aligns correctly with national curricula through the Curriculum IR. Test the combined pipeline: Curriculum IR (what to teach) + Writing IR (how to teach it in this language) = localized, curriculum-aligned FLN courseware.
Establish governance framework for the Writing IR: versioning policy, validation requirements, parameter quality certification, and new-language onboarding procedures.
Demonstrate operational end-to-end deployment: at least 2 FLN courseware applications localized through the Writing IR into all 12 languages, with validated learning outcomes in at least 4 languages.
Conduct independent evaluation of localization quality, literacy specialist satisfaction, and courseware developer adoption (see Section 10).
Publish comprehensive technical report and policy recommendations for scaling to all AU languages.
Execute sustainability transition: hand off Writing IR governance to the PREMIER Institute's ongoing standards process (or a dedicated Writing IR governance body if warranted); hand off maintenance to the RESPECT Platform engineering team.
Produce a scaling cost estimate: projected cost per additional language to extend the Writing IR to all AU languages used in formal education.
Transition Track 1 manual FLN localization specialists into Writing IR parameter elicitation, quality assurance, and validation roles.

7. Principal Investigator Profile and Team

7.1 Required Expertise

Four domains of expertise define the PI requirements for Easy FLN Localization:

Domain 1 — Intermediate Representation Architecture. Deep expertise in designing, implementing, and scaling canonical intermediate representations for complex, heterogeneous systems. Direct experience with LLVM, MLIR, FHIR, or analogous IR systems. This expertise is shared with the ECM research team — a PI who serves both projects provides architectural coherence.

Domain 2 — Reading Science and Psycholinguistics. Deep expertise in cross-linguistic reading acquisition, orthographic transparency, grapheme-phoneme correspondence theory, and the empirical basis for FLN pedagogy across writing systems.

Domain 3 — Computational Phonology and African Linguistics. Expertise in computational grapheme-to-phoneme modeling, African language phonology, orthography development, and digital language resources for low-resource African languages.

Domain 4 — FLN Courseware Development. Practical experience building and localizing Foundational Literacy and Numeracy courseware for African learners, including direct knowledge of the manual localization process and its costs.

7.2 Proposed Principal Investigators and Key Personnel

IR Architecture (shared with ECM):

Vikram Adve (University of Illinois at Urbana-Champaign) — Co-creator of LLVM. If serving as Lead PI for ECM, provides architectural coherence across both IRs. The Writing IR and Curriculum IR share the same structural pattern; a single architectural lead ensures design consistency.
Uday Bondhugula (Indian Institute of Science, Bangalore) — Co-author of the foundational MLIR paper. Deep expertise in multi-level IR design with dialect support — directly relevant to representing heterogeneous writing system families within a unified framework.

Reading Science and Psycholinguistics:

Ludo Verhoeven (Radboud University, Netherlands) — Co-author of the 2022 Perfetti & Verhoeven study establishing reading universals across 17 orthographies. Former editor of Written Language and Literacy. Internationally recognized authority on cross-linguistic literacy acquisition. His empirical work provides the theoretical foundation for the Writing IR.
Johannes Ziegler (Aix-Marseille University, CNRS, France) — Co-author of the Psycholinguistic Grain Size Theory. Director of the Laboratoire de Psychologie Cognitive. His parametric framework (grain size × consistency) directly informs the Writing IR's core parameterization dimensions.
Leketi Makalela (University of the Witwatersrand, South Africa) — Professor of Applied Linguistics. His 2024 NASCEE research on the Orthographic Depth Hypothesis and Morphological Transparency Hypothesis, validated with South African bilingual children, provides the Africa-specific evidence base.

Computational Phonology and African Linguistics:

Kathleen Siminyu (Mozilla Foundation / Masakhane) — A leading figure in African NLP, co-founder of Masakhane research community. Expertise in low-resource African language technology, with direct experience building datasets and models for languages with limited digital resources.
SIL International (organizational partner) — Decades of experience in African language orthography development, phonological analysis, and literacy program design. SIL's PrimerPrep and Bloom tools are direct component technologies for the Writing IR. SIL's field linguists have documented phonological systems for hundreds of African languages.

FLN Courseware Development:

Andrew Ashe (onebillion) — Director/CEO of onebillion, co-winner of the Global Learning XPRIZE. Direct experiential knowledge of the FLN localization problem: the 180,000-words-per-language cost, the Cisco-funded tooling improvements, and the challenges of scaling localization across African languages. onebillion's open-source codebase provides the primary validation target for the Writing IR.

Project Management and Software Development

The PREMIER Institute provides the engineering and project management capacity that bridges research and deployment. The development team will implement the Writing IR specification, build the parameterization tools, develop the computational phonology pipeline, and deliver the open-source tooling that linguists and courseware developers will use. If the Institute manages both ECM and Easy FLN Localization, project management overhead is shared and architectural coherence is maintained. The Spix Foundation is expected to provide expertise and development support wrt the DPI-Ed/RESPECT Platform.

7.3 Operational Structure

Core team (approximately 10 FTE across 48 months, scaling by phase): 4 FTE researchers (reading science, computational phonology, African linguistics), 2 FTE linguists/phonologists for field parameterization, and 4 FTE software engineers for IR implementation and tooling. Staffing levels vary by phase: Phase 1 emphasizes research and language parameterization; Phase 2 emphasizes tooling, training, and deployment.

Budget allocation model: Approximately 58% ± 3% of the project budget flows to personnel and university partner subgrants (researcher salaries, linguist fees, field work). The remaining 40–45% covers infrastructure, NLP model costs, program management, travel, evaluation, and contingency. Project management and platform integration costs are partially shared with ECM.

Partner selection: Research partners are selected based on: demonstrated expertise in reading science, computational phonology, or African linguistics; prior experience with applied research in FLN contexts; presence in or partnership with African institutions; and ability to meet open-source and open-access requirements.

Principal Investigator responsibilities: The Lead PI (shared with ECM) is responsible for Writing IR design and architectural coherence with the Curriculum IR. Each co-PI is responsible for their domain's research quality, deliverable acceptance criteria, and publication. The senior partner from onebillion provides experiential validation against real FLN courseware.

Quality assurance: Each phase undergoes independent external evaluation (Month 24 and Month 48). The desk pilot at Months 1–6 provides an early feasibility checkpoint. SIL International's field linguists provide independent validation of language parameter sets against established phonological analyses.

7.4 Recommended Structure: Co-PI Team

Lead PI (IR Architecture): Vikram Adve or Uday Bondhugula. Shared with ECM. Responsible for Writing IR design, validation methodology, and architectural coherence with the Curriculum IR.
Co-PI (Reading Science): Ludo Verhoeven or Johannes Ziegler. Responsible for the psycholinguistic foundations of the Writing IR: parameterization dimensions, validation criteria, and alignment with the empirical literature.
Co-PI (African Linguistics): Leketi Makalela or a senior researcher from the Masakhane community. Responsible for African language parameterization, orthographic analysis, and validation against African classroom realities.
Senior Partner (Computational Phonology): Kathleen Siminyu or a senior SIL computational linguist. Responsible for the computational phonology pipeline and language parameter extraction tooling.
Senior Partner (FLN Courseware): Andrew Ashe (onebillion). Responsible for experiential validation of the Writing IR against real FLN courseware and localization practice.
Project Management and Development: The Spix Foundation (shared with ECM where possible).

8. Institutional Partners

8.1 Lead Institution: AUDA-NEPAD

AUDA-NEPAD serves as the program's institutional home, providing continental legitimacy, existing relationships with all 55 AU member state Ministries of Education, and coordination with the broader Breakthrough System.

8.2 Proposed Research and Technical Partners

Institution	Country	Contribution
Radboud University (Verhoeven)	Netherlands	Cross-linguistic reading acquisition research; Writing IR validation framework; 17-orthography empirical base
Aix-Marseille University / CNRS (Ziegler)	France	Grain Size Theory expertise; parametric framework for the Writing IR's core dimensions
University of the Witwatersrand (Makalela)	South Africa	African bilingual literacy research; Orthographic Depth Hypothesis validation; South African language expertise
University of Illinois at Urbana-Champaign (Adve)	USA	LLVM/IR architecture expertise; shared with ECM
Indian Institute of Science (Bondhugula)	India	MLIR/dialect architecture expertise; shared with ECM
SIL International	Global	African language phonology; PrimerPrep and Bloom tools; decades of literacy program experience across hundreds of African languages
Masakhane NLP research community	Africa-wide	Low-resource African language NLP; computational G2P models; language data
onebillion	UK/Tanzania	FLN courseware localization experience; open-source codebase; XPRIZE alumni experiential knowledge
Ministries of Education (Phase 1 countries)	Africa	Literacy specialists; mother-tongue language expertise; classroom validation sites
Spix Foundation	USA	DPI-Ed/RESPECT Platform development expertise; project management
ECM research team (PREMIER)	—	Curriculum IR coordination; shared IR architecture; validation of combined pipeline

8A. Institutional Outputs

Easy FLN Localization may produce governance or standards outputs that outlive the research project:

Writing IR governance: If the Writing IR achieves adoption, its specification requires ongoing maintenance — versioning, new-language onboarding procedures, parameter quality certification. This governance function transfers to the PREMIER Institute's ongoing standards process (or a dedicated Writing IR governance body if the scope warrants it). The Writing IR does not prescribe orthography or language policy; it parameterizes existing written languages. The parameter lifecycle follows four stages: (1) language parameter submission by a field linguist or literacy specialist using the parameterization toolkit; (2) validation against IR constraints (GPC consistency, syllable structure completeness, decodable-word coverage); (3) certification by a qualified literacy specialist confirming pedagogical validity; (4) deployment to the RESPECT Platform, making the language available to all RESPECT Compatible FLN apps.
Parameterization specialist roles: Literacy specialists and field linguists trained during the project transition into ongoing Writing IR parameter elicitation, quality assurance, and validation roles — maintaining and extending language coverage as the Breakthrough System scales from 6 to 21+ countries.

The project's role is to produce the research and infrastructure that these functions require; governance authority flows from the Breakthrough System's established structures.

8B. Standards-Based Interoperability

All Easy FLN Localization outputs will conform to or align with relevant standards. The distinction matters: "comply with" means the project will implement the standard and test/certify against it; "align to" means the project will follow the standard's design principles and interoperate with its interfaces, adapting where the standard does not fully address African contexts.

Global Proficiency Framework (UNESCO/USAID/World Bank) for reading proficiency constructs — align to. The Writing IR's validation framework will use GPF-aligned reading proficiency constructs for assessing localized courseware quality, adapting where GPF does not address specific African language features.
Unicode for grapheme representation — comply with. All grapheme inventories and GPC tables will use Unicode-encoded characters and comply with Unicode normalization standards.
IPA (International Phonetic Alphabet) for phoneme representation — comply with. All phoneme inventories will be encoded in IPA notation.
CASE (Competency and Academic Standards Exchange) for curriculum alignment (via ECM) — align to. The combined Writing IR + Curriculum IR pipeline will produce outputs compatible with CASE-encoded curriculum standards.
CRADLE's federated data governance framework for research data access — comply with. All research data access involving learner interaction data will go through CRADLE's tiered governance process.

9. Budget Framework

9.1 Summary

Category	Amount (USD)
Personnel (PI team, researchers, linguists, phonologists, developers)	3,200,000
Language parameterization (12 languages: field linguistics, GPC analysis, frequency studies, expert validation)	600,000
Desk pilot (Phase 1 proof-of-concept: 2 languages, full parameter extraction)	200,000
Infrastructure (computational phonology pipeline, IR tooling, cloud computing)	400,000
LLM and NLP model costs (G2P extraction, phonological analysis, 48 months)	300,000
Partner institution subgrants (universities, SIL, Masakhane)	1,200,000
Travel and convening (Ministry engagement, literacy specialist workshops, onebillion collaboration)	600,000
Program management and administration (AUDA-NEPAD + Spix Foundation, shared with ECM where possible)	700,000
Independent evaluation (external evaluator, 3 assessments)	225,000
Contingency (~5%)	375,000
Total	7,800,000

Note: Budget rounds to USD 8 million for planning purposes.

9.2 Budget by Phase

Phase	Duration	Amount (USD)	Key Activities
Phase 1: Research + Validation	Months 1–24	4,500,000	Desk pilot, IR v0.1→v0.2, 12 languages parameterized, courseware partnerships, validation study
Phase 2: Deployment + Operational Readiness	Months 25–48	3,500,000	IR v1.0, tools delivered, national team training, governance framework, end-to-end deployment, scaling plan, sustainability transition

Funding is structured as staged commitments with a go/no-go gate (see Section 10).

9.3 Budget Rationale

The personnel budget assumes approximately 4 FTE researchers (reading science, computational phonology, African linguistics), 2 FTE linguists/phonologists for field parameterization, and 4 FTE software engineers for IR implementation and tooling, with staffing levels varying by phase.

The budget is lower than ECM's ($8M vs. $10M) for three reasons: (a) the Writing IR and the Curriculum IR share a common architectural methodology and co-develop shared infrastructure — the two projects run simultaneously (Years 1–4) with a single IR design team serving both, reducing duplicated engineering effort; (b) the computational phonology pipeline leverages existing G2P models and SIL tools, keeping the NLP research scope narrower than ECM's curriculum digitization effort; (c) project management and platform integration costs are partially shared with ECM.

9.4 Assumptions and Bounds

The budget and timeline rest on the following assumptions. If an assumption proves false, the corresponding bound applies.

Assumption	Bound (what Easy FLN Localization is not promising)
Africa's predominantly transparent orthographies constrain the Writing IR's parameter space to a tractable level	The Writing IR targets transparent Latin-script orthographies in Phase 1. Opaque orthographies (e.g., English) and non-Latin scripts (Arabic, Ethiopic) are explicitly bounded to Phase 2 or later.
Computational G2P models covering 500+ languages (Deri & Knight, 2016) are usable for automated phonological analysis	If G2P model quality is insufficient for specific languages, manual phonological analysis by SIL linguists substitutes. Budget absorbs this through contingency.
SIL International's PrimerPrep and Bloom tools are adaptable as component technologies for the Writing IR pipeline	If adaptation proves complex, the project builds equivalent components using the same specification. SIL's tools are a starting point, not a hard dependency.
At least 12 African languages have sufficient phonological documentation for complete parameter extraction	SIL International has documented phonological systems for hundreds of African languages. Language selection will prioritize well-documented languages for Phase 1.
48 months is sufficient to reach Writing IR v1.0 with operational parameterization tools	Phase 1 alone (24 months) produces 12 language parameter sets, a computational phonology pipeline, and a validated IR v0.2 — valuable even if Phase 2 requires extension.
Easy FLN Localization does not promise that the Writing IR will eliminate all manual FLN localization effort by Month 48	The IR reduces localization from months of expert redesign to weeks of parameterization and validation. Cultural content (mnemonics, illustrations, audio) still requires local teams.

10. Evaluation and Go/No-Go Criteria

10.1 Independent Evaluation

The project will be independently evaluated at the end of Phase 1 and Phase 2 by an external evaluator. Evaluation criteria include: letter-sequence accuracy against expert-produced sequences, decodable-text quality, phonological validity of GPC tables, literacy specialist satisfaction with generated localizations, and — where classroom trials are possible — learning outcomes compared to manually localized courseware.

10.2 Go/No-Go Gate

Phase 1 → Phase 2 gate (Month 24): The formal validation study achieves ≥80% pedagogical validity (as rated by literacy specialists) for Writing IR–generated FLN localizations across 12 languages. At least 2 FLN courseware developers have mapped their pedagogical structure to the Writing IR. The Writing IR v0.2 specification is published. All 12 language parameter sets are complete. Peer-reviewed results are submitted for publication. If pedagogical validity falls below 60%, the project convenes a technical review to determine whether the IR architecture requires fundamental revision or the approach is non-viable.

10.3 What If the Writing IR Approach Proves Non-Viable?

If the Writing IR fails to achieve ≥60% pedagogical validity at the Phase 1 gate, the project will have produced three outputs with independent value: (a) complete phonological parameter sets for 6 African languages in machine-readable format — a digital language resource contribution; (b) a rigorous empirical assessment of the Writing IR hypothesis, informing future research; (c) a computational phonology pipeline with documented capabilities across African languages. These resources serve the broader African NLP and literacy communities regardless of the IR's ultimate viability.

11. Risk Mitigation

Risk	Mitigation
Writing IR too coarse for pedagogy — generated sequences prove pedagogically invalid	Phase 1 targets transparent orthographies (simplest case) where the IR is on strongest theoretical ground. Desk pilot validates before full investment. Opaque orthographies (e.g., English) are explicitly out of initial scope.
Language data scarcity — insufficient phonological data for target languages	SIL International has documented phonological systems for hundreds of African languages. Masakhane community provides computational G2P models. Partnership with Ministries provides access to mother-tongue literacy specialists.
Mnemonic and cultural content proves non-parameterizable — cultural embedding too deep for formal representation	Mnemonics and culturally specific content are treated as language-specific parameters, not IR-level abstractions. The IR does not attempt to generate cultural content; it generates pedagogical scaffolding into which culturally appropriate content is inserted by local teams.
ECM Curriculum IR delayed — blocks the combined pipeline validation	The Writing IR can be validated independently for language-specific pedagogical quality. Combined pipeline validation is a Phase 2 milestone; ECM delays shift this milestone but do not block Writing IR development.
FLN courseware developers do not adopt — tools unused	Phase 2 includes partnership with onebillion and 1–2 additional developers. onebillion's open-source codebase provides a validation target regardless of developer adoption. XPRIZE finalists provide a captive adoption audience during Phase 2.
PI recruitment contingent on institutional commitment — senior researchers unavailable	Co-PI structure with multiple candidates per domain provides redundancy. The project can proceed with any combination of one IR architect, one reading scientist, and one African linguist.
48-month timeline proves insufficient — IR development takes longer than planned	Phase structure ensures useful outputs at each stage; Phase 1 alone produces language parameter sets and a computational phonology pipeline with independent value.

11.1 Degraded Operations: What Ships If Dependencies Slip

Easy FLN Localization's principal external dependency is ECM's Curriculum IR (for combined pipeline validation). The Writing IR can be validated independently; the combined pipeline is a Phase 2 milestone.

If ECM's Curriculum IR is delayed:

The Writing IR develops and validates independently for language-specific pedagogical quality (letter-introduction sequences, decodable-text generation, GPC table accuracy). Combined pipeline validation — Writing IR + Curriculum IR = localized, curriculum-aligned FLN courseware — is deferred until the Curriculum IR is available. The Writing IR's standalone value (automated FLN localization across languages) is unaffected.

If onebillion or other FLN courseware partners are unavailable:

The project validates the Writing IR against publicly available FLN courseware structures and against SIL's Bloom reader specifications. onebillion's open-source codebase remains available as a validation target regardless of organizational partnership status.

If language data for specific target languages proves insufficient:

The project substitutes better-documented languages from the same language family, maintaining typological diversity. SIL International's phonological archives provide fallback data for hundreds of African languages. The 12-language target is a minimum; language selection is flexible within the typological diversity requirement.

11.2 Go/No-Go Gates

The following gates operationalize the Month-24 Proof of Capability (Section 1B) with specific action protocols for each contingency, and define additional checkpoints for operational risk management.

Gate	Timing	Condition	Action if Not Met
Phase 1 → Phase 2 release	Month 24	≥80% pedagogical validity across 12 languages; at least 2 FLN courseware-to-IR mappings; IR v0.2 published; all 12 language parameter sets complete	If validity ≥60% but <80%: extend Phase 1 for architectural refinement. If validity <60%: convene technical review to assess viability.
Desk pilot checkpoint	Month 6	Prototype Writing IR constructed; algorithmically generated letter-introduction sequences compared against expert-produced sequences for 2 languages	If results are negative, project pivots design before committing Phase 1's full investment
Courseware validation	Month 18	At least 1 FLN courseware developer has mapped pedagogical structure to the Writing IR with measurable results	If no developer adoption, project intensifies partnership efforts (onebillion, XPRIZE entrants) and adjusts tooling priorities
Phase 2 completion	Month 48	IR v1.0 published; operational tools delivered; at least 2 FLN courseware applications localized through the IR into all 12 languages	Joint funder-project review; scope and timeline adjustment for any incomplete deliverables

12. Expected Outcomes and Impact

12.1 Direct Outputs

Writing IR v1.0 specification — open-source, peer-reviewed, with governance framework.
12 African language parameter sets in machine-readable format, covering at least 4 language families and both Latin and Arabic scripts.
Validated FLN localizations for at least 2 courseware applications across 12 languages.
Open-source parameterization toolkit for linguists and literacy specialists to encode new languages into the Writing IR.
Computational phonology pipeline for automated G2P extraction, frequency analysis, and GPC consistency scoring.
Foundational Numeracy verbal layer parameter sets for all 12 languages.
Desk pilot report (Phase 1) providing the first empirical assessment of Writing IR viability.
Peer-reviewed publications establishing the Writing IR as a research contribution.

12.2 Beneficiary Population

The 6 Phase 1 countries collectively serve tens of millions of K-3 students across dozens of AU languages. At USD 8 million for infrastructure enabling FLN courseware to reach these students in their mother tongues, the per-student investment is negligible — and amortizes toward zero as additional languages and countries join the Writing IR framework.

12.3 Downstream Impact

If the Writing IR achieves its target of ≥80% pedagogical validity for FLN localizations across 12 languages, it will:

Enable any FLN courseware application to localize into any language for which Writing IR parameters have been supplied — reducing per-language localization effort from months of expert redesign to weeks of parameterization and validation.
Provide the foundation for mother-tongue FLN delivery at continental scale — the prerequisite for addressing the 89% functional illiteracy rate.
Multiply the impact of XPRIZE finalists by enabling rapid localization into the AU languages of participating countries.
Combine with ECM's Curriculum IR to enable fully automated, curriculum-aligned, mother-tongue FLN courseware delivery — the end-state capability of the RESPECT Platform for foundational learning.
Establish the first formal abstraction capturing the deep structural invariants among written languages for pedagogical purposes — a contribution to reading science and computational linguistics with applications beyond Africa.

13. Sustainability and Scaling

13.1 Post-Project Institutional Home

Following the 48-month research project, Writing IR governance and maintenance transfer to the PREMIER Institute's ongoing standards process. The Writing IR's operational infrastructure — parameterization tools and platform APIs — will be maintained by the RESPECT Platform engineering team, funded through V&P_Core's trademark and certification revenue.

13.2 Scaling Pathway

The primary scaling dimension is the number of parameterized languages. The Writing IR shifts the bulk of FLN localization cost away from app developers and onto a Development Partner, who pays the one-time cost per language to map that language into the Writing IR. Each new language requires: (a) phonological data collection and GPC table construction; (b) parameter encoding using the parameterization toolkit; (c) validation by literacy specialists. Once a language has been parameterized, every IR-compatible FLN application can localize into that language — the Development Partner pays once, and all apps benefit.

Extending the Writing IR to all AU languages used in formal education (estimated at 200–300 languages) is fundable through a combination of follow-on grants, GPE allocations, and Ministry co-funding, spread over the Phase 2–3 expansion period.

13.3 Intellectual Property

SIL's existing tools (PrimerPrep, Bloom) remain under their current licenses; the Writing IR specification is designed to interoperate with but not depend on any single proprietary or licensed tool.

14. AU Mandate Alignment

Easy FLN Localization — developing computational phonology pipelines and language parameter sets for African-language Foundational Literacy and Numeracy courseware — addresses the following AU provisions:

Dec.973, para 23 — DPI-Ed investment and teacher digital capacity: Easy FLN Localization builds the language-technology infrastructure that enables DPI-Ed to deliver mother-tongue FLN courseware — the single most impactful application of digital public infrastructure in African basic education.
AU DES, SO2 — Digital content and platforms: Easy FLN Localization enables fully automated, curriculum-aligned, mother-tongue FLN courseware delivery — content at scale in languages that commercial publishers do not serve.
AU DES, SO7 — Digital literacy and skills for teachers: The localized courseware produced by Easy FLN Localization equips teachers with structured pedagogy tools in their students' mother tongues.
CESA 26–35, SA1/Obj 2 — Upgrade curricula to reflect current challenges: CESA specifically notes that "ensuring when feasible that young children are taught in their native language is also a priority." Easy FLN Localization provides the technological capability to make this feasible at scale.
CESA 26–35, SA3/Obj 7 — Foundational learning: This is Easy FLN Localization's core purpose — expanding cost-effective approaches to improve foundational literacy and numeracy, as CESA's highest-priority learning objective demands.
CESA 26–35, SA5/Obj 14 — Adult literacy campaigns: Easy FLN Localization's computational phonology pipelines could be extended to produce adult literacy materials in African languages, supporting CESA's adult literacy objective.
STISA-2034, SP3 — Frontier and emerging technologies: Easy FLN Localization's computational phonology pipelines apply frontier AI and NLP technologies to African languages, many of which are under-resourced in global NLP research.

15. Conclusion

Easy FLN Localization addresses the most expensive localization problem in education technology. It is the precise problem that the Breakthrough Project's Phase 1 scope (six countries, K-3, FLN) and the XPRIZE Accelerate Learning Challenge require solving — first in six countries, then in 21, then continent-wide.

The evidence from reading science establishes that the expense is addressable. Written languages share deep structural universals (Perfetti & Verhoeven, 2022). The variation among them is parametric and systematically characterizable (Ziegler & Goswami, 2005). Africa's transparent orthographies constrain the parameter space (Makalela, 2024). The component technologies — computational G2P models, language-parameter elicitation tools, decodable-text generators — exist. The formal abstraction that integrates them does not.

Easy FLN Localization will build that abstraction. The Writing IR applies a proven architectural pattern — the same one that makes Easy Curriculum Mapping possible — to a sibling problem, within the same institute, using shared methodology and shared IR architecture. The project's 48-month timeline aligns with ECM, the Breakthrough System's phased deployment, and the XPRIZE competition cycle. Manual FLN localization during Years 1–4 produces the ground truth; the Writing IR during Years 3–4 produces the automation; Year 5 begins operational deployment at the moment when V&P_Core is scaling from six to 21+ countries and XPRIZE finalists are entering the Ecosystem.

The children currently in K-3 across Sub-Saharan Africa will age out of foundational learning within this project's 48-month timeline. Africa's best FLN courseware exists. The Writing IR will make it available — in every mother tongue, on the RESPECT Platform, at a cost that scales.