A New Rubric for “Script Readiness”

Anushah Hossain, 2024

One of the most significant updates to the SEI website is the addition of a Script Readiness rubric. This post delves into the motivation behind developing this tool and how we hope you’ll use it. 

Let’s start with a bit of background on how SEI operates first. Our work on Unicode proposals comes almost entirely from external grants. These grants usually run on two- to four-year cycles. We select a set of scripts to prioritize at the outset of a grant application; if the grant is successful, we’ll then distribute those funds to collaborators to draft formal proposals. 

Some projects we take on ourselves—such as initiating research, funding font development, or organizing expert meetings—while others we help facilitate by connecting scholars, linguists, and community members with the resources they need. As a central hub for proposal development, we often have insight into ongoing or planned proposals elsewhere in the world. Over the years, we’ve also compiled research on the full range of extant scripts worldwide, whether or not anyone is working on proposing them to Unicode. 

All this work converges in one place: our Scripts to Encode page. While several other resources track the universe of writing systems, SEI has long specialized in offering details about Unicode proposal status—documenting who is working on what and explaining where a proposal currently stands in the development process.

The Scripts to Encode page continues to highlight that specialized information, but the latest update adds critical new information: a more explicit evaluation of how “ready” a script is for encoding.

Why do we need a rubric?

While it is often said that over 150 scripts remain to be encoded, not all are on equal footing. After conducting a systematic review of every script on our docket, we found meaningful distinctions. Some scripts are actively advancing. Others are viable candidates but stalled due to a lack of funding, documentation, or community input. Still others face structural hurdles such as a lack of active users or decipherment issues, which place them out of practical scope for now.

In the past, some of this information distinguishing script proposal stages lived in our quarterly liaison reports. But it was relatively buried and didn’t include updates on proposals outside our direct work. Now, we’ve consolidated our internal evaluations into a clear, easy-to-read rubric.

HIGHScript is nearly ready for Unicode inclusion. Preliminary proposals exist. Script is considered established and unique; community contacts, experts, and encoding strategy are likely identified, but may require some further investigation. A proposal author may still need to be assigned.

High-readiness should be targeted in the near future or are being actively advanced by SEI or others.
MEDIUMScript is certainly established and viable for Unicode inclusion. More research is required to prepare a proposal. We may have yet to identify experts to provide review or someone to author a proposal. 

Medium-readiness scripts are excellent candidates for research projects, which can result in the advancement of a script proposal. But they may require sizeable effort.
LOWScripts are not currently considered viable for Unicode inclusion due to major barriers. Possible issues include: 

– limited evidence of adoption, durability, and/or stability
– issues with intellectual property rights
– limited understanding of the script and/or awaiting decipherment
– unclear status as a writing system 

Low-readiness scripts are unlikely to be funded for a full proposal by SEI until conditions drastically change. Scripts in this category are considered suitable for research reports, introductory proposals, or scholarship to contribute to general knowledge.
UNASSIGNEDSEI is aware of the script, but needs more information or research to determine its readiness for Unicode inclusion.

We should emphasize: these are discretionary assessments, based on the best information available to us, and they involve a level of impressionistic normalization. We’ve tried to distinguish and group comparably within and across script regions, which can be a difficult task. As a result, we’ve added some fuzzy categories of medium-high and low-medium, of scripts that fall between the larger categories. We know the picture is incomplete, but we hope that it is better to have some heuristics than none.

The rubric also relates to our broader reflections about what makes a script “ready.” Over time, Unicode maintainers’ thresholds for encoding new scripts have grown more stringent. Today, newly-invented scripts often need to demonstrate five or more years of sustained usage, ideally across a range of media and use-cases. This higher bar has meant a great number of scripts with potential are currently categorized as low. This is not meant as a verdict. We are thinking hard about what types of tools could help a script gain greater traction in the absence of a Unicode encoding – a topic that we’ll explore more in future posts. 

Why this matters

First, the rubric helps paint a more realistic picture of the number of unencoded scripts. While the latest list may suggest there are over 170 unencoded scripts, the number of those that are viable for short-term proposal development is closer to one hundred. 

Second, it highlights which scripts could be encoded with the right injection of resources. Those in the medium category, in particular, could be excellent candidates for future funding and time. 

Internally, the rubric helps us identify high-priority scripts for funding applications or future proposal development. It also allows us to connect individuals working on similar issues or researching related scripts by sharing our latest knowledge transparently.

As for you, we hope this tool helps you to:

We’ll continue updating the rubric as new information becomes available, and welcome input from anyone working in this space. We invite you to head over to the live page and let us know what you think!