Interview with SEI Founder, Debbie Anderson - Script Encoding Initiative

In many ways, SEI is synonymous with Debbie Anderson. Over the years, Debbie has worn many hats, splitting her time as Chair of the Script Ad Hoc (now called the Script Encoding Working Group) in Unicode and as the Project Director of SEI at UC Berkeley. Across both roles, she’s become known for her careful, compassionate feedback and ability to humanize a complex technical world for newcomers. Although she’s officially retired, we’re very grateful to still have Debbie’s guidance in plotting the path for SEI’s future.

In the interview below, she reflects on more than two decades of work with Unicode and script communities. We hope you enjoy the conversation.

How did you first learn of Unicode and get the idea to start SEI?
Did you always imagine yourself working at the intersection of linguistics and technology?

My initial introduction to Unicode came about through a project I had to put the UCLA Indo-European Studies Bulletin online. At the time UC Berkeley library staff member Kirk Hastings explained that the reason why I couldn’t get some letters from the Old Italic script to appear online was because they weren’t in Unicode – which raised the question, “What is Unicode?” Digging into this question took me to the Sybase office of Ken Whistler, editor of the Unicode Standard, who calmly provided background on Unicode, encouraged me to read the Unicode Standard book, and explained the rather opaque world of character and script encoding.

Cover of the third Unicode manual, released in 2000

The idea of the Script Encoding Initiative project arose in ca. 2000 as a suggestion from Rick McGowan, then Vice President of the Unicode Consortium, who encouraged me to start a project at UC Berkeley that would ensure academics’ voices would be heard in the Unicode Technical Committee. Since the UTC makes decisions on various scripts – including historic and modern minority scripts – it seemed vital that academic experts be consulted. At that time, no universities or colleges were voting members.

With seed funding from a foundation run by A. Richard Diebold Jr., a linguistic anthropologist who specialized in Indo-European linguistics, SEI was set up at UC Berkeley in the Department of Linguistics in 2002. Work initially focused on historic Indo-European scripts and characters relating to Indo-European, such as Old Italic and Linear B, but then later expanded to non-Indo-European scripts, such as Ol Chiki. Work on Indo-European scripts was aided by contacts made earlier when I was editor of the UCLA Indo-European Studies Bulletin (née Indo-European Studies Newsletter). For non-Indo-European scripts, I have relied on contacts suggested by UC Berkeley faculty Johanna Nichols and Jim Matisoff. The project proved very successful, leading to getting six NEH grants, and the encoding of many scripts and character additions.

Feature of SEI’s “Universal Scripts Project” on the NEH website

I have often been interested in the intersection of computers and languages, having taken a course as a graduate student at UCLA on Assembly language. After receiving my Ph.D., I worked on various software projects for a spin-off of Houghton-Mifflin, though the work dealt with Latin-based material and wasn’t in different scripts. My introduction to problems of different writing systems arose when typing my dissertation on an early Apple computer, since it included words from various Indo-European historic languages. The same problems re-appeared again when dealing with articles for the Indo-European Studies Bulletin, where the authors sent their articles using many non-standard fonts. Little did I know that Unicode would be the answer to handling various characters and scripts.

Page from the Indo-European Studies Bulletin from 1998

How did your training in Indo-European languages prepare you for this work?
And what didn’t it prepare you for?

My Ph.D. work in Indo-European at UCLA entailed coursework on Sanskrit with the Devanagari script, early Greek written in Linear B, Hittite in cuneiform, Old Church Slavic in Glagolitic and other languages and scripts, which all proved to be a valuable introduction to various writing systems – though generally for many languages via the lens of transliteration / transcription in the Latin script. In hindsight, I should have taken more classes in phonetics as well as a course in writing systems, though writing systems was not taught separately (or not that I was aware of). More background in programming could have been very helpful too.

What have been some of your favorite projects?

I suppose some of my favorite script encoding projects are those that started as a request from a user and have – after a long period – ended up with the script being approved and successfully implemented on computers, devices, and software. One example is N’Ko, which was originally requested by the ever-patient Mamady Doumbouya and his colleagues. Mamady called me a number of times on the phone to ask how to keep the script moving forward, and personally raised donations to SEI to support work on N’Ko, which also received a grant from UNESCO’s Initiative B@bel for the proposal by Michael Everson. It is very rewarding to see its use today on Facebook.

A more recent success story is Adlam, which again started as an email from one of the Barry brothers in 2007, asking about getting their script into Unicode. The Barry brothers then working closely with the Unicode Technical Committee and, after approval and publication in Unicode 9.0 in 2016, they engaged with implementers and font designers to get the script supported on devices.

For historic scripts, the large new repertoire of Egyptian Hieroglyphs of nearly 4,000 characters is a big accomplishment. Work was done in close cooperation with Egyptologists in Europe and Michel Suignard, who worked on the proposal and code charts, with initial work that was done by Michael Everson and Bob Richmond. More recently, it has been very satisfying to get Cypro-Minoan encoded, after having an in-person meeting with experts in Paris and subsequent email communication to work through the questions.

What were some of the biggest or most unexpected challenges you faced along the way?
What do you most wish outside observers would understand about your work?

The biggest challenge has been working through issues with script users and experts, involving consultation with the technical specialists in Unicode. Having good communication and trust are, in my opinion, key to being able to make progress.

What I would wish outside observers to know is that the process to encode scripts and characters is slow, and requires patience, careful work, and being open to learning about Unicode, the encoding process, and working with Unicode specialists to resolve any issues. Encoding scripts and then getting them to work on computers involves many moving parts, and takes time.

How has public understanding of Unicode changed since you first got involved?

When I first started working on encoding scripts, it was a challenge to explain why it is important to get a script and characters into Unicode – why can’t I just use my own font? As computers, the Internet, and Unicode have progressed and evolved, however, this view has changed. Now requests for additional characters and scripts are received unsolicited; explaining why it is important for a script to be in Unicode is no longer difficult (or even necessary). For some newly created scripts, the model of N’Ko or Adlam serves as an example, but users of recently created scripts don’t see the long tail of work behind their success.

Having good communication and trust are, in my opinion, key to being able to make progress.

Looking back, what does it feel like to see SEI grow the way it has?
Could you have imagined the impact it’s had or the vast network it helped build?

I consider myself an interlocutor, connecting users with those in the technical standards committees, and providing encouragement where needed and explanation of the process. It is rewarding to see all the scripts encoded, but not quite fulfilling because I know many of these new characters and scripts are not fully implemented – there is more work to do!

As SEI moves into a new chapter, what are you most excited to see it take on?
And for you: what projects or passions (script-related or not!) are calling your name now that you’re stepping back?

I am very honored and pleased to have Anushah Hossain taking over the role as head of SEI. She brings lots of new ideas and enthusiasm to take the project to new levels and areas. I plan to remain in the background of SEI, providing any information on past work or guidance needed, and will still be involved in other work at Unicode as a volunteer – in the Editorial Working Group, the Script Encoding Working Group, and other groups where my skills can be put to work. In the meantime, I will fill my time with developing new skills, such as mahjong or tap, or expand out to fulfill other interests that may arise.

I also want to mention that the success of SEI is not based on any special ability of mine, but rather it reflects the joint work (and support) of many people including Ken Whistler, Roozbeh Pournader, Anshuman Pandey, and earlier Rick McGowan, Lisa Moore, and Michael Everson. The work would not have been done without funding from NEH, a very generous outside donor, and A. Richard Diebold Jr.