Divided by a common language

http://technology.guardian.co.uk/weekly/ story/0,,1830481,00.html

The internet is a global revolution in communication - as long as you use
letters from the western alphabet. Kieren McCarthy on the growing pressure
for a net that recognises Asian, Arabic and Hindi characters, too

Thursday July 27, 2006
The Guardian

According to Kaled Fattal, "People say the net works, but it only works for
those communities whose native language is Latin-based. The rest of the
world is totally isolated." Fattal speaks perfect English but as chairman
and chief executive of the Multilingual Internet Names Consortium (MINC),
and an Arab, he knows that the majority of the world's population does not.

And he knows that this means the internet is a bewildering and often
incomprehensible place for the billions of people who live east of Greece.

Despite everything you may have heard, the global resource we all know as
the internet is not global at all. Since you are reading this article in
English you probably won't have noticed, but if your first language was
Chinese, Arabic, Hindi or Tamil, you would know very different. At most
websites you visit you will be scrabbling to find a link to a translated
version in your language, seemingly hidden amid tracts of baffling text.
Even getting to a website in the first place requires that you master the
western alphabet - have you ever tried to type ".com" in Chinese letters?

If you think this situation needn't worry you as an English speaker, think
again. At a meeting in the House of Commons this month, a number of
prominent MPs and industry experts listed internationalised domain names
(IDNs) as one of the internet's most pressing priorities. In June, at a
meeting of the Internet Corporation for Assigned Names and Numbers (Icann)
in Marrakech, the "father of the internet" himself, Vint Cerf, highlighted
the introduction of IDNs as vital for the future of the net.

Why the urgency? Because a number of companies - and even countries - that
are frustrated by years of delays have started offering the internet in
their own languages by working outside the existing domain name system
(DNS).

The DNS is the internet's global directory and links particular websites to
particular computers, so if you type in, say, "guardian.co.uk", no matter
where you are on the internet you always end up at the same website. The
problem is that, at the moment, the DNS works only with western languages.

The logic of maintaining a single global directory has so far prevented
people from building and using a different system that includes their
language, but in the past few years there has been such a build-up in demand
from millions of new internet users that the previous agreements are
starting to unravel and risk causing a split in the internet itself.

If that were to happen, the web address you type in could suddenly end up at
an entirely different website depending on where in the world you are, or
which ISP you use. You may want to buy a book from Amazon.com but find that
you end up at a Russian website all about the world's longest river. Email
sent to you could end up with someone you don't know in Korea.

The internet community received a scare in February when China announced it
had created three new top-level domains that were the Chinese equivalents of
".com", ".net" and ".china". If China had decided to break away from the
global internet,others would certainly have soon followed. There was a huge
wave of relief when the Chinese government explained that it had made the
new domains available only within China itself. But the fact that experts
didn't doubt that China was capable of and willing to separate from the
global internet was a wake-up call in itself.

And it's not just China. Israel has set up its own internal system for
domains in Hebrew. Korea has done the same in its language - as has Iran,
Syria and Japan.

But as the world grows smaller, these countries are no longer prepared to
stick with their add-on systems, accessible only when they are in their own
country. They want to register a domain name that is accessible across the
world in the same way that western domains have been from day one.

At a May meeting of the International Telecommunication Union in Geneva,
however, the western world finally woke up. MINC's Fattal demonstrated a
prototype system that worked with the existing internet but also allowed new
languages to be added to the global system.

"We have found a way of connecting these islands [of different-language
networks] and also connecting to the global internet," Fattal explains.
"With this approach, we can leave the current DNS untouched and safe while
helping coordinate between other countries in the namespace. In other words,
now there's a choice."

In Fattal's presentation, suddenly the internet that we all understand as
the global internet today was represented as the "ASCII 'English' internet",
which took its place alongside the Arabic internet, Persian internet,
Chinese internet, Indian internet, Korean internet and so on.

To understand how we have reached the position where there is a real risk of
the internet fragmenting, you need only review the term ASCII itself. It
stands for American Standard Code for Information Interchange and it is the
code devised to enable computers to represent and process all the characters
in the English alphabet (a through to z, plus 0 to 9 and the various symbols
you get on your keyboard such as % and &).

It was first developed in 1967 and written into the internet's foundations
by American scientists. It is now so hardwired into the net that the only
way to include other characters such as accents on letters, or Chinese or
Arabic script, is to use complex combinations of letters that don't exist in
English words in order to represent them.

Linguists have created long tables to represent all the possible
combinations and permutations of different languages. In the case of
internet domain names, the address is preceded by "xn--" and then an agreed
code. For example "www.rémax.com" is represented as "www.xn--rmax-bpa.com".
Using this method, it suddenly becomes possible to have internet domain
names containing foreign characters, and hence foreign language domain
names.

>From the western perspective this approach was sufficient for the rest of
the world to use the internet. But the problem is that each of these domains
still has to use the existing domain system with ".com" or ".net" - suffixes
that are virtually incomprehensible to non Latin- derived language users.

The problem was initially overcome by keyboard manufacturers adding buttons
with ".com" printed on them that did nothing but add ".com" to the end of
what a user had typed. But as the number of new top-level domains has
expanded over time, this sticking plaster approach has proved unworkable.
People want their own domains in their own language, as was made clear by a
recent addition to Japan's own internal domain name system that advertised
itself: "At last - the domain name you can spell!"

There is only one organisation that can add new top-level domains to the
existing global internet, and it is a not-for-profit company based in
California and controlled by the US government: Icann.

Icann was first approached in the year it was created - 1998 - with the aim
of introducing "internationalised domain names" into its system. But it has
yet to introduce a single one. Many members of the global internet community
have cried foul at the endless delays from a company based in the least
linguistically diverse area of the world (the US has speakers of 170
different languages, compared to 364 in Europe and 2,390 in Africa).

These accusations have only been strengthened by the fact it is American
companies that own and run the existing global domains and so have the most
to lose from new foreign-language additions. These companies not only have
disproportionate influence over Icann but have also been insisting on being
given automatic ownership rights to any foreign versions of their domains -
an argument of such corrupt logic that the very fact it is even discussed is
a major cause of concern.

On top of that, the proud and ancient cultures of Asia, Africa and the
Middle East are offended by the very suggestion that they should need to
apply to a private US company in order to have their language accepted as
legitimate on the internet.

As overall coordinator of the domain name system, Icann is caught in a bind
in which it is desperate to avoid the political repercussions of approving
or not approving languages, whilst at the same time maintaining overall
charge of the domain name system to prevent everything falling apart.

Icann has successfully delayed the day it has to make such decisions by
pointing to the complex technical issues that have to be decided first.
However, with non- Latin-language networks becoming increasingly advanced,
China making it clear it is prepared to break away from the internet, MINC
touting a solution that could bypass its processes altogether and, perhaps
most crucially, Microsoft deciding to include IDN10 technology in the new
version of Internet Explorer, out later this year, Icann has been left with
no choice but to speed up the technical side of internationalised domain
names in a bid to keep the net together.

Once that technical side is completed, it will take a masterstroke of
international political will to keep the internet as we now know it together
in one piece.

The sore reality is that global internet politics mean nothing to users in
Korea, Syria or Egypt. They simply want to be able to use this remarkable
medium in their own language, in their own way.