Translating the standard (was: Re: Fonts and font sizes used in the Unicode)

Marcel Schneider via Unicode unicode at unicode.org
Wed Mar 7 19:27:06 CST 2018


On Mon, 5 Mar 2018 20:19:47 +0100, Philippe Verdy via Unicode wrote:
 
> There's been significant efforts to "translate" or more precisely "adapt" 
> significant parts of the standard with good presentations in Wikipedia and 
> various sites for scoped topics. So there are alternate charts, and instead 
> of translating all, the concepts are summarized, reexplained, but still 
> give links to the original version in English everytime more info is needed. 

Indeed one of the best uses we can make of efforts in Unicode education is
in extending and improving the Wikipedia coverage, because this is the first
place almost everybody is going to. So if a government is considering an 
investment, donating to Wikimedia and motivating a vast community seems
a really good plan. And hiring staffers for this purpose will increase reliability
of the data (given that some corporations misuse the infrastructure for PR).

> All UCD files don't need to be translated, they can also be automatically 
> processed to generate alternate presentations or datatables in other 
> formats. There's no value in taking efforts to translate them manually, 
> it's better to develop a tool that will process them in the format users 
> can read. 

The only UCD file Iʼd advise to fully translate is the Nameslist as being the 
source code of the Code Charts. These are indeed indispensable because of
the glyphic information they convey, that can be found nowhere else, Hence
all good secondary sources like Wikipedia link to the Unicode Charts,
The NamesList per se is useful also in that it provides a minimal amount of
information about the characters. But it lacks important hints about bidi‐mirroring,
that should be compiled from yet another UCD file. The downside of generating
a holistic view is that it generally ends up in an atomic view as on a per‐character
basis. Though anyway itʼs up to the user to gather an overview tailored for his/her
needs. This is catered for by Chinese and Japanese versions of sites such as
www.fileformat.info.

[…]
> The only efforts is in: 
> * naming characters (Wikipedia is great to distribute the effort and have 
> articles showing relevant collections of characters and document alternate 
> names or disambiguate synonyms). 

Naming characters is a real challenge and is often running into multiple issues.
First we need to make clear for who the localization is intended: technical people
or UIs. It happened that a literal translation tuned in accordance with specialists
was then handed out to the industry for showing up on everyoneʼs computer,
while some core characters of the intended locale are named differently in real
life, so that students donʼt encounter what they have learned at school. 
And the worst thing is that once a translation is released, image considerations
lead to seek stability even where no Unicode (ISO) policy is preventing updates.

> * the core text of the standard (section 3 about conformance and 
> requirements is the first thing to adapt). There's absolutely no need 
> however to do that as a pure translation, it can be rewritten and presented 
> with the goals wanted by users. Here again Wikiepdia has done significant 
> efforts there, in various languages 
> * keeping the tools developed in the previous paragraph in sync and 
> conformity with the standard (sync the UCD files they use).  

Yes the biggest issue over time, as Ken wrote, is to *maintain* a translation, 
be it only the Nameslist.


Marcel



More information about the Unicode mailing list