CSUR Tonal

Sun Mar 15 17:50:13 CDT 2015

Luke Dashjr <luke at dashjr dot org> wrote:

> That is, 100 decimal is "one hundred" with a binary value of 110 0100.
> But the same "100" in tonal would be "san" with a binary value of
> 1 0000 0000.

"100" with the meaning of "one hundred" is spoken as "ciento" in 
Spanish, "ekatón" in Greek, "sto" in Russian, etc. So pronunciation by 
itself doesn't necessarily justify separate encoding.

Within English-speaking contexts, "100" can also be a binary number, or 
an octal number with a binary value of 100 0000. In my world as a 
developer, it's often a hex number, as in tonal. In most of these cases 
it's typically pronounced "one zero zero" or "one oh oh." So the numeric 
value of a string of digits within a positional system also doesn't 
necessarily justify separate encoding.

TTS systems always have to rely on environmental hints. Anyone who has 
worked on them will agree.

> And in the other example, one is "B with double lines" vs "bitcoins".

As David pointed out, currency symbols really aren’t an analogy to 
anything else. They are never built from combining characters, and are 
never decomposable to them. This has nothing really to do with TTS or 
pronunciation. One person in the Ubuntu thread mentioned that, but that 
is not the primary reason.

--
Doug Ewell | http://ewellic.org | Thornton, CO ����