Interpreting t-h0- mechanism

Matthew Stuckwisch via CLDR-Users cldr-users at unicode.org
Sun Oct 20 16:28:31 CDT 2019


> On Oct 20, 2019, at 3:58 PM, Doug Ewell via CLDR-Users <cldr-users at unicode.org> wrote:
>>> I work occasionally with documents in Eonaviego which would best be
>>> coded as ast-t-gl-h0-hybrid, but then when translated to-from (which
>>> there are quite a few to/from Asturian or Spanish), there would be no
>>> valid encoding, so being able to represent a hybrid language as a
>>> source/destination of a transform is not a pure hypothetical for me.
>> 
>> The hybrids were originally designed for cases like Hinglish or
>> Denglish, where there are large numbers of borrowings of words from a
>> different language. Eonaviego sounds like set of dialects on the
>> continuum between Asturian and Galician. That is, it doesn't appear to
>> be Asturian with a batch of loan words from Galician.
> 
> It sounds like the best course of action might be to investigate adding a BCP 47 variant, rather than trying to shoehorn this dialectical situation into the T extension.

That's certainly fair, but I just used it as a quick example.  In my head 'hybrid' doesn't imply exclusively lexical borrowing, and perhaps we could include other terms to describe the relationship more explicitly.  For example, 'hybrid' for lexical borrowing (more for backwards compatibility than precision), 'codeswap' for true codeswapping, 'diacont' for being the intermediate in a dialect continuum, 'mixed' for an intermixing where both grammar and lexicon are taken, etc.  

For fairly established mixes I think variants —or even language codes, as many mixed languages have them— are a good idea, especially with the advantage they can be used anywhere the singletons can't be used.

Matéu


More information about the CLDR-Users mailing list