Use of Unicode 6.3 bidi format chars in CLDR number formats?
Philippe Verdy
verdy_p at wanadoo.fr
Fri Apr 29 02:56:11 CDT 2016
Yes but this is unnecessarily complex to edit in surveys, even if the XML
or JSON exports are inserting these characters themselves, and even if
libaries using the data may detect those characters (when they are properly
paired, but not possible for RLM and LRM and overly complex for LRO/PDF and
RLO/PDF) and replace them by markup or style (possible for LRI/PDI, RLI/PDI
and FSI/PDI which is probably the best mapping in HTML for the "bdi"
element without dir="ltr/rtl").
Do you expect that the survey will allow entering those controls easily?
Can't there be helpers ?
2016-04-29 9:24 GMT+02:00 Mark Davis ☕️ <mark at macchiato.com>:
> The number and currency formats can be used in a variety of contexts and
> adjacent to a variety of text. The bidi isolate characters were designed
> *precisely* to address this kind of need, without forcing people to jump
> through hoops.
>
> The *only* question we have is whether the major platforms/systems that
> use CLDR are all up to speed in terms of supporting the "new" (2013)
> characters in their BIDI algorithms:
>
> U+2066 LEFT-TO-RIGHT ISOLATE
> U+2067 RIGHT-TO-LEFT ISOLATE
> U+2068 FIRST STRONG ISOLATE
> U+2069 POP DIRECTIONAL ISOLATE
>
> Of course, anyone who is using the number formats in a richer format (like
> HTML) is free to remap characters to markup when processing. That's their
> choice.
>
> Mark
>
> Mark
>
> On Fri, Apr 29, 2016 at 6:59 AM, Asmus Freytag (c) <asmusf at ix.netcom.com>
> wrote:
>
>> On 4/28/2016 9:37 PM, Steven Loomis wrote:
>>
>>> Asmus:
>>>
>>> Given the correct choice of internal format for the database,
>>>>
>>>
>>> The internal format is a Unicode String, specifically, UTF-8.
>>>
>> That covers a lot of ground.
>>
>>>
>>> Given that CLDR data should be specifying the desired appearance
>>>>
>>> But CLDR is text, specifically, XML, and not glyphs…
>>>
>>
>> Sorry, I meant that CLDR should be specified in a way that the user
>> expected "visual ordering" can be determined., not "appearance" as in
>> "glyphs".
>>
>> Just to sidestep a potential misunderstanding: I'm not suggesting that
>> the format be in visual order. Just that there are some assumptions made
>> about the context in which the Unicode string (when bidi processed) will
>> result in the correct visual appearance.
>>
>> For example, if you assume that a string as stored displays correct when
>> it is part of a RTL paragraph, then you should be able to compute what you
>> need to do to get the correct visual order when the text is part of an LTR
>> paragraph, part of an isolated embedding, etc.
>>
>> I haven't looked into the actualities, but I know that while you can
>> convert uniquely between some formats in a given direction, there are some
>> conversions (or directions) that are not unique. So the challenge would be
>> for the database to find some format that allows conversions to all the
>> bidi contexts (and capabilities) that are typically encountered.
>>
>> Storing things in visual order is a bad idea, because in the general
>> case, conversion to logical order is not unique.
>>
>> But, instead of picking some "random" logical order (based on an
>> assumption of what "might" be most needed) my suggestion is to carefully
>> pick a "universal" format for the string, one that allows mechanical
>> conversion to all the actual formats that people need, based on what
>> environment they want to embed their strings into, and what sorts of
>> embedding / isolation controls are actually supported.
>>
>> A./
>>
>>
>>
>>> Steven
>>>
>>> El 4/28/16 7:30 PM, "CLDR-Users en nombre de Asmus Freytag (c)" <
>>> cldr-users-bounces at unicode.org en nombre de asmusf at ix.netcom.com>
>>> escribió:
>>>
>>> On 4/28/2016 3:44 PM, Peter Edberg wrote:
>>>>
>>>>> Dear CLDR users,
>>>>>
>>>> Peter,
>>>>
>>>> I think this is where a "one size fits all" solution isn't the answer.
>>>>
>>>> Ideally, I'll be able to use CLDR (and formatting tools depending on it)
>>>> to format date/time/number strings for a variety of consumers.
>>>>
>>>> Plain text (pre 6.3), Plain text with isolates support, and plain text
>>>> for embedding into markup (where I'll supply external markup to isolate
>>>> and otherwise prep the field).
>>>>
>>>> Given that CLDR data should be specifying the desired appearance (not
>>>> the bidi controls necessary to get to that) it should be possible to
>>>> provide mechanical conversion between these formats, rather than having
>>>> to make a single choice for the data base.
>>>>
>>>> Not only will "pre 6.3" support be an issue for a long time to come, I
>>>> am confidently predicting that the need for multiple bidi flavors will
>>>> continue beyond the adoption of the isolates. Whether a string is part
>>>> of an (arbitrary) plain text stream or a separate data field (with its
>>>> scope determined by markup and with it's own bidi styling) will continue
>>>> to call for somewhat different data.
>>>>
>>>> Given the correct choice of internal format for the database, it should
>>>> be possible to provide all of these flavors mechanically, thus avoiding
>>>> the full cost of duplication, while freeing users from having to make
>>>> those format translations themselves.
>>>>
>>>> A./
>>>> _______________________________________________
>>>> CLDR-Users mailing list
>>>> CLDR-Users at unicode.org
>>>> http://unicode.org/mailman/listinfo/cldr-users
>>>>
>>>
>>> _______________________________________________
>>> CLDR-Users mailing list
>>> CLDR-Users at unicode.org
>>> http://unicode.org/mailman/listinfo/cldr-users
>>>
>>
>>
>> _______________________________________________
>> CLDR-Users mailing list
>> CLDR-Users at unicode.org
>> http://unicode.org/mailman/listinfo/cldr-users
>>
>
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20160429/cd414a1e/attachment-0001.html>
More information about the CLDR-Users
mailing list