Time zones: the localized GMT formats

Rafael Xavier rxaviers at gmail.com
Thu Mar 19 10:07:36 CDT 2015


I took the liberty and filed two tickets:

   - http://unicode.org/cldr/trac/ticket/8293
   - http://unicode.org/cldr/trac/ticket/8294

Please, feel free to correct me or add more information to them.

On Thu, Mar 19, 2015 at 11:45 AM, Rafael Xavier <rxaviers at gmail.com> wrote:

> I highly encourage documentation gets updated for clarification. I
> completely agree with Jon Zeppieri that there are so many nebulous aspects
> of tz formatting.
>
> 1:
> Patterns O and OOOO are defined respectively by "The *short localized GMT
> format*", and "The *long localized GMT format*". Both (short and long)
> localized GMT format are defined by:
>
> 7.1 Time Zone Format Terminology
>> Localized GMT format: A constant, specific offset from GMT (or UTC),
>> which may be in a translated form. There are two styles for this. The first
>> is used when there is an *explicit non-zero offset* from GMT; this style *is
>> specified by the <gmtFormat> element and <hourFormat> element*. The *long
>> format* always uses *2-digit hours* field and *minutes* field, with *optional
>> 2-digit seconds* field. The *short format* is intended for the shortest
>> representation and uses *hour* fields* without leading zero*, with *optional
>> 2-digit minutes and seconds* fields. The digits used for hours, minutes
>> and seconds fields in this format are the locale's default decimal digits:
>
>
>>    - "GMT+03:30" (long)
>>    - "GMT+3:30" (short)
>>    - "UTC-03.00" (long)
>>    - "UTC-3" (short)
>>    - "Гриинуич+03:30" (long)
>>
>> At [
> http://www.unicode.org/reports/tr35/tr35-dates.html#Time_Zone_Format_Terminology
> ].
>
> Q1: Which format does <hourFormat> define, the short or the long? E.g.,
> "en" locale defines *"+HH:mm;-HH:mm"*, which suggests, as Jon has pointed
> out, the long format. But, "cs" (or "fi" similarly) defines "+H:mm;-H:mm",
> which suggests the short format. If it defines one of them, where is the
> other? Should implementations (e.g., ICU) be able to use the above
> <hourFormat> and extract the other forms from it? If so, is there any
> specification for this algorithm?
>
> Q2: How should the optional seconds be generated? This is somewhat related
> to the above question. But, it adds additional questions, for example which
> timeSeparator to use? It's not reliable to use the <timeSeparator>
> information from numbers data given for example the "am" language, where
> the timeSeparator is ":", but hourFormat is "+HHmm;-HHmm" suggesting no
> time separator should be used.
>
> Q3: How should the short format be generated? Again, this is somewhat
> related to the above question. But, has different complications. An
> algorithm should be able to drop the minutes field plus to drop the time
> separator. As Jon has pointed out, there are locales that use different
> time separators than ":" on their hourFormats ("da", "id", "am" as more
> examples). Also as Jon has pointed out, the <timeSeparator> is not always
> the same as used in hourFormats ("ar" as another example, where its
> timeSeparator is "،", but its hourFormat is "+HH:mm;-HH:mm").
>
>
> 2:
>
> Pattern x is defined by "The ISO8601 basic format with hours field and
> optional minutes field".
>
> ISO8601 is defined by:
>
> ISO 8601 time zone formats: The formats based on the ISO 8601 local time
>> difference from UTC, or the UTC indicator ("Z" - only when the local time
>> offset is 0 and the specifier X* is used). The ISO 8601 basic format does
>> not use a separator character between hours and minutes field, while the
>> extended format uses colon (':') as the separator. The ISO 8601 basic
>> format with hours and minutes fields is equivalent to RFC 822 zone format.
>>
>>    - "-0800" (basic)
>>    - "-08" (basic - short)
>>    - "-08:00" (extended)
>>    - "Z" (UTC)
>>
>>  Note: This specification extends the original ISO 8601 formats and some
>> format specifiers append seconds field when necessary.
>>
> At [
> http://www.unicode.org/reports/tr35/tr35-dates.html#Time_Zone_Format_Terminology
> ].
>
> Q1: How to format offset zero: "+0000" or "-0000"? In wikipedia, it says
> to use "+0000", because "-0000" is forbidden according to clause 3.4.2 in
> the 2004 edition of the standard. Although, it's allowed on RFC 3339.
>
> Q2: Should we find any more info of ISO 8601 somewhere else in UTS TR?
> Does UTS TR recommend going external to find out more about it (eg. ISO_8601
> wikipedia entry <http://en.wikipedia.org/wiki/ISO_8601>, or iso.org
> (available for purchase only)
> <http://www.iso.org/iso/home/standards/iso8601.htm>).
>
>
> On Sun, Mar 15, 2015 at 2:23 AM, Jon Zeppieri <zeppieri at gmail.com> wrote:
>
>> On Sun, Mar 15, 2015 at 12:09 AM, Philippe Verdy <verdy_p at wanadoo.fr>
>> wrote:
>> > I suppose that the "short" form will differentiate from the non short
>> form,
>> > only by stripping zeroes
>> >
>>
>> Unless the value of <hourFormat> is syntactically constrained in ways
>> not mentioned in the documentation, this isn't enough, as my example
>> about possible literal strings in <hourFormat> demonstrates. Here's a
>> more realistic example:
>>
>> The pl locale's <hourFormat> is "+H.mm;-H.mm". Note that it uses a
>> literal '.' as the time separator, rather than the pattern variable
>> ':'. If you were going to strip out the mm field here, you'd also want
>> to strip out the '.'. But unless you know that '.' represents a
>> separator, rather than some literal portion of the pattern, you really
>> can't. And even the fact that '.' is the <timeSeparator> for pl
>> doesn't prove that it's being used that way in the pattern.
>>
>> My guess is that <hourFormat> *is* syntactically constrained -- that
>> it's not allowed to use the full pattern syntax -- because if that's
>> not true then it seems impossible to implement the short form as
>> specified. So, really, I'm just looking for some confirmation about
>> what can and cannot appear in <hourFormat>.
>>
>> -Jon
>> _______________________________________________
>> CLDR-Users mailing list
>> CLDR-Users at unicode.org
>> http://unicode.org/mailman/listinfo/cldr-users
>>
>
>
>
> --
> +55 (16) 98138-1582, +1 (415) 568-5854, skype: rxaviers
> http://rafael.xavier.blog.br
>



-- 
+55 (16) 98138-1582, +1 (415) 568-5854, skype: rxaviers
http://rafael.xavier.blog.br
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20150319/2ec280da/attachment.html>


More information about the CLDR-Users mailing list