LGR for unspecified language Selected-recommended-IdentifierType-in-MSR-but-not-in-RefLGR

This document is mechanically formatted from the above XML file for the LGR. It provides additional summary data and explanatory text. The XML file remains the sole normative specification of the LGR.

Date 2025-01-02
LGR Version 16.0.0
Unicode Version 16.0.0

Description

L2/25-034
Partially updates
L2/19-329R

Characters recommended in both UTS#39 and MSR but excluded from the Root Zone or Reference LGR

This document has been submitted as a UTC document. For convenience in documenting the character list it is presented using an LGR template format. A few minor details of the boilerplate in that template may not be applicable in this context and should be disregarded.

The collection comprises 274 characters from the [MSR] that are recommended in UTS#39 but are not part of the Reference LGR [RefLGR], as well as the uppercase equivalents for 88 of them (Latin), plus 70 decimal digits excluded from the RefLGR, for a total of 432 characters.

Recommendation

These 432 characters should be considered Uncommon_Use, based on the fact that the expert teams charged with reviewing them for the ICANN Root Zone LGR and Reference LGR for the Second Level could not come up with evidence that they are used in common everyday writing, even for minority languages in reasonably widespread use. Consequently, they declined to include them in the respective LGRs.

Background

There are about a thousand non-Han characters with Identifier_Type Recommended that should be reclassified because they appear to fail reasonable criteria for being needed in identifiers. They come in two sets. For one set, an independent analysis [MSR] has found indications that they should have been considered Uncommon_Use, Obsolete or Technical based on information available at the time of encoding. That set is discussed in another document. The second set contains characters that were tentatively retained as Recommended in the [MSR] but upon further review by local expert teams from the [RZ-LGR] project were found to not be needed for any language or minority language in reasonably widespread use. That determination started with the [EGIDS] classification as a proxy but made further adjustments in expert review.

This analysis was carried out for the purposes of defining the repertoire for IDN Top Level Domain names for the DNS Root Zone. There are some restrictions that are specific to the Root Zone, such as a prohibition on digits, so a follow-on effort determined how to relax these restrictions in a manner appropriate for the needs of Second-Level Domains. This resulted in the Second-Level Reference Label Generation Rules [RefLGR]. The characters listed in this document are those that were not added to the [RefLGR], for lack of evidence of their use in everyday common writing for any language or minority language in vigorous and reasonably widespread use. Also listed are their uppercase equivalents as well as any native digit sets that were not added to the [RefLGR].

The implication here is that any character not included in the Reference LGR for lack of documented or identifiable usage should be considered Uncommon_Use for Unicode's default identifiers—until such time as independent evidence to the contrary is produced. Until then, in lack of a demonstrated use case, it seems not helpful to continue to suggest that these characters should be supported as recommended. This also applies to some of the sets of native digits, where local experts considered them obsolete for the purpose of identifiers.

Arriving at a precise cutoff for Uncommon_Use is difficult because there is no single source or perfect information on the use of writing systems, and the details of such use are changing over time. Accordingly, this document suggests that the UTC should consider the published results of the cited research as one of the better sources of information available and only deviate from it on the basis of even better information.

All decisions for the classification of characters in [MSR], or inclusion in [RZ-LGR] and [RefLGR] are documented and sourced on the character level; the same is not true for Unicode's classification, so it is not easily possible to verify any of the decisions that underlie the classification published in UTS39. By first making the alignment proposed here, and then carefully documenting deviations, a positive side effect might be that the classification overall becomes more transparent and reviewable.

Special Considerations

The [RZ-LGR] and [RefLGR] exclude the Bopomofo script, considering the entire script special use as it tends to be used almost exclusively in education. This could be addressed by either changing the status of the script in UAX31 to Limited_Use or by marking the entire set of Bopomofo characters as Technical. (This is not reflected in the list of characters in this document.)

No definite recommendations can be made for the Tibetan script. It is considered by ICANN as eligible for the Root Zone in principle, but work on defining the label generation rules has faced some difficulties and has not commenced. It might be reasonable to reflect that uncertainty by also not giving these characters Identifier_Type Recommended until some body, project, or group has created a definite analysis of this script for identifier purposes. (Tibetan characters have been excluded from the list of characters in this document).

Arabic combining marks are categorically excluded from domain names, see also RFC5564. In consequence, they should not be Recommended by Unicode, but if it is felt that Uncommon_Use is not the best classification, then perhaps Inclusion or Technical might be more appropriate.

Root Zone and Reference LGRs

For further background on the DNS Root Zone and Second-Level Reference LGR see the cited references and links therein.

Additional Notes

  • U+0931 DEVANAGARI LETTER RRA is part of the Root Zone and Reference LGR via sequence (does not occur standalone, but should be retained in Recommended)
  • U+09BC  ়  BENGALI SIGN NUKTA is part of the Root Zone and Reference LGR via sequence (does not occur standalone, but should be retained in Recommended)
  • U+0DA6 SINHALA LETTER SANYAKA JAYANNA is part of the Root Zone and Reference LGR via sequence (does not occur standalone, but should be retained as Recommended)
  • U+0E45 THAI CHARACTER LAKKHANGYAO is part of the Root Zone and Reference LGR via sequence (does not occur standalone, but should be retained as Recommended)
  • U+1063  ၣ  MYANMAR TONE MARK SGAW KAREN HATHI is part of the Root Zone and Reference LGR via sequence (does not occur standalone, but should be retained in Recommended)

Note: All characters have tags matching their Identifier_Type values, except Uppercase equivalents are tagged as Uppercase. A comment indicates the nature of the exclusion from the [RefLGR], in this case "Not documented to be in common use". The definition of IDNs excludes uppercase characters, however, for case pairs the analysis for the lowercase letter is treated as applicable. Native digit sets excluded from the [RefLGR] based on information that their use is not preferred in that context are listed with their Identifier_Type and a comment indicating their exclusion from the [RefLGR].

Discussion and Review

Domain names are an important, and deliberately conservative set of identifiers. That said, there may be other classes of identifiers that don't require the same level of restrictions, so this proposal should not be understood to suggest that default Identifiers must be restricted to only those characters that are being recommended for IDNs. Rather, the purpose is to bring the facts discovered during the development of the IDN repertoire for the DNS Root Zone and the [RefLGR] to the attention of the Unicode Technical Committee, so that characters that were classified Recommended can be given additional scrutiny before confirming their status.

As review progresses, a number of characters have been identified that may well have documented use:

  • U+0671 ٱ ARABIC LETTER ALEF WASLA - this letter is considered to be "an important Quranic character" (which would make it Technical, but not Uncommon_Use). It is also claimed to be used with a newly invented orthography Luri language in Iran.

Other issues

Combining marks: where combining marks are excluded, but needed for decompositions (such as U+0654  ٔ ), it was proposed to focus on the NFC format for Identifier_Type, documenting that combining characters may be marked as Uncommon_Use even when they are in the NFD version of a modern language's exemplar characters.

Combining marks and Arabic Script: the Internet Architecture Board [IAB] has issued a statement referencing this issue. Please also see the “Proposal for Arabic Script Root Zone LGR”, [Proposal-Arabic].

Contributors

This excerpt was prepared by Asmus Freytag, based on published data found in [RefLGR] and reference information from [MSR]. For details on the process and contributors to those projects, see [RefLGR-Overview], in particular, Section 1, “Overview” and Section 6, “Contributors”. Michel Suignard and Roozbeh Pournader have contributed feedback.

Repertoire

Repertoire Summary

Number of elements in repertoire 432
Longest code point sequence 1

Repertoire by Code Point

The following table lists the repertoire by code point (or code point sequence). The data in the Script and Name column are extracted from the Unicode character database. Where a comment in the original LGR is equal to the character name, it has been suppressed.

See also the legend provided below the table.

Code
Point
Glyph Script Name Ref Tags Comment
U+0114 Ĕ Latin LATIN CAPITAL LETTER E WITH BREVE [100] Uppercase  
U+0115 ĕ Latin LATIN SMALL LETTER E WITH BREVE [100] Recommended Not in documented common use
U+012C Ĭ Latin LATIN CAPITAL LETTER I WITH BREVE [100] Uppercase  
U+012D ĭ Latin LATIN SMALL LETTER I WITH BREVE [100] Recommended Not in documented common use
U+014E Ŏ Latin LATIN CAPITAL LETTER O WITH BREVE [100] Uppercase  
U+014F ŏ Latin LATIN SMALL LETTER O WITH BREVE [100] Recommended Not in documented common use
U+0156 Ŗ Latin LATIN CAPITAL LETTER R WITH CEDILLA [100] Uppercase  
U+0157 ŗ Latin LATIN SMALL LETTER R WITH CEDILLA [100] Recommended Not in documented common use
U+0162 Ţ Latin LATIN CAPITAL LETTER T WITH CEDILLA [100] Uppercase  
U+0163 ţ Latin LATIN SMALL LETTER T WITH CEDILLA [100] Recommended Not in documented common use
U+01D5 Ǖ Latin LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON [100] Uppercase  
U+01D6 ǖ Latin LATIN SMALL LETTER U WITH DIAERESIS AND MACRON [100] Recommended Not in documented common use
U+01D7 Ǘ Latin LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE [100] Uppercase  
U+01D8 ǘ Latin LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE [100] Recommended Not in documented common use
U+01D9 Ǚ Latin LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON [100] Uppercase  
U+01DA ǚ Latin LATIN SMALL LETTER U WITH DIAERESIS AND CARON [100] Recommended Not in documented common use
U+01DB Ǜ Latin LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE [100] Uppercase  
U+01DC ǜ Latin LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE [100] Recommended Not in documented common use
U+01DE Ǟ Latin LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON [100] Uppercase  
U+01DF ǟ Latin LATIN SMALL LETTER A WITH DIAERESIS AND MACRON [100] Recommended Not in documented common use
U+01E0 Ǡ Latin LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON [100] Uppercase  
U+01E1 ǡ Latin LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON [100] Recommended Not in documented common use
U+01E2 Ǣ Latin LATIN CAPITAL LETTER AE WITH MACRON [100] Uppercase  
U+01E3 ǣ Latin LATIN SMALL LETTER AE WITH MACRON [100] Recommended Not in documented common use
U+01EA Ǫ Latin LATIN CAPITAL LETTER O WITH OGONEK [100] Uppercase  
U+01EB ǫ Latin LATIN SMALL LETTER O WITH OGONEK [100] Recommended Not in documented common use
U+01EC Ǭ Latin LATIN CAPITAL LETTER O WITH OGONEK AND MACRON [100] Uppercase  
U+01ED ǭ Latin LATIN SMALL LETTER O WITH OGONEK AND MACRON [100] Recommended Not in documented common use
U+01F0 ǰ Latin LATIN SMALL LETTER J WITH CARON [100] Recommended Not in documented common use
U+01F4 Ǵ Latin LATIN CAPITAL LETTER G WITH ACUTE [100] Uppercase  
U+01F5 ǵ Latin LATIN SMALL LETTER G WITH ACUTE [100] Recommended Not in documented common use
U+01F8 Ǹ Latin LATIN CAPITAL LETTER N WITH GRAVE [100] Uppercase  
U+01F9 ǹ Latin LATIN SMALL LETTER N WITH GRAVE [100] Recommended Not in documented common use
U+01FA Ǻ Latin LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE [100] Uppercase  
U+01FB ǻ Latin LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE [100] Recommended Not in documented common use
U+01FC Ǽ Latin LATIN CAPITAL LETTER AE WITH ACUTE [100] Uppercase  
U+01FD ǽ Latin LATIN SMALL LETTER AE WITH ACUTE [100] Recommended Not in documented common use
U+01FE Ǿ Latin LATIN CAPITAL LETTER O WITH STROKE AND ACUTE [100] Uppercase  
U+01FF ǿ Latin LATIN SMALL LETTER O WITH STROKE AND ACUTE [100] Recommended Not in documented common use
U+021E Ȟ Latin LATIN CAPITAL LETTER H WITH CARON [100] Uppercase  
U+021F ȟ Latin LATIN SMALL LETTER H WITH CARON [100] Recommended Not in documented common use
U+0226 Ȧ Latin LATIN CAPITAL LETTER A WITH DOT ABOVE [100] Uppercase  
U+0227 ȧ Latin LATIN SMALL LETTER A WITH DOT ABOVE [100] Recommended Not in documented common use
U+0228 Ȩ Latin LATIN CAPITAL LETTER E WITH CEDILLA [100] Uppercase  
U+0229 ȩ Latin LATIN SMALL LETTER E WITH CEDILLA [100] Recommended Not in documented common use
U+022A Ȫ Latin LATIN CAPITAL LETTER O WITH DIAERESIS AND MACRON [100] Uppercase  
U+022B ȫ Latin LATIN SMALL LETTER O WITH DIAERESIS AND MACRON [100] Recommended Not in documented common use
U+022C Ȭ Latin LATIN CAPITAL LETTER O WITH TILDE AND MACRON [100] Uppercase  
U+022D ȭ Latin LATIN SMALL LETTER O WITH TILDE AND MACRON [100] Recommended Not in documented common use
U+022E Ȯ Latin LATIN CAPITAL LETTER O WITH DOT ABOVE [100] Uppercase  
U+022F ȯ Latin LATIN SMALL LETTER O WITH DOT ABOVE [100] Recommended Not in documented common use
U+0230 Ȱ Latin LATIN CAPITAL LETTER O WITH DOT ABOVE AND MACRON [100] Uppercase  
U+0231 ȱ Latin LATIN SMALL LETTER O WITH DOT ABOVE AND MACRON [100] Recommended Not in documented common use
U+0232 Ȳ Latin LATIN CAPITAL LETTER Y WITH MACRON [100] Uppercase  
U+0233 ȳ Latin LATIN SMALL LETTER Y WITH MACRON [100] Recommended Not in documented common use
U+0400 Ѐ Cyrillic CYRILLIC CAPITAL LETTER IE WITH GRAVE [100] Uppercase  
U+040D Ѝ Cyrillic CYRILLIC CAPITAL LETTER I WITH GRAVE [100] Uppercase  
U+0450 ѐ Cyrillic CYRILLIC SMALL LETTER IE WITH GRAVE [100] Recommended Not in documented common use
U+045D ѝ Cyrillic CYRILLIC SMALL LETTER I WITH GRAVE [100] Recommended Not in documented common use
U+04C1 Ӂ Cyrillic CYRILLIC CAPITAL LETTER ZHE WITH BREVE [100] Uppercase  
U+04C2 ӂ Cyrillic CYRILLIC SMALL LETTER ZHE WITH BREVE [100] Recommended Not in documented common use
U+04CB Ӌ Cyrillic CYRILLIC CAPITAL LETTER KHAKASSIAN CHE [100] Uppercase  
U+04CC ӌ Cyrillic CYRILLIC SMALL LETTER KHAKASSIAN CHE [100] Recommended Not in documented common use
U+04DA Ӛ Cyrillic CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS [100] Uppercase  
U+04DB ӛ Cyrillic CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS [100] Recommended Not in documented common use
U+04EA Ӫ Cyrillic CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS [100] Uppercase  
U+04EB ӫ Cyrillic CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS [100] Recommended Not in documented common use
U+04EC Ӭ Cyrillic CYRILLIC CAPITAL LETTER E WITH DIAERESIS [100] Uppercase  
U+04ED ӭ Cyrillic CYRILLIC SMALL LETTER E WITH DIAERESIS [100] Recommended Not in documented common use
U+05B4  ִ Hebrew HEBREW POINT HIRIQ [100] Recommended Not in documented common use
U+05F0 װ Hebrew HEBREW LIGATURE YIDDISH DOUBLE VAV [100] Recommended Not in documented common use
U+05F1 ױ Hebrew HEBREW LIGATURE YIDDISH VAV YOD [100] Recommended Not in documented common use
U+05F2 ײ Hebrew HEBREW LIGATURE YIDDISH DOUBLE YOD [100] Recommended Not in documented common use
U+064B  ً Inherited ARABIC FATHATAN [100] Recommended Arabic combining marks are categorically excluded from domain names
U+064C  ٌ Inherited ARABIC DAMMATAN [100] Recommended Arabic combining marks are categorically excluded from domain names
U+064D  ٍ Inherited ARABIC KASRATAN [100] Recommended Arabic combining marks are categorically excluded from domain names
U+064E  َ Inherited ARABIC FATHA [100] Recommended Arabic combining marks are categorically excluded from domain names
U+064F  ُ Inherited ARABIC DAMMA [100] Recommended Arabic combining marks are categorically excluded from domain names
U+0650  ِ Inherited ARABIC KASRA [100] Recommended Arabic combining marks are categorically excluded from domain names
U+0651  ّ Inherited ARABIC SHADDA [100] Recommended Arabic combining marks are categorically excluded from domain names
U+0652  ْ Inherited ARABIC SUKUN [100] Recommended Arabic combining marks are categorically excluded from domain names
U+0654  ٔ Inherited ARABIC HAMZA ABOVE [100] Recommended Arabic combining marks are categorically excluded from domain names
U+0655  ٕ Inherited ARABIC HAMZA BELOW [100] Recommended Arabic combining marks are categorically excluded from domain names
U+0670  ٰ Inherited ARABIC LETTER SUPERSCRIPT ALEF [100] Recommended Arabic combining marks are categorically excluded from domain names
U+0671 ٱ Arabic ARABIC LETTER ALEF WASLA [100] Recommended Not in documented common use
U+0674 ٴ Arabic ARABIC LETTER HIGH HAMZA [100] Recommended Not in documented common use
U+0682 ڂ Arabic ARABIC LETTER HAH WITH TWO DOTS VERTICAL ABOVE [100] Recommended Not in documented common use
U+0690 ڐ Arabic ARABIC LETTER DAL WITH FOUR DOTS ABOVE [100] Recommended Not in documented common use
U+0692 ڒ Arabic ARABIC LETTER REH WITH SMALL V [100] Recommended Not in documented common use
U+0694 ڔ Arabic ARABIC LETTER REH WITH DOT BELOW [100] Recommended Not in documented common use
U+069B ڛ Arabic ARABIC LETTER SEEN WITH THREE DOTS BELOW [100] Recommended Not in documented common use
U+069C ڜ Arabic ARABIC LETTER SEEN WITH THREE DOTS BELOW AND THREE DOTS ABOVE [100] Recommended Not in documented common use
U+069D ڝ Arabic ARABIC LETTER SAD WITH TWO DOTS BELOW [100] Recommended Not in documented common use
U+069E ڞ Arabic ARABIC LETTER SAD WITH THREE DOTS ABOVE [100] Recommended Not in documented common use
U+06A1 ڡ Arabic ARABIC LETTER DOTLESS FEH [100] Recommended Not in documented common use
U+06A3 ڣ Arabic ARABIC LETTER FEH WITH DOT BELOW [100] Recommended Not in documented common use
U+06A5 ڥ Arabic ARABIC LETTER FEH WITH THREE DOTS BELOW [100] Recommended Not in documented common use
U+06B2 ڲ Arabic ARABIC LETTER GAF WITH TWO DOTS BELOW [100] Recommended Not in documented common use
U+06B4 ڴ Arabic ARABIC LETTER GAF WITH THREE DOTS ABOVE [100] Recommended Not in documented common use
U+06B6 ڶ Arabic ARABIC LETTER LAM WITH DOT ABOVE [100] Recommended Not in documented common use
U+06B7 ڷ Arabic ARABIC LETTER LAM WITH THREE DOTS ABOVE [100] Recommended Not in documented common use
U+06B8 ڸ Arabic ARABIC LETTER LAM WITH THREE DOTS BELOW [100] Recommended Not in documented common use
U+06B9 ڹ Arabic ARABIC LETTER NOON WITH DOT BELOW [100] Recommended Not in documented common use
U+06BF ڿ Arabic ARABIC LETTER TCHEH WITH DOT ABOVE [100] Recommended Not in documented common use
U+06C5 ۅ Arabic ARABIC LETTER KIRGHIZ OE [100] Recommended Not in documented common use
U+06C7 ۇ Arabic ARABIC LETTER U [100] Recommended Not in documented common use
U+06C8 ۈ Arabic ARABIC LETTER YU [100] Recommended Not in documented common use
U+06C9 ۉ Arabic ARABIC LETTER KIRGHIZ YU [100] Recommended Not in documented common use
U+06CA ۊ Arabic ARABIC LETTER WAW WITH TWO DOTS ABOVE [100] Recommended Not in documented common use
U+06D3 ۓ Arabic ARABIC LETTER YEH BARREE WITH HAMZA ABOVE [100] Recommended Not in documented common use
U+06EE ۮ Arabic ARABIC LETTER DAL WITH INVERTED V [100] Recommended Not in documented common use
U+06EF ۯ Arabic ARABIC LETTER REH WITH INVERTED V [100] Recommended Not in documented common use
U+06FA ۺ Arabic ARABIC LETTER SHEEN WITH DOT BELOW [100] Recommended Not in documented common use
U+06FB ۻ Arabic ARABIC LETTER DAD WITH DOT BELOW [100] Recommended Not in documented common use
U+06FC ۼ Arabic ARABIC LETTER GHAIN WITH DOT BELOW [100] Recommended Not in documented common use
U+06FF ۿ Arabic ARABIC LETTER HEH WITH INVERTED V [100] Recommended Not in documented common use
U+0750 ݐ Arabic ARABIC LETTER BEH WITH THREE DOTS HORIZONTALLY BELOW [100] Recommended Not in documented common use
U+0753 ݓ Arabic ARABIC LETTER BEH WITH THREE DOTS POINTING UPWARDS BELOW AND TWO DOTS ABOVE [100] Recommended Not in documented common use
U+0754 ݔ Arabic ARABIC LETTER BEH WITH TWO DOTS BELOW AND DOT ABOVE [100] Recommended Not in documented common use
U+0755 ݕ Arabic ARABIC LETTER BEH WITH INVERTED SMALL V BELOW [100] Recommended Not in documented common use
U+0757 ݗ Arabic ARABIC LETTER HAH WITH TWO DOTS ABOVE [100] Recommended Not in documented common use
U+0758 ݘ Arabic ARABIC LETTER HAH WITH THREE DOTS POINTING UPWARDS BELOW [100] Recommended Not in documented common use
U+0759 ݙ Arabic ARABIC LETTER DAL WITH TWO DOTS VERTICALLY BELOW AND SMALL TAH [100] Recommended Not in documented common use
U+075A ݚ Arabic ARABIC LETTER DAL WITH INVERTED SMALL V BELOW [100] Recommended Not in documented common use
U+075B ݛ Arabic ARABIC LETTER REH WITH STROKE [100] Recommended Not in documented common use
U+075C ݜ Arabic ARABIC LETTER SEEN WITH FOUR DOTS ABOVE [100] Recommended Not in documented common use
U+075D ݝ Arabic ARABIC LETTER AIN WITH TWO DOTS ABOVE [100] Recommended Not in documented common use
U+075E ݞ Arabic ARABIC LETTER AIN WITH THREE DOTS POINTING DOWNWARDS ABOVE [100] Recommended Not in documented common use
U+075F ݟ Arabic ARABIC LETTER AIN WITH TWO DOTS VERTICALLY ABOVE [100] Recommended Not in documented common use
U+0761 ݡ Arabic ARABIC LETTER FEH WITH THREE DOTS POINTING UPWARDS BELOW [100] Recommended Not in documented common use
U+0764 ݤ Arabic ARABIC LETTER KEHEH WITH THREE DOTS POINTING UPWARDS BELOW [100] Recommended Not in documented common use
U+0765 ݥ Arabic ARABIC LETTER MEEM WITH DOT ABOVE [100] Recommended Not in documented common use
U+0769 ݩ Arabic ARABIC LETTER NOON WITH SMALL V [100] Recommended Not in documented common use
U+076B ݫ Arabic ARABIC LETTER REH WITH TWO DOTS VERTICALLY ABOVE [100] Recommended Not in documented common use
U+076C ݬ Arabic ARABIC LETTER REH WITH HAMZA ABOVE [100] Recommended Not in documented common use
U+076D ݭ Arabic ARABIC LETTER SEEN WITH TWO DOTS VERTICALLY ABOVE [100] Recommended Not in documented common use
U+0772 ݲ Arabic ARABIC LETTER HAH WITH SMALL ARABIC LETTER TAH ABOVE [100] Recommended Not in documented common use
U+0773 ݳ Arabic ARABIC LETTER ALEF WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE [100] Recommended Not in documented common use
U+0774 ݴ Arabic ARABIC LETTER ALEF WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE [100] Recommended Not in documented common use
U+0775 ݵ Arabic ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE [100] Recommended Not in documented common use
U+0776 ݶ Arabic ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE [100] Recommended Not in documented common use
U+0777 ݷ Arabic ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW [100] Recommended Not in documented common use
U+0778 ݸ Arabic ARABIC LETTER WAW WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE [100] Recommended Not in documented common use
U+0779 ݹ Arabic ARABIC LETTER WAW WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE [100] Recommended Not in documented common use
U+077A ݺ Arabic ARABIC LETTER YEH BARREE WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE [100] Recommended Not in documented common use
U+077B ݻ Arabic ARABIC LETTER YEH BARREE WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE [100] Recommended Not in documented common use
U+077C ݼ Arabic ARABIC LETTER HAH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW [100] Recommended Not in documented common use
U+077D ݽ Arabic ARABIC LETTER SEEN WITH EXTENDED ARABIC-INDIC DIGIT FOUR ABOVE [100] Recommended Not in documented common use
U+08A1 Arabic ARABIC LETTER BEH WITH HAMZA ABOVE [100] Recommended Not in documented common use
U+08AA Arabic ARABIC LETTER REH WITH LOOP [100] Recommended Not in documented common use
U+08AB Arabic ARABIC LETTER WAW WITH DOT WITHIN [100] Recommended Not in documented common use
U+08AC Arabic ARABIC LETTER ROHINGYA YEH [100] Recommended Not in documented common use
U+0904 Devanagari DEVANAGARI LETTER SHORT A [100] Recommended Not in documented common use
U+090C Devanagari DEVANAGARI LETTER VOCALIC L [100] Recommended Not in documented common use
U+0929 Devanagari DEVANAGARI LETTER NNNA [100] Recommended Not in documented common use
U+0934 Devanagari DEVANAGARI LETTER LLLA [100] Recommended Not in documented common use
U+0944  ॄ Devanagari DEVANAGARI VOWEL SIGN VOCALIC RR [100] Recommended Not in documented common use
U+0979 Devanagari DEVANAGARI LETTER ZHA [100] Recommended Not in documented common use
U+097A Devanagari DEVANAGARI LETTER HEAVY YA [100] Recommended Not in documented common use
U+098C Bengali BENGALI LETTER VOCALIC L [100] Recommended Not in documented common use
U+09D7  ৗ Bengali BENGALI AU LENGTH MARK [100] Recommended Not in documented common use
U+0A03  ਃ Gurmukhi GURMUKHI SIGN VISARGA [100] Recommended Not in documented common use
U+0A66 Gurmukhi GURMUKHI DIGIT ZERO [100] Recommended Native digits not in common use
U+0A67 Gurmukhi GURMUKHI DIGIT ONE [100] Recommended Native digits not in common use
U+0A68 Gurmukhi GURMUKHI DIGIT TWO [100] Recommended Native digits not in common use
U+0A69 Gurmukhi GURMUKHI DIGIT THREE [100] Recommended Native digits not in common use
U+0A6A Gurmukhi GURMUKHI DIGIT FOUR [100] Recommended Native digits not in common use
U+0A6B Gurmukhi GURMUKHI DIGIT FIVE [100] Recommended Native digits not in common use
U+0A6C Gurmukhi GURMUKHI DIGIT SIX [100] Recommended Native digits not in common use
U+0A6D Gurmukhi GURMUKHI DIGIT SEVEN [100] Recommended Native digits not in common use
U+0A6E Gurmukhi GURMUKHI DIGIT EIGHT [100] Recommended Native digits not in common use
U+0A6F Gurmukhi GURMUKHI DIGIT NINE [100] Recommended Native digits not in common use
U+0A72 Gurmukhi GURMUKHI IRI [100] Recommended Not in documented common use
U+0A73 Gurmukhi GURMUKHI URA [100] Recommended Not in documented common use
U+0A81  ઁ Gujarati GUJARATI SIGN CANDRABINDU [100] Recommended Not in documented common use
U+0B0C Oriya ORIYA LETTER VOCALIC L [100] Recommended Not in documented common use
U+0B35 Oriya ORIYA LETTER VA [100] Recommended Not in documented common use
U+0B57  ୗ Oriya ORIYA AU LENGTH MARK [100] Recommended Not in documented common use
U+0B66 Oriya ORIYA DIGIT ZERO [100] Recommended Native digits not in common use
U+0B67 Oriya ORIYA DIGIT ONE [100] Recommended Native digits not in common use
U+0B68 Oriya ORIYA DIGIT TWO [100] Recommended Native digits not in common use
U+0B69 Oriya ORIYA DIGIT THREE [100] Recommended Native digits not in common use
U+0B6A Oriya ORIYA DIGIT FOUR [100] Recommended Native digits not in common use
U+0B6B Oriya ORIYA DIGIT FIVE [100] Recommended Native digits not in common use
U+0B6C Oriya ORIYA DIGIT SIX [100] Recommended Native digits not in common use
U+0B6D Oriya ORIYA DIGIT SEVEN [100] Recommended Native digits not in common use
U+0B6E Oriya ORIYA DIGIT EIGHT [100] Recommended Native digits not in common use
U+0B6F Oriya ORIYA DIGIT NINE [100] Recommended Native digits not in common use
U+0BD7  ௗ Tamil TAMIL AU LENGTH MARK [100] Recommended Not in documented common use
U+0BE6 Tamil TAMIL DIGIT ZERO [100] Recommended Native digits not in common use
U+0BE7 Tamil TAMIL DIGIT ONE [100] Recommended Native digits not in common use
U+0BE8 Tamil TAMIL DIGIT TWO [100] Recommended Native digits not in common use
U+0BE9 Tamil TAMIL DIGIT THREE [100] Recommended Native digits not in common use
U+0BEA Tamil TAMIL DIGIT FOUR [100] Recommended Native digits not in common use
U+0BEB Tamil TAMIL DIGIT FIVE [100] Recommended Native digits not in common use
U+0BEC Tamil TAMIL DIGIT SIX [100] Recommended Native digits not in common use
U+0BED Tamil TAMIL DIGIT SEVEN [100] Recommended Native digits not in common use
U+0BEE Tamil TAMIL DIGIT EIGHT [100] Recommended Native digits not in common use
U+0BEF Tamil TAMIL DIGIT NINE [100] Recommended Native digits not in common use
U+0C0C Telugu TELUGU LETTER VOCALIC L [100] Recommended Not in documented common use
U+0C31 Telugu TELUGU LETTER RRA [100] Recommended Not in documented common use
U+0C55  ౕ Telugu TELUGU LENGTH MARK [100] Recommended Not in documented common use
U+0C56  ౖ Telugu TELUGU AI LENGTH MARK [100] Recommended Not in documented common use
U+0C66 Telugu TELUGU DIGIT ZERO [100] Recommended Native digits not in common use
U+0C67 Telugu TELUGU DIGIT ONE [100] Recommended Native digits not in common use
U+0C68 Telugu TELUGU DIGIT TWO [100] Recommended Native digits not in common use
U+0C69 Telugu TELUGU DIGIT THREE [100] Recommended Native digits not in common use
U+0C6A Telugu TELUGU DIGIT FOUR [100] Recommended Native digits not in common use
U+0C6B Telugu TELUGU DIGIT FIVE [100] Recommended Native digits not in common use
U+0C6C Telugu TELUGU DIGIT SIX [100] Recommended Native digits not in common use
U+0C6D Telugu TELUGU DIGIT SEVEN [100] Recommended Native digits not in common use
U+0C6E Telugu TELUGU DIGIT EIGHT [100] Recommended Native digits not in common use
U+0C6F Telugu TELUGU DIGIT NINE [100] Recommended Native digits not in common use
U+0C8C Kannada KANNADA LETTER VOCALIC L [100] Recommended Not in documented common use
U+0CB1 Kannada KANNADA LETTER RRA [100] Recommended Not in documented common use
U+0CBC  ಼ Kannada KANNADA SIGN NUKTA [100] Recommended Not in documented common use
U+0CC4  ೄ Kannada KANNADA VOWEL SIGN VOCALIC RR [100] Recommended Not in documented common use
U+0CD5  ೕ Kannada KANNADA LENGTH MARK [100] Recommended Not in documented common use
U+0CD6  ೖ Kannada KANNADA AI LENGTH MARK [100] Recommended Not in documented common use
U+0D0C Malayalam MALAYALAM LETTER VOCALIC L [100] Recommended Not in documented common use
U+0D29 Malayalam MALAYALAM LETTER NNNA [100] Recommended Not in documented common use
U+0D66 Malayalam MALAYALAM DIGIT ZERO [100] Recommended Native digits not in common use
U+0D67 Malayalam MALAYALAM DIGIT ONE [100] Recommended Native digits not in common use
U+0D68 Malayalam MALAYALAM DIGIT TWO [100] Recommended Native digits not in common use
U+0D69 Malayalam MALAYALAM DIGIT THREE [100] Recommended Native digits not in common use
U+0D6A Malayalam MALAYALAM DIGIT FOUR [100] Recommended Native digits not in common use
U+0D6B Malayalam MALAYALAM DIGIT FIVE [100] Recommended Native digits not in common use
U+0D6C Malayalam MALAYALAM DIGIT SIX [100] Recommended Native digits not in common use
U+0D6D Malayalam MALAYALAM DIGIT SEVEN [100] Recommended Native digits not in common use
U+0D6E Malayalam MALAYALAM DIGIT EIGHT [100] Recommended Native digits not in common use
U+0D6F Malayalam MALAYALAM DIGIT NINE [100] Recommended Native digits not in common use
U+0D8E Sinhala SINHALA LETTER IRUUYANNA [100] Recommended Not in documented common use
U+0D9E Sinhala SINHALA LETTER KANTAJA NAASIKYAYA [100] Recommended Not in documented common use
U+0DE6 Sinhala SINHALA LITH DIGIT ZERO [100] Recommended Native digits not in common use
U+0DE7 Sinhala SINHALA LITH DIGIT ONE [100] Recommended Native digits not in common use
U+0DE8 Sinhala SINHALA LITH DIGIT TWO [100] Recommended Native digits not in common use
U+0DE9 Sinhala SINHALA LITH DIGIT THREE [100] Recommended Native digits not in common use
U+0DEA Sinhala SINHALA LITH DIGIT FOUR [100] Recommended Native digits not in common use
U+0DEB Sinhala SINHALA LITH DIGIT FIVE [100] Recommended Native digits not in common use
U+0DEC Sinhala SINHALA LITH DIGIT SIX [100] Recommended Native digits not in common use
U+0DED Sinhala SINHALA LITH DIGIT SEVEN [100] Recommended Native digits not in common use
U+0DEE Sinhala SINHALA LITH DIGIT EIGHT [100] Recommended Native digits not in common use
U+0DEF Sinhala SINHALA LITH DIGIT NINE [100] Recommended Native digits not in common use
U+0E4E  ๎ Thai THAI CHARACTER YAMAKKAN [100] Recommended Not in documented common use
U+0EDE Lao LAO LETTER KHMU GO [100] Recommended Not in documented common use
U+0EDF Lao LAO LETTER KHMU NYO [100] Recommended Not in documented common use
U+108B  ႋ Myanmar MYANMAR SIGN SHAN COUNCIL TONE-2 [100] Recommended Not in documented common use
U+108C  ႌ Myanmar MYANMAR SIGN SHAN COUNCIL TONE-3 [100] Recommended Not in documented common use
U+108D  ႍ Myanmar MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE [100] Recommended Not in documented common use
U+1090 Myanmar MYANMAR SHAN DIGIT ZERO [100] Recommended Native digits not in common use
U+1091 Myanmar MYANMAR SHAN DIGIT ONE [100] Recommended Native digits not in common use
U+1092 Myanmar MYANMAR SHAN DIGIT TWO [100] Recommended Native digits not in common use
U+1093 Myanmar MYANMAR SHAN DIGIT THREE [100] Recommended Native digits not in common use
U+1094 Myanmar MYANMAR SHAN DIGIT FOUR [100] Recommended Native digits not in common use
U+1095 Myanmar MYANMAR SHAN DIGIT FIVE [100] Recommended Native digits not in common use
U+1096 Myanmar MYANMAR SHAN DIGIT SIX [100] Recommended Native digits not in common use
U+1097 Myanmar MYANMAR SHAN DIGIT SEVEN [100] Recommended Native digits not in common use
U+1098 Myanmar MYANMAR SHAN DIGIT EIGHT [100] Recommended Native digits not in common use
U+1099 Myanmar MYANMAR SHAN DIGIT NINE [100] Recommended Native digits not in common use
U+10F7 Georgian GEORGIAN LETTER YN [100] Recommended Not in documented common use
U+10F8 Georgian GEORGIAN LETTER ELIFI [100] Recommended Not in documented common use
U+1207 Ethiopic ETHIOPIC SYLLABLE HOA [100] Recommended Not in documented common use
U+1287 Ethiopic ETHIOPIC SYLLABLE XOA [100] Recommended Not in documented common use
U+12AF Ethiopic ETHIOPIC SYLLABLE KOA [100] Recommended Not in documented common use
U+12F8 Ethiopic ETHIOPIC SYLLABLE DDA [100] Recommended Not in documented common use
U+12F9 Ethiopic ETHIOPIC SYLLABLE DDU [100] Recommended Not in documented common use
U+12FA Ethiopic ETHIOPIC SYLLABLE DDI [100] Recommended Not in documented common use
U+12FB Ethiopic ETHIOPIC SYLLABLE DDAA [100] Recommended Not in documented common use
U+12FC Ethiopic ETHIOPIC SYLLABLE DDEE [100] Recommended Not in documented common use
U+12FD Ethiopic ETHIOPIC SYLLABLE DDE [100] Recommended Not in documented common use
U+12FE Ethiopic ETHIOPIC SYLLABLE DDO [100] Recommended Not in documented common use
U+12FF Ethiopic ETHIOPIC SYLLABLE DDWA [100] Recommended Not in documented common use
U+130F Ethiopic ETHIOPIC SYLLABLE GOA [100] Recommended Not in documented common use
U+131F Ethiopic ETHIOPIC SYLLABLE GGWAA [100] Recommended Not in documented common use
U+1347 Ethiopic ETHIOPIC SYLLABLE TZOA [100] Recommended Not in documented common use
U+135A Ethiopic ETHIOPIC SYLLABLE FYA [100] Recommended Not in documented common use
U+135D  ፝ Ethiopic ETHIOPIC COMBINING GEMINATION AND VOWEL LENGTH MARK [100] Recommended Not in documented common use
U+135E  ፞ Ethiopic ETHIOPIC COMBINING VOWEL LENGTH MARK [100] Recommended Not in documented common use
U+135F  ፟ Ethiopic ETHIOPIC COMBINING GEMINATION MARK [100] Recommended Not in documented common use
U+179D Khmer KHMER LETTER SHA [100] Recommended Not in documented common use
U+179E Khmer KHMER LETTER SSO [100] Recommended Not in documented common use
U+17A9 Khmer KHMER INDEPENDENT VOWEL QUU [100] Recommended Not in documented common use
U+17B2 Khmer KHMER INDEPENDENT VOWEL QOO TYPE TWO [100] Recommended Not in documented common use
U+17D7 Khmer KHMER SIGN LEK TOO [100] Recommended Not in documented common use
U+1E02 Latin LATIN CAPITAL LETTER B WITH DOT ABOVE [100] Uppercase  
U+1E03 Latin LATIN SMALL LETTER B WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E04 Latin LATIN CAPITAL LETTER B WITH DOT BELOW [100] Uppercase  
U+1E05 Latin LATIN SMALL LETTER B WITH DOT BELOW [100] Recommended Not in documented common use
U+1E06 Latin LATIN CAPITAL LETTER B WITH LINE BELOW [100] Uppercase  
U+1E07 Latin LATIN SMALL LETTER B WITH LINE BELOW [100] Recommended Not in documented common use
U+1E08 Latin LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE [100] Uppercase  
U+1E09 Latin LATIN SMALL LETTER C WITH CEDILLA AND ACUTE [100] Recommended Not in documented common use
U+1E0A Latin LATIN CAPITAL LETTER D WITH DOT ABOVE [100] Uppercase  
U+1E0B Latin LATIN SMALL LETTER D WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E0C Latin LATIN CAPITAL LETTER D WITH DOT BELOW [100] Uppercase  
U+1E0D Latin LATIN SMALL LETTER D WITH DOT BELOW [100] Recommended Not in documented common use
U+1E0E Latin LATIN CAPITAL LETTER D WITH LINE BELOW [100] Uppercase  
U+1E0F Latin LATIN SMALL LETTER D WITH LINE BELOW [100] Recommended Not in documented common use
U+1E10 Latin LATIN CAPITAL LETTER D WITH CEDILLA [100] Uppercase  
U+1E11 Latin LATIN SMALL LETTER D WITH CEDILLA [100] Recommended Not in documented common use
U+1E14 Latin LATIN CAPITAL LETTER E WITH MACRON AND GRAVE [100] Uppercase  
U+1E15 Latin LATIN SMALL LETTER E WITH MACRON AND GRAVE [100] Recommended Not in documented common use
U+1E16 Latin LATIN CAPITAL LETTER E WITH MACRON AND ACUTE [100] Uppercase  
U+1E17 Latin LATIN SMALL LETTER E WITH MACRON AND ACUTE [100] Recommended Not in documented common use
U+1E1C Latin LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE [100] Uppercase  
U+1E1D Latin LATIN SMALL LETTER E WITH CEDILLA AND BREVE [100] Recommended Not in documented common use
U+1E1E Latin LATIN CAPITAL LETTER F WITH DOT ABOVE [100] Uppercase  
U+1E1F Latin LATIN SMALL LETTER F WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E22 Latin LATIN CAPITAL LETTER H WITH DOT ABOVE [100] Uppercase  
U+1E23 Latin LATIN SMALL LETTER H WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E24 Latin LATIN CAPITAL LETTER H WITH DOT BELOW [100] Uppercase  
U+1E25 Latin LATIN SMALL LETTER H WITH DOT BELOW [100] Recommended Not in documented common use
U+1E26 Latin LATIN CAPITAL LETTER H WITH DIAERESIS [100] Uppercase  
U+1E27 Latin LATIN SMALL LETTER H WITH DIAERESIS [100] Recommended Not in documented common use
U+1E28 Latin LATIN CAPITAL LETTER H WITH CEDILLA [100] Uppercase  
U+1E29 Latin LATIN SMALL LETTER H WITH CEDILLA [100] Recommended Not in documented common use
U+1E2E Latin LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE [100] Uppercase  
U+1E2F Latin LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE [100] Recommended Not in documented common use
U+1E30 Latin LATIN CAPITAL LETTER K WITH ACUTE [100] Uppercase  
U+1E31 Latin LATIN SMALL LETTER K WITH ACUTE [100] Recommended Not in documented common use
U+1E32 Latin LATIN CAPITAL LETTER K WITH DOT BELOW [100] Uppercase  
U+1E33 Latin LATIN SMALL LETTER K WITH DOT BELOW [100] Recommended Not in documented common use
U+1E34 Latin LATIN CAPITAL LETTER K WITH LINE BELOW [100] Uppercase  
U+1E35 Latin LATIN SMALL LETTER K WITH LINE BELOW [100] Recommended Not in documented common use
U+1E38 Latin LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON [100] Uppercase  
U+1E39 Latin LATIN SMALL LETTER L WITH DOT BELOW AND MACRON [100] Recommended Not in documented common use
U+1E3A Latin LATIN CAPITAL LETTER L WITH LINE BELOW [100] Uppercase  
U+1E3B Latin LATIN SMALL LETTER L WITH LINE BELOW [100] Recommended Not in documented common use
U+1E3E Latin LATIN CAPITAL LETTER M WITH ACUTE [100] Uppercase  
U+1E3F ḿ Latin LATIN SMALL LETTER M WITH ACUTE [100] Recommended Not in documented common use
U+1E40 Latin LATIN CAPITAL LETTER M WITH DOT ABOVE [100] Uppercase  
U+1E41 Latin LATIN SMALL LETTER M WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E4C Latin LATIN CAPITAL LETTER O WITH TILDE AND ACUTE [100] Uppercase  
U+1E4D Latin LATIN SMALL LETTER O WITH TILDE AND ACUTE [100] Recommended Not in documented common use
U+1E4E Latin LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS [100] Uppercase  
U+1E4F Latin LATIN SMALL LETTER O WITH TILDE AND DIAERESIS [100] Recommended Not in documented common use
U+1E50 Latin LATIN CAPITAL LETTER O WITH MACRON AND GRAVE [100] Uppercase  
U+1E51 Latin LATIN SMALL LETTER O WITH MACRON AND GRAVE [100] Recommended Not in documented common use
U+1E52 Latin LATIN CAPITAL LETTER O WITH MACRON AND ACUTE [100] Uppercase  
U+1E53 Latin LATIN SMALL LETTER O WITH MACRON AND ACUTE [100] Recommended Not in documented common use
U+1E54 Latin LATIN CAPITAL LETTER P WITH ACUTE [100] Uppercase  
U+1E55 Latin LATIN SMALL LETTER P WITH ACUTE [100] Recommended Not in documented common use
U+1E56 Latin LATIN CAPITAL LETTER P WITH DOT ABOVE [100] Uppercase  
U+1E57 Latin LATIN SMALL LETTER P WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E58 Latin LATIN CAPITAL LETTER R WITH DOT ABOVE [100] Uppercase  
U+1E59 Latin LATIN SMALL LETTER R WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E5A Latin LATIN CAPITAL LETTER R WITH DOT BELOW [100] Uppercase  
U+1E5B Latin LATIN SMALL LETTER R WITH DOT BELOW [100] Recommended Not in documented common use
U+1E5C Latin LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON [100] Uppercase  
U+1E5D Latin LATIN SMALL LETTER R WITH DOT BELOW AND MACRON [100] Recommended Not in documented common use
U+1E5E Latin LATIN CAPITAL LETTER R WITH LINE BELOW [100] Uppercase  
U+1E5F Latin LATIN SMALL LETTER R WITH LINE BELOW [100] Recommended Not in documented common use
U+1E60 Latin LATIN CAPITAL LETTER S WITH DOT ABOVE [100] Uppercase  
U+1E61 Latin LATIN SMALL LETTER S WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E64 Latin LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE [100] Uppercase  
U+1E65 Latin LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE [100] Recommended Not in documented common use
U+1E66 Latin LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE [100] Uppercase  
U+1E67 Latin LATIN SMALL LETTER S WITH CARON AND DOT ABOVE [100] Recommended Not in documented common use
U+1E68 Latin LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE [100] Uppercase  
U+1E69 Latin LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE [100] Recommended Not in documented common use
U+1E6A Latin LATIN CAPITAL LETTER T WITH DOT ABOVE [100] Uppercase  
U+1E6B Latin LATIN SMALL LETTER T WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E6E Latin LATIN CAPITAL LETTER T WITH LINE BELOW [100] Uppercase  
U+1E6F Latin LATIN SMALL LETTER T WITH LINE BELOW [100] Recommended Not in documented common use
U+1E78 Latin LATIN CAPITAL LETTER U WITH TILDE AND ACUTE [100] Uppercase  
U+1E79 Latin LATIN SMALL LETTER U WITH TILDE AND ACUTE [100] Recommended Not in documented common use
U+1E7A Latin LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS [100] Uppercase  
U+1E7B Latin LATIN SMALL LETTER U WITH MACRON AND DIAERESIS [100] Recommended Not in documented common use
U+1E7C Latin LATIN CAPITAL LETTER V WITH TILDE [100] Uppercase  
U+1E7D Latin LATIN SMALL LETTER V WITH TILDE [100] Recommended Not in documented common use
U+1E7E Latin LATIN CAPITAL LETTER V WITH DOT BELOW [100] Uppercase  
U+1E7F ṿ Latin LATIN SMALL LETTER V WITH DOT BELOW [100] Recommended Not in documented common use
U+1E80 Latin LATIN CAPITAL LETTER W WITH GRAVE [100] Uppercase  
U+1E81 Latin LATIN SMALL LETTER W WITH GRAVE [100] Recommended Not in documented common use
U+1E82 Latin LATIN CAPITAL LETTER W WITH ACUTE [100] Uppercase  
U+1E83 Latin LATIN SMALL LETTER W WITH ACUTE [100] Recommended Not in documented common use
U+1E84 Latin LATIN CAPITAL LETTER W WITH DIAERESIS [100] Uppercase  
U+1E85 Latin LATIN SMALL LETTER W WITH DIAERESIS [100] Recommended Not in documented common use
U+1E86 Latin LATIN CAPITAL LETTER W WITH DOT ABOVE [100] Uppercase  
U+1E87 Latin LATIN SMALL LETTER W WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E88 Latin LATIN CAPITAL LETTER W WITH DOT BELOW [100] Uppercase  
U+1E89 Latin LATIN SMALL LETTER W WITH DOT BELOW [100] Recommended Not in documented common use
U+1E8A Latin LATIN CAPITAL LETTER X WITH DOT ABOVE [100] Uppercase  
U+1E8B Latin LATIN SMALL LETTER X WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E8E Latin LATIN CAPITAL LETTER Y WITH DOT ABOVE [100] Uppercase  
U+1E8F Latin LATIN SMALL LETTER Y WITH DOT ABOVE [100] Recommended Not in documented common use
U+1E90 Latin LATIN CAPITAL LETTER Z WITH CIRCUMFLEX [100] Uppercase  
U+1E91 Latin LATIN SMALL LETTER Z WITH CIRCUMFLEX [100] Recommended Not in documented common use
U+1E92 Latin LATIN CAPITAL LETTER Z WITH DOT BELOW [100] Uppercase  
U+1E93 Latin LATIN SMALL LETTER Z WITH DOT BELOW [100] Recommended Not in documented common use
U+1E94 Latin LATIN CAPITAL LETTER Z WITH LINE BELOW [100] Uppercase  
U+1E95 Latin LATIN SMALL LETTER Z WITH LINE BELOW [100] Recommended Not in documented common use
U+1E96 Latin LATIN SMALL LETTER H WITH LINE BELOW [100] Recommended Not in documented common use
U+1E97 Latin LATIN SMALL LETTER T WITH DIAERESIS [100] Recommended Not in documented common use
U+1E98 Latin LATIN SMALL LETTER W WITH RING ABOVE [100] Recommended Not in documented common use
U+1E99 Latin LATIN SMALL LETTER Y WITH RING ABOVE [100] Recommended Not in documented common use
U+2D80 Ethiopic ETHIOPIC SYLLABLE LOA [100] Recommended Not in documented common use
U+2D81 Ethiopic ETHIOPIC SYLLABLE MOA [100] Recommended Not in documented common use
U+2D82 Ethiopic ETHIOPIC SYLLABLE ROA [100] Recommended Not in documented common use
U+2D83 Ethiopic ETHIOPIC SYLLABLE SOA [100] Recommended Not in documented common use
U+2D84 Ethiopic ETHIOPIC SYLLABLE SHOA [100] Recommended Not in documented common use
U+2D85 Ethiopic ETHIOPIC SYLLABLE BOA [100] Recommended Not in documented common use
U+2D86 Ethiopic ETHIOPIC SYLLABLE TOA [100] Recommended Not in documented common use
U+2D87 Ethiopic ETHIOPIC SYLLABLE COA [100] Recommended Not in documented common use
U+2D88 Ethiopic ETHIOPIC SYLLABLE NOA [100] Recommended Not in documented common use
U+2D89 Ethiopic ETHIOPIC SYLLABLE NYOA [100] Recommended Not in documented common use
U+2D8A Ethiopic ETHIOPIC SYLLABLE GLOTTAL OA [100] Recommended Not in documented common use
U+2D8B Ethiopic ETHIOPIC SYLLABLE ZOA [100] Recommended Not in documented common use
U+2D8C Ethiopic ETHIOPIC SYLLABLE DOA [100] Recommended Not in documented common use
U+2D8D Ethiopic ETHIOPIC SYLLABLE DDOA [100] Recommended Not in documented common use
U+2D8E Ethiopic ETHIOPIC SYLLABLE JOA [100] Recommended Not in documented common use
U+2D8F Ethiopic ETHIOPIC SYLLABLE THOA [100] Recommended Not in documented common use
U+2D90 Ethiopic ETHIOPIC SYLLABLE CHOA [100] Recommended Not in documented common use
U+2D91 Ethiopic ETHIOPIC SYLLABLE PHOA [100] Recommended Not in documented common use
U+2D92 Ethiopic ETHIOPIC SYLLABLE POA [100] Recommended Not in documented common use
U+2D93 Ethiopic ETHIOPIC SYLLABLE GGWA [100] Recommended Not in documented common use
U+2D94 Ethiopic ETHIOPIC SYLLABLE GGWI [100] Recommended Not in documented common use
U+2D95 Ethiopic ETHIOPIC SYLLABLE GGWEE [100] Recommended Not in documented common use
U+2D96 Ethiopic ETHIOPIC SYLLABLE GGWE [100] Recommended Not in documented common use
U+A7B9 Latin LATIN SMALL LETTER U WITH STROKE [100] Recommended Not in documented common use
U+AB01 Ethiopic ETHIOPIC SYLLABLE TTHU [100] Recommended Not in documented common use
U+AB02 Ethiopic ETHIOPIC SYLLABLE TTHI [100] Recommended Not in documented common use
U+AB03 Ethiopic ETHIOPIC SYLLABLE TTHAA [100] Recommended Not in documented common use
U+AB04 Ethiopic ETHIOPIC SYLLABLE TTHEE [100] Recommended Not in documented common use
U+AB05 Ethiopic ETHIOPIC SYLLABLE TTHE [100] Recommended Not in documented common use
U+AB06 Ethiopic ETHIOPIC SYLLABLE TTHO [100] Recommended Not in documented common use
U+AB09 Ethiopic ETHIOPIC SYLLABLE DDHU [100] Recommended Not in documented common use
U+AB0A Ethiopic ETHIOPIC SYLLABLE DDHI [100] Recommended Not in documented common use
U+AB0B Ethiopic ETHIOPIC SYLLABLE DDHAA [100] Recommended Not in documented common use
U+AB0C Ethiopic ETHIOPIC SYLLABLE DDHEE [100] Recommended Not in documented common use
U+AB0D Ethiopic ETHIOPIC SYLLABLE DDHE [100] Recommended Not in documented common use
U+AB0E Ethiopic ETHIOPIC SYLLABLE DDHO [100] Recommended Not in documented common use

Legend

Code Point
A code point or code point sequence.
Glyph
The shape displayed depends on the fonts available to your browser.
Script
Shows the script property value from the Unicode Character Database. Combining marks may have the value Inherited and code points used with more than one script may have the value Common.
Name
Shows the character or sequence name from the Unicode Character Database.
Ref
Links to the references associated with the code point or sequence, if any.
Tags
LGR-defined tag values. Any tags matching the Unicode script property are suppressed in this view.
Comment
The comment as given in the XML file. However, if the comment for this row consists only of the code point or sequence name, it is suppressed in this view. By convention, comments starting with “=” denote an alias. If present, the symbol ⍟ marks a default item shared among a set of LGRs.

Variants

This LGR does not specify any variants.

Classes, Rules and Actions

Character Classes

Number of named classes 2
Implicit (except script) 4

The following table lists all named and implicit classes with their definition and a list of their members intersected with the current repertoire (for larger classes, this list is elided).

Name Definition Count Members or Ranges Ref Comment
Digits Prop=gc:Nd 760→70 {0A66-0A6F 0B66-0B6F 0BE6-0BEF 0C66-0C6F 0D66-0D6F 0DE6-0DEF 1090-1099}   Any character matching Unicode property General_Category:Decimal_Number
Uppercase Prop=gc:Lu 1858→88 {0114 012C 014E 0156 0162 01D5 01D7 01D9 01DB 01DE 01E0 01E2 01EA 01EC 01F4 01F8 01FA 01FC 01FE 021E 0226 0228 022A 022C 022E 0230 0232 0400 040D 04C1 04CB ...}   Any character matching Unicode property General_Category:Uppercase_Letter
implicit Tag=Recommended 344 {0115 012D 014F 0157 0163 01D6 01D8 01DA 01DC 01DF 01E1 01E3 01EB 01ED 01F0 01F5 01F9 01FB 01FD 01FF 021F 0227 0229 022B 022D 022F 0231 0233 0450 045D 04C2 ...}   Any character tagged as Recommended
implicit Tag=RefLGR 2332→0 {}   Any character tagged as RefLGR
implicit Tag=RefLGRBySequence 13→0 {}   Any character tagged as RefLGRBySequence
implicit Tag=Uppercase 88 {0114 012C 014E 0156 0162 01D5 01D7 01D9 01DB 01DE 01E0 01E2 01EA 01EC 01F4 01F8 01FA 01FC 01FE 021E 0226 0228 022A 022C 022E 0230 0232 0400 040D 04C1 04CB ...}   Any character tagged as Uppercase

Legend

Members or Ranges
Lists the members of the class as code points (xxx) or as ranges of code points (xxx-yyy). Any class too numerous to list in full is elided with "...".
m→n
Indicates a set for which only n of its m members fall inside the repertoire.
Tag=ttt
A named or implicit class defined by all code points that share the given tag value (ttt).
Prop=ppp:vvv
A named class defined by reference to value vvv of Unicode property ppp.
Implicit
An anonymous class implicitly defined based on tag value and for which there is no named equivalent.

Note: The following named classes are defined but not used in this LGR: Digits, Uppercase.

Whole label evaluation and context rules

The LGR does not define any rules.

Actions

The LGR does not define any actions.

Table of References

The following lists the references cited for specific code points, variants, classes, rules or actions in this LGR.

[EGIDS] Lewis and Simons, EGIDS: Expanded Graded Intergenerational Disruption Scale,” documented in [SIL-Ethnologue] and summarized here:
https://en.wikipedia.org/wiki/Expanded_Graded_Intergenerational_Disruption_Scale_(EGIDS)
[IAB] IAB Statement on Identifiers and Unicode 7.0.0,
https://datatracker.ietf.org/doc/statement-iab-statement-on-identifiers-and-unicode-7-0-0/01/pdf/
[MSR] ICANN, “Maximal Starting Repertoire”,
https://www.icann.org/resources/pages/msr-2015-06-21-en
[Proposal-Arabic] “Proposal for Arabic Script Root Zone LGR”,
https://www.icann.org/en/system/files/files/arabic-lgr-proposal-18nov15-en.pdf
[RefLGR] ICANN, “Second-Level Reference Label Generation Rules”,
https://www.icann.org/resources/pages/second-level-lgr-2015-06-21-en
[RefLGR-Overview] ICANN, “Reference Label Generation Rules (LGR) for the Second Level — Overview and Summary”,
https://www.icann.org/sites/default/files/packages/lgr/lgr-second-level-overview-summary-25oct24-en.pdf
[RZ-LGR] ICANN, “Root Zone Label Generation Rules”,
https://www.icann.org/resources/pages/root-zone-lgr-2015-06-21-en
[SIL-Ethnologue] David M. Eberhard, Gary F. Simons & Charles D. Fennig (eds.). 2021. Ethnologue: Languages of the World, Twenty fourth edition. Dallas, Texas: SIL International. Online version available as
https://www.ethnologue.com
[100] The Unicode Consortium: Identifier_Type property for Unicode Version 16.0.0, available as
https://unicode.org/Public/security/16.0.0/IdentifierType.txt