UCD in XML or in CSV? (is: UCD in YAML)

Marcel Schneider via Unicode unicode at unicode.org
Sat Sep 1 07:16:03 CDT 2018


Thank you Marius for the example. Indeed I now see that YAML is a powerful means
for a file to have an intuitive readability while drastically reducing file size.

BTW what I conjectured about the role of line breaks is true for CSV too, and any file
downloaded from UCD on a semicolon separator basis becomes unusable when 
displayed straight in the built-in text editor of Windows, given Unicode uses Unix EOL.

 Still for use in spreadsheets, YAML needs to be converted to CSV, although that 
might not crash the browser as large XML does.

Regards,

Marcel

On 01/09/18 09:18 Marius Spix via Unicode wrote:
> 
> Hello Marcel,
> 
> YAML supports references, so you can refer to another character’s
> properties.
> 
> Example:
> 
> repertoire: 
> char:
> -
> name_alias: 
> - [NUL,abbreviation]
> - ["NULL",control]
> cp: 0000
> na1: "NULL"
> props: &0000
> age: "1.1"
> na: ""
> JSN: ""
> gc: Cc
> ccc: 0
> dt: none
> dm: "#"
> nt: None
> nv: NaN
> bc: BN
> bpt: n
> bpb: "#"
> Bidi_M: N
> bmg: ""
> suc: "#"
> slc: "#"
> stc: "#"
> uc: "#"
> lc: "#"
> tc: "#"
> scf: "#"
> cf: "#"
> jt: U
> jg: No_Joining_Group
> ea: N
> lb: CM
> sc: Zyyy
> scx: Zyyy
> Dash: N
> WSpace: N
> Hyphen: N
> QMark: N
> Radical: N
> Ideo: N
> UIdeo: N
> IDSB: N
> IDST: N
> hst: NA
> DI: N
> ODI: N
> Alpha: N
> OAlpha: N
> Upper: N
> OUpper: N
> Lower: N
> OLower: N
> Math: N
> OMath: N
> Hex: N
> AHex: N
> NChar: N
> VS: N
> Bidi_C: N
> Join_C: N
> Gr_Base: N
> Gr_Ext: N
> OGr_Ext: N
> Gr_Link: N
> STerm: N
> Ext: N
> Term: N
> Dia: N
> Dep: N
> IDS: N
> OIDS: N
> XIDS: N
> IDC: N
> OIDC: N
> XIDC: N
> SD: N
> LOE: N
> Pat_WS: N
> Pat_Syn: N
> GCB: CN
> WB: XX
> SB: XX
> CE: N
> Comp_Ex: N
> NFC_QC: Y
> NFD_QC: Y
> NFKC_QC: Y
> NFKD_QC: Y
> XO_NFC: N
> XO_NFD: N
> XO_NFKC: N
> XO_NFKD: N
> FC_NFKC: "#"
> CI: N
> Cased: N
> CWCF: N
> CWCM: N
> CWKCF: N
> CWL: N
> CWT: N
> CWU: N
> NFKC_CF: "#"
> InSC: Other
> InPC: NA
> PCM: N
> blk: ASCII
> isc: ""
> 
> -
> cp: 0001
> na1: "START OF HEADING"
> name_alias: 
> - [SOH,abbreviation]
> - [START OF HEADING,control]
> props: *0000
> 
> 
> 
> 
> 
> Regards,
> 
> Marius Spix
> 
> 
> On Sat, 1 Sep 2018 08:00:02 +0200 (CEST)
> schrieb Marcel Schneider wrote:
> 
[…]



More information about the Unicode mailing list