Private Use areas

William_J_G Overington via Unicode unicode at unicode.org
Tue Aug 28 10:58:54 CDT 2018


Asmus Freytag wrote:

> There are situations where an ad-hoc markup language seems to fulfill a need that is not well served by the existing full-fledged markup languages. You find them in internet "bulletin boards" or services like GitHub, where pure plain text is too restrictive but the required text styles purposefully limited - which makes the syntactic overhead of a full-featured mark-up language burdensome.

I am thinking of such an ad-hoc special purpose markup language.

I am thinking of something like a special purpose version of the FORTH computer language being used but with no user definitions, no comparison operations and no loops and no compiler. Just a straight run through as if someone were typing commands into FORTH in interactive mode at a keyboard. Maybe no need for spaces between commands. For example, circled R might mean use Right-to-left text display.

I am thinking that there could be three stacks, one for code points and one for numbers and one for external reference strings such as for accessing a web page or a PDF (Portable Document Format) document or listing an International Standard Book Number and so on. Code points could be entered by circled H followed by circled hexadecimal characters followed by a circled character to indicate Push onto the code point stack. Numbers could be entered in base 10, followed by a circled character to mean Push onto the number stack. A later circled character could mean to take a certain number of code points (maybe just 1, or maybe 0) from the character stack and a certain number of numbers (maybe just 1, or maybe just 0) from the number stack and use them to set some property.

It could all be very lightweight software-wise, just reading the characters of the sequence of circled characters and obeying them one by one just one time only on a single run through, with just a few, such as the circled digits, each having its meaning dependent upon a state variable such as, for a circled digit, whether data entry is currently hexadecimal or base 10.

I am wondering how many PUA property variables there would need to be set for the system to be useful.

The sequence could start with all of those PUA property values set at their default values so only those that needed changing need be explicitly set, though others could be explicitly set to the default values if a record were desired. 

William Overington
 
Tuesday 28 August 2018



More information about the Unicode mailing list