Unicode block for programming related symbols and codepoints?

Andre Schappo A.Schappo at lboro.ac.uk
Mon Feb 9 10:41:14 CST 2015


I think this is a very good idea. There are so many multiple uses of ASCII characters in programming languages that really does need sorting out.

The fundamental separation of character semantics and glyph visual representation works really well for this proposal.

Let me take as an example the use of = in programming. The = is used for test of equality and assignment in various programming languages. The equality and assignment operations should have different characters. e.g.

U+XXX1 TEST FOR EQUALITY
U+XXX2 ASSIGNMENT OPERATOR

Initially the glyphs used for these characters could be = but then this mechanism can be used to transition to a new and less ambiguous visual representation. The new  visual representation could be something like

U+XXX1 TEST FOR EQUALITY =
U+XXX2 ASSIGNMENT OPERATOR ⬅

Such a visual and character distinction between the 2 functions must surely make it easier for those learning to program and for interpreter and compiler writers. I think it would also make for easier to read/understand program code.

André

On 8 Feb 2015, at 20:15, Alfred Zett wrote:

Hello everyone,

is there such a unicode block for programming related codepoints?

Conventional search engines as well as wolfram alpha can't answer that, with the former one leading to all the programming problems that occur...

If such a block doesn't exist, I'd like to make a proposal - if possible - to add one with at least the following codepoints/characters:

- Indentation codepoint, with no fixed defined graphical representation. For indentation based programming languages.
Because:
-- specific clients may want to show it different (for example as arrows, lines etc., using another color):
--- browsers could let the web page creator let decide the visual representation (character and size) via CSS
--- the same with editors, independent from the actual font
--- in case of visual impairment, the user could even change the accoustical representation if the editor allows it
-- unlike a space symbol, it wouldn't need more than one character per indentation
-- unlike tabs or space, it wouldn't be whitespace
-- unlike normal arrow characters, one could customize the length in an editor and wouldn't have to insert extra spaces for a better visual imagery

- A codepoint for string literal quotes, that would spare one the escaping.
- A statement separator symbol.
- Other ideas?

You may now think, this is highly specific and you are right.
However, so are EMOJI signs, in particular those like PINE DECORATION.

These days, there are a lot of tools to create small embedded scripting languages and DSLs, which are used in-program in special editors. And there is a lot of people using them.
Exactly these could really profit from such a codeblock instead of using conventional ASCII subset characters.
Also, there is a lot of potential with really good text editors and IDEs where semantics may matter a lot.

Excuse my english, I hope this was understandable.

Best regards,

A. Z.
_______________________________________________
Unicode mailing list
Unicode at unicode.org<mailto:Unicode at unicode.org>
http://unicode.org/mailman/listinfo/unicode

马馬骉驫马馬骉驫马馬骉驫马馬骉驫
http://twitter.com/andreschappo
http://schappo.blogspot.co.uk
http://weibo.com/andreschappo
http://blog.sina.com.cn/andreschappo



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150209/45089af1/attachment.html>


More information about the Unicode mailing list