Emoji and Annotation data

Mark Davis ☕️ mark at macchiato.com
Fri Jun 24 11:04:40 CDT 2016

You should never be scraping *any* Unicode HTML files. They are not made
for that, and there is no guarantee of stability.

The emoji files are built from data which is described in
(plus CLDR annotations and collation)


On Fri, Jun 24, 2016 at 7:21 AM, Takao Fujiwara <tfujiwar at redhat.com> wrote:

> Hi,
> I'm working on IBus - the input method framework for Linux.
> I parse http://unicode.org/emoji/charts/emoji-list.html and create a
> dictionary between the annotations and the Emoji characters.
> Since the file size is large and it's often updated, I'm thinking how to
> maintain the file.
> I copied the file as http://ibus.github.io/files/ibus/emoji-list.html for
> the build at the moment.
> I have questions:
>  - if unicode.org provides the tarball of the stable html files or other
> data.
>  - what is the license of the html files.
> Do you have any ideas?
> Thanks,
> Fujiwara
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20160624/75d7a31d/attachment.html>

More information about the Unicode mailing list