A cumbersome but arbitrarily easily extensible framework for encoding emoji is to index them (at least, new ones) by arbitrary text strings instead of numeric code points. The font rendering agent has access to the text string and can render it appropriately, perhaps using NLP AI to generate an emoji image for text it has never seen before. Such emoji (probably) cannot have interesting properties like specifying alphabetization, but usually that's not needed for emoji. It is a little strange that a large number of characters as data go into producing a single displayed character, but this is similar to characters composed of multiple combining characters or joined by zero-width joiners.
This is similar to heraldry, in which the text describing an image is canonical; different artists may render the text in different ways.
The Unicode consortium can establish standard characters for the beginning and end markers of the descriptive text strings and standards for what goes in them (what language? What encoding? All caps? Are emoji recursively permitted within the descriptive strings?). Then, a decentralized process happens with people inventing text strings describing emoji and font designers inventing how to render them. Standardization can then do things like the following: Identify popular emoji supported by many fonts and used commonly. Popular emoji can be assigned code points to conserve data usage. Multiple different text strings describing the same thing could be combined into one code point. Identical text strings describing different things (homonyms?) could be separated out to different code points.
No comments :
Post a Comment