Saturday, July 11, 2020

[lxjmqymd] Identifiers

An identifier may consist of the characters a-z, 0-9, underbar, and parentheses.  (39 possible characters.)

It must be at least one character long.

Parentheses must match.

Provactive design decisions (which might come back to bite us):

Unlimited length.

No additional escape character.  We imagine conventions that use parentheses as an escape character will come to exist, e.g., "(unicode_128169)" to denote the "pile of poo" emoji.  Compare with HTML's 💩.

No capital letters.  The user agent retains the flexibility to capitalize wherever, perhaps to signify names instead of regular words.  Because underbar is available, there is no need for camelCase.

It may consist of all digits.  Context is needed to know if something if an identifier or a plain number: 24601.

Underbars may go at the beginning, at the end, right after an open parenthesis, or right before a close parenthesis.  There may be multiple underbars in a row.

The first and last characters can be matching parentheses, so there needs to be context to know if parentheses are part of the identifier or part of the surrounding text.  Perhaps the user agent has the flexibility of displaying identifier parentheses with a different pair of matching characters.

Original motivation was for selecting usernames.  What constraints are narrow enough to avoid madness, but expressive enough to satisfy people?

Underbars permit chaining multiple components.  Digits encode arbitrary data.  Parentheses permit hierarchical (and other) structures.

Should we have periods instead of (or in addition to) underbars?  Perhaps we leave it up to the user agent on how to display, and to input, the one abstract separator character.

How do we chain multiple identifiers?  Lots of options, including comma-space, or periods.  There needs to be context about what the chaining is for.

In addition to an identifier which remains (fairly) constant, also permit additional (Unicode) text as flair, and a graphical icon.  These can change or be changed easily.

We can also additionally have a purely numeric UID.

Perhaps the user agent can help distinguish similar looking identifiers.

Many previous posts: Data as a number , With escape char but no digits , Multi word identifiers , No space .

No comments :