Thursday, June 30, 2011

[bwnjdfni] Unicode for machine block of text

We need a control character which is interpreted as a suggestion to the text renderer as, "elide the following consecutive characters until the next whitespace".  The intended application is for a machine readable string of text, for example a URL, that a human probably doesn't want to read.  The renderer can, of course, display it, or part of it (just the hostname), if it wants, perhaps "click to see the whole thing".  The elided characters get replaced by a placeholder, e.g., <…>. Cut and paste should capture the full text.

The inspiration was Twitter, for which elided characters ought not count against the character count, eliminating the need for URL shorteners, which many others have noted are a bad idea.  But there are other applications for including machine readable data in line with text: images, movies, programs, which the renderer interprets in an appropriate way.

A method which I'm sure already exists is, "delete previous character".  The renderer could detect a string of consecutive strikeouts and render them as a single placeholder for an elided string.

Of course, higher level formats (e.g., HTML) can already accomplish this, but they are not so amenable to cut and paste.  (Arguably Unicode is already a excessively high level format.)

No comments :