that 14 bytes are wasted for each character in the BMP, only store
characters of three bytes or less in the cell itself and store others
(outside the BMP or with combining characters) in a separate global
tree. Can reduce grid memory use for heavy Unicode users by around 30%.
changes to allow the status line to be entirely configured with a single
option.
Now that it is possible to configure their content, enable the existing
code that lets the status line be multiple lines in height. The status
option can now take a value of 2, 3, 4 or 5 (as well as the previous on
or off) to configure more than one line. The new status-format array
option configures the format of each line, the default just references
the existing status-* options, although some of the more obscure status
options may be eliminated in time.
Additions to the #[] syntax are: "align" to specify alignment (left,
centre, right), "list" for the window list and "range" to configure
ranges of text for the mouse bindings.
The "align" keyword can also be used to specify alignment of entries in
tree mode and the pane status lines.
information and are missing widths for relatively common Unicode
characters (so mbtowc() works, but wcwidth() fails). So if wcwidth()
returns -1, assume a width of 1 instead of ignoring the character.
of storing a full grid_cell with UTF-8 data and everything, store a new
type grid_cell_entry. This can either be the cell itself (for ASCII
cells), or an offset into an extended array (per line) for UTF-8
data.
This avoid a large (8 byte) overhead on non-UTF-8 cells (by far the
majority for most users) without the complexity of the shadow array we
had before. Grid memory without any UTF-8 is about half.
The disadvantage that cells can no longer be modified in place and need
to be copied out of the grid and back but it turned out to be lot less
complicated than I expected.
uint64_t and converting UTF-8 to Unicode on input and the reverse on
output. (This allows key bindings, there are still omissions - the
largest being that the various prompts do not accept UTF-8.)
in /usr/src/share/locale/ctype/en_US.UTF-8.src, with one single
exception: Keep U+00AD SOFT HYPHEN at width 1 rather than moving
it to width 0, a tradition already observed in the old
https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c .
While here, manually rebalance the btree for optimal lookup speed.
OK nicm@
and supports larger terminals than the older way.
If the new mouse-utf8 option is on, UTF-8 mouse input is enabled for all
UTF-8 terminals. The option defaults to on if LANG etc are set in the
same manner as the utf8 option.
With help and based on code from hsim at gmx.li.
Get rid of passing around u_char[4]s and define a struct utf8_data which has
character data, size (sequence length) and width. Move UTF-8 character
collection into two functions utf8_open/utf8_append in utf8.c which fill in
this struct and use these functions from input.c and the various functions in
screen-write.c.
Space for rather more data than is necessary for one UTF-8 sequence is in the
utf8_data struct because screen_write_copy is still nasty and needs to reinject
the character (after combining) into screen_write_cell.