Commit Graph

65 Commits

Author SHA1 Message Date
nicm
692ce59bce Do not escape $ unless DQ is set, that is the only case where we need to
escape it.
2024-05-24 12:41:24 +00:00
nicm
f09cde2542 Change UTF-8 combining to inspect the previous character at the cursor
position rather than keeping the last character from the input stream,
this is how most terminals work and fixes problems with displaying these
characters in vim. GitHub issue 3600.
2023-09-15 15:49:05 +00:00
nicm
9456258ccc Rewrite combined character handling to be more consistent and to support
newer Unicode combined characters (which we have to "know" are combined
since they are not width zero). GitHub issue 3600.
2023-09-01 14:29:11 +00:00
nicm
e79fb214f8 Another warning fix for GCC from Thomas Klausner. 2023-07-03 08:37:14 +00:00
nicm
a2a02fd7d7 Change a few types to fix warnings, from Thomas Klausner. 2023-06-30 21:55:08 +00:00
nicm
7ced0a03d2 Restore code to handle wcwidth failure so that unknown codepoints still
do the most likely right thing. GitHub issue 3427, patch based on an
diff from Jesse Luehrs in GitHub issue 3003.
2023-01-08 22:15:30 +00:00
nicm
8bd17bff49 Make U+FE0F VARIATION SELECTOR-16 change the width from 1 to 2. GitHub
issue 3409.
2022-12-16 08:19:58 +00:00
nicm
77b1290698 More accurate vi(1) word navigation in copy mode and on the status line.
This changes the meaning of the word-separators option - setting it to
the empty string is equivalent to the previous behavior. From Will Noble
in GitHub issue 2693.
2021-06-10 07:56:47 +00:00
nicm
869c0e860f Fix some warnings, GitHub issue 2382. 2020-09-16 18:37:55 +00:00
nicm
743ab5728d Fix show-buffer when run from inside tmux, GitHub issue 2314. 2020-07-21 05:24:33 +00:00
nicm
fee585ea14 Include width in error message. 2020-06-09 10:37:00 +00:00
nicm
c60389acbf It is not sensible to store pointers into an array we are going to
realloc (duh), use two trees instead.
2020-06-09 08:34:33 +00:00
nicm
a4a3d89598 Use bitshifts instead of a union for encoding UTF-8 into 32 bits, which
is more friendly to GCC3.

Reported by and ok aoyama@.
2020-06-06 12:38:32 +00:00
nicm
2a4d4bda2b Allow UTF-8 characters of width 0 to be stored, it is useful to be able
to put padding cells in as width 0.
2020-06-02 20:10:23 +00:00
nicm
7e501f1993 UTF-8 keys need to be big endian so the size bits are at the top. 2020-06-02 17:17:44 +00:00
nicm
822ee4e0a6 Fail rather than fatal on UTF-8 width 0. 2020-06-02 11:29:00 +00:00
nicm
ff6f2ff6d9 Return new character properly when converting to data. 2020-05-26 12:50:03 +00:00
nicm
6f03e49e68 Use the internal representation for UTF-8 keys instead of wchar_t and
drop some code only needed for that.
2020-05-25 18:57:24 +00:00
nicm
49ec074271 Tidy up new UTF-8 code and make it more generic. 2020-05-25 18:19:29 +00:00
nicm
bbfb44e9b2 Make some data types consistent. 2020-05-25 15:02:25 +00:00
nicm
3a5219c6d0 Instead of storing all UTF-8 characters in the extended cell which means
that 14 bytes are wasted for each character in the BMP, only store
characters of three bytes or less in the cell itself and store others
(outside the BMP or with combining characters) in a separate global
tree. Can reduce grid memory use for heavy Unicode users by around 30%.
2020-05-25 09:32:10 +00:00
nicm
1ebd8c1234 Add p format modifier for padding to width. 2019-11-25 15:04:15 +00:00
nicm
e90d4a6021 Add formats for word and line under the mouse and use them to add some
items to the pane menu.
2019-05-26 17:34:45 +00:00
nicm
f006116bac Environment variables can start with { also. 2019-05-23 18:22:13 +00:00
nicm
27bfb56ad5 Break the argument escaping code into a separate function and use it to
escape key bindings in list-keys. Also escape ~ and ; and $ properly.
2019-05-23 14:03:44 +00:00
nicm
979313832c Extend the #[] style syntax and use that together with previous format
changes to allow the status line to be entirely configured with a single
option.

Now that it is possible to configure their content, enable the existing
code that lets the status line be multiple lines in height. The status
option can now take a value of 2, 3, 4 or 5 (as well as the previous on
or off) to configure more than one line. The new status-format array
option configures the format of each line, the default just references
the existing status-* options, although some of the more obscure status
options may be eliminated in time.

Additions to the #[] syntax are: "align" to specify alignment (left,
centre, right), "list" for the window list and "range" to configure
ranges of text for the mouse bindings.

The "align" keyword can also be used to specify alignment of entries in
tree mode and the pane status lines.
2019-03-18 20:53:33 +00:00
nicm
467ece53e6 Remove unused variable. 2017-06-04 09:02:57 +00:00
nicm
8149bc3fa6 Be more strict about escape sequences that rename windows or set titles:
ignore any that not valid UTF-8 outright, and for good measure pass the
result through our UTF-8-aware vis(3).
2017-06-04 09:02:36 +00:00
nicm
248aa54bfd Style and spacing nits. 2017-05-31 17:56:48 +00:00
nicm
67d2335130 Fix a couple of argument types. 2017-03-17 14:51:41 +00:00
nicm
faa0570309 Plain stravis() because it will mangle UTF-8 characters, so add
utf8_stravis() which calls our existing utf8_strvis() and use it instead
2017-01-18 10:08:05 +00:00
nicm
8b804fb589 Support UTF-8 entry into the command prompt. 2016-10-11 07:11:40 +00:00
nicm
9892d80d6f Most of the utf8_data is fixed so simplify utf8_set to use a memcpy. 2016-05-27 22:57:27 +00:00
nicm
7abdfbe20e OpenBSD wcwidth() is sensible and complete so if it returns -1 it means
that a character is not printable, so return to ignoring such
characters.
2016-04-29 09:11:19 +00:00
nicm
23fdbc9ea6 Loads of platforms appear to have old or broken Unicode character type
information and are missing widths for relatively common Unicode
characters (so mbtowc() works, but wcwidth() fails). So if wcwidth()
returns -1, assume a width of 1 instead of ignoring the character.
2016-04-27 09:36:25 +00:00
nicm
d303e55258 Log wcwidth() and mbtowc() failure to make it easier to debug a Unicode
codepoint not appearing.
2016-04-26 07:33:36 +00:00
nicm
b8a102d26f Handle wcwidth() and mbtowc() failures in better style and drop
characters where we can't find the width (wcwidth() fails) on input, the
same as we drop invalid UTF-8. Suggested by schwarze@.
2016-03-02 15:36:02 +00:00
nicm
26945d7956 Use system wcwidth() instead of carrying around UTF-8 width tables. 2016-03-01 12:02:08 +00:00
nicm
fa64b89ad7 Whoops, need this for the previous reverse trim commit too. 2016-01-31 09:57:41 +00:00
nicm
995af0e2b7 I no longer use my SourceForge address so replace it. 2016-01-19 15:59:12 +00:00
nicm
933929cd62 Memory leaks and an uninitialized part of utf8_data, from Patrick Palka. 2015-11-20 22:02:54 +00:00
nicm
3db0d50df4 The private use area at U+E000 to U+F8FF is not very useful if it is
width 0, make it width 1 instead.
2015-11-14 12:03:23 +00:00
nicm
205d15e82d All these return values from utf8_* are confusing, use an enum. 2015-11-14 11:45:43 +00:00
nicm
f401791a56 Rename a variable in utf8_combine for consistency and use 0xfffd for
unknown Unicode.
2015-11-14 11:13:44 +00:00
nicm
64333e3ef8 Be more strict about invalid UTF-8. 2015-11-14 10:56:31 +00:00
nicm
c5689a5a40 Long overdue change to the way we store cells in the grid: now, instead
of storing a full grid_cell with UTF-8 data and everything, store a new
type grid_cell_entry. This can either be the cell itself (for ASCII
cells), or an offset into an extended array (per line) for UTF-8
data.

This avoid a large (8 byte) overhead on non-UTF-8 cells (by far the
majority for most users) without the complexity of the shadow array we
had before. Grid memory without any UTF-8 is about half.

The disadvantage that cells can no longer be modified in place and need
to be copied out of the grid and back but it turned out to be lot less
complicated than I expected.
2015-11-13 08:09:28 +00:00
nicm
e71a915412 Rename overly-long utf8data to ud throughout. 2015-11-12 22:04:37 +00:00
nicm
a209ea3953 Add utf8_padcstr and use it to align columns in list-keys. 2015-11-12 12:43:36 +00:00
nicm
d6daf37df4 Tidy utf8.c a little: build table on first use, and make utf8_width take
a u_int rather than splitting and then combining again in utf8_split.
2015-11-12 12:19:57 +00:00
nicm
c41673f3fa If we know the terminal outside tmux is not UTF-8, replace UTF-8 in
error messages and whatnot with underscores the same as we do when we
draw UTF-8 characters as part of the screen.
2015-11-12 11:10:50 +00:00