Wide-character (CJK / emoji) input handling
CJK ideographs and most emoji take two terminal cells. Combining
marks like the acute accent in é = e + ́ take zero. Naïve
String.length is wrong about both. Render and edit code that
ignores this clips characters, miscounts cursor positions, and
leaves stray cells when text shrinks.
TermFlow's prompt and multi-line widgets handle this for you out of the box. If you're building your own text widget, the rules are straightforward.
Pattern
Three primitives:
| Need | Use |
|---|---|
| How wide is this char/string? | WCWidth.charWidth(c) / stringWidth(s) |
| Where's the previous grapheme boundary? | Grapheme.previousBoundary(s, idx) |
| Where's the next? | Grapheme.nextBoundary(s, idx) |
Apply them in two places:
- Cursor column math. When rendering a text input, the cursor's
visual column is
WCWidth.stringWidth(text.take(charIndex)), notcharIndexitself. - Backspace / Delete. Move by graphemes, not by chars or code points.
Code: cursor column
import termflow.tui.WCWidth
def cursorColumn(buffer: String, charIndex: Int): Int =
WCWidth.stringWidth(buffer.take(charIndex))
cursorColumn("hello", 3) // 3
cursorColumn("こんにちは", 3) // 6 (3 wide chars × 2)
cursorColumn("👋🏽 hi", 3) // 4 (👋🏽 = 2 cells, space = 1)
Prompt.cursorColumn(state) does exactly this — see
modules/termflow-app/src/main/scala/termflow/tui/Prompt.scala.
Code: grapheme-aware Backspace
import termflow.tui.Grapheme
def backspace(buffer: String, cursor: Int): (String, Int) =
if cursor == 0 then (buffer, 0)
else
val prev = Grapheme.previousBoundary(buffer, cursor)
val without = buffer.substring(0, prev) + buffer.substring(cursor)
(without, prev)
backspace("café", 4) // ("caf", 3) — drops "é" cleanly even though it's 2 chars
backspace("👋🏽hi", 4) // ("hi", 0) — drops the whole emoji+modifier
Prompt.handleKey already does this internally for Backspace,
Delete, ArrowLeft, ArrowRight. If you reach for it, you don't
have to.
Code: rendering wide cells in a custom widget
import termflow.tui.WCWidth
// When laying out one row of text into cells, advance by width(c):
def stringToCells(text: String, style: Style): List[RenderCell] =
val out = List.newBuilder[RenderCell]
text.foreach { ch =>
WCWidth.charWidth(ch) match
case 1 => out += RenderCell(ch, style, width = 1)
case 2 => out += RenderCell(ch, style, width = 2) // takes two cells
case 0 => /* combining; skip — should fold into prior cell */
case _ => /* control; skip */
}
out.result()
The RenderCell.width = 2 flag is what tells AnsiRenderer to
skip the next cell — it knows the wide glyph already covers it.
Notes
- Don't mix
String.lengthwith cell math. They aren't the same. If you find yourself comparing one to the other, you almost certainly have a bug. Grapheme.count(s)returns the number of user-visible characters, not cells. It's the right thing for "how many characters has the user typed?", which is rarely what you actually want for layout.NO_COLORandLANG=C. Some terminals are configured to refuse Unicode. TheCapabilities.unicodeflag tells you whether to use ASCII fallbacks (Themealready does this for box-drawing glyphs).- Combining sequences.
Grapheme.previousBoundarywalks across full graphemes including ZWJ sequences (👨👩👧👦), skin-tone modifiers (👋🏽), and regional indicators (🇯🇵🇺🇸).
For a working example, see the showcase's Inputs tab — type CJK
and emoji into the MultiLineInput and watch the cursor track
correctly.