2.6. uniseg.wrap
— Text Wrapping
Unicode-aware text wrapping.
- class uniseg.wrap.Formatter
The abstruct base class for formatters invoked by a
Wrapper
objectThis class is implemented only for convinience sake and does nothing itself. You don’t have to design your own formatter as a subclass of it, while it is not deprecated either.
Your formatters should have the methods and properties this class has. They are invoked by a
Wrapper
object to determin logical widths of texts and to give you the ways to handle them, such as to render them.- handle_new_line() None
The handler method which is invoked when the current line is over and a new line begins
- handle_text(text: str, extents: List[int]) None
The handler method which is invoked when text should be put on the current position with extents.
- reset() None
Reset all states of the formatter.
- property tab_width: int
The logical width of tab forwarding.
This property value is used by a
Wrapper
object to determin the actual forwarding extents of tabs in each of the positions.
- text_extents(s: str, /) List[int]
Return a list of logical lengths from start of the string to each of characters in s.
- property wrap_width: int | None
The logical width of text wrapping.
Note that returning
None
(which is the default) means “do not wrap” while returning0
means “wrap as narrowly as possible.”
- class uniseg.wrap.TTFormatter(*, wrap_width: int, tab_width: int = 8, tab_char: str = ' ', ambiguous_as_wide: bool = False)
A Fixed-width text wrapping formatter.
- property ambiguous_as_wide: bool
Treat code points with its East_Easian_Width property is ‘A’ as those with ‘W’; having double width as alpha-numerics
- handle_new_line() None
The handler which is invoked when the current line is over and a new line begins
- handle_text(text: str, extents: Sequence[int], /) None
The handler which is invoked when a text should be put on the current position
- lines() Iterator[str]
Iterate every wrapped line strings
- reset() None
Reset all states of the formatter
- property tab_char: str
Character to fill tab spaces with
- property tab_width: int
forwarding size of tabs
- text_extents(s: str, /) List[int]
Return a list of logical lengths from start of the string to each of characters in s
- property wrap_width: int
Wrapping width
- class uniseg.wrap.Wrapper
Text wrapping engine.
Usually, you don’t need to create an instance of the class directly. Use
wrap()
instead.- wrap(formatter: Formatter, s: str, cur: int = 0, offset: int = 0, *, char_wrap: bool = False) int
Wrap string s with formatter and invoke its handlers
The optional arguments, cur is the starting position of the string in logical length, and offset means left-side offset of the wrapping area in logical length — this parameter is only used for calculating tab-stopping positions for now.
If char_wrap is set to
True
, the text will be warpped with its grapheme cluster boundaries instead of its line break boundaries. This may be helpful when you don’t want the word wrapping feature in your application.This function returns the total count of wrapped lines.
- uniseg.wrap.tt_text_extents(s: str, *, ambiguous_as_wide: bool = False) List[int]
Return a list of logical widths from the start of s to each of characters (not of code points) on fixed-width typography
>>> tt_text_extents('') [] >>> tt_text_extents('abc') [1, 2, 3] >>> tt_text_extents('\u3042\u3044\u3046') [2, 4, 6] >>> import sys >>> s = '\U00029e3d' # test a code point out of BMP >>> actual = tt_text_extents(s) >>> expect = [2] if sys.maxunicode > 0xffff else [2, 2] >>> len(s) == len(expect) True >>> actual == expect True
The meaning of ambiguous_as_wide is the same as that of
tt_width()
.
- uniseg.wrap.tt_width(s: str, index: int = 0, ambiguous_as_wide: bool = False) Literal[1, 2]
Return logical width of the grapheme cluster at s[index] on fixed-width typography
Return value will be
1
(halfwidth) or2
(fullwidth).Generally, the width of a grapheme cluster is determined by its leading code point.
>>> tt_width('A') 1 >>> tt_width('\u8240') # U+8240: CJK UNIFIED IDEOGRAPH-8240 2 >>> tt_width('g\u0308') # U+0308: COMBINING DIAERESIS 1 >>> tt_width('\U00029e3d') # U+29E3D: CJK UNIFIED IDEOGRAPH-29E3D 2
If ambiguous_as_wide is specified to
True
, some characters such as greek alphabets are treated as they have fullwidth as well as ideographics does.>>> tt_width('\u03b1') # U+03B1: GREEK SMALL LETTER ALPHA 1 >>> tt_width('\u03b1', ambiguous_as_wide=True) 2
- uniseg.wrap.tt_wrap(s: str, wrap_width: int, /, *, tab_width: int = 8, tab_char: str = ' ', ambiguous_as_wide: bool = False, cur: int = 0, offset: int = 0, char_wrap: bool = False) Iterator[str]
Wrap s with given parameters and return a list of wrapped lines
See
TTFormatter
for wrap_width, tab_width and tab_char, andtt_wrap()
for cur, offset and char_wrap.
- uniseg.wrap.wrap(formatter: Formatter, s: str, cur: int = 0, offset: int = 0, *, char_wrap: bool = False) int
Wrap string s with formatter using the module’s static
Wrapper
instanceSee
Wrapper.wrap()
for further details of the parameters.Changed in version 0.7.1: It returns the count of lines now.