3.8. uniseg.wrap — Text Wrapping
Wrap text based on Unicode line breaking algorithm.
- class uniseg.wrap.Formatter(*args, **kwargs)
Protocol methods and properties for formatters invoked by the
Wrapperinstance.Your formatter should have the same methods and properties this class has. They are invoked by the
Wrapperinstance to determin logical widths of texts and to give you the ways to handle them, such as to render them.- handle_new_line() None
Handler method which is invoked when a new line begins.
- handle_text(text: str, extents: list[int], /) None
Handler method which is invoked when the text should be put on the current position and extents.
- property tab_width: int
Logical width of tab forwarding.
This property value is used by the
Wrapperinstance to determin the actual forwarding extents of tabs in each of the positions.
- text_extents(s: str, /) list[int]
Return a list of logical lengths from start of the string to each of code point in s.
- property wrap_width: int | None
Logical width of text wrapping.
Note that returning
None(which is the default) means “do not wrap” while returning0means “wrap as narrowly as possible.”
- class uniseg.wrap.TTFormatter(width: int, *, tab_width: int = 8, tab_char: str = ' ', ambiguous_as_wide: bool = False)
Fixed-width text wrapping formatter.
- property ambiguous_as_wide: bool
Treat code points with its East_Easian_Width property is ‘A’ as those with ‘W’; having double width as alpha-numerics.
- handle_text(text: str, extents: Sequence[int], /) None
Handler which is invoked when a text should be put on the current position.
- lines() Iterator[str]
Iterate every wrapped line strings.
- property tab_char: str
Character to fill tab spaces with.
- property tab_width: int
Forwarding size of tabs.
- text_extents(s: str, /) list[int]
Return a list of logical lengths from the start of the string to the end of each code point for s.
- property wrap_width: int
Wrapping width.
- class uniseg.wrap.Wrapper
Text wrapping engine.
Usually, you don’t need to create an instance of the class directly. Use
wrap()instead.- wrap(formatter: Formatter, s: str, /, cur: int = 0, offset: int = 0, *, char_wrap: bool = False, tailor: Callable[[str, Iterable[Literal[0, 1]]], Iterable[Literal[0, 1]]] | None = None) int
Wrap string s with formatter and invoke its handlers.
The optional arguments, cur is the starting position of the string in logical length, and offset means left-side offset of the wrapping area in logical length — this parameter is only used for calculating tab-stopping positions for now.
If char_wrap is set to
True, the text will be warpped with its grapheme cluster boundaries instead of its line break boundaries. This may be helpful when you don’t want the word wrapping feature in your application.The method returns the total count of wrapped lines.
- uniseg.wrap.tt_text_extents(s: str, /, *, ambiguous_as_wide: bool = False) list[int]
Return a list of logical lengths from the start of the string to the end of each code point for s.
>>> tt_text_extents('abc') [1, 2, 3] >>> tt_text_extents('あいう') [2, 4, 6] >>> tt_text_extents('𩸽') # test a code point out of BMP [2]
Calling with an empty string will return an empty list:
>>> tt_text_extents('') []
The meaning of ambiguous_as_wide is the same as that of
tt_width():>>> tt_text_extents('αβ') [1, 2] >>> tt_text_extents('αβ', ambiguous_as_wide=True) [2, 4]
- uniseg.wrap.tt_width(s: str, /, index: int = 0, *, ambiguous_as_wide: bool = False) int
Return logical width of the grapheme cluster at s[index] on fixed-width typography
Return value will be
1(halfwidth) or2(fullwidth).Generally, the width of a grapheme cluster is determined by its leading code point.
>>> tt_width('A') 1 >>> tt_width('\u8240') # U+8240: CJK UNIFIED IDEOGRAPH-8240 2 >>> tt_width('g\u0308') # U+0308: COMBINING DIAERESIS 1 >>> tt_width('\U00029e3d') # U+29E3D: CJK UNIFIED IDEOGRAPH-29E3D 2
If ambiguous_as_wide is specified to
True, some characters such as greek alphabets are treated as they have fullwidth as well as ideographics does.>>> tt_width('α') # U+03B1: GREEK SMALL LETTER ALPHA 1 >>> tt_width('α', ambiguous_as_wide=True) 2
- uniseg.wrap.tt_wrap(s: str, /, wrap_width: int, *, tab_width: int = 8, tab_char: str = ' ', ambiguous_as_wide: bool = False, cur: int = 0, offset: int = 0, char_wrap: bool = False, tailor: Callable[[str, Iterable[Literal[0, 1]]], Iterable[Literal[0, 1]]] | None = None) Iterator[str]
Wrap string s based on fixed-width typography algorithm and return a list of wrapped lines.
>>> s1 = 'A quick brown fox jumped over the lazy dog.' >>> list(tt_wrap(s1, 24)) ['A quick brown fox ', 'jumped over the lazy ', 'dog.'] >>> s2 = '和歌は、人の心を種として、万の言の葉とぞなれりける。' >>> list(tt_wrap(s2, 24)) ['和歌は、人の心を種とし', 'て、万の言の葉とぞなれり', 'ける。']
If wrap_width is less than the length of the word of the line, at least one word will be remain as the part of the line:
>>> list(tt_wrap('supercalifragilisticexpialidocious', 24)) ['supercalifragilisticexpialidocious'] >>> list(tt_wrap('wrap supercalifragilisticexpialidocious long words', 24)) ['wrap ', 'supercalifragilisticexpialidocious ', 'long words']
Tab options:
>>> s3 = 'A\tquick\tbrown fox jumped\tover\tthe lazy dog.' >>> print('\n'.join(s.rstrip() for s in tt_wrap(s3, 32))) A quick brown fox jumped over the lazy dog. >>> print('\n'.join(s.rstrip() for s in tt_wrap(s3, 32, tab_width=10))) A quick brown fox jumped over the lazy dog. >>> print('\n'.join(s.rstrip() for s in tt_wrap(s3, 32, tab_char='+'))) A+++++++quick+++brown fox jumped++over++++the lazy dog.
(We use s.rstrip() for every line because trailing spaces will be removed in the docstring here while every wrapped line returned may keep them.)
An option for treating code points of which East_Asian_Width propertiy is ‘A’ (ambiguous):
>>> s4 = 'μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος' >>> list(tt_wrap(s4, 24, ambiguous_as_wide=True)) ['μῆνιν ἄειδε ', 'θεὰ Πηληϊάδεω ', 'Ἀχιλῆος'] >>> list(tt_wrap(s4, 24, ambiguous_as_wide=False)) ['μῆνιν ἄειδε θεὰ ', 'Πηληϊάδεω Ἀχιλῆος']
The cur option controls the indentation of the first line of the result:
>>> print('*** ' + '\n'.join(s.rstrip() for s in tt_wrap(s3, 32, cur=4))) *** A quick brown fox jumped over the lazy dog.
The offset affects indent level for every line:
>>> print('\n'.join(('||' + s.rstrip()) for s in tt_wrap(s3, 32, offset=2))) ||A quick brown fox ||jumped over the lazy ||dog.
- uniseg.wrap.wrap(formatter: Formatter, s: str, /, cur: int = 0, offset: int = 0, *, char_wrap: bool = False, tailor: Callable[[str, Iterable[Literal[0, 1]]], Iterable[Literal[0, 1]]] | None = None) int
Wrap string s with formatter using the module’s static
WrapperinstanceSee
Wrapper.wrap()for further details of the parameters.