Intelligently pretty-print HTML/XML with inline tags.
While I love Beautiful Soup as a parser, BeautifulSoup.prettify()
adds a linebreak between every tag.
This results in unwanted white space between tags that should be inline, like <sup>
, <a>
, <span>
, etc:
<p>Introducing GitHub<sup>®</sup></p>
Introducing GitHub®
vs.
<p>
Introducing GitHub
<sup>
®
</sup>
</p>
Introducing GitHub ®
This module parses HTML/XML as a raw string to more intelligently format tags.
You have two options:
pip install prettierfier
in your command lineThis module is built with just the Python Standard Library and contains no external third-party dependencies.
prettify_xml(xml_string, indent=2, debug=False)
Args:
xml_string (str): XML text to prettify.
indent (int, optional): Set size of XML tag indents.
Test-only args:
debug (bool, optional): Show results of each regexp application.
Returns:
str: Prettified XML.
prettify_html(html_string, debug=False)
BeautifulSoup.prettify()
output. Args:
html_string (str): HTML string to parse.
Test-only args:
debug (bool, optional): Show results of each regexp application.
Returns:
str: Prettified HTML.
import prettierfier
ugly_html = """<p>
Introducing GitHub
<sup>
®
</sup>
</p>"""
pretty_html = prettierfier.prettify_html(ugly_html)
print(pretty_html)
# Output
>>> <p>Introducing GitHub<sup>®</sup></p>