Dompa

A zero-dependency HTML5 document parser. It takes an input of an HTML string, parses it into a node tree, and provides an API for querying and manipulating said node tree.

Install

pip install dompa

Requires Python 3.10 or higher.

Usage

The most basic usage looks like this:

from dompa import Dompa
from dompa.actions import ToHtml

dom = Dompa("<div>Hello, World</div>")

# Get the tree of nodes
nodes = dom.get_nodes()

# Turn the node tree into HTML
html = dom.action(ToHtml)

DOM manipulation

You can run queries on the node tree to get or manipulate node(s).

`query`

You can find nodes with the query method which takes a Callable that gets Node passed to it and that has to return a boolean true or false, like so:

from dompa import Dompa

dom = Dompa("<h1>Site Title</h1><ul><li>...</li><li>...</li></ul>")
list_items = dom.query(lambda n: n.name == "li")

All nodes returned with query are deep copies, so mutating them has no effect on Dompa's state.

`traverse`

The traverse method is very similar to the query method, but instead of returning deep copies of data it returns a direct reference to data instead, meaning it is ideal for updating the node tree inside of Dompa. It takes a Callable that gets a Node passed to it, and has to return the updated node, like so:

from typing import Optional
from dompa import Dompa
from dompa.nodes import Node, TextNode

dom = Dompa("<h1>Site Title</h1><ul><li>...</li><li>...</li></ul>")


def update_title(node: Node) -> Optional[Node]:
    if node.name == "h1":
        node.children = [TextNode(value="New Title")]

    return node


dom.traverse(update_title)

If you wish to remove a node then return None instead of the node. If you wish to replace a single node with multiple nodes, use FragmentNode.

Types of nodes

There are three types of nodes that you can use in Dompa to manipulate the node tree.

`Node`

The most common node is just Node. You should use this if you want the node to potentially have any children inside of it.

from dompa.nodes import Node

Node(name="name-goes-here", attributes={}, children=[])

Would render:

<name-goes-here></name-goes-here>

`VoidNode`

A void node (or Void Element according to the HTML standard) is self-closing, meaning you would not have any children in it.

from dompa.nodes import VoidNode

VoidNode(name="name-goes-here", attributes={})

Would render:

<name-goes-here>

You would use this to create things like img, input, br and so forth, but of course you can also create custom elements. Dompa does not enforce the use of any known names.

`TextNode`

A text node is just for rendering text. It has no tag of its own, it cannot have any attributes and no children.

from dompa.nodes import TextNode

TextNode(value="Hello, World!")

Would render:

Hello, World!

`FragmentNode`

A fragment node is a node whose children will replace itself. It is sort of a transient node in a sense that it doesn't really exist. You can use it to replace a single node with multiple nodes on the same level inside of the traverse method.

from dompa.nodes import TextNode, FragmentNode, Node

FragmentNode(children=[
    Node(name="h2", children=[TextNode(value="Hello, World!")]),
    Node(name="p", children=[TextNode(value="Some content ...")])
])

Would render:

<h2>Hello, World!</h2>
<p>Some content ...</p>

Actions

Both Dompa and its nodes have actions - a way to extend built-in functionality to do additional things, like for example converting the node tree into some desired result or perhaps manipulating inner state. Use your imagination.

Dompa Actions

You can create a Dompa action by extending the abstract class dompa.DompaAction with your action class, like for example:

from dompa import Dompa, DompaAction

class MyAction(DompaAction):
    def __init__(self, instance: Dompa):
        self.instance = instance

    def make(self):
        pass

Basically, an action gets an instance of the Dompa class, and has a make method that does something with it.

`ToHtml`

To convert the Dompa node tree into an HTML string, you can make use of the ToHtml action.

from dompa import Dompa
from dompa.actions import ToHtml

template = Dompa("<h1>Hello World</h1>")
html = template.action(ToHtml)

Node Actions

Node actions are basically identical to Dompa actions, except that they are in a different namespace and, naturally, only work on the Node class (and its child classes). You can create a Node action by extending the abstract class dompa.nodes.NodeAction with your action class, like so:

from dompa.nodes import Node, NodeAction

class MyAction(NodeAction):
    def __init__(self, instance: Node):
        self.instance = instance

    def make(self):
        pass

Just like with the DompaAction, a NodeAction also gets an instance of the Node class, and has a make method that does something with it.

`ToHtml`

To convert a Node into an HTML string, you can make use of the ToHtml action.

from dompa import Dompa
from dompa.nodes.actions import ToHtml

template = Dompa("<h1>Hello World</h1>")
h1_node = template.query(lambda x: x.name == "h1")[0]
html = h1_node.action(ToHtml)

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.github/workflows		.github/workflows
dompa		dompa
.gitignore		.gitignore
.python-version		.python-version
LICENSE.txt		LICENSE.txt
README.md		README.md
coverage-badge.svg		coverage-badge.svg
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dompa

Install

Usage

DOM manipulation

`query`

`traverse`

Types of nodes

`Node`

`VoidNode`

`TextNode`

`FragmentNode`

Actions

Dompa Actions

`ToHtml`

Node Actions

`ToHtml`

About

Releases

Languages

License

askonomm/dompa

Folders and files

Latest commit

History

Repository files navigation

Dompa

Install

Usage

DOM manipulation

query

traverse

Types of nodes

Node

VoidNode

TextNode

FragmentNode

Actions

Dompa Actions

ToHtml

Node Actions

ToHtml

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Languages

`query`

`traverse`

`Node`

`VoidNode`

`TextNode`

`FragmentNode`

`ToHtml`

`ToHtml`