lib.sanitize¶
See also
Registry-aware entry: SanitizeTransform in
Docutils components.
Doctree sanitization for Django Docutils web-facing rendering.
These helpers strip HTML-unsafe nodes and attributes from a docutils
document before it is written to HTML. SanitizeTransform exposes the
same pass as a docutils transform so it can run inside a custom writer
pipeline; sanitize_doctree() is the reusable entry point.
C0 / DEL control characters that disqualify a URI outright.
-
django_docutils.lib.sanitize._uri_is_allowed(uri, allowed_uri_schemes)¶django_docutils.lib.sanitize._uri_is_allowed(uri, allowed_uri_schemes)¶
Return whether a URI can be emitted into HTML attributes.
Control characters are rejected before parsing: scheme-invalid bytes such as a vertical tab make
urlsplitreport an empty scheme, which would otherwise pass as a relative link. URIsurlsplitrefuses to parse (e.g. malformed IPv6 brackets) are treated as disallowed.Examples
>>> _uri_is_allowed("https://example.com", frozenset({"https"})) True >>> _uri_is_allowed("#section", frozenset()) True >>> _uri_is_allowed("javascript:alert(1)", frozenset({"https"})) False >>> _uri_is_allowed("java\x0bscript:alert(1)", frozenset({"https"})) False >>> _uri_is_allowed("http://[::1", frozenset({"http"})) False
-
django_docutils.lib.sanitize._replace_node_with_text(node)¶django_docutils.lib.sanitize._replace_node_with_text(node)¶
Replace a node with its rendered text content.
Examples
>>> paragraph = nodes.paragraph() >>> reference = nodes.reference("", "", nodes.Text("link")) >>> paragraph += reference >>> _replace_node_with_text(reference) >>> paragraph.astext() 'link'
- Parameters:
node (
Element)- Return type:
-
django_docutils.lib.sanitize._remove_node(node)¶django_docutils.lib.sanitize._remove_node(node)¶
Remove a node from its parent if it is attached.
Examples
>>> paragraph = nodes.paragraph() >>> raw = nodes.raw("", "<script></script>", format="html") >>> paragraph += raw >>> _remove_node(raw) >>> len(paragraph.children) 0
- Parameters:
node (
Element)- Return type:
-
django_docutils.lib.sanitize.sanitize_doctree(document, docutils_settings=None)¶django_docutils.lib.sanitize.sanitize_doctree(document, docutils_settings=None)¶
Remove unsafe HTML-producing nodes and attributes from a doctree.
- Parameters:
document (
docutils.nodes.document) – Doctree to sanitize in place.docutils_settings (
mapping,optional) – Already-resolved Docutils settings, consumed as-is.Noneresolves project defaults viaget_docutils_settings().raw_enabledonly skips raw-node removal when the project also setsallow_unsafe_docutils_settings. URI scheme policy is project-level viaget_allowed_uri_schemes(), not per-call.
- Return type:
Examples
>>> document = nodes.document("", "") >>> document += nodes.raw("", "<script></script>", format="html") >>> sanitize_doctree(document) >>> len(document.children) 0
-
class django_docutils.lib.sanitize.SanitizeTransform¶class django_docutils.lib.sanitize.SanitizeTransform¶
Bases:
TransformRun
sanitize_doctree()as a docutils transform.DjangoDocutilsWritersanitizes intranslate()so the pass always runs after every transform. This transform makes the same pass available to custom docutils pipelines that do not use that writer.default_priorityis high so it runs late when added to a writer’s transform list.Examples
>>> from django_docutils.lib.publisher import publish_doctree >>> document = publish_doctree("Hello") >>> document += nodes.raw("", "<script></script>", format="html") >>> SanitizeTransform(document, document).apply() >>> list(document.findall(nodes.raw)) []