Closes awslabs#154
Closes awslabs#167
The 4 post-processing fixer scripts in `plugins/deploy-on-aws/scripts/lib/`
all do `import defusedxml.ElementTree as ET` and then reach for
`ET.Element`, `ET.ElementTree`, and `ET.indent`. None of these names
are exported by `defusedxml.ElementTree`:
- `Element` and `ElementTree` are stdlib type aliases. defusedxml's
module is a thin security wrapper that re-exports the parsing
helpers (`parse`, `fromstring`) but not the types. With Python 3.10+
evaluating annotations eagerly, `def f(cell: ET.Element)` crashes at
module-import time:
AttributeError: module 'defusedxml.ElementTree' has no attribute
'Element'
- `indent()` is a write-side pretty-printer added to the stdlib in
Python 3.9. Older defusedxml releases — still common via system
Python and distro packages — never re-exported it. Newer scripts
call `ET.indent(tree, ...)` and crash with the same AttributeError.
Both bugs are filed:
- awslabs#154 (Element / ElementTree) lists all 4 affected files.
- awslabs#167 (indent) reproduces the issue on `fix_step_badges.py:486`.
Fix
----
Pull the three names directly from the stdlib at import time and
attach them to the `ET` module object. defusedxml's secure `parse()`
remains the actual XML entry point — none of the symbols added here
do any parsing:
- `Element` / `ElementTree` are pure type aliases used in annotations
and `dict[str, ET.Element]`-shaped indexes.
- `indent()` only mutates the in-memory tree's `text`/`tail` whitespace
before serialization; no network or external input is involved.
Bandit B405 ("Using Element to parse untrusted XML data is known to be
vulnerable to XML attacks") flags the `from xml.etree.ElementTree import`
line. The flagged usage is type-only (no parsing), so each import is
annotated with a `# nosec B405` comment explaining that defusedxml
remains the parser. Verified with `bandit` on all 4 files: 0 issues
post-fix.
Verification
------------
- All 4 modules import cleanly: `python3 -c "import importlib.util;
spec=importlib.util.spec_from_file_location('m', '<file>'); m =
importlib.util.module_from_spec(spec); spec.loader.exec_module(m)"`
- All 4 fixers run end-to-end on a real diagram:
`python3 fix_nesting.py test.drawio` (and same for the other 3).
- `post_process_drawio.py test.drawio` chains all fixers, completes
successfully, output XML re-parses with stdlib ElementTree.
- `bandit` reports 0 issues across all 4 files (4 nosec annotations
applied; no other findings introduced).
- Tested against `defusedxml==0.7.1` on Python 3.12.
Files touched
-------------
- plugins/deploy-on-aws/scripts/lib/fix_step_badges.py
- plugins/deploy-on-aws/scripts/lib/fix_nesting.py
- plugins/deploy-on-aws/scripts/lib/fix_icon_colors.py
- plugins/deploy-on-aws/scripts/lib/post_process_drawio.py
----
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Summary
Closes #154
Closes #167
The 4 post-processing fixer scripts in
plugins/deploy-on-aws/scripts/lib/all doimport defusedxml.ElementTree as ETand then reach forET.Element,ET.ElementTree, andET.indent. None of these names are re-exported bydefusedxml.ElementTree, so the scripts crash at import time on Python 3.10+ (eager annotation evaluation) and on systems with olderdefusedxml(noindentre-export).deploy-on-aws: defusedxml.ElementTree does not expose Element/ElementTree types #154 (Element / ElementTree): defusedxml.ElementTree is a thin security wrapper that re-exports the parsing helpers (
parse,fromstring) but not the type aliases. With eager annotations,def f(cell: ET.Element)raises:All 4 files in
plugins/deploy-on-aws/scripts/lib/are affected (per the comment on deploy-on-aws: defusedxml.ElementTree does not expose Element/ElementTree types #154).fix_step_badges.py crashes: defusedxml.ElementTree has no indent attribute on older defusedxml #167 (indent):
ET.indent()is the stdlib pretty-printer added in Python 3.9. Olderdefusedxmlreleases (still common via system Python and distro packages) never re-exported it, soET.indent(tree, ...)raises:Approach
Pull the three names directly from the stdlib at import time and attach them to the
ETmodule object. defusedxml's secureparse()remains the actual XML entry point — none of the symbols added here do any parsing:Element/ElementTreeare pure type aliases used in annotations anddict[str, ET.Element]-shaped indexesindent()only mutates the in-memory tree'stext/tailwhitespace before serialization; no network or external input is involvedBandit's B405 ("Using Element to parse untrusted XML data is known to be vulnerable to XML attacks") flags the
from xml.etree.ElementTree importline. The flagged usage is type-only (no parsing), so each import is annotated with a# nosec B405comment explaining that defusedxml remains the parser.Verification
python3 -c "import importlib.util; spec=importlib.util.spec_from_file_location('m','<file>'); m=importlib.util.module_from_spec(spec); spec.loader.exec_module(m); print('OK')"fix_nesting.py,fix_icon_colors.py,fix_step_badges.pyindividually, pluspost_process_drawio.pychaining them)mise run security:bandittask) reports 0 issues across all 4 files post-fix:defusedxml==0.7.1on Python 3.12.12Files changed
plugins/deploy-on-aws/scripts/lib/fix_step_badges.pyplugins/deploy-on-aws/scripts/lib/fix_nesting.pyplugins/deploy-on-aws/scripts/lib/fix_icon_colors.pyplugins/deploy-on-aws/scripts/lib/post_process_drawio.pyBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.