Skip to content

Conversation

@tomingtoming
Copy link
Contributor

Purpose / Goal

To improve XML parsing performance by optimizing leaf node detection in OrderedObjParser.

Background

When parsing XML documents, the parser frequently needs to check if a node is a leaf node. The current implementation creates unnecessary temporary arrays during this check, which impacts performance especially for large XML files.

Technical Details

The optimization changes how we check for leaf nodes:

- isLeafNode = Object.keys(currentNode.child).length === 0
+ isLeafNode = currentNode.child.length === 0

This is safe because:

  • child is explicitly initialized as an array in XmlNode constructor
  • Array modifications only happen through controlled methods (add and addChild)
  • Array's length property is read-only and cannot be corrupted

Performance Impact

Test Environment

  • Hardware: MacBook Pro M1
  • Node.js: v19.6.1
  • OS: macOS
  • Memory: 16GB

Benchmark Results

Small File (1.5KB, sample.xml)

Before optimization:

  • fxp v3: 31,183 req/sec
  • fxp: 18,782 req/sec
  • fxp - preserve order: 21,107 req/sec
  • xmlbuilder2: 11,511 req/sec
  • xml2js: 18,172 req/sec

After optimization:

  • fxp v3: 31,295 req/sec (+0.36%)
  • fxp: 18,957 req/sec (+0.93%)
  • fxp - preserve order: 21,628 req/sec (+2.47%)
  • xmlbuilder2: 11,610 req/sec
  • xml2js: 17,829 req/sec

Large File (98MB, large.xml)

Before optimization:

  • fxp v3: 0.284 req/sec
  • fxp: 0.197 req/sec
  • fxp - preserve order: 0.234 req/sec
  • xmlbuilder2: 0.094 req/sec
  • xml2js: 0.210 req/sec

After optimization:

  • fxp v3: 0.277 req/sec
  • fxp: 0.196 req/sec (-0.51%)
  • fxp - preserve order: 0.232 req/sec (-0.85%)
  • xmlbuilder2: 0.095 req/sec
  • xml2js: 0.211 req/sec

Analysis

  1. Small File Performance:

    • Consistent improvement across all modes
    • Most significant in preserve-order mode (+2.47%)
    • Memory allocation reduced by eliminating unnecessary array creation
  2. Large File Performance:

    • Similar performance characteristics
    • Slight variation within margin of error
    • Memory pressure reduction more significant than raw performance gain
  3. Key Benefits:

    • Reduced memory allocation overhead
    • More efficient in memory-constrained environments
    • Particularly effective for high-frequency parsing of small to medium files
    • Reduced GC pressure in long-running applications

Type

  • Bug Fix
  • Refactoring / Technology upgrade
  • New Feature

Testing

  • All existing tests pass without modification
  • No new tests needed (pure performance optimization)
  • Performance testing completed with multiple file sizes
  • Memory allocation improvement verified

Notes

  • No breaking changes
  • No API modifications
  • Fully backward compatible
  • Zero risk modification (array operations remain unchanged)
  • Reduces garbage collection pressure in heavy parsing scenarios

Technical Implementation Details

The safety of this optimization is guaranteed by the XMLNode implementation:

  1. child array is initialized in constructor: this.child = []
  2. All modifications are performed through controlled methods:
    • add(): Adds text nodes, CDATA, comments
    • addChild(): Adds nested XML nodes
  3. No direct array manipulation elsewhere in the codebase

Note: I have read the contribution guidelines before raising this PR.

@coveralls
Copy link

Coverage Status

coverage: 98.217%. remained the same
when pulling fc89754 on tomingtoming:master
into 682066c on NaturalIntelligence:master.

@amitguptagwl amitguptagwl merged commit eadeb7e into NaturalIntelligence:master Feb 9, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants