A lightweight C++ XML parser with support for XPath expressions.
Table of contents generated with markdown-toc
CXML is a simple and efficient XML parser written in C++ with built-in support for XPath queries.
The project combines fundamental C++ techniques with classic data structures to deliver a practical tool for XML data processing.
git clone https://github.com/CodeCat-maker/cxml.git
#include "src/parser.hpp"
#include "src/xpath.hpp"
extern int CXML_PARSER_STATUS;
extern int XPATH_PARSE_STATUE;
int main() {
using std::cout;
using std::endl;
clock_t start, end;
start = clock();
CXMLNode *root = parse_from_string(
"<bookstore company=\"codecat\" boss=\"man\">"
" <book category=\"CHILDREN\">"
" <title>Harry Potter</title>"
" <author>J K.Rowlingk</author>"
" <year>2005</year><br>"
" <price>29.99 </price>"
" </book>"
" <book category=\"WEB\">"
" <title>Learning XML</title>"
" <author>Erik T.Ray</author>"
" <year>2003 </year>"
" <price>39.95 </price>"
" </book>"
"</bookstore>"
);
if (CXML_PARSER_STATUS == CXML_SYNTAX_ERROR) {
std::puts("> XML parsing failed");
return 0;
} else {
std::puts("> XML parsing succeeded");
}
const CXMLNode_result *result1 =
xpath("/bookstore/book[@category=CHILDREN]/@category//text()", root);
const CXMLNode_result *result2 =
xpath("/bookstore/book/title/../price/text()", root);
if (XPATH_PARSE_STATUE == XPATH_SYNTAX_ERROR) {
std::puts("> XPath parsing failed");
return 0;
} else {
std::puts("> XPath parsing succeeded");
}
cout << "Example 1: " << result1->text << endl;
cout << "Example 2: " << result2->text << endl;
end = clock();
cout << "\nExecution time: "
<< (double)(end - start) / CLOCKS_PER_SEC << " seconds";
return 0;
}
/name
– Selects child elementname
//name
– Selects descendant elementname
/.
– Selects the current element/..
– Selects the parent element/name[@attr=value]
– Selects elements with attributeattr=value
/name[@attr]
– Selects elements with attributeattr
/name[n]
– Selects the nth occurrence ofname
/text()
– Returns text content of the current element/@attr
– Returns the value of attributeattr
//text()
– Returns all descendant text nodes//@attr
– Returns all descendant attributes
cmake_minimum_required(VERSION 3.13)
project(cxml)
set(CMAKE_CXX_STANDARD 11)
add_subdirectory(src)
add_executable(cxml main.cpp)
target_link_libraries(cxml CxmlFunction)
mkdir -p build
cd build
cmake ..
make -j
> XML parsing succeeded
> XPath parsing succeeded
Example 1: Harry Potter J K.Rowlingk 2005 29.99
Example 2: 29.99
Execution time: 0.000135 seconds
- From file
- From string
- Parse element name
- Parse attributes
- Parse element text
- Construct the XML tree
- Stack-based construction
- Push
<tag>
on open - Pop on
</tag>
- Naturally maintains parent–child relationships
- Push
- Recursive descent for nested content
Tokenize the expression with a two-pointer scan, enqueue operations, and evaluate in FIFO order.
get_parent_node
–/..
get_this_node
–/.
get_node_from_child_by_name
–/name
get_node_from_genera_by_name
–//name
get_node_by_array_and_name
–/name[n]
get_node_by_attr_and_name
–/name[@attr]
get_node_by_attrValue_and_name
–/name[@attr=value]
get_text_from_this
–/text()
get_texts_from_genera
–//text()
get_attr_from_this
–/@attr
Split the XPath string into tokens and enqueue them.
Dequeue operations and dispatch to the corresponding handlers.
Collect text nodes from all descendants.
Search for the first matching descendant by name or predicate.
- Stack, linked list, tree, queue, pair/tuple
vector
,map
,string
,pair
,stack
,queue
- DFS, BFS, two-pointer parsing
- HTML parsing support
- Extended XPath coverage
- Performance optimizations
Contributions are welcome—feel free to fork and open a PR.