Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 13 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
# Introduction

`uproot-custom` is an extension of [`uproot`](https://uproot.readthedocs.io/en/latest/basic.html) that provides an enhanced way to read custom classes stored in `TTree`.
Uproot-custom is an extension of [Uproot](https://uproot.readthedocs.io/en/latest/basic.html) that provides an enhanced way to read custom classes stored in `TTree`.

## When to use `uproot-custom`
## What uproot-custom can do

`uproot-custom` aims to handle cases that classes are too complex for `uproot` to read, such as when their `Streamer` methods are overridden or some specific data members are not supported by `uproot`.
Uproot-custom can natively read complicated combinations of nested classes and c-style arrays (e.g. `map<int, map<int, map<int, string>>>`, `vector<TString>[3]`, etc), and memberwisely stored classes. It also exposes a way for users to implement their own readers for custom classes that are not supported by Uproot or uproot-custom built-in readers, so that users can read their custom classes seamlessly.

## How `uproot-custom` works
## When to use uproot-custom

`uproot-custom` uses a `reader`/`factory` mechanism to read classes:
Uproot-custom aims to handle cases that classes are too complex for Uproot to read, such as when their `Streamer` methods are overridden or some specific data members are not supported by Uproot.

## How uproot-custom works

Uproot-custom uses a `reader`/`factory` mechanism to read classes:

```mermaid
flowchart TD
Expand Down Expand Up @@ -36,14 +40,14 @@ flowchart TD
- `reader` is a C++ class that implements the logic to read data from binary buffers.
- `factory` is a Python class that creates, combines `reader`s, and post-processes the data read by `reader`s.

This machanism is implemented basing on `uproot_custom.AsCustom` interpretation. This makes `uproot-custom` well compatible with `uproot`.
This machanism is implemented basing on `uproot_custom.AsCustom` interpretation. This makes uproot-custom well compatible with Uproot.

> [!TIP]
> Users can implement their own `factory` and `reader`, register them to `uproot-custom`. An example of implementing a custom `factory`/`reader` can be found in [the example repository](https://github.com/mrzimu/uproot-custom-example).
> Users can implement their own `factory` and `reader`, register them to uproot-custom. An example of implementing a custom `factory`/`reader` can be found in [the example repository](https://github.com/mrzimu/uproot-custom-example).

> [!NOTE]
> `uproot-custom` does not provide a full reimplementation of `ROOT`'s I/O system. Users are expected to implement their own `factory`/`reader` for their custom classes that built-in factories cannot handle.
> Uproot-custom does not provide a full reimplementation of `ROOT`'s I/O system. Users are expected to implement their own `factory`/`reader` for their custom classes that built-in factories cannot handle.

## Documentation

View the [documentation](https://mrzimu.github.io/uproot-custom/) for more details about customizing your own `reader`/`factory`, and the architecture of `uproot-custom`.
View the [documentation](https://mrzimu.github.io/uproot-custom/) for more details about customizing your own `reader`/`factory`, and the architecture of uproot-custom.
8 changes: 4 additions & 4 deletions docs/example/override-streamer.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
A full example can be found in the [example repository](https://github.com/mrzimu/uproot-custom-example).
```

We define a demo class `TOverrideStreamer` whose `Streamer` method is overridden to show how to read such classes using `uproot-custom`.
We define a demo class `TOverrideStreamer` whose `Streamer` method is overridden to show how to read such classes using uproot-custom.

There are 2 member variables in `TOverrideStreamer`: `m_int` and `m_double`:

Expand Down Expand Up @@ -281,7 +281,7 @@ Refer to [awkward forms](https://awkward-array.org/doc/main/reference/generated/

## Step 4: Register target branch and the `factory`

Finally, we need to register the branch we want to read with `uproot-custom`, and also register the `OverrideStreamerFactory` so that it can be used by `uproot-custom`.
Finally, we need to register the branch we want to read with uproot-custom, and also register the `OverrideStreamerFactory` so that it can be used by uproot-custom.

We can do this by adding the following code in the `__init__.py` of your package:

Expand All @@ -295,9 +295,9 @@ AsCustom.target_branches |= {
registered_factories.add(OverrideStreamerFactory)
```

## Step 5: Read data with `uproot`
## Step 5: Read data with Uproot

Now we can read the data using `uproot` as usual:
Now we can read the data using Uproot as usual:

```python
>>> b = uproot.open("demo_data.root")["my_tree:override_streamer"]
Expand Down
12 changes: 6 additions & 6 deletions docs/example/read-tobjarray.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ According to the `Streamer` method, the binary data contains:

4. `Line 15`: The next 4 bytes is `fLowerBound` (int32), which is `[0, 0, 0, 0]`, i.e. `0`.

5. `Line 19-25`: Loop over `nobjects` to read each object. Note that the `[255, 255, 255, 255]` indicates that the object's binary layout follows [this rule](https://root.cern/doc/v636/dobject.html). In `uproot-custom`, it can be handled by `ObjectHeaderFactory`.
5. `Line 19-25`: Loop over `nobjects` to read each object. Note that the `[255, 255, 255, 255]` indicates that the object's binary layout follows [this rule](https://root.cern/doc/v636/dobject.html). In uproot-custom, it can be handled by `ObjectHeaderFactory`.

```{tip}
For other `ROOT` built-in classes, it is suggested to check both the streamer information and the source code. If the `Streamer` method is not overridden, the streamer information is usually enough.
Expand All @@ -200,8 +200,8 @@ In summary, the binary data contains:
So we need such factories/readers to read the data:

- `TObjArrayFactory`/`TObjArrayReader` to read `TObjArray` header and loop over `nobjects`.
- `ObjectHeaderFactory`/`ObjectHeaderReader` to read `ObjectHeader`, which are already implemented in `uproot-custom`.
- `AnyClassFactory`/`AnyClassReader` to read `TObjInObjArray` object, which are already implemented in `uproot-custom`.
- `ObjectHeaderFactory`/`ObjectHeaderReader` to read `ObjectHeader`, which are already implemented in uproot-custom.
- `AnyClassFactory`/`AnyClassReader` to read `TObjInObjArray` object, which are already implemented in uproot-custom.

The `TObjArrayFactory`/`TObjArrayReader` should be implemented by ourselves. Note that since we know the type of objects in the `TObjArray` is always `TObjInObjArray`, we can take just 1 `AnyClassFactory`/`AnyClassReader` as sub-factory/sub-reader to read all objects. This is also a process that embedding user-known rules.

Expand Down Expand Up @@ -397,7 +397,7 @@ def make_awkward_form(self):

## Step 4: Register target branch and the `factory`

Finally, register the branch we want to read with `uproot-custom`, and also register the `TObjArrayFactory` so that it can be used by `uproot-custom`.
Finally, register the branch we want to read with uproot-custom, and also register the `TObjArrayFactory` so that it can be used by uproot-custom.

We can do this by adding the following code in the `__init__.py` of your package:

Expand All @@ -411,9 +411,9 @@ AsCustom.target_branches |= {
registered_factories.add(TObjArrayFactory)
```

## Step 5: Read data with `uproot`
## Step 5: Read data with Uproot

Now we can read the data using `uproot` as usual:
Now we can read the data using Uproot as usual:

```python
>>> b = uproot.open("demo_data.root")["my_tree:obj_with_obj_array/m_obj_array"]
Expand Down
20 changes: 12 additions & 8 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
# Introduction

`uproot-custom` is an extension of [`uproot`](https://uproot.readthedocs.io/en/latest/basic.html) that provides an enhanced way to read custom classes stored in `TTree`.
Uproot-custom is an extension of [Uproot](https://uproot.readthedocs.io/en/latest/basic.html) that provides an enhanced way to read custom classes stored in `TTree`.

## When to use `uproot-custom`
## What uproot-custom can do

`uproot-custom` aims to handle cases that classes are too complex for `uproot` to read, such as when their `Streamer` methods are overridden or some specific data members are not supported by `uproot`.
Uproot-custom can natively read complicated combinations of nested classes and c-style arrays (e.g. `map<int, map<int, map<int, string>>>`, `vector<TString>[3]`, etc), and memberwisely stored classes. It also exposes a way for users to implement their own readers for custom classes that are not supported by Uproot or uproot-custom built-in readers.

## How `uproot-custom` works
## When to use uproot-custom

`uproot-custom` uses a `reader`/`factory` mechanism to read classes:
Uproot-custom aims to handle cases that classes are too complex for Uproot to read, such as when their `Streamer` methods are overridden or some specific data members are not supported by Uproot.

## How uproot-custom works

Uproot-custom uses a `reader`/`factory` mechanism to read classes:

```{mermaid}
flowchart TD
Expand Down Expand Up @@ -36,14 +40,14 @@ flowchart TD
- `reader` is a C++ class that implements the logic to read data from binary buffers.
- `factory` is a Python class that creates, combines `reader`s, and post-processes the data read by `reader`s.

This machanism is implemented as `uproot_custom.AsCustom` interpretation. This makes `uproot-custom` well compatible with `uproot`.
This machanism is implemented as `uproot_custom.AsCustom` interpretation. This makes uproot-custom well compatible with Uproot.

```{tip}
Users can implement their own `factory` and `reader`, register them to `uproot-custom`. An example of implementing a custom `factory`/`reader` can be found in [the example repository](https://github.com/mrzimu/uproot-custom-example).
Users can implement their own `factory` and `reader`, register them to uproot-custom. An example of implementing a custom `factory`/`reader` can be found in [the example repository](https://github.com/mrzimu/uproot-custom-example).
```

```{note}
`uproot-custom` does not provide a full reimplementation of `ROOT`'s I/O system. Users are expected to implement their own `factory`/`reader` for their custom classes that built-in factories cannot handle.
Uproot-custom does not provide a full reimplementation of `ROOT`'s I/O system. Users are expected to implement their own `factory`/`reader` for their custom classes that built-in factories cannot handle.
```

```{toctree}
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/api/uproot-custom-ref.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Module reference

This chapter documents the `uproot-custom` module, including both the Python factories and the C++ readers.
This chapter documents the uproot-custom module, including both the Python factories and the C++ readers.

```{toctree}
---
Expand Down
10 changes: 5 additions & 5 deletions docs/reference/version-requirements.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
# Version requirements

## uproot-custom versioning
## Uproot-custom versioning

`uproot-custom` gurantees C++ header compatibility with same minor versions (e.g. C++ headers in `2.0.0` and `2.0.1` are compatible). Therefore, users' project should specify specific minor versions of `uproot-custom` in their `pyproject.toml` file to avoid unexpected incompatibility issues.
Uproot-custom gurantees C++ header compatibility with same minor versions (e.g. C++ headers in `2.0.0` and `2.0.1` are compatible). Therefore, users' project should specify specific minor versions of uproot-custom in their `pyproject.toml` file to avoid unexpected incompatibility issues.

## pybind11 version requirement

If the version of `pybind11` differs between the one used to build `uproot-custom` and the one used to build user's C++ readers, an exception like below may be raised when importing the user's C++ extension module:
If the version of `pybind11` differs between the one used to build uproot-custom and the one used to build user's C++ readers, an exception like below may be raised when importing the user's C++ extension module:

```
ImportError: generic_type: type "xxx" referenced unknown base type "uproot::IReader"
```

To avoid this issue, every versions of `uproot-custom` requires users to build their C++ readers with the same minor version of `pybind11` as the one used to build `uproot-custom`. Users are expected to specify the exact version of `pybind11` manually in their `pyproject.toml` file.
To avoid this issue, every versions of uproot-custom requires users to build their C++ readers with the same minor version of `pybind11` as the one used to build uproot-custom. Users are expected to specify the exact version of `pybind11` manually in their `pyproject.toml` file.

## Summary table

This table summarizes the required `pybind11` versions for each `uproot-custom` version:
This table summarizes the required `pybind11` versions for each uproot-custom version:

| uproot-custom | pybind11 |
| :-----------: | :------: |
Expand Down
2 changes: 1 addition & 1 deletion docs/tutorial/customize-factory-reader.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

If the built-in factories cannot reach your needs, you can implement your own `factory` and/or `reader`.

<!-- This requires some knowledge of `ROOT`'s streaming mechanism and `uproot-custom`'s design. -->
<!-- This requires some knowledge of `ROOT`'s streaming mechanism and uproot-custom's design. -->

```{admonition} Prerequisites
---
Expand Down
4 changes: 2 additions & 2 deletions docs/tutorial/customize-factory-reader/binary-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ m_tstr | TString | AsStrings(
m_tarr_int | TArrayI | AsObjects(Model_TArrayI)
```

This feature makes `uproot` easy to read most of the custom classes.
This feature makes Uproot easy to read most of the custom classes.

However, if we store `TCStyleArray` defined in [streamer information page](streamer-info.md) into `TTree`, the data members of `TSimpleObject` will not be splitted:

Expand All @@ -52,7 +52,7 @@ name | typename | interpretation
m_simple_obj[3] | TSimpleObject[][3] | AsObjects(AsArray(False, False
```

This case is more common when you are trying to use `uproot-custom`.
This case is more common when you are trying to use uproot-custom.

(obtain-binary-data)=
## Obtain branch binary data
Expand Down
8 changes: 4 additions & 4 deletions docs/tutorial/customize-factory-reader/bootstrap.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Bootstrap of custom classes reading in `uproot-custom`
# Bootstrap of custom classes reading in uproot-custom

When reading a `TBranch` with `uproot-custom`, the following steps are performed:
When reading a `TBranch` with uproot-custom, the following steps are performed:

1. `uproot-custom` reads the streamer information (which contains information about data members and their types) of the branch.
1. Uproot-custom reads the streamer information (which contains information about data members and their types) of the branch.
2. `factory`s recursively instantiate themselves and combine together into a tree-like structure according to the streamer information.
3. `factory`s recursively create and combine `reader`s .
4. The combined `reader` reads the binary data and return results back to `factory`.
5. `factory`s recursively convert raw `numpy` arrays to `awkward` contents, and combine them into final `awkward` array.

## Build factory instances

When reading a branch through `uproot-custom`, `uproot-custom` firstly builds a `factory` instance according to the streamer information of the class stored in the branch. During the building process, The factory should also recursively build `factory` instances for all data members of the class.
When reading a branch through uproot-custom, uproot-custom firstly builds a `factory` instance according to the streamer information of the class stored in the branch. During the building process, The factory should also recursively build `factory` instances for all data members of the class.

For example, the streamer information of `TSimpleObject` is as follows (as illustrated in [streamer information](simple-obj-streamer-info)):

Expand Down
10 changes: 5 additions & 5 deletions docs/tutorial/customize-factory-reader/reader-and-factory.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Reader and factory interface

`uproot-custom` uses a `reader`/`factory` mechanism to balance performance and flexibility. `reader`s are implemented in C++, do the actual reading from binary stream. `factory`s are implemented in Python, manage `reader`s, and reconstruct the final `awkward` array.
Uproot-custom uses a `reader`/`factory` mechanism to balance performance and flexibility. `reader`s are implemented in C++, do the actual reading from binary stream. `factory`s are implemented in Python, manage `reader`s, and reconstruct the final `awkward` array.

## Reader interface

Expand Down Expand Up @@ -122,7 +122,7 @@ py::array_t<int> np_array = make_array(data);

### Declaring `reader` to Python

`uproot-custom` uses `pybind11` to declare C++ `reader`s to Python. A helper function `declare_reader` is provided to simplify the declaration. When implementing your own `reader`, you should declare it to Python like this:
Uproot-custom uses `pybind11` to declare C++ `reader`s to Python. A helper function `declare_reader` is provided to simplify the declaration. When implementing your own `reader`, you should declare it to Python like this:

```cpp
PYBIND11_MODULE( my_cpp_reader, m) {
Expand All @@ -141,7 +141,7 @@ from my_cpp_reader import MyReaderClass

### Debugging message

`uproot-custom` provides a `debug_print` method to print debugging message. The print will only be performed when `UPROOT_DEBUG` macro is defined, or `UPROOT_DEBUG` environment variable is set:
Uproot-custom provides a `debug_print` method to print debugging message. The print will only be performed when `UPROOT_DEBUG` macro is defined, or `UPROOT_DEBUG` environment variable is set:

```cpp
// Will print "The reader name is Bob"
Expand All @@ -160,7 +160,7 @@ debug_print( buffer, 50 )
- `make_awkward_content`: Called to reconstruct the final `awkward` content with the raw data read by the C++ `reader`.
- `make_awkward_form`: Called to generate the `awkward` form.

To select the appropriate `factory` for a data member, `uproot-custom` loops over all registered `factory` classes, and calls their `build_factory` method. The first non-`None` return value will be used.
To select the appropriate `factory` for a data member, uproot-custom loops over all registered `factory` classes, and calls their `build_factory` method. The first non-`None` return value will be used.

### Constructor

Expand Down Expand Up @@ -247,7 +247,7 @@ It receives following parameters:

- `**kwargs`: Any extra keyword arguments that might be needed.

When current data member is not suitable for the `factory`, it should return `None`, so that `uproot-custom` will try next `factory`, until one return an instance of itself.
When current data member is not suitable for the `factory`, it should return `None`, so that uproot-custom will try next `factory`, until one return an instance of itself.

When current data member is suitable for the `factory`, it should return an instance of itself, with all necessary parameters passed to the constructor.

Expand Down
Loading