columnar, ragged data with a dynamic, run-time defined schema
A TTree-like data organization schema stored within the HDF5 data format.
Table of Contents
dir | description | tests |
---|---|---|
cpp | C++ API for serial read/write | |
python | planned Python API for parallel/serial read |
HDTree Meta-Format Definition
HDTree is a specific metaformat on top of a HDF5 Group.
The tree has a few HDF5 attributes of its own to help interface.
__size__
: the number of entries in the tree__version__
: the version of the HDTree meta-format__api__
: the API that was used to write the HDTree__api_version__
: the version of that API
Besides these attributes, all child groups of the tree are the branches. Each branch can have an arbitrary number of child branches itself.
Each branch has a two attributes
__type__
: the name of the type of this branch__version__
: the version of the type
A variable-length container is "flattened" into two sub-branches:
__size__
: is a branch storing the successive sizes of the containersdata
: is a branch storing the entries in the containers
Similarly, a variable-length mapping is flattened into three sub-branches:
__size__
: is a branch storing the successive sizes of the mappingskeys
: is a branch storing the keys in the mappingsvals
: is a branch storing the values in the mappings
The recursive process of sub-branching continues until atomic1 data types are reached. Which are the only actual HDF5 DataSets. They are stored in chunked and compressed one dimensional DataSets.
booleans, integers, floats, and strings
Design Principles
This page outlines the qualitative goals of the HDTree meta-format and thus also any API interacting with it. These are not concrete requirements of any API; however, they are helpful to keep in mind when diving into deep development associated with HDTree.
Accessible Along Both Axes
In the "DataFrame" vocabulary (popularized by the R language and the Python package pandas), each table of data has two axes: along the rows and along the columns. HDTree is focused on making sure the data is accessible along both of these axes because both access patterns are useful in different situations.
In HDTree vocabulary, each "branch" is a "column" and each "event" is a "row". While HDTree allows for entries within one "cell" (an intersection between a single row and column) to be an arbitrarily complex data type, this organization is still at the foundation of its development.
In-File Version Control
Not only will HDTree store the version of the meta-format in the objects it writes to HDF5 files, it will also store the version of the API and the version of any user-defined data structures. This gives allows users to recieve the benefits of a flexible schema without losing track of how their schema has evolved.
Concrete Acknowledgement of Data Organization
When defining user data types for serialization, HDTree APIs require acknowledgement of where the data is going in the HDTree in-file structure. This may not be literally required in some dynamic language APIs; however, it is important for the user to be aware of how their data will be organized and how it will end up on disk.
Not Limited to Specific Language
HDTree is the meta-format. While originating in C++, the format of data-on-disk should not be limited to a specific language. This prevents evolution of the meta-format or its APIs. This informs development of the meta-format itself by requiring any new features implemented in one language API to have plausible equivlants in other languages.
Coming From ROOT
Page is Work in Progress
Since the HDTree meta-format is directly comparable to (and inspired by) ROOT's TTree class, many users of HDTree are expected to be familiar with the ROOT ecosystem. This page is focused on providing guidance towards HDF5-related tools that would allow for similar interaction with HDTrees that ROOT's ecosystem provides for TTrees.
Graphical Browsing
TBrowser
->HDFView
and/or JupyterLab extension
Plotting Branches
TTree::Draw
->h5py
andmatplotlib.hist
scikit-hep.hist
Serialization of Histogram Objects
pickle
/h5py
in pythonHighFive
in C++
Merging HDTrees
- Simple, small example using
h5py
- Reference open issue for writing a C++ program
awkward and pandas interface
- Issue #11 is aiming to define a HDTree Python API modeled after
uproot
's interface for ROOT TTrees
Contributing to HDTree
All contributing is helpful! Anything from correcting a spelling mistake in the documentation, adding a new example, patching bugs, adding features, or as big as starting an API in a new language is highly encouraged. Below, I've collected some notes on these various levels of contribution.
Documentation Updates
If you are writing more detailed explanation or adding in a new
example, please git clone
the repository and make sure the updated
documentation can be built into a website by jekyll and has the format
you expect. You can build and view the documentation locally
with the help of a container runner like docker
to aid in this development.
New Examples
As far as I'm concerned, the more the merrier! If you are writing an example, please be detailed about which API and which version of that API you are using so that future readers can check if anything has changed since the example was written.
Patching Bugs or Adding Features
If you find a bug or think of a new feature to add, please open a GitHub Issue to start the discussion. This allows all collaborators to see what you plan to work on as well as potentially offer some insight on how to get going.
New API
If your favorite language does not have an API represented, feel free to start writing one! A first API does not have to be super powered. Even a simple one only focused on reading without parallelization can be a good start and open the door to other contributors to expand on it.
Again, similar to patching bugs or adding features, please create a GitHub issue to start a discussion and outline a plan for what you want to implement.
As you get closer to a functional API, integration tests will also be requested. So keep in mind that you may need to be able to run one of the other APIs to help make sure your API is reading and/or writing a correct form of the HDTree meta-format.
Building the Docs
Offering documentation edits while using HDTree is incredibly helpful.
The automatically generated documentation from source code is usually more detailed and is done differently depending on the language being written, so that "API Reference" is kept in separate sites for each API. Manually written documentation is also separated by API, but are all written here and processed by mdbook.
Launching Local Version of Docs
After installing mdbook, you can use it to build and serve the doc website locally on your computer while you are writing documentation. It will then automatically refresh the website when it detects that files have changed.
Note: Some of the links in the mdbook point to the reference manuals that are generated differently in order to conform to language-specific standards, so without generating those manuals those links will be broken.
Reference Manuals
The different APIs have different methods of generating reference manuals
from the comments in the code. Besides editing the manual documentation
and having it processed by jekyll, these files are copied onto the gh-pages
branch into a subdirectory so that they are not modified by jekyll but still
hosted at the same website.
C++
The C++ reference is generated using doxygen and the doxygen-awesome theme. The theme is kept in a git submodule, so you will need to make sure the submodules are downloaded for the local version to be the same theme as the online version.
git submodule --update --init
After installing doxygen
, it is expected to be run from the root of this repository.
doxygen cpp/docs/doxyfile
This produces the HTML doxumentation in the cpp/docs/html
directory.
You can view the HTML files generated by doxygen by opening them
in your favorite browser. For example,
firefox cpp/docs/html/index.html
Structure of the Mono-Repo
In order to ensure uniformity of the HDTree meta-format, the various API implementations are kept within this mono-repo. Each implementation can cater to its language's strengths; nevertheless, the meta-format itself should unite all of the APIs.
For this reason, the different APIs will also be tested to make sure files written by one API can be read by others. Each API has its own subdirectory and it has full control over the organization within that subdirectory to conform to language conventions. So, in general, the structure of the mono-repo's root directory is very simple:
.github/
: GitHub workflows, templates, and other GitHub-related filestest/
: Integration tests to make sure files from one API can be read by anothercpp/
: CPP API ImplementationXlang/
: some language X API implementationmetaformat/
: documentation about the meta-format/schema itself, not specific to any languagedocs/
: general documentation about the HDTree projectREADME.md
: GitHub READMESUMMARY.md
: outline of mdBook-based documentation websitebook.toml
: configuration file for mdBook-based site
hdtree-cpp
C++ API for the HDTree data organization structure.
hdtree-cpp is a C++17 library with support for
- serial read/write of an HDTree
- schema evolution of user-defined structures stored in branches of the HDTree
Installation
Depedencies
- HDF5
- HighFive
- Boost (for demangling, plans to make optional)
cmake -B build -S . \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=<prefix>
cmake --build build --target test
cmake --build build --target install
Usage
Below are just code snippets, please look into the examples for complete programs that compile and run with HDTree.
Write
examples/save.cxx
auto tree = hdtree::Tree::save("my-file.hdf5","/path/to/tree");
auto& i_entry = tree.branch<int>("i_entry");
for (std::size_t i{0}; i < 5; i++) {
*i_entry = i;
// same as
//i_entry.update(i);
tree.save();
}
Read and Write
examples/transform.cxx
{ // read and write (separate source/dest)
auto tree = hdtree::Tree::transform("one.hdf5","/tree1","two.hdf5","/tree2");
// read this branch
auto& i_entry = tree.get<int>("i_entry");
// write this branch
auto& my_cool_new_var = tree.branch<double>("coolio");
for (std::size_t i{0}; i < tree.entries(); i++) {
tree.load();
*my_cool_new_var = i_entry*4.2;
tree.save();
}
}
{ // read and write (same source/dest)
auto tree = hdtree::Tree::inplace("one.hdf5","/tree1");
// read this branch
auto& i_entry = tree.get<int>("i_entry");
// write this branch
auto& my_cool_new_var = tree.branch<double>("coolio");
for (std::size_t i{0}; i < tree.entries(); i++) {
tree.load();
*my_cool_new_var = i_entry*4.2;
tree.save();
}
}
Read
examples/load.cxx
{ // read
auto tree = hdtree::Tree::load("my-file.hdf5","/path/to/tree");
// & required
auto& i_entry = tree.get<int>("i_entry");
for (std::size_t i{0}; i < r.entries(); i++) {
tree.load();
assert(i == *i_entry);
}
}
Benefits
- resets value in a Branch to a empty state after each
save
(i.e. treat the Branch reference as a for-loop-local variable) - hdtree juggles the memory addresses
Table of Contents
- include: the headers for the HDTree C++ API
- src: source files compiled into the hdtree-cpp library
- test: source files for testing hdtree-cpp
- examples: simple, example programs showing hdtree-cpp's various abilities
- compilation and running of examples are always included in the build so that developers of hdtree-cpp can keep them up-to-date
Getting Started with HDTree C++
Installing the HDTree C++ API
CMake and a C++17 compatible C++ compiler are required. Both of which are readily available via your system software repjositories.
- On Ubuntu derivatives:
sudo apt update && sudo apt install cmake gcc g++
The HDF5 libray exists in many Unix repositories so look there for installing it, as always, you can fall back to building the latest release from source.
- On Ubuntu derivatives:
sudo apt update && sudo apt install libhdf5-dev
- On MacOS:
brew install hdf5
The HighFive C++ Wrapper is used by HDTree so it is also required.
- Download a HighFive release
wget https://github.com/BlueBrain/HighFive/archive/refs/tags/v2.7.0.tar.gz
- Unpack the source
tar xzf v2.7.0.tar.gz
cd HighFive-v2.7.0
- Configure the build. Use the
CMAKE_INSTALL_PREFIX
if you wish to install HighFive somewhere besides/usr/local
.
cmake -DHIGHFIVE_EXAMPLES=OFF -DHIGHFIVE_UNIT_TESTS=OFF -B build -S .
- Install the interface. May require administrative (
sudo
) privileges if installing to/usr/local
.
make install
Building HDTree is similar to HighFive but a separate compilation step will be helpful since, unlike HighFive, HDTree is not a header-only C++ library.
- Download a release.
wget https://github.com/tomeichlersmith/hdtree/archive/refs/tags/cpp/v0.4.5.tar.gz
- Unpack the source
tar xzf v0.4.5.tar.gz
cd hdtree-v0.4.5/cpp
- Configure the build (again, use
CMAKE_INSTALL_PREFIX
if you wish to change the install location).
cmake -B build -S .
- Build the library.
cd build
make
- Install
make install
First Steps
There are four ways to access a HDTree with the C++ API.
They are mainly separated by different stages of processing the data.
We start with save
since you will first need to write a HDF5 file
with an HDTree in it in order to be able to go further.
The code below is copied in from the examples directory within the C++ API source. This means the code "snippets" are pretty long, but I've tried to include explanatory comments within them. The examples are compiled alongside the HDTree C++ API so you can try it out immediately after building it.
write-only (save
)
First, we are just going to write some example data to a file. This shows an example of a write-only process. After compilation, run by providing a name for the file and the tree.
hdtree-eg-save my-first-hdtree.h5 the-tree
/**
* @file save.cxx
* Example of saving a new HDTree into a file
*/
// for generating random data
#include <random>
// for interacting with HDTrees
#include "hdtree/Tree.h"
// utility functions for example programs
#include "examples.h"
int main(int argc, char** argv) try {
/**
* parse command line for arguments
*/
std::string filename, treename;
int rc = hdtree::examples::parse_single_file_args(argc, argv, filename, treename);
if (rc != 0) return rc;
/**
* Create a tree by defining what file it is in
* and where it resides within that file
*/
auto tree = hdtree::Tree::save(filename, treename);
/**
* Create branches to define what type of information will
* go into the HDTree. The hdtree::Tree::branch function
* returns a handle to the created hdtree::Branch object.
* This object can (and should) be used to interace with
* the values that will be stored in the HDTree on disk
* in order to reduce the number of in-memory copies that
* need to happen. Here, we use `auto&` to avoid typing
* out all the C++ template nonsense that hdtree::Branch
* does under-the-hood.
*
* Each branch handle can be treated as a pointer
* to the underlying type.
*
* **Note**: Branch handles are invalid after the tree they
* were created from is deleted.
*/
auto& i_entry = tree.branch<std::size_t>("i_entry");
auto& rand_nums = tree.branch<std::vector<double>>("rand_nums");
/**
* Initialization of random number generation.
* Not really applicable to HDTree, just used here to
* show that varying length vectors can be serialized
* with ease
*/
std::mt19937 rng; // no argument -> no seed
std::uniform_real_distribution<double> norm(0., 1.);
std::uniform_int_distribution<std::size_t> uniform(1, 100);
/**
* Actual update and filling of the HDTree.
*
* You can see here how we can treat `i_entry`
* as if it was a properly initialized `std::size_t *`
* and `rand_nums` * as if it was a properly
* initialized `std::vector<double> *`.
*/
for (std::size_t i{0}; i < 100; ++i) {
*i_entry = i;
std::size_t size = uniform(rng);
for (std::size_t j{0}; j < size; j++) {
rand_nums->push_back(norm(rng));
}
/**
* We choose to save each value of the loop into the tree.
*/
tree.save();
}
/**
* The final flushing of the data to disk as well as handle
* cleanup procedures will all be handled automatically by
* deconstruction.
*/
return 0;
} catch (const hdtree::HDTreeException& e) {
std::cerr << "ERROR " << e << std::endl;
return 1;
}
read and write (transform
or inplace
)
Another common task is to perform calculations on some input data
and save those calculations into the tree as well. This does not answer
the question of what should be done with the original data. Should we
(a) copy the original data and write it to a new file with the new data
or (b) write the new data into the input file alongside the original data.
In the HDTree C++ API, option (a) is achieved with transform
and option
(b) is done with inplace
. Both can be run from the same executable and
the choice is made depending of if you give a new file and tree name or
not.
# this will use hdtree::Tree::transform
hdtree-eg-transform my-first-hdtree.h5 the-tree my-second-hdtree.h5 the-second-tree
# this will use hdtree::Tree::inplace
hdtree-eg-transform my-first-hdtree.h5 the-tree
/**
* @file transform.cxx
* Example of transforming an HDTree by adding more branches
*
* This example determines whether a tree should be copied into
* a new file or simply transformed in its current file by what
* arguments are provided to the program. We assume the input
* tree was generated by the hdtree-eg-save example program
* defined in @ref save.cxx (i.e. we look for specific branches).
*/
// for interacting with HDTrees
#include "hdtree/Tree.h"
// utility functions for example programs
#include "examples.h"
int main(int argc, char** argv) try {
/**
* parse command line for arguments
*/
std::pair<std::string,std::string> src, dest;
int rc = hdtree::examples::parse_two_file_args(argc, argv, src, dest);
if (rc != 0) return rc;
/**
* Wrap an existing on-disk HDTree
*
* Here is where we make the decision on whether to copy a tree
* into a new file or not. We choose to copy the tree into
* a new file if a destination file and tree are provided on
* the command line. We use the slightly-ugly ternary operator
* in order to avoid unnecessary copying from an if-else tree.
*/
auto tree = dest.first.empty() ?
hdtree::Tree::inplace(src.first, src.second) :
hdtree::Tree::transform(src, dest);
/**
* We are going to calculate the average of the random
* numbers within each tree entry, so we create a new
* branch to store that result as well as retrieve
* the branch with the numbers we will use.
*/
auto& rand_nums = tree.get<std::vector<double>>("rand_nums");
auto& avg = tree.branch<double>("avg");
/**
* Actual update and filling of the HDTree.
*
* We use a tree helper that will make sure we go through
* each entry in the tree, calling the hdtree::Tree::load
* at the beginning and hdtree::Tree::save at the end of
* each run in the loop. This code is essentially equivalent to
* ```cpp
* for (std::size_t i{0}; i < tree.entries(); ++i) {
* tree.load();
* // the code inside the lambda function below
* if (rand_nums->size() > 0) {
* *avg = (std::reduce(rand_nums->begin(), rand_nums->end()))/rand_nums->size();
* } else {
* *avg = -1;
* }
* //
* tree.save();
* }
* ```
* Just using this example to show off some potentially-helpful
* features - if lambda functions are causing you difficulty,
* feel free to avoid them. Just make sure to remember to call
* the load and save functions!
*/
tree.for_each([&]() {
if (rand_nums->size() > 0) {
*avg = (std::reduce(rand_nums->begin(), rand_nums->end()))/rand_nums->size();
} else {
*avg = -1;
}
});
/**
* The final flushing of the data to disk as well as handle
* cleanup procedures will all be handled automatically by
* deconstruction.
*/
return 0;
} catch (const hdtree::HDTreeException& e) {
std::cerr << "ERROR " << e << std::endl;
return 1;
}
read-only (load
)
Finally, the last common task is reading in the data from the tree and using
it to do some other task (e.g. making a plot or fitting the data with some
model). In this API, that is called load
ing and the example program included
prints a simple histogram of the averages of the original data generated earlier.
Fun Fact: This is an example of the central limit theorem!
# this will error-out if you didn't run step two!
hdtree-eg-load my-first-hdtree.h5 the-tree
# the below is example output, it may change since the random data may change!
0.X | Num Entries
< 0 |
0.0 |
0.1 |*
0.2 |
0.3 |***
0.4 |********************************************
0.5 |*************************************************
0.6 |**
0.7 |
0.8 |*
0.9 |
> 1 |
/**
* @file transform.cxx
* Example of transforming an HDTree by adding more branches
*
* This example determines whether a tree should be copied into
* a new file or simply transformed in its current file by what
* arguments are provided to the program. We assume the input
* tree was generated by the hdtree-eg-save example program
* defined in @ref save.cxx (i.e. we look for specific branches).
*/
// for interacting with HDTrees
#include "hdtree/Tree.h"
// utility functions for example programs
#include "examples.h"
int main(int argc, char** argv) try {
/**
* parse command line for arguments
*/
std::string file_name, tree_name;
int rc = hdtree::examples::parse_single_file_args(argc, argv, file_name, tree_name);
if (rc != 0) return rc;
/**
* Wrap an existing on-disk HDTree
*/
auto tree = hdtree::Tree::load(file_name, tree_name);
std::cout << "This is what a missing branch exception looks like:" << std::endl;
try {
tree.get<double>("dne");
} catch (const hdtree::HDTreeException& e) {
// demonstrate what exceptions look like.
std::cout << e << std::endl;
}
std::cout << "--- end of example exception ---" << std::endl;
/**
* We want to study the average of the random data
* in each entry. This average was calculated in
* the examples/transform.cxx program so this part
* will fail if running on a file that wasn't updated
* by transform!
*/
const auto& avg = tree.get<double>("avg");
/**
* Our very simple histogram is going to be 10 bins with
* an underflow (everything below 0) and overflow (everthing
* above 1) bins.
*
* Since the random data is between 0 and 1, we can calculate
* the bin index very quickly
*
* floor(avg * 10)+1
*
* We will include the value of exactly 1 in the last bin
* and have a special bin for the entries without any data
* from which to calculate an average.
*/
std::vector<unsigned int> hist_bins(12, 0);
/**
* Actual loop over the tree.
*
* We use a tree helper that will make sure we go through
* each entry in the tree, calling the hdtree::Tree::load
* at the beginning of each run in the loop.
* This code is essentially equivalent to
* ```cpp
* for (std::size_t i{0}; i < tree.entries(); ++i) {
* tree.load();
* // the code in teh lambda function below
* }
* ```
* Just using this example to show off some potentially-helpful
* features - if lambda functions are causing you difficulty,
* feel free to avoid them. Just make sure to remember to call
* the load and save functions!
*/
tree.for_each([&]() {
std::size_t i_bin{0};
if (*avg < 0) {
i_bin = 0;
} else if (*avg > 1) {
i_bin = 11;
} else {
i_bin = floor(*avg * 10) + 1;
}
++hist_bins[i_bin];
});
printf("0.X | Num Entries\n");
for (std::size_t i_bin{0}; i_bin < 12; ++i_bin) {
std::string x;
if (i_bin == 0) {
x = "< 0";
} else if (i_bin == 11) {
x = "> 1";
} else {
x = "0."+std::to_string(i_bin-1);
}
printf("%s |", x.c_str());
for (std::size_t c{0}; c < hist_bins.at(i_bin); ++c) printf("*");
printf("\n");
}
/**
* The final flushing of the data to disk as well as handle
* cleanup procedures will all be handled automatically by
* deconstruction.
*/
return 0;
} catch (const hdtree::HDTreeException& e) {
std::cerr << "ERROR " << e << std::endl;
return 1;
}
User-Defined Data Structures
User-defined objects can also be serialized within HDTree. Simplified
schema evolution (a la ROOT's ClassDef
macro) is also available; however,
this example merely shows the required boiler-plate.
HDTree's C++ API has chosen to avoid automatically deducing the on-disk naming from the in-memory class member names. This introduces more boilerplate, but, in my opinion, is helpful for essentially documenting how on-disk data was generated.
/**
* @file user_classes.cxx
* Example of saving and loading user-defined C++ classes
*/
// for generating random data
#include <random>
// for interacting with HDTrees
#include "hdtree/Tree.h"
// utility functions for example programs
#include "examples.h"
/**
* Example user class
*/
class MyData {
float x_, y_, z_;
// grant hdtree access so we can keep the `attach` method private
friend class hdtree::access;
// this is where the name of data on disk is assigned to the
// variable name of data in memory
template <typename Branch>
void attach(Branch& b) {
b.attach("x", x_);
b.attach("y", y_);
b.attach("z", z_);
}
public:
MyData() = default;
MyData(float x, float y, float z)
: x_{x}, y_{y}, z_{z} {}
// HDTree also requires classes to have a `clear` method
// for resetting the instance to a "non-assigned" state
void clear() {
x_ = 0.;
y_ = 0.;
z_ = 0.;
}
// helper function since we know what this data means
float mag() const {
return sqrt(x_*x_+y_*y_+z_*z_);
}
};
int main(int argc, char** argv) try {
/**
* parse command line for arguments
*/
std::string filename, treename;
int rc = hdtree::examples::parse_single_file_args(argc, argv, filename, treename);
if (rc != 0) return rc;
{ // write a simple file with some random data points
auto tree = hdtree::Tree::save(filename, treename);
/**
* Once the MyData::attach method is written, it can be put
* into STL containers (or as a member of other user classes)
* like any other serializable class
*/
auto& my_data = tree.branch<std::vector<MyData>>("my_data");
// initialization of random number generator
std::mt19937 rng; // no argument -> no seed
std::uniform_real_distribution<double> norm(0., 1.);
std::uniform_int_distribution<std::size_t> uniform(1, 100);
for (std::size_t i{0}; i < 100; ++i) {
std::size_t size = uniform(rng);
for (std::size_t j{0}; j < size; ++j) {
my_data->emplace_back(norm(rng), norm(rng), norm(rng));
}
tree.save();
}
// final flushing accomplished when tree and its branches
// go out of scope and are destructed
}
{ // load back from same file and write the average mag as a new branch
auto tree = hdtree::Tree::inplace(filename, treename);
auto& my_data = tree.get<std::vector<MyData>>("my_data");
auto& avg_mag = tree.branch<float>("avg_mag");
tree.for_each([&]() {
if (my_data->size() > 0) {
float tot_mag = 0.;
for (const MyData& d : *my_data) {
tot_mag += d.mag();
}
*avg_mag = tot_mag/my_data->size();
} else {
*avg_mag = -1;
}
});
// final flushing accomplished when tree and its branches
// go out of scope and are destructed
}
return 0;
} catch (const hdtree::HDTreeException& e) {
std::cerr << "ERROR " << e << std::endl;
return 1;
}
More Intense Use Case
The C++ HDTree API is mainly implemented through its
various Branch
classes. The Tree
class is mainly there
to be a helpful interface for handling a set of Branch
es.
I point this out because if you are interested in building
a larger data processing framework around the C++ HDTree API,
I would suggest focusing on writing your own version of Tree
to accomodate your needs rather than attempting to use the Tree
that is apart of this repository.
Performance
Since ROOT is written in C++, using the C++ API for HDTree is the closest to an apples-to-apples comparison we can have between the two formats.
This page details a comparison between the two attempting to isolate the serialization performance of the two libraries.
Writing
Reading
Generating hdtree-cpp docs
The hdtree-cpp documentation is generated with Doxygen using the fancy doxygen-awesome theme In order to obtain the same styling as the online documentation, you must make sure the doxygen-awesome submodule is downloaded. You can do this with
git submodule update --init
You can generate a local copy of the documentation after installing doxygen and sphinx. We assume that doxygen is run from the root directory of the fire repository.
doxygen docs/doxyfile
The online documentation includes hyperlinks that jump between the C++ documentation generated by doxygen and the Python documentation generated by sphinx. These hyperlinks refer to the root directory of the destination github site and so they will not function when building the documentation locally.
Diagrams
Specialized diagrams were created with diagrams.net and then exported
to a SVG file for inclusion in the generated HTML. Files ending in .drawio
are versions of these
diagrams that can be loaded by diagrams.net in order to continue with a current version of the diagram.
Files of the same name but ending in .svg
are the images actually included in the docs.