Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions core/foundation/inc/ROOT/StringUtils.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,16 @@ std::string Join(const std::string &sep, StringCollection_t &&strings)

std::string Round(double value, double error, unsigned int cutoff = 1, std::string_view delim = "#pm");

inline bool StartsWith(std::string_view string, std::string_view prefix)
{
return string.size() >= prefix.size() && string.substr(0, prefix.size()) == prefix;
}

inline bool EndsWith(std::string_view string, std::string_view suffix)
{
return string.size() >= suffix.size() && string.substr(string.size() - suffix.size(), suffix.size()) == suffix;
}

} // namespace ROOT

#endif
2 changes: 2 additions & 0 deletions io/io/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ ROOT_LINKER_LIBRARY(RIO
src/TStreamerInfoReadBuffer.cxx
src/TStreamerInfoWriteBuffer.cxx
src/TZIPFile.cxx
src/RFile.cxx
$<TARGET_OBJECTS:RootPcmObjs>
LIBRARIES
${CMAKE_DL_LIBS}
Expand All @@ -73,6 +74,7 @@ if(uring)
endif()

ROOT_GENERATE_DICTIONARY(G__RIO
ROOT/RFile.hxx
ROOT/RRawFile.hxx
ROOT/RRawFileTFile.hxx
${rawfile_local_headers}
Expand Down
204 changes: 204 additions & 0 deletions io/io/inc/ROOT/RFile.hxx
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
/// \file ROOT/RFile.hxx
/// \ingroup Base ROOT7
/// \author Giacomo Parolini <[email protected]>
/// \date 2025-03-19
/// \warning This is part of the ROOT 7 prototype! It will change without notice. It might trigger earthquakes. Feedback
/// is welcome!

#ifndef ROOT7_RFile
#define ROOT7_RFile

#include <ROOT/RError.hxx>

#include <memory>
#include <string_view>
#include <typeinfo>

class TFile;
class TKey;

namespace ROOT {
namespace Experimental {

class RFile;
struct RFileKeyInfo;

namespace Internal {

ROOT::RLogChannel &RFileLog();

} // namespace Internal

/**
\class ROOT::Experimental::RFile
\ingroup RFile
\brief An interface to read from, or write to, a ROOT file, as well as performing other common operations.
## When and why should you use RFile
RFile is a modern and minimalistic interface to ROOT files, both local and remote, that can be used instead of TFile
when the following conditions are met:
- you want a simple interface that makes it easy to do things right and hard to do things wrong;
- you only need basic Put/Get operations and don't need the more advanced TFile/TDirectory functionalities;
- you want more robustness and better error reporting for those operations;
- you want clearer ownership semantics expressed through the type system rather than having objects "automagically"
handled for you via implicit ownership of raw pointers.
RFile doesn't try to cover the entirety of use cases covered by TFile/TDirectory/TDirectoryFile and is not
a 1:1 replacement for them. It is meant to simplify the most common use cases and make them easier to handle by
minimizing the amount of ROOT-specific quirks and conforming to more standard C++ practices.
Comment on lines +39 to +49
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the tone a bit harsh. Consider:

RFile is a modern and minimalistic interface to ROOT files, both local and remote, that can be used instead of TFile
when you only need basic Put/Get operations and don't need the more advanced TFile/TDirectory functionalities.  It provides:
- a simple interface that makes it easy to do things right and hard to do things wrong,
- more robustness and better error reporting for those operations,
- clearer ownership semantics expressed through the type system.

RFile doesn't cover the entirety of use cases covered by TFile/TDirectory/TDirectoryFile and is not
a 1:1 replacement for them.  It is meant to simplify the most common use cases by following newer standard C++ practices.

## Ownership model
RFile handles ownership via smart pointers, typically std::unique_ptr.
When getting an object from the file (via RFile::Get) you get back a unique copy of the object. Calling `Get` on the
same object twice produces two independent clones of the object. The ownership over that object is solely on the caller
and not shared with the RFile. Therefore, the object will remain valid after closing or destroying the RFile that
generated it. This also means that any modification done to the object are **not** reflected to the file automatically:
to update the object in the file you need to write it again (via RFile::Overwrite).
RFile::Put and RFile::Overwrite are the way to write objects to the file. Both methods take a const reference to the
object to write and don't change the ownership of the object in any way. Calling Put or Overwrite doesn't guarantee that
the object is immediately written to the underlying storage: to ensure that, you need to call RFile::Flush (or close the
file).
## Directories
Differently from TFile, the RFile class itself is not also a "directory". In fact, there is no RDirectory class at all.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Differently from TFile, the RFile class itself is not also a "directory". In fact, there is no RDirectory class at all.
Unlike TFile, the RFile class does not function as a directory. Moreover, there is no RDirectory class provided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think this is better than the current version?

Copy link
Member

@pcanal pcanal Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads (better) to me 'english-wise'. Unlike is more (american) English than Differently from. The rest of that sentence is rephrase in a function way rather than indirectly focusing on the C++ semantic of inheritance. The In fact ... at all. was making the 2nd sentences stand (too much) as a reinforcement of the first rather than its owns message. Instead of saying 'we also do not plan to offer a standalone RDirectory facility', the In fact ... at all can be (mis?) read as 'RFile is not a directory solely because we are not implementing RDirectory [but if we did implement it, RFile would inherit from it]' (the last part is a slight exaggeration to clarify the drift)'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for the "unlike", I agree.
The point of the sentence for me was specifically to focus on the C++ semantic rather than the functionality (see here), as I think it's a relatively big change from both TFile and the older RFile prototype. However, rereading the sentence, maybe the intention was not that clear.
I have a bit of a problem with the "function as a directory" wording as it's not very clear in my opinion.
Perhaps I should just drop the paragraph altogether, I don't think it adds much value to the documentation after all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can indeed make sense to drop the first 2 sentences and start at Directories are still ... maybe rephrased now as (for example)

Directories are semantically supported by the `RFile` interfaces as they are a concrete part of the ROOT binary format.
Directories are interacted with solely via the use of filesystem-like string-based paths. 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See if the updated phrasing works for you :)

Directories are still an existing concept in RFile (since they are a concept in the ROOT binary format),
but they are usually interacted with indirectly, via the use of filesystem-like string-based paths. If you Put an object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but they are usually interacted with indirectly, via the use of filesystem-like string-based paths. If you Put an object

Is the "usually" intentional? It implies that there is (or will be) other to interact with the directories. Is that the intent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it implies that we "reserve the right" to also provide directory querying functionality at some point (which we know we'll probably need in some form)

Copy link
Member

@pcanal pcanal Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that 'usually' is effectively expressing "reserve the right". In the contrary, it reads as if there is already other ways. A closer adverb might 'for now only interacted with indirectly.'

in an RFile under the path "path/to/object", "object" will be stored under directory "to" which is in turn stored under
directory "path". This hierarchy is encoded in the ROOT file itself and it can provide some optimization and/or
conveniencies when querying objects.
For the most part, it is convenient to think about RFile in terms of a key-value storage where string-based paths are
used to refer to arbitrary objects. However, given the hierarchical nature of ROOT files, certain filesystem-like
properties are applied to paths, for ease of use: the '/' character is treated specially as the directory separator;
multiple '/' in a row are collapsed into one (since RFile doesn't allow directories with empty names).
At the moment, RFile doesn't allow getting directories via Get, nor writing ones via Put (this may change in the
future).
## Sample usage
Opening an RFile (for writing) and writing an object to it:
~~~{.cpp}
auto rfile = ROOT::RFile::Recreate("my_file.root");
auto myObj = TH1D("h", "h", 10, 0, 1);
rfile->Put(myObj.GetName(), myObj);
~~~
Opening an RFile (for reading) and reading an object from it:
~~~{.cpp}
auto rfile = ROOT::RFile::Open("my_file.root");
auto myObj = file->Get<TH1D>("h");
~~~
*/
class RFile final {
enum PutFlags {
kPutAllowOverwrite = 0x1,
kPutOverwriteKeepCycle = 0x2,
Comment on lines +100 to +101
Copy link
Member

@pcanal pcanal Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we document these flags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are internal and pretty self-explanatory, but I can add doc comments to them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it took me to look at the code to understand the 2nd one, I first read it as:
"kPut Overwrite[/change] [the default behavior] Keep Cycle" [aka 'problably' not keep the cycle]
while I guess it is meant to be read as
"[The] kPut Overwrite [operation will] Keep [the existing last] Cycle"

The misunderstanding is based on my misunderstanding in that context of "Overwrite" as a noun or as a verb. The fact that in kPutAllowOverwrite the 2nd word is treated as a verb lead my brain to infer that the 2nd word of the other enum was also to be treated as verb :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I added documentation to them in the followup PR.

};

std::unique_ptr<TFile> fFile;

// Outlined to avoid including TFile.h
explicit RFile(std::unique_ptr<TFile> file);

/// Gets object `path` from the file and returns an **owning** pointer to it.
/// The caller should immediately wrap it into a unique_ptr of the type described by `type`.
[[nodiscard]] void *GetUntyped(std::string_view path, const std::type_info &type) const;

/// Writes `obj` to file, without taking its ownership.
void PutUntyped(std::string_view path, const std::type_info &type, const void *obj, std::uint32_t flags);

/// \see Put
template <typename T>
void PutInternal(std::string_view path, const T &obj, std::uint32_t flags)
{
PutUntyped(path, typeid(T), &obj, flags);
}

/// Given `path`, returns the TKey corresponding to the object at that path (assuming the path is fully split, i.e.
/// "a/b/c" always means "object 'c' inside directory 'b' inside directory 'a'").
/// IMPORTANT: `path` must have been validated/normalized via ValidateAndNormalizePath() (see RFile.cxx).
TKey *GetTKey(std::string_view path) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider:

Suggested change
TKey *GetTKey(std::string_view path) const;
TKey *GetKey(std::string_view path) const;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use TKey in RFile whenever I want to point out that it's specifically a TKey - which should only be used internally and never exposed in the public API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is internal, it does not really matter that much. I still find it a bit awkward unless we envisioned the possibility that there would eventually be a RKey.


public:
// This is arbitrary, but it's useful to avoid pathological cases
static constexpr int kMaxPathNesting = 1000;

///// Factory methods /////

/// Opens the file for reading. `path` may be a regular file path or a remote URL.
/// \throw ROOT::RException if the file at `path` could not be opened.
static std::unique_ptr<RFile> Open(std::string_view path);

/// Opens the file for reading/writing, overwriting it if it already exists.
/// \throw ROOT::RException if a file could not be created at `path` (e.g. if the specified
/// directory tree does not exist).
static std::unique_ptr<RFile> Recreate(std::string_view path);

/// Opens the file for updating, creating a new one if it doesn't exist.
/// \throw ROOT::RException if the file at `path` could neither be read nor created
/// (e.g. if the specified directory tree does not exist).
static std::unique_ptr<RFile> Update(std::string_view path);

///// Instance methods /////

// Outlined to avoid including TFile.h
~RFile();

/// Retrieves an object from the file.
/// `path` should be a string such that `IsValidPath(path) == true`, otherwise an exception will be thrown.
/// See \ref ValidateAndNormalizePath() for info about valid path names.
/// If the object is not there returns a null pointer.
template <typename T>
std::unique_ptr<T> Get(std::string_view path) const
{
void *obj = GetUntyped(path, typeid(T));
return std::unique_ptr<T>(static_cast<T *>(obj));
}

/// Puts an object into the file.
/// The application retains ownership of the object.
/// `path` should be a string such that `IsValidPath(path) == true`, otherwise an exception will be thrown.
/// See \ref ValidateAndNormalizePath() for info about valid path names.
///
/// Throws a RException if `path` already identifies a valid object or directory.
/// Throws a RException if the file was opened in read-only mode.
template <typename T>
void Put(std::string_view path, const T &obj)
{
PutInternal(path, obj, /* flags = */ 0);
}

/// Puts an object into the file, overwriting any previously-existing object at that path.
/// The application retains ownership of the object.
///
/// If an object already exists at that path, it is kept as a backup cycle unless `backupPrevious` is false.
/// Note that even if `backupPrevious` is false, any existing cycle except the latest will be preserved.
///
/// Throws a RException if `path` is already the path of a directory.
/// Throws a RException if the file was opened in read-only mode.
template <typename T>
void Overwrite(std::string_view path, const T &obj, bool backupPrevious = true)
{
std::uint32_t flags = kPutAllowOverwrite;
flags |= backupPrevious * kPutOverwriteKeepCycle;
PutInternal(path, obj, flags);
}

/// Writes all objects and the file structure to disk.
/// Returns the number of bytes written.
size_t Flush();

/// Flushes the RFile if needed and closes it, disallowing any further reading or writing.
void Close();
};

} // namespace Experimental
} // namespace ROOT

#endif
Loading
Loading