Skip to content
This repository was archived by the owner on Jul 10, 2025. It is now read-only.
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
180 changes: 180 additions & 0 deletions rfcs/20200804-configurable-filesystems.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
# Configurable File Systems

| Status | Proposed |
| :------------ | :-------------------------------------------------------------------------------------------- |
| **RFC #** | [NNN](https://github.com/tensorflow/community/pull/NNN) (update when you have community PR #) |
| **Author(s)** | Sami Kama ([email protected]) |
| **Sponsor** | Mihai Maruseac ([email protected]) |
| **Updated** | 2020-08-04 |

## Objective

The aim of this RFC to extend filesystem API to enable users to pass configuration parameters to tune the behavior of implementation to their use cases.

## Motivation

There are many FileSystem implementations in Tensorflow that enable interaction with various storage solutions. Most of these implementations have internal parameters that are suitable for generic use case but not necessarily optimal for all cases. For example accessing remote filesystems through multiple threads can improve the throughput if there is a high bandwidth connection to the remote thus increasing number of connections might be beneficial. On the other hand if the connection is slow, a higher number of threads will just waste resources and may even reduce the throughput. Depending on the resources available during the execution, users should be able to alter some of the parameters of the Filesystems to improve the performance of their execution. This can be especially useful for the the cases where the execution is data i/o bound.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the relation between this and SIG IO's filesystem extensions? @mihaimaruseac @samikama @tensorflow/sig-io-maintainers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terrytangyuan Are you talking about plugin based filesystem or something else? Could you please be more specific?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the existing file system extensions in TF IO kernels, e.g. gstpu, oss, azure, etc. For example, we can use Azure like the following (full tutorial):

pathname = 'az://{}/aztest'.format(account_name)
tf.io.gfile.mkdir(pathname)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @terrytangyuan,

Sorry if the document was not clear. This proposal do not change existing behavior. It extends existing api so that you can do things like @martinwicke mentioned above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Perhaps mentioning the relationship with TF IO would avoid some confusion. Thanks.


## User Benefit

With this proposal users will be able to fine tune some parameters that developers expose through configuration API and get an improved perfomance for file i/o.

## Design Proposal

This proposal introduces two new methods to plugin api structure `TF_FilesystemOps` as shown below.

```cpp
struct TF_FilesystemOps{
// other members are ignored for brevity
void (*const get_filesystem_configuration)(char** serialized_config, int *serialized_length, TF_Status* status);
void (*const set_filesystem_configuration)(const char* serialized_config, int serialized_length, TF_Status* status);
}
```

where `serialized_config` is a pointer to the buffer containing serialized human readable form of the protobuf object described below and `serialized_length` is the length of the buffer.

For non-plugin based filesystems, FileSystem API can be extended similarly.

```cpp
class FileSystem{
public:
// existing methods are not shown.
Status GetConfiguration(std::unique_ptr<FilesystemConfig>* config);
Status SetConfiguration(std::unique_ptr<FilesystemConfig> new_configuration);
}
```

Since each filesystem will likely to have different set of tunable parameters, a `FilesystemConfig` object can be used to unify the API and allow discovery of existing tunable parameters at runtime. We propose a protobuf object with the following schema
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strongly prefer not to add protos to the API. Let's go with a plain struct, which is easier to make ABI safe?

Copy link
Contributor Author

@samikama samikama Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Martin, Thanks for the first review. Main use case of the protobuf is to handle serialization-deserialization, structured and simple editing from user side in C++ and python domains, potentially expose the documentation as well as have a free schema. The ABI boundary will be crossed by prototbuf serialized to prototxt so it will be human readable plain text. It is to ensure that there is no ABI issue or problems due to small differences in protobuf library used in plugin or tensorflow. Plain struct will not be able to do easily unless we restrict the data types and arguments and how these data is serialized and deserialized in ABI compatible way. Does this alleviate some of your concerns about protobuf?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

typedef struct key_val {
  char *key; // null terminated
  int version;
  int type_tag;
  union {
    int64 inv_val;
    double real_val;
    struct {
      char* buf;
      int buf_length;
    } buffer_val;
  } value; 
} key_val;

And then we can send an array of key_val elements.

It seems the option values can be either integer, reals or some buffers (always null terminated?). In that case, we can also have this API:

int64 get_int_option(const char* key);
double get_double_option(const char* key);
const char* get_char_option(const char* key);

void get_int_option(const char* key, int64 val);
void get_double_option(const char* key, double val);
void get_char_option(const char* key, const char* val);

int get_plugin_options(char** keys);

(modulo status codes). The second API proposal has the downside that user needs to first ask the plugin what options are available and then iterate over the returned array and set them. However, it is much more extensible (both API and ABI compatible by default -- already handled by the filesystem ABI/API compatibility layer). Whereas the struct-union variant from the beginning of this comment allows getting all the config options at once but requires adding a new version parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the second API allows setting options per path by adding a path argument to the methods

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bools can be passed in via int options, lists can be passed via multiple calls or by adding 3 more API entries

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have the function API proposed above between plugins and core TF (the C layer). Then, inside TF we can use a class/struct to hold them together and present the info to user in a class wrapper too (Python).

The C++/Python classes don't need to be backwards compatible from the point of view of the plugin, so we can change them as needed. But for the plugin interface, we need compatibility (both API and ABI).

The only people seeing the functional API above are plugin implementers and people who contribute to the filesystem layer in core TF.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using protos forces users to use protobuf, which has been a source of issues (it's not exactly a light dependency). Passing only serialized protos removes the ABI concerns, but using proto for configuration isn't something we love to do (maybe surprisingly, since it's all over our APIs, but that's just why we decided it might not be the best idea).

I agree that a set/get API is a bit of an overhead, but a struct would be fairly straightforward, I believe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihaimaruseac Yes I agree. Though your get_* api needs extension since having the name of the key doesn't tell you its type. With the struct approach, memory management might require some work but it is doable. I just believe some existing infrastructure like protobuf or alternatively json/yaml/xml serialized format would simplify things and reduce amount of code needed when developing plugins.

Copy link
Contributor Author

@samikama samikama Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinwicke I just believe using a library and human readable serialized data would be the best. It doesn't have to be protobuf. Only reason the protobuf suggested is that it is already used in TF. I am perfectly happy with other standard formats json/yaml/xml etc. Only advantage of protobuf over these is binary data and type information. Which could easily be implemented in other formats as well.

I agree we can also use struct. But we need to address

  • Memory ownership. Since there needs to be a query mechanism for existing values we will get the structs from plugins and need to manage the memory. There are methods in existing plugin API, although they are not documented in plugin RFC I believe, We can use these methods to manage memory and define lifetimes and ownership clearly.

  • Multiple entries. We need to support passing a vector/list of values. It is possible to encode maps and other structures with lists.

  • The documentation. We all know that developers hardly write documentation. Having a key and its value type is usually not sufficient to figure out what it means. We need to have a description field in the struct that documents the key.

  • It would have been nice if everybody didn't have to implement Struct generation/parsing code from scratch but I can't see a way around it right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the following in the plugin interface

typedef struct TF_Filesystem_Option_Value {
  int type_tag;
  int num_values;
  union {
    int64 inv_val;
    double real_val;
    struct {
      char* buf;
      int buf_length;
    } buffer_val;
  } *values;  // owned
} TF_Filesystem_Option_Value;

typedef struct TF_Filesystem_Option {
  char* name; // null terminated, owned
  char* description; // null terminated, owned
  int per_file;  // bool actually, but bool is not a C type
  TF_Filesystem_Option_Value *value;  // owned
} TF_Filesystem_Option;

This handles both documentation (provided by plugin) and list values (I'm not conviced we need maps here, but it could be possible to extent the union to add them). Plus, since above is in filesystem interface, it is used by core TF and we can provide functions in the filesystem interface to manage these structs. This way, plugins only need to know the layout of the structs and read/write to them directly.

Memory allocation will have to come from the plugin's side, using the same routines plugins currently use for the other filesystem operations. It's the only way this can work on windows so we cannot choose anything else.

We probably need a new ABI number but adding it to the filesystem registration function would be an ABI breakage. We can handle this by increasing existing ABI numbers from 0 to 1 or we can use the options ABI number only in the function that core TF uses to get filesystem's default options.

Plugins will need to fill these structs in with all the options they support, change them (if plugin allows changing them), and read from them when implementing operations that depend on these options. Core TF only needs to propagate these options up to the C++/Python layers and display them to users.

I would expect that a user trying a new filesystem plugin will do something like

options = tf.io.get_filesystem_options_for_scheme("new_uri_scheme://")
tf.io.display_filesystem_options(options)  # will print current options and the help text too
tf.io.set_option(options, key, value)
...

We have a lot of pointer chasing in this structure but since this is IO I doubt it will result in significant performance degradation.


```proto
message FilesystemAttr{
message ListValue {
repeated bytes s = 2; // "list(string)"
repeated int64 i = 3 [packed = true]; // "list(int)"
repeated float f = 4 [packed = true]; // "list(float)"
repeated bool b = 5 [packed = true]; // "list(bool)"
}
oneof value {
bytes s = 2; // "string"
int64 i = 3; // "int"
float f = 4; // "float"
bool b = 5; // "bool"
ListValue list = 1; // any "list(...)"
}
optional string description = 2;
}

message FilesystemConfig{
string owner = 1;
string version = 2;
map<string, FilesystemAttr> options = 3 ;
}
```

It is possible to choose `FilesystemConfig` to be another human readable key-value store format with a similar structure such as `json` or `yaml`, though this may limit the data types that can be used for configuration.
Filesystems which doesn't have user configurable parameters can leave these methods unimplemented. In that case default implementations will return `nullptr` wherever applicable. `FilesystemConfig` object can be exposed to python layer for modifications at python level.

Typical use pattern would be that user queries the Filesystem implementation for current configuration. Filesystem returns an object populated with all configurable parameters and their existing or default values which also serves as a schema. User creates a copy of the configuration, modified desired parameters in protobuf object and passes this back to Filesystem through `SetConfiguration()` call. Then Filesystem alter its operational parameters if modifications are within acceptable limits or return an error with apropriate message describing the issue.

### Alternatives Considered

Alternative to this proposal is to use a side-channel such as an environment variable to modify the internal parameters. However this is cumbersome, error prone and may not be possible to use at all under certain circumstances.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another alternative I can imagine: some of these parameters might be useful to set per file (e.g., caching policies). Should this be possible? Or is a global per filesystem switch enough? What are the use cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We leave this to the filesystem implementations. This proposal is to expose such information. If a filesystem can support per-file configuration, it can expose it and then user can make use of it. For example I was thinking for networked file systems, exposing file size and thread count thresholds such that below filesize threshold, number or parallel download request is different than above threshold. Some specific example could be single thread if <1MB, 10 threads if >1GB. How does this sound.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of a user-facing API of something like:

f1 = tf.io.gfile.Open("crazyfs://rw_file", {'cache_policy': 'LRU', 'cache_size': 4096})
f2 = tf.io.gfile.Open("crazyfs://append_only_file", {'cache_policy': 'off'})

How would you implement something like this using only the end-points proposed here?

Copy link
Contributor Author

@samikama samikama Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible but having a uniform behavior would be though unless all filesystems agree on some convention. Assume that all these are non-issue one way to implement would be in pseudo code

def Open(fname,mode,conf_dict):
  FS=env.GetFileSystemForFile("fname")
  fsconfig=FS.GetConfiguration() 
  newconfig=FileConfiguration()
  newconfig.CopyFrom(fsconfig)
  for o,v in conf_dict.keys():
    if o in newconfig.options:
      newconfig.options[o]=v
  FS.SetConfiguration(newconfig)
  if mode == read_write:
   return FS.NewWritableFile(fname)
  elif mode == appendable_file:
   return FS.NewAppendOnlyFile(fname)

Didn't check the actual method names and signatures but this could be a way to implement per file arguments if Filesystem implementation supports it.


### Performance Implications

This proposal should help improve persistent storage i/o performance.

### Dependencies

This proposal do not introduce any new dependencies though, plugin based filesystems may have to link against protobuf (and hide its symbols) or respective library if an alternative form for `FilesystemConfig` is chosen.

### Engineering Impact

Engineering impact of this change is negligable. Amount of change needed is proportional to configurability that developers choose to expose to user.

### Platforms and Environments

This proposal is applicable to all Filesystems on all supported platforms.

### Best Practices

This proposal provides tuning handles to users for tuning the i/o performance. These can be documented in performance guides, in filesystem implementations or the `FilesystemAttr.description` field of the configuration object.

### Tutorials and Examples

An example use of the new API could be as follows.

```cpp
Status SetFilesystemThreads(int thread_count) {
ModularFileSystem* fs = Env::Default()->GetFileSystemForFile(
"remote://some_configurable_remote_filesystem");
std::unique_ptr<FilesystemConfig> config;
auto s = fs->GetConfiguration(&config);
if (!s.ok()) return s;
if (!config) return Status::OK(); // No configuration support
std::unique_ptr<FilesystemConfig> new_config =
std::make_unique<FilesystemConfig>() new_config->CopyFrom(*config);
if (config->options.contains("ThreadPoolSize")) {
new_config->options.at("ThreadPoolSize").set_i(8);
}
fs->SetConfiguration(std::move(new_config));
return Status::OK();
}
```

### Compatibility

This proposal have no effect on compatibility of existing code.

### User Impact

This proposal will expose new methods to user to query and modify operational parameters of Filesystems. Users wishing to tune their Filesystem access will be able to do so.

## Questions and Discussion Topics

### Comments and Altrenatives Came Out During Posting Period

During the posting period, some concerns about the protobuf passing through C C++ boundaries has been raised and alternative approaches has been discussed. @mihaimaruseac suggested following structure for crossing plugin-framework boundary.

```cpp
typedef struct TF_Filesystem_Option_Value {
int type_tag;
int num_values;
union {
int64 inv_val;
double real_val;
struct {
char* buf;
int buf_length;
} buffer_val;
} *values; // owned
} TF_Filesystem_Option_Value;

typedef struct TF_Filesystem_Option {
char* name; // null terminated, owned
char* description; // null terminated, owned
int per_file; // bool actually, but bool is not a C type
TF_Filesystem_Option_Value *value; // owned
} TF_Filesystem_Option;
```

On framework side these options can be translated to user friendly C++ and Python data structures and helper functions for getting and setting options can be provided in filesystem header file for plugins to use. With this approach all the buffer allocation and dealloactions will be done through allocator functions provided by plugins.

If this schema is prefered C layer methods become

```cpp
void (*const get_filesystem_configuration)(TF_Filesystem_Option** options, int *num_options, TF_Status* status);
void (*const set_filesystem_configuration)(const TF_Filesystem_Option** options, int num_options, TF_Status* status);
}
```

Alternatively API can be expanded with per-option getters and setters in which case methods similar to following would be added to filesystem API

```cpp
void (*const get_filesystem_configuration_option)(const char* key, TF_Filesystem_Option *option, TF_Status* status);
void (*const set_filesystem_configuration_option)(const TF_Filesystem_Option* option, TF_Status* status);
void (*const get_filesystem_configuration_keys)(char** Keys, int *num_keys, TF_Status* status);

```

The first option has the advantage of smaller API surface for plugin developers to implement at the expense of bigger data size crossing the framework-plugin boundary. Adding per-option methods to the API can simplify data preparation for boudary crossing for filesystems that have very large configuration options.