-
Notifications
You must be signed in to change notification settings - Fork 944
Conversation
- Add interface - Add implementation for local storage: tf.io.browserLocalStorageManager() - Add implementation for IndexedDB: tf.io.browserIndexedDB() Each of the two implementation supports: - Listing all models: their paths and some meta data (ModelArtifactsInfo), such as topology type, byte sizes of topology of weights, etc. - Deleting a model by path. - Copying a model by path. Change the IndexedDB to two tables (object stores) for efficient look up of ModelArtifactsInfo: - One for ModelArtifactsInfo (faster access) - One for the actual artifacts (slower access) Also, change "KerasJSON" to "JSON"
1. Simplify copyModel() code for local storage and IndexedDB by reusing the load() and save() logic. 2. In IndexedDB.save(), roll back incomplete results on failure at second step.
Really nice work Shanqing. Few small comments, 1 more on the design side regarding having a registry of ModelStores instead of API specific methods for each type of ModelStore. Reviewed 7 of 9 files at r1. src/io/indexed_db.ts, line 34 at r1 (raw file):
typo: src/io/io.ts, line 41 at r1 (raw file):
no need to expose these to the API. Instead expose global src/io/local_storage.ts, line 335 at r1 (raw file):
A great benefit to having an interface ModelStorageManager is to hide the need for having storage-specific API (e.g. src/io/types.ts, line 223 at r1 (raw file):
nice job on this clean interface src/io/types.ts, line 253 at r1 (raw file):
just curious, what's the use-case for copying models? Comments from Reviewable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Some nits & some thoughts on the API.
src/io/indexed_db.ts
Outdated
// 2. The object store forModelArtifactsInfo, including meta-information such as | ||
// the type of topology (JSON vs binary), byte size of the topology, byte | ||
// size of the weights, etc. | ||
const INFO_STORE_NAME = 'model_info_store'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for bikeshedding style naming suggestions, but:
MODEL_METADATA_STORE_NAME perhaps?
src/io/indexed_db.ts
Outdated
const getRequest = store.get(this.modelPath); | ||
const modelTx = db.transaction( | ||
MODEL_STORE_NAME, | ||
modelArtifacts == null ? 'readonly' : 'readwrite'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modelArtifacts is guaranteed to be null here ...
src/io/indexed_db.ts
Outdated
// First, put ModelArtifactsInfo into info store. | ||
const infoTx = db.transaction( | ||
INFO_STORE_NAME, | ||
modelArtifacts == null ? 'readonly' : 'readwrite'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modelArtifacts is guaranteed to not be null here
src/io/indexed_db.ts
Outdated
// Second, put model data into model store. | ||
modelTx = db.transaction( | ||
MODEL_STORE_NAME, | ||
modelArtifacts == null ? 'readonly' : 'readwrite'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here also
src/io/indexed_db.ts
Outdated
resolve({modelArtifactsInfo}); | ||
}; | ||
putModelRequest.onerror = error => { | ||
// If the put-model request fails, roll back the info entry as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic makes sense, but requires significant mental effort to follow. If it gets more complicated than this I could see it becoming difficult to manage. Is there a way to modularize or pull some of this out for greater readability?
If not, or if it would be even worse, no big deal.
src/io/types.ts
Outdated
* Delete a model specified by `path`. | ||
* | ||
* @param path | ||
* @returns ModelArtifactsInfo of the deleted model (if and only if deletion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does deleting return the ModelArtifactsInfo? Is this a common pattern for the deletion of assets in JS? Are we creating a difficult-to-fix problem if some storage medium can sense presence & delete much faster than access the ModelArtifactsInfo?
src/io/types.ts
Outdated
* | ||
* @param oldPath | ||
* @param newPath | ||
* @returns ModelArtifactsInfo of the copied model (if and only if copying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above, should these methods return some sort of status object different from the ModelArtifactsInfo themselves?
Review status: 7 of 9 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. src/io/indexed_db.ts, line 185 at r1 (raw file):
you can one-line this src/io/indexed_db.ts, line 406 at r1 (raw file):
is the idea here that you're going to expand the API? Why not just have a top-level list method that takes a storage mechanism and lists the model names. Then you can have another "remove" method that takes a the string path like src/io/io.ts, line 41 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
+1, see my comment above Comments from Reviewable |
Review status: 7 of 9 files reviewed at latest revision, 11 unresolved discussions, some commit checks failed. src/io/indexed_db.ts, line 34 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Done. src/io/indexed_db.ts, line 37 at r1 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
The names here don't really matter that much. They are private to this file. The current name is shorter. So I vote to keep it. src/io/indexed_db.ts, line 143 at r1 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
Done. src/io/indexed_db.ts, line 168 at r1 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
Done. src/io/indexed_db.ts, line 177 at r1 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
Done. src/io/indexed_db.ts, line 185 at r1 (raw file): Previously, nsthorat (Nikhil Thorat) wrote…
Done. src/io/indexed_db.ts, line 188 at r1 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
I agree it's a little unreadable. But this is the price we have to pay for keeping two tables. I don't see a way to combine two transactions into an atomic one in IndexedDB API. src/io/indexed_db.ts, line 406 at r1 (raw file): Previously, nsthorat (Nikhil Thorat) wrote…
I've made changes to adopt the suggestion. See my reply to Daniel's comments. src/io/io.ts, line 41 at r1 (raw file): Previously, nsthorat (Nikhil Thorat) wrote…
Great point. In this revision I removed this exports and added the new methods such as: console.log(await tf.io.listModels());
await tf.io.removeModel('indexeddb://foo');
await tf.io.copyModel('indexeddb://foo', 'localstorage://bar'); // Copying between mediums is supported.
await tf.io.moving('indexeddb://foo', 'localstorage://bar'); // Moving between mediums is supported. Thanks for the great suggestion. src/io/local_storage.ts, line 335 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
See my reply to your comment above. src/io/types.ts, line 237 at r1 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
We could also return nothing. Returning a ModelArtifactsInfo to the caller is useful (e.g., for the caller to keep track of how much more space has been freed from IndexedDB or Local Storage). src/io/types.ts, line 248 at r1 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
I think returning ModelArtifactsInfo should be sufficient. The presence of a return value indicates that the copy or move succeeded. (In case of failure, an Error will be thrown). It is unclear what other pieces of info would be helpful. The source and destination URLs are known to the caller. src/io/types.ts, line 253 at r1 (raw file): Previously, dsmilkov (Daniel Smilkov) wrote…
Good question. I discussed with others offline. We agree that both moving (i.e, renaming) and copying models will be useful, within and between storage mediums. The new API now has both supported. Comments from Reviewable |
Review status: 3 of 11 files reviewed at latest revision, 8 unresolved discussions. src/io/model_management.ts, line 65 at r2 (raw file):
protect against multiple managers attempting to use the same scheme? src/io/model_management_test.ts, line 268 at r2 (raw file):
moveModel with identical src/dest medium has a special code path. Is there a test that this works? I could have missed it. src/io/model_management_test.ts, line 316 at r2 (raw file):
nit: copying from invalid URL ... src/io/model_management_test.ts, line 367 at r2 (raw file):
suggestion : test that the source has not been deleted in this case. Comments from Reviewable |
Great! Some small comments. Comments from Reviewable |
Comments from Reviewable |
Review status: 3 of 11 files reviewed at latest revision, 12 unresolved discussions, some commit checks pending. src/io/model_management.ts, line 65 at r2 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
Done. src/io/model_management_test.ts, line 268 at r2 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
Moving / copying model within the same medium is covered in the unit tests in local_storage_test.ts and indexed_db_test.ts. src/io/model_management_test.ts, line 316 at r2 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
Done. src/io/model_management_test.ts, line 367 at r2 (raw file): Previously, bileschi (Stanley Bileschi) wrote…
Good suggestion! Done. Comments from Reviewable |
Review status: 3 of 11 files reviewed at latest revision, 9 unresolved discussions, all commit checks successful. Comments from Reviewable |
tf.io.browserLocalStorageManager()
tf.io.browserIndexedDBManager()
tf.io.listModels();
tf.io.copyModel('indexeddb://foo', 'localstorage://bar');
tf.io.moveModel('indexeddb://foo', 'localstorage://bar');
tf.io.removeModel('indexeddb://foo');
Each of the two implementation supports:
(ModelArtifactsInfo), such as topology type,
byte sizes of topology and weights, etc.
Change the IndexedDB to two tables (object stores) for
efficient look up of ModelArtifactsInfo:
Also, change "KerasJSON" to "JSON"
Towards: tensorflow/tfjs#13
This change is