-
Notifications
You must be signed in to change notification settings - Fork 6
[DONATION] of new datasets #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
hi, we would welcome these data. Are they all labelled classification problems? If so, the next stage is to get it into our format. If they are equal length, you get the data into memory so that X = np.ndarray shape (n_cases, n_channels, n_timepoints) if unequal length make X a list of ndarray (n_channels, n_timepoints_i) where n_timepoints_i is the length of the ith case. then you should be able to write them to aeon compatible format from aeon.datasets import write_to_tsfile
write_to_tsfile(X, path = "your_directory", y=y, problem_name="your_filename.ts") if there is a provided train test split, create trainX, trainy, testX, testy from aeon.datasets import write_to_tsfile
write_to_tsfile(trainX, path = "your_directory", y=trainy, problem_name="your_filename_TRAIN.ts")
write_to_tsfile(testX, path = "your_directory", y=tresty, problem_name="your_filename_TEST.ts") you can check it works with this from aeon.datasets import load_from_tsfile
X, y, meta = load_from_tsfile(full_file_path_and_name="your_directory\your_filename.ts", return_meta_data=True) if there is no provided train test split we create one, but you need to be careful, if there are repetitions from the same subject (e.g. one person repeats a HAR task many times) you need to be clear if you are splitting so train and test do not contain the same person or not. Any problems, let us know |
Hello, the data comes with presplit train, validation and test sets. I have converted them all into What is the next step? |
fantastic, next stage is to get them to us. How big are they? you can email to [email protected] or we can find another way. I will then list them on the site. Is there a text description we could use? And preferably an image? I set up the pages something like this https://timeseriesclassification.com/description.php?Dataset=AsphaltObstacles Not sure how to handle validation set, would be tempted to merge it into train, since its really part of the training. We will try out our standard suite of classifiers and they can go into the next batch release. Hoping to improve the website this year, will try get an intern as my web skills are not the best :) |
got the data, thanks, will process it all next week. |
If all goes ok, and there is nothing more to do from my side, i would have another set of datasets |
will post here as I do them, if you could check that would be great. Ive changed the names to conform to our standards but hopefully links make it clear. |
and lastly Not putting Gesture in as its really UWave. Happy to put more in if you have them @PaulRabich |
Hi
In the Paper "Self-Supervised Contrastive Pre-Training For Time Series via Time-Frequency Consistency" found here https://arxiv.org/abs/2206.08496 they use the following datasets:
All of these datasets are published under the https://creativecommons.org/licenses/by/4.0/ licence.
Would it be possible to add them? And if yes, what are the next steps for uploading them?
The text was updated successfully, but these errors were encountered: