-
Notifications
You must be signed in to change notification settings - Fork 207
[ENH] Series search for similarity search module #2010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Series search for similarity search module #2010
Conversation
…h-for-similarity-search-module
…h-for-similarity-search-module
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Thank you for contributing to
|
…h-for-similarity-search-module
…ttps://github.com/aeon-toolkit/aeon into 1900-enh-series-search-for-similarity-search-module
Will update notebooks tomorrow and this will be good for review ! (and fix test that fails while I'll sleep) |
…ttps://github.com/aeon-toolkit/aeon into 1900-enh-series-search-for-similarity-search-module
Switching this back to draft as there is huge performance issue with STOMP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to approve the module changes in this given the upcoming ECML tutorial, the fact the module is experimental and the next release after this is v1.0.0 with major changes.
We should have a more thorough review prior to v1.0.0 I feel, however.
Couple of comments for other changes though.
…ttps://github.com/aeon-toolkit/aeon into 1900-enh-series-search-for-similarity-search-module
I agree that some changes would be needed before 1.0 to make thing cleaner, multiple issues are already up for this purpose. |
Reference Issues/PRs
Fixes #1900
What does this implement/fix? Explain your changes.
After the query search, this PR introduces the series search, which is linked to the task of computing matrix profiles.
To make this task compatible with query search (for the naive approach), I changed the return of the predict function for query search to return both index and distances to best matches.
Modified docstrings accordingly.
For now, only the naive version (the one looping on query search, which is equivalent to STAMP from matrix profile paper 1) and STOMP (from matrix profile paper 2) algorithm for squared and euclidean distance are implemented. I'll add the different matrix profiles (e.g. SWAMP with dtw and more efficient euclidean ones) methods in a later PR.
Minimal functionality were tested given time frame for ECML, more complete test are planned in next PR.
Comments
I'm still not entirely satisfied with the naming of query search / series search, so that's open to discussion
PR checklist
For all contributions
For new estimators and functions
__maintainer__
at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.For developers with write access