-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmarks for KedroDataCatalog
and fix tests for DataCatalog
#4246
Conversation
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @ankatiyar!
Added some suggestions on extending tests.
def time_setitem(self): | ||
"""Benchmark the time to set a dataset""" | ||
for i in range(1,1001): | ||
self.catalog[f"dataset_new_{i}"] = CSVDataset(filepath="data.csv") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please also add setting raw data? So this part of setter was covered:
kedro/kedro/io/kedro_data_catalog.py
Line 137 in 3818a2a
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a separate test for this
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you 🚀
Description
Close #4125
Development notes
KedroDataCatalog
DataCatalog
: The tests were failing forDataCatalog.add_feed_dict()
andDataCatalog.add_all()
because asv runs thesetup()
then repeats the tests for a number of times and then runsteardown()
but trying to add the same datasets multiple times results inDatasetAlreadyExistsError
https://github.com/kedro-org/kedro/actions/workflows/benchmark-performance.ymlTo test locally:
asv run
Developer Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file