-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add wait param for all snapshot API #1571
Conversation
9973712
to
34a6128
Compare
Ok(true) | ||
} | ||
|
||
pub async fn do_list_full_snapshots( | ||
toc: &TableOfContent, | ||
dispatcher: &Dispatcher, | ||
// toc: &TableOfContent, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left over to remove :)
|
||
archiving.await??; | ||
archiving.await.unwrap().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the unwrap
here and below in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to use ?
firstly, but it will be error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.unwrap().unwrap()
will kill the whole service
src/actix/api/snapshot_api.rs
Outdated
}); | ||
|
||
if wait { | ||
let result = task.await.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment about unwrap
Thank you for the contribution 👍 There are a few things missing but it is looking promising otherwise. When changing the REST API, it is required to also update the openapi schema definition. This schema is used for integration testing and generating clients down the line. The following documentation will guide you through it. At the end, it will be great to add an openapi integration test using the new |
c81149a
to
f23eb18
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would nice to test the usage of wait=false
as well.
Having rock solid test assertions there is not clear to me, maybe:
1 - count number of snapshots on collection
2 - create snapshot async
3 - assert count is the same
4 - wait until count goes up by one
Not ideal as prone to timing issues if the snapshot creating is very fast.
Got other ideas maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following my commit, you could also assert on the status code 202
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to test them locally but met some errors when running ./tests/openapi_integration_test.sh
. My env is MacOS 12.5.1
/usr/local/lib/python3.9/site-packages/requests/adapters.py:519: ConnectionError
=========================== short test summary info ============================
ERROR openapi_integration/test_alias.py::test_cant_create_alias_if_collection_exists
ERROR openapi_integration/test_alias.py::test_cant_create_collection_if_alias_exists
ERROR openapi_integration/test_alias.py::test_alias_operations - requests.exc...
ERROR openapi_integration/test_basic_retrieve_api.py::test_points_retrieve - ...
ERROR openapi_integration/test_basic_retrieve_api.py::test_exclude_payload - ...
ERROR openapi_integration/test_basic_retrieve_api.py::test_is_empty_condition
ERROR openapi_integration/test_basic_retrieve_api.py::test_recommendation - r...
ERROR openapi_integration/test_basic_retrieve_api.py::test_query_nested - req...
ERROR openapi_integration/test_basic_retrieve_api_on_disk.py::test_points_retrieve_on_disk
ERROR openapi_integration/test_basic_retrieve_api_on_disk.py::test_exclude_payload_on_disk
ERROR openapi_integration/test_basic_retrieve_api_on_disk.py::test_is_empty_condition_on_disk
ERROR openapi_integration/test_basic_retrieve_api_on_disk.py::test_recommendation_on_disk
ERROR openapi_integration/test_basic_retrieve_api_on_disk.py::test_query_nested_on_disk
ERROR openapi_integration/test_basic_retrieve_multivec_api.py::test_points_retrieve
ERROR openapi_integration/test_basic_retrieve_multivec_api.py::test_exclude_payload
ERROR openapi_integration/test_basic_retrieve_multivec_api.py::test_is_empty_condition
ERROR openapi_integration/test_basic_retrieve_multivec_api.py::test_recommendation
ERROR openapi_integration/test_basic_retrieve_multivec_api.py::test_query_nested
ERROR openapi_integration/test_collection_update.py::test_collection_update
ERROR openapi_integration/test_count.py::test_exact_count_search - requests.e...
ERROR openapi_integration/test_count.py::test_approx_count_search - requests....
ERROR openapi_integration/test_db_lock.py::test_lock_db_for_writes - requests...
ERROR openapi_integration/test_delete_points.py::test_delete_points - request...
ERROR openapi_integration/test_euclid_dist.py::test_search_with_threshold - r...
ERROR openapi_integration/test_filter.py::test_match_any - requests.exception...
ERROR openapi_integration/test_filter_values_count.py::test_filter_values_count
ERROR openapi_integration/test_fts.py::test_scroll_with_prefix - requests.exc...
ERROR openapi_integration/test_geo_payload_index.py::test_payload_operations
ERROR openapi_integration/test_multicollection_reco.py::test_recommend_with_wrong_vector_size
ERROR openapi_integration/test_multicollection_reco.py::test_recommend_from_another_collection
ERROR openapi_integration/test_nested_payload_indexing.py::test_payload_indexing_operations
ERROR openapi_integration/test_payload_indexing.py::test_payload_indexing_operations
ERROR openapi_integration/test_payload_operations.py::test_payload_operations
ERROR openapi_integration/test_score_threshold.py::test_search_with_threshold
ERROR openapi_integration/test_snapshot.py::test_snapshot_operations - reques...
ERROR openapi_integration/test_uuid_ops.py::test_uuid_operations - requests.e...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error also happened on original dev
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests are working fine, they just require you to start the server manually beforehand.
cargo run --bin qdrant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, definitely a docker issue.
Not currently on Mac so it is hard to help concretely sorry.
Have you had a look at this doc?
For debugging purpose, you could try to to replace
QDRANT_HOST='localhost:6334'
in ./tests/basic_grpc_test.sh
by
QDRANT_HOST='host.docker.internal:6334'
See if it helps :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to QDRANT_HOST='host.docker.internal:6334
can work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It it works, you could apply the same trick to openapi/tests/openapi_integration/helpers/settings.py
for the time being.
Setting QDRANT_HOST
should work as well.
This way you can make progress on your test 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should open a new ticket for this issue? If so, I am willing to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think an issue is necessary as we have other devs on Mac who have no issues running those tests.
It seems to be specific to your system :)
src/actix/api/snapshot_api.rs
Outdated
if let Err(e) = filename { | ||
return Err(ErrorInternalServerError(e.to_string())); | ||
} | ||
let filename = filename.unwrap().map_err(storage_into_actix_error)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning the unwraps, there is one left here :)
src/actix/api/snapshot_api.rs
Outdated
let filename = filename.unwrap().map_err(storage_into_actix_error)?; | ||
Ok(NamedFile::open(filename)?) | ||
} else { | ||
Ok(NamedFile::open("not_found")?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the snapshot does not exist, we need a proper 404
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should return 202
here too, because it will not wait the background process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this function can be non waiting
.
The user wants to retrieve the path via a single API call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it makes sense. Not all of API in snapshot should have wait
parm.
Ok(snapshot.await??) | ||
} else { | ||
Ok(SnapshotDescription { | ||
name: "".to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure about using the snapshot description, empty or with message, to convey the semantic of the operation being in progress.
Ideally we should return a HTTP 202 I believe https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/202
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, return 202 is a better solution. But I am not sure how to return StatusCode here because of the function signature.
@Weijun-H I took the liberty to push to your branch to streamline the changes here. My commit introduces a way to return HTTP 202 accepted when not waiting on the snapshot creation. This takes place at the API level, meaning that we still need to create a dummy WDYT? |
d605ebf
to
a10b8b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the integration tests 👍
I have added a few comments that need to be addressed to move forward.
This is slowly getting closer to completion.
openapi/openapi-snapshots.ytt.yaml
Outdated
result: | ||
$ref: "#/components/schemas/SnapshotDescription" | ||
'202': | ||
description: operation in prohress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: operation in prohress | |
description: operation in progress |
openapi/openapi-snapshots.ytt.yaml
Outdated
result: | ||
$ref: "#/components/schemas/SnapshotDescription" | ||
'202': | ||
description: operation in prohress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: operation in prohress | |
description: operation in progress |
openapi/openapi-snapshots.ytt.yaml
Outdated
@@ -83,6 +83,12 @@ paths: | |||
required: true | |||
schema: | |||
type: string | |||
- name: wait |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this block needs to be removed given that there is no async. behavior possible on GET
) | ||
assert response.ok | ||
# wait for snapshot to be deleted | ||
sleep(5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 seconds seems a bit excessive IMHO
Was it not working with lower values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must set the values to a minimum of 2 in order to pass the tests. If the values are any lower than 2, I will not meet the requirements for the tests.
) | ||
assert response.status_code == 202 | ||
# wait for snapshot to be created | ||
sleep(5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concern about duration as above
src/actix/api/snapshot_api.rs
Outdated
@@ -171,13 +182,12 @@ async fn recover_from_snapshot( | |||
|
|||
#[get("/collections/{name}/snapshots/{snapshot_name}")] | |||
async fn get_snapshot( | |||
toc: web::Data<TableOfContent>, | |||
dispatcher: web::Data<Dispatcher>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one could rollback fully to using toc
instead of dispatcher
src/actix/api/snapshot_api.rs
Outdated
} | ||
|
||
#[get("/snapshots/{snapshot_name}")] | ||
async fn get_full_snapshot( | ||
toc: web::Data<TableOfContent>, | ||
dispatcher: web::Data<Dispatcher>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment about rollback to toc
src/common/collections.rs
Outdated
@@ -63,21 +63,49 @@ pub async fn do_list_aliases( | |||
} | |||
|
|||
pub async fn do_list_snapshots( | |||
toc: &TableOfContent, | |||
dispatcher: &Dispatcher, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listing snapshot cannot be performed in an async fashion.
The caller has no mean to get the collection afterwards.
src/common/collections.rs
Outdated
collection_name: &str, | ||
) -> Result<Vec<SnapshotDescription>, StorageError> { | ||
Ok(toc | ||
Ok(dispatcher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be rollback to toc
as a side effect
) -> impl Responder { | ||
let snapshot_name = path.into_inner(); | ||
let timing = Instant::now(); | ||
let response = do_delete_full_snapshot(toc.get_ref(), &snapshot_name).await; | ||
let wait = params.wait.unwrap_or(true); | ||
let response = do_delete_full_snapshot(dispatcher.get_ref(), &snapshot_name, wait).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it also possible to return 202 for the delete operations as well?
This would make the API more consistent.
src/common/collections.rs
Outdated
@@ -73,11 +73,23 @@ pub async fn do_list_snapshots( | |||
.await?) | |||
} | |||
|
|||
pub async fn do_create_snapshot( | |||
toc: &TableOfContent, | |||
pub async fn do_create_snapshot<'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure why this lifetime was made explicit. Why is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is redundant. I will remove it.
Hi @agourlay, I ran |
@Weijun-H This is indeed an issue with CI, I will fix it 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for your patience throughout the review process 👍
@Weijun-H A PR impacting the REST APIs has been merged. |
85117c2
to
9a479bf
Compare
It seems that I accidentally overwrote the changes by others. |
i might be able to recover from a local copy |
* feat: add wait parm for all snapshot API * update openAPI * avoid unwrap * remove all unwarp * return HTTP 202 when no waiting on snapshot creation * return 202 when non wait * update open API * update param on python test * update and test api * optimize the test * return 202 when non-wait delete * recover fixes --------- Co-authored-by: Arnaud Gourlay <arnaud.gourlay@gmail.com> Co-authored-by: Andrey Vasnetsov <andrey@vasnetsov.com>
This pr is related to #1532
All Submissions:
New Feature Submissions:
cargo fmt
command prior to submission?cargo clippy
command?Changes to Core Features: