[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leverage ibis expression for getting readablerelations #2046

Draft
wants to merge 3 commits into
base: devel
Choose a base branch
from

Conversation

sh-rp
Copy link
Collaborator
@sh-rp sh-rp commented Nov 10, 2024

Description

This is a quick and dirty implementation of the idea to use ibis expressions for selecting from our native relations. Basically I use a ReadableRelation as a Wrapper/Proxy around an ibis expression and convert it to sql at the moment of execution, which gives us a very powerful interface for free.

Please read the last test in test_read_interfaces. This test passes against duckdb (which is the hardcoded dialect for this example). It shows how you can limit, join, select, order and aggregate and also chain all of these things. More stuff is also available on the ibis expression side.

Maybe an ibis expression will actuall give us the schema of the result when executed, so if we find a way to convert our schemas into the ibis schema, including all hints with precision etc, we will be able to discover the resultschema and convert that back into a dlt schema. That would be really cool.

Stuff to figure out:

  • Will it work for all of our sql destinations? This depends on wether there is a sqlglot dialect available for them. It looks like dremio is the only one that is not supported.
  • Should we do this and if so, should this be a different relation type than the one we have, or should it replace it?
  • I added a few ToDos and Notes in the example implementation that point out a few things.

Copy link
netlify bot commented Nov 10, 2024

Deploy Preview for dlt-hub-docs canceled.

Name Link
🔨 Latest commit 390f9a0
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/67350d363944f000099c8136

@sh-rp sh-rp self-assigned this Nov 10, 2024
@sh-rp sh-rp linked an issue Nov 10, 2024 that may be closed by this pull request
@sh-rp sh-rp changed the title [Experiment] Leverage ibis expressions & sqlot do to the query building in our Readable Relations [Experiment] Leverage ibis expressions & sqlglot do to the query building in our Readable Relations Nov 10, 2024
@sh-rp
Copy link
Collaborator Author
sh-rp commented Nov 11, 2024

Another note to self, we probably need to run columns names through the normalizer. Or we assume the user will use normalized names as they are present in the schema when building these expressions.

df = items_table.df()
assert len(df.index) == total_records

df = double_items_table.df()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regular dlt dataset execution methods (df, arrow, iter_arrow...) work everywhere

Copy link
Collaborator
@rudolfix rudolfix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really cool! IMO we should keep ibis as optional dependency (it only works with python 3.10+). so we have two options:

  • separate relation
  • enable the proxy behavior if ibis is found, if not we fallback to the current behavior

i'd probably go for the second option. I'm just a little bit worried about the typing

in both cases we should implement a few common expressions we already have in our existing relation (limit, head, column selection).

regarding the schema: column lineage you can do with sqlglot. it makes sense to invest a little bit of time to understand how it is done:

we can add sqlglot as a regular dependency. and use it everywhere we have sql SELECT statements.

@sh-rp sh-rp changed the title [Experiment] Leverage ibis expressions & sqlglot do to the query building in our Readable Relations leverage ibis expression for getting readablerelations Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants