Skip to content

graph, store: Avoid using to_jsonb when looking up a single entity #5372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Oct 25, 2024

Conversation

lutter
Copy link
Collaborator

@lutter lutter commented Apr 23, 2024

Our queries all ultimately get their data by doing something like select to_jsonb(c.*) from ( ... complicated query ... ) c because when these queries were written it was far from obvious how to generate queries with Diesel that select columns whose number and types aren't known at compile time.

The call to to_jsonb forces Postgres to encode all data as JSON, which graph-node then has to deserialize which is pretty wasteful both in terms of memory and CPU.

This commit is focused on the groundwork for getting rid of these JSON conversions and querying data in a more compact and native form with fewer conversions. It only uses it in the fairly simple case of Layout.find, but future changes will expand that use

@mangas
Copy link
Contributor

mangas commented May 28, 2024

@lutter is this ready for review?

@lutter lutter marked this pull request as ready for review June 6, 2024 00:20
@lutter
Copy link
Collaborator Author

lutter commented Jun 6, 2024

It is ready for review in the sense that it does what it claims to do, but it introduces a bunch of machinery whose full value we'd only reap if we used it more widely. But maybe it's enough to do this as a first step in that direction.

@fordN fordN requested a review from mangas June 6, 2024 15:51
Copy link

This pull request hasn't had any activity for the last 90 days. If there's no more activity over the course of the next 14 days, it will automatically be closed.

@github-actions github-actions bot added the Stale label Sep 13, 2024
@fordN fordN requested review from zorancv and removed request for mangas September 16, 2024 15:44
@lutter lutter force-pushed the lutter/dsl-dyn branch 2 times, most recently from 5794366 to c514b58 Compare September 16, 2024 17:42
@github-actions github-actions bot removed the Stale label Oct 2, 2024
Copy link
Contributor

@zorancv zorancv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice more elegant approach, curious to see the performance impact.

let bi = self
.0
.to_bigint()
.expect("The implementation of `to_bigint` for `OldBigDecimal` always returns `Some`");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess here the intention was to return an error. Also the comment hints at it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's one of these 'should never happen', but you are right, since this already returns an error, we might as well not panic

.select_cols(&columns)
.filter(table.id_eq(&key.entity_id))
.filter(table.at_block(block))
.filter(table.belongs_to_causality_region(key.causality_region));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful!

let child_table = layout.table_for_entity(first_entity)?;
let sort_by_column = child_table.column_for_field(&child.sort_by_attribute)?;
if entity_types.is_empty() {
return Err(QueryExecutionError::ConstraintViolation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice flattening the nested ifs and adding one more correctness check.

Our queries all ultimately get their data by doing something like `select
to_jsonb(c.*) from ( ... complicated query ... ) c` because when these
queries were written it was far from obvious how to generate queries with
Diesel that select columns whose number and types aren't known at compile
time.

The call to `to_jsonb` forces Postgres to encode all data as JSON, which
graph-node then has to deserialize which is pretty wasteful both in terms
of memory and CPU.

This commit is focused on the groundwork for getting rid of these JSON
conversions and querying data in a more compact and native form with fewer
conversions. It only uses it in the fairly simple case of `Layout.find`,
but future changes will expand that use
@lutter
Copy link
Collaborator Author

lutter commented Oct 22, 2024

Rebased to latest master and addressed review comment

@lutter lutter merged commit 605c6d2 into master Oct 25, 2024
6 checks passed
@lutter lutter deleted the lutter/dsl-dyn branch October 25, 2024 17:20
encalypto added a commit that referenced this pull request Jan 9, 2025
lutter added a commit that referenced this pull request Jan 9, 2025
Roundtripping arrays of enums would fail because we would read an array of
enums back as a single string "{yellow,red,BLUE}" instead of the array
["yellow", "red", "BLUE"]. Storing an update to such an entity, even if
users make no changes to that field, would fail because Postgres expects an
array and we were sending a scalar value.

This fixes a bug introduced in PR
#5372
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants