Skip to content

canonicalize/normalize document utility #2342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
micimize opened this issue Jan 10, 2020 · 6 comments
Open

canonicalize/normalize document utility #2342

micimize opened this issue Jan 10, 2020 · 6 comments
Labels
PR: feature 🚀 requires increase of "minor" version number

Comments

@micimize
Copy link

Feature requests

Would a canonicalize(node) or normalize(node) utility be a desirable feature? I started working on one to simplify some aspects of ast-based code generation, though all it does right now is merge duplicate fields

I don't think it should inline fragments or anything like that (though maybe conditionally). Maybe we'd need to spec out exactly what a canonical document form would actually look like though

@IvanGoncharov
Copy link
Member

@micimize Thanks for opening this issue 👍
Sadly we can't merge fields with directives, simplest example would be:

query ($oneCondition: Boolean, $anotherCondition: Boolean) {
  someField @include(if: $oneCondition)
  someField @include(if: $anotherCondition)
}

After merge you will get:

query ($oneCondition: Boolean, $anotherCondition: Boolean) {
  someField @include(if: $oneCondition) @include(if: $anotherCondition)
}

Which is invalid since @include is not a repeatable directive and can have only one instance per location.
We can make standard query directives repeatable (skip & include) but it will not solve the same problem with custom directives.

What is your use case for the canonicalize function?

@micimize
Copy link
Author

hmm, true. We could just add directives that to the unique condition. My use case is data class generation, but there are other potential use-cases, like getting the correct cache hit on the client for foo { a b } and foo { b a a }, or .graphql file formatting.

As an aside, it feels a bit counter intuitive that

{
    name @include(if: false)
    name @include(if: true)
}

is valid, but not

{ name @include(if: false) @include(if: true) }

@IvanGoncharov
Copy link
Member

or .graphql file formatting.

@micimize There is stripIgnoredCharacters function for this:
https://github.com/graphql/graphql-js/blob/master/src/utilities/stripIgnoredCharacters.js

I don't think it should inline fragments or anything like that (though maybe conditionally). Maybe we'd need to spec out exactly what a canonical document form would actually look like though

So without inlining fragments, we can only remove fields at the same level that don't have directives applied to them.
I don't think there are a lot of real-world queries that have duplicate fields at the same level:

{
  foo
  bar
  foo
}

Can you think of any other normalization step that doesn't affect the query execution?

@micimize
Copy link
Author

micimize commented Jan 21, 2020

@IvanGoncharov I was more talking about developer-facing formatting like prettierjs (though this would only be for ASTs).

I guess this makes more sense as a collection of smaller utilities:

  • deduplicateFields(ast, directiveHandler), where directiveHandler is maybe a flatMap-like callback for name @directive name @directive cases
  • sortFields(ast, comparatorFn)
  • maybe inlineFragments(ast, variables)

This way implementors/users can decide for themselves if things like field order, distinct fragment definition, etc. are part of "query identity", eschewing the issue of whether or not the utilities are fully spec compliant.

To distill my position:

  • many use-cases want to treat foo { a b } and foo { b a a } as equivalent (order and duplicate independence)
  • some use-cases want to treat fragment f { a } { ...f } and { a } as equivalent
  • offering (reference) utilities to do this robustly would be beneficial

I think the real validation of this idea is if it would be desirable for persisted query implementors to offer some subset of hash(deduplicateFields(sortFields(inlineFragments(node)))) as an alternative to using hash(print(node)) as an identity. @benjamn, you're the most recent committer on apollo-link-persisted-queries - do you know if the apollo team is or might be interested this idea?

@Cito
Copy link
Member

Cito commented Mar 24, 2020

A user of GraphQL-core had a similar request. In his case, he wanted to normalize the order of arguments in particular by sorting them. So we are also thinking about something similar to lexicographic_sort_schema, but for executable documents. It's still unclear to me though how much should be normalized. The order of arguments does clearly not matter, but the order of fields may be relevant in some cases. GraphQL follows that order in the result at least if the transport protocol allows it - JSON is officially unordered.

@IvanGoncharov
Copy link
Member

@Cito Thanks for excellent example with an order of args.
It totally makes sense to have function that does that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: feature 🚀 requires increase of "minor" version number
Projects
None yet
Development

No branches or pull requests

3 participants