-
Notifications
You must be signed in to change notification settings - Fork 20
feat(#242): add pulldata function #488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: 03f2a9e The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
{ | ||
testName: 'returns empty string when uses not existing property with no value', | ||
property: '@doesnotexist', | ||
inputValue: '', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding a couple of cases where the inputValue
has single quotes or double quotes?
For example: O'Reilly
and Hello "world"
Maybe related post: https://forum.getodk.org/t/different-behaviour-between-enketo-and-odk-collect-with-straight-quotes-in-select-from-file-questions/49973/7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right, good catch. In this version, if the input value was "O'Reilly" then the parser would Error.
I took the step of escaping the apostrophe but no matter which escape method I used (\'
, '
, '''
) the parser couldn't understand it. It turned out the tree sitter xpath package couldn't handle escaping, so I extended it as well, and added the necessary unescaping to the string literal evaluator. Now everything works but I'm way outside of my comfort zone on this one. It's possible I've now enabled forms to have escaped quotes in all sorts of places, or it's possible I've broken something somewhere...
A much easier solution I thought of is to use double quotes if the value has single quotes, and vice versa. However this wouldn't work if a string had both. Seems unlikely but you never know.
Anyway, I'd appreciate your take. This is a much bigger change than I expected but maybe an improvement to the parser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tricky! I don't have experience with the Tree Sitter code, and I share your concern.
I'm unsure if any of the Tree Sitter is autogenerated or manually ported (commit). I want to dig more into the history tomorrow (docs, slack, etc)
It's possible I've now enabled forms to have escaped quotes in all sorts of places, or it's possible I've broken something somewhere...
+1. How can we increase confidence? We have a solid set of scenario tests. Do you think it's worth adding more where things might break, like calculations?
Side note:
Tree Sitter is something I've been curious about. I'm wondering if it's really the best tool for us, but we haven't had the time to reevaluate it (ticket) (for example: Is there a smaller library available (bundle size ticket) that doesn't require the Docker daemon to run and is easier to upgrade (ticket)?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. So openrosa-xpath-evaluator doesn't appear to handle any escaping either which made me wonder why web-forms/xpath had to. I think the cause is that the openrosa-xpath-evaluator implementation doesn't strip the quotes out, so the form builder had to pick the right quotes to not have an escaping issue.
Note the difference between: https://github.com/enketo/enketo/blob/5ce794321a85d1c0ffb90284581b2edffa8b3933/packages/enketo-core/src/js/form-model.js#L1405
`instance(${params[0]})/root/item[${params[2]} = ${searchValue}]/${params[1]}`;
and the version in this PR:
`instance('${instanceId}')/root/item[${queryElement}='${escapedQuery}']/${desiredElement}`;
I think this is because of the way the string literals are passed. In openrosa-xpath-evaluator the parameters still have the quotes, and must be manually stripped, whereas in web-forms the quotes are automatically stripped:
web-forms/packages/xpath/src/evaluator/expression/StringLiteralExpressionEvaluator.ts
Line 11 in 6015ab6
const constValue = text.substring(1, text.length - 1); |
Therefore in my implementation I add them back, but I don't know whether to use '
or "
so sometimes it's wrong.
Options:
- Leave it the way it is in this PR, which may have unintended consequences, both good and bad.
- Update the StringLiteralExpressionEvaluator to not strip quotes, or at least expose the original string, so I can use that (compatible with openrosa-xpath-evaluator).
- (mentioned above) Use double quotes if the value has single quotes, and vice versa. However this wouldn't work if a string had both.
2 and 3 would not require any changes to webforms/tree-sitter-xpath which feels safer, and more compliant with openrosa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more inclined to Option 2. It is slightly faster due to less string manipulation and a lower risk of breaking something. We can document that quote-handling behavior in the pulldata documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on thinking about quotes, @latin-panda. I think there are two cases to consider here:
query
is a literal value with a quote and/or apostrophe -- this is likely to be quite rarequery
is a reference to another field (often a select) and that field can contain a value with a quote and/or apostrophe -- this is somewhat likely and is what the forum post illustrates
I'm not totally sure whether these would be handled differently but want to point out that the second one is most important to support.
I agree with the discomfort with the current approach. 3 feels too hacky to me. I'm having trouble with thinking through the implications for 2. Would it also have the problem that Enketo seems to have as highlighted by the forum post linked above?
If so, my preference would be to remove anything special for apostrophe/quote handling for now and file it as a separate issue since it is an edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think the literal vs reference cases are different, thinking about it more. In the literal case, the form designer can wrap the value with quotes but in the reference case the values are whatever is in that field. Though I'm not sure what the escaping story is there currently!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing to note is that more typical pulldata
usage would be pulling from a CSV rather than from an internal instance. I also don't think pulldata
can get attribute values in Collect nor Enketo so filtering by the value
child would be more realistic. I also imagine values with quotes/apostrophe from CSV might have special handling so it would be good to have a scenario test on that end-to-end pattern. I'm more and more inclined to do quote/apostrophe handling as its own mini project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okie dokie...
Upon another look at the Enketo code, it's not as clever as I suggested it was - it literally wraps the evaluation of the query with single quotes just like my naive first attempt did. So, I've removed the escaping code and raised #492 to cover it more thoroughly, and commented out the scenario test that was asserting quotes worked.
I'm not terribly happy with where this ended up but I think it's completely compatible with Enketo. I'll leave it up to those with more experience to decide whether we want to do better than Enketo in the first pass.
filtering by the value child would be more realistic
Thanks - I've added a test for value
so that's covered, not just attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m happy to match Enketo for now and have the issue but @latin-panda you can have the final say!
@latin-panda I think this is ready for a review again. It still doesn't handle apostrophes which is covered by #492 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging this now ✨ Thanks, Gareth, for exploring and sharing your ideas. It was interesting to learn more about string handling.
Closes #242
I have verified this PR works in these browsers (latest versions):
What else has been done to verify that this works as intended?
Ported scenario tests from javarosa.
Why is this the best possible solution? Were any other approaches considered?
This matches the other similar function implementations.
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
The changes are be isolated to the previously unsupported function call so it should not impact any existing forms.
Do we need any specific form for testing your changes? If so, please attach one.
This is the one I used...
242-pulldata.xml
Typing in "south" returns "Texas" which is the first result that matches.
Typing in "north" returns "Alaska" which is the only result that matches.
Typing in anything else will return empty string.
What's changed
A brand new function
pulldata
has been implemented compliant with the ODK spec: https://getodk.github.io/xforms-spec/#fn:pulldata