Skip to content

Additional control over parsing of repeated elements #495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
yairlenga opened this issue Oct 5, 2021 · 0 comments
Open

Additional control over parsing of repeated elements #495

yairlenga opened this issue Oct 5, 2021 · 0 comments

Comments

@yairlenga
Copy link

yairlenga commented Oct 5, 2021

With 2.12, Jackson can parse xml iwith multiple repeated elements into a tree (ignore duplicate feature). This make it possible to parse, but create challenge with extracting data.

the solution result in document structure that some times will have an array and sometimes a scalar, depending on if the element exists one time or multiple time.

Wanted to suggest borrowing a feature from Perl XML::simple, which make it possible to specify list of tags with the “forceArray” flag [https://metacpan.org/pod/XML::Simple#ForceArray-=%3E-%5B-names-%5D-%23-in-important]

From my experience, this approach make it significantly easier to tree parsing most documents with repeated attributes.

With this feature, the developer is can safely access the nodes that are associated with those tags using the array notation.

as a further improvement, allowing specification of tags in hierarchy will make it

Consider: <doc> <x> 1 </x> </doc>

That will parse to { x: q }
With the propose change, the above will parse into { x: [ 1 ] }

and

<doc> <x> 1 </x> <x> 2 </x> </doc>

That will parse to { x : [ 1, 2 ] }

In the first case the ‘x’ has a value. By specifying at attribute to the mapper forceArray = [ ‘x’], the mapper will always generate array for ‘x’ even if there is only one element, simplifying access to those array.

As an extension, if forceArray is provided and multiple elements are provided fir an element not in the forceArray list, this should be considered a parsing error.

@yairlenga yairlenga changed the title Additional control over array fir xml parsing Additional control over parsing of repeated elements Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant