Skip to content

Non-US-ASCII URIs are mangled when reading a multipart response from v1/documents #1688

Closed
@rjrudin

Description

@rjrudin

See #1687 for a failing test that demonstrates this from the user perspective.

Reasons why this is happening:

  1. Java Mail (either javax.mail or jakarta.mail, same behavior with both) has an InternetHeaders class that requires multipart header fields to adhere to RFC 822, which requires US-ASCII characters - see https://docs.oracle.com/javaee/7/api/javax/mail/internet/InternetHeaders.html (not that latest link, but the docs are the same in the latest jakarta.mail version of this class).
  2. MarkLogic URIs of course do not require US-ASCII characters.
  3. If a MarkLogic URI does have non-US-ASCII characters, those get mangled when the Java Client fetches a multipart response from v1/documents, where each body part has the URI in a header.

I verified that if we were to switch to OkHttp's new MultipartReader - see https://square.github.io/okhttp/5.x/okhttp/okhttp3/-multipart-reader/index.html - then we don't run into this issue because that feature is not enforcing RFC 822.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions