Skip to content

Conversation

FrankChen021
Copy link
Member

@FrankChen021 FrankChen021 commented Apr 18, 2025

A part of #17769.

Description

Allow raw sql in the HTTP body on http endpoints. This helps reduce the client side complexity which requires users to do character escaping when the SQL is placed in the JSON document. The following example demonstrates the usage:

echo 'SELECT 1' | curl -X POST "http://localhost:8082/druid/v2/sql" --data @-

Release note

Druid now supports raw text format SQL on /druid/v2/sql endpoint. The content in the HTTP body of a HTTP request which does not set Content-Type to application/json will be treated as a raw sql text.

Key changed/added classes in this PR
  1. Only HTTP requests with Content-Type: application/json will be parsed as traditional JSON format SQL
  2. All other HTTP requests including those with empty Content-Type will be parsed as raw SQL
  3. Router also supports raw text SQL, but when it forwards queries to brokers, it uses the JSON format for simplicity

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • [] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for accepting raw SQL directly in the HTTP request body on the /druid/v2/sql endpoint while preserving the traditional JSON-based SQL submission when the Content-Type is application/json. Key changes include:

  • Removing the @consumes(MediaType.APPLICATION_JSON) annotation to allow raw SQL for non-JSON content types.
  • Introducing overloaded endpoints that use HttpContext to distinguish raw SQL from JSON SQL.
  • Updating integration tests and documentation to cover the new behavior.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
sql/src/main/java/org/apache/druid/sql/http/SqlResource.java Updated endpoint to derive SqlQuery from HttpContext instead of relying solely on JSON input.
sql/src/main/java/org/apache/druid/sql/http/SqlQuery.java Added static helper methods to construct SqlQuery from HttpContext or HttpServletRequest, handling both JSON and raw SQL formats.
services/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java Changed SQL query deserialization to use the new SqlQuery.from(request, objectMapper) method and ensured proper header setting when forwarding.
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/sql/resources/SqlStatementResource.java Modified endpoint to leverage HttpContext-based SqlQuery creation similar to the SQL resource.
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/dart/controller/http/DartSqlResource.java Updated HTTP POST endpoint to use HttpContext for retrieving raw SQL queries.
docs/api-reference/sql-api.md Revised API documentation to explain the raw text SQL input option and clarify handling of Content-Type headers.
integration-tests/src/test/java/org/apache/druid/tests/query/ITSqlQueryTest.java Added integration tests validating different Content-Type header scenarios for SQL queries.

@FrankChen021
Copy link
Member Author

The failed test is about coverage of UT, which is expected because I didn't add UT, but add IT for end-to-end test.

@cryptoe
Copy link
Contributor

cryptoe commented Apr 22, 2025

@FrankChen021 I am currently unsure about the usecase here. Adding different flavors of the same API has a bunch of issues.

  1. Confuses users.
  2. Currently we have DART/SQL endpoints which are similar. The exact payload works in both endpoints. If we support this, we need to add support for it on DART too.

For all the above reasons I have my reservations against this PR. More 1 than 2 tbh.

@FrankChen021
Copy link
Member Author

@FrankChen021 I am currently unsure about the usecase here. Adding different flavors of the same API has a bunch of issues.

  1. Confuses users.
  2. Currently we have DART/SQL endpoints which are similar. The exact payload works in both endpoints. If we support this, we need to add support for it on DART too.

For all the above reasons I have my reservations against this PR. More 1 than 2 tbh.

  1. Please see the linked issue for the background: Enhancing Query Context Handling: Introducing SQL-Level SETTINGS, Raw SQL Support, System Table and Context Validation #17769
  2. DART and sql statement have been modified to support this

@FrankChen021 FrankChen021 requested a review from Copilot May 1, 2025 09:24
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for raw SQL queries on HTTP endpoints by accepting non‑JSON content types, and updates both the query handling logic and corresponding tests.

  • Updated SQL resource classes to support raw text queries via HTTP body with different Content-Type values
  • Extended error handling and exception mapping for malformed or unsupported queries
  • Updated integration tests and documentation to cover new raw SQL behaviors

Reviewed Changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
sql/src/main/java/org/apache/druid/sql/http/SqlResource.java Updated doPost signature to use HttpContext for raw SQL support
sql/src/main/java/org/apache/druid/sql/http/SqlQuery.java Enhanced extraction logic to support JSON, text/plain and form URL-encoded SQL queries
services/src/test/java/org/apache/druid/server/AsyncQueryForwardingServletTest.java Modified test expectations for Content-Type header usage
services/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java Updated SQL query creation and error handling with HttpException
server/src/main/java/org/apache/druid/server/initialization/jetty/*.java Introduced new HttpException and exception mapper for uniform error responses
integration-tests/src/test/java/org/apache/druid/tests/query/ITSqlQueryTest.java Added tests for various Content-Type values and raw SQL endpoint behaviors
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/* Adjusted SQL resource endpoints for raw SQL support in multi-stage query modules
docs/api-reference/sql-api.md Revised documentation to include raw text SQL format instructions
Files not reviewed (2)
  • integration-tests/pom.xml: Language not supported
  • sql/pom.xml: Language not supported

@FrankChen021 FrankChen021 requested a review from gianm May 16, 2025 02:39
@FrankChen021
Copy link
Member Author

@cryptoe @gianm pls let me if you have further comments

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just tried it locally and the examples and error paths seem to work well. TY!

I'll ignore the coverage checker when merging, because coverage is achieved through integration tests.

@gianm gianm merged commit d72b3bf into apache:master May 30, 2025
30 of 31 checks passed
@FrankChen021 FrankChen021 deleted the text branch May 31, 2025 03:06
@adarshsanjeev
Copy link
Contributor

@FrankChen021 The IT tests added seem to be failing on new PRs.

@FrankChen021
Copy link
Member Author

@adarshsanjeev interesting, why didn't the IT run against this PR as well as the main branch after it was merged? let me take a look at the failed IT on new PRs

@kgyrtkirk
Copy link
Member

The IT failures were not seen because the migration to a different CI approach was not fully finished - which as a sideffect blocks ITs even if jacoco fails like in this case

@FrankChen021 @gianm: adding jacoco:skip would have been better (an open/close of the PR or a commit must retrigger a new run as labels are copied into the event at trigger time).

@kgyrtkirk
Copy link
Member

I'm unsure if this PR was the main problem or some other things which were at play....there were multiple PRs merged even after this...
and the test TaskQueueConcurrencyTest became pretty unstable recently which complicates things;
I'll try to get to the bottom of these issues

@kfaraz
Copy link
Contributor

kfaraz commented Jun 3, 2025

@kgyrtkirk , there is a PR in-progress trying to address the race conditions causing the flakiness in TaskQueueConcurrencyTest.

#18060

@FrankChen021
Copy link
Member Author

@kgyrtkirk the change in this PR should have nothing to do with the failure of TaskQueueConcurrencyTest
Newly added IT cases in this PR all pass on my local which are running against manually started cluster.

I'm still trying to run the IT against the auto brought up docker containers, but still have not made it because coordinator always exit abnormally.

@kgyrtkirk
Copy link
Member

Newly added IT cases in this PR all pass on my local which are running against manually started cluster.

that's what I was suspecting - that's why I've left the second comment: there seems to be multiple independent things causing the current failures

sorry for the noise - and thanks for checking it out!

@cryptoe
Copy link
Contributor

cryptoe commented Jun 3, 2025

Since this is causing active failures in all PR's lets disable both the
TaskQueueConcurrentyTest and IT's added in this PR till they are resolved.

This will give time to @FrankChen021 to figure the failure out.

@FrankChen021
Copy link
Member Author

The error log shows that:

Expected [400] but got [401]

It turns out that integration tests configure the basic http authenticator in the docker/environment-configs/common.
Let me update the test cases to add user/password to the http requests

@FrankChen021 FrankChen021 mentioned this pull request Jun 3, 2025
@gianm
Copy link
Contributor

gianm commented Jun 3, 2025

@FrankChen021 @gianm: adding jacoco:skip would have been better (an open/close of the PR or a commit must retrigger a new run as labels are copied into the event at trigger time).

Ah, I didn't know about the label jacoco:skip. Next time there's a PR where coverage is not needed or not feasible, I'll add that instead of merging it.

@FrankChen021
Copy link
Member Author

@FrankChen021 @gianm: adding jacoco:skip would have been better (an open/close of the PR or a commit must retrigger a new run as labels are copied into the event at trigger time).

Ah, I didn't know about the label jacoco:skip. Next time there's a PR where coverage is not needed or not feasible, I'll add that instead of merging it.

I didn't notice that the IT didn't run, nor did I know there's such label to skip the coverage.

@kgyrtkirk
Copy link
Member

sorry guys - I did mentioned it in a mail to dev list here
however I think that might not be the best place; maybe the job when it fails with should mention the poissibility to use jacoco:skip

@gianm
Copy link
Contributor

gianm commented Jun 4, 2025

sorry guys - I did mentioned it in a mail to dev list here however I think that might not be the best place; maybe the job when it fails with should mention the poissibility to use jacoco:skip

Thanks, I remember seeing the mail but of course I forgot about the jacoco:skip label since reading it. It would be great to include the instructions in the failure message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants