Skip to content

Conversation

ganeshashree
Copy link

What changes were proposed in this pull request?

Refactor MemoryStream to use SparkSession instead of SQLContext.

Why are the changes needed?

SQLContext is deprecated in newer versions of Spark.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Verified that the affected tests are passing successfully.

Was this patch authored or co-authored using generative AI tooling?

No


test("three hop pipeline") {
val session = spark
implicit val sparkSession: SparkSession = spark
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where was the previous implicit SQLContext defined?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it was getting implicit sqlContext defined in SharedSparkSession.

@cloud-fan
Copy link
Contributor

cc @HeartSaVioR

@HeartSaVioR
Copy link
Contributor

@ganeshashree
Thanks for the proposal. The change looks OK to me.

Have we checked the warn (build/log) message when we use SQLContext here? If we weren't providing the message to migrate easily, it might be beneficial to defer replacement of apply() and have intermediate migration step (deprecation of the existing methods and removal of them in Spark 5.0.0).

@HeartSaVioR HeartSaVioR changed the title [SPARK-53656][SQL] Refactor MemoryStream to use SparkSession instead of SQLContext [SPARK-53656][SS] Refactor MemoryStream to use SparkSession instead of SQLContext Sep 22, 2025
@ganeshashree
Copy link
Author

ganeshashree commented Sep 29, 2025

@ganeshashree Thanks for the proposal. The change looks OK to me.

Have we checked the warn (build/log) message when we use SQLContext here? If we weren't providing the message to migrate easily, it might be beneficial to defer replacement of apply() and have intermediate migration step (deprecation of the existing methods and removal of them in Spark 5.0.0).

@HeartSaVioR Thanks for reviewing. Currently, no warning appears in the build log when we use SQLContext. Creating two versions of MemoryStream.apply for SparkSession and SQLContext and showing a warning for SQLContext would require resolving ambiguity when both sparkSession and sqlContext are set as implicit variables. Since this is an internal API, please review whether it's acceptable to make this change and update the callers to use MemoryStream with an implicit SparkSession instead of SQLContext, where applicable. I'm exploring further to resolve the ambiguity by preferring SparkSession instead of SQLContext.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants