[SPARK-50767][SQL] Remove codegen of `from_json` #49411

panbingkun · 2025-01-08T06:27:10Z

What changes were proposed in this pull request?

The pr aims to remove codegen of from_json.

Why are the changes needed?

Based on the discussion and testing with SubExprEliminationBenchmark #48466 (comment),
after implementing codegen for from_json, there is a performance regression in the withFilter scenario with subExprElimination = true, codegen = true
Let's remove it first and will submit it after we solve the above issue.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA & Manually test.

Was this patch authored or co-authored using generative AI tooling?

No.

LuciferYang · 2025-01-08T06:34:05Z

Could you update the test results for SubExprEliminationBenchmark in this pr to ensure that the changes are as expected? @panbingkun

panbingkun · 2025-01-08T06:35:54Z

Could you update the test results for SubExprEliminationBenchmark in this pr to ensure that the changes are as expected? @panbingkun

Well, okay.

JDK17: https://github.com/panbingkun/spark/actions/runs/12666974432
JDK21: https://github.com/panbingkun/spark/actions/runs/12666979089

dongjoon-hyun · 2025-01-08T06:42:22Z

+1 for including SubExprEliminationBenchmark here. All the others are covered in the following.

[SPARK-50766][TESTS] Regenerate benchmark results #49409

cloud-fan · 2025-01-08T06:46:58Z

Does it mean any time we add codegen support for some functions, there is a risk of perf regression? Are we sure from_json is the only one or it's simply because SubExprEliminationBenchmark tests from_json?

panbingkun · 2025-01-08T07:07:21Z

Does it mean any time we add codegen support for some functions, there is a risk of perf regression? Are we sure from_json is the only one or it's simply because SubExprEliminationBenchmark tests from_json?

In the two scenarios tested in SubExprEliminationBenchmark , it has been found that the filter scenario has perf regression.
I can test other functions, such as those that have existed for a long time in history.

LuciferYang · 2025-01-08T07:13:03Z

@panbingkun After giving it some more thought, we could try enabling -Xlog:compilation to test the corresponding case. Is it possible that the filter in this test case generated an extremely large method after enabling Codegen, which in turn affected the compilation optimization?

If it weren't a || filter involving all fields, would the impact be less severe?

LuciferYang · 2025-01-08T11:24:51Z

I found the following content in the log:

19:22:37.212 main INFO CodeGenerator: Generated method too long to be JIT compiled: org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext is 58762 bytes

panbingkun · 2025-01-08T11:26:11Z

case 1 (3 filtering conditions with a data size of 1000000)

object FromJsonBenchmark extends SqlBasedBenchmark {
  import spark.implicits._

  def withFilter(rowsNum: Int, numIters: Int): Unit = {
    val benchmark = new Benchmark("from_json in Filter", rowsNum, output = output)

    withTempPath { path =>
      prepareDataInfo(benchmark)
      val numCols = 500
      val schema = writeWideRow(path.getAbsolutePath, rowsNum, numCols)

      val jsonValue = from_json($"value", schema)
      val predicate = jsonValue.getField(s"col0") >= lit(100000) ||
        jsonValue.getField(s"col50") >= lit(100000) ||
        jsonValue.getField(s"col123") >= lit(100000)

      val caseName = s"from_object, codegen: no"
      benchmark.addCase(caseName, numIters) { _ =>
        val df = spark.read
          .text(path.getAbsolutePath)
          .where(predicate)
        df.write.mode("overwrite").format("noop").save()
      }
      benchmark.run()
    }
  }

  override def runBenchmarkSuite(mainArgs: Array[String]): Unit = {
    val numIters = 3
    runBenchmark("Benchmark for performance of from_json codegen") {
      withFilter(1_000_000, numIters)
    }
  }
}

codegen for from_json

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: yes                          61029          62195        1781          0.0       61028.8       1.0X

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: yes                          61391          66201        7157          0.0       61391.2       1.0X

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: yes                          60653          61195         481          0.0       60652.7       1.0X

non-codegen for from_json

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: no                         61289          62508        1155          0.0       61288.6       1.0X

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: no                          61219          61663         386          0.0       61218.9       1.0X

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: no                          61056          61362         287          0.0       61055.9       1.0X

panbingkun · 2025-01-08T11:41:25Z

case 2 (1 filtering conditions with a data size of 100000)

object FromJsonBenchmark extends SqlBasedBenchmark {
  import spark.implicits._

  def withFilter(rowsNum: Int, numIters: Int): Unit = {
    val benchmark = new Benchmark("from_json in Filter", rowsNum, output = output)

    withTempPath { path =>
      prepareDataInfo(benchmark)
      val numCols = 500
      val schema = writeWideRow(path.getAbsolutePath, rowsNum, numCols)

      val jsonValue = from_json($"value", schema)
      val predicate = jsonValue.getField(s"col0") >= lit(100000)

      val caseName = s"from_object, codegen: no"
      benchmark.addCase(caseName, numIters) { _ =>
        val df = spark.read
          .text(path.getAbsolutePath)
          .where(predicate)
        df.write.mode("overwrite").format("noop").save()
      }
      benchmark.run()
    }
  }

  override def runBenchmarkSuite(mainArgs: Array[String]): Unit = {
    val numIters = 3
    runBenchmark("Benchmark for performance of from_json codegen") {
      withFilter(1_000_00, numIters)
    }
  }
}

codegen for from_json

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: yes                          2325           2341          26          0.0       23249.6       1.0X

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: yes                          2237           2266          36          0.0       22373.5       1.0X

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: yes                          2317           2403          74          0.0       23172.3       1.0X

non-codegen for from_json

OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: no                           2264           2286          20          0.0       22639.3       1.0X


OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: no                           2475           3010         554          0.0       24752.2       1.0X


OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.2
Apple M2
from_json in Filter:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
from_object, codegen: no                           2315           2780         480          0.0       23150.6       1.0X

panbingkun · 2025-01-08T11:45:26Z

From the above scenario, it seems that there is no a performance regression, and I am investigating other reasons.

LuciferYang · 2025-01-08T12:10:24Z

before: code-gen-before.txt
after:after-code-gen.txt

It can be seen that after from_json enabling codegen, an huge processNext method is generated for filter. I suspect that this method is the reason why JIT cannot optimize it.

LuciferYang · 2025-01-09T04:00:57Z

@cloud-fan @dongjoon-hyun @panbingkun How do we proceed with this issue?

I think the risk of generating huge filter methods has always existed, but it was hidden in this benchmark because from_json did not support code generation previously. So I believe the support of code generation in from_json is not the root cause.

As more functions come to support code generation, the probability of generating huge methods will increase, It should apply to more than just filters, right?.

Perhaps we need to find a more universal approach to split the generated methods in order to avoid this risk?

panbingkun · 2025-01-09T05:23:01Z

Give me some time to look at the root cause.

panbingkun · 2025-01-09T08:33:14Z

In the withFilter scenario of SubExprEliminationBenchmark, the root cause as follows:

  val df = spark.read
              .text(path.getAbsolutePath)
              .where(predicate)
  df.write.mode("overwrite").format("noop").save()

When from_json does not implement codegen
FilterExec.doExecute -> Predicate.create -> CodeGeneratorWithInterpretedFallback.createObject -> Predicate.createCodeGeneratedObject -> CodegenContext.subexpressionElimination

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala

Line 281 in 0123a5e

val evaluator = evaluatorFactory.createEvaluator()

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/FilterEvaluatorFactory.scala

Line 39 in 0123a5e

val predicate = Predicate.create(condition, childOutput)

spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala

Line 45 in 0123a5e

createCodeGeneratedObject(in)

spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratePredicate.scala

Line 41 in 0123a5e

val eval = ctx.generateExpressions(Seq(predicate), useSubexprElimination).head

spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

Line 1270 in 0123a5e

val commonExprs = equivalentExpressions.getCommonSubexpressions

Ultimately, optimize the 500 calls to `from_json` to only 1 call

When from_json implement codegen
FilterExec.doConsume -> GeneratePredicateHelper.generatePredicateCode

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala

Line 252 in 0123a5e

val predicateCode = generatePredicateCode(

there is no `subexpressionElimination` optimization here, 500 calls will ultimately be applied to `JsonToStructs`.

panbingkun · 2025-01-09T08:36:50Z

If we can implement subexpressionElimination optimization in the method FilterExec.doConsume, like ProjectExec.doConsume, that would be great.

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala

Lines 68 to 80 in 0123a5e

    
           override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: ExprCode): String = { 
        
             val exprs = bindReferences[Expression](projectList, child.output) 
        
             val (subExprsCode, resultVars, localValInputs) = if (conf.subexpressionEliminationEnabled) { 
        
               // subexpression elimination 
        
               val subExprs = ctx.subexpressionEliminationForWholeStageCodegen(exprs) 
        
               val genVars = ctx.withSubExprEliminationExprs(subExprs.states) { 
        
                 exprs.map(_.genCode(ctx)) 
        
               } 
        
               (ctx.evaluateSubExprEliminationState(subExprs.states.values), genVars, 
        
                 subExprs.exprCodesNeedEvaluate) 
        
             } else { 
        
               ("", exprs.map(_.genCode(ctx)), Seq.empty) 
        
             }

cc @cloud-fan

panbingkun · 2025-01-09T08:43:46Z

Does it mean any time we add codegen support for some functions, there is a risk of perf regression? Are we sure from_json is the only one or it's simply because SubExprEliminationBenchmark tests from_json?

So the answer to this question is:

In the SubExprEliminationBenchmark testing scenario, any expression that implements Codegen will have this issue, not just from_json, because our FilterExec is inconsistent in codegen and interpreted.

cloud-fan · 2025-01-09T09:38:43Z

@panbingkun great investigation! +1 to implement subexpression elimination for FilterExec

LuciferYang · 2025-01-09T10:54:41Z

@panbingkun great investigation! +1 to implement subexpression elimination for FilterExec

So this feature doesn't need to be revert, right?

LuciferYang · 2025-01-13T02:39:35Z

any progress? @panbingkun

panbingkun · 2025-01-13T02:45:51Z

any progress? @panbingkun

It is being implemented and will take some time.

panbingkun · 2025-01-26T07:13:05Z

Based on that the root cause for the performance regression of SubExprEliminationBenchmark is not due to the codegen implementation of from_json(the root cause is that the operator FilterExEc did not implement SubExprElimination optimization in WSC), this PR will close here.
The operator FilterExec implementing SubExprElimination optimization in WSC is still being explored for implementation: [Only Test][Don't Review] Make FilterExec to support subexpressionElimination in the codegen scenario #49573

LuciferYang · 2025-01-26T07:18:08Z

Based on that the root cause for the performance regression of SubExprEliminationBenchmark is not due to the codegen implementation of from_json(the root cause is that the operator FilterExEc did not implement SubExprElimination optimization in WSC), this PR will close here.

The operator FilterExec implementing SubExprElimination optimization in WSC is still being explored for implementation: [Only Test][Don't Review] Make FilterExec to support subexpressionElimination in the codegen scenario #49573

Got, although the performance result may degrade, I suggest submitting a pr to update the benchmark results of SubExprEliminationBenchmark for Apache Spark 4.0 and leave a TODO.

dongjoon-hyun · 2025-01-31T02:37:17Z

Hi, @panbingkun , @cloud-fan , @LuciferYang . I ran the benchmark again as a part of regression check Today and I hit this still.

branch-4.0, Java 17: https://github.com/dongjoon-hyun/spark/actions/runs/13062008857
branch-4.0, Java 21: https://github.com/dongjoon-hyun/spark/actions/runs/13062012809

When I do the bisect, it also reached the original PR. So, although we did some follow-ups, is this still required?

2a1301133138ba0d5e2d969fc6428153903ffff1 is the first bad commit
commit 2a1301133138ba0d5e2d969fc6428153903ffff1
Author: panbingkun <[email protected]>
Date:   Wed Oct 16 09:10:06 2024 +0200

    [SPARK-49966][SQL] Codegen Support for JsonToStructs(`from_json`)

LuciferYang · 2025-01-31T04:49:49Z

Based on that the root cause for the performance regression of SubExprEliminationBenchmark is not due to the codegen implementation of from_json(the root cause is that the operator FilterExEc did not implement SubExprElimination optimization in WSC), this PR will close here.

The operator FilterExec implementing SubExprElimination optimization in WSC is still being explored for implementation: [Only Test][Don't Review] Make FilterExec to support subexpressionElimination in the codegen scenario #49573

@panbingkun is currently on vacation. According to @panbingkun's intention, if FilterExec supports subexpressionElimination in the codegen scenario, this pr would not be needed. BUt, #49573 is still under development.

dongjoon-hyun · 2025-01-31T05:16:55Z

Got it. Thank you for the info, @LuciferYang ~

…Scala 2.13.16 ### What changes were proposed in this pull request? This PR aims to regenerate benchmark results of `master` after upgrading to Scala 2.13.16. - #49478 ### Why are the changes needed? To check a regression again. Currently, there is only one known benchmark suite failure. - #49411 (comment) ### Does this PR introduce _any_ user-facing change? No, this updates only test results. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49744 from dongjoon-hyun/bm_41. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

…g to Scala 2.13.16 ### What changes were proposed in this pull request? This PR aims to regenerate benchmark results of `branch-4.0` after upgrading to Scala 2.13.16. - #49478 ### Why are the changes needed? To check a regression again - #49411 (comment) ### Does this PR introduce _any_ user-facing change? No, this updates only test results. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49741 from dongjoon-hyun/bm_40. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

panbingkun · 2025-02-05T00:17:21Z

Based on that the root cause for the performance regression of SubExprEliminationBenchmark is not due to the codegen implementation of from_json(the root cause is that the operator FilterExEc did not implement SubExprElimination optimization in WSC), this PR will close here.

The operator FilterExec implementing SubExprElimination optimization in WSC is still being explored for implementation: [Only Test][Don't Review] Make FilterExec to support subexpressionElimination in the codegen scenario #49573

@panbingkun is currently on vacation. According to @panbingkun's intention, if FilterExec supports subexpressionElimination in the codegen scenario, this pr would not be needed. BUt, #49573 is still under development.

Yeah, that's right.
Currently, I want to focus on implementing subexpressionElimination for FilterExec in the WSC scenario.

Thanks @dongjoon-hyun @LuciferYang ❤️

### What changes were proposed in this pull request? This reopens #49411 to fix the performance regression in 4.0. ### Why are the changes needed? It's non-trivial to support CSE for Filter in whole stage codegen. We should not rush but revert the codegen support in 4.0 so that we have more time to get it right in 4.1. Note: 4.0 also adds codegen support for a few other expressions, but `from_json` is special as it's quite expensive and the performance regression is very significant with it. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? N/A ### Was this patch authored or co-authored using generative AI tooling? no Closes #49992 from cloud-fan/json. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Max Gekk <[email protected]>

### What changes were proposed in this pull request? This reopens #49411 to fix the performance regression in 4.0. ### Why are the changes needed? It's non-trivial to support CSE for Filter in whole stage codegen. We should not rush but revert the codegen support in 4.0 so that we have more time to get it right in 4.1. Note: 4.0 also adds codegen support for a few other expressions, but `from_json` is special as it's quite expensive and the performance regression is very significant with it. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? N/A ### Was this patch authored or co-authored using generative AI tooling? no Closes #49992 from cloud-fan/json. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Max Gekk <[email protected]> (cherry picked from commit a8b694f) Signed-off-by: Max Gekk <[email protected]>

### What changes were proposed in this pull request? This reopens apache#49411 to fix the performance regression in 4.0. ### Why are the changes needed? It's non-trivial to support CSE for Filter in whole stage codegen. We should not rush but revert the codegen support in 4.0 so that we have more time to get it right in 4.1. Note: 4.0 also adds codegen support for a few other expressions, but `from_json` is special as it's quite expensive and the performance regression is very significant with it. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? N/A ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#49992 from cloud-fan/json. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Max Gekk <[email protected]>

[SPARK-50767][SQL] Remove codegen of from_json

51a5549

github-actions bot added the SQL label Jan 8, 2025

panbingkun mentioned this pull request Jan 8, 2025

[SPARK-49966][SQL] Codegen Support for JsonToStructs(from_json) #48466

Closed

panbingkun added 3 commits January 8, 2025 16:39

fix code style

bee3e84

add SubExprEliminationBenchmark

89dd1ab

Merge branch 'master' into SPARK-50767

14307a9

dongjoon-hyun mentioned this pull request Jan 31, 2025

[SPARK-51045][TESTS][4.0] Regenerate benchmark results after upgrading to Scala 2.13.16 #49741

Closed

dongjoon-hyun mentioned this pull request Jan 31, 2025

[SPARK-51045][TESTS] Regenerate benchmark results after upgrading to Scala 2.13.16 #49744

Closed

panbingkun closed this Feb 5, 2025

dongjoon-hyun mentioned this pull request Feb 13, 2025

[SPARK-51046][SQL][TEST] Reduce numCols in withFilter() to prevent SubExprEliminationBenchmark from failing due to a Codegen error #49938

Closed

cloud-fan mentioned this pull request Feb 18, 2025

[SPARK-50767][SQL] Remove codegen of from_json #49992

Closed

[SPARK-50767][SQL] Remove codegen of from_json #49411

[SPARK-50767][SQL] Remove codegen of from_json #49411

Uh oh!

Conversation

panbingkun commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

LuciferYang commented Jan 8, 2025

Uh oh!

panbingkun commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan commented Jan 8, 2025

Uh oh!

panbingkun commented Jan 8, 2025

Uh oh!

LuciferYang commented Jan 8, 2025

Uh oh!

LuciferYang commented Jan 8, 2025

Uh oh!

panbingkun commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

panbingkun commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

panbingkun commented Jan 8, 2025

Uh oh!

LuciferYang commented Jan 8, 2025

Uh oh!

LuciferYang commented Jan 9, 2025

Uh oh!

panbingkun commented Jan 9, 2025

Uh oh!

panbingkun commented Jan 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ultimately, optimize the 500 calls to from_json to only 1 call

there is no subexpressionElimination optimization here, 500 calls will ultimately be applied to JsonToStructs.

Uh oh!

panbingkun commented Jan 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

panbingkun commented Jan 9, 2025

Uh oh!

cloud-fan commented Jan 9, 2025

Uh oh!

LuciferYang commented Jan 9, 2025

Uh oh!

LuciferYang commented Jan 13, 2025

Uh oh!

panbingkun commented Jan 13, 2025

Uh oh!

panbingkun commented Jan 26, 2025

Uh oh!

LuciferYang commented Jan 26, 2025

Uh oh!

dongjoon-hyun commented Jan 31, 2025

Uh oh!

LuciferYang commented Jan 31, 2025

Uh oh!

dongjoon-hyun commented Jan 31, 2025

Uh oh!

panbingkun commented Feb 5, 2025

Uh oh!

Uh oh!

[SPARK-50767][SQL] Remove codegen of `from_json` #49411

[SPARK-50767][SQL] Remove codegen of `from_json` #49411

panbingkun commented Jan 8, 2025 •

edited

Loading

panbingkun commented Jan 8, 2025 •

edited

Loading

dongjoon-hyun commented Jan 8, 2025 •

edited

Loading

panbingkun commented Jan 8, 2025 •

edited

Loading

panbingkun commented Jan 8, 2025 •

edited

Loading

panbingkun commented Jan 9, 2025 •

edited

Loading

Ultimately, optimize the 500 calls to `from_json` to only 1 call

there is no `subexpressionElimination` optimization here, 500 calls will ultimately be applied to `JsonToStructs`.

panbingkun commented Jan 9, 2025 •

edited

Loading