Closed
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Add a function akin to DataFrame.transform
from pyspark. This gives an easy to use way to chain DataFrame transformations.
Describe the solution you'd like
It is common to write a python function that takes as it's input a DataFrame plus 0 or more arguments and return a DataFrame. It is convenient to be able to write functions this way and to chain them. For example
def add_something_cool(df: DataFrame) -> DataFrame:
return df.with_column("the_answer", lit(42))
def add_another(df: DataFrame, col_name: str) -> DataFrame:
return df.with_column(col_name, lit("another"))
df_original.transform(add_something_cool).transform(add_another, "second_col").show()
Describe alternatives you've considered
To do the above operation I would probably do it like
df = add_something_cool(df_original)
df = add_another(df, "second_col")
df.show()
Additional context