Feb 14, 2022
Ehm… RDDs in 2022?
Dataframes API was made for a reason. You’ll see then that you get basically the same performance in both languages since Catalyst Optimizer resolves to same RDD DAG. Only reason to go for Scala nowadays is needing lots of UDFs or using Datasets API for extra data type control…