Updated April 21st, 2023 by sergios.lalas

Decreased performance when using DELETE with a subquery on Databricks Runtime 10.4 LTS

Problem Auto optimize on Databricks (AWS | Azure | GCP) is an optional set of features that automatically compact small files during individual writes to a Delta table. Paying a small cost during writes offers significant benefits for tables that are queried actively. Although auto optimize can be beneficial in many situations, you can see decreased...

0 min reading time
Updated April 21st, 2023 by sergios.lalas

Field name sorting changes in Apache Spark 3.x

Problem When using a map transformation on a RDD using Databricks Runtime 9.1 LTS and above, the resulting schema order is different when compared to doing the same map transformation using Databricks Runtime 7.3 LTS. Cause Databricks Runtime 9.1 LTS and above incorporate Apache Spark 3.x. Starting with Spark 3.0.0, rows created from named arguments...

0 min reading time
Load More