Updated October 14th, 2022 by deepak.bhutada
Different tables with same data generate different plans when used in same query
Problem Assume you have two Delta tables test_table_1 and test_table_2. Both tables have the same schema, same data volume, same partitions, and contain the same number of files. You are doing a join transformation with another Delta table, test_table_join, which has a million records. When you run the below join queries using test_table_1 and test_...
1 min reading timeUpdated October 26th, 2022 by deepak.bhutada
Using datetime values in Spark 3.0 and above
Problem You are migrating jobs from unsupported clusters running Databricks Runtime 6.6 and below with Apache Spark 2.4.5 and below to clusters running a current version of the Databricks Runtime. If your jobs and/or notebooks process date conversions, they may fail with a SparkUpgradeException error message after running them on upgraded clusters. ...
1 min reading timeLoad More