Problem
When working in a notebook, timezone-aware datetime64[ns, tz]
columns in a pandas DataFrame do not reflect the expected timezone after conversion when rendered using the display()
function. Although the conversion is correctly applied in memory, the displayed output remains in UTC.
Cause
Databricks' display()
function leverages Apache Spark’s rendering behavior, which by default uses the session time zone (usually UTC unless explicitly configured).
Solution
Explicitly format timezone-aware datetime columns as strings using .strftime('%Y-%m-%d %H:%M:%S%z')
before calling display()
. This ensures the output includes the numeric offset.
import pandas as pd
# Sample data
data = {
'datetime': [
'2025-05-28 08:00:00',
'2025-05-28 12:30:00',
'2025-05-28 16:45:00',
'2025-05-29 00:15:00',
'2025-05-29 04:00:00',
]
}
df = pd.DataFrame(data)
# Convert to datetime, localize to base timezone (such as UTC), and convert to desired timezone
dt_base = pd.to_datetime(df['datetime']).dt.tz_localize('UTC')
dt_converted = dt_base.dt.tz_convert('<your-target-timezone>') # Replace with desired time zone
# Format to string with offset
df['start_time_base'] = dt_base.dt.strftime('%Y-%m-%d %H:%M:%S%z')
df['start_time_converted'] = dt_converted.dt.strftime('%Y-%m-%d %H:%M:%S%z')
# Display formatted output
display(df[['datetime', 'start_time_base', 'start_time_converted']])
If you’re using Spark DataFrames or SQL, and you want to consistently render all times in a specific timezone, you can optionally configure the session timezone explicitly with its TZ identifier. For example, “Australia/Sydney”
.
For a complete timezone database, refer to Time Zone Database.
Python
spark.conf.set("spark.sql.session.timeZone", "<your-target-timezone>")
SQL
SET TIME ZONE '<your-target-timezone>';