PySpark Date Function Cheat Sheet (with Input-Output Types & Examples) This one-pager covers all core PySpark date and timestamp functions , their input/output types , and example usage. Suitable for data engineers and interview prep .
๐ Date Conversion & Parsing Function Input Output Example to_date(col, fmt)
String Date to_date('2025-06-14', 'yyyy-MM-dd')
โ 2025-06-14
to_timestamp(col, fmt)
String Timestamp to_timestamp('2025-06-14 12:01', 'yyyy-MM-dd HH:mm')
unix_timestamp(col, fmt)
String Long (seconds since epoch) unix_timestamp('2025-06-14', 'yyyy-MM-dd')
from_unixtime(col)
Long String (formatted time) from_unixtime(1718342400)
๐ Date Extraction Function Input Output Example year(col)
Date/Timestamp Int year('2025-06-14')
โ 2025
month(col)
Date/Timestamp Int month('2025-06-14')
โ 6
dayofmonth(col)
Date/Timestamp Int dayofmonth('2025-06-14')
โ 14
dayofweek(col)
Date/Timestamp Int (1=Sun, 7=Sat) dayofweek('2025-06-14')
โ 7
dayofyear(col)
Date/Timestamp Int dayofyear('2025-06-14')
โ 165
weekofyear(col)
Date/Timestamp Int weekofyear('2025-06-14')
โ 24
quarter(col)
Date/Timestamp Int quarter('2025-06-14')
โ 2
hour(col)
Timestamp Int hour('2025-06-14 09:30:00')
โ 9
minute(col)
Timestamp Int minute('2025-06-14 09:30:00')
โ 30
second(col)
Timestamp Int second('2025-06-14 09:30:25')
โ 25
โ Date Arithmetic Function Input Output Example date_add(date, days)
Date Date date_add('2025-06-14', 10)
โ 2025-06-24
date_sub(date, days)
Date Date date_sub('2025-06-14', 7)
โ 2025-06-07
add_months(date, n)
Date Date add_months('2025-06-14', -1)
โ 2025-05-14
months_between(date1, date2)
Dates Double months_between('2025-06-14', '2025-05-14')
โ 1.0
datediff(end, start)
Dates Int datediff('2025-06-14', '2025-06-01')
โ 13
next_day(date, 'day')
Date Date next_day('2025-06-14', 'Sunday')
โ 2025-06-15
โ๏ธ Truncation & Formatting Function Input Output Example `trunc(date, ‘MM’ ‘YYYY’)` Date Date (truncated) date_trunc('unit', ts)
Timestamp Timestamp date_trunc('hour', '2025-06-14 12:34:56')
โ 2025-06-14 12:00:00
last_day(date)
Date Date last_day('2025-06-14')
โ 2025-06-30
date_format(date, fmt)
Date/Timestamp String date_format('2025-06-14', 'MMM-yyyy')
โ 'Jun-2025'
โ Miscellaneous Function Input Output Example current_date()
None Date Returns today (e.g., 2025-06-14
) current_timestamp()
None Timestamp Returns now (e.g., 2025-06-14 12:34:56
) now()
(alias)None Timestamp Same as current_timestamp()
from_utc_timestamp(ts, tz)
Timestamp, TZ Timestamp from_utc_timestamp('2025-06-14 12:00', 'Asia/Kolkata')
to_utc_timestamp(ts, tz)
Timestamp, TZ Timestamp to_utc_timestamp('2025-06-14 17:30', 'Asia/Kolkata')
โ ๏ธ Notes Most functions require DateType
or TimestampType
, not String
. Use to_date()
/ to_timestamp()
to convert string columns before applying date functions. Use lit("2025-06-14")
with to_date()
if working with literal strings. ๐ Practical Use Case: Get First and Last Day of Previous Month from pyspark.sql import functions as F
prev_month_last = F.last_day(F.add_months(F.current_date(), -1))
prev_month_first = F.trunc(prev_month_last, 'MM')
Pages: 1 2 3 4 5
Leave a Reply