Python | Working with date and time using Pandas (original) (raw)

While working with data, encountering time series data is very usual. Pandas is a very useful tool while working with time series data.

Pandas provide a different set of tools using which we can perform all the necessary tasks on date-time data. Let’s try to understand with the examples discussed below.

Working with Dates in Pandas

The date class in the DateTime module of Python deals with dates in the Gregorian calendar. It accepts three integer arguments: year, month, and day.

Python3

from datetime import date

d = date( 2000 , 9 , 17 )

print (d)

print ( type (d))

**Output:

2000-09-17 <class 'datetime.date'>

**Year, month, and day extraction

Retrieve the year, month, and day components from a Timestamp object.

Python3

import pandas as pd

timestamp = pd.Timestamp( '2023-10-04 15:30:00' )

year = timestamp.year

print (year)

month = timestamp.month

print (month)

day = timestamp.day

print (day)

**Output:

2023 10 4

**Weekdays and quarters

Determine the weekday and quarter associated with a Timestamp.

Python3

hour = timestamp.hour

print (hour)

minute = timestamp.minute

print (minute)

weekday = timestamp.weekday()

print (weekday)

quarter = timestamp.quarter

print (quarter)

**Output:

15 30 2 4

Working with Time in Pandas

Another class in the DateTime module is called time, which returns a DateTime object and takes integer arguments for time intervals up to microseconds:

Python3

from datetime import time

t = time( 12 , 50 , 12 , 40 )

print (t)

print ( type (t))

**Output:

12:50:12.000040 <class 'datetime.time'>

**Time periods and date offsets

Create custom time periods and date offsets for flexible date manipulation.

Python3

time_period = pd.Period( '2023-10-04' , freq = 'M' )

year = time_period.year

print (year)

month = time_period.month

print (month)

quarter = time_period.quarter

print (quarter)

date_offset = pd.DateOffset(years = 2 , months = 3 , days = 10 )

new_timestamp = timestamp + date_offset

print (new_timestamp)

**Output:

2023 10 4 2026-01-14 15:30:00

**Handling Time Zones

Time zones play a crucial role in date and time data. Pandas provides mechanisms to handle time zones effectively:

Python3

import pandas as pd

timestamp = pd.Timestamp( '2023-10-04 15:30:00' ,

`` tz = 'America/New_York' )

print (timestamp)

utc_timestamp = timestamp.utcfromtz( 'America/New_York' )

print (utc_timestamp)

original_timestamp = utc_timestamp.tz_localize( 'America/New_York' )

print (original_timestamp)

datetime_index = pd.DatetimeIndex([ '2023-10-04' ,

`` '2023-10-11' ,

`` '2023-10-18' ],

`` tz = 'Asia/Shanghai' )

print (datetime_index)

utc_datetime_index = datetime_index.utcfromtz( 'Asia/Shanghai' )

print (utc_datetime_index)

original_datetime_index = utc_datetime_index.tz_localize(

`` 'Asia/Shanghai' )

print (original_datetime_index)

**Output:

Original Timestamp: 2023-10-04 15:30:00-04:00 UTC Timestamp: 2023-10-04 19:30:00+00:00 Original Timestamp (Back to America/New_York): 2023-10-04 15:30:00-04:00 Original DatetimeIndex: DatetimeIndex(['2023-10-04 00:00:00+08:00', '2023-10-11 00:00:00+08:00', '2023-10-18 00:00:00+08:00'], dtype='datetime64[ns, Asia/Shanghai]', freq=None) UTC DatetimeIndex: DatetimeIndex(['2023-10-03 16:00:00+00:00', '2023-10-10 16:00:00+00:00', '2023-10-17 16:00:00+00:00'], dtype='datetime64[ns, UTC]', freq=None) Original DatetimeIndex (Back to Asia/Shanghai): DatetimeIndex(['2023-10-04 00:00:00+08:00', '2023-10-11 00:00:00+08:00', '2023-10-18 00:00:00+08:00'], dtype='datetime64[ns, Asia/Shanghai]', freq=None)

**Working with Date and Time in Pandas

Pandas provide convenient methods to extract specific date and time components from Timestamp objects. These methods include:

**Step-1: Create a dates dataframe

Python3

import pandas as pd

data = pd.date_range( '1/1/2011' , periods = 10 , freq = 'H' )

data

**Output:

DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 01:00:00', '2011-01-01 02:00:00', '2011-01-01 03:00:00', '2011-01-01 04:00:00', '2011-01-01 05:00:00', '2011-01-01 06:00:00', '2011-01-01 07:00:00', '2011-01-01 08:00:00', '2011-01-01 09:00:00'], dtype='datetime64[ns]', freq='H')

**Step-2: Create range of dates and show basic features

Python3

data = pd.date_range( '1/1/2011' , periods = 10 , freq = 'H' )

x = pd.datetime.now()

x.month, x.year

**Output:

(9, 2018)

**Datetime features can be divided into two categories. The first one time moments in a period and second the time passed since a particular period. These features can be very useful to understand the patterns in the data.

**Step-3: Divide a given date into features –

**pandas.Series.dt.year returns the year of the date time.
**pandas.Series.dt.month returns the month of the date time.
**pandas.Series.dt.day returns the day of the date time.
**pandas.Series.dt.hour returns the hour of the date time.
**pandas.Series.dt.minute returns the minute of the date time.
Refer all **datetime properties from here.

Break date and time into separate features

Python3

rng = pd.DataFrame()

rng[ 'date' ] = pd.date_range( '1/1/2011' , periods = 72 , freq = 'H' )

rng[: 5 ]

rng[ 'year' ] = rng[ 'date' ].dt.year

rng[ 'month' ] = rng[ 'date' ].dt.month

rng[ 'day' ] = rng[ 'date' ].dt.day

rng[ 'hour' ] = rng[ 'date' ].dt.hour

rng[ 'minute' ] = rng[ 'date' ].dt.minute

rng.head( 3 )

**Output:

date year month day hour minute 0 2011-01-01 00:00:00 2011 1 1 0 0 1 2011-01-01 01:00:00 2011 1 1 1 0 2 2011-01-01 02:00:00 2011 1 1 2 0

**Step-4: To get the present time, use Timestamp.now() and then convert timestamp to datetime and directly access year, month or day.

Python3

t = pandas.tslib.Timestamp.now()

t

**Output:

Timestamp('2018-09-18 17🔞49.101496')

Python3

**Output:

datetime.datetime(2018, 9, 18, 17, 18, 49, 101496)

**Step-5: Extracting specific components of datetime columne like date, time, day of the week for further analysis.

Python3

t.year

t.month

t.day

t.hour

t.minute

t.second

**Output:

2018 8 25 15 53

Exploring UFO Sightings Over Time

Let’s analyze this problem on a real dataset uforeports.

Python3

**Output:

City Colors Reported Shape Reported State Time 0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00 1 Willingboro NaN OTHER NJ 6/30/1930 20:00 2 Holyoke NaN OVAL CO 2/15/1931 14:00 3 Abilene NaN DISK KS 6/1/1931 13:00 4 New York Worlds Fair NaN LIGHT NY 4/18/1933 19:00

The code is used to convert a column of time values in a Pandas DataFrame into the datetime format.

Python3

df[ 'Time' ] = pd.to_datetime(df.Time)

df.head()

**Output:

City Colors Reported Shape Reported State
0 Ithaca NaN TRIANGLE NY
1 Willingboro NaN OTHER NJ
2 Holyoke NaN OVAL CO
3 Abilene NaN DISK KS
4 New York Worlds Fair NaN LIGHT NY
Time
0 1930-06-01 22:00:00
1 1930-06-30 20:00:00
2 1931-02-15 14:00:00
3 1931-06-01 13:00:00
4 1933-04-18 19:00:00

The code is used to display the data types of each column in a Pandas DataFrame.

Python3

**Output:

City object Colors Reported object Shape Reported object State object Time datetime64[ns] dtype: object

The code is used to extract the hour details from a column of time data in a Pandas DataFrame.

Python3

**Output:

0 22 1 20 2 14 3 13 4 19 Name: Time, dtype: int64

The code is used to retrieve the names of the weekdays for a column of date and time data in a Pandas DataFrame.

Python3

df.Time.dt.weekday_name.head()

**Output:

0 Sunday 1 Monday 2 Sunday 3 Monday 4 Tuesday Name: Time, dtype: object

The code is used to retrieve the ordinal day of the year for each date in a column of date and time data in a Pandas DataFrame.

Python3

df.Time.dt.dayofyear.head()

**Output:

0 152 1 181 2 46 3 152 4 108 Name: Time, dtype: int64

Creating visualization to explore the frequency of UFO sightings by hour of the day.

Python3

df[ 'Time' ] = pd.to_datetime(df.Time)

df[ 'Hour' ] = df[ 'Time' ].dt.hour

plt.figure(figsize = ( 10 , 6 ))

plt.hist(df[ 'Hour' ], bins = 24 , range = ( 0 , 24 ), edgecolor = 'black' , alpha = 0.7 )

plt.xlabel( 'Hour of the Day' )

plt.ylabel( 'Number of UFO Sightings' )

plt.title( 'UFO Sightings by Hour of the Day' )

plt.xticks( range ( 0 , 25 ))

plt.grid( True )

plt.show()

**Output:

Screenshot-2023-10-09-095920Conclusion

Working with date and time data is an essential skill for data analysts and scientists. Pandas provides a comprehensive set of tools and techniques for effectively handling date and time information, enabling insightful analysis of time-dependent data. By mastering these techniques, you can gain valuable insights from time series data and make informed decisions in various domains.