Convert column type from string to datetime format in Pandas dataframe (original) (raw)

Last Updated : 27 Nov, 2024

To perform time-series operations, dates should be in the correct format. Let’s learn how to convert a Pandas DataFrame column of strings to datetime format.

Pandas Convert Column To DateTime using pd.to_datetime()

**pd.to_datetime() function in Pandas is the most effective way to handle this conversion. This is used to handle different formats and to convert string columns to datetime objects.

Python `

Importing pandas

import pandas as pd

Creating a DataFrame

df = pd.DataFrame({ 'Date': ['11/8/2011', '04/23/2008', '10/2/2019'], 'Event': ['Music', 'Poetry', 'Theatre'], 'Cost': [10000, 5000, 15000] })

Display initial data types

print("Before Conversion:") print(df.info())

Converting 'Date' column to datetime

df['Date'] = pd.to_datetime(df['Date'])

Display data types after conversion

print("\nAfter Conversion:") print(df.info())

`

**Output:

**Before Conversion:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):

Column Non-Null Count Dtype


0 Date 3 non-null object
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes
None

**After Conversion:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):

Column Non-Null Count Dtype


0 Date 3 non-null datetime64[ns]
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: datetime64ns, int64(1), object(1)
memory usage: 200.0+ bytes
None

Before conversion, the Date column has the data type **object (string). After using **pd.to_datetime(), it changes to **datetime64[ns]. Now, let’ s explore how can we can convert columns from string-based formats like **yymmdd to **yyyymmdd using Pandas.

Converting from ‘yymmdd’ Format

If a column contains dates in the **yymmdd format, you can use **pd.to_datetime() with the appropriate format to convert it to **datetime64[ns].

Python `

Importing pandas library

import pandas as pd

Initializing the dataset

player_list = [ ['200712', 50000], ['200714', 51000], ['200716', 51500], ['200719', 53000], ['200721', 54000], ['200724', 55000], ['200729', 57000] ]

Creating a pandas DataFrame

df = pd.DataFrame(player_list, columns=['Dates', 'Patients'])

Displaying the DataFrame and its data types before conversion

print("Before Conversion:") print(df) print(df.dtypes)

Converting 'Dates' from 'yymmdd' to datetime format

df['Dates'] = pd.to_datetime(df['Dates'], format='%y%m%d')

Displaying the DataFrame and its data types after conversion

print("\nAfter Conversion:") print(df) print(df.dtypes)

`

**Output:

**Before Conversion:
Dates Patients
0 200712 50000
1 200714 51000
2 200716 51500
3 200719 53000
4 200721 54000
5 200724 55000
6 200729 57000
**Dates object
Patients int64
dtype: object

**After Conversion:
Dates Patients
0 2020-07-12 50000
1 2020-07-14 51000
2 2020-07-16 51500
3 2020-07-19 53000
4 2020-07-21 54000
5 2020-07-24 55000
6 2020-07-29 57000
**Dates datetime64[ns]
Patients int64
dtype: object

Converting from ‘yyyymmdd’ Format

When the column is already in yyyymmdd format, use **pd.to_datetime() with the ****%Y%m%d** format to convert it directly to **datetime64[ns].

Python `

import pandas as pd

Initializing the dataset

player_list = [ ['20200712', 50000, '20200812'], ['20200714', 51000, '20200814'], ['20200716', 51500, '20200816'], ['20200719', 53000, '20200819'], ['20200721', 54000, '20200821'], ['20200724', 55000, '20200824'], ['20200729', 57000, '20200824'] ]

Creating a pandas DataFrame

df = pd.DataFrame(player_list, columns=['Treatment_start', 'No_of_Patients', 'Treatment_end'])

Displaying the DataFrame and its data types before conversion

print("Before Conversion:") print(df) print(df.dtypes)

Converting 'Treatment_start' and 'Treatment_end' to datetime format

df['Treatment_start'] = pd.to_datetime(df['Treatment_start'], format='%Y%m%d') df['Treatment_end'] = pd.to_datetime(df['Treatment_end'], format='%Y%m%d')

Displaying the DataFrame and its data types after conversion

print("\nAfter Conversion:") print(df) print(df.dtypes)

`

**Output:

converting-from-yyyymmdd-format

Convert Column To DateTime using DataFrame.astype() function

Here’s how you can create a DataFrame, inspect its data types, and convert the string-based date column to datetime using **DataFrame.astype().

Python `

Importing pandas

import pandas as pd

Creating the DataFrame

df = pd.DataFrame({ 'Date': ['11/8/2011', '04/23/2008', '10/2/2019'], 'Event': ['Music', 'Poetry', 'Theatre'], 'Cost': [10000, 5000, 15000] })

Checking the initial data type of the 'Date' column

print("Before Conversion:") df.info()

`

**Output:

The Date column initially has the data type **object, indicating it contains strings.

Python `

Converting the 'Date' column to datetime format using astype()

df['Date'] = df['Date'].astype('datetime64[ns]')

Checking the data type after conversion

print("\nAfter Conversion:") df.info()

`

**Output:

Using **astype(‘datetime64[ns]’)** successfully converts the **Date column to the **datetime64[ns]` format.