Pandas DataFrame interpolate() Method | Pandas Method (original) (raw)

Last Updated : 02 Feb, 2024

Python is a great language for data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandasis one of those packages and makes importing and analyzing data much easier.

Python Pandas interpolate() method is used to fill NaN values in the DataFrame or Series using various interpolation techniques to fill the missing values rather than hard-coding the value.

Example:

Python3

import pandas as pd

import numpy as np

df = pd.DataFrame({

`` 'A' : [ 1 , 2 , np.nan, 4 ],

`` 'B' : [ 5 , np.nan, np.nan, 8 ],

`` 'C' : [ 9 , 10 , 11 , 12 ]

})

df.interpolate()

print (df)

Output:

A B C 0 1.0 5.0 9 1 2.0 NaN 10 2 NaN NaN 11 3 4.0 8.0 12

Syntax

Syntax: DataFrame.interpolate(method=’linear’, axis=0, limit=None, inplace=False, limit_direction=’forward’, limit_area=None, downcast=None, **kwargs)

Parameters :

Returns : Series or DataFrame of same shape interpolated at the NaNs

Examples

Let’s look at some examples of the interpolate method of the Pandas library to fill NaN values in DataFrame or Series:

Example 1:

Use the interpolate() function to fill in the missing values using the linear method.

Python3

import pandas as pd

df = pd.DataFrame({ "A" :[ 12 , 4 , 5 , None , 1 ],

`` "B" :[ None , 2 , 54 , 3 , None ],

`` "C" :[ 20 , 16 , None , 3 , 8 ],

`` "D" :[ 14 , 3 , None , None , 6 ]})

df

printing dataframe before interpolate()

Let’s interpolate the missing values using the Linear method. Note that Linear method ignore the index and treat the values as equally spaced.

Python3

df.interpolate(method = 'linear' , limit_direction = 'forward' )

Output :

interpolate() method example output

As we can see in the output, values in the first row could not get filled as the direction of filling of values is forward and there is no previous value that could have been used in interpolation.

Example 2:

Use the interpolate() function to interpolate the missing values in the backward direction using the linear method and putting a limit on the maximum number of consecutive Na values that could be filled.

Python3

import pandas as pd

df = pd.DataFrame({ "A" :[ 12 , 4 , 5 , None , 1 ],

`` "B" :[ None , 2 , 54 , 3 , None ],

`` "C" :[ 20 , 16 , None , 3 , 8 ],

`` "D" :[ 14 , 3 , None , None , 6 ]})

df.interpolate(method = 'linear' , limit_direction = 'backward' , limit = 1 )

Output :

interpolate method example output

Notice the fourth column, only one missing value has been filled as we have put the limit to 1. The missing value in the last row could not be filled as no row exists after that from which the value could be interpolated.