Creating a dataframe from Pandas series (original) (raw)

Last Updated : 29 Sep, 2023

Series is a type of list in Pandas that can take integer values, string values, double values, and more. But in Pandas Series we return an object in the form of a list, having an index starting from 0 to n, Where n is the length of values in the series. Later in this article, we will discuss Dataframes in pandas, but we first need to understand the main difference between Series and Dataframe.

Series can only contain a single list with an index, whereas Dataframe can be made of more than one series or we can say that a Dataframe is a collection of series that can be used to analyze the data.

**Creating Pandas DataFrames from Series

Python3

import pandas as pd

author = [ 'Jitender' , 'Purnima' ,

`` 'Arpit' , 'Jyoti' ]

auth_series = pd.Series(author)

print (auth_series)

**Output:

0 Jitender 1 Purnima 2 Arpit 3 Jyoti dtype: object

Let’s check the type of Series:

Python3

**Output:

<class 'pandas.core.series.Series'>

**Create DataFrame From Multiple Series

We have created two lists ‘author’ and article’ which have been passed to pd.Series() functions to create two Series. After creating the Series, we created a dictionary and passed Series objects as values of the dictionary, and the keys of the dictionary will be served as Columns of the Dataframe.

Python3

import pandas as pd

author = [ 'Jitender' , 'Purnima' ,

`` 'Arpit' , 'Jyoti' ]

article = [ 210 , 211 , 114 , 178 ]

auth_series = pd.Series(author)

article_series = pd.Series(article)

frame = { 'Author' : auth_series,

`` 'Article' : article_series}

result = pd.DataFrame(frame)

print (result)

**Output:

  **Author  Article**

0 Jitender 210 1 Purnima 211 2 Arpit 114 3 Jyoti 178

**Add a Column in Pandas Dataframe

We have added one more series externally named as the age of the authors, then directly added this series in the Pandas Dataframe.

Python3

import pandas as pd

auth_series = pd.Series([ 'Jitender' ,

`` 'Purnima' , 'Arpit' , 'Jyoti' ])

article_series = pd.Series([ 210 , 211 , 114 , 178 ])

frame = { 'Author' : auth_series,

`` 'Article' : article_series}

result = pd.DataFrame(frame)

age = [ 21 , 21 , 24 , 23 ]

result[ 'Age' ] = pd.Series(age)

print (result)

**Output:

 ****Author  Article  Age**

0 Jitender 210 21 1 Purnima 211 21 2 Arpit 114 24 3 Jyoti 178 23

**Missing value in Pandas Dataframe

Remember one thing if any value is missing then by default it will be converted into NaN value, i.e, null by default.

Python3

import pandas as pd

auth_series = pd.Series([ 'Jitender' ,

`` 'Purnima' , 'Arpit' , 'Jyoti' ])

article_series = pd.Series([ 210 , 211 , 114 , 178 ])

frame = { 'Author' : auth_series,

`` 'Article' : article_series}

result = pd.DataFrame(frame)

age = [ 21 , 21 , 24 ]

result[ 'Age' ] = pd.Series(age)

print (result)

**Output:

  **Author  Article   Age**

0 Jitender 210 21.0 1 Purnima 211 21.0 2 Arpit 114 23.0 3 Jyoti 178 NaN

**Creating a Dataframe using a dictionary of Series

Here, we have passed a dictionary that has been created using a series as values then passed this dictionary to create a Dataframe. We can see while creating a Dataframe using Python Dictionary, the keys of the dictionary will become Columns and values will become Rows.

Python3

import pandas as pd

dict1 = { 'Auth_Name' : pd.Series([ 'Jitender' ,

`` 'Purnima' , 'Arpit' , 'Jyoti' ]),

`` 'Author_Book_No' :\

`` pd.Series([ 210 , 211 , 114 , 178 ]),

`` 'Age' : pd.Series([ 21 , 21 , 24 , 23 ])}

df = pd.DataFrame(dict1)

print (df)

**Output:

**Auth_Name  Auth_Book_No  Age**

0 Jitender 210 21 1 Purnima 211 21 2 Arpit 114 24 3 Jyoti 178 23

**Explicit Indexing in Pandas Dataframe

Here we can see after providing an index to the dataframe explicitly, it has filled all data with NaN values since we have created this dataframe using Series and Series has its own default indices(0,1,2) which is why when indices of both dataframe and Series do not match, we got all NaN values.

Python3

import pandas as pd

dict1 = { 'Auth_Name' : pd.Series([ 'Jitender' ,

`` 'Purnima' , 'Arpit' , 'Jyoti' ]),

`` 'Author_Book_No' : pd.Series([ 210 , 211 , 114 , 178 ]),

`` 'Age' : pd.Series([ 21 , 21 , 24 , 23 ])}

df = pd.DataFrame(dict1, index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ])

print (df)

**Output:

      **Auth_Name     Author_Book_No  Age**

SNo1 NaN NaN NaN SNo2 NaN NaN NaN SNo3 NaN NaN NaN SNo4 NaN NaN NaN

Here, we can rectify this problem by providing the same index values to every Series element.

Python3

import pandas as pd

dict1 = { 'Auth_Name' : pd.Series([ 'Jitender' ,

`` 'Purnima' , 'Arpit' , 'Jyoti' ],

`` index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ]),

`` 'Author_Book_No' : pd.Series([ 210 , 211 , 114 , 178 ],

`` index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ]),

`` 'Age' : pd.Series([ 21 , 21 , 24 , 23 ],

`` index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ])}

df = pd.DataFrame(dict1, index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ])

print (df)

**Output:

   **Auth_Name       Author_Book_No  Age**

SNo1 Jitender 210 21 SNo2 Purnima 211 21 SNo3 Arpit 114 24 SNo4 Jyoti 178 23