Creating Pandas dataframe using list of lists (original) (raw)

Last Updated : 29 Nov, 2023

In this article, we will explore the Creating Pandas data frame using a list of lists. A Pandas DataFrame is a versatile 2-dimensional labeled data structure with columns that can contain different data types. It is widely utilized as one of the most common objects in the Pandas library. There are various methods for Creating a Pandas data frame using a list of lists, and we will specifically delve into the approach of utilizing a list of lists for this purpose.

Create Pandas Dataframe using list of lists

There are various methods to create a Pandas data frame using a list of lists. Here, we are discussing some generally used methods that are following

Using pd.DataFrame() Function
Handling Missing Values
DataFrame with Different Data Types

Using pd.DataFrame() function

In this example, we will create a list of lists and then pass it to the Pandas **DataFrame function. Also, we will add the parameter of columns which will contain the column names.

Python3

import pandas as pd

data = [[ 'Geeks' , 10 ], [ 'for' , 15 ], [ 'geeks' , 20 ]]

df = pd.DataFrame(data, columns = [ 'Name' , 'Age' ])

print (df)

**Output:

Name  Age

0 Geeks 10
1 for 15
2 geeks 20

Let’s see another example with the same implementation as above.

Python3

import pandas as pd

data = [[ 'DS' , 'Linked_list' , 10 ], [ 'DS' , 'Stack' , 9 ], [ 'DS' , 'Queue' , 7 ],

`` [ 'Algo' , 'Greedy' , 8 ], [ 'Algo' , 'DP' , 6 ], [ 'Algo' , 'BackTrack' , 5 ], ]

df = pd.DataFrame(data, columns = [ 'Category' , 'Name' , 'Marks' ])

print (df)

**Output:

Category Name Marks
0 DS Linked_list 10
1 DS Stack 9
2 DS Queue 7
3 Algo Greedy 8
4 Algo DP 6
5 Algo BackTrack 5

Handling Missing Values

Below code creates a Pandas **DataFrame named df from a list of lists, where missing values represented as None are replaced with NaN. It prints the resulting DataFrame containing information about individuals, including names, ages, and occupations.

Python3

import pandas as pd

import numpy as np

data = [[ 'Geek1' , 28 , 'Engineer' ],

`` [ 'Geek2' , None , 'Data Scientist' ],

`` [ 'Geek3' , 32 , None ]]

columns = [ 'Name' , 'Age' , 'Occupation' ]

df = pd.DataFrame(data, columns = columns)

df = df.replace({ None : np.nan})

print (df)

**Output :

      Name      Age    Occuption

0 Geek1 28.0 Engineer
1 Geek2 NaN Data Scientist
2 Geek3 32.0 NaN

DataFrame With Different Data Types

Below code creates a Pandas **DataFrame from a list of lists, converting the ‘Age’ column to numeric format and handling errors, with the result printed. The ‘Age’ values, initially a mix of numbers and strings, are corrected to numeric format.

Python3

import pandas as pd

data = [[ 'Geek1' , 28 , 'Engineer' ],

`` [ 'Geek2' , 25 , 'Data Scientist' ],

`` [ 'Geek3' , '32' , 'Manager' ]]

columns = [ 'Name' , 'Age' , 'Occupation' ]

df = pd.DataFrame(data, columns = columns)

df[ 'Age' ] = pd.to_numeric(df[ 'Age' ], errors = 'coerce' )

print (df)

**Output :

Name   Age      Occupation

0 Geek1 28.0 Engineer
1 Geek2 25.0 Data Scientist
2 Geek3 NaN Manager

Defining column names using Dataframe.columns() function

Doing some operations on dataframe like transpose. And also defining the Dataframe without column parameters and using **df.columns() for the same.

In this example the below code uses pandas to create a DataFrame from a list of lists, assigns column names (‘Col_1’, ‘Col_2’, ‘Col_3’), prints the original DataFrame, transposes it, and prints the result. Transposing swaps rows and columns in the DataFrame.

Python3

import pandas as pd

data = [[ 1 , 5 , 10 ], [ 2 , 6 , 9 ], [ 3 , 7 , 8 ]]

df = pd.DataFrame(data)

df.columns = [ 'Col_1' , 'Col_2' , 'Col_3' ]

print (df, "\n" )

df = df.transpose()

print ( "Transpose of above dataframe is-\n" , df)

**Output :

Col_1 Col_2 Col_3
0 1 5 10
1 2 6 9
2 3 7 8
Transpose of above dataframe is-
0 1 2
Col_1 1 2 3
Col_2 5 6 7
Col_3 10 9 8