Python Pandas pivot() – Be on the Right Side of Change (original) (raw)

Syntax

pandas.pivot(data, index=None, columns=None, values=None)

Return Value: The return value for the pivot() function is a reshaped DataFrame organized by index/column values.

Background

Direct quote from the Pandas Documentation website:

This function does not support data aggregation. If there are multiple values, they will result in a Multi-Index in the columns.

This article delves into each parameter for this function separately.

Preparation

Before any data manipulation can occur, one (1) new library will require installation.

The Pandas library enables access to/from a DataFrame.

To install this library, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

If the installation was successful, a message displays in the terminal indicating the same.

Feel free to view the PyCharm installation guide for the required library.

How to install Pandas on PyCharm

Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

import pandas as pd

staff = {'FName': ['Clare', 'Micah', 'Ben', 'Mac', 'Emma'], 'EID': [100, 101, 102, 103, 104], 'Job': ['Designer I', 'Data Scientist', 'Developer', 'Designer II', 'Manager'], 'Age': [19, 23, 21, 27, 36], 'Salary': [87343, 94123, 96543, 65232, 102375]}

💡 Note: The data structure used in this article is a Dictionary of Lists made up of Keys (column names) and associated Values (list).

The “data” Parameter

The pivot() function data parameter is a DataFrame. This parameter can be one of the following data types or another data type that converts to a DataFrame:

CSV
dictionary of lists (used in this article)
dictionary of tuples, and more

If the DataFrame is empty, the following output will display:

df = pd.DataFrame() print(df)

Output

Empty DataFrame Columns: [] Index: []

If the DataFrame contains staff, the output will be similar to the table below.

df = pd.DataFrame(staff) print(df)

💡 Note: Formatting will vary depending on the IDE used to run the code.

Output

| | FName | EID | Job | Age | Salary | | | ------- | ----- | --- | -------------- | ------ | ------ | | 0 | Clare | 100 | Designer I | 19 | 87343 | | 1 | Micah | 101 | Data Scientist | 23 | 94123 | | 2 | Ben | 102 | Developer | 21 | 96543 | | 3 | Mac | 103 | Designer II | 27 | 65232 | | 4 | Emma | 104 | Manager | 36 | 102375 |

The “index” Parameter

The pivot() method’s index parameter is not required. If used, these column(s) create a new index. If empty, the default index numbering system will display. The parameter can be one of the following data types:

string
object
list of strings
list of index names

💡 Note: All data assigned to the index parameter must contain unique values. If not, a ValueError occurs.

In this example, the index parameter is the EID and Job columns. The column parameter is required. If empty, the entire DataFrame will display. If missing, a TypeError will occur.

df_index = df.pivot(index=['EID', 'Job'], columns=[]) print(df_index)

Output

For this code example, the index parameter contains a list of column names. They display on the left in the same order as entered in the list: not the original data structure order.

The remaining columns with the relevant data will display (FName, Age, Salary) on the right.

EID	Job	FName	Age	Salary
100	Designer I	Clare	19	87343
101	Data Scientist	Micah	23	94123
102	Developer	Ben	21	96543
103	Designer II	Mac	27	65232
104	Manager	Emma	36	102375

The “columns” Parameter

The Pandas pivot() method’s columns parameter is required. The parameter can be one of the following data types:

string
object
list of strings
list of index names

If empty (as shown above), the entire DataFrame displays. If missing, a TypeError occurs.

df_cols = df.pivot(index=['EID', 'Job', 'Salary'], columns=['FName']) print(df_cols)

Output

In this example, the index columns will display on the left side. The order is the same as that listed in the above parameter.

The list passed to the columns parameter (FNames) shows the text as a sub-heading. The remaining sub-headings are the values from that column.

The Age column is the only column omitted from both parameters. This column becomes the main table heading.

The Age data from the original data structure will display as floats.

The “values” Parameter

This function’s value parameter is not required. The parameter can be one of the following data types:

string
object
list of the previous

This parameter uses column(s) for populating DataFrame values. If this parameter is empty, all remaining columns will display.

In this example, the Salary data from the original data structure will display as floats.

df_values = df.pivot(index=['EID', 'Job', 'Age'], columns=['FName'], values=['Salary']) print(df_values)

The DataFrame