Exploratory Data Analysis (original) (raw)

Last Updated : 30 Apr, 2026

Exploratory Data Analysis (EDA) is an important step in data analysis where we explore, summarize, and visualize data to understand its structure, detect patterns, identify anomalies, test assumptions, and check relationships between variables before applying any machine learning or statistical models.

Importance

Types of Exploratory Data Analysis

1. Univariate Analysis

Univariate analysis studies one variable at a time to understand its characteristics and distribution.

**2. Bivariate Analysis

Bivariate analysis examines the relationship between two variables to understand how they interact or influence each other. Common techniques include:

3. Multivariate Analysis

Multivariate analysis studies three or more variables together to understand complex relationships within the dataset. Common techniques include:

Steps for Performing Exploratory Data Analysis

EDA involves a set of steps that help us understand the data, find patterns, detect issues and prepare the data for further analysis or modelling. It can be performed using different tools like:

Steps-in-EDA

Common steps included in EDA

Step 1: Understanding the Problem and the Data

The first step in any data analysis project is to fully understand the problem we're solving and the data we have. This includes asking questions like:

Step 2: Importing and Inspecting the Data

The next step is to load the dataset into tools like Python or R and inspect it. These checks give a basic understanding of the dataset.

Step 3: Handling Missing Data

Missing data is common in many datasets and can affect the quality of analysis. During EDA, it is important to identify and handle missing values properly to avoid incorrect results.

Step 4: Exploring Data Characteristics

After handling missing data, the next step is to examine the main characteristics of the dataset. This helps us understand how the data is distributed, detect unusual values and identify potential issues before further analysis.

Step 5: Performing Data Transformation

Data transformation prepares the dataset for better analysis and modelling. Depending on the dataset, we may need to modify or convert the data so that it is in a suitable format for analysis.

Step 6: Visualizing Relationship of Data

Data visualization helps us understand patterns, trend and relationships in the dataset that may not be clear from numbers alone.

Step 7: Handling Outliers

Outliers are data points that differ significantly from other observations. They may arise due to errors or genuine variations in the data.

Step 8: Communicate Findings and Insights

The final step in EDA is to clearly present the results of the analysis. This helps others understand the insights discovered and the conclusions drawn from the data.

Application