Construct a DataFrame in Pandas using string data (original) (raw)
Last Updated : 20 Mar, 2025
**Data comes in various formats and string data is one of the most common formats encountered when working with data sources such as CSV files, web scraping, or APIs. In this article, we will explore different ways to load string data into a Pandas DataFrame efficiently.
**Using StringIO()
One way to create a DataFrame from string data is by using the StringIO() function from the io module. This function treats a string as a file object, enabling us to read the data using pd.read_csv().
**Example
Python `
Import necessary libraries
import pandas as pd from io import StringIO
Define string data
StringData = StringIO("""Date;Event;Cost 10/2/2011;Music;10000 11/2/2011;Poetry;12000 12/2/2011;Theatre;5000 13/2/2011;Comedy;8000 """)
Read the data into a DataFrame using read_csv()
df = pd.read_csv(StringData, sep=';')
Print the DataFrame
print(df)
`
Output
Date Event Cost
0 10/2/2011 Music 10000 1 11/2/2011 Poetry 12000 2 12/2/2011 Theatre 5000 3 13/2/2011 Comedy 8000
**Explanation:
- **StringIO() is used to wrap the multi-line string data and treat it as a file-like object.
- **pd.read_csv() reads the string data using a semicolon (;) as a separator.
- The result is a structured Pandas DataFrame.
Table of Content
**Using read_clipboard()
Another simple approach to create a DataFrame from string data is using the **pd.read_clipboard() function. This method is particularly useful when copying and pasting data from an external source such as a webpage or document.
Python `
Import required library
import pandas as pd
Copy and paste the following data into the clipboard:
Date;Event;Cost
10/2/2011;Music;10000
11/2/2011;Poetry;12000
12/2/2011;Theatre;5000
13/2/2011;Comedy;8000
Read data from clipboard
df = pd.read_clipboard(sep=';')
Print the DataFrame
print(df)
`
**Output
Date Event Cost
0 10/2/2011 Music 10000 1 11/2/2011 Poetry 12000 2 12/2/2011 Theatre 5000 3 13/2/2011 Comedy 8000
**Explanation:
- The user copies data to the clipboard manually.
- **pd.read_clipboard() reads the data directly from the clipboard and creates a DataFrame.
Using a dictionary of strings
Another approach is to create a DataFrame using a dictionary where values are stored as lists of strings.
Python `
import pandas as pd
Creating DataFrame from a dictionary
string_data = { "Date": ["10/2/2011", "11/2/2011", "12/2/2011", "13/2/2011"], "Event": ["Music", "Poetry", "Theatre", "Comedy"], "Cost": ["10000", "12000", "5000", "8000"] }
Convert to DataFrame
df = pd.DataFrame(string_data)
Print the DataFrame
print(df)
`
Output
Date Event Cost
0 10/2/2011 Music 10000 1 11/2/2011 Poetry 12000 2 12/2/2011 Theatre 5000 3 13/2/2011 Comedy 8000
**Explanation:
- A dictionary is created where each key represents a column.
- The values are stored as lists, which are used as column values.
- **pd.DataFrame() is used to convert the dictionary into a DataFrame.
Using a list of strings
We can construct a DataFrame from a list of lists containing string data.
Python `
import pandas as pd
Creating a list of string records
string_data = [ ["10/2/2011", "Music", "10000"], ["11/2/2011", "Poetry", "12000"], ["12/2/2011", "Theatre", "5000"], ["13/2/2011", "Comedy", "8000"] ]
Define column names
columns = ["Date", "Event", "Cost"]
Convert to DataFrame
df = pd.DataFrame(string_data, columns=columns)
Print the DataFrame
print(df)
`
Output
Date Event Cost
0 10/2/2011 Music 10000 1 11/2/2011 Poetry 12000 2 12/2/2011 Theatre 5000 3 13/2/2011 Comedy 8000
**Explanation:
- A **list of lists is created, where each sublist represents a row.
- Column names are defined separately.
- **pd.DataFrame() is used to convert the list into a DataFrame.