Eliminating repeated lines from a file using Python (original) (raw)

Last Updated : 19 Dec, 2025

Given a text file that contains several duplicate lines, the task is to remove all repeated lines and produce an output file containing only unique lines, while keeping their original order.

**Example: Input file(myfile.txt)

This is a sample line.
Python is a powerful language.
This is a sample line.

**Output:
This is a sample line.
Python is a powerful language.

Below are several methods to eliminate repeated lines from a file:

Using a Set

This method removes duplicate lines by storing only unique lines in a Python set.

Python `

seen = set()

with open("myfile.txt", "r") as f_in, open("output.txt", "w") as f_out: for ln in f_in: if ln not in seen: f_out.write(ln) seen.add(ln)

**Output

This is a sample line.
Python is a powerful language.

**Explanation:

**seen = set(): Stores all unique lines encountered
**for ln in f_in: Reads every line one by one
**if ln not in seen: Checks if the line is unique
**f_out.write(ln): Writes unique line to output file
**seen.add(ln): Marks the line as seen.

Using a List

This method removes repeated lines by checking each line before adding it to a list, ensuring only unique lines are kept.

C++ `

seen = []

with open("myfile.txt", "r") as f_in, open("output.txt", "w") as f_out: for ln in f_in: if ln not in seen: f_out.write(ln) seen.append(ln)

**Output

This is a sample line.
Python is a powerful language.

**Explanation:

**f_out.write(ln): Writes only unique lines
**seen.append(ln): Saves the line for comparison

Using Pandas

This method removes duplicate lines by loading the file into a Pandas DataFrame and using its built-in drop_duplicates() function.

Python `

import pandas as pd

df = pd.read_csv("myfile.txt", header=None) df.drop_duplicates(inplace=True) df.to_csv("output.txt", index=False, header=False)