How to remove punctuations in NLTK (original) (raw)

Last Updated : 27 May, 2026

Natural Language Processing (NLP) involves processing and analyzing human language using machines. Removing punctuation is an important text preprocessing step that helps clean text data and improve the performance of NLP models and text analysis tasks.

Implementation

Step 1: Install NLTK

To install NLTK run the following command in your command prompt

pip install nltk

Step 2: Import Required Libraries

Imports NLTK and the tokenizer used for splitting text into words.

Python `

import nltk from nltk.tokenize import word_tokenize

`

Step 3: Download Tokenizer Resources

Downloads the tokenizer resources required for word tokenization.

Python `

nltk.download('punkt') nltk.download('punkt_tab')

`

Step 4: Define Input Text

Creates a sample sentence containing punctuation marks.

Python `

text = "Hello! Welcome to NLP, using NLTK."

`

Step 5: Tokenize and Remove Punctuation

Tokenizes the text and removes punctuation using isalnum() to keep only words and numbers.

Python `

tokens = word_tokenize(text)

clean_text = [ word for word in tokens if word.isalnum() ]

print(' '.join(clean_text))

`

**Output:

Hello Welcome to NLP using NLTK

Download full code from here