How to remove punctuations in NLTK (original) (raw)
Last Updated : 27 May, 2026
Natural Language Processing (NLP) involves processing and analyzing human language using machines. Removing punctuation is an important text preprocessing step that helps clean text data and improve the performance of NLP models and text analysis tasks.
Implementation
Step 1: Install NLTK
To install NLTK run the following command in your command prompt
pip install nltk
Step 2: Import Required Libraries
Imports NLTK and the tokenizer used for splitting text into words.
Python `
import nltk from nltk.tokenize import word_tokenize
`
Step 3: Download Tokenizer Resources
Downloads the tokenizer resources required for word tokenization.
Python `
nltk.download('punkt') nltk.download('punkt_tab')
`
Step 4: Define Input Text
Creates a sample sentence containing punctuation marks.
Python `
text = "Hello! Welcome to NLP, using NLTK."
`
Step 5: Tokenize and Remove Punctuation
Tokenizes the text and removes punctuation using isalnum() to keep only words and numbers.
Python `
tokens = word_tokenize(text)
clean_text = [ word for word in tokens if word.isalnum() ]
print(' '.join(clean_text))
`
**Output:
Hello Welcome to NLP using NLTK
Download full code from here