Named Entity Recognition (original) (raw)

Last Updated : 2 Feb, 2026

Named Entity Recognition (NER) in NLP focuses on identifying and categorizing important information known as entities in text. These entities can be names of people, places, organizations, dates, etc. It helps in transforming unstructured text into structured information which helps in tasks like text summarization, knowledge graph creation and question answering.

NER

Working of NER

NER helps in detecting specific information and sort it into predefined categories. It plays an important role in enhancing other NLP tasks like part-of-speech tagging and parsing. Examples of Common Entity Types:

It helps in handling ambiguity by analyzing surrounding words, structure of sentence and the overall context to make the correct classification. It means context can change based on entity’s meaning.

Example 1:

Example 2:

Working of Named Entity Recognition (NER)

Various steps involves in NER and are as follows:

  1. **Analyzing the Text: It processes entire text to locate words or phrases that could represent entities.
  2. **Finding Sentence Boundaries: It identifies starting and ending of sentences using punctuation and capitalization which helps in maintaining meaning and context of entities.
  3. **Tokenizing and Part-of-Speech Tagging: Text is broken into tokens (words) and each token is tagged with its grammatical role which provides important clues for identifying entities.
  4. **Entity Detection and Classification: Tokens or groups of tokens that match patterns of known entities are recognized and classified into predefined categories like Person, Organization, Location etc.
  5. **Model Training and Refinement: Machine learning models are trained using labeled datasets and they improve over time by learning patterns and relationships between words.
  6. **Adapting to New Contexts: A well-trained model can generalize to different languages, styles and unseen types of entities by learning from context.

Methods of Named Entity Recognition

There are different methods present in NER which are:

1. Lexicon Based Method

This method uses a dictionary of known entity names. This process involves checking if any of these words are present in a given text. However, this approach isn't commonly used because it requires constant updating and careful maintenance of the dictionary to stay accurate and effective.

2. Rule Based Method

It uses a set of predefined rules which helps in extraction of information. These rules are based on patterns and context. Pattern-based rules focus on the structure and form of words helps in looking at their morphological patterns. On the other hand context-based rules focus on the surrounding words or the context in which a word appears within the text document. This combination of pattern-based and context-based rules increases the accuracy of information extraction in NER.

3. Machine Learning-Based Method

There are two main types of category in this:

4. Deep Learning Based Method

**Implementation of NER in Python

Step 1: Installing Libraries

Firts we need to install necessary libraries. You can run the following commands in command prompt to install them.

!pip install spacy
!pip install nltk
!python -m spacy download en_core_web_sm

Step 2: Importing and Loading data

We will be using Pandas and Spacy libraries to implement this.

import pandas as pd import spacy import requests from bs4 import BeautifulSoup nlp = spacy.load("en_core_web_sm") pd.set_option("display.max_rows", 200)

`

Step 3: Applying NER to a Sample Text

We have created some random content to implement this you can use any text based on your choice.

content = "Trinamool Congress leader Mahua Moitra has moved the Supreme Court against her expulsion from the Lok Sabha over the cash-for-query allegations against her. Moitra was ousted from the Parliament last week after the Ethics Committee of the Lok Sabha found her guilty of jeopardising national security by sharing her parliamentary portal's login credentials with businessman Darshan Hiranandani." doc = nlp(content) for ent in doc.ents: print(ent.text, ent.start_char, ent.end_char, ent.label_)

`

**Output:

ner1

Resulting document

It displays the names of the entities, their start and end positions in the text and their predicted labels.

Step 4: Visualizing Entities

We will highlight the text with their categories using visualizing technique for better understanding.

from spacy import displacy displacy.render(doc, style="ent")

`

**Output:

ner2

Highlighted text with their categories

Step 5: Creating a DataFrame for Entities

entities = [(ent.text, ent.label_, ent.lemma_) for ent in doc.ents] df = pd.DataFrame(entities, columns=['text', 'type', 'lemma']) print(df)

`

**Output:

ner3

Text after categorization

Here dataframe provides a structured representation of the named entities, their types and lemmatized forms. NER helps organize unstructured text into structured information making it a useful for a wide range of NLP applications.