Check for URL in a String Python (original) (raw)

Last Updated : 24 Dec, 2025

We are given a string that may contain one or more URLs and our task is to extract them efficiently. This is useful for web scraping, text processing, and data validation.

**For Example:

**Input: s = "My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/"

**Output: ['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/'\]

Ouput is a list containing all the URLs.

Below are the several methods to perform this task:

Using re.findall()

re.findall() function in Python is used to find all occurrences of a pattern in a given string and return them as a list.

Python `

import re

s = "My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/" pattern = r'https?://\S+|www.\S+' print("URLs:", re.findall(pattern, s))

`

Output

URLs: ['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/']

**Explanation:

Using the urlparse()

urlparse() function from urllib.parse breaks down a URL into components like scheme, domain, path, and query.

Python `

from urllib.parse import urlparse

s = 'My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/' s1= s.split()

urls = [] for word in s1: parsed = urlparse(word) if parsed.scheme and parsed.netloc: urls.append(word) print("URLs:", urls)

`

Output

URLs: ['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/']

**Explanation:

urlextract is a third-party Python library used to easily extract URLs from text without writing complex regular expressions. Use the following pip command to install it:

pip install urlextract

Python `

from urlextract import URLExtract

extractor = URLExtract() urls = extractor.find_urls(s) print("URLs:", urls)

`

**Output

['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/'\]

**Explanation:

Using startswith()

This approach splits the text into words and checks each word using the built-in startswith() method to see if it begins with "http://" or "https://". Matching words are treated as URLs and collected.

Python `

s = 'My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/' x = s.split()

res=[]

for i in x: if i.startswith("https:") or i.startswith("http:"): res.append(i)
print("Urls: ", res)

`

Output

Urls: ['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/']

**Explanation:

Using find() method

find() is a built-in method in Python that is used to find a specific element in a collection, so we can use it to identify and extract a URL from a string.

Python `

s = 'My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/' s1 = s.split()

res=[] for i in s1: if i.find("https:")==0 or i.find("http:")==0: res.append(i) print("Urls: ", res)

`