GitHub - scrapy/parsel: Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors (original) (raw)
Parsel
Parsel is a BSD-licensed Python library to extract data from HTML, JSON, andXML documents.
It supports:
- CSS and XPath expressions for HTML and XML documents
- JMESPath expressions for JSON documents
- Regular expressions
Find the Parsel online documentation at https://parsel.readthedocs.org.
Example (open online demo):
from parsel import Selector text = """
Hello, Parsel!
""" selector = Selector(text=text) selector.css('h1::text').get() 'Hello, Parsel!' selector.xpath('//h1/text()').re(r'\w+') ['Hello', 'Parsel'] for li in selector.css('ul > li'): ... print(li.xpath('.//@href').get()) http://example.com http://scrapy.org selector.css('script::text').jmespath("a").get() 'b' selector.css('script::text').jmespath("a").getall() ['b', 'c']