Home
Projects
Tags
Monthly
More
...
GitHub
GitHub
Toggle theme
Scraping
•
12 projects
Scraping
HTML & text parsing
Accessibility
PDF
Sort: By total number of stars
Cheerio
26.9k
The fast, flexible, and elegant library for parsing and manipulating HTML a...
HTML & text parsing
Scraping
Pushed 2 years ago
134 contributors
Created 14 years ago
26.9k
Crawlee
9.04k
The scalable web scraping and crawling library for JavaScript/Node.js. Enab...
Scraping
Pushed 2 years ago
70 contributors
Created 9 years ago
9.04k
Readibility
6.45k
Extract the Readable Content from an HTML Document
Scraping
Accessibility
Pushed 3 years ago
67 contributors
Created 11 years ago
6.45k
X-ray
5.76k
The next web scraper. See through the <html> noise.
Scraping
Pushed 5 years ago
41 contributors
Created 11 years ago
5.76k
Postlight Parser
4.93k
Extract meaningful content from the chaos of a web page
HTML & text parsing
Scraping
Pushed 3 years ago
57 contributors
Created 9 years ago
4.93k
Percollate
3.95k
A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Mar...
Scraping
PDF
Pushed 3 years ago
19 contributors
Created 7 years ago
3.95k
scrape-it
3.93k
A Node.js scraper for humans.
Scraping
Pushed 2 years ago
19 contributors
Created 10 years ago
3.93k
Metascraper
2.11k
Get unified metadata from websites using Open Graph, Microdata, RDFa, Twitt...
Scraping
Pushed 2 years ago
32 contributors
Created 10 years ago
2.11k
website-scraper
1.42k
Download website to local directory (including all css, images, js, etc.)
Scraping
Pushed 2 years ago
16 contributors
Created 11 years ago
1.42k
Article Extractor
1.08k
Extract main article, main image and meta data from URL
Scraping
Pushed 2 years ago
15 contributors
Created 10 years ago
1.08k
linkinator
836
A super simple site crawler and broken link checker
Scraping
Pushed 2 years ago
23 contributors
Created 7 years ago
836
Unfurl
412
Metadata scraper with support for oEmbed, Twitter Cards and Open Graph Prot...
Scraping
Pushed 2 years ago
21 contributors
Created 9 years ago
412
Prev
Next