JavaScript Web Scraping



This list contains JavaScript libraries related to web scraping and data processing. The content of that list is focused on libs that could be run in nodejs (without real web-browser).


Network

Web-Scraping Frameworks

HTML/XML Parsing

Text Processing

Libraries for parsing and manipulating plain texts.

Specific Formats Processing

Libraries for parsing and manipulating specific text formats.

Natural Language Processing

Libraries for working with human languages.

Browser automation and emulation

Multiprocessing

Asynchronous

Libraries for asynchronous networking programming.

Queue

Email

Libraries for parsing email.

URL and Network Address Manipulation

Libraries for parsing/modifying URLs and network addresses.

Web Content Extracting

Libraries for extracting web contents.

WebSocket

Libraries for working with WebSocket.

DNS Resolving

Computer Vision

Proxy Server

Data Structure

Other JavaScript lists