🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
-
Updated
Oct 21, 2024 - Python
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
🎭 Playwright integration for Scrapy
A package acting as a wrapper around the headless mode of existing web browsers to generate images from URLs and from HTML+CSS strings or files.
Run Selenium with Python via Github Actions using Headless or Non-Headless browsers!
Example of username and password proxy authentication for use in Selenium
Pyppeteer integration for Scrapy
🗄 Save an archived copy of websites from Pocket/Pinboard/Bookmarks/RSS. Outputs HTML, PDFs, and more...
Scrapfly Python SDK for headless browsers and proxy rotation
Web crawler and scraper based on Scrapy and Playwright's headless browser.
An embeddable headless browser package for Python that provides a simplified interface for interacting with web pages using Selenium and Selenium Hub.
COVID-19 Apple Mobility Trends Reports
Automated Selenium-based scraper for extracting data from Myntra
Dare2024 Solver is a Python automation script for seamlessly solving Dare2024 quizzes. Impress your friends with correct answers effortlessly. Compatible with all dare2024 versions and future updates.
Automated Selenium-based scraper for extracting and analyzing job listings from Glassdoor
This repository contains a Python script that simulates views on a GitHub profile by repeatedly reloading the profile page. The script uses the selenium and requests libraries to fetch the content of the profile page and then reloads the page in a headless Firefox browser.
Add a description, image, and links to the headless-browser topic page so that developers can more easily learn about it.
To associate your repository with the headless-browser topic, visit your repo's landing page and select "manage topics."