Skip to content
@apify

Apify

We're making the web more programmable.

Pinned Loading

  1. crawlee- Python crawlee- PythonPublic

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…

    Python 4.1k 264

  2. crawlee crawleePublic

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…

    TypeScript 15.2k 643

  3. proxy-chain proxy-chainPublic

    Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.

    JavaScript 839 139

  4. apify-sdk-js apify-sdk-jsPublic

    Apify SDK monorepo

    TypeScript 119 32

  5. got-scraping got-scrapingPublic

    HTTP client made for scraping based on got.

    TypeScript 529 40

  6. fingerprint-suite fingerprint-suitePublic

    Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

    TypeScript 920 97

Repositories

Showing 10 of 128 repositories
  • crawlee Public

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee’s past year of commit activity
    TypeScript 15,236 Apache-2.0 643 112 (1 issue needs help) 13 UpdatedOct 8, 2024
  • crawlee- Python Public

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee- Python ’s past year of commit activity
    Python 4,059 Apache-2.0 264 69 11 UpdatedOct 8, 2024
  • fingerprint-suite Public

    Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

    apify/fingerprint-suite’s past year of commit activity
    TypeScript 920 Apache-2.0 97 18 10 UpdatedOct 8, 2024
  • apify-docs Public

    This project is the home of Apify's documentation.

    apify/apify-docs’s past year of commit activity
    API Blueprint 26 Apache-2.0 74 64 26 UpdatedOct 8, 2024
  • actor-whitepaper Public

    This whitepaper describes a new concept for building serverless microapps called Actors, which are easy to develop, share, integrate, and build upon. Actors are a reincarnation of the UNIX philosophy for programs running in the cloud.

    apify/actor-whitepaper’s past year of commit activity
    1 0 7 4 UpdatedOct 8, 2024
  • apify-sdk- Python Public

    The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.

    apify/apify-sdk- Python ’s past year of commit activity
    Python 115 Apache-2.0 11 10 1 UpdatedOct 8, 2024
  • apify-cli Public

    Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.

    apify/apify-cli’s past year of commit activity
    TypeScript 121 18 34 (1 issue needs help) 8 UpdatedOct 8, 2024
  • apify-sdk-js Public

    Apify SDK monorepo

    apify/apify-sdk-js’s past year of commit activity
    TypeScript 119 Apache-2.0 32 9 7 UpdatedOct 8, 2024
  • apify-actor-docker Public

    Base Docker images for Apify actors.

    apify/apify-actor-docker’s past year of commit activity
    Dockerfile 69 Apache-2.0 22 9 2 UpdatedOct 8, 2024
  • apify-shared-js Public

    Utilities and constants shared across Apify projects.

    apify/apify-shared-js’s past year of commit activity
    TypeScript 12 Apache-2.0 10 4 3 UpdatedOct 8, 2024