Skip to content
This repository has been archived by the owner on May 8, 2020. It is now read-only.
/ pyppeteer Public archive

Headless chrome/chromium automation library (unofficial port of puppeteer)

License

Notifications You must be signed in to change notification settings

miyakogi/pyppeteer

Repository files navigation

Pyppeteer

Pyppeteer has moved topyppeteer/pyppeteer


PyPI PyPI version Documentation Travis status AppVeyor status codecov

Unofficial Python port of puppeteerJavaScript (headless) chrome/chromium browser automation library.

Installation

Pyppeteer requires Python 3.6+. (experimentally supports Python 3.5)

Install by pip from PyPI:

Python 3 -m pip install pyppeteer

Or install latest version fromgithub:

Python 3 -m pip install -U git+https://github /miyakogi/pyppeteer.git@dev

Usage

Note:When you run pyppeteer first time, it downloads a recent version of Chromium (~100MB). If you don't prefer this behavior, runpyppeteer-installcommand before running scripts which uses pyppeteer.

Example:open web page and take a screenshot.

importasyncio
frompyppeteerimportlaunch

asyncdefmain():
browser=awaitlaunch()
page=awaitbrowser.newPage()
awaitpage.goto('http://example ')
awaitpage.screenshot({'path':'example.png'})
awaitbrowser.close()

asyncio.get_event_loop().run_until_complete(main())

Example:evaluate script on the page.

importasyncio
frompyppeteerimportlaunch

asyncdefmain():
browser=awaitlaunch()
page=awaitbrowser.newPage()
awaitpage.goto('http://example ')
awaitpage.screenshot({'path':'example.png'})

dimensions=awaitpage.evaluate('''() => {
return {
width: document.documentElement.clientWidth,
height: document.documentElement.clientHeight,
deviceScaleFactor: window.devicePixelRatio,
}
}''')

print(dimensions)
# >>> {'width': 800, 'height': 600, 'deviceScaleFactor': 1}
awaitbrowser.close()

asyncio.get_event_loop().run_until_complete(main())

Pyppeteer has almost same API as puppeteer. More APIs are listed in the document.

Puppeteer's document andtroubleshootingare also useful for pyppeteer users.

Differences between puppeteer and pyppeteer

Pyppeteer is to be as similar as puppeteer, but some differences between Python and JavaScript make it difficult.

These are differences between puppeteer and pyppeteer.

Keyword arguments for options

Puppeteer uses object (dictionary in Python ) for passing options to functions/methods. Pyppeteer accepts both dictionary and keyword arguments for options.

Dictionary style option (similar to puppeteer):

browser=awaitlaunch({'headless':True})

Keyword argument style option (more Python ic, isn't it?):

browser=awaitlaunch(headless=True)

Element selector method name ($->querySelector)

In Python,$is not usable for method name. So pyppeteer uses Page.querySelector()/Page.querySelectorAll()/Page.xpath()instead of Page.$()/Page.$$()/Page.$x().Pyppeteer also has shorthands for these methods,Page.J(),Page.JJ(),andPage.Jx().

Arguments ofPage.evaluate()andPage.querySelectorEval()

Puppeteer's version ofevaluate()takes JavaScript raw function or string of JavaScript expression, but pyppeteer takes string of JavaScript. JavaScript strings can be function or expression. Pyppeteer tries to automatically detect the string is function or expression, but sometimes it fails. If expression string is treated as function and error is raised, addforce_expr=Trueoption, which force pyppeteer to treat the string as expression.

Example to get page content:

content=awaitpage.evaluate('document.body.textContent',force_expr=True)

Example to get element's inner text:

element=awaitpage.querySelector('h1')
title=awaitpage.evaluate('(element) => element.textContent',element)

Future Plan

  1. Catch up development of puppeteer
    • Not intend to add original API which puppeteer does not have

Credits

This package was created withCookiecutterand theaudreyr/cookiecutter-pypackageproject template.

About

Headless chrome/chromium automation library (unofficial port of puppeteer)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages