Introduction
Chrome DevTools Protocol (CDP) is a set of APIs that allows developers to communicate with Chromium-based browsers, including Google Chrome. CDP was originally developed to power the Developer Tools features within Chrome, but since its introduction its usage has extended to much more than this initial use-case.
In this article, we’ll provide some practical examples for interacting with the Chrome DevTools Protocol, as well as cover how some popular testing libraries utilize CDP.
Prerequisites
The code examples in this article assume that the following programs are installed on your development machine:
- Python3
- Node
- npm
- Chrome or another Chromium-based browser
Protocol Domains
Chrome DevTools Protocol is divided into domains. Each domain has a set of commands and events that it supports.
For example, the Network domain contains APIs for accessing the HTTP requests and responses made when rendering a page.
Another useful domain is the DOM (Document Object Model) domain. It exposes APIs for reading from and writing to the DOM. You can access query selectors, get element attributes, manipulate nodes and even scroll to selected nodes.
Apart from powering Developer Tools in Chrome, Chrome DevTools Protocol provides some of the underlying functionality used in popular testing libraries like Playwright, Puppeteer, and Selenium.
Using Chrome DevTools Protocol APIs in Python
We can execute a page in Chrome headless mode and use Chrome DevTools APIs to debug. Chrome DevTools Protocol works with any language that supports WebSockets.
To get started, install PyChromeDevTools
. PyChromeDevTools
is a library that provides wrappers for events, types,
and commands specified in Chrome DevTools Protocol.
|
|
Next, run Chrome in headless mode:
|
|
Once this command completes, you are ready to start interacting with the browser through CDP.
Example: Loading a page via CDP
In this first example, we’ll write a Python script that navigates to a page and waits until it has been loaded successfully:
|
|
After enabling the Network
and Page
domain, we navigate to https://example.com
, wait for the
Page.loadEventFired
event to be
sent, and finally measure the time it took to load the page.
The page loads in about 1.6 seconds.
Example: Retrieving cookies
In this next example, we’ll use the getCookies()
command in the Network domain to extract the cookies associated with
a page.
|
|
There are many other useful commands in CDP that we won’t be able to cover in this article. Check the official documentation for more information.
Chrome DevTools Protocol and Puppeteer
Puppeteer is a browser automation tool that runs on NodeJS. It provides an API for creating automated tests and scripts that use the Chrome DevTools Protocol under the covers.
Since interacting with the low-level Chrome DevTools Protocol can be tedious, using a higher-level library like Puppeteer to do most of the heavy lifting can be a great time-saver. Below is an architectural diagram describing how Puppeteer works (credit: https://devdocs.io/puppeteer/)
Puppeteer ships in two packages:
puppeteer-core
is the main library that handles all communications with Chrome DevTools Protocol APIs.puppeteer
downloads and installs a version of Chromium and uses thepuppeteer-core
library to interact with the browser.
If you’re building a library or another end-user product where there’s no need to download another Chromium binary, it’s
better to use puppeteer-core
.
Uses of Puppeteer
Pretty much anything you can manually do in your browser can be automated with puppeteer
. This includes:
- Form field entry
- Clicks / Taps
- Page navigation
- Extracting text displayed on the page
In addition to replicating actions that a user can perform in the browser, Puppeteer can also perform actions that are included as part of Chrome Developer Tools. This includes:
- Generate screenshots and PDFs of pages.
- Recording load time and runtime performances.
- Emulating various mobile devices, including using their proper user agent, device dimensions, and pixel density.
One exciting feature of Puppeteer is utilizing headless mode to enable server-side rendering (SSR). Most search engines rely on static HTML to index content, while more javascript-centric applications are getting created. Prerendering pages using headless Chrome and Puppeteer is a great way to generate static HTML pages.
Debugging Web Pages with Puppeteer
Install Puppeteer and Puppeteer-Core:
|
|
This script will use puppeteer
to extract all anchor tags on a page.
|
|
Run the script:
|
|
The output is an array of all the hypertext references on the page.
Chrome DevTools Protocol and Playwright
Microsoft released the public version of Playwright in July 2020. Playwright is similar to Puppeteer in many ways, and that’s not a coincidence: it was developed at Microsoft by the same team that initially developed Puppeteer at Google.
Playwright also uses Chrome DevTools Protocol to interact with Chromium-based browsers. One exciting feature in
Playwright is BrowserContexts
. BrowserContexts
lets you operate many independent browser sessions. If a page opens
another window, that page gets added to the parent context. A browser context can have multiple pages(tabs).
If you don’t want to use the high-level methods provided by Playwright, you can use CDPSession
to directly interact
with Chrome DevTools Protocol.
Debugging Web Pages with Playwright
Install Playwright using pip:
|
|
Let’s run a script to screenshot a website on an iPhone 12 device:
|
|
Run the script:
|
|
The script will create a new example.png file in the directory.
We can also monitor console output with Playwright.
|
|
Run the script:
|
|
Chrome DevTools Protocol and Selenium
Selenium uses the WebDriver Protocol. WebDriver provides a set of interfaces to manipulate DOM elements in web documents and user agents. For this to work, an intermediary server is required. To test across different browsers, you’ll need separate drivers:
- ChromeDriver for Chrome
- GeckoDriver for Firefox
- SafariDriver for Safari
Version 4 of Selenium includes a new protocol called WebDriver BiDi. The WebDriver BiDi (short for “bi-directional”) interface is in its early stages, but its purpose is to provide a stable bidirectional API for cross-browser automation and testing. Right now WebDriver BiDi is simply a wrapper around a subset of the Chrome DevTools Protocol, however there is an effort underway to define a W3C spec for WebDriver BiDi and have other browser vendors implement that spec.
Conclusion
Chrome DevTools Protocol makes it possible to have the same set of tests work across Chromium-based browsers. Besides browser test automation, Chrome DevTools Protocol can also help with server-side rendering.
Although it’s powerful, it can be cumbersome to interact directly with the Chrome DevTools Protocol. Using tools like Puppeteer, Playwright, and Selenium which provide higher-level abstractions over CDP is something you should consider before deciding to interact directly with this protocol.