BiDi Protocol support in Selenium 4
One of the new features in the recently released Selenium 4 is support for new event-driven listeners which will be powered by the currently-in-draft BiDirectional (or BiDi) protocol (though the current Selenium implementation has some limitations, which we’ll discuss later). In this article we’ll discuss some of these new capabilities and demonstrate how to use them in your own regression tests. to inspect console logs and network requests made from the browser.
Limitations of Console/Network Log Support in Selenium 3
In previous versions of Selenium, console and network log information was accessible by pulling via methods such as WebDriver.manage.logs.get(...)
. While that
model does provide log access, it has a few shortcomings:
- Logs need to be actively requested. No built-in interface is available for having logs pushed to your code as they occur.
- No control is available for the volume of logs returned - presenting memory concerns when dealing with a long-lived session unless the logs are periodically pulled.
- While network requests and responses can be recorded after they have been made, no built-in mechanism is available to modify or block the requests themselves.
Selenium 4 BiDi support
While the existing pull-based log methods remain available in Selenium 4, a new set of APIs has been added in Selenium 4 to allow users to subscribe to console logs and intercept network requests. We’ll now demonstrate these APIs using Scala and Selenium’s Java library.
BiDi support in action
We’ll start by instantiating our WebDriver
and creating buffers to hold the console log and network request information:
|
|
To begin, we’re launching Chrome and opening a connection to it using the CDP protocol (in a future release it will use the BiDi protocol). Tracking all console messages that are logged during the browser session is as simple as calling:
|
|
addConsoleListener
registers a function that is invoked whenever a console message is logged by the browser - in our case we simply throw it onto our
consoleMessages
buffer.
Potential Applications
section for discussion on how that can be done.
To record network request and response information we can use the following code:
|
|
Let’s break this down: interceptTrafficWith
allows us to register a Filter
that is executed for every network request made by the browser. In our example, we execute each request and add a RequestData
entry to our networkRequests
buffer
containing the HTTP method and URL of the request, as well as the status code of the response.
networkRequestLatch
is presumably not something you would use in your normal code, we add it here to provide a hook to ensure a request has been executed before closing the browser.
To demonstrate this functionality against a web page, we can use the following example.html
file:
|
|
Putting together all the earlier code snippets, you can test against the example.html
file with the following code:
|
|
If you run this code against the example.html
file, you should see the following information printed to console:
2021-11-01T02:07:57.309Z [log] [["Hello, Selenium 4!"]]
(RequestData(GET,https://www.google.com/),ResponseData(200))
Potential Applications
While the example code simply recorded and printed the console and network request information it collected, there is lots of potential for using this new functionality for more practical applications.
Streaming into storage/ingestion
Rather than relying on pulling logs periodically, network and console logs can now be streamed directly into a file or remote storage system (such as Amazon S3). Alternatively the logs could be forwarded directly into a data ingestion pipeline such as Amazon Kinesis for additional processing/filtering before storage.
Network Request Modification
By having access to every outbound network request before it is actually executed, one could begin conditionally injecting new data - such as supplemental headers - into outgoing network requests based on the attributes (headers, path, etc.) of the request. You could even block requests if desired.
Limitations
While the new BiDi APIs offer new and interesting patterns for interacting with the browser, they suffer from the same limitation as the older pull-based APIs for log information in that they’re dependent on each individual browser to implement the necessary protocols/APIs for use by the WebDriver. Because the BiDi Protocol is still in a draft state that means support across browsers is quite limited - in fact these new APIs are actually reliant on the Chrome DevTools Protocol rather than BiDi.
As the BiDi Protocol is finalized browser support can be expected to improve, but until then you may not be able to leverage these new APIs for all the browsers you’d like to.
Conclusion
Selenium 4 provides a new mechanism for interacting with and recording log and network request information in the browser. While it has limited support today, the potential applications for using it make it at minimum a feature to monitor as the BiDi Protocol is finalized.