Chromedriver - Under The Hood

- June 18, 2019

Selenium Webdriver has been our companion in our test automation adventures for some time now. Over the years the Selenium open source project has obtained something I can fairly describe as global domination. With Selenium, we are able to interact with different browsers using the WebDrivers. But, how much do we really know about them and the way they work?
In this article, we will take a closer look into "Chromedriver" and explore a little bit about how it works.

So, what is the role of this Webdriver?

Starting from the release of Selenium 2, we were introduced to the concept of Webdriver.
The "Chromedriver" for example is a .exe file that is able to act as a server. His role is to allow us to "Interact" with the browsers.
Eventually, the responsibility for developing and maintaining these drivers was transferred to the browser vendors. The idea is, that they know their browsers the best and the produces driver would be much more stable and robust. In reality, that caused numerous problems. There were methods that were not properly invoked by the WebDrivers, and insufficient maintenance and bug fixes on the vendor's side.

Let's start with a fairly primitive example

In the example below, we have a simple Selenium Code which is:

- Opening a Chrome browser
- Navigating to Google.com

What happens when we execute this code?

When we add the Selenium Library dependencies to our code, we are able to use the bindings that we see in the above snippet.

Once we execute this code, this is what we will see in our terminal:

In the above example, we are starting a session with the ChromeDriver through a port (In this case 17494).
What will happen next is that the Webdriver will act as a server and receive commands that we want to perform on our browser. (The bindings we discussed).
All commands are following a specific REST API protocol called the WebDriver protocol, the documentation of which can be found here.
The WebDrivers themselves must be able to implement all of the calls made to them for our commands to be performed successfully.

In this example...

- We start by posting a session to the WebDriver with a body that can potentially include our DesiredCapabilities if given (Desired Capabilities class contains pairs of keys and values that can help us determine browser properties).
The call would look something like this:

- The WebDriver responds with a session ID, status code for the operation, etc...

- We then try to navigate to our desired URL
The call would look something like this:

The server itself interacts with the Chrome browser via the Remote Debugger Protocol.
(https://chromedevtools.github.io/debugger-protocol-viewer/1-2/).

It translates the API calls to something that can be received by the Chromes socket.
In our case, I guess the navigation call will be translated to:

On the other side of the socket, there is a dispatcher listening and executing a chromium code.
I actually found the piece of code being invoked on the chromes side:

This is a very simple introduction to "Chromedriver" and the way it works.
Basically, the Webdriver makes direct calls to the browser using each browser's native support for Automation / debugging How these calls are made and the features they support depends on the browser you are using.

Comments

AnonymousJanuary 13, 2022 at 9:23 PM
카지노 카지노 betway betway bet365 bet365 온라인카지노 온라인카지노 10bet 10bet 카지노 카지노 카지노사이트 카지노사이트 クイーンカジノクイーンカジノ dafabet link dafabet link 390
ReplyDelete
Replies
Tucson HandjobJanuary 1, 2025 at 1:15 AM
Thaank you for writing this
ReplyDelete
Replies

Add comment

Search This Blog

In god we trust - The rest we test

Chromedriver - Under The Hood

Comments

Post a Comment

Popular posts from this blog

Test Automation in Docker: The Good, the Bad, and the WTF

Is Your Test Automation Actually Automated?

Linux Cheat Sheet - (A quick reference for common tasks)