Selenium Architecture – Detailed Explanation

Selenium Architecture

The goal of automation testing is to minimize the time and effort of testers and generate accurate test results. A tool combined with practical knowledge about the system is used to automate test execution. If you are an automation test engineer, Selenium is one tool that you would have heard about. If you want to know more about Selenium, then you have stumbled upon the right article! This article comprises crisp and concise information about Selenium Architecture, Selenium WebDriver, and its features, advantages, and disadvantages. Without further ado, let’s begin!

A Quick Walkthrough of Selenium’s Memory Lane!

Selenium was developed in 2004 by Jason Huggins, a Thoughtworks engineer, to address the shortcomings of manual testing. He developed a solution in the form of a program using JavaScript. He initially named it JavaScript TestRunner. Later on, when he realized that the program could do much more, he renamed it Selenium Core and open-sourced it.
Next Selenium Remote Control (RC) or better known as Selenium 1 was developed by Paul Hammant, another ThoughtWorks engineer. Selenium RC was developed to solve domain-related issues while testing web applications.

This further escalated to the development of Selenium Grid for parallel testing purposes. Next, Selenium IDE was developed to automate the browser through the record and playback features (similar to UFT/QTP). In 2008, the Selenium team decided to merge the web driver and Selenium RC into a tool called Selenium 2, which later evolved to Selenium 3 or better known as Selenium WebDriver.
Selenium RC is now deprecated and moved to legacy packaging.

What is Selenium?

Selenium is an open-source framework used to automate the testing of web applications. If you are looking forward to automating functional and regression test cases, then Selenium would be the right choice! Test scripts can be written in Selenium using different programming languages like Java, Python, C#, Ruby, and JavaScript.
Quick notes:
Selenium is a web automated testing tool that supports cross-browser testing across various operating systems.
Selenium supports JAVA, Python, C#, Ruby, and JavaScript.

What is Selenium WebDriver?

Selenium WebDriver is currently the most widely used component in the Selenium tool suite. Selenium WebDriver: Selenium 2 integrated with WebDriver API provides an understandable programming interface. JAVA and C# languages are mostly preferred to work with Selenium.

Selenium WebDriver Architecture

Selenium WebDriver is currently the most widely used component in the Selenium tool suite. Selenium WebDriver: Selenium 2 integrated with WebDriver API provides an understandable programming interface. JAVA and C# languages are mostly preferred to work with Selenium.

Before diving into Selenium WebDriver architecture, let us look at its components.
Selenium 3’s architecture consists of five layers.

Selenium Client Library

Selenium Client Library or language bindings is a programming library that consists of commands in the form of an external jar file that are compatible with Selenium protocol/W3C Selenium protocol. The selenium client library can be divided into two groups:

Web Driver protocol clients – They are thin wrappers around WebDriver protocol HTTP calls. Based on the user’s preferred programming language, the library can be downloaded from Selenium’s official repository. The library can later be added while creating a new project or a new Maven project in Eclipse or IntelliJ.

WebDriver-based tools – These are higher-level libraries that allow us to work with WebDriver automation. Testing frameworks like Selenide, webdriver.io, or AI-powered Selenium extensions like Healenium come under this group. These tools rely on lower-level webdriver protocols to function efficiently.

Selenium API

Selenium API is a set of rules and regulations that the programs use to communicate with each other. APIs work as an interface between the program and aid in their interaction without any user knowledge.

JSON Wire protocol

JSON is used in web services in REST and is a widely accepted method for communication between heterogeneous systems. The Selenium WebDriver uses JSON to communicate between client libraries and drivers. The JSON requests sent by the client are converted into HTTP requests for the server’s understanding and again converted back to JSON format while sending it to the client again. This data transfer process is serialization. By this method, the internal logic of the browser is not revealed, and the server can communicate with the client libraries, even if it is not aware of any programming language.

Browser Drivers

Browser drivers act as a bridge between the Selenium libraries and the browsers. They help to run Selenium commands on the browser. Each of the browsers has separate drivers, which can be downloaded from Selenium’s official repository. While using a browser driver, we need to import the respective Selenium package “org.openqa.selenium.[$browsername$];” in our code. We should also set the System property of the executable file of the browser driver using the following syntax:

System.setProperty(key,value)

Where key is the driver’s name and value is the path to the executable file of the driver in the user’s device.
Let us understand this better with this code snippet:

Package InterviewBitBlog;
import.org.openqa.selenium.chrome.ChromeDriver;
Public class IBContent{
@Test
Public void browser(){
System.setProperty("webdriver.chrome.driver", "C:\\downloads\\chromedriver.exe")
ChromeDriver driver=new ChromeDriver

When we execute this code, the chrome browser will be opened by Selenium.

Browser

All the browsers supported by Selenium come under this category. Selenium test scripts can be run across various browsers like Chrome, Safari, Firefox, Opera, and Internet Explorer and operating systems like Windows, Mac OS, Linux, and Solaris.
Quick notes:
Selenium architecture comprises 5 components; Selenium Client Library, Selenium API, JSON Wire Protocol, Browser Drivers, and Browsers.
Selenium Client library – Selenium commands in the desired programming language in compliance with the W3C Selenium protocol.
Selenium API – Facilitates software to software interaction.
JSON Wire Protocol – Communication method between client libraries and drivers.
Browser drivers – Support interaction between Selenium library and web browser.

Working of Selenium WebDriver

Imagine the working of Selenium WebDriver like a conversation between you, a foreign tourist, and your friend. The tourist is asking for directions, but you are new in the city. You know the tourist’s language, but your friend knows the directions in and out. So, you ask the tourist what they want in their language, and translate it for your friend. Your friend tells you the directions in a jiffy, and you quickly explain it to the tourist in their language. Sounds simple right? Now let us relate this scenario with the components in our Selenium Architecture!
Here’s an example script for login written in Selenium JAVA (Element values used are for illustration purposes).

Package InterviewBitBlog;
import.org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.By; 
import org.openqa.selenium.WebDriver; 
import org.openqa.selenium.WebElement; 
import org.testng.Assert;
Public class IBContent{
@BeforeTest
Public void login(){
System.setProperty("webdriver.chrome.driver", "C:\\downloads\\chromedriver.exe")
ChromeDriver driver=new ChromeDriver

@Test 
  public void loginAutomationTest() { 
  driver.get("www.samplelogin.com"); 
  Assert.assertEquals(driver.getTitle(),"Home"); 
   
  WebElement signInLink = driver.findElement(By.linkText("signin")); 
  signInLink.click(); 
  WebElement user = driver.findElement(By.id("username")); 
  user.sendKeys("test"); 
  WebElement pass = driver.findElement(By.id("password")); 
  pass.sendKeys("test"); 
  WebElement log = driver.findElement(By.name("Login")); 
  log.click(); 

Once you have written your selenium script in the IDE, let us say, Eclipse, for now, you will hit the run button to execute the program. Based on the above program, the Chrome browser will be launched and it will navigate to “www.samplelogin.com”. The Selenium library communicates with the Selenium API, which in turn sends the programming language commands to the browser driver via the JSON wired protocol. The commands are sent in the form of JSON requests, where the protocol converts them to HTTPS requests. The browser driver will then use this HTTP server to get the request and send it to the server, where it filters out the commands that need to be executed. In this case, the driver identifies the sign-in link and performs a click operation on it. Later it identifies the username and password fields and inputs the given values and finally clicks on the “login” button. This is the execution part which is done on the browser’s UI. Finally, the HTTP server sends the response back to the test script, where the drivers and APIs convert it to JSON format and thus the results are recorded.

Here, the Selenium client library is your friend who knows the directions, the test script is the tourist and the webdriver is you. You interacted with the tourist with your multilingual skills (umm, let’s say this is the Selenium API) and successfully executed the script, that is, our tourist got their directions and reached their destination! Sounds good, doesn’t it?

Advantages of Selenium WebDriver Architecture

  • It is open-source, supports many languages, and is compatible with many operating systems.
  • Selenium WebDriver architecture is designed to support cross-browser testing and parallel testing.
  • Selenium WebDriver supports integration with various frameworks like Maven, and ANT for code compilation.
  • It also supports integration with testing frameworks like TestNG to improve automation testing and reporting.
  • Selenium can be integrated with Jenkins for CI/CD purposes.
  • Selenium has strong community support which makes troubleshooting pretty easy.
  • Selenium Architecture enables us to implement user gestures like the mouse cursor and keyboard actions like click, double click, drag, and drop, click and hold, etc.
  • With Selenium, you can write your test scripts in the language with which the web application was coded, thus speeding up test cycles.
  • Selenium does not require us to start any server before testing and provides a direct interpretation of code onto the web services.
  • The architecture of Selenium enables us to simulate advanced browser interactions like clicking the browser’s back and front buttons.

Disadvantages of Selenium WebDriver

  • Selenium does not support testing of Windows applications as it works only on web applications.
  • Selenium depends on third-party frameworks like TestNG and Cucumber for reporting, as it does not have inbuilt reporting features.
  • Selenium architecture is not prepared to handle dynamic web elements accurately, thus affecting test results.
  • Selenium does not handle frames and pop-ups efficiently.
  • Selenium does not automate captcha, barcodes, and test cases that involve fingerprints.
  • Selenium does not support the automation of video and audio elements.
  • Selenium requires knowledge of programming languages, thus making test script authoring slightly hard.
  • Test management tasks cannot be performed with Selenium, while tools like UFT/QTP support ALM integration.

Conclusion

I hope this article gave you a fair idea of Selenium Architecture. If you are preparing for interviews based on Selenium, then check out our Selenium Interview questions compilation. Visit InterviewBit for more resources and training programs to help you crack those tough interviews in your dream companies!

Additional Resources

Previous Post

Apache Spark Architecture – Detailed Explanation

Next Post

CNN Architecture – Detailed Explanation

Exit mobile version