Exploring Selenium Architecture: A Comprehensive Guide – IQCode

A Quick Walkthrough of Selenium’s Memory Lane!

Automation testing aims to reduce testers’ time and effort and provide accurate test results. To achieve this, testers use tools combined with practical knowledge of the system to automate test execution. One such tool that has gained popularity among automation test engineers is Selenium. In this article, we will provide concise information about Selenium Architecture, Selenium WebDriver, its features, advantages, and disadvantages.

Let’s start by understanding what Selenium is and how it works.

Selenium is an open-source automated testing tool used to test web applications. It supports various programming languages such as Java, Python, C#, and Ruby. It consists of several components, including Selenium WebDriver, Selenium RC, Selenium Grid, and Selenium IDE.

Selenium WebDriver is the primary component of the Selenium suite. It is a browser-specific driver that communicates with the browser and automates user actions. It uses the browser’s native support for automation, which provides more control over browser actions and better browser compatibility.

The architecture of Selenium WebDriver consists of the Selenium client library, Selenium API, JSON Wire Protocol, browser drivers, and the browser itself. The Selenium client library is used to write test scripts, while the Selenium API provides an interface to interact with browser automation. The JSON Wire Protocol is a RESTful web service that facilitates communication between the client and the browser driver. The browser driver is a bridge that facilitates communication between the browser and the client.

Now that you know the basics of Selenium, let’s look at the advantages and disadvantages of Selenium WebDriver architecture.

The advantages of Selenium WebDriver architecture include better browser compatibility, support for various programming languages, faster execution, and more precise control over browser actions. On the other hand, the disadvantages of Selenium WebDriver include limited mobile testing capability, difficulty in testing desktop applications, and the need for a stable Internet connection.

In conclusion, Selenium WebDriver offers a robust and flexible solution for automated testing of web applications. To learn more about Selenium, check out the additional resources section at the end of this article.

A BRIEF HISTORY OF SELENIUM

Selenium emerged in 2004 as a JavaScript-based program developed by Jason Huggins, a Thoughtworks engineer, to address the downsides of manual testing. The program was initially called JavaScript TestRunner but was later renamed Selenium Core. On realizing its capabilities, Huggins open-sourced the program.
Later, Paul Hammant developed Selenium Remote Control (RC) to tackle web application testing-related issues. Subsequently, Selenium Grid was developed for parallel testing, and Selenium IDE for browser automation through record and playback features. In 2008, the developers merged Selenium RC and web driver to produce Selenium 2, which later evolved into Selenium 3 (WebDriver). Today, Selenium RC is deprecated and moved to a legacy package.

//no code changes made

What is Selenium?

Selenium is an open-source framework that automates testing for web applications. It’s ideal for automating functional and regression test cases. Selenium supports multiple programming languages, including Java, Python, C#, Ruby, and JavaScript. It also enables cross-browser testing across various operating systems.

What is Selenium WebDriver?

Selenium WebDriver is the most commonly used component in the Selenium tool suite. It offers an easy-to-understand programming interface by integrating with the WebDriver API. Java and C# are the preferred languages for working with Selenium.


# Sample Python Code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

# create a new Firefox browser instance
driver = webdriver.Firefox()
driver.get("http://www.google.com")

# get the search box element
search_box = driver.find_element_by_name("q")

# enter a search query and submit
search_box.send_keys("Selenium WebDriver")
search_box.send_keys(Keys.RETURN)

Understanding Selenium WebDriver Architecture

Selenium WebDriver is the most widely used component of the Selenium tool suite in the US. It offers an easy-to-understand programming interface by integrating Selenium 2 with WebDriver API and is mostly used with JAVA and C# languages. Let’s take a look at its components, as well as its architecture, which consists of five essential layers in Selenium 3.


//Code comments here

Act like API

Selenium Language Bindings

The Selenium Language Bindings are a programming library that includes commands in the form of an external jar file. This library is compatible with both the Selenium protocol and the W3C Selenium protocol.

There are two groups of Selenium language bindings:

1. WebDriver Protocol Clients: They are thin wrappers around WebDriver protocol HTTP calls and can be downloaded from Selenium’s official repository. These clients can be added to a new project or a new Maven project in Eclipse or IntelliJ depending on the user’s preferred programming language.

2. WebDriver-based Tools: These are higher-level libraries that allow working with WebDriver automation. Tools such as Selenide, Webdriver.io, or Healenium, which is AI-powered Selenium extension, belong to this group. These tools work based on the lower-level WebDriver protocols to function efficiently.

What is the Selenium API?

The Selenium API serves as an interface between programs, allowing them to communicate with each other without user intervention. By following a set of rules and regulations, programs can interact seamlessly with one another.

JSON Wire Protocol

JSON is a popular format for communication between different systems, especially in RESTful web services. It is widely used as a communication method between the client libraries and drivers in Selenium WebDriver. The client’s JSON requests are encoded as HTTP requests for the server to understand, and the response is converted back to JSON, which is deserialized on the client side. This approach maintains the confidentiality of the browser’s internal workings and allows the server to communicate with client libraries in any programming language.

Browser Drivers

Browser drivers are essential intermediaries connecting the Selenium libraries to the browser. They facilitate the running of Selenium commands on the browser. Selenium’s repository offers browser-specific drivers as separate downloads. To use a browser driver, we must import the relevant Selenium package “org.openqa.selenium.[$browsername$];” in our code. The code below sets the System property of the driver executable file and opens Chrome browser.

Code:

“`
package IQCodeBlog;
import org.openqa.selenium.chrome.ChromeDriver;
public class IBContent {
@Test
public void browser(){
System.setProperty(“webdriver.chrome.driver”, “C:\\downloads\\chromedriver.exe”);
ChromeDriver driver = new ChromeDriver();
}
“`

Selenium Browser Support

Selenium supports various browsers, including Chrome, Safari, Firefox, Opera, and Internet Explorer, for testing scripts across different operating systems such as Windows, Mac OS, Linux, and Solaris. The Selenium architecture comprises five components: Selenium Client Library, Selenium API, JSON Wire Protocol, Browser Drivers, and Browsers. The Selenium Client Library provides Selenium commands in compliance with the W3C Selenium protocol in programming languages. The Selenium API facilitates software-to-software interaction, while the JSON Wire Protocol is a communication method between client libraries and drivers. The browser drivers support interaction between the Selenium library and web browsers.

Working of Selenium WebDriver

The Selenium WebDriver can be compared to a conversation between a foreign tourist and a local friend. Here’s an example login script written in Selenium Java. Once the script has been written and executed, the browser is launched to navigate to www.samplelogin.com. The webdriver identifies the elements on the webpage to execute and sends programming language commands to the browser driver via JSON wired protocol. The browser driver receives the requests and sends the response back to the test script. Essentially, the Selenium client library is your friend who knows the directions while the test script is the tourist, and the webdriver is the translator.

Advantages of Selenium WebDriver Architecture

Selenium WebDriver offers various advantages over other frameworks for automated testing:

  • Open-source support for multiple languages and operating systems.
  • Built-in support for cross-browser and parallel testing.
  • Integration with popular frameworks like Maven, ANT, and TestNG.
  • Easy integration with CI/CD tools like Jenkins.
  • Active community support that simplifies troubleshooting.
  • Enables implementation of user gestures and browser interactions.
  • Supports writing test scripts in the web app’s native programming language.
  • No server startup required, directly interprets code into web services.
  • Simulates advanced browser interactions like back/forward button clicks.

// code tag can be used to include code if required

Disadvantages of Selenium WebDriver


- Limited to testing web applications only, cannot test Windows applications
- Requires third-party frameworks for reporting
- Inability to handle dynamic web elements accurately
- Inefficient in handling frames and pop-ups
- Cannot automate captcha, barcodes, or fingerprint tests
- Cannot support automation of video and audio
- Test script authoring requires programming language knowledge
- No test management tasks can be performed with Selenium, unlike tools such as UFT/QTP that offer ALM integration

Selenium WebDriver has several drawbacks that users must consider before choosing it as their testing tool. While it is reliable for web applications, it lacks features to handle dynamic and complex test cases. Additionally, report generation, captcha and barcode automation, and audio/video automation are not supported. Users must also possess knowledge of programming languages to create test scripts, and test management requires additional tools to integrate with Selenium.

Conclusion

Find below some useful resources related to software testing and automation:

Top 10 Productivity Tools for Programmers

Python Developer Salaries in India for Freshers and Experienced Professionals in 2023 – IQCode

Top 10 JavaScript Books for Beginners to Advanced Learners in 2023 – IQCode

A Comprehensive Guide to Understanding Lambda Architecture – IQCode