{"id":2218,"date":"2023-10-27T13:48:23","date_gmt":"2023-10-27T17:48:23","guid":{"rendered":"https:\/\/lowtek.ca\/roo\/?p=2218"},"modified":"2023-10-27T13:48:23","modified_gmt":"2023-10-27T17:48:23","slug":"running-selenium-testing-in-a-single-docker-container","status":"publish","type":"post","link":"https:\/\/lowtek.ca\/roo\/2023\/running-selenium-testing-in-a-single-docker-container\/","title":{"rendered":"Running Selenium testing in a single Docker container"},"content":{"rendered":"<p><a href=\"https:\/\/www.selenium.dev\/\">Selenium<\/a> is a pretty neat bit of kit, it is a framework that makes it easy to create browser automation for testing and other web-scraping activities. Unfortunately it seems there is a dependency mess just to get going, and when I hit these types of problems I turn to <a href=\"https:\/\/en.wikipedia.org\/wiki\/Docker_(software)\">Docker<\/a> to contain the mess.<\/p>\n<p>While there are a number of &#8220;Selenium + Docker&#8221; posts out there, many have more complex multi-container setups. I wanted a very simple single container to have Chrome + Selenium + my code to go grab something off the web. <a href=\"https:\/\/reflect.run\/articles\/how-to-run-selenium-tests-inside-a-docker-container\/\">This article is close<\/a>, but doesn&#8217;t work out of the box due to various software updates. This blog post will cover the changes needed.<\/p>\n<p>First up is the Dockerfile.<\/p>\n<pre class=\"lang:default decode:true\">FROM --platform=linux\/amd64 python:3.9-buster\r\n\r\n# install google chrome\r\n\r\nRUN wget -q -O - https:\/\/dl-ssl.google.com\/linux\/linux_signing_key.pub | apt-key add -\r\n\r\nRUN sh -c 'echo \"deb [arch=amd64] http:\/\/dl.google.com\/linux\/chrome\/deb\/ stable main\" &gt;&gt; \/etc\/apt\/sources.list.d\/google-chrome.list'\r\n\r\nRUN apt-get -y update\r\n\r\nRUN apt-get install -y google-chrome-stable\r\n\r\n# install chromedriver\r\n\r\nRUN apt-get install -yqq unzip\r\n\r\nRUN wget -O \/tmp\/chromedriver.zip https:\/\/edgedl.me.gvt1.com\/edgedl\/chrome\/chrome-for-testing\/`curl -sS https:\/\/googlechromelabs.github.io\/chrome-for-testing\/LATEST_RELEASE_STABLE`\/linux64\/chromedriver-linux64.zip\r\n\r\nRUN unzip \/tmp\/chromedriver.zip chromedriver-linux64\/chromedriver -d \/tmp\r\n\r\nRUN cp \/tmp\/chromedriver-linux64\/chromedriver \/usr\/local\/bin\/\r\n\r\n# set display port to avoid crash\r\n\r\nENV DISPLAY=:99\r\n\r\n# install selenium\r\n\r\nRUN pip install selenium==4.14.0\r\n\r\nCOPY . .\r\n\r\nCMD python tests.py\r\n<\/pre>\n<p>The changes needed from the original article are minor. Since Chrome 115 the chromedriver has <a href=\"https:\/\/sites.google.com\/chromium.org\/driver\/\">changed locations<\/a>, and the zip file layout is slightly different. I also updated it to pull the latest version of Selenium.<\/p>\n<p>ChromeDriver is a standalone server that implements the <a href=\"https:\/\/w3c.github.io\/webdriver\/webdriver-spec.html\">W3C WebDriver standard<\/a>. This is what Selenium will use to control the Chrome browser.<\/p>\n<p>The second part is the Python script <code>tests.py<\/code><\/p>\n<pre class=\"lang:default decode:true \">from selenium import webdriver\r\nfrom selenium.webdriver.chrome.options import Options\r\nfrom selenium.webdriver.common.by import By\r\n\r\n# Define options for running the chromedriver\r\nchrome_options = Options()\r\nchrome_options.add_argument(\"--no-sandbox\")\r\nchrome_options.add_argument(\"--headless\")\r\nchrome_options.add_argument(\"--disable-dev-shm-usage\")\r\n\r\n# Initialize a new chrome driver instance\r\ndriver = webdriver.Chrome(options=chrome_options)\r\n\r\ndriver.get('https:\/\/www.example.com\/')\r\nheader_text = driver.find_element(By.XPATH, '\/\/h1').text\r\n\r\nprint(\"Header text is:\")\r\nprint(header_text)\r\n\r\ndriver.quit()\r\n<\/pre>\n<p>Again, only minor changes here to account for changes in Selenium APIs. This script does do some of the key &#8216;tricks&#8217; to ensure that Chrome will run inside Docker (providing a few arguments to Chrome).<\/p>\n<p>This is a very basic &#8216;hello world&#8217; style test case, but it&#8217;s a starting point to start writing a more complicated web scraper.<\/p>\n<p>Building is as simple as:<\/p>\n<pre class=\"lang:default decode:true\">docker build -t webscraper .<\/pre>\n<p>And then we run it and get output on stdout:<\/p>\n<pre class=\"lang:default decode:true\">$ docker run webscraper\r\nHeader text is:\r\nExample Domain\r\n<\/pre>\n<p>Armed with this simple Docker container, and using the <a href=\"https:\/\/selenium-python.readthedocs.io\/index.html\">Python Selenium documentation<\/a> you can now scrape complex web pages with relative ease.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Selenium is a pretty neat bit of kit, it is a framework that makes it easy to create browser automation for testing and other web-scraping activities. Unfortunately it seems there is a dependency mess just to get going, and when I hit these types of problems I turn to Docker to contain the mess. While &hellip; <a href=\"https:\/\/lowtek.ca\/roo\/2023\/running-selenium-testing-in-a-single-docker-container\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Running Selenium testing in a single Docker container&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,12],"tags":[],"class_list":["post-2218","post","type-post","status-publish","format-standard","hentry","category-computing","category-how-to"],"_links":{"self":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts\/2218","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/comments?post=2218"}],"version-history":[{"count":2,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts\/2218\/revisions"}],"predecessor-version":[{"id":2220,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/posts\/2218\/revisions\/2220"}],"wp:attachment":[{"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/media?parent=2218"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/categories?post=2218"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lowtek.ca\/roo\/wp-json\/wp\/v2\/tags?post=2218"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}