Experiencing difficulties when attempting to deploy script to AWS Lambda

My current challenge involves executing a script that utilizes Selenium and specifically webdriver.

driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')

The problem I am facing is that the function requires geckodriver to be present in order to run. Geckodriver is stored in the zip file that I have uploaded to AWS, but I am unsure how to make the function access it on AWS. When running locally, everything works fine since geckodriver is in my directory.

Running the function via serverless results in the following error message:

{ "errorMessage": "Message: 'geckodriver' executable needs to be in PATH. \n", "errorType": "WebDriverException", "stackTrace": [ [ "/var/task/handler.py", 66, "main", "print(TatamiClearanceScrape())" ], [ "/var/task/handler.py", 28, "TatamiClearanceScrape", "driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')" ], [ "/var/task/selenium/webdriver/firefox/webdriver.py", 164, "init", "self.service.start()" ], [ "/var/task/selenium/webdriver/common/service.py", 83, "start", "os.path.basename(self.path), self.start_error_message)" ] ] }

Error --------------------------------------------------

The invoked function has failed

Any assistance on this matter would be greatly appreciated.

EDIT:

def TatamiClearanceScrape():
    options = Options()
    options.add_argument('--headless')

    page_link = 'https://www.tatamifightwear.com/collections/clearance'
    # this is the url that we've already determined is safe and legal to scrape from.
    page_response = requests.get(page_link, timeout=5)
    # here, we fetch the content from the url, using the requests library
    page_content = BeautifulSoup(page_response.content, "html.parser")

    driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')
    driver.get('https://www.tatamifightwear.com/collections/clearance')

    labtnx = driver.find_element_by_css_selector('a.btn.close')
    labtnx.click()
    time.sleep(10)
    labtn = driver.find_element_by_css_selector('div.padding')
    labtn.click()
    time.sleep(5)
    # wait(driver, 50).until(lambda x: len(driver.find_elements_by_css_selector("div.detailscontainer")) > 30)
    html = driver.page_source
    page_content = BeautifulSoup(html)
    # we use the html parser to parse the url content and store it in a variable.
    textContent = []

    tags = page_content.findAll("a", class_="product-title")

    product_title = page_content.findAll(attrs={'class': "product-title"})  # allocates all product titles from site

    old_price = page_content.findAll(attrs={'class': "old-price"})

    new_price = page_content.findAll(attrs={'class': "special-price"})

    products = []
    for i in range(len(product_title) - 2):
        #  groups all products together in list of dictionaries, with name, old price and new price
        object = {"Product Name": product_title[i].get_text(strip=True),
                  "Old Price:": old_price[i].get_text(strip=True),
                  "New Price": new_price[i].get_text(), 'date': str(datetime.datetime.now())
                  }
        products.append(object)



    return products

Answer №1

If you're looking to streamline your AWS Lambda functions, consider utilizing AWS Lambda Layers. With Layers, you can integrate libraries seamlessly without the need to bundle them into your deployment package. This means you can avoid uploading dependencies every time you make changes to your code by simply creating an additional layer with all the necessary packages.

To learn more about AWS Lambda Layers, check out this resource.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

What is the best way to use __repr__ with more than one argument?

(Python 3.5.2) In my code, I have implemented the __repr__ method for a specific class like this: class d(): def __init__(self): self._values = [] return def __pos__(self): return self._values[0] def __repr__(self, val ...

When running through Selenium web driver, JS produces inaccurate results

Currently, I am utilizing JavaScript to determine the number of classes of a specific type. However, when I run the JS code in Webdriver, it provides me with an incorrect value. Surprisingly, when I execute the same JavaScript on the Firebug console, it gi ...

Issue with searching on Github because search bar element is not interactable

Currently, I am working on a project using Selenium to develop a simple code that will launch the browser and navigate to the GitHub website. The main objective is to search for a specific keyword in the search bar. However, upon running the code, I encoun ...

Is there a way for me to change related_name in inherited or child objects?

Consider the given models: class Module(models.Model): pass class Content(models.Model): module = models.ForeignKey(Module, related_name='contents') class Blog(Module): pass class Post(Content): pass I am looking to retrieve al ...

Can you please explain the errors related to formatting in this text?

Could someone explain the meaning of this error? Here is the code snippet in question: steps = np.loadtxt('par.in', unpack=True, usecols=[4]) maximum = np.loadtxt('par.in', unpack=True, usecols=[3]) minimum = np.loadtxt('par.in&ap ...

Dividing an ArrayList of strings into various subparts and storing them in a HashMap: programming in Java with the help of Selenium

Here is a breakdown of the passenger list: 1: Adult(s) ‎(1 X 9,288)‎ 2: Child(s) ‎(1 X 9,288)‎ 3: Infant(s) ‎(1 X 2,429)‎ 4: Adult(s) ‎(1 X2,712)‎ 5: Child(s) ‎(1 X 2,712)‎ 6: Infant(s) ‎(1 X 146) I am trying to extract i ...

How can I attain never-ending aspirations?

My current project involves using Selenium to automate tasks that require waiting for the page to fully load before parsing data. The loading time is quite long, taking a couple of hours. Has anyone encountered a similar situation and found an effective so ...

Issue with the lxml version - encountering difficulty in invoking the findall method!

An error occurs in lxml version 1.3 for the following line of code: self.doc.findall('.//field[@on_change]') File "/home/.../code_generator/xmlGenerator.py", line 158, in processOnChange onchangeNodes = self.doc.findall('.//field[@on_chan ...

Transforming values in a 2-dimensional array from strings to numerical data

Data From External Source Hello there, I've been utilizing the read_csv function in Python to pull data from a specified URL. The dataset I'm working with contains a column filled with character/string values, while the remaining columns are co ...

Scraping a website with Python that contains redirection to another website

I'm struggling with scraping the contents of a specific web page. Here is an example of my Python code: response = requests.post('http://a836-acris.nyc.gov/bblsearch/bblsearch.asp?borough=1&block=733&lot=66',{'User-Agent' ...

Putting Jenkins, selenium-grid, and protractor to the test with end-to-end testing

Currently, I am in the process of setting up my testing environment using Jenkins, Selenium, and Protractor. To efficiently distribute tests across remote machines (nodes), I have opted to utilize the selenium-plugin (selenium grid). The initial progress i ...

Is there a way to determine if one key is contained within another key, and if so, merge the values of the two keys together?

Is it feasible to determine if a key is contained within another key in a dictionary, and if so, merge the values of the first key into the second key? For example: {'0': {'3', '1', '0'}, '3': {'3&apo ...

When Selenium webdriver is deployed onto TFS, it may return an empty string for the breadcrumb text from an IWebElement

Welcome to my first post on StackOverflow! I've provided a detailed description of the issue, and I'm open to providing more information if needed. Issue Overview: I am currently using a standard breadcrumb for navigation on my website, with the ...

A guide to transforming Gremlin query outputs into Pandas or Python data structures

Seeking to enhance the appearance of my Gremlin query results by transforming them into dataframes. The output from Gremlin queries often resembles Json format, especially seen in examples like the answer to one of my previous questions that involves the ...

Error encountered in Colab when importing keras.utils: "to_categorical" name cannot be imported

Currently utilizing Google's Colab to execute the Deep Learning scripts from François Chollet's book "Deep Learning with python." The initial exercise involves using the mnist dataset, but encountering an error: ImportError: cannot import name & ...

Extracting Features: 0% complete, still in progress indefinitely... using Python for a deep learning project comparing dogs vs. cats

import cv2 import numpy as np import os from random import shuffle from tqdm import tqdm ​ TRAIN_DIR = '/home/ajmal/Dogs vs cat/train' TEST_DIR = '/home/ajmal/Dogs vs cat/test' IMG_SIZE = 50 LR = 1e-3 CNN = 'dogsvscat ...

Is the Selenium WebDriver disregarding the setTimeout command?

Has anyone experienced the issue where the selenium.setTimeout() command is being ignored when using WebDriverBackedSelenium? Any insights or solutions? ...

What is the process for locating elements with the listitem role using Selenium?

Having trouble accessing and cycling through a collection of list items on a webpage using Python. The elements to be fetched are defined as follows, with 4 present on the page: <div data-test-component="StencilReactView" role="listitem& ...

What is the method for sorting a Python list both numerically in descending order and alphabetically in ascending order simultaneously?

My goal is to arrange a list that contains tuples with a word and a number. my_list = [('hello',25), ('hell',4), ('bell',4)] I am looking for a way to sort this list (maybe using lambda) in order to achieve the following: [ ...

Executing MySQL inserts using AWS Lambda and Node.js

During my usage of Amazon Lambda to run a nodejs function, I encountered an issue with inserting data into a mysql DB after performing an HTTP get request. The Cloudwatch logs appear fine, indicating that the query is parsed correctly, and when I manually ...