I'm facing challenges with an API that provides natural gas data. The documentation for this API can be found at . It allows me to access Json-formatted data by inputting a URL into my internet browser. However, in order to download the Json data, I need t ...
I'm struggling with scraping the contents of a specific web page. Here is an example of my Python code: response = requests.post('http://a836-acris.nyc.gov/bblsearch/bblsearch.asp?borough=1&block=733&lot=66',{'User-Agent' ...
I am currently attempting to locate numerous elements using their individual ids. The elements have the following names: dcm-reservation-limit-multiple-input-generic-X where X represents the number of elements, such as: dcm-reservation-limit-multiple-in ...
I have the following XML that needs to be extracted: <div class="tab_product_details"> <table> <tbody> <tr>...</tr> <tr>...</tr> <tr>...</tr> <tr> ...
My Selenium script opening the Edge WebDriver is encountering issues with notifications popups that interfere with clicking on buttons on websites. I suspect these notifications are overlaying the website content, making it difficult for the code to locate ...
I have been assigned a web scraping project and I am required to scrape data from the following website for testing: The HTML: <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text& ...
I'm struggling with saving the scraped data in a Postgresql database. I attempted to use Psycopg2 without success, so now I'm considering using Django models instead. The scraper needs to collect data from every blog post on each page and store it in the ...
I am currently working on a project that involves scraping prices for specific food items based on different locations across the country. One of the features allows users to enter the name of a city in an input text box and then view a list of available i ...
I have developed a script to systematically catalog all URLs on a website. Currently, I am utilizing CrawlSpider with a rules handler to manage the scraped URLs. The "filter_links" function checks an existing table for each URL and writes a new entry if i ...
I am currently in the process of extracting href data from doctors' profiles on the website . To achieve this, my code employs Selenium to create a web server that accesses the website and retrieves the URLs, handling the heavy lifting for web scraping. Wh ...
I am currently on the hunt for a dropdown element in order to select an option from it. I know that Selenium has a built-in class specifically for handling select drop-downs, but I'm having trouble locating the actual element. Could someone point out ...
I'm currently utilizing bs4 and urllib2 to extract data from a website. Check out the webpage here. The goal is to retrieve the remaining digits of the telephone number 3610...... but beforehand, it's necessary to click on a button to reveal th ...
After using Selenium in Chrome to gather usernames from a social media profile, I encountered an issue with the limited loading of the page and Chrome crashing due to running out of memory. The list of followers is extensive, reaching hundreds of thousands ...
This script retrieves data from Oddsortal website: import pandas as pd from bs4 import BeautifulSoup as bs from selenium import webdriver import threading from multiprocessing.pool import ThreadPool import os import re from math import nan class Driver: ...
I am relatively new to Python and coding, so please bear with me as I explain my current project. Basically, what I'm trying to accomplish is creating a script that opens my monthly fire department training page, navigates to the video section where differ ...
I'm currently in the process of scraping a website that relies on dynamically loaded content through JavaScript. In my attempts to request the data source, I received a JSON response where a key 'results_html' holds all the HTML necessary for querying an ...
I am trying to extract data from a Google Scholar page that has a 'show more' button. After researching, I found out that this page is not in HTML format but rather in JavaScript. There are different methods to scrape such pages and I attempted to use Sele ...
https://i.stack.imgur.com/BBk53.png https://i.stack.imgur.com/WEa6i.png Trying to extract multiple reviews using cssSelector from a div element. public void getFacebookData() { driver.get("https://www.facebook.com/?stype=lo&jlou=AffEX_j6PH-b ...
When attempting to retrieve the availability and price for each day on , I navigate through the calendar by checking which days are booked or not, and then clicking the "next" button to move to the next month. In addition, I click on the arrival date and ...
Having an element with the following HTML: <span id="ContentPlaceHolder1_Label2" designtimedragdrop="1319" style="display:inline-block;color:Firebrick;font-size:Medium;font-weight:bold;width:510px;"></span> Upon clicking the Save button on ...
I have been attempting to utilize Selenium in order to locate the text within a span element that lacks any specific attributes such as class or id for identification. Here is the HTML structure: HTML snippet obtained from inspecting the element in Chrome ...
Trying to extract media files from a specific website with notes has been quite the challenge. Despite easily downloading the files, they are not in the correct order. It seems that the website makes an Ajax call after scrolling to page 30 and then loads ...
I'm currently working on a TikTok crawler project that uses both selenium and scrapy start_urls = ['https://www.tiktok.com/trending'] .... def parse(self, response): options = webdriver.ChromeOptions() from fake_useragent import UserAgent ua = ...
As a newcomer to both PowerShell and HTML, I am venturing into the realm of extracting table data from a webpage using the powerful combination of PowerShell and Selenium webdriver. My approach involves automating the process of launching a specific webpag ...
I am currently facing a challenge while trying to extract a list from a website. The website has two separate lists, with the second one only loading after selecting an option from the first list. Unfortunately, I am having trouble selecting the first opti ...
Currently, I am engaged in web scraping to extract the "id" of all locations from a complex json content. Click here for the JSON link I attempted using the dict.items method, but it only extracted 2 values at the start of the dictionary followed by a li ...
Consider the following HTML structure: <div class="divSearchContainer"><input type="search" class="FL H100P" placeholder="Select"><div class="divSearchIconConatiner H100P CP FL" title="S ...
https://i.stack.imgur.com/klhCL.png I need to retrieve numbers like 14,401. I have attempted the following code: WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[@id='wiz-iframe-intent']"))) WebDriverWait(drive ...
My goal is to retrieve the price of a product based on its size, as prices tend to change daily. While I succeeded in extracting data from a website that uses "a class," I am facing difficulties with websites that use div and span classes. Link: Price: $ ...
While attempting to extract data from a particular website using the requests.get method, I encountered an issue. The information retrieved from the website seems to be inconsistent and does not align with the actual data displayed on the site. For instan ...
I attempted to extract information from a table on wyscout.com, which appears to be constructed with Reactjs. Once logged in, the script selects the country (e.g. England), League (e.g. Premier League), and Team (e.g. Arsenal). From there, it navigates to ...
My friend asked me to scrape all the videos from 'TVFilthyFrank'. I have access to all the links for each video. I want to determine the size of each video in MB and then proceed with downloading them. However, using driver.get(VIDEO_URL) and ext ...
Custom Image Link I need assistance with coding a function that can interact with the ellipsis icon on a specific webpage. I have provided details about its location. Due to security measures, I am unable to share the exact page where the ellipsis is loc ...
For my Amazon.es web scraper built with Selenium, I am using a CSS selector to determine the total number of pages it will iterate through. However, the selector name seems to change dynamically and I must update it daily. As someone not well-versed in H ...
Currently, I am working on a project for one of my courses that involves web scraping data from Bodybuilding.com. My main objective is to collect information regarding the members of the website. Initially, I was able to successfully scrape data from the 1 ...
My experience with webscraping is limited to the basics, so this task is a bit out of my comfort zone. What I'm hoping to achieve is a comprehensive list of farmers along with the markets they sell at. The website features a table where you can select ...
On this website , I am trying to extract the email. I attempted using requests and Beautifulsoup without success. I also wrote this code utilizing selenium, but it did not work: from selenium import webdriver url = "https://aiwa.ae/company/arad-bui ...
Is there a way to scrape a web application and retrieve the values from a table as soon as new values are added? If not, what is the best method to scrape the website?Visit the website here. The current code I have only allows manual scraping which result ...
I've encountered an issue while attempting to call a function from within a FOR loop. The error message I receive is: test() NameError: name 'test' is not defined Below is the code in question: from selenium import webdriver from selenium.common.excep ...
My objective is to compile a list of the names of all newly posted items on within a 24-hour period. After some research, I've discovered that Selenium is the ideal tool for this task as the website I am scraping is dynamic and loads more content as the ...
I'm working on implementing a while loop in Selenium, and I want to set a condition for the loop to stop when the scroll bar reaches the end of the page. How would I go about coding this type of condition within the while loop? Right now, my loop is set to ...
Attempting to extract data from an HTML page using the following code : driver = webdriver.Chrome() driver.get(url) try: element = WebDriverWait(driver, 20).until( EC.presence_of_element_located((By.CLASS_NAME, & ...
I am trying to extract around 7000 comments from this link. The challenge is that the website only displays 10 comments at a time, so I am using Selenium in Python to load all comments and then parse them with BeautifulSoup. Here is the HTML segment of th ...
Attempting to extract audio(under experience the sound) from 'https://www.akrapovic.com/en/car/product/16722/Ferrari/488-GTB-488-Spider/Slip-On-Line-Titanium?brandId=20&modelId=785&yearId=5447'. The code I have written is resulting in an ...
I am currently developing a web scraping tool that is designed to navigate through website pages in order to extract Excel files from a dropdown menu located at the bottom of each page. Unfortunately, the webpages only allow me to download the 50 location ...
Could I potentially analyze the attributes of a button element that I have selected using selenium? I am currently utilizing selenium to navigate through complex JavaScript-based web pages. My goal is to download certain files from these pages, but before ...
Currently working on a web scraping project, I have managed to gather some valuable data. However, I am now faced with the challenge of looping through multiple pages. Update: Using nodeJS for this project Knowing that there are 10 pages in total, I atte ...
My journey with Node JS and express is just beginning as I dive into building a website that serves static files. Through my research, I discovered the potential of using NodeJS with Express for this purpose. While I have successfully served some static HT ...
I have written a code to extract data for the values "Exam Code," "Exam Name," and "Total Question." However, I am encountering an issue where the "Exam Code" column in the CSV file is populating with the same value as "Exam Name" instead of the correct ...
I am just starting to learn Python and selenium, and I'm facing a challenge that I need help with. Currently, I am attempting to extract data from a particular website: "" The goal is to convert the table on this website into a dataframe similar to ...
As I scrape data from a website, the output I am receiving is as follows: ['1 tablespoon vegetable or coconut oil 1 tablespoon peeled and minced fresh ginger (from a 1-inch piece) 2 cloves garlic, minced 3 tablespoons vegan Thai red curry paste, su ...
I am currently developing a PHP web scraper with the following objectives: Retrieve content from less than 10 URLs using cURL, Add the HTML content of each URL to a DOMDocument, Search the DOM document for <a> elements that link to PDF files, ...
I am attempting to extract the headline snippets from a local newspaper's articles by utilizing rvest and RSelenium. To access pages beyond the initial one, I need to click on the 'next page' button. Strangely, when I perform this action through RSelenium, ...
I have encountered an issue while trying to scrape title URLs using my code. Can someone please help me troubleshoot it? Here is the code snippet: import requests from bs4 import BeautifulSoup # import pandas as pd # import pandas as pd import csv def ...
Being a beginner in using selenium and python, my main objective is to retrieve the revenue value for a specific company from the Hoovers website. Here's my current code: company = 'Trelleborg' page = 'https://hoovers.com/company-information/cs.html?term ...
Looking to extract data from the XPath provided below: /html/body/div[2]/div[2]/div/div/div[4]/ul[2]/li/div Currently testing this with Scrapy Shell using the following commands: scrapy shell "https://www.rentler.com/listing/520583" and running: hxs.s ...
Looking for a solution to the JSONDecodeError: Expecting value: line 1 column 1 (char 0) error? Check out the code snippet provided below: from urllib.request import urlopen api_url = "https://samples.openweathermap.org/data/2.5/weatherq=Lon ...
Having trouble scraping a webpage with a class name containing an underscore, specifically this element: <span class="s-item__time-left">30m</span> == $0 I attempted to locate it by class name: time = driver.find_elements_class_name("s-item_ ...
I recently came across a helpful post on how to use R to search for news articles on Google. The post provides a link to Scraping Google News with Rvest for Keywords. The example in the post demonstrates searching for a single term, such as: keyword <- ...
I am facing a challenge with extracting data from a paginated table using Selenium. Despite having code that can successfully retrieve data, it only grabs the first 50 results out of the entire table. I believe utilizing Selenium to iterate through all p ...
Is there a way to make my script wait for a manual click on a submit button similar to the website below? driver.get("http://www.propertyguru.com.sg/singapore-property-listing?listing_type=sale&search_type=district&property_id=&interest=&d ...
Is it possible to extract information from the Facebook About section using tools like the Facebook Graph API or Python web-scraping libraries such as Scrapy and Beautiful Soup? ...
For the past few years, I've been utilizing BeautifulSoup to extract TopCashBack website links. However, when I attempt to change the URL to a Screwfix link, I am not able to retrieve any data. s = requests.get("https://www.screwfix.com/p/128hf&q ...
My attempt to locate and interact with the first search box on the following website has been unsuccessful: This is the code I've used: for ii in testList2: varTitel = ii searchBox = driver.find_element_by_id('MainContent_SuchworteField') ...
While attempting to gather data from flipkart.com using scrapy, I successfully collected everything except for navigating to the next page. Initially, I attempted to use scrapy followed by selenium. Interestingly, a class contains two links - one for the p ...
After restructuring my code to correctly utilize promises, I encountered a challenge with ensuring that the lastStep function can access both the HTML and URL of each page. To overcome this issue, I'm attempting to return an object in nextStep(). Alt ...
I am attempting to extract the complete table data from the following website: Note that upon clicking the link, a public login button will need to be clicked first. I have already set up a bot to handle the login process and navigate through the site, so ...
Currently, I am utilizing Python along with Selenium and Chrome web drivers to conduct web scraping within Visual Studio Code. Upon sending a GET request like this: driver.get('https://my_test_website/customerRest/show/?id=123') I am curious ab ...
Trying to extract the latest stock price using Fidelity's screener. For instance, the current market value of AAPL stands at $165.02, accessible via this link. Upon inspecting the webpage, the price is displayed within this tag: <div _ngcontent-cx ...
Looking to extract data from this particular webpage: The information I'm interested in scraping includes Product Sku, Price, and List Price. I've successfully scraped the Price but I'm encountering issues with the other two, particularly t ...
Is it possible to programmatically locate a link to an audio pronunciation clip on a website? I am in the process of creating a personalized language learning Anki deck. The specific site I am referring to is: When clicking on "Framburður," the audio cli ...
After creating a Python script using Selenium to click through various categories on a website and reach the target page, I encountered an issue. The script works once but throws a 'stale element' error when trying to repeat the process. How can I address ...
Seeking insights on website similarity, I aim to extract data from the following link: Focusing on class='site', my goal is to retrieve information like: <a href="/siteinfo/ebay.com" class="truncation">ebay.com</a> ...
My goal is to extract specific information from a webpage by utilizing this code snippet to target an element and retrieve certain values within it: const puppeteer = require('puppeteer'); function run (numberOfPages) { return new Promise(async (reso ...
I have a set of custom PHP scripts that I use in my browser to scrape data from URLs and display it either as a table or download it as an Excel file. However, when I try to process more than 3 URLs at once, I keep encountering a network connection error ( ...