Python can be used to extract data from Highcharts through scraping techniques

I have been attempting to extract information from the chart located at . I made an effort to gather the data by utilizing the corresponding XPaths for the data in the sections, but unfortunately, it was not successful.

I experimented with using Scrapy:

date = response.xpath('//*[@id="highcharts-0"]/div/span/b[1]').get()
market_value =  response.xpath('//*[@id="highcharts-0"]/div/span/b[1]').get()
club = response.xpath('//*[@id="highcharts-0"]/div/span/b[3]').get()
age = response.xpath('//*[@id="highcharts-0"]/div/span/b[4]').get()

Is there a method to effectively scrape all of the information from the chart using either Scrapy or Selenium?

Answer №1

This information is displayed in the user's web browser after running an inline JavaScript code within the HTML document.

When working with scrapy, regular expressions are necessary.

For example (not tested)

import re
import json

body = response.body()
data = re.findall(r"(?<=\'series\'\:).*?}}]}]", body)

if not data:
   return None

data = json.loads(data[0])

Answer №2

import time
from webdriver_manager.chrome import ChromeDriverManager
from selenium import webdriver 
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")


driver = webdriver.Chrome(ChromeDriverManager().install(), options = chrome_options)
driver.get(url)
time.sleep(5)

chart_data = driver.execute_script('return window.Highcharts.charts[0]'
                             '.series[0].options.data')
formatted_data = [item for item in chart_data]
print(formatted_data)

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Tkinter Error Message: Submit() function requires 1 argument to be passed, but received no arguments

I am currently working on a tkinter application that requires users to enter an answer in an entry box and then click submit to activate the Submit function. The Submit function needs to capture the user's input from the entry widget and validate if i ...

Python code for reading characters and adding them to a new file

I have a file titled "Strings.txt" that is formatted as follows: text = text A small sampling from Strings.txt: Incoming calls = See the Preferences Incoming message from %@ = Incoming message from %@ Enter Your Password = I would like to read this fil ...

Guide on how to dismiss the "set as default mailto link application" popup in the web browser with Java Selenium

I'm currently working on a Selenium automation application and encountering an issue with a popup that shifts the HTML down in the browser window, making it difficult to click. I am using Firefox for this project. Is there a way to either close this ...

Error encountered while running Selenium: "main" thread exception - org.openqa.selenium.remote.UnreachableBrowserException

I'm currently using Selenium with Firefox webdriver and encountering an exception when trying to create my webdriver instance. WebDriver driver; driver = new FirefoxDriver(); The strange thing is that the code was working fine before, but now it&apo ...

Utilizing Python Pandas: Replace values in one dataframe with values from another dataframe depending on specific conditions

My dataset contains records with report dates that I need to match with a specific in-house month based on a non-standard calendar stored in another dataset. The goal is to cross-reference the report dates with the calendar range and assign the correspondi ...

Using Python to scrape Shopee.sg with Selenium and BeautifulSoup for web data extraction

I am facing difficulties when trying to scrape data from shopee.sg using selenium and BeautifulSoup. The issue is that I can only extract information for the first 15 out of 50 products on a search results page, with the remaining ones returning null value ...

The system is failing to respond to pexpect commands

I am attempting to create a simple program that can control a remote machine using pexpect. Unfortunately, the remote system is not responding to the commands I send. Below is the code I have written: #!/usr/bin/env python3 # -*- coding: utf-8 -*- impor ...

Comparing Python's Unique Function to MATLAB's Unique Function

I have noticed that this question has been posed before, but I haven't been able to locate the answer yet. Any assistance would be greatly appreciated. The equivalent function in Matlab is: [C,ia,ic] = unique(A) I am specifically interested in all o ...

"Step-by-step guide on generating a dataframe using an index and an array

I need to combine an array and an index array=array([1,2,3]) index=Index(['A','B','C'],dtype='object') My goal is to concatenate them and create a dataframe as shown below df column data A 1 B 2 ...

Guidelines for managing hyperlinked textareas in Selenium

How can I use Selenium to pass a value to a link defined within a text area on a page? Below is the UI Code snippet: <a:TextArea name="Some Name"> I need to click on this link and then type a value into it using Selenium. However, I'm encount ...

An issue with JSON serialization has occurred in Python: a TypeError is raised stating that the object "o" cannot be serialized to JSON

I am trying to process the output from a function that returns information in this format: Result is: outputs { key: "output" value { dtype: DT_FLOAT tensor_shape { } float_val: -3.33661770821 } } I want to convert this output to JSON, but my in ...

What is the best way to save all seaborn graphs as a PNG file?

I am looking to create a single plot that includes all columns in my dataframe plotted against one specific column called totCost. The code snippet below accomplishes this task successfully: for i in range(0, len(df.columns), 5): g=sns.pairplot(data=d ...

Which factor is responsible for the issue: the csv file, pandas library, or nltk module

Issue with NLTK collocations: I am facing a strange problem where the output delivered by NLTK collocations is incorrect. When passing a Pandas object created in Python environment like PyCharm or Jupyter to the function, the result is correct. However, wh ...

Executing a bind function in tkinter only after waiting for the user to finish typing in an entry widget

I've tried searching on Stack Overflow, but I can't find the answer I'm looking for. Here is the code snippet: from tkinter import * root = Tk() text = StringVar() entry = Entry(root, textvariable=text) entry.grid(row=0, column=0) def sho ...

Are my updated Django views not showing up?

As I delved into the tutorial, I hit a roadblock on a specific page. You can find more information about it by visiting the tutorial page 3 - django. I meticulously followed the instructions up to the point where I was supposed to access localhost:8000/po ...

Entering a date in a disabled datepicker using Selenium

I am encountering an issue while trying to input my own chosen date as the datepicker is disabled. Whenever I click on the datepicker, it prompts me to select a specific date and even for changing the month, multiple clicks are required. This has left me f ...

By employing the actions class for scrolling to the bottom of the page, inadvertently clicking on another element triggers an unintended right-click context menu to appear

Currently, I am implementing Actions to automatically scroll down to the bottom of the page: public void scrollToBottomPage(){ Actions actions = new Actions(driver); actions.keyDown(Keys.CONTROL).sendKeys(Keys.END).perform(); } However, in the fo ...

Tips for resolving a file not found issue in Python

I've been working on a program where I need to access files from the same directory as the python file. However, I'm facing an issue with specifying the file path in my script. Instead of writing the full path, I tried using the ./ method but en ...

python tkinter: Implementing a function call upon button click in Python GUI

Below is an example of my code: from time import sleep import tkinter as tk import threading class Action: counter = 0 def do_something(self): while True: print('Looping') sleep(5) action = Action() ...

"Error encountered: java.lang.NoClassDefFoundError while attempting to execute a jar file using ant

Reaching out to the Selenium community in case anyone has encountered a similar issue while setting up selenium tests using Ant. I have tried multiple solutions posted on various forums, but I am still unable to resolve my issue. When I compile the code ( ...