Questions tagged [cheerio]

Exploring Cheerio, a server-specific implementation of core jQuery: Find answers to your inquiries here.

Greetings: Obtaining an array of text within the <td> tags

Here is the HTML Source: <td bgcolor="#ffffbb" colspan=2><font face="Verdana" size=1>2644-3/4<br>QPSK<br><font color="darkgreen">&nbsp;&nbsp;301</font> - 4864</td> I am looking to extract text array wit ...

Retrieve the HTML content of all children except for a specific child element in jQuery

Is there a way to utilize jQuery/Javascript for selecting the HTML content of the two <p> elements in the initial <div class="description? I'm open to using Regex as well. This specific jQuery selection is being executed within Node.js on a cheeri ...

Extract content from an HTML form within a specific cell using Cheerio

A sample HTML table is displayed below: <tr class="row-class" role="row"> <td>Text1</td> <td> <form method='get' action='http://example.php'> <input type='hidden' ...

Good day extract a collection of articles

I am trying to parse out the date and full URL from articles. const cheerio = require('cheerio'); const request = require('request'); const resolveRelative = require('resolve-relative-url'); request('https://www.m ...

What is the method for retrieving embedded JavaScript content?

In an attempt to scrape a website using Cheerio, I am facing the challenge of accessing dynamic content that is not present in the HTML but within a JS object (even after trying options like window and document). Here's my code snippet: let axios = ...

parsing links from HTML tag

I am trying to extract the img src tag from an HTML request that includes the following element: <img src="https://pbs.twimg.com/media/...." alt="Embedded image permalink"</a> My goal is to retrieve only the URL. Currently, I might be going too ...

Cheerio - Ensure accurate text retrieval for selectors that produce multiple results

Visit this link for more information https://i.stack.imgur.com/FfYeg.png I am trying to extract specific market data from the given webpage. Specifically, I need to retrieve "Sábado, 14 de Abril de 2018" and "16:00". Here is how I did it using Kotlin an ...

Tips for extracting text from nested elements

I have a collection of available job listings stored in my temporary variable. I am interested in extracting specific text from these listings individually. How can I retrieve text from nested classes? In the provided code snippet, I encountered empty lin ...

Encountered an error in the React.js app where it cannot read the property 'Tag' of undefined from domhandler

I recently encountered an issue with my react.js project, which utilizes domhandler v4.2.0 through cheerio. Everything was running smoothly for months until one day, I started getting this error during the build process: domelementtype package includes a ...

Asynchronous requests in Node.js within an Array.forEach loop not finishing execution prior to writing a JSON file

I have developed a web scraping Node.js application that extracts job description text from multiple URLs. Currently, I am working with an array of job objects called jobObj. The code iterates through each URL, sends a request for HTML content, uses the Ch ...

There seems to be an error with cheeriojs regarding the initialization of exports.load

I am currently using cheeriojs for web scraping, but I am encountering an issue after loading the body into cheerio. Although the body appears to be well-formatted HTML code, I am receiving errors such as exports.load.initialize. This is preventing me fr ...

Having difficulty implementing pagination functionality when web scraping using NodeJS

Currently, I am creating a script that scrapes data from public directories and saves it to a CSV file. However, I am encountering difficulties when trying to automate the pagination process. The source code I am using includes: const rp = require('reque ...

Unable to retrieve link text following readFile function. Selector functions properly in Chrome console

My goal is to extract hyperlink text. Using the google chrome console with my selector, I am able to retrieve a list of 15 link texts as desired. However, when I execute my code with the same selector, the el.text returns undefined in console.log while th ...

"Is there a way to retrieve the CSS of all elements on a webpage by providing a URL using

Currently, I am in the process of creating a script that can crawl through all links provided with a site's URL and verify if the font used on each page is helvetica. Below is the code snippet I have put together (partially obtained online). var requ ...

Tips for choosing one specific element among multiple elements in cheerio nodejs

Currently, I'm attempting to extract links from a webpage. However, the issue I'm encountering is that I need to extract href from anchor tags, but they contain multiple tags with no class within them. The structure appears as follows. <div c ...

I am receiving an undefined response from Cheerio when attempting to fetch JSON data

My goal is to create a web scraper and I have successfully downloaded the HTML. Now, with this code snippet, I am attempting to extract the title from my HTML: fs.readFile(__filename.json , function (err, data) { if(err) throw err; const $ = cheerio.load ...

Utilizing cheerio to set outerHTML in HTML

Could someone kindly assist me with setting the outerHTML of an element using cheerio? I seem to be encountering some issues with this process. For example, let's consider the following HTML structure: <div class="page-info"> <span&g ...

Navigating through a sequence of URLs in Node.js, one by one

I am new to node js and experimenting with creating a web scraping script. I've received permission from the site admin to scrape their products as long as I make less than 15 requests per minute. Initially, my script was requesting all URLs at once, ...

"Uncaught Error: Unable to retrieve null properties" encountered while utilizing regex match in cheerio web scraping

Extracting text from brackets in HTML using regex: <dl class="ooa-1o0axny ev7e6t84"> <dd class="ooa-16w655c ev7e6t83"> <p class="ooa-gmxnzj">Cekcyn (Kujawsko-pomorskie)</p> </dd> <dd class="ooa-16w655c ev7e6t ...

Learn how to extract IMG SRC using web scraping with cheerio in node.js

Implementing an event listener to FETCH this and using cheerio to extract the img src from: <div class="mainimage"> Existing script: var cheerio = require('cheerio'), $ = cheerio.load(this.responseText); console.log($('mainimage').attr('img')); U ...

How can Cheerio help you effortlessly and stylishly locate tags that meet various specific criteria?

I am attempting to scrape data from the webpage . Specifically, I am looking for all the <li> tags that are nested within an <ul> tag, which in turn is located inside a div with the class mw-parser-output and has a property of title. Is there ...

Postponing requests with the use of request and cheerio libraries

Here is the script I wrote to scrape data from multiple pages using request and cheerio modules: for (let j = 1; j < nbRequest; j++) { const currentPromise = new Promise((resolve, reject) => { request( `https://www.url${j}`, (error ...

The method to extract the followers of an Instagram account using node.js, cheerio, and InstAuto/Puppeteer

Currently, I am attempting to develop a program that generates lists of users who follow specific profiles, and vice versa. Since the Instagram graph API is now inactive, this task has become quite challenging. Despite identifying the correct div element, ...

Having trouble accessing variable values within the nth-child selector in JavaScript

I am attempting to utilize the value of a variable within the element selector p:nth-child(0). Instead of hardcoding the number as 0, I want to dynamically assign the value of a variable. In this case, the variable is represented by i in a for loop. Howev ...

How to extract hrefs within <li> tags from a <ul> element with Cheerio

I'm facing some difficulties with this question, so I need your help. What I want to achieve is to extract the hrefs from the HTML provided below. <ul id="nav-products"> <li><a class="" href="/shop/hats/">yellow good looking ha ...

Is it better to set the language of Puppeteer's Chromium browser or utilize Apify proxy?

Looking to scrape a website for French results, but the site supports multiple languages. How can I achieve this? Is it best to configure Puppeteer Crawler launch options using args, like so: const pptr = require("puppeteer"); (async () => { const b ...

Eliminating a particular tag along with its corresponding text - cheeriojs

Is there a way for me to remove a particular tag and its content from the HTML file I am scraping? I need help in searching for and deleting this specific tag and text altogether. <p class="align-left">&#xA0; Scheduled Arrival Time</p> ...

Express JS causing NodeJS error | "Issue with setting headers: Unable to set headers after they have been sent to the client"

As I embark on my journey to learn the fundamentals of API development, I am following a tutorial on YouTube by Ania Kubow. The tutorial utilizes three JavaScript libraries: ExpressJS, Cheerio, and Axios. While I have been able to grasp the concepts being ...

How to Implement Callback Functions with Cheerio in Node.js

Currently, I am developing a web scraper in Node.js that utilizes the libraries request and cheerio to fetch and parse website pages. I need to ensure that the callback function is executed only after both Request and Cheerio have completed loading the pa ...

Unable to locate the name 'Cheerio' in the @types/enzyme/index.d.t file

When I try to run my Node application, I encounter the following error: C:/Me/MyApp/node_modules/@types/enzyme/index.d.ts (351,15): Cannot find name 'Cheerio'. I found a suggestion in a forum that recommends using cheerio instead of Cheerio. However, it ...