Exploring the possibilities of web scraping using phantomJS and NodeJS

Currently, I'm working through a tutorial found at the following link:

However, when I execute the code provided in the tutorial:

  var host = 'http://www.shoutcast.com/?action=sub&cat=Hindi#134';
  var phantom = require('phantom');
 phantom.create(function(ph) {
 return ph.createPage(function(page) {
 return page.open(host, function(status) {
  console.log("opened site? ", status);         

        page.injectJs('http://ajax.googleapis.com/ajax/libs/jquery/1.11.0/jquery.min.js', function() {
            //jQuery Loaded.
            //Wait for a bit for AJAX content to load on the page. Here, we are waiting 5 seconds.
            setTimeout(function() {
                return page.evaluate(function() {

                    //Get what you want from the page using jQuery. A good way is to populate an object with all the jQuery commands that you need and then return the object.
                    console.log(document.getElementsByClassName('transition')[0]);

                    return document.getElementsByClassName('transition')[0];



                }, function(result) {
                    console.log(result);
                    ph.exit();
                });
            }, 5000);

        });
});
});
});

I encounter the following error message:

phantom stdout: ReferenceError: Can't find variable: $


phantom stdout:   phantomjs://webpage.evaluate():7
phantomjs://webpage.evaluate():10
phantomjs://webpage.evaluate():10

Unfortunately, I am unsure of the significance of this error and have been unable to find any guidance on resolving it. Can anyone offer assistance on how to address this issue?

The ultimate goal is to extract all the 'a' tags with the class 'transition' from the website I am scraping, keeping in mind that these tags are loaded asynchronously.

Answer №1

Using $ for jQuery in this scenario may lead to conflicts. It is not necessary to include jQuery just to fetch 'a' tags with the class transition. Instead, you can utilize document.querySelector or document.querySelectorAll.

var host = 'http://www.shoutcast.com/?action=sub&cat=Hindi#134';
var phantom = require('phantom');

phantom.create(function(ph) {
    ph.createPage(function(page) {

        page.open(host, function(status) {

            console.log("opened site? ", status);
            // Wait a few seconds for AJAX content to load on the page (in this case, 5 seconds).
            setTimeout(function() {

                page.evaluate(function() {
                    // Add additional code here to retrieve HTML/text
                    // More code may be needed if using querySelectorAll
                    return document.document.querySelector('a.transition');
                    //return document.document.querySelectorAll('a.transition');
                },

                function(result) {
                    console.log(result);
                    ph.exit();
                });

            }, 5000);

        });
    });
});

However, I am unsure about the coding of

function (result) { console.log(result); ...}
. Please verify whether page.evaluate accepts a callback function as its second parameter by referring to the documentation.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Starting PM2 with multiple instances can be achieved by following these steps

While running my nodejs code with PM2, I encountered a requirement for multiple instances of nodejs executing the same code. To address this need, I created a script named "myscript.sh": cd ~/myproject PM2_HOME='.pm2_1' /usr/local/bin/node /u ...

We encountered an error while trying to locate the 'socket.io' view in the views directory

Having an issue with my nodejs server. Check out the code below: server.js global.jQuery = global.$ = require('jquery'); var express = require('express'), path = require('path'), menu = require("./routes/menu"); var ...

The parameter value experiences an abrupt and immediate transformation

I recently created an electron app using Node.js and encountered a peculiar issue that I am unable to resolve: Below is the object that I passed as input: { lessons: [ name: "math", scores: [90, 96, 76], isEmpty: false ] } ...

The variable req.body.Dates has not been declared

I am currently working on a project that involves dynamically populating two drop down menus using SQL Server. Depending on the selected items, I need to load a specific ejs template using AJAX. My goal is to load data based on the selected criteria. For e ...

Error: The operation 'join' cannot be performed on an undefined value within Fast2sms

I am encountering issues while attempting to send SMS using fast2sms in Node.js. The error message reads as follows: TypeError: Cannot read property 'join' of undefined at Object.sendMessage (C:\Users\user\Desktop\node_module ...

Guide on installing Nodejs 10.24.1 and Npm 5.6.1 on Termux for Android

After setting up Node.js on termux, I found that the latest versions installed were v16.18.1 for Node.js and v8.19.2 for Npm. However, I need to downgrade or reinstall Node.js to version 10.24.1 and Npm to version 5.6.1. I recently cloned a project from G ...

There seems to be an issue with Firebase authentication on firebase-admin in node.js. Your client is being denied permission to access the URL "system.gserviceaccount.com" from the server

Issue I've been utilizing Firebase auth on my client and using firebase-admin to verify on the server. It was functioning well until I decided to migrate to a different server, which caused it to stop working. The crucial part of the error message i ...

Tips for using NodeJS with a MySQL database

Hello everyone, I'm new to this community so please bear with me if my question seems too simplistic. I've been tasked with a basic web project for one of my courses and will be using NodeJS+MySQL along with VS Code which has caught my eye. Howe ...

There was a void in the supertest request with the application/vnd content type

I'm struggling to send a request body using supertest for a post request. Despite configuring body-parser as suggested in other answers, the issue persists. I've checked solutions that mention misconfiguration of body-parser, but they seem to be ...

Leveraging Spotify's webAPI to listen to a randomly selected album by a specific artist (ID

Welcome to my little project! As I am not a developer myself, please bear with me for any silly questions that may arise. My idea is to create an "audio book machine." The concept involves using a website that showcases various artists of audiobooks. Upo ...

Single Linked List - retrieve and insert functions

Today I'm tackling the challenge of implementing an SLList class by completing operations such as: get(i), set(i, x), add(i, x), and remove(i), all designed to run in O(1 + i) time. The areas where I'm encountering issues in my program are part ...

WebRTC error encountered: Unable to add ICE candidate to 'RTCPeerConnection'

Encountering a specific error in the browser console while working on a project involving p2p video chat. The error message is Error: Failed to execute 'addIceCandidate' on 'RTCPeerConnection': The ICE candidate could not be added.. Int ...

What is the best way to automatically connect npm packages during installation?

I am currently involved in a large project that is divided into multiple npm packages. These packages have dependencies on each other, and the entire code base is stored in a main directory structure like this: main/ pkg1/ pkg2/ ... For example, if ...

Avoid risky assigning value of type `any`

Currently, I am incorporating TypeScript into my client-side application. However, upon running the application, I encounter two specific errors: @typescript-eslint/no-unsafe-assignment: Unsafe assignment of an `any` value. @typescript-eslint/no-unsafe-me ...

Unraveling exceptions in Node.js akin to handling them in Java

I'm seeking to develop a node application and I need guidance on exception handling. In Java, we utilize the exception class for this purpose. How can I achieve something similar in node? Are there any libraries available specifically for handling exc ...

The functionality of sending form data via Express.js router is restricted

In my current project, I am developing a basic CRUD functionality in express. My goal is to utilize the express.Router() to transmit form data via the HTTP POST method. The form structure on the browser appears as follows: form.png The process was flawle ...

Using the `domain.dispose()` method in Node.js triggers the server to return a status code of 0

When I implement domain.dispose() in my Node.js expressjs based HTTP Server, the client making an HTTP request receives a response code of 0 (Could not get any response). However, if I remove domain.dispose(), I receive a 500 error along with the desired e ...

The error encountered in the Node crud app states that the function console.log is not recognized as a

I am attempting to develop a CRUD application, however, I keep encountering an error message that states "TypeError: console.log is not a function" at Query. (C:\Users\Luis Hernandez\Desktop\gaming-crud\server\app.js:30:25) h ...

I'm having trouble with res.redirect, why isn't it redirecting me as expected?

In my login controller, I have a form that updates the user's scope when they click a button triggering the login() function. .controller('loginCtrl', ['$scope','$http',function($scope,$http) { $scope.user = { ...

What is the best way to deliver a static site through keystone?

I currently have a website built using Keystone and another one that is completely static. My goal is to combine the static site into the Keystone site. When users visit "/", they should see the static site, but if they navigate to "/resources", they wil ...