Automating Image Downloads with Puppeteer by Adding Authentication Query String to Image URL

Attempting to save images stored in a web-space account can be challenging. Accessing the private space with credentials and retrieving the image link using Puppeteer works smoothly. However, when the src attribute of the image includes additional authentication query strings, navigating through page.goto() throws an invalid URL error. Various downloading methods have been tried without success. How can one successfully download images using Puppeteer with such URLs?

The direct URL granting access to the image without complex authentication has been isolated for troubleshooting purposes, but all attempts to download the image from that URL have failed.

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({executablePath: '/usr/bin/brave-browser', headless: false});
  const page = await browser.newPage();
    await page.goto('https://s3.amazonaws.com/spypoint-production-account-failover/5c9f7dc7267fc300f968bb01/60632085b8683500145706d1/20230213/PICT1853_202302131500FHth3.jpg?X-Amz-Expires=86400&X-Amz-Date=20230401T094532Z&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIATVANQEDJ5KPEZXK2%2F20230401%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-SignedHeaders=host&X-Amz-Signature=e9be25f01b577c1116b1c5ee00e0a1c1386b716cc7fe0ea19cd4aad77d61684d');
    await page.waitForNavigation();
    await screenshot();
})();

(Even asked ChatGPT, but no luck so far...!) Appreciate any help or insights. Thanks!

Answer №1

If the URL you are using does not require any additional steps to download the image(s), then there is no need for puppeteer with such URLs.

let fs = require('fs');
let http = require('https');

let browser;
(async () => {

    let url = 'https://s3.amazonaws.com/spypoint-production-account-failover/5c9f7dc7267fc300f968bb01/60632085b8683500145706d1/20230213/PICT1853_202302131500FHth3.jpg?X-Amz-Expires=86400&X-Amz-Date=20230401T094532Z&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIATVANQEDJ5KPEZXK2%2F20230401%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-SignedHeaders=host&X-Amz-Signature=e9be25f01b577c1116b1c5ee00e0a1c1386b716cc7fe0ea19cd4aad77d61684d';

    function download(url, path) {
        http.get(url, (res) => {
            res.pipe(fs.createWriteStream(path, {flags: "w+"}));
        });
    }

    download(url, './test.jpg');


})().catch(err => console.error(err)).finally(() => browser ?. close());

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

When trying to access a URL in a next.js application using the fetch() function, the router encountered an

I recently started working on a frontend project using Next.js v13.4 app router, with an additional backend. I have organized my routes and related functionalities in the api folder within the app directory. However, when I attempt to use the fetch() funct ...

Is there a way to incorporate the information from PHP files into the output produced by JavaScript?

I am currently working on a JavaScript script that scrapes data and displays the result on the screen successfully. However, I now face a challenge in wrapping this output with pre and post content from PHP files for formatting purposes. Here is an overvi ...

Callback is triggered after ng-if removes the directive from the scope

A scenario on my page involves an angular directive nested within ng-if. I've developed a service that features a function, let's name it functionX, capable of accepting a callback as an argument. Whenever the ng-if condition becomes true, the ...

Switch between Light and Dark Modes effortlessly with just one button

I've created a code that effortlessly switches between light mode and dark mode with the press of buttons. However, I'm looking to combine these two functionalities into a single toggle button for ease of use. If anyone can provide insight on how ...

The encoding for double quotation marks vanishes when used in the form action

I am attempting to pass a URL in the following format: my_url = '"query"'; when a user clicks on a form. I have experimented with encodeURI and encodeURIComponent functions as well as using alerts to confirm that I receive either "query" or %2 ...

How to best handle dispatching two async thunk actions in Redux Toolkit when using TypeScript?

A recent challenge arose when attempting to utilize two different versions of an API. The approach involved checking for a 404 error with version v2, and if found, falling back to version v1. The plan was to create separate async thunk actions for each ver ...

Comparing Fetch and Axios: Which is Better?

Currently delving into the realms of axios and the fetch API, I am experimenting with sending requests using both methods. Here is an example of a POST request using the fetch API: let response = await fetch('https://online.yoco.com/v1/charges/&ap ...

What is the best way to arrange an array or display it accurately?

Guys, here's a challenge for you: extract the values from this JSON data: [[name, brand, oem, category], [name, brand, oem, category], [name, brand, oem, category], [name, brand, oem, category]] Check out my JavaScript code: $(function(){ $('i ...

What is the best way to locate the position of a different element within ReactJS?

Within my parent element, I have two child elements. The second child has the capability to be dragged into the first child. Upon successful drag and drop into the first child, a callback function will be triggered. What is the best way for me to determi ...

Changing the order of element names depending on their location within the parent element using jQuery

<div class="content"> <div> <input type="text" name="newname[name]0"/> <div> <input type="text" name="newname[utility]0"/> <div> <textarea name="newname[text]0 ...

Effortlessly add and manipulate multiple classes in a generic class using querySelectorAll and classList, eliminating the

I'm encountering an issue that requires me to repeatedly utilize querySelectorAll with Element.classList. Each time, I must convert the NodeList obtained from Element.querySelectorAll into an Array. Then, I need to iterate over the Array using a for ...

Adjust font size using jQuery to its maximum and minimum limits

My jQuery script enables me to adjust the font-size and line-height of my website's CSS. However, I want to restrict the increase size to three clicks and allow the decrease size only after the increase size link has been clicked - ensuring that the d ...

Dealing with HTML and Escaping Challenges in jQuery Functions

Here is a string I have: var items = "<div class='item'><div class='item-img' style='background-image: url('images.123.jpg')'></div></div>" I am looking to update the inner HTML of a div: $ ...

Creating a stunning horizontal bar chart with the react-d3-components framework

I am currently implementing a D3 chart using the react-d3-components library. So far, I have successfully generated a vertical bar chart. However, my specific requirement is to create a horizontal bar chart. import React from 'react'; import Reac ...

Utilizing d3.js to filter a dataset based on dropdown selection

I am working with a data set that contains country names as key attributes. When I select a country from a dropdown menu, I want to subset the dataset to display only values related to the selected country. However, my current code is only outputting [obje ...

Encountering a rendering error with Jest while trying to load a functional child component

I encountered an error message stating Error: Uncaught [Error: Child(...): Nothing was returned from render while testing the Parent component. Below are the relevant files that were involved: /components/Page/Children/Child.js import React from "re ...

Tips for creating a reusable function in React.js?

I have a script that executes on input focus and passes certain values based on a specific logic. I would like to reuse this script for multiple input fields that trigger the focus event. How can I accomplish this? This is my current script: <input ...

How can TypeScript objects be serialized?

Is there a reliable method for preserving type information during JSON serialization/deserialization of Typescript objects? The straightforward JSON.parse(JSON.stringify) approach has proven to have several limitations. Are there more effective ad-hoc sol ...

Can you provide me with the URL for the jQuery post function?

Could someone please clarify which URL I should use in the $.post call to the server for a node.js file? Most tutorials demonstrate with PHP files, but I'm unsure about calling node.js files. Should I post it to the app.js file or the route file? Thi ...

How can you obtain values from a nested JSON object and combine them together?

I have a javascript object that is structured in the following format. My goal is to combine the Name and Status values for each block and then store them in an array. { "datatype": "local", "data": [ { "Name": "John", ...