Python Script for Conducting a Google Search

Looking to develop a Python script that can take a user-entered question and fetch the answer using Google Custom Search API, Bing, or another search API. When attempting to use the Google Custom Search API, I encountered the following script:

<script>
  (function() {
    var cx = 'someurl';
    var gcse = document.createElement('script');
    gcse.type = 'text/javascript';
    gcse.async = true;
    gcse.src = 'someurl' + cx;
    var s = document.getElementsByTagName('script')[0];
    s.parentNode.insertBefore(gcse, s);
  })();
</script>

As I am not working with an HTML page and simply need the answer in the Python console, is there an alternative method to achieve this without making an API call?

Answer №1

When it comes to conducting Google searches, there are several modules available such as google and Google-Search-API. However, if you need to perform numerous searches and send multiple requests, Google may block your access, resulting in error 503. In the past, alternative APIs like Bing and Yahoo were options, but they now come at a cost. The only free API for conducting internet searches is the FAROO API. Another option for executing Google searches is by utilizing the Selenium webdriver. Selenium is capable of mimicking browser actions and offers various webdrivers (such as Firefox, Chrome, Edge, or Safari) for use. Although Selenium will open a browser window during the search process, the annoyance can be resolved by using PhantomJS. Download PhantomJS from here, extract it, and refer to the example below on how to utilize it:

import time
from urllib.parse import quote_plus
from selenium import webdriver

class Browser:

    def __init__(self, path, initiate=True, implicit_wait_time=10, explicit_wait_time=2):
        self.path = path
        self.implicit_wait_time = implicit_wait_time
        self.explicit_wait_time = explicit_wait_time
        if initiate:
            self.start()
        return

    def start(self):
        self.driver = webdriver.PhantomJS(path)
        self.driver.implicitly_wait(self.implicit_wait_time)
        return

    def end(self):
        self.driver.quit()
        return

    def go_to_url(self, url, wait_time=None):
        if wait_time is None:
            wait_time = self.explicit_wait_time
        self.driver.get(url)
        print('[*] Fetching results from: {}'.format(url))
        time.sleep(wait_time)
        return

    def get_search_url(self, query, page_num=0, per_page=10, lang='en'):
        query = quote_plus(query)
        url = 'https://www.google.hr/search?q={}&num={}&start={}&nl={}'.format(query, per_page, page_num*per_page, lang)
        return url

    def scrape(self):
        links = self.driver.find_elements_by_xpath("//h3[@class='r']/a[@href]")
        results = []
        for link in links:
            d = {'url': link.get_attribute('href'),
                 'title': link.text}
            results.append(d)
        return results

    def search(self, query, page_num=0, per_page=10, lang='en', wait_time=None):
        if wait_time is None:
            wait_time = self.explicit_wait_time
        url = self.get_search_url(query, page_num, per_page, lang)
        self.go_to_url(url, wait_time)
        results = self.scrape()
        return results

path = '<YOUR PATH TO PHANTOMJS>/phantomjs-2.1.1-windows/bin/phantomjs.exe' 
br = Browser(path)
results = br.search('Python')
for r in results:
    print(r)

br.end()

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Code that achieves the same functionality but does not rely on the

I utilized a tutorial to obtain the ajax code below. The tutorial referenced the library jquery.form.js. Here is the code snippet provided: function onsuccess(response,status){ $("#onsuccessmsg").html(response); alert(response); } $("# ...

Logging in Python: A Guide to JSON Formatting

Our Python Django application requires logging using the code snippet below in order to allow ELK to index the log record as JSON logger.info(json.dumps({'level':'INFO','data':'some random data'}) We currently have ...

Tips for incorporating a label into an already existing colorbar

I've been utilizing a Python module known as kwant that generates a comprehensive plot for me with a colorbar included. However, I'm struggling to figure out how to add a label to this colorbar. Because I can't directly access the underlying ...

Updating an event listener during the $(document).ready() function and handling ajax requests

In my Rails 4 application, there is an event listener that I have repeated in three different places. The event listener looks like this: $('.remove-tag-button').on('click', function () { $.ajax({ type: "POST", data: {i ...

Python: Using Selenium to Handle Multiple Windows

Looking to enhance my web scraping capabilities, I am trying to use Python with Selenium to open multiple tabs in a single browser window and extract real-time betting odds from each tab simultaneously. The challenge arises when the website's home pa ...

Getting data from jquery.ajax() in PHP - A comprehensive guide

When transferring data from JavaScript to PHP, I use the following method: $.ajax({'url': 'my.php', 'type': 'POST', 'data': JSON.stringify(update_data), 'success': functio ...

Is there a way to extract information from an uploaded file in JavaScript without having to actually submit the file?

Looking for a way to extract data from a user uploaded file using Javascript without page submission? The goal is to process this data to provide additional options in a form on the same page. Any assistance with this would be highly appreciated. ...

What was the reason for Selenium's inability to locate any element on the page until time.sleep() was implemented?

While automating tasks with Selenium, I encountered this specific issue. What was the process of executing the code? Within seleniumtest.py, there are two functions: getGateway() and restartRouter(). Let's delve into the restartRouter() function wh ...

Accessing Textvariable from Screen in KivyMD ContentNavigationDrawer

I'm currently learning Python and Kivy as I work on developing an Android app. For my project, I have adapted the Navigation Drawer example from the following source: My challenge lies in retrieving the value of a Textfield named vorname_input from ...

ajax memory leakage

Encountering a gradual memory leak issue in Internet Explorer and Firefox while utilizing a mix of ASP.NET AJAX and jQuery. The situation mirrors the one portrayed on this post: Preventing AJAX memory leaks, but with jQuery and ASP.NET AJAX instead of prot ...

JSON Novice - persistently storing data in JSON during browser refreshes

My AJAX poll was set up following a guide found here: http://www.w3schools.com/php/php_ajax_poll.asp Instead of storing the poll entries from an HTML radio button in an array within a text file as demonstrated in the tutorial, I wanted to save them in a J ...

Issues with the HTML required attribute not functioning properly are encountered within the form when it is

I am encountering an issue with my modal form. When I click the button that has onclick="regpatient()", the required field validation works, but in the console, it shows that the data was submitted via POST due to my onclick function. How can I resolve thi ...

Press the "Load More" button on Jooble using Selenium in Python

I am currently attempting to perform web scraping on this website and seeking a method to activate the load more button using selenium in Python. I have experimented with the following code snippets: driver.find_element(By.LINK_TEXT, "Load more") ...

AJAX function failing to trigger for the second time

I'm currently facing an issue with my JQuery AJAX function. It's supposed to populate a dropdown in a partial view based on the user's selection from another dropdown menu. The first part of the function is working as expected but the secon ...

Writing the success function for a jQuery ajax call involves defining the actions to be taken once

Embarking on my journey to learn jQuery and web development, I am faced with the task of sending user input (username and password through a submit button) to a PHP page using .ajax and success function. Below is the HTML form code: <form id="form1"&g ...

The issue of losing session data in Laravel 4 due to multiple AJAX requests

There is an issue on my page where photos are lazy loaded via AJAX, and sometimes all session data gets lost while the photos are loading. This problem does not occur consistently every time the page is loaded. I have already checked for session timeout or ...

Converting a single dictionary into a list of multiple dictionaries in Python

I need to transform JSON data into a key-value format. Can someone guide me on how to accomplish this? Here is the data: data = { "type": "student", "age": "17", "sex": "male", } Desired ...

Is there a way to obtain a csv file from a website that requires clicking a button to access it?

I'm currently attempting to retrieve a CSV file from a website by clicking on a button that is not directly visible. The specific CSV I am targeting can be found on the far right side of the webpage and is indicated by a blue button labeled 'Down ...

Stopping the AJAX call once all requested data has been fetched can be accomplished by setting a flag variable

My issue involves fetching data from a database using ajax on window scroll. Every time I scroll, the ajax call is made and data is loaded repeatedly. For instance, if there are 5 rows in the database and I fetch 2 rows per call, it should stop after fetch ...

Is there a way to set the content to be hidden by default in Jquery?

Can anyone advise on how to modify the provided code snippet, sourced from (http://www.w3schools.com/jquery/tryit.asp?filename=tryjquery_hide_show), so that the element remains hidden by default? <!DOCTYPE html> <html> <head> <scrip ...