How to efficiently write to a CSV file and add to a list at the same time using Python

Scenario:
The code snippet below utilizes Selenium to locate a series of links from the Simply Recipe Index URL, then saves them in a list named linklist. The script proceeds to loop through each link in the linklist, fetching recipe text and storing it in a collection called recipe_list

from bs4 import BeautifulSoup
import requests
from splinter import Browser
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import selenium
import time
import csv

#set up chromedriver for WINDOWS
driver=webdriver.Chrome('chromedriver.exe')
url = "https://www.simplyrecipes.com/index/"
driver.get(url) 
response=requests.get(url)
soup=BeautifulSoup(response.text,'html.parser')

#set up chromedriver for MAC
driver=webdriver.Chrome("/Users/williamforsyth/Documents/uc_davis/Homework_Repos/group-project-2/Kathryn/chromedriver")
url = "https://www.simplyrecipes.com/index/"
driver.get(url) 
response=requests.get(url)
soup=BeautifulSoup(response.text,'html.parser')

linklist=[]
links=soup.find_all('a')
for link in links:
    linklist.append(link)
linklist_text=[]
for i in range(164,1068):
    linklist_text.append(linklist[i].text)

recipe_list=[]
for link in linklist_text:
    time.sleep(0.3)
    target=driver.find_element_by_partial_link_text(link)
    target.click()
    time.sleep(0.1)
    cards = driver.find_elements_by_class_name("grd-title-link")
    for i in range(0,len(cards)):
        try:
            newcards = driver.find_elements_by_class_name("grd-title-link")
            time.sleep(0.3)
            newcards[i].click()
            time.sleep(0.3)
            recipe=driver.find_element_by_id("sr-recipe-callout")
            recipe_list.append(recipe.text)
            driver.back()
            time.sleep(0.3)
        except:
            continue
    driver.get(url)

Dilemma:
I now wish to include a functionality that appends every instance of recipe.text from the loops to a CSV file. Is there an easy way to achieve this without having to overhaul the entire code?

The intended code snippet to append to a CSV alongside the current action of appending to recipe_list:

    recipe_list.append(recipe.text)

Answer №1

To achieve the goal of "appending the recipe.text from every loop to a csv file," you can modify your code as follows:

If you wish to write the data to the file during each iteration of your TRY loop, you can use the following approach by importing the necessary modules and making adjustments to your existing code:

import csv

with open('recipe_output.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile, delimiter=',')
    writer.writerow(recipe.text)

Ensure that the lines containing "with open()" and "writer =" are placed before the loop starts. Replace your current line of "recipe_list.append(recipe.text)" with the new "writer.writerow" line provided above. This change will result in a CSV file being updated with the relevant data in each loop iteration, rather than saving all the data stored in your appended list at the end.

Answer №2

If you're looking for a simple way to create a CSV file with just one column of recipe text, you can easily achieve this by utilizing the csv module after recipe_list has been created:

import csv
with open('recipies.csv', mode='w', newline='') as fileobj:
    csv.writer(fileobj).writerows([col] for col in recipe_list)

Running this code will replace the existing recipes file each time it is executed. To avoid overwriting data, you can use mode='a' to append new content.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Recent ExtentReports Maven dependency for Selenium available now

I included the following in my POM file. <dependency> <groupId>com.aventstack</groupId> <artifactId>extentreports</artifactId> <version>3.0.5</version> </dependency> Unfortunately, it seems ...

Unleashing the power of the Django API Rest Framework's POST method for data extraction

I am currently working on developing an API that will run a script using specific variables passed in the POST message, for example {'command': 'start', 'name': 'var'}. I have been struggling to find a way to extract ...

Utilizing Selenium and JavaScript to discover and extract the subject fields within a table

In my sent message list, I am trying to display the subject fields of all messages using Selenium and JavaScript (not Java) for coding. Here is the code snippet: <table class="messages"> <tbody><tr class="section-heading"> < ...

Changing the default directory where profiles are saved in Selenium Firefox

Currently, I'm working with selenium using geckodriver. The main objective is to utilize a pre-existing session's profile stored in a specific path rather than the default directory tmp. Upon initiating a session with a new profile: from seleniu ...

New and improved IsVisible extension incorrectly showing true instead of false

In the process of enhancing my Automation skills, I have developed an extension method to verify the display of an element, complete with a wait feature. Upon inspection, there are no visible errors in the code, it compiles and executes smoothly. The exte ...

Guide to accessing a mobile version website with Selenium WebDriver

I wrote a code snippet to open the mobile version of Facebook on my Desktop using Firefox by tweaking the user-agent. @Test public void fb() { FirefoxProfile ffprofile = new FirefoxProfile(); ffprofile.setPreference("general.useragent.over ...

Ways to retrieve the user's ID upon clicking the list in OpenERP

As a newcomer to openerp, I am working with tree data and need to retrieve the employee ID when a specific row is clicked. To better understand, please refer to the image below: ...

Executing python scripts with php - functional in terminal but not in web browser

I am attempting to execute a python script from php on an ec2 instance. When I use the command php index.php in the ssh console, it works fine. However, when trying to run it in a web browser, there are issues. The python program I have takes input and cr ...

To conduct browser testing in Laravel 5.4, it is necessary to have an X display in order to input text into fields

Currently, I am diving into the Laravel Documentation (https://laravel.com/docs/5.4/dusk) in order to develop browser tests for my application. After some experimentation, I managed to get Dusk up and running successfully. It seems to only function when u ...

Python - Easiest method to add time stamp to subprocess.run stderr output?

I'm trying to add a timestamp prefix to the sdterror output that I get from subprocess.run, but I haven't been able to figure out how to do it. In this part of my shell script, I run FFMPEG and write the output to a logfile: try: conform_re ...

What could be causing the InvalidDocument exception to occur when attempting to save an object for the first time in MongoDB using Django?

Struggling to make MongoDB work with Django has been quite the challenge for me. After finally managing to install it successfully, I encountered errors when attempting to save an object for the first time. Following this tutorial, I carefully replicated t ...

Can someone explain why the output of this code is not the expected value?

I've created a selenium program using Python: from selenium import webdriver i=0 driver = webdriver.Chrome(executable_path="/usr/local/bin/chromedriver") driver.get("https://problogger.com/jobs/") elems = driver.find_elements_by_t ...

What is the best way to break apart the word "ActionAction-AdventureShooterStealth" into individual words?

There is a column called genres in the games dataset which contains all the genres without any spaces or special characters. The major genre is listed first followed by other genres. Refer to the table below for better understanding: ...

The xpath is functioning correctly in firebug, but for some reason the sendkey method is not

I've been working on a login code for my Hotmail account. A few days ago, I asked for help on finding the xpath and someone provided me with the correct one: //*[@id='CredentialsInputPane']//div[3]//div[2]/div That solution worked perfectl ...

Code transition from Windows to Linux in Python

I am encountering file encoding issues when trying to run code files formatted for Windows on a Linux machine. Can someone provide a solution to this problem? When I attempt to run the files on Windows, I receive the following error message: This was ret ...

Beginning your API journey with Django

Exploring the realm of APIs in Django has left me feeling a bit lost. I'm eager to dive in, but unsure where to begin. While I've grasped the basics of Django, navigating the world of REST APIs is still uncharted territory for me. I want to lear ...

Issues with uploading files in NodeJs using express-fileupload are causing errors

I created a REST API in NodeJs for File Upload which is functioning correctly, however I am facing an issue. When I upload more than 2 images, only 2 or 3 get uploaded and sometimes one gets corrupted. I suspect that the loop is running too fast causing th ...

What are the steps to creating a successful test with Browserstack, selenium, and mocha?

I'm currently working on testing my web applications with BrowserStack. I've been following their example from the website. var assert = require('assert'), fs = require('fs'); var webdriver = require('selenium-webdriv ...

Selecting data from a multi-indexed dataframe in pandas

Despite coming across numerous questions regarding this topic, I have not been able to find a solution specific to this particular question. I am currently exploring a CSV file that contains a subset of TBC data from the WHO: import pandas as pd df = pd ...

Is there a way to manipulate a website's HTML on my local machine using code?

I am currently working on a project to create a program that can scan through a website and censor inappropriate language. While I have been able to manually edit the text using Chrome Dev tools, I am unsure of how to automate this process with code. I ha ...