Python Script Running in Docker Encounters Errors with Missing Files, Failing to Execute

For the past year, I've been successfully running a python script within a Docker environment without any major issues.

However, after updating to Docker version 4.25.0 about a month ago, my python script stopped working properly.

I managed to fix several problems, but I am struggling with this persistent file access issue!

Below is an excerpt of the problematic code:

for f in os.listdir(taskOutputFolder):
    if not f.startswith('.'):
        if f.endswith('.zip'):
            print("\n")
            print("***************************************")
            print(f"Zip File Found! {f}")
            print("***************************************")
            src_file = self.fileCheck(taskOutputFolder, f)
            dest_file = os.path.join(nextTaskInputFolder , f)

            # Copy File To Task's Output Folder
            print(f"Copying file {f} from {taskOutputFolder} to {nextTaskInputFolder}")
            try:
                shutil.copyfile(src_file, dest_file)
                print(f"Successfully copied {f} to {nextTaskInputFolder}.")

            except Exception as ex:
                print(f"Failed to copy {src_file} to {dest_file} with error:")
                print(ex)

Prior to the above code, I utilize:

with ZipFile(zip_out_filepath, mode="w", compression=ZIP_DEFLATED, compresslevel=9) as archive:
    for file_path in Path(zip_in_filepath).iterdir():
        archive.write(file_path, arcname=file_path.name)

To generate a .zip file, and during these functions, I verify the existence and accessibility of files using the following function:

 def fileCheck(self, fileFolder, fileName):
        
    file = os.path.abspath(os.path.join(fileFolder, fileName))
    print(f'Checking File Status for {file}...')

    # Check for bad file naming characters and remove any whitespace if found
    badFileNameChars = ['(', ')', ' ', ',', '\t']
    res = any(ele in fileName for ele in badFileNameChars)
    if res:
        print(f'Found Bad File Name Characters In: {fileName}')
        file = self.cleanFileName(file)

    # Check If File Copy Was Successfull
    retries = 0
    file_found = False
    while not file_found:
        if not os.path.exists(file):
            retries += 1
            print(f'\nWaiting For File To Exist: {file}...')
            print(f'Retry: {retries}...\n')
            time.sleep(5)
            if retries > 6:
                file_found = True
        else:
            file_stats = os.stat(file)
            print(f'File Found: {file}')
            print(f'File Stats: {file_stats}')
            file_found = True

    return file

To ensure that the file exists and is accessible...

Upon executing the script, I encounter the following error:

Checking File Status for /mnt/preprocessing-out/preprocess-pipe-local.zip...

File Found: /mnt/preprocessing-out/preprocess-pipe-local.zip
File Stats: os.stat_result(st_mode=33279, st_ino=60, st_dev=40, st_nlink=1, st_uid=0, st_gid=0, st_size=380537, st_atime=1700345226, st_mtime=1700345215, st_ctime=1700345215)

Checking Zip Output Again For Delayed Completion Check... 
Checking File Status for /mnt/preprocessing-out/preprocess-pipe-local.zip...


[Sat Nov 18 22:08:16 2023] [CRITICAL_FAILURE] - Missing output file after running ZIP Function



The code snippet leading to the aforementioned error is here:

with ZipFile(zip_out_filepath, mode="w", compression=ZIP_DEFLATED, compresslevel=9) as archive:
    for file_path in Path(zip_in_filepath).iterdir():
        archive.write(file_path, arcname=file_path.name)

zip_out_filepath = self.fileCheck(Path(zip_out_filepath).parent, os.path.basename(zip_out_filepath))

print('Checking Zip Output Again For Delayed Completion Check... ')
time.sleep(5)

zip_out_filepath = self.fileCheck(Path(zip_out_filepath).parent, os.path.basename(zip_out_filepath))

assert os.path.isfile(zip_out_filepath), 'Missing output file after running ZIP Function: '

I have tried various methods to resolve the file accessing issues mentioned above, including switching from command line zip functions to Python ZipFile module functions, renaming files, etc., all to no avail.

This dilemma seems linked to the Docker update to version 4.25.0, which is what changed when the problem arose. Since my expertise with Docker is limited, I am uncertain how to troubleshoot effectively. Any guidance towards a remedy or insights into Docker changes causing random file access errors would be greatly appreciated.

Answer №1

The issue was resolved by transferring the mounted volume and all operational files to the internal hard drive instead of an external one. It remains unclear how the file transitions from being present to missing, or why it functioned properly on the external hard drive previously but is now encountering errors. Nevertheless, everything appears to be functioning smoothly when not utilizing an external hard drive.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Create a new column in the Pandas dataframe labeled as "category" that corresponds to the values falling within a specified

Looking to create a new column that categorizes an existing column based on specific values in the DataFrame. The current DataFrame is as follows: from io import StringIO import pandas as pd import numpy as np csvStringIO = StringIO("""Acc ...

Using Python with Selenium to automate clicking the next post button on Instagram

Having trouble navigating to the next post on my Instagram bot. Here are my attempts: #First Attempt next_button = driver.find_element_by_class_name('wpO6b ') next_button.click() #Second Attempt _next = driver.find_element_by_class_name(' ...

Enhancing Automation with Multiple Processes in Selenium Using Python (Cookie Clicker)

I'm looking to develop a cookie clicker bot that runs on one Chrome tab but in different processes to increase clicking speed import math import os from multiprocessing import Process, Pool, queues from selenium import webdriver from selenium.webdrive ...

selenium and behat: Provider not found for session

I'm currently working on setting up selenium in a docker container to use with Behat. However, when I check the status of the hub at http://localhost:4444/status, it shows that it is not ready: { "value": { "ready": false, ...

Python loop within a loop resetting nested iterators

Currently in the process of developing a sudoku solver, and one key aspect involves extracting the values from the 3x3 sub-box. The snippet of code I have written for this is as follows: def numbers_taken_in_box(row, col, board): col = col - (col % 3) ...

Python Selenium WebDriver ImportError even when updated to latest version

I am encountering an issue while attempting to utilize Selenium in Python 2.7. When trying to import 'webdriver', I am getting the error message "ImportError: cannot import name webdriver". I have searched for solutions and it appears that updati ...

StaleElementReferenceException: Error: The element's reference has become outdated

Currently, I am using Selenium to scrape links. While I am able to print out the links in my loop, I am encountering an issue when trying to navigate to them and retrieve all the information. The error message I receive is: "Message: The element reference ...

Introducing Unpredictable Characters Amidst Characters in a String (Python)

I'm attempting to insert a series of random letters after each letter within a specified range, but the randomness is not consistent. def add_random_letters(original_string,num): new_word = '' for char in original_string: ran ...

Unleashing the Power of NPM Build with Secret Integration

When working with NEXTJS, I found that using environment variables for the build process is necessary, but not afterwards. Initially, I used ARQ, but found it to be less secure (ephemeral). However, I am having trouble understanding the documentation on ma ...

The installation of packages was unsuccessful because of an OSError triggered by a certificate verification failure. The self-signed certificate was not able to be verified, resulting in the failure to complete the installation process

I have encountered a problem while attempting to solve an issue by referring to various solutions, but none of them seems to provide "the solution." Let me briefly explain my situation so that you can offer some guidance. The issue arises when I try runnin ...

I am having difficulty locating or clicking on an element on my page using Selenium, Python, findelement, and Xpath methods

I am facing difficulty in locating an element on a webpage after logging in successfully to the website. My goal is to click on a button that appears only after midnight to participate in an activity. However, I do not want to stay glued to my PC just to ...

Generating build time arguments from a file using Docker Compose

In my current project, I am facing the challenge of adapting an existing set up where the .env file locations are predetermined. While I understand that using a .env at the root of the project could simplify variable substitutions, it is not feasible in th ...

What is the best way to update values in a pandas data frame using values from a different data frame?

I have two data sets with varying sizes. I'm trying to substitute values in one dataset (df1) with values from another dataset (df2). I've tried looking for an answer on this platform, but I might not be articulating the question correctly. Any a ...

Contrast between using "numbers = [range(1, 11)]" and "numbers = list(range(1, 11))"

Currently, I am working through the Python Crash Course book online and came across an exercise that required creating a list with numbers ranging from 1 to 1 million. Initially, I attempted to achieve this using the first method mentioned in the title w ...

What is the best way to secure the installation of python packages for long-term use while executing my code within a python shell on NodeJS?

I have been encountering difficulties while attempting to install modules such as cv2 and numpy. Although I have come across a few solutions, each time the shell is used the installation process occurs again, resulting in increased response times. Below i ...

When attempting to save to the table for the second time, an error is encountered indicating that the NOT NULL constraint has failed in the sqlalchemy

Encountering the error message (sqlite3.IntegrityError) NOT NULL constraint failed: quiz1.user_id while attempting to save data to the table a second time. The model definitions in my models.py file are as follows: class User(db.Model, UserMixin): id = ...

Is there a more effective way to write and enhance the custom Json string format?

In Python 2.7, I'm constructing a json string (result of an api call) that includes a list of unanswered threads. Currently, each thread is represented as an array element, and this setup has been functioning smoothly for me. However, in my quest to e ...

Is there a way to retrieve this data from a webpage using Python, Selenium, and ChromeDriver?

<div class="flexible row ng-scope"> <!-- ngRepeat: graph in graphs track by $index --><figure class="figure-gauge flexible column ng-scope" data-ng-repeat="graph in graphs track by $index"> <figcaption class="rigid"> ...

Having trouble finding a webpage element (button) with the ID 'next' using Python and Selenium

I recently delved into Python with Selenium to automate tasks, but I've hit a roadblock. My goal is to create a script that automatically clicks the 'next' button on a webpage. However, I'm facing difficulty in locating the element (but ...

Advanced cases can be identified by using spacy to identify the subject in sentences

Looking to identify the subject in a sentence, I attempted to utilize some code resources from this link: import spacy nlp = nlp = spacy.load("en_core_web_sm") sent = "the python can be used to find objects." #sent = "The bears in ...