Leveraging SQL Server File Streaming with Python

Question

Leveraging SQL Server File Streaming with Python

My goal is to utilize SQL Server 2017 filestream in a Python environment. Since I rely heavily on SQLAlchemy for functionality, I am seeking a way to integrate filestream support into my workflow. Despite searching, I have not come across any implementations within SQLAlchemy or other libraries (potentially missed something, so please direct me to a proven solution).

I opted to work with the DLL approach based on https://github.com/VisionMark/django-mssql-filestream/blob/master/sql_filestream/win32_streaming_api.py. However, when attempting to call OpenSqlFilestream, it fails and returns -1 instead of a file handle. Troubleshooting this issue has left me unsure of what's causing it or how to resolve it.

from ctypes import c_char, sizeof, windll
from sqlalchemy import create_engine
from sqlalchemy.orm import session_maker
import msvcrt
import os

msodbcsql = windll.LoadLibrary("C:\Windows\System32\msodbcsql17.dll") 

engine = create_engine("mssql+pyodbc://user:pass@test/test?TrustedConnection=yes+driver=ODBC Driver+17+for+SQL+Server")
maker = session_maker(bind=engine)
session = session_maker()
## first query should begind transaction
path = session.execute("SELECT file_stream.PathName() FROM test_filetable").fetchall()[0][0]
## this returns str like "\\\\test\\*"
context = session.execute("SELECT GET_FILESTREAM_TRANSACTION_CONTEXT()").fetchall()[0][0]
## returns bytes

_context = (c_char*len(context)).from_buffer_copy(context)

## This call fails
handle = msodbcsql.OpenSqlFilestream(
        path, # FilestreamPath
        0, # DesiredAccess
        0, # OpenOptions
        _context, # FilestreamTransactionContext
        sizeof(_context), # FilestreamTransactionContextLength
        0 # AllocationSize
    )
## this returns -1 instead of handle

## Never reached, but this should create usable file
desc = msvcrt.open_osfhandle(fsHandle, os.O_RDONLY)
_file = os.fdopen(desc, 'r')

All queries seem to be functioning correctly and producing expected outputs.

How can I establish filestream access to a file on SQL Server 2017 from Python (3.7)?

Edit: The objects I'm handling are gigabytes in size, and stream access is all that's required during processing.

python sqlalchemy pyodbc sql-server-2017 msodbcsql17

Answer 1

Answer №1

It seems like the issue you are facing could be attributed to

The complexity of a SQLAlchemy Session, which is more than just a basic DB API Connection, and/or
The mismatch in transaction context when using OpenSqlFilestream

Here's an example that I have successfully tested with CPython 3.7.2 and pythonnet 2.4.0:

import clr
clr.AddReference("System.Data")
from System.Data import IsolationLevel
from System.Data.SqlClient import SqlCommand, SqlConnection
from System.Data.SqlTypes import SqlFileStream
from System.IO import File, FileAccess, FileOptions

# Code adapted from a C# example at
# https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/filestream-data
connection_string = r"Data Source=(local)\SQLEXPRESS;Initial Catalog=myDB;Integrated Security=True"
con = SqlConnection(connection_string)
con.Open()
sql = """\
SELECT Photo.PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT()
FROM employees WHERE EmployeeID = 1"""
cmd = SqlCommand(sql, con)
tran = con.BeginTransaction(IsolationLevel.ReadCommitted)
cmd.Transaction = tran
rdr = cmd.ExecuteReader()
rdr.Read()
path = rdr.GetString(0)
transaction_context = rdr.GetSqlBytes(1).Buffer
rdr.Close()
allocation_size = 0
input_stream = SqlFileStream(path, transaction_context,
        FileAccess.Read, FileOptions.SequentialScan, allocation_size)
output_stream = File.Create(r"C:\Users\Gord\Desktop\photo.bmp")
input_stream.CopyTo(output_stream)
output_stream.Close()
input_stream.Close()
tran.Commit()
con.Close()

Answer 2

It seems like the issue you are facing could be attributed to

The complexity of a SQLAlchemy Session, which is more than just a basic DB API Connection, and/or
The mismatch in transaction context when using OpenSqlFilestream

Here's an example that I have successfully tested with CPython 3.7.2 and pythonnet 2.4.0:

import clr
clr.AddReference("System.Data")
from System.Data import IsolationLevel
from System.Data.SqlClient import SqlCommand, SqlConnection
from System.Data.SqlTypes import SqlFileStream
from System.IO import File, FileAccess, FileOptions

# Code adapted from a C# example at
# https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/filestream-data
connection_string = r"Data Source=(local)\SQLEXPRESS;Initial Catalog=myDB;Integrated Security=True"
con = SqlConnection(connection_string)
con.Open()
sql = """\
SELECT Photo.PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT()
FROM employees WHERE EmployeeID = 1"""
cmd = SqlCommand(sql, con)
tran = con.BeginTransaction(IsolationLevel.ReadCommitted)
cmd.Transaction = tran
rdr = cmd.ExecuteReader()
rdr.Read()
path = rdr.GetString(0)
transaction_context = rdr.GetSqlBytes(1).Buffer
rdr.Close()
allocation_size = 0
input_stream = SqlFileStream(path, transaction_context,
        FileAccess.Read, FileOptions.SequentialScan, allocation_size)
output_stream = File.Create(r"C:\Users\Gord\Desktop\photo.bmp")
input_stream.CopyTo(output_stream)
output_stream.Close()
input_stream.Close()
tran.Commit()
con.Close()

Leveraging SQL Server File Streaming with Python

Answer №1

Similar questions

Analyze a text document containing columns arranged neatly using white spaces

I have come across an issue with an additional set of "[]" in the json data I generated. Is there a way to eliminate them from the file

Python 3: Organizing a Complex Application

When attempting to click a button using Python, an error may be encountered known as TimeoutException in the selenium module

Getting a class object back from Oct2Py

What is the method for sorting a Python list both numerically in descending order and alphabetically in ascending order simultaneously?

What is the method for inserting a clickable link at the top of every page in a Python PDF document?

What is the best way to swap out characters according to the user's input?

What is the best way to obtain just the name and phone number information?

Tips for changing window size using Selenium WebDriver and Python

When attempting to load a new page from an iframe by clicking a button, the selenium driver fails to retrieve the page source of the newly loaded page

I'm faced with a predicament in Python where I need to split a portion of text using the line-ending characters at the end of each

Exporting Python Pandas Data Frame to an HTML file

When Selenium is not in headless mode, it cannot capture a screenshot of the entire website

Comparing time across different time zones using Python

Struggling to showcase information from a JSON file within an embed on a webpage

Tips on resolving the issue of an element being unclickable at a particular point in Selenium when using

Transmit a lexicon containing values encapsulated within a roster to an AJAX solicitation, endeavoring to dissect it at the receiving

Does the list automatically remove its values once the session ends?

Encountering the error message "Failed to load resource: the server responded with a status of 500 (Internal Server Error)" while using Django and Vue on my website