Apache Airflow - Using Azure Blob Storage

Introduction
This article is part of the series on Airflow on Kubernetes. In this series, we will cover the following topics:
This is the third article in the series.
In this article, we’ll dive into using Azure Blob Storage with Apache Airflow.
We’ll create two DAGs to showcase different approaches:
-
pandas-abfs-example.py: Demonstrates reading from and writing to Azure Blob Storage using Pandas DataFrames.
-
duckdb-abfs-example.py: Highlights reading data from Azure Blob Storage with DuckDB.
Limitation of this article
In this article, we will not cover the cases when using Custom Docker Image with the Kubernetes Pod Operator. For those cases, it is recommended to use libraries provided by Azure SDK for your programming language.
Set up Azure Blob Storage Connection
To use Azure Blob Storage in Apache Airflow, we need to set up the connection to Azure Blob Storage.
WASB is deprecated
Some documents online may refer to the Windows Azure Storage Blob driver (WASB) for Azure Blob Storage. However, wasb::// protocol is deprecated.
Now the protocols below are available:
-
abfs
-
abfss
-
adl
Microsoft has deprecated the Windows Azure Storage Blob driver (WASB) for Azure Blob Storage in favor of the Azure Blob Filesystem driver (ABFS)
For more information on the deprecation of WASB, refer to the link:
Create a Connection
To create a connection to Azure Blob Storage, run the main menu → Admin → Connections → Create.

Fill in the following fields: * Connection ID: az_blob_storage (or any name you want) * Connection Type: Azure Blob Storage * Blob Storage Connection String (optional): Connection string for Azure Blob Storage


Connection ID
For most cases, the Connection ID has nothing to do with the actual connection. It is just a name for the connection. But in the example of DuckDB, the Connection ID is used to get the container name of Azure Blob Storage. So, we are going to create a new Azure Blob Storage connection with the Connection ID of my-storage
later.
Connection String
For some reason, among security options, only the connection string works for the Azure Blob Storage connection. No need to set Blog Storage Login(account name) because the connection string includes the account name.
You can generate the connection string from the Azure Portal.

Please note that the connection string has its expiration date. So, you need to regenerate the connection string when it expires.
References for Azure Blob Storage
pandas-abfs-example.py
We are going to create a simple DAG having 2 tasks:
-
task1 - read JSON data from hard coded data and write it to Azure Blob Storage using Pandas DataFrame.
-
task2 - read JSON data from Azure Blob Storage and write it to the log.
Even though the two tasks runs on a different pod, we can share the path of the file using XCom.
import pendulum
import json
from airflow.decorators import dag, task
from airflow.io.path import ObjectStoragePath
@dag(
schedule=None,
start_date=pendulum.datetime(2024, 11, 1, tz="UTC"),
catchup=False,
tags=["panadas", "abfs", "azure_blob_storage"],
)
def pandas_abfs_example():
@task
def save_file() -> ObjectStoragePath:
import pandas as pd
base = ObjectStoragePath("abfs://my-storage", conn_id="az_blob_storage")
print("Saving file to Azure Blob Storage")
print("base: ", base)
data_string = '{"Courses":{"r1":"Spark"},"Fee":{"r1":"25000"},"Duration":{"r1":"50 Days"}}'
df = pd.read_json(data_string)
print("df: ", df)
path = base / f"pandas/data.json"
with path.open("w") as file:
df.to_csv(file)
return path
@task
def read_file(path: ObjectStoragePath):
import pandas as pd
base = ObjectStoragePath("abfs://my-storage", conn_id="az_blob_storage")
print("Reading file from Azure Blob Storage")
print("path: ", path)
with path.open("r") as file:
df = pd.read_csv(file)
print("df: ", df)
path = save_file()
read_file(path)
pandas_abfs_example()
save_file function
In the DAG, we are going to use the ObjectStoragePath
class to create the path for Azure Blob Storage.
base = ObjectStoragePath("abfs://my-storage", conn_id="az_blob_storage")
'my-storage' is the container name of Azure Blob Storage. You can get the container name from the Azure Portal.
And path is created by adding the container name and the file name.
path = base / f"pandas/data.json"
In the save_file function, we are going to save the file to Azure Blob Storage. And we can see the data.json file in Azure Blob Storage.

read_file function
As the parameter of the read_file function, we are going to use the path of the file that is saved in the save_file function. It is shared using XCom. Each task running on a different pod can share the data using XCom. The size of the data shared using XCom is limited. So, it is not recommended to share large data like Pandas DataFrame or Spark DataFrame.
XComs
We can see the data shared using XCom in the Airflow UI.

For more information on XComs, refer to the link:
Since the 'pandas' package is included in the default Docker image of Apache Airflow, we can use the 'pandas' package in the Airflow task without any additional installation. However, if you need to use other packages, you need to create a custom Docker image.
duckdb-abfs-example.py
The duckdb package is not included in the default Docker image of Apache Airflow. So, we need to create a custom Docker image.
Customized Base Docker Image for DuckDB
This docker image is the base image that is used for all Airflow tasks.
For more information on python libraries included in the Docker image of Apache Airflow, refer to the appendix.
FROM apache/airflow:2.10.3
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir duckdb
This Dockerfile installs the 'duckdb' package. And you can add other packages that you need.
In order to use this Docker image as the base image for the Airflow task, we need to push this Docker image to the container registry.
$ az acr build --image airflow-custom:2.10.3 --registry {your-acr-name} ./ docker/custom
And then, we need to update the values.yaml file to use this Docker image.
images:
airflow:
repository: {your-acr-name}.azurecr.io/airflow-custom
tag: 2.10.3
Now we can use DuckDB in the Airflow task.
duckdb-abfs-example.py
We are going to create a simple DAG having 1 task:
-
task1 - read Parquet data from Azure Blob Storage and analyze the data using DuckDB.
The Parquet files used in this example are the same as the Parquet files used in the previous article. The Sling ETL task migrates data from the source database to Azure Blob Storage in Parquet format. In the previous article, we used a separate Docker image and used Sling configuration file to use Azure Blob Storage. In this article, we are going to use the Azure Blob Storage connection in Apache Airflow.
import pendulum
import json
from airflow.decorators import dag, task
from airflow.io.path import ObjectStoragePath
@dag(
schedule=None,
start_date=pendulum.datetime(2024, 11, 1, tz="UTC"),
catchup=False,
tags=["abfs", "duckdb", "azure_blob_storage"],
)
def duckdb_abfs_example():
@task
def analyze_data():
""" Analyze
This task will analyze the data from Azure Blob Storage
"""
print("Analyzing data from Azure Blob Storage")
import duckdb
base = ObjectStoragePath("abfs://my-storage", conn_id="my-storage")
path = base / "sling/2024-11-29/division/*.parquet"
conn = duckdb.connect(database=":memory:")
conn.register_filesystem(path.fs)
conn.execute(f"CREATE OR REPLACE VIEW division AS SELECT * FROM read_parquet('{path}')")
df = conn.execute("SELECT COUNT(*) AS COUNT FROM division").fetchdf()
print(df)
message = "===> The number of records in the division table is: " + str(df["COUNT"][0])
print(message)
analyze_data()
# azure_task >> analyze_task
duckdb_abfs_example()
In analyze_date function, we are using Azure Blob Storage to read Parquet files and analyze the data using DuckDB.
But DuckDB treats ObjectStoragePath in a different way. While Pandas get container name from the ObjectStoragePath, DuckDB gets the container name from the Connection ID.
So we need to create a new Azure Blob Storage connection with the Connection ID of my-storage
.
I used the same information as the Pandas example but with a different Connection ID.
After executing the DAG, we can see the result of the DuckDB query in the log.

Conclusion
In this article, we explored how to work with Azure Blob Storage in Apache Airflow. We created two DAGs to demonstrate the use of Pandas DataFrames and DuckDB with Azure Blob Storage. Additionally, we learned how to share data between tasks using XCom and how to build a custom Docker image to include the DuckDB package for use in Airflow tasks.
All my LinkedIn articles are available at My LinkedIn Article Library.
Appendix
Python libraries included in the Docker image of Apache Airflow 2.9.3
$ pip list
Package Version
---------------------------------------- ---------------
adal 1.2.7
adlfs 2024.4.1
aiobotocore 2.13.1
aiofiles 23.2.1
aiohttp 3.9.5
aioitertools 0.11.0
aiosignal 1.3.1
alembic 1.13.2
amqp 5.2.0
annotated-types 0.7.0
anyio 4.4.0
apache-airflow 2.9.3
apache-airflow-providers-amazon 8.25.0
apache-airflow-providers-celery 3.7.2
apache-airflow-providers-cncf-kubernetes 8.3.3
apache-airflow-providers-common-io 1.3.2
apache-airflow-providers-common-sql 1.14.2
apache-airflow-providers-docker 3.12.2
apache-airflow-providers-elasticsearch 5.4.1
apache-airflow-providers-fab 1.2.2
apache-airflow-providers-ftp 3.10.0
apache-airflow-providers-google 10.21.0
apache-airflow-providers-grpc 3.5.2
apache-airflow-providers-hashicorp 3.7.1
apache-airflow-providers-http 4.12.0
apache-airflow-providers-imap 3.6.1
apache-airflow-providers-microsoft-azure 10.2.0
apache-airflow-providers-mysql 5.6.2
apache-airflow-providers-odbc 4.6.2
apache-airflow-providers-openlineage 1.9.1
apache-airflow-providers-postgres 5.11.2
apache-airflow-providers-redis 3.7.1
apache-airflow-providers-sendgrid 3.5.1
apache-airflow-providers-sftp 4.10.2
apache-airflow-providers-slack 8.7.1
apache-airflow-providers-smtp 1.7.1
apache-airflow-providers-snowflake 5.6.0
apache-airflow-providers-sqlite 3.8.1
apache-airflow-providers-ssh 3.11.2
apispec 6.6.1
argcomplete 3.4.0
asgiref 3.8.1
asn1crypto 1.5.1
asyncssh 2.15.0
attrs 23.2.0
Authlib 1.3.1
azure-batch 14.2.0
azure-common 1.1.28
azure-core 1.30.2
azure-cosmos 4.7.0
azure-datalake-store 0.0.53
azure-identity 1.17.1
azure-keyvault-secrets 4.8.0
azure-kusto-data 4.5.1
azure-mgmt-containerinstance 10.1.0
azure-mgmt-containerregistry 10.3.0
azure-mgmt-core 1.4.0
azure-mgmt-cosmosdb 9.5.1
azure-mgmt-datafactory 8.0.0
azure-mgmt-datalake-nspkg 3.0.1
azure-mgmt-datalake-store 0.5.0
azure-mgmt-nspkg 3.0.2
azure-mgmt-resource 23.1.1
azure-mgmt-storage 21.2.1
azure-nspkg 3.0.2
azure-servicebus 7.12.2
azure-storage-blob 12.20.0
azure-storage-file-datalake 12.15.0
azure-storage-file-share 12.16.0
azure-synapse-artifacts 0.19.0
azure-synapse-spark 0.7.0
Babel 2.15.0
backoff 2.2.1
bcrypt 4.1.3
beautifulsoup4 4.12.3
billiard 4.2.0
blinker 1.8.2
boto3 1.34.131
botocore 1.34.131
cachelib 0.9.0
cachetools 5.3.3
cattrs 23.2.3
celery 5.4.0
certifi 2024.7.4
cffi 1.16.0
chardet 5.2.0
charset-normalizer 3.3.2
click 8.1.7
click-didyoumean 0.3.1
click-plugins 1.1.1
click-repl 0.3.0
clickclick 20.10.2
colorama 0.4.6
colorlog 4.8.0
ConfigUpdater 3.2
connexion 2.14.2
cron-descriptor 1.4.3
croniter 2.0.5
cryptography 41.0.7
db-dtypes 1.2.0
decorator 5.1.1
Deprecated 1.2.14
dill 0.3.8
distlib 0.3.8
dnspython 2.6.1
docker 7.1.0
docstring_parser 0.16
docutils 0.16
elastic-transport 8.13.1
elasticsearch 8.14.0
email_validator 2.2.0
eventlet 0.36.1
filelock 3.15.4
Flask 2.2.5
Flask-AppBuilder 4.5.0
Flask-Babel 2.0.0
Flask-Caching 2.3.0
Flask-JWT-Extended 4.6.0
Flask-Limiter 3.7.0
Flask-Login 0.6.3
Flask-Session 0.5.0
Flask-SQLAlchemy 2.5.1
Flask-WTF 1.2.1
flower 2.0.1
frozenlist 1.4.1
fsspec 2023.12.2
gcloud-aio-auth 4.2.3
gcloud-aio-bigquery 7.1.0
gcloud-aio-storage 9.2.0
gcsfs 2023.12.2.post1
gevent 24.2.1
google-ads 24.1.0
google-analytics-admin 0.22.8
google-api-core 2.19.1
google-api-python-client 2.137.0
google-auth 2.32.0
google-auth-httplib2 0.2.0
google-auth-oauthlib 1.2.1
google-cloud-aiplatform 1.59.0
google-cloud-appengine-logging 1.4.4
google-cloud-audit-log 0.2.5
google-cloud-automl 2.13.4
google-cloud-batch 0.17.22
google-cloud-bigquery 3.20.1
google-cloud-bigquery-datatransfer 3.15.4
google-cloud-bigtable 2.24.0
google-cloud-build 3.24.1
google-cloud-compute 1.19.1
google-cloud-container 2.49.0
google-cloud-core 2.4.1
google-cloud-datacatalog 3.19.1
google-cloud-dataflow-client 0.8.11
google-cloud-dataform 0.5.10
google-cloud-dataplex 2.2.1
google-cloud-dataproc 5.10.1
google-cloud-dataproc-metastore 1.15.4
google-cloud-dlp 3.18.1
google-cloud-kms 2.24.1
google-cloud-language 2.13.4
google-cloud-logging 3.10.0
google-cloud-memcache 1.9.4
google-cloud-monitoring 2.22.1
google-cloud-orchestration-airflow 1.13.0
google-cloud-os-login 2.14.5
google-cloud-pubsub 2.22.0
google-cloud-redis 2.15.4
google-cloud-resource-manager 1.12.4
google-cloud-run 0.10.7
google-cloud-secret-manager 2.20.1
google-cloud-spanner 3.47.0
google-cloud-speech 2.26.1
google-cloud-storage 2.17.0
google-cloud-storage-transfer 1.11.4
google-cloud-tasks 2.16.4
google-cloud-texttospeech 2.16.4
google-cloud-translate 3.15.4
google-cloud-videointelligence 2.13.4
google-cloud-vision 3.7.3
google-cloud-workflows 1.14.4
google-crc32c 1.5.0
google-re2 1.1.20240702
google-resumable-media 2.7.1
googleapis-common-protos 1.63.2
graphviz 0.20.3
greenlet 3.0.3
grpc-google-iam-v1 0.13.1
grpc-interceptor 0.15.4
grpcio 1.64.1
grpcio-gcp 0.2.2
grpcio-status 1.62.2
gunicorn 22.0.0
h11 0.14.0
h2 4.1.0
hpack 4.0.0
httpcore 1.0.5
httplib2 0.22.0
httpx 0.27.0
humanize 4.10.0
hvac 2.3.0
hyperframe 6.0.1
idna 3.7
ijson 3.3.0
importlib-metadata 6.11.0
importlib_resources 6.4.0
inflection 0.5.1
isodate 0.6.1
itsdangerous 2.2.0
Jinja2 3.1.4
jmespath 0.10.0
json-merge-patch 0.2
jsonpath-ng 1.6.1
jsonschema 4.23.0
jsonschema-specifications 2023.12.1
kombu 5.3.7
kubernetes 29.0.0
kubernetes_asyncio 29.0.0
lazy-object-proxy 1.10.0
ldap3 2.9.1
limits 3.13.0
linkify-it-py 2.0.3
lockfile 0.12.2
looker-sdk 24.10.0
lxml 5.2.2
Mako 1.3.5
markdown-it-py 3.0.0
MarkupSafe 2.1.5
marshmallow 3.21.3
marshmallow-oneofschema 3.1.1
marshmallow-sqlalchemy 0.28.2
mdit-py-plugins 0.4.1
mdurl 0.1.2
methodtools 0.4.7
microsoft-kiota-abstractions 1.3.3
microsoft-kiota-authentication-azure 1.0.0
microsoft-kiota-http 1.3.2
more-itertools 10.3.0
msal 1.29.0
msal-extensions 1.2.0
msgraph-core 1.1.1
msrest 0.7.1
msrestazure 0.6.4.post1
multidict 6.0.5
mysql-connector-python 9.0.0
mysqlclient 2.2.4
numpy 1.26.4
oauthlib 3.2.2
openlineage-integration-common 1.18.0
openlineage-python 1.18.0
openlineage_sql 1.18.0
opentelemetry-api 1.25.0
opentelemetry-exporter-otlp 1.25.0
opentelemetry-exporter-otlp-proto-common 1.25.0
opentelemetry-exporter-otlp-proto-grpc 1.25.0
opentelemetry-exporter-otlp-proto-http 1.25.0
opentelemetry-proto 1.25.0
opentelemetry-sdk 1.25.0
opentelemetry-semantic-conventions 0.46b0
ordered-set 4.1.0
packaging 24.1
pandas 2.1.4
pandas-gbq 0.23.1
paramiko 3.4.0
pathspec 0.12.1
pendulum 3.0.0
pip 24.3.1
platformdirs 4.2.2
pluggy 1.5.0
ply 3.11
portalocker 2.10.0
prison 0.2.1
prometheus_client 0.20.0
prompt_toolkit 3.0.47
proto-plus 1.24.0
protobuf 4.25.3
psutil 6.0.0
psycopg2-binary 2.9.9
pyarrow 16.1.0
pyasn1 0.5.1
pyasn1-modules 0.3.0
PyAthena 3.8.3
pycparser 2.22
pydantic 2.8.2
pydantic_core 2.20.1
pydata-google-auth 1.8.2
Pygments 2.18.0
PyJWT 2.8.0
PyNaCl 1.5.0
pyodbc 5.1.0
pyOpenSSL 24.1.0
pyparsing 3.1.2
python-daemon 3.0.1
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
python-http-client 3.3.7
python-ldap 3.4.4
python-nvd3 0.16.0
python-slugify 8.0.4
pytz 2024.1
PyYAML 6.0.1
redis 5.0.7
redshift-connector 2.1.2
referencing 0.35.1
requests 2.32.3
requests-oauthlib 1.3.1
requests-toolbelt 1.0.0
rfc3339-validator 0.1.4
rich 13.7.1
rich-argparse 1.5.2
rpds-py 0.19.0
rsa 4.9
s3transfer 0.10.2
scramp 1.4.5
sendgrid 6.11.0
setproctitle 1.3.3
setuptools 66.1.1
shapely 2.0.4
six 1.16.0
slack_sdk 3.31.0
sniffio 1.3.1
snowflake-connector-python 3.11.0
snowflake-sqlalchemy 1.6.1
sortedcontainers 2.4.0
soupsieve 2.5
SQLAlchemy 1.4.52
sqlalchemy-bigquery 1.11.0
SQLAlchemy-JSONField 1.0.2
sqlalchemy-redshift 0.8.14
sqlalchemy-spanner 1.7.0
SQLAlchemy-Utils 0.41.2
sqlparse 0.5.0
sshtunnel 0.4.0
starkbank-ecdsa 2.2.0
statsd 4.0.1
std-uritemplate 1.0.3
tabulate 0.9.0
tenacity 8.5.0
termcolor 2.4.0
text-unidecode 1.3
time-machine 2.14.2
tomlkit 0.13.0
tornado 6.4.1
typing_extensions 4.12.2
tzdata 2024.1
uc-micro-py 1.0.3
unicodecsv 0.14.1
universal_pathlib 0.2.2
uritemplate 4.1.1
urllib3 2.0.7
uv 0.2.31
vine 5.1.0
virtualenv 20.26.3
watchtower 3.2.0
wcwidth 0.2.13
websocket-client 1.8.0
Werkzeug 2.2.3
wirerope 0.4.7
wrapt 1.16.0
WTForms 3.1.2
yarl 1.9.4
zipp 3.19.2
zope.event 5.0
zope.interface 6.4.post2
Python libraries included in the Docker image of Apache Airflow 2.10.3
$ pip list
Package Version
---------------------------------------- ------------
adal 1.2.7
adlfs 2024.7.0
aiobotocore 2.15.2
aiofiles 23.2.1
aiohappyeyeballs 2.4.3
aiohttp 3.10.10
aioitertools 0.12.0
aiosignal 1.3.1
alembic 1.14.0
amqp 5.2.0
annotated-types 0.7.0
anyio 4.6.2.post1
apache-airflow 2.10.3
apache-airflow-providers-amazon 9.0.0
apache-airflow-providers-celery 3.8.3
apache-airflow-providers-cncf-kubernetes 9.0.1
apache-airflow-providers-common-compat 1.2.1
apache-airflow-providers-common-io 1.4.2
apache-airflow-providers-common-sql 1.19.0
apache-airflow-providers-docker 3.14.0
apache-airflow-providers-elasticsearch 5.5.2
apache-airflow-providers-fab 1.5.0
apache-airflow-providers-ftp 3.11.1
apache-airflow-providers-google 10.25.0
apache-airflow-providers-grpc 3.6.0
apache-airflow-providers-hashicorp 3.8.0
apache-airflow-providers-http 4.13.2
apache-airflow-providers-imap 3.7.0
apache-airflow-providers-microsoft-azure 11.0.0
apache-airflow-providers-mysql 5.7.3
apache-airflow-providers-odbc 4.8.0
apache-airflow-providers-openlineage 1.13.0
apache-airflow-providers-postgres 5.13.1
apache-airflow-providers-redis 3.8.0
apache-airflow-providers-sendgrid 3.6.0
apache-airflow-providers-sftp 4.11.1
apache-airflow-providers-slack 8.9.1
apache-airflow-providers-smtp 1.8.0
apache-airflow-providers-snowflake 5.8.0
apache-airflow-providers-sqlite 3.9.0
apache-airflow-providers-ssh 3.14.0
apispec 6.7.1
argcomplete 3.5.1
asgiref 3.8.1
asn1crypto 1.5.1
asyncssh 2.18.0
attrs 24.2.0
Authlib 1.3.2
azure-batch 14.2.0
azure-common 1.1.28
azure-core 1.32.0
azure-cosmos 4.7.0
azure-datalake-store 0.0.53
azure-identity 1.19.0
azure-keyvault-secrets 4.9.0
azure-kusto-data 4.6.1
azure-mgmt-containerinstance 10.1.0
azure-mgmt-containerregistry 10.3.0
azure-mgmt-core 1.5.0
azure-mgmt-cosmosdb 9.6.0
azure-mgmt-datafactory 9.0.0
azure-mgmt-datalake-nspkg 3.0.1
azure-mgmt-datalake-store 0.5.0
azure-mgmt-nspkg 3.0.2
azure-mgmt-resource 23.2.0
azure-mgmt-storage 21.2.1
azure-nspkg 3.0.2
azure-servicebus 7.12.3
azure-storage-blob 12.23.1
azure-storage-file-datalake 12.17.0
azure-storage-file-share 12.19.0
azure-synapse-artifacts 0.19.0
azure-synapse-spark 0.7.0
babel 2.16.0
backoff 2.2.1
bcrypt 4.2.0
beautifulsoup4 4.12.3
billiard 4.2.1
blinker 1.8.2
boto3 1.35.36
botocore 1.35.36
cachelib 0.9.0
cachetools 5.5.0
cattrs 24.1.2
celery 5.4.0
certifi 2024.8.30
cffi 1.17.1
chardet 5.2.0
charset-normalizer 3.4.0
click 8.1.7
click-didyoumean 0.3.1
click-plugins 1.1.1
click-repl 0.3.0
clickclick 20.10.2
colorama 0.4.6
colorlog 6.9.0
ConfigUpdater 3.2
connexion 2.14.2
cron-descriptor 1.4.5
croniter 5.0.1
cryptography 42.0.8
db-dtypes 1.3.0
decorator 5.1.1
Deprecated 1.2.14
dill 0.3.9
distlib 0.3.9
dnspython 2.7.0
docker 7.1.0
docstring_parser 0.16
elastic-transport 8.15.1
elasticsearch 8.15.1
email_validator 2.2.0
eventlet 0.37.0
filelock 3.16.1
Flask 2.2.5
Flask-AppBuilder 4.5.2
Flask-Babel 2.0.0
Flask-Caching 2.3.0
Flask-JWT-Extended 4.6.0
Flask-Limiter 3.8.0
Flask-Login 0.6.3
Flask-Session 0.5.0
Flask-SQLAlchemy 2.5.1
Flask-WTF 1.2.2
flower 2.0.1
frozenlist 1.5.0
fsspec 2024.10.0
gcloud-aio-auth 5.3.2
gcloud-aio-bigquery 7.1.0
gcloud-aio-storage 9.3.0
gcsfs 2024.10.0
gevent 24.10.3
google-ads 25.1.0
google-analytics-admin 0.23.2
google-api-core 2.22.0
google-api-python-client 2.151.0
google-auth 2.35.0
google-auth-httplib2 0.2.0
google-auth-oauthlib 1.2.1
google-cloud-aiplatform 1.71.1
google-cloud-appengine-logging 1.5.0
google-cloud-audit-log 0.3.0
google-cloud-automl 2.14.1
google-cloud-batch 0.17.31
google-cloud-bigquery 3.20.1
google-cloud-bigquery-datatransfer 3.17.1
google-cloud-bigtable 2.26.0
google-cloud-build 3.27.0
google-cloud-compute 1.20.1
google-cloud-container 2.53.0
google-cloud-core 2.4.1
google-cloud-datacatalog 3.21.1
google-cloud-dataflow-client 0.8.13
google-cloud-dataform 0.5.13
google-cloud-dataplex 2.3.1
google-cloud-dataproc 5.15.1
google-cloud-dataproc-metastore 1.16.0
google-cloud-dlp 3.25.0
google-cloud-kms 3.1.0
google-cloud-language 2.15.0
google-cloud-logging 3.11.3
google-cloud-memcache 1.10.0
google-cloud-monitoring 2.23.0
google-cloud-orchestration-airflow 1.15.0
google-cloud-os-login 2.15.0
google-cloud-pubsub 2.26.1
google-cloud-redis 2.16.0
google-cloud-resource-manager 1.13.0
google-cloud-run 0.10.10
google-cloud-secret-manager 2.21.0
google-cloud-spanner 3.49.1
google-cloud-speech 2.28.0
google-cloud-storage 2.18.2
google-cloud-storage-transfer 1.13.0
google-cloud-tasks 2.17.0
google-cloud-texttospeech 2.21.0
google-cloud-translate 3.17.0
google-cloud-videointelligence 2.14.0
google-cloud-vision 3.8.0
google-cloud-workflows 1.15.0
google-crc32c 1.6.0
google-re2 1.1.20240702
google-resumable-media 2.7.2
googleapis-common-protos 1.65.0
graphviz 0.20.3
greenlet 3.1.1
grpc-google-iam-v1 0.13.1
grpc-interceptor 0.15.4
grpcio 1.67.1
grpcio-gcp 0.2.2
grpcio-status 1.62.3
gunicorn 23.0.0
h11 0.14.0
h2 4.1.0
hpack 4.0.0
httpcore 1.0.6
httplib2 0.22.0
httpx 0.27.0
humanize 4.11.0
hvac 2.3.0
hyperframe 6.0.1
idna 3.10
ijson 3.3.0
immutabledict 4.2.0
importlib-metadata 6.11.0
importlib_resources 6.4.5
inflection 0.5.1
isodate 0.7.2
itsdangerous 2.2.0
Jinja2 3.1.4
jmespath 0.10.0
json-merge-patch 0.2
jsonpath-ng 1.7.0
jsonschema 4.23.0
jsonschema-specifications 2023.12.1
kombu 5.4.2
kubernetes 30.1.0
kubernetes_asyncio 30.1.0
lazy-object-proxy 1.10.0
ldap3 2.9.1
limits 3.13.0
linkify-it-py 2.0.3
lockfile 0.12.2
looker-sdk 24.18.1
lxml 5.3.0
Mako 1.3.6
markdown-it-py 3.0.0
MarkupSafe 3.0.2
marshmallow 3.23.1
marshmallow-oneofschema 3.1.1
marshmallow-sqlalchemy 0.28.2
mdit-py-plugins 0.4.2
mdurl 0.1.2
methodtools 0.4.7
microsoft-kiota-abstractions 1.3.3
microsoft-kiota-authentication-azure 1.1.0
microsoft-kiota-http 1.3.3
microsoft-kiota-serialization-json 1.0.0
microsoft-kiota-serialization-text 1.0.0
more-itertools 10.5.0
msal 1.31.0
msal-extensions 1.2.0
msgraph-core 1.1.6
msrest 0.7.1
msrestazure 0.6.4.post1
multidict 6.1.0
mysql-connector-python 9.1.0
mysqlclient 2.2.5
numpy 1.26.4
oauthlib 3.2.2
openlineage-integration-common 1.23.0
openlineage-python 1.23.0
openlineage_sql 1.23.0
opentelemetry-api 1.27.0
opentelemetry-exporter-otlp 1.27.0
opentelemetry-exporter-otlp-proto-common 1.27.0
opentelemetry-exporter-otlp-proto-grpc 1.27.0
opentelemetry-exporter-otlp-proto-http 1.27.0
opentelemetry-proto 1.27.0
opentelemetry-sdk 1.27.0
opentelemetry-semantic-conventions 0.48b0
ordered-set 4.1.0
packaging 24.1
pandas 2.1.4
pandas-gbq 0.24.0
paramiko 3.5.0
pathspec 0.12.1
pendulum 3.0.0
pip 24.2
platformdirs 4.3.6
pluggy 1.5.0
ply 3.11
portalocker 2.10.1
prison 0.2.1
prometheus_client 0.21.0
prompt_toolkit 3.0.48
propcache 0.2.0
proto-plus 1.25.0
protobuf 4.25.5
psutil 6.1.0
psycopg2-binary 2.9.10
pyarrow 18.0.0
pyasn1 0.6.1
pyasn1_modules 0.4.0
PyAthena 3.9.0
pycparser 2.22
pydantic 2.9.2
pydantic_core 2.23.4
pydata-google-auth 1.8.2
Pygments 2.18.0
PyJWT 2.9.0
PyNaCl 1.5.0
pyodbc 5.2.0
pyOpenSSL 24.2.1
pyparsing 3.2.0
python-daemon 3.1.0
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
python-http-client 3.3.7
python-ldap 3.4.4
python-nvd3 0.16.0
python-slugify 8.0.4
python3-saml 1.16.0
pytz 2024.2
PyYAML 6.0.2
redis 5.2.0
redshift-connector 2.1.3
referencing 0.35.1
requests 2.32.3
requests-oauthlib 1.3.1
requests-toolbelt 1.0.0
rfc3339-validator 0.1.4
rich 13.9.4
rich-argparse 1.6.0
rpds-py 0.20.1
rsa 4.9
s3transfer 0.10.3
scramp 1.4.5
sendgrid 6.11.0
setproctitle 1.3.3
setuptools 75.3.0
shapely 2.0.6
six 1.16.0
slack_sdk 3.33.3
sniffio 1.3.1
snowflake-connector-python 3.12.3
snowflake-sqlalchemy 1.6.1
sortedcontainers 2.4.0
soupsieve 2.6
SQLAlchemy 1.4.54
sqlalchemy-bigquery 1.12.0
SQLAlchemy-JSONField 1.0.2
sqlalchemy-redshift 0.8.14
sqlalchemy-spanner 1.7.0
SQLAlchemy-Utils 0.41.2
sqlparse 0.5.1
sshtunnel 0.4.0
starkbank-ecdsa 2.2.0
statsd 4.0.1
std-uritemplate 2.0.0
tabulate 0.9.0
tenacity 8.5.0
termcolor 2.5.0
text-unidecode 1.3
time-machine 2.16.0
tomlkit 0.13.2
tornado 6.4.1
typing_extensions 4.12.2
tzdata 2024.2
uc-micro-py 1.0.3
universal_pathlib 0.2.5
uritemplate 4.1.1
urllib3 2.2.3
uv 0.4.1
vine 5.1.0
virtualenv 20.27.1
watchtower 3.3.1
wcwidth 0.2.13
websocket-client 1.8.0
Werkzeug 2.2.3
wirerope 0.4.7
wrapt 1.16.0
WTForms 3.2.1
xmlsec 1.3.14
yarl 1.17.1
zipp 3.20.2
zope.event 5.0
zope.interface 7.1.1