0
I’m having problems importing the Azure-Storage-blob package into an Apache-Airflow container.
My image:
My running container:
I installed Azure-Storage-blob but when executing a script, an error occurs Modulenotfounderror: No module named 'Azure.Storage'; 'Azure' is not a package.
airflow@707ab2142426:~$ python3 /usr/local/airflow/dags/azure.py
I am trying to run this script Azure.py:
from azure.storage.blob import BlobServiceClient
print("oiii")
Then I enter the container bash running with the following command:
docker exec -it 707ab2142426 bash
And I enter inside python, inside the bash of this running container and import normally without any error:
from azure.storage.blob import BlobServiceClient
This is the bash print of the container after running the script with the error message and after manual import of the package into the container python.
This is my Dockerfile
# Base Image
FROM python:3.8
LABEL maintainer="precredito"
# Arguments that can be set with docker build
ARG AIRFLOW_VERSION=1.10.12
ARG AIRFLOW_HOME=/usr/local/airflow
# Export the environment variable AIRFLOW_HOME where airflow will be installed
ENV AIRFLOW_HOME=${AIRFLOW_HOME}
# Install dependencies and tools
RUN apt-get update -yqq && \
apt-get upgrade -yqq && \
apt-get install -yqq --no-install-recommends \
build-essential r-base \
wget \
libczmq-dev \
curl \
libssl-dev \
git \
inetutils-telnet \
bind9utils freetds-dev \
libkrb5-dev \
libsasl2-dev \
libffi-dev libpq-dev \
freetds-bin build-essential \
default-libmysqlclient-dev \
apt-utils \
rsync \
zip \
unzip \
gcc \
vim \
locales \
unixodbc-dev \
unixodbc \
&& apt-get clean
RUN apt-get update -y
#RUN pip install deployv==0.9.173
# RUN pip install build-essential==12.4ubuntu1
# RUN pip install install manpages-dev==5.09-2
RUN pip install psutil==5.7.3
RUN pip install python-dev-tools==2020.9.10
## Driver odbc
RUN apt-get update \
&& apt-get install unixodbc -y \
&& apt-get install unixodbc-dev -y \
&& apt-get install unixodbc-bin -y \
&& apt-get install freetds-dev -y \
&& apt-get install freetds-bin -y \
&& apt-get install freetds-common -y \
&& apt-get install tdsodbc -y \
&& apt-get install libdbd-odbc-perl -y \
&& apt-get install liblocal-lib-perl -y \
&& apt-get install --reinstall build-essential -y
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
#Ubuntu 18.04
RUN curl https://packages.microsoft.com/config/ubuntu/18.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt-get update
RUN ACCEPT_EULA=Y apt-get install msodbcsql17
# optional: for bcp and sqlcmd
RUN ACCEPT_EULA=Y apt-get install mssql-tools
RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile
RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
### fim da instalação do driver ODBC da Microsoft
RUN python3.8 -m pip install pyodbc==4.0.30
COPY ./requirements-python3.8.txt /requirements-python3.8.txt
# Upgrade pip
# Create airflow user
# Install apache airflow with subpackages
RUN pip install --upgrade pip && \
useradd -ms /bin/bash -d ${AIRFLOW_HOME} airflow && \
pip install apache-airflow[microsoft.azure,azure_blob_storage,azure_cosmos,azure_data_lake,all_dbs,crypto,celery,postgres,kubernetes,docker]==${AIRFLOW_VERSION} --constraint /requirements-python3.8.txt
RUN python3.8 -m pip install azure-storage-blob --upgrade --force-reinstall
RUN python3.8 -m pip install azure-storage-blob --user
RUN python3.8 -m pip install azure-storage-blob --upgrade --force-reinstall
# Install R packages
RUN Rscript -e "install.packages('data.table')"
RUN Rscript -e "install.packages('tidyverse',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN Rscript -e "install.packages('caret',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN Rscript -e "install.packages('readxl',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN Rscript -e "install.packages('xgboost',dependencies=TRUE, repos='http://cran.rstudio.com/')"
# Copy the entrypoint.sh from host to container (at path AIRFLOW_HOME)
COPY ./entrypoint.sh ./entrypoint.sh
# COPY config/airflow.cfg ${AIRFLOW_USER_HOME}/airflow.cfg
# Set the entrypoint.sh file to be executable
RUN chmod +x ./entrypoint.sh
# Set the owner of the files in AIRFLOW_HOME to the user airflow
RUN chown -R airflow: ${AIRFLOW_HOME}
# Set the username to use
USER airflow
# Set workdir (it's like a cd inside the container)
WORKDIR ${AIRFLOW_HOME}
# Create the dags folder which will contain the DAGs
RUN mkdir dags
# Expose ports (just to indicate that this container needs to map port)
EXPOSE 8080
# Execute the entrypoint.sh
ENTRYPOINT [ "/entrypoint.sh" ]
Why can’t I import a package that apparently gets installed without any errors? I’m new to this.
Thank you!