Quantcast
Channel: Active questions tagged ubuntu - Stack Overflow
Viewing all articles
Browse latest Browse all 5995

Deploying Airflow Locally via Helm with Custom Dockerfile

$
0
0

I'm facing difficulties trying to deploy Apache Airflow locally using Helm charts with a custom Dockerfile. Here is a detailed description of my problem:

ContextEnvironment: Local Kubernetes Cluster (Kind Cluster)Orchestration Tool: Helm chartsAirflow Executor: KubernetesExecutorCustomization: Using a custom Dockerfile for Airflow

1) I created a dockerfile:

FROM apache/airflow:2.8.0USER rootUSER airflowRUN pip install apache-airflow-providers-apache-spark

2) After, maked a build:

sudo docker build -t my-custom/airflow:latest .

3) load to kind:

sudo kind load docker-image my-custom/airflow:latest

4) Apply values.yaml:

helm install airflow apache-airflow/airflow -f values.yaml

My values.yaml:

executor: KubernetesExecutorairflow:  image:    repository: my-custom/airflow    tag: latest    pullPolicy: IfNotPresent    pullSecret: ""    uid: 50000    gid: 0config:  core:    load_examples: 'False'    load_default_connections: 'False'  webserver:    expose_config: 'False'  logging:    remote_logging: 'True'    remote_log_conn_id: "google_cloud_default"    remote_base_log_folder: "gs://my-gcs-bucket"dags:  gitSync:    enabled: true    repo: #my external repo    branch: main    rev: HEAD    subPath: dags    depth: 1    wait: 60scheduler:  replicas: 1web:  replicas: 1  service:    type: LoadBalancer  resources:    requests:      cpu: 500m      memory: 7Gi    limits:      cpu: 500m      memory: 16Gi  initContainers:    - name: wait-for-scheduler      image: busybox      command: ['sh', '-c', 'until nslookup airflow-scheduler; do echo waiting for scheduler; sleep 2; done;']  livenessProbe:    initialDelaySeconds: 1800    periodSeconds: 20    timeoutSeconds: 5    failureThreshold: 5  readinessProbe:    initialDelaySeconds: 1800    periodSeconds: 20    timeoutSeconds: 5    failureThreshold: 5  startupProbe:    initialDelaySeconds: 300    periodSeconds: 30    timeoutSeconds: 5    failureThreshold: 10postgresql:  enabled: falsedata:  metadataConnection:    protocol: postgres    host: 192.168.15.8    port: 5432    db: airflow    user: postgres    pass: airflowflower:  enabled: falseredis:  enabled: falsetriggerer:  enabled: trueingress:  enabled: false

It installs successfully, however the log shows a different version of the airflow specified in the docker image, as well as the spark modules were not installed when accessing the airflow UI.

NAME: airflowLAST DEPLOYED: Thu Aug  1 17:33:21 2024NAMESPACE: defaultSTATUS: deployedREVISION: 1TEST SUITE: NoneNOTES:**Thank you for installing Apache Airflow 2.9.3!**Your release is named airflow.You can now access your dashboard(s) by executing the following command(s) and visiting the corresponding port at localhost in your browser:

Has anyone faced a similar issue or has any idea what might be causing this behavior? Any tips on how to debug and resolve this issue would be greatly appreciated.

Thank you in advance for your help!


Viewing all articles
Browse latest Browse all 5995

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>