Description /docs/source/development/testing.rst 파일을 읽으며 테스트를 시도하는 도중에 다음과 같은 문제가 발생했고, 이 문제를 수정하기로 생각했다.
1 2 3 4 24/12/16 20:44:42 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address. 24/12/16 20:44:42 ERROR SparkContext: Error initializing SparkContext. 에러에서 설명한 대로, PORT가 이미 사용중이라는 뜻으로 생각되어 PORT를 수정하니 에러가 해결되었다.
해결법에 대해서 docs에 수정해두었다.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 .
Kafka with Zookeeper Install 1 brew install kafka Homebrew’s default installation path will differ based on the chips: Macs with Apple Silicon will install kafka under /opt/homebrew/Cellar.
Binaries and scrips will be in /opt/homebrew/bin
Kafka configurations will be in /opt/homebrew/etc/kafka
Zookeeper configurations will be in /opt/homebrew/etc/zookeeper
The log.dirs config(the location of kafka data) will be set to /opt/homebrew/var/lib/kafka-logs
Setup the $PATH environment variable In order to easily access the kafka binaries, you can edit your PATH variable by adding the following line(edit the content to your system) to your system run commands(~/.
Configurations Docker Image We use official Airflow image. We have to install the necessary libraries and packages into the Airflow container. For that, we have to create a Dockerfile
1 2 3 4 5 6 FROM apache/airflow:2.10.2 USER airflow COPY requirements.txt /requirements.txt RUN pip install -r /requirements.txt 1 2 3 confluent-kafka cassandra-driver pymongo This Dockerfile will be used to install airflow:2.10.2. Then, it will install all necessary libraries in the requirements.
Chapter 9: Design a web crawler A web crawler is known as a robot or spider. It is widely used by search engines to discover new or updated content on the web. Content can be a web page, an image, a video, a PDF file, etc. A web crawler starts by collecting a few web pages and then follows links on those pages to collect new content.
Install Helm chart 1 brew install helm Install the Chart 1 2 3 4 5 6 7 8 9 {seilylook} 💎minikube start {seilylook} 💎helm repo add apache-airflow https://airflow.apache.org "apache-airflow" has been added to your repositories {seilylook} 💎 helm repo list NAME URL apache-airflow https://airflow.apache.org Upgrade the Chart 1 2 3 4 5 6 7 8 9 10 11 {seilylook} 💎 helm upgrade --install airflow apache-airflow/airflow --namespace airflow --create-namespace {seilylook} 💎 ~/Development/Devlog main ± kubectl get pods -n airflow -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES airflow-postgresql-0 1/1 Running 0 9m10s 10.
Install and start Minikube Install the Minikube 1 2 curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-darwin-amd64 sudo install minikube-darwin-amd64 /usr/local/bin/minikube Start minikube cluster and Check the status 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 {seilylook} 🚀 minikube start 😄 Darwin 14.6.1 (arm64) 의 minikube v1.33.0 ✨ 기존 프로필에 기반하여 docker 드라이버를 사용하는 중 👍 Starting "minikube" primary control-plane node in "minikube" cluster 🚜 Pulling base image v0.