Projects with this topic
-
Solución end-to-end para la migración y análisis de datos utilizando Python, FastAPI, Kafka y PostgreSQL. Implementa un pipeline de datos asíncrono y una API RESTful para analíticas, todo completamente containerizado con Docker Compose para un despliegue fácil y reproducible.
Updated -
Unified project demonstrating both batch analytics and real-time streaming pipelines with Apache Spark:
Batch (PySpark/Jupyter): Processed S&P 500 stock data, applied transformations, and ran distributed computations.
Streaming (Spark + Kafka): Built a streaming pipeline to consume Kafka topics, process messages in real-time, and visualize outputs.
Deployed using Docker and Jupyter for reproducibility.
Updated -
Airflow pipeline, mainly for exploration and self-learning, but particularly for scraping LINE Webtoon data and store it in external MySQL database. The ingested data are used to create dashboard on https://ammarchalifah.com/webtoon-insights
Updated