Caravelo
Sector: Travel subscriptions
Senior Data Engineer
August 2023 - current
Design and implementation of a Data Lake with Redshift, Kinesis and DBT to process different aggregations with data from subscriptions and financial sources (Stripe, Airwallex, Bluefin, etc).
Implemented an automatized pipeline for ingestion and reporting for modeling, transformation, normalization and serving data using Python and AWS services.
Data extraction from external sources like REST APIs, Hotjar integrations, Airwallex, SQL DBs, etc. using Python with AWS Lambda functions or ECS Tasks.
Cross-team collaborations (ad-hoc reports for fraud or insights, data forensics for outages, alignment between teams, roadmaps, etc).
Data quality: monitor/alert data anomalies, compute and serve business metrics, detect possible outages, using automated tests and monitoring tools.
Data catalog: comprehensive documentation for our pipelines using internal wiki (Confluence and Mkdocs) and metadata management.
DevOps practices: CI/CD with GitLab CI, unit and end-to-end tests with pytest, infrastructure as code with Terraform.