CV – Álvaro González Sánchez

Profile

Backend and Data Engineer with a strong background in Python, API development and cloud-based data architectures. Experienced in building scalable backend applications, automation tools and CI/CD pipelines (GitHub Actions, Jenkins/Groovy), processing time-series data and automating complex workflows. Combines software engineering with an analytical engineering mindset and years of experience in modelling and data-intensive projects. Driven to build robust, secure and performant systems within a dynamic team.

Work Experience

Python Developer & Data Engineer 08/2025 – present

Self-employed

Designed, built and operate a production IoT/data platform — seven projects covering serverless and containerised ingestion, a secure Kafka device gateway, analytics pipelines on AWS and Azure Databricks, and an AI assistant (Claude API + MCP) — live at iot.gonzalezsanchez.dev (see Projects).
Run as a real product: CI/CD, infrastructure as code, observability (Datadog, Grafana Cloud) and security hardening on a self-hosted server; open-source fixes merged into Pydantic and PynamoDB.

Freelance Data Engineer 02/2025 – 07/2025

Self-employed — Peru & Bolivia

Data analysis and software development for projects in South America (environmental and water sector).
Processing and analysis of discrete and continuous datasets using Python.

Programmer 03/2023 – 01/2025

Link / Manage Count-e — Leuven

Backend development and maintenance of a school management system with a complex relational database (350+ tables).
Automation of administrative business processes in collaboration with multidisciplinary teams.
Worked with large structured datasets and contributed to the stability and extensibility of the system.

Groundwater & Surface Water Advisor 03/2022 – 12/2022

Antea Group Belgium

Analysis of hydrological data and groundwater flow modelling.
Spatial analyses using ArcGIS Pro and QGIS.
Advisory reports for policymakers and companies on water management.

Family care 2019 – 2021

Researcher – Hydrodynamic Modelling 05/2013 – 08/2019

Vrije Universiteit Brussel

Hydrodynamic modelling of the Zeeschelde estuary.
Development of numerical models in Fortran.

Special Academic Staff (BAP) 11/2009 – 04/2013

Vrije Universiteit Brussel — Brussels

Assistant Hydrology & Hydraulic Engineering 06/2009 – 07/2009

Soresma — Belgium

Hydraulic Engineer 08/2006 – 08/2007

Zorrilla Construcciones — Bolivia

Research Assistant & Programming Assistant 07/2004 – 07/2006

Universidad Mayor de San Simón — Bolivia

Projects

IoT Monitoring Platform — github.com/GonzalezSanchez/iot-monitoring-platform · iot.gonzalezsanchez.dev

Project 1a: Serverless Ingestion · 12/2025 – 01/2026

Python · AWS Lambda · API Gateway · DynamoDB · CloudFormation · GitHub Actions

Serverless REST API for real-time ingestion of sensor events (temperature, humidity, occupancy, motion) with threshold-based anomaly detection.
Clean layered architecture (models → services → repositories). Full infrastructure provisioned with CloudFormation. Deployed to AWS via CI/CD.

Project 1b: Containerised Ingestion · 01/2026 – 02/2026

Python · FastAPI · Docker · nginx · AWS DynamoDB · React · Cloudflare · OpenTelemetry · Datadog · Grafana Faro · GitHub Actions

Same business logic as 1a, redeployed as a containerised FastAPI application — demonstrating the separation of domain logic from infrastructure.
Live at iot.gonzalezsanchez.dev on a home server via Docker Compose + Cloudflare tunnel. React dashboard with 30s auto-refresh and live event submission.
Implemented end-to-end observability with OpenTelemetry auto-instrumentation and OTel Collector → Datadog APM: distributed traces with automatic DynamoDB child span detection, log-trace correlation, and Watchdog anomaly detection — zero manual instrumentation.
Frontend observability with the Grafana Faro Web SDK (real-user monitoring: Web Vitals, errors, sessions) — session data surfaced a production serialization bug that was then reproduced, fixed and regression-tested.
Docker images built and pushed to GitHub Container Registry on every merge to main.

Project 2a: Behavior Analyzer (AWS Serverless) · 02/2026 – 04/2026

Python · AWS Step Functions · Lambda · Aurora Serverless v2 · EventBridge · Terraform · Secrets Manager · GitHub Actions

Serverless ETL pipeline: extracts historical sensor data from DynamoDB, detects occupancy schedules, temperature trends and anomalies, stores results in Aurora Serverless v2 (PostgreSQL).
Full infrastructure provisioned with Terraform. Runs on-demand to minimise AWS costs (~$15/month while deployed, scales to zero when idle).
Unit, integration and regression tests (pytest + moto); 80%+ coverage enforced on every push via GitHub Actions.

Project 2b: Behavior Analyzer (Data Engineering) · 04/2026 – 05/2026

Python · Apache Airflow · PySpark · dbt · AWS S3 · PostgreSQL · GeoPandas · Power BI · OpenTelemetry · Grafana Cloud · Jenkins (Groovy) · Terraform · Docker

Data engineering pipeline with Medallion architecture (Bronze → Silver → Gold): raw Parquet (DynamoDB scan) → processed Parquet (validated, cleaned) → PostgreSQL serving layer via dbt (staging views + materialised marts with source tests).
PySpark analytics: occupancy schedule detection (window aggregation), temperature trend regression (Spark SQL regr_slope), z-score anomaly detection (stddev_pop — medium ≥ 3, high ≥ 5), spatial hotspot aggregation per building (GeoPandas, EPSG:4326).
Observability: OTel Collector → Grafana Cloud (Mimir metrics + Loki logs) — custom counters per pipeline stage, Airflow StatsD metrics, PostgreSQL metrics via postgres-exporter; pipeline dashboard versioned as code in the repo.
Orchestrated by Apache Airflow (weekly schedule). Deployed via a 9-stage Jenkins declarative CD pipeline written as a Groovy Jenkinsfile (deploy + destroy paths, smoke tests). Infrastructure provisioned with Terraform (S3 + IAM). Power BI report surfaced in the portfolio frontend at iot.gonzalezsanchez.dev.
94 unit tests (pytest + PySpark in-process). CI pipeline with PySpark tests (Java 17), Airflow DAG tests, dbt parse, and terraform validate — all green on every push.

Project 2c: Behavior Analyzer (Azure Databricks Lakehouse) · 05/2026 – 06/2026

Python · PySpark · Azure Databricks · Delta Lake · Unity Catalog · dbt-databricks · ADLS Gen2 · Terraform · Databricks Asset Bundles · GitHub Actions

Lakehouse pipeline on Azure Databricks: Bronze ingestion via Auto Loader, Silver transformation with the Write-Audit-Publish pattern (good records MERGEd idempotent, invalid records appended to quarantine — never deleted), Gold layer built with dbt-databricks incremental models: z-score anomaly detection (|z| > 2.5), hourly aggregations, dimensional models for rooms and buildings.
Full infrastructure as code with Terraform: ADLS Gen2, Databricks workspace (Premium), Access Connector (Managed Identity), Key Vault, Unity Catalog metastore, SQL Warehouse, budget alert. Orchestrated via Databricks Asset Bundles (DABs) — monthly job, 1st of every month 06:00 Brussels time.
Gold layer served live via FastAPI /lakehouse/* endpoints, visible in the portfolio dashboard at iot.gonzalezsanchez.dev.
43 tests, 92% coverage. CI: ruff + mypy + pytest + dbt parse + bundle validate + terraform validate — all green on every push.

Project 3: IoT Device Gateway (Kafka) · 07/2026

Python · FastAPI · Kafka (Redpanda) · aiokafka · DynamoDB · CloudFormation · Docker · Locust

Secure device gateway, live in production: device registration with bcrypt-hashed API keys, short-lived JWT session tokens, and per-device sliding-window rate limiting (configurable per device type).
Async FastAPI producer publishes sensor events to Kafka (Redpanda, keyed by device); a consumer group normalises messages to the platform's shared DynamoDB contract with idempotent writes — poison messages routed to a dead-letter queue with full error context.
Automation tooling: CLI device simulator (self-registering devices, JWT refresh, per-device stats) and a Locust load-test suite (steady vs. aggressive device scenarios) integrated into the CI pipeline. Validated against the production gateway: steady traffic clean (0 failures), an aggressive device throttled at the 60/min threshold with zero cross-device impact, zero dead-letter entries under load.
Infrastructure as code with CloudFormation (Devices table + least-privilege IAM), deployed via GitHub Actions. 36 tests, mypy clean.

Project 4: AI Assistant (Claude + MCP) · 06/2026 – 07/2026

Python · Claude API (Anthropic SDK) · MCP · fastapi-mcp · FastAPI · SSE · slowapi · Docker · React

Conversational AI layer over the live platform: Claude answers questions about the real sensor data by calling the platform's own REST APIs, exposed as 7 read-only MCP tools (fastapi-mcp, internal Docker network only).
Separate FastAPI service running a bounded agent loop (max 8 steps) on the streaming Claude API (Haiku 4.5) with the official MCP client; answers streamed token-by-token to the React chat tab via Server-Sent Events.
Security by design: least-privilege container (only the Anthropic key — no AWS/Databricks credentials), read-only tool allowlist, per-IP rate limiting behind Cloudflare, capped tokens and history, generic error events.
12 tests (agent loop, MCP client, API) against fakes — CI needs no network and no API key. Live at iot.gonzalezsanchez.dev.

Open Source

Pydantic · 10B+ downloads

Python · mypy plugin

Fixed silent acceptance of unknown mypy plugin config keys; added validation and clear warnings for both pyproject.toml and .ini formats, with test coverage. · PR #13149

PynamoDB · 2k+ stars

Python · AWS DynamoDB

Fixed a bug where Enum values were incorrectly rejected as attribute defaults due to an incomplete type validation list. · PR #1302

Education

MSc Water Resources Engineering

KU Leuven

2009 · Magna cum laude

Enterprise Java Developer

VDAB

2021–2022

BSc Engineering

Univ. Mayor de San Simón (Bolivia)

2004

Specialisation Environmental Management

Universidad de Beni (Bolivia)

2007

Effective Scientific Communication (6 ECTS)

Vrije Universiteit Brussel

2014

Statistics for PhD Students (6 ECTS)

Vrije Universiteit Brussel

2012

Certifications (2025 –)

LinkedIn · Databricks Certified Data Engineer Associate Cert Prep

LinkedIn · Data Engineering with dbt

LinkedIn · PySpark

LinkedIn · Mastering Observability with OpenTelemetry

LinkedIn · Monitoring and Observability with Datadog

LinkedIn · Apache Airflow

LinkedIn · Claude AI: Data Analysis, Programming, MCP

LinkedIn · Build with AI: API with CI/CD in Claude Code

IBM · Containers, Kubernetes & OpenShift

IBM · Cloud Native, DevOps, Agile & NoSQL

LinkedIn · Build REST APIs with FastAPI

MITx · Intro to CS & Programming – Python

MITx · Computational Thinking & Data Science

HarvardX · Intro to Data Science with Python

AWS · Developing Apps in Python on AWS

AWS · Getting Started with Data Analytics on AWS

IBM · Python for Data Engineering

Microsoft · Power BI: Data Models & Reports

Esri · Complete ArcGIS Pro Mastery

Full list on LinkedIn.

Skills

Backend & Development

Python, FastAPI, Flask, Groovy (Jenkins pipeline scripting), Java (Spring Boot), REST APIs, JSON, YAML, ETL pipelines, CLI tooling, OOP, TDD

AI / LLM Engineering

Claude API (streaming, tool use), MCP (server + client), fastapi-mcp, agent loops, SSE, LLM cost & abuse controls, responsible AI-assisted development (Claude Code)

Cloud & DevOps

AWS (Lambda, API Gateway, S3, DynamoDB, Aurora Serverless v2, CloudFormation, Step Functions, EventBridge, Secrets Manager, CloudWatch, SNS) · Azure (Databricks, ADLS Gen2, Key Vault, Event Hubs, Monitor) · Terraform, Docker, Docker Compose, Kubernetes, Nginx, GitHub Actions, Jenkins (Groovy declarative pipelines), Cloudflare, Linux (Ubuntu server administration), CI/CD, IaC

Data Engineering

SQL (PostgreSQL, MySQL), MongoDB · PySpark, Databricks, Delta Lake, Unity Catalog, Pandas, NumPy, Scikit-learn, GeoPandas · Apache Airflow, dbt, Auto Loader, Databricks Asset Bundles, ETL/ELT · Grafana Cloud, Power BI, Tableau

GIS

ArcGIS Pro, ArcGIS API for Python, QGIS, FME

Other

Git, Scrum/Agile, JUnit, Maven, HEC-RAS, MODFLOW, Fortran

Languages

Spanish — Native

English — C1

Dutch — C1

German — A2

French — A2

Álvaro GONZÁLEZ SÁNCHEZ

Profile

Work Experience

Projects

Open Source

Education

Certifications (2025 –)

Skills

Languages