Real-time Processing of Machine Data
Migrated machine-data pipeline from Databricks Connect to PySpark Streaming on Azure Databricks with Unity Catalog for near-real-time equipment telemetry processing.
Also:
- Stabilised pipeline under production load and introduced end-to-end observability.
- Configured pytest-based unit and integration testing with pre-commit hooks and mandatory PR quality gates.
- Delivered documentation-as-code for the project using MkDocs.
- Migrated dev environment from Windows to Ubuntu via WSL2.