sptrans_network_resilience
Graph-theory analysis of São Paulo's transit network on SPTrans GTFS data — 22K+ nodes, 29K+ edges. Finds critical bottlenecks and tests robustness under targeted vs random failure.
██████╗██╗███╗ ██╗████████╗██╗ █████╗ ██╗ ██╔════╝██║████╗ ██║╚══██╔══╝██║██╔══██╗ ██║ ██║ ██║██╔██╗ ██║ ██║ ██║███████║ ██║ ██║ ██║██║╚██╗██║ ██║ ██║██╔══██║ ██║ ╚██████╗██║██║ ╚████║ ██║ ██║██║ ██║ ██║██╗ ╚═════╝╚═╝╚═╝ ╚═══╝ ╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝╚═╝
Data scientist working at the intersection of applied machine learning, database systems, and AI agents
I'm a data scientist based in São Paulo, Brazil. My work centres on modelling complex systems as networks, machine learning, and — increasingly — computer vision and retrieval-augmented AI agents.
I build end-to-end, reproducible data pipelines (ingestion → transformation → analysis → interactive apps) and reapply the same architectural patterns across domains: public health, financial derivatives, real estate, and astronomy.
Currently completing a BSc in Data Science alongside specializations in Robotics and applied mathematics. Open to remote roles in ML engineering and applied AI.
Each project is an end-to-end pipeline with a live demo, dataset, or reproducible code. Click through for the full case study.
Graph-theory analysis of São Paulo's transit network on SPTrans GTFS data — 22K+ nodes, 29K+ edges. Finds critical bottlenecks and tests robustness under targeted vs random failure.
Sentiment analysis of São Paulo city-council speeches using transformer models over a full NLP pipeline — ingestion, cleaning, inference, and visual reporting.
A retrieval-augmented FAQ system for observational astronomy — modular Mistral-7B + FAISS pipeline over a curated Portuguese corpus, with a full evaluation report.
Where each project leans, at a glance. Cells are shaded against the per-row maximum — greener = stronger emphasis, pinker = lighter touch — same idea as the autolab detail-score grid.
| Project | Data Engineering | ML / Modeling | Graphs / Networks | NLP | LLM / Agents | Vision / Perception | Robotics / Control | Applied Math | Deploy / Infra | Viz / App | Testing / evals |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Networks & Graphs | |||||||||||
| sp_public_transit | |||||||||||
| NLP & Language | |||||||||||
| legislative_nlp | |||||||||||
| LLM & Agents | |||||||||||
| astronomia_rag | |||||||||||
Values are a self-rated qualitative measure of where each project's effort concentrated — illustrative, not a measured benchmark. The shading engine (data-v → pink↔green) is the reusable part: drop real metrics in on any project page and the colours follow.
Open to remote opportunities in ML engineering and applied AI. The fastest way to reach me is GitHub or LinkedIn.