Overview
Continuously refined list.
Vignettes
Death by a Thousand Microservices
You're not Google. Keep it simple and the argument for macroservices.
Benchmarking xethub, dvc, lfs, lakefs
Bit of a biased view from XetHub, but it's a fascinating topic about Data Version Control. The primary use case highlight is version controlling data with ML model developments. The article explores various approaches and XetHub's innovation to simplify and scale DVC.
2024-01
- Data Engineering Handbook
- Testing in MLOps
- URL Design
- paradedb: ...ParadeDB introduces a new kind of table called the deltalake table. deltalake tables behave like regular Postgres tables but use a column-oriented layout via Apache Arrow and leverage Apache DataFusion, a query engine optimized for column-oriented data. This means that users can choose between row and column-oriented storage at table creation time...
2024-02
- Makefile Projects Cloud Run, a sidecar is a separate container that runs alongside your main application container within the same pod. It acts like a helper application, providing additional functionalities without cluttering up your main application's code.
- CI-CD for ML
- AI explains repos
- Data Philosophy Blueprint for Data Architecture
- Makefile
2024-03
- Open Source DE Landscape
- debug vscode
- mojo open sourced
- bandit Bandit is a tool designed to find common security issues in Python code.
- comprehensive list code quality
- dlt - data load tool: source to destination abstracted away
- scraper - botasaurus
2024-04
- mlops
- Code Examples from Professional Service Team
- Cloud Architecture Center
- Lakehouse
- CI/CD - Flask to Kub
2024-05
- Future of work: It means a transition from a knowledge economy to an allocation economy. You won’t be judged on how much you know, but instead on how well you can allocate and manage the resources to get work done.
- HTMX - Talk Python