Big-Data

The Python Paradox: How Python Dominates Big Data Despite the GIL
reading time: 13 minutes
Discover why Python dominates big data despite the GIL: Python coordinates, C/Rust/JVM executes. Learn how NumPy, pandas, Polars, and PySpark bypass the GIL for true parallelism.
You Don't Know JSON: Part 6 - JSON Lines: Processing Gigabytes Without Running Out of Memory
reading time: 30 minutes
Standard JSON can’t stream - you must parse the entire document. JSON Lines solves this with one JSON object per line, enabling streaming processing, log aggregation, Unix pipelines, and handling gigabyte-scale datasets with constant memory usage.