Click any tag below to further narrow down your results
Links
Lakesail rewrote Apache Spark in Rust, removing the JVM layer. The new implementation runs eight times faster and cuts infrastructure costs by 94%.
This article presents jsongrep, a tool for querying JSON documents efficiently using a DFA-based approach. It explains the tool's features, how it processes queries, and benchmarks its performance against other JSON querying tools.
chDB transforms ClickHouse into a user-friendly Python library for seamless DataFrame operations, eliminating serialization overhead and enabling fast SQL queries directly on Pandas DataFrames. The latest version achieves significant performance improvements, making it 87 times faster than its predecessor by implementing zero-copy data handling and optimized processing.
DuckDB has proven to be superior to Polars when handling large datasets, particularly 1TB of data. While DuckDB effectively manages memory and execution with a robust design, Polars struggles with large data processing, leading to out-of-memory errors.