DataPelago Nucleus Outperforms cuDF, Nvidia’s Data Processing Library, Raising The Roofline of GPU-Accelerated Data Processing

  • August 22, 2025
  • Home
  • USA
  • DataPelago Nucleus Outperforms cuDF, Nvidia’s Data Processing Library, Raising The Roofline of GPU-Accelerated Data Processing

MOUNTAIN VIEW, Calif., Aug. 22, 2025 (GLOBE NEWSWIRE) — DataPelago today released new benchmarking results that show DataPelago Nucleus significantly outperforms Nvidia’s cuDF — a widely used open-source software library that runs on CUDA to speed up data processing — for compute-intensive operations on top of Nvidia GPUs. Nucleus, DataPelago’s universal data processing engine, seamlessly executes data processing tasks across heterogeneous hardware (from CPUs to GPUs), dramatically improving price/performance for key data processing workloads without requiring code or infrastructure changes.

As businesses manage growing volumes of complex data for ETL, business intelligence and GenAI workloads, CPU-based data processing alone can no longer keep pace. Nvidia GPUs offer massive parallelism and throughput advantages that make them ideal for accelerating these workloads. However, they also present unique challenges — such as I/O bottlenecks and limited GPU memory — that limit the amount of data that can be processed at once. To fully realize the benefits of GPUs and deliver better performance-per-dollar to accelerate adoption, data processing engines must be designed to leverage GPU strengths while compensating for their limitations.

Nucleus’ GPU-optimized execution layer was designed with this objective in mind. While cuDF has long established the performance ceiling for utilizing these GPUs in data processing, complex and real-world workloads, such as multi-key and variable-length string sorts, are not handled efficiently. The benchmarking results for this scenario demonstrate higher gains with Nucleus compared to simple, fixed-length data operations.

Nucleus overcomes these challenges with capabilities such as better parallel algorithms, fast flows for common workloads, optimized multi-column support, kernel fusions to accelerate complex expressions, and end-to-end string optimization with zero copy shared memory management. This enables Nucleus to raise the roofline for performance on GPUs, unlocking greater value from existing accelerated infrastructure.

Initial benchmark results for real-world workloads include:

  • Complex Expressions: Nucleus is up to 10.5x faster for project operations, up to 10.1x faster for filter operations, and up to 4.3x faster for aggregate operations compared to cuDF.
  • Variable Length String As Data Type: For hash join operations, Nucleus achieves up to 38.6x faster throughput compared to cuDF for smaller strings while up to 4x faster for larger strings. Nucleus also shows significant improvements in hash aggregate operations with gains of up to 3.8x and up to 5.9x improvement for Top-K.
  • Multi-Column Support: Nucleus delivers up to 8.2x faster performance for ‘Top-K’ operations compared to cuDF while handling multiple column key.

“While organizations deal with a tsunami of complex data, fortunately accelerated hardware like GPUs have become more readily available in today’s cloud environments. To take full advantage of the performance benefits possible with accelerated hardware, new approaches and non-linear thinking are required,” said Rajan Goyal, CEO of DataPelago. “We founded DataPelago to apply this non-linear thinking and create a new data processing standard for the accelerated computing era so that companies can overcome performance, cost and scalability limitations. These latest benchmark results are an example of how DataPelago is continuing to push this new standard forward.”

To learn more about the benchmarking results, read the full blog post: https://www.datapelago.io/resources/DataPelago-Nucleus-Vs-Nvidia-cuDF. To contact the DataPelago team, visit datapelago.ai.

About DataPelago
DataPelago is unleashing the data acceleration revolution that AI demands. Today, AI’s relentless hunger for data acceleration at massive scale has created the ultimate chokepoint — without economically scaled data processing, AI innovation itself will be throttled. At DataPelago, we’re unleashing breakthrough thinking to transform data processing economics and ignite the next wave of AI-powered revolution.

DataPelago Nucleus is the world’s first universal data processing engine built for accelerated computing, purpose-built to process any type of data, operate across any hardware, and support any query engine, delivering new price/performance benefits that make it viable to extract value from all the data in the world, igniting an AI-powered revolution.

DataPelago is backed by Eclipse, Taiwania Capital, Qualcomm Ventures, Alter Venture Partners, Nautilus Venture Partners, and Silicon Valley Bank, a division of First Citizens Bank. To learn more, visit datapelago.ai.

Media Contact
LaunchSquad for DataPelago
[email protected]


Wall St Business News, Latest and Up-to-date Business Stories from Newsmakers of Tomorrow