Loading Events

WORKSHOP – Polars: Faster, Lighter, Smarter

Organizers

Workshop introduction:

Data processing is a key part of a data scientist’s day to day job. Today, we consider that most data scientists spend more time processing, and visualizing data than building models out of it. Another key finding is that better downstream performance is often yielded from data quality and robustness of data pipelines, rather than from architectural improvements.

For several years, pandas has shown to be the go-to open-source python library for single-node data processing. However, its creator, Wes McKinney, published in 2017 a blog post entitled: “Apache Arrow and the 10 things I hate about pandas” where he goes through several design choices that were made during the development of pandas, and how he would do them differently, had he had the opportunity to do things differently.

From this idea, polars was born.

Polars started out as a hobby project in 2020, but quickly gained traction within the open source community. Many developers were searching for an easy-to-use DataFrame library that was performant at the same time, and Polars set out to fill this void. The community grew fast as many contributors came in from various backgrounds and programming languages.

Today, polars is rapidly evolving and community adherence is very strong. The library is evolving at a pace where it could outgrow pandas (in terms of github stars) within several years.

Workshop summary:

The goal of this workshop is to introduce data scientists to the polars library and provide first examples to become familiar with it.

In this workshop, we will:

  1. Briefly introduce polars and the design choices associated with the library.
  2. Work our way through the documentation and basic functions / objects as a starter.
  3. Translate complex pandas pipelines to polars.
  4. Evaluate the gain in performance associated to various tasks that a data scientist can work on.

 

Whether you heard of polars or not, let us convince you that this library is not something you want to miss.

Come and benefit from the experience of our team on this library.

< All past workshops