Abstract—Process mining, i.e., a sub-field of data science
focusing on the analysis of event data generated during the
execution of (business) processes, has seen a tremendous change
over the past two decades. Starting off in the early 2000’s,
with limited to no tool support, nowadays, several software
tools, i.e., both open-source, e.g., ProM and Apromore, and
commercial, e.g., Disco, Celonis, ProcessGold, etc., exist. The
commercial process mining tools provide limited support for
implementing custom algorithms. Moreover, both commercial
and open-source process mining tools are often only accessible
through a graphical user interface, which hampers their usage in
large-scale experimental settings. Initiatives such as RapidProM
provide process mining support in the scientific workflow-based
data science suite RapidMiner. However, these offer limited to
no support for algorithmic customization. In the light of the
aforementioned, in this paper, we present a novel process mining
library, i.e., Process Mining for Python (PM4Py), that aims to
bridge this gap, providing integration with state-of-the-art data
science libraries, e.g., pandas, numpy, scipy and scikit-learn. We
provide a global overview of the architecture and functionality
of PM4Py, accompanied by some representative examples of its
usage.