Position:
Associate Professor/Senior Lecturer in Computing
Imperial College London
180 Queen’s Gate
London SW7 2RH, United Kingdom
Contact:
Email: hlgr@imperial.ac.uk
www: http://holger.pirk.name
Bio:
Before joining Imperial, I was a Postdoc at the Database group at MIT CSAIL. I spent my PhD years in the Database Architectures group at CWI in Amsterdam resulting in a PhD from the University of Amsterdam in 2015. I received my master’s degree (Diplom) in computer science at Humboldt-Universität zu Berlin in 2010.
Research Interests
I am interested in all things data: analytics, transactions, systems, algorithms, data structures, processing models and everything in between. While some of my work targets “traditional” relational databases, my objective is to broaden the applicability of data management techniques. This naturally leads to research at the intersection of data management, compilers and computer architecture: I study the effective use of current and emerging hardware to improve the performance of data-intensive applications and abstractions to make them easier to program. This means targeting new applications like visualization, games, IoT and AI as well as new platforms like compilers, GPUs or FPGAs as well all hardware-conscious algorithms, new data processing paradigms, algebraic optimizations, cost models and code generation techniques.
Education
Ph.D. at the Database Architectures group at CWI/University of Amsterdam (2011 - 2014)
Thesis: Waste Not, Want Not! Managing Relational Data In Asymmetric Memories
· Advisors: Martin Kersten & Stefan Manegold
MSc. (Diplom) of computer science and psychology at Humboldt-Universität zu Berlin (2003 - 2010)
Thesis: Cache Conscious Data Layouting For In-Memory Databases
· Advisor: Ulf Leser
Professional Experience
06/2021 - today
Senior Lecturer at Imperial College, London, UK
· Faculty member in the Large-Scale Data & Systems Group
09/2017 - 05/2021
Lecturer at Imperial College, London, UK
· Faculty member of the Large-Scale Data & Systems Group
02/2018 - 09/2019
Consulting Researcher at Microsoft Research, Cambridge, UK
05/2016 - 08/2017
Visiting Researcher at the DMX group at Microsoft Research, Redmond, USA
· Studying the use of compression in in-memory Database Management Systems
12/2014 - 04/2017
Postdoctoral researcher at the Database group at MIT, Cambridge, USA
· Developed Voodoo, a processing algebra that enables the implementation of portable programs for data-intensive applications
· Contributed to Weld/Solder, A Common Runtime for High Performance Data Analytics
· Developed Loa, a domain-specific language for the optimization of dataflow programs
10/2010 - 09/2014
Ph.D. candidate at the Database Architectures group at CWI, Amsterdam
· Developed the Bitwise Decomposed Storage and Processing Model, a processing strategy for efficient CPU/GPU co-processing
· Developed and Evaluated a cost-based storage optimizer for a high-performance in-memory database system (HyPeR)
· Contributed to the Data Vaults Framework, an on-demand data ingestion framework
· Contributed to SciQL, a query language for scientific applications that unifies relational and array data management (managed ERC FP7 reporting requirements)
· Developed the (at the time) fastest implementation of Database Cracking, an adaptive indexing strategy
10/2009 - 09/2010
Software architect at Kontacts IT-Solutions GmbH, Potsdam, Germany
· Developed a data ingestion system for the quality control at the Mercedes Benz Commercial Vehicles Subdivision of the Daimler AG
· Developed the mobile application for a German social network service
10/2009 - 02/2010
Teaching assistant at the Hasso-Plattner-Institute, Potsdam, Germany
· Mentored undergraduate students in the Advanced Software Engineering class
01/2009 - 09/2009
Research student at the EPIC group at the Hasso-Plattner-Institute, Potsdam, Germany
· Developed and implemented a cost-based storage optimization framework for an in-memory database system (HyRise)
03/2008 - 08/2008
Research assistant at the EPIC group at the Hasso-Plattner-Institute, Potsdam, Germany
· Contributed to the implementation of CestBON, a context aware, push-based data retrieval system
12/2007 - 10/2008
Research assistant at the Knowledge Management in Bio-Informatics group at Humboldt Universität zu Berlin
· Developed an application for the management and visualization of protein interaction networks
12/2006 - 05/2007
Research assistant (remote) for the IBM Data Warehousing group (Berlin, Germany)
· Continued my contributions to the Cube Faceted Warehousing project
04/2006 - 10/2006
Research Intern at IBM Silicon Valley Labs, San Jose, CA
· Contributed a Cube Faceted Warehousing, a project to integrate faceted search capabilities into data warehousing (resulting in two patents)
10/2004 - 03/2006
Research assistant at the Fraunhofer FIRST, Berlin
· Developed an interactive tool to formalize textual use case descriptions
Publications
Conferences:
- Georgios Theodorakis, Peter Pietzuch and Holger Pirk. SCABBARD: Single Node Fault-Tolerant Stream Processing. In VLDB, 2021
- Georgios Theodorakis, Alexandros Koliousis, Peter Pietzuch and Holger Pirk. LightSaber: Efficient Window Aggregation on Multi-core Processors. In SIGMOD, 2020
- Georgios Theodorakis, Peter R. Pietzuch and Holger Pirk. SlideSide: A fast Incremental Stream Processing Algorithm for Multiple Queries. In EDBT, 2020
- Matheus Nerone, Stefan Manegold and Holger Pirk. Efficient Hard Real-Time Transaction Processing-Unachievable in Software. In CIDR, 2020
- Philippos Papaphilippou, Holger Pirk and Wayne Luk. Accelerating the merge phase of sort-merge join. In FPL, 2019
- Holger Pirk, Jana Giceva and Peter Pietzuch. Thriving in the No Man’s Land between Compilers and Databases. In CIDR, 2019
- Shoumik Palkar, James Thomas, Deepak Narayanan, Pratiksha Thaker, Rahul Palamuttam, Parimajan Negi, Anil Shanbhag, Malte Schwarzkopf, Holger Pirk, Saman Amarasinghe, Samuel Madden, Matei Zaharia. Evaluating End-to-End Optimization for Data Analytics Applications in Weld, In VLDB, 2018
- Anil Shanbhag, Holger Pirk and Sam Madden. Efficient Top-K Query Processing on Massively Parallel Hardware. In SIGMOD, 2018
- Shoumik Palkar, James Thomas, Anil Shanbhag, Holger Pirk, Malte Schwarzkopf, Saman Amarasinghe and Matei Zaharia. Weld: A Common Runtime for High Performance Data Analytics. In CIDR, 2017
- Holger Pirk, Oscar Moll, Matei Zaharia and Sam Madden. Voodoo–A Vector Algebra for Portable Database Performance on Modern Hardware. In PVLDB, 2016
- Holger Pirk, Oscar Moll and Sam Madden. What Makes a Good Physical plan? Experiencing Hardware-Conscious Query Optimization with Candomble. In SIGMOD, 2016
- Steffen Zeuch, Holger Pirk and Johann-Christoph Freytag. Non-Invasive Progressive Optimization for In-Memory Databases. In PVLDB, 2016
- Yagiz Kargin, Martin Kersten, Stefan Manegold and Holger Pirk. The DBMS–Your Big Data Sommelier. In ICDE, 2015
- Holger Pirk. …like Commanding an Anthill: A Case for Micro-Distributed (Data) Management Systems, In SIGMOD Record Special Issue, 2015
- Holger Pirk, Stefan Manegold and Martin Kersten. Waste Not… Efficient Co-Processing Of Relational Data. In ICDE, 2014
- Max Heimel, Michael Saecker, Holger Pirk, Stefan Manegold and Volker Markl. Hardware-Oblivious Parallelism For In-Memory Column-Stores. In PVLDB, 2013
- Holger Pirk, Florian Funke, Martin Grund, Thomas Neumann, Ulf Leser, Stefan Manegold, Alfons Kemper, Martin Kersten. CPU And Cache Efficient Management Of Memory-Resident Databases. In ICDE, 2013
Workshops:
- George Theodorakis, Alex Koliousis, Peter Pietzuch and Holger Pirk. Hammer Slide: Work- and CPU-efficient Streaming Window Aggregation, ADMS@VLDB, 2018
- Holger Pirk, Sam Madden and Mike Stonebraker. By their fruits shall ye know them: A Data Analyst’s Perspective on Massively Parallel System Design. In DaMoN@SIGMOD, 2015
- Anil Shanbhag, Holger Pirk and Sam Madden. Locality-Adaptive Parallel Hash Joins using Hardware Transactional Memory. In IMDM@VLDB, 2016
- Holger Pirk, Eleni Petraki, Stratos Idreos, Stefan Manegold and Martin Kersten. Database Cracking: Fancy Scan, Not Poor Man’s Sort!. In DaMoN@SIGMOD, 2014
- Manolis Koubarakis et al.. Building Virtual Earth Observatories Using Ontologies and Linked Geospatial Data. In WebRR, 2012
- Holger Pirk, Thibault Sellam, Stefan Manegold and Martin Kersten. X-Device Query Processing By Bitwise Distribution. In DaMoN@SIGMOD, 2012
- Konrad Bösche, Thibault Sellam, Holger Pirk, René Beier, Peter Mieth and Stefan Manegold. Scalable Generation Of Synthetic GPS Traces With Real-Life Data Characteristics. In TPCTC@VLDB, 2012
- Yagiz Kargin, Holger Pirk, Milena Ivanova, Stefan Manegold and Martin Kersten. Instant-On Scientific Data Warehouses - Lazy ETL for Data-Intensive Research. In BIRTE@VLDB, 2012
- Holger Pirk. Efficient Cross-Device Query Processing. In VLDB PhD Symposium, 2012
- Charalampos Kontoes et al.. Operational wildfire monitoring and disaster management support using state-of-the-art EO and Information Technologies, In EORSA, 2012
- Holger Pirk, Stefan Manegold and Martin Kersten. Accelerating Foreign-Key Joins Using Asymmetric Memory Channels, In ADMS@VLDB, 2011
Patents
- Marion Behnen, Richard Cole, Qi Jin, Timo Pfahl and Holger Pirk: Data Analysis using Facet Attributes, US Patent, 2011
- Marion Behnen, Qi Jin, Timo Pfahl and Holger Pirk: Cube Faceted Data Analysis, US Patent, 2010
Awards & Funding
- 2021, Principal Investigator, EPSRC New Investigator Award “Bespoke Compression for General-Purpose Programming Languages”. Duration: 2 years
- 2020, Co-Investigator, Innovate UK Project “Energy Catalyst 7”. Duration: 2 years
- 2020, Co-Investigator, Innovate UK Project “HyAI - Hydrogen AI”. Duration: 1 years
- 2019, Principal Investigator, Oracle Labs-funded Project “Compression in GraalVM/Truffle”. Duration: 3.5 years
- 2018, Hardware grant from Intel/Altera
- 2017, Hardware grant from NVidia
Invited Talks
- High-performance multi-paradigm database systems, Huawei Science & Technology Day, Edinburgh, 2021
- Dark Silicon–A currency we do not control, KTH, Stockholm, 2020
- Dark Silicon–A currency we do not control, Invited Fresh Thinking Keynote at SIGMOD DaMoN Workshop, 2019
- Invited Participation, Microsoft Faculty Research Summit, 2018
- Hardware-Conscious Data Processing Systems, Universität des Saarlands, 2018
- Hardware-Conscious Data Processing Systems, Technische Universität Dresden, 2018
- Hardware-Conscious Data Processing Systems, Technische Universität Dortmund, 2018
- Hardware-Conscious Data Processing Systems, Universität Tübingen, 2018
- Hardware-Conscious Data Processing Systems, University of Washington, 2018
- Hardware-Conscious Data Processing Systems, Oxford University, 2018
- Hardware-Conscious Data Processing Systems, SAP HANA Tech Days, 2018
- Voodoo - A Kernel For Database Performance Engineering, Harvard University, 2016
- Voodoo - A Kernel For Database Performance Engineering, Yale University, 2016
- Voodoo - A Kernel For Database Performance Engineering, Brown University, 2015
- A mind like water - Increasing DBMS Resilience without Sacrificing Performance, Imperial College London, 2015
- A mind like water - Increasing DBMS Resilience without Sacrificing Performance, EPFL, 2015
- Waste Not, Want Not - Efficient Co-Processing of Relational Data, ETH Zürich, 2014
- Waste Not, Want Not - Efficient Co-Processing of Relational Data, Oracle Labs, 2013
- Waste Not, Want Not - Efficient Co-Processing of Relational Data, IBM Almaden, 2013
- Hardware-Conscious Cost Modelling through the Ages, ETH Zürich, 2013
- Cache Conscious Data Layouting for In-Memory Databases, Humboldt Universität zu Berlin, 2012
- Cache Conscious Data Layouting for In-Memory Databases, Techniche Universität München, 2012
Teaching
- Nomination for Student Choice Award for Outstanding Teaching 2018, 2019 & 2020
- Advanced Databases at Imperial College, 2017 - 2022
- Performance Engineering at Imperial College, 2018 - 2022
- Introduction to Object-Oriented Programming at Imperial College, 2018 - 2021
- Contributed lectures to Database Systems at MIT, 2015 & 2016
- Teaching assistant for Advanced Software Engineering at HPI, 2009
- Software Engineering Best Practices at Humboldt-University, 2008
Academic Service
- ICDE Demonstration Chair 2022
- SIGMOD Reproducibility Co-Chair 2022
- General Chair of BICOD 2021 (Co-Chaired with Thomas Heinis)
- Area Editor for Information Systems, 2020-today
- Associate Chair for ICDE 2022
- Webchair of SIGMOD 2019
- Core Member of SIGMOD Program Committee 2019
- Member of VLDB Program Committee 2016, 2017, 2018, 2020, 2021
- Member of SIGMOD Program Committee 2016, 2017, 2018, 2020, 2021
- Member of ICDCS Program Committee 2018
- Member of ICDE Program Committee 2016, 2020
- Member of EDBT Program Committee 2020
- Member of ICDE Industrial Track Program Committee 2018
- Member of SIGMOD/DaMoN Program Committee 2015 & 2018
- Member of VLDB PhD Workshop Program Committee 2016
- Member of VLDB Demo Program Committee 2016
- Member of ICDE PhD Workshop Program Committee 2017
- Member of ICDE Demo Program Committee 2016
Supervision
PhD Students
- Hubert Mohr-Daurat: Homoiconic Data as a Basis for Data Cleaning – ongoing
- Fotios Kounelis: Transparent Compression in General Purpose Programming Languages – ongoing
- Ahmad Khazaie: Index Structures for Worst-case Optimal Joins – ongoing
- Giorgos Theodorakis: High-performance Stream Processing (jointly with Peter Pietzuch) – ongoing
Postdocs
- Andrea Piermarteri: Data Visualization using a Homoiconic Data Representation – ongoing
Master’s Students
- Hannes Hertach, MEng Computing: Query Compiler for a Symbolic Database Management System
- Tiger Wang, MEng Computing: Homoiconic Symbolically Distributed Processing
- Abel Shields, MEng Computing: Evaluating Symbolic Programs on GPUs
- Alexandru-Petre Cazan, MEng Computing: Symbolic Optimization of Database Queries
- Christopher Battarbee, MEng Computing: Profile-Guided Optimization using Database techniques
- Liam Pilot, MEng Computing: Accelerating Stream Processing with RDMA
- Mayank Surana, EE: BW-Trees on FPGAs and GPUs
- Marek Beseda, MEng Computing: NUMA-Aware Stream Processing
- William Woodacre, MEng: Deterministic concurrency control for transaction processing systems on FPGAs
- Emma Gospodinova, MEng: BW-Trees on FPGAs
- Zicong Ma, JMC: GamesBench: A Benchmark for Streaming Analytics of Strategy Games
- Oliver Brown, MEng Computing: Powerpipes
- Charith Amarasinghe, MSc: CLOPS: A Proxy Testbed for Cloud Storage
- Celie Valentiny, MSc: Personal Tracking Data Recommender/Awareness Demonstration for Imperial Festival
- Yao Chen MSc.: Predicting access latencies of modern storage devices
- Jeng Wong, MSc: Massively Parallel Stream Ingestion
- Andrew Chow, MSc: Personal Tracking Data Recommender/Awareness Demonstration for Imperial Festival
- Armand Cadet, MSc: Personal Tracking Data Recommender/Awareness Demonstration for Imperial Festival
Master’s-Level Group Projects
- Alexander Harkness et al., MEng Group Project: Developing a collaborative drawing app using CRDTs
- Kapilan Cholanet al., MEng Group Project: Developing a collaborative drawing app using CRDTs
- Jordan Spooner et al., MEng Group Project: Building an Efficient Query Processor by Generating OpenCL Code from Voodoo Vector Algebra
Bachelor’s Students
- Robert Moore, BEng Computing: Adaptive Compression for Graph Processing
- Ki Cheuk, BEng: Developing a hardware-conscious cost model for parallel, data-intensive applications
Ph.D. Assessment
- Timo Kersten, TU Muenchen, 2021
- Matthew Pugh, University of Edinburgh, 2021 (thesis currently in corrections phase)
- Christian Priebe, Imperial College London, 2020
University Service
- Coordinator for the departmental colloquium at the Department of Computing
- Co-chair of the Athena SWAN Committee (Imperial Computing currently holds Athena SWAN Bronze status and working to upgrade to Silver)
- Member of undergrad admission panel (until 2021)
- Departmental Knowledge Management Officer (since 2021)