Introduction

CDPG Anonkit is a toolkit that can be used to preprocess, anonymise, and post-process data. This toolkit was originally written as an application intended to be run inside a Trusted Execution Environment (TEE) and was later developed into a python package to allow anyone to be able to use it for any dataset.

Using the toolkit, you can perform the following operations:

  • Sanitisation
    • Clipping

    • Hashing

    • Suppression

  • Generalisation
    • Spatial Generalisation

    • Temporal Generalisation

    • Categorical Generalisation

  • Aggregation
    • Query Building

Installation

CDPG Anonkit is available on PyPI. We recommend using the latest version of Python - CDPG Anonkit supports Python 3.10 and newer. We also recommend using a virtual environment in order to isolate your project dependencies from other projects and the system.

Install the most recent cdpg-anonkit version using pip:

$ pip install cdpg-anonkit

Dependencies

These will be installed automatically when installing the package.

  • H3 A system to partition geographical areas into uniquely identifiable, hexagonal, hierarchical cells.

  • Pandas A library for high-performance, easy-to-use data structures and data analysis tools.

  • Numpy A package for scientific computing in python.

  • typing_extensions A complementary library to the standarding typing module. Enables run-time support for type hints.

Developer Dependencies

These distributions will not be installed automatically and will only be installed on installing the dev version of cdpg-anonkit.

  • Pytest provides translation support in templates.