What is the Konfuzio SDK?

Overview of the SDK

The Open Source Konfuzio Software Development Kit (Konfuzio SDK) provides a Python API to build custom document processes. For a quick introduction to the SDK, check out the Get Started section. Review the release notes and the source code on GitHub.



Get Started

Learn more about the Konfuzio SDK and how it works.


Learn how to build your first document extraction pipeline, speed up your annotation process and many more.


Here are links to teaching material about the Konfuzio SDK.

API Reference

Get to know all major Data Layer concepts of the Konfuzio SDK.

Contribution Guide

Learn how to contribute, run the tests locally, and submit a Pull Request.


Review the release notes and the source code of the Konfuzio SDK.

Customizing document processes with the Konfuzio SDK

For documentation about how to train and evaluate document understanding AIs, as well as use a trained AI in the Konfuzio Server web interface, please see our Konfuzio Guide.

If you need to add custom functionality to the document processes of the Konfuzio Server, the Konfuzio SDK is the tool for you. You can customize pipelines for automatic document Categorization, File Splitting, and Extraction. These processes allow to split stack scans, categorize and extract information from the Documents.


Customizing document AI pipelines with the Konfuzio SDK requires a self-hosted installation of the Konfuzio Server.

Other use cases around documents

You can also use the Konfuzio SDK for other Document-related purposes like filling in the PDF forms via the generator. For further information, check out our tutorial on how to create a PDF form generator.