Tutorials

Welcome to the Developer’s Guide Tutorials section, where you’ll find a comprehensive set of tutorials to help you make the most of our powerful AI tools. These tutorials are designed to guide you through various aspects of Document processing, from data preparation to advanced techniques like Named Entity Recognition (NER) and barcode scanning.

Getting Started

  1. Example usage of main Konfuzio concepts

Get to know how to operate main Konfuzio concepts like Documents and Project, as well as learn structure of a Project folder.

  1. Data Preparation

Learn how to efficiently prepare your data for optimal processing. This tutorial covers data organization, cleaning, and formatting to ensure seamless integration with our AI models.

  1. Create, change and delete Annotations via API

Get to know how to create, change and delete different types of Annotations using methods from konfuzio_sdk.api.

Document Processing Essentials

  1. Categorize a Document manually

Get to know how to assign and change Category of Documents and its Pages manually.

  1. Categorize a Document using Categorization AI

Master the art of categorizing Documents within your projects automatically. This tutorial provides step-by-step guidance on labeling and organizing Documents based on their content.

  1. Create your own Categorization AI

Build your own Categorization AI and define a custom architecture or reuse any external model. This tutorial provides guidance about constructing a class for the model for Document Categorization that can later be reused on Konfuzio app or in an on-prem installation.

  1. Build a Context-Aware File Splitting Model

Learn how to build a lightweight model for splitting a multi-Document stream of Pages. This tutorial is a step-by-step guide about the existing class of Konfuzio SDK.

  1. Train and use a Context-Aware File Splitting Model

Familiarize yourself with a simple fallback logic for splitting stream of Pages into separate sub-Documents.

  1. Create your own File Splitting AI

Build a custom File Splitting AI and define your own architecture or reuse any external model. This tutorial provides guidance about constructing a class for the model for File Splitting that can later be reused on Konfuzio app or in an on-prem installation.

  1. Evaluate the performance of a File Splitting AI

Get to know how to work with FileSplittingEvaluation class, assess the performance of Splitting AIs and interpret the results.

  1. Tokenization

Delve into the world of Document tokenization, a crucial step in text analysis. This tutorial explores various tokenization techniques and their applications.

  1. Information Extraction

Unlock the potential of extracting valuable information from unstructured text. This tutorial guides you through the process of identifying and labeling key details.

  1. Upload your AI model to use on Konfuzio app or an on-prem installation

Learn how to proceed with your model after you built and trained it and upload it to use in production using API.

Advanced Techniques

  1. Named Entity Recognition

Take your text analysis to the next level with fast and accurate Named Entity Recognition using OntoNotes. This tutorial provides in-depth insights into NER techniques.

  1. Annual Reports Analysis

Learn how to extract critical insights from annual reports using our advanced AI models. This tutorial is ideal for financial analysts and researchers.

Specialized Applications

  1. Barcode Scanner

Explore the capabilities of our barcode scanning tool. This tutorial demonstrates how to effortlessly extract information from barcodes in Documents.

  1. PDF Form Generator

Learn how to dynamically generate PDF forms using our AI-powered tools. This tutorial is perfect for streamlining Document creation processes.

  1. Regex-based Annotations

Harness the power of regular expressions for precise Document Annotations. This tutorial guides you through the process of using regex patterns effectively.

Streamlined Operations

  1. Data Validation

Ensure the accuracy and integrity of your data with effective validation techniques. This tutorial provides best practices for maintaining high-quality data sets.

  1. Outlier Annotations

Discover how to identify and handle outliers in your Document processing pipeline. This tutorial offers strategies for accurate Annotations.

  1. Async Upload with Callback

Optimize your Document processing workflow with asynchronous upload and callback functionality. This tutorial enhances the efficiency of large-scale operations.

Dive into these tutorials and elevate your Document processing capabilities. Whether you’re a beginner or an experienced developer, you’ll find valuable insights and practical techniques to enhance your projects. Happy coding!