Changelog

All notable changes in the Konfuzio Server will be documented according to the principles defined by Keep a Changelog.

The changelog adheres to Calendar Versioning and the release tag relates to the date and time when those changes have been released to app.konfuzio.com.

Self-hosted Konfuzio Server can be upgraded according to the documentation.

Planned

You can think of the Planned section as a Roadmap that lists Konfuzio Server features our team is actively working on.

  • Delta Training, Partial Fit an exisiting classifier, so that training documents used previously can be deleted (Internal Ticket).

  • Allow administrators of Konfuzio on-premise installations to run a speedtest (Internal Ticket).

  • Start automatic AI retraining after User confirms that he has finished a annotation review (Internal Ticket).

  • Add original file url in Document endpoint in the API: being able to download the original PDF (Internal Ticket).

Next release

Estimated release date 18th September 2024.

released-2024-09-05_06-54-53

This version uses the Konfuzio Python SDK in version v.0.3.15 and Konfuzio Document Validation UI (DVUI) in version v.0.1.38.

Added

Changed

  • The API documentation now uses empty defaults for ‘Assignee’ and ‘Callback URL’ when creating a Document (Internal Ticket).

  • The maximum length of the Label and Label-Set names is now 1000 characters (Internal Ticket).

  • For new Projects, Extraction AIs will now be dockerized (Internal Ticket).

released-2024-08-11_09-56-15

This version uses the Konfuzio Python SDK in version v.0.3.14 and Konfuzio Document Validation UI (DVUI) in version v.0.1.36.

Added

Fixed

  • An issue in the Document Validation UI (DVUI) with displaying missing Annotation (Internal Ticket).

  • An issue that prevented Extraction AIs to run properly (Internal Ticket.

released-2024-07-01_18-11-29

This version uses the Konfuzio Python SDK in version v.0.3.11 and Konfuzio Document Validation UI (DVUI) in version v.0.1.35.

Fixed

  • From the SDK: Context Aware Splitting AI returning Splits with no Category Annotations (Internal Ticket).

  • Bug triggered when category names contain non-ASCII characters (Internal Ticket).

Added

  • In the DVUI: lasso selection in Document with moving box to select text entities (Internal Ticket).

released-2024-06-10_08-44-22

This version uses the Konfuzio Python SDK in version v.0.3.9 and Konfuzio Document Validation UI (DVUI) in version v.0.1.33.

Fixed

  • From the SDK: Missing documents when SDK project is initialized (Internal Ticket).

  • In the DVUI: no Documents to navigate through in the DVUI (Internal Ticket).

  • Read only mode issue preventing Splitting AI training to complete successfully (Internal Ticket).

  • Member with the role “Reader” can view other members of the project (Internal Ticket).

  • “/” character in Project name leads to training error (Internal Ticket).

Added

released-2024-05-19_14-58-11

This version uses the Konfuzio Python SDK in version v.0.3.6 and Konfuzio Document Validation UI in version v.0.1.32.

Fixed

  • When clicking on an Annotation link, the user gets always redirected to the SmartView Document Viewer and not to the Document Viewer set as default on the project level (Internal Ticket).

  • From the SDK: normalization of floats with 4+ digits after decimal separators (Internal Ticket).

  • From the SDK: backward compatibility when loading a project from a folder (Internal Ticket).

Changed

released-2024-05-06_06-47-41

This version uses the Konfuzio Python SDK in version v.0.3.4 and Konfuzio Document Validation UI in version v.0.1.31.

Fixed

  • Bug for creating a snapshot due to the data type used for saving the snapshot size (Internal Ticket).

  • Return 404 instead of 500 for non-existing documents (Internal Ticket).

Added

  • In the DVUI: scroll to and permanently highlight annotation when opening the DVUI through an annotation link (Internal Ticket).

Changed

released-2024-04-18_20-18-02

This version uses the Konfuzio Python SDK in version v.0.3.3 and Konfuzio Document Validation UI in version v.0.1.30.

Fixed

Added

released-2024-04-09_08-05-05

This version uses the Konfuzio Python SDK in version v.0.3.1 and Konfuzio Document Validation UI in version v.0.1.29.

Added

  • Make the Document details page accessible in the DVUI (Internal Ticket).

  • Define the file type for files that the user can select at Document upload (Internal Ticket).

  • Error message on unfilled Captcha at submiting the sign-up form (Internal Ticket).

Fixed

Changed

  • Remove Categorization and Splitting AI sections from the Project settings for pretrained Projects (Internal Ticket).

  • Refactor the design and text of error messages for the registration form (Internal Ticket).

  • Adapt Server to the project’s new structure (Internal Ticket).

released-2024-03-22_11-51-15

This version uses the Konfuzio Python SDK in version v.0.2.47 and Konfuzio Document Validation UI in version v.0.1.27.

Fixed

Changed

  • The Original file instead of the Sandwich is used for uploading the Sample Document in a pretrained Project(Internal Ticket).

  • The search in the Marketplace is performed also on the name of the Listing (Internal Ticket).

  • Links for the Annotations in the Document List view redirect to the default Document Viewer, as set in the Project Settings (Internal Ticket).

  • Supporting the Project’s new file structure, coming from upgrading to API v3 (Internal Ticket).

released-2024-03-12_09-39-12

This version uses the Konfuzio Python SDK in version v.0.2.46 and Konfuzio Document Validation UI in version v.0.1.27.

Added

  • Support for uploading PDFs with broken xref attribute (Internal Ticket).

  • Show evaluations of the training Data Set for Extraction AIs (Internal Ticket).

Fixed

  • Missing character in the email address for sending Documents per email - using the email link in the Document View (Internal Ticket).

  • In the DVUI: cannot add annotations with overlappig bounding boxes (Internal Ticket).

  • In the DVUI: fix Annotation details tooltip height when annnotation is not filling all the window height (Internal Ticket).

  • User with Reviewer Role is not able to see Documents assigned to him/her (Internal Ticket).

  • Labels cannot be sorted in Label Sets through the UI (Internal Ticket).

Changed

released-2024-02-22_06-47-40

This version uses the Konfuzio Python SDK in version v.0.2.45 and Konfuzio Document Validation UI in version v.0.1.26.

Added

Fixed

Changed

  • Include support for “thick client” when using Oracle as DB(Documentation).

released-2024-02-15_21-37-11

This version uses the Konfuzio Python SDK in version v.0.2.44 and Konfuzio Document Validation UI in version v.0.1.26.

Added

  • Add an “AI Model” filter to the “Smart View” Document Viewer & the API, so that annotators can compare the results of different AI Models more easily (Internal Ticket).

Fixed

  • Make it possible to set assignee to None via the API v3 (Internal Ticket).

  • Fixed “Document could not be processed. min() arg is an empty sequence” error message for training Extraction AI (Internal Ticket).

  • In the DVUI: the user should be able to complete the Review even though there is a Label Set without a Label (Internal Ticket).

released-2024-02-07_09-05-06

This version uses the Konfuzio Python SDK in version v.0.2.43 and Konfuzio Document Validation UI in version v.0.1.25.

Added

  • Showing the number of pages per Document in the Document View (Internal Ticket).

  • Billing: the user can enable/disable exceeding the page and rate limits. The page limit is a limit for the total amount of pages processed in a month. The rate limit is a limit for the total amount of pages processed per hour. Exceeding the limits adds additional costs (Internal Ticket).

  • In the DVUI: Show the Label name when hovering over an Annotation in the DVUI (Internal Ticket).

  • In the DVUI: added a search tooltip for the labels in the right navigation panel. Through the tooltip the user can directly search the Document with the label as a keyword (Internal Ticket).

  • In the DVUI: search option in the Document Container for searching the Document for keywords (Internal Ticket).

  • Users have to request access to the API. Users with permissions can grant API access. A notification email is sent upon token creation. (Internal Ticket).

Fixed

  • Training of Extraction AI failed (Internal Ticket).

  • ValueError during Extraction, where an attempt to convert the string ‘1111..’ to a float fails (Internal Ticket).

  • Fixed issue with Categorization AI Training failing - changed default training parameters (Internal Ticket).

released-2024-01-10_13-24-43

Added

Fixed

released-2023-12-06_18-32-50

This version uses the Konfuzio Python SDK in version v.0.2.42 and Konfuzio Document Validation UI in version v.0.1.21.

Added

  • In the DVUI: annotations can be easily moved from one label set to another (Internal Ticket).

  • Allow usage of transformers.AutoTokenizer in combination with BERT models for Categorization (Internal Ticket).

  • In the Project settings: allow the user to choose the Document viewer for all Documents in the project. Possible values: SmartView, TextView and Document Validation UI. Default value: SmartView (Internal Ticket).

Fixed

  • Reduce loading time for opening marketplace listings (Internal Ticket).

  • Switching between accounts: logging out the currently logged-in user when accepting an invitation for another account (Internal Ticket).

  • Error appearing when changing the category for Documents with existing annotations (Internal Ticket).

  • Inconsistency for the accuracy of Categorization AIs in the UI and the training logs (Internal Ticket).

  • Unable to delete a Project: prompted to delete not existing Document (Internal Ticket).

  • Solve document classification error for BERT-based text-only models (Internal Ticket).

released-2023-11-20_11-34-47

This version uses the Konfuzio Python SDK in version v.0.2.39 and Konfuzio Document Validation UI in version v.0.1.20.

Added

  • Extraction quality per Document: each Document has an extraction score (Internal Ticket).

  • Add a filter to the list of Documents to find Documents that need to be revised by humans. The Documents are filtered based on their extraction quality (e.g. see bottom 10% of documents based on the extraction quality) (Internal Ticket).

  • For Splitting AI: added a model that works solely with textual information (Internal Ticket).

  • For self-hosted installations: introduce splitting documents based on blank pages. Consequent blank pages are grouped with consequent non-blank pages preceding them in one document (Internal Ticket).

Fixed

Changed

  • Async Workflows for Superusers: starting Superuser Workflows does not block the UI - further actions can be taken. (Internal Ticket).

  • Extraction logs are added when an extraction throws an exception (Internal Ticket).

  • UI changes to Accept, Decline & Reject action buttons in the DVUI for a cleara and intuitive understanding of what each action does (Internal Ticket).

  • Increase the maximum PDF dimensions for uploading a file (Internal Ticket).

  • Using ‘cpu’ as a device by default in Splitting AI (Internal Ticket).

  • For documents splitted with the Splitting AI: the PDF has the content of the splitted file and not of the one used for the splitting (Internal Ticket).

released-2023-11-01_12-12-49

This version uses the Konfuzio Python SDK in version v.0.2.37 and Konfuzio Document Validation UI in version v.0.1.18.

Fixed

  • Problem when creating a single annotation for a long text block. (Internal Ticket).

  • Fixed unbound error: referencing local variable before assignment. (Internal Ticket).

  • Fixed normalization problem for numbers with 4 signs after the decimal. (Internal Ticket).

  • Fixed error occurring when uploading a document. (Internal Ticket).

Updated Documentation

released-2023-10-19_09-39-24

This version uses the Konfuzio Python SDK in version v.0.2.36 and Konfuzio Document Validation UI in version v.0.1.17.

Added

Changed

  • Upgraded the base image from python:3.8.18-slim-bullseye to the more recent Debian version python:3.8.18-slim-bookworm for improving security: this version has a reduced number of known Common Vulnerabilities and Exposures (CVEs) (Internal Ticket).

  • Tesseract 5 is now used, instead of tesseract 4: enhances OCR capabilities and ensures you get the best text recognition experience (Internal Ticket).

Fixed

  • Error messages are properly shown in the language set for the server app (Internal Ticket).

  • IntegrityErrors are not visible to the user, as expected (Internal Ticket).

  • Annotations created by the user in the SmartView appear as accepted in the DVUI (Internal Ticket).

  • When an Annotation is rejected/deleted from the SmartView, it also disappears from the DVUI (Internal Ticket).

  • Coming-soon error on the Swagger API (Internal Ticket).

released-2023-10-04_18-41-03

This version uses the Konfuzio Python SDK in version v.0.2.35 and Konfuzio Document Validation UI in version v.0.1.16.

Added

Fixed

released-2023-09-19_21-25-05

This version uses the Konfuzio Python SDK in version v.0.2.33 and Konfuzio Document Validation UI in version v.0.1.15.

Fixed

  • The Document Callback in API V3 now uses the Document representation of API V3 (Internal Ticket).

released-2023-09-07_13-35-31

This version uses the Konfuzio Python SDK in version v.0.2.32 and Konfuzio Document Validation UI in version v.0.1.13.

Added

Changed

released-2023-08-28_18-13-01

This version uses the Konfuzio Python SDK in version v.0.2.29 and Konfuzio Document Validation UI in version v.0.1.12.

Added

Changed

  • Deactivate “Unfilled Labels” on the Catefory detail page to improve page load performance (Internal Ticket)

released-2023-08-24_05-48-16

This version uses the Konfuzio Python SDK in version v.0.2.29 and Konfuzio Document Validation UI in version v.0.1.12.

Added

Changed

  • Autoconfirm the Category of Document in case only one Category is available (Internal Ticket)

  • Sent Plan limit reached email only once per day (Internal Ticket)

Fixed

  • Allow Superusers without a Project to access Flower, Usage, and Queue views (Internal Ticket).

  • When using Google Kubernetes Engine (GKE) the log level is now detected correctly (Internal Ticket).

  • When creating a User via the Webinterface as Superuser not all Permissions habe been applied (Internal Ticket).

released-2023-08-10_21-33-41

This version uses the Konfuzio Python SDK in version v.0.2.28 and Konfuzio Document Validation UI in version v.0.1.11.

Fixed

  • An issue that prevented non-standard compliant PDFs to be uploaded (Internal Ticket).

released-2023-08-07_13-49-58

This version uses the Konfuzio Python SDK in version v.0.2.28 and Konfuzio Document Validation UI in version v.0.1.11.

Added

Changed

Fixed

  • Correct a typo in file splitting notifications (Internal Ticket).

  • Allow to open a link to a Document which does not belong to the currently selected Project (Internal Ticket).

released-2023-07-22_17-14-51

This version uses the Konfuzio Python SDK in version v.0.2.26 and Konfuzio Document Validation UI in version v.0.1.10.

Added

Fixed

released-2023-07-11_16-07-40

This version uses the Konfuzio Python SDK in version v.0.2.25 and Konfuzio Document Validation UI in version v.0.1.9.

Fixed

released-2023-07-10_06-37-03

This version uses the Konfuzio Python SDK in version v.0.2.24 and Konfuzio Document Validation UI in version v.0.1.9.

Added

Changed

Fixed

  • In some cases Label Sets could not be deleted (Internal Ticket).

  • The sorting by number of training and test Document on the Extraction AI list page (Internal Ticket).

  • Documents could not be processed for specific OCR settings (Internal Ticket).

  • When using the Sentence or Paragraph Tokenizer, the Training process did not complete (Internal Ticket).

released-2023-06-27_21-39-25

This version uses the Konfuzio Python SDK in version v.0.2.23 and Konfuzio Document Validation UI in version v.0.1.9.

Changed

released-2023-06-25_15-20-35

This version uses the Konfuzio Python SDK in version v.0.2.22 and Konfuzio Document Validation UI in version v.0.1.9.

Fixed

  • In some cases AI trainings stopped with the status “Contact Support” (Internal Ticket).

released-2023-06-15_18-32-26

This version uses the Konfuzio Python SDK in version v.0.2.20 and Konfuzio Document Validation UI in version v.0.1.8.

Changed

  • The Non-Strict Evaluation is now used for Extraction AIs by default (Internal Ticket).

  • For self-hosted environments, the Konfuzio Server can now run for 24 hours without contact the License Server (Internal Ticket).

Fixed

  • Large TIFF files (> 100 Pages) can now be processed (Internal Ticket).

  • Applying a filter on the Label List view now provides correct results (Internal Ticket).

  • An error which prevented the manual rotation of a Page to work correctly (Internal Ticket).

released-2023-05-30_11-01-48

This version uses the Konfuzio Python SDK in version v.0.2.19 and Konfuzio Document Validation UI in version v.0.1.7.

Added

Fixed

  • Show the exact Page number in case a PDF has invalid dimensions (Internal Ticket).

  • Subscription updates are now applied to previous created API Tokens (Internal Ticket).

  • The Evaluation could not be displayed, if a Label in the training or test data did not have at least one Annotation (Internal Ticket).

  • If a Document was created via API V3 and the “sync” option, not all extraction have been returned (Internal Ticket).

  • If a Document was created via API V3, the default extraction URL was pointing to API V1 instead of API V3. (Internal Ticket).

released-2023-05-22_12-48-00

This version uses the Konfuzio Python SDK in version v.0.2.18 and Konfuzio Document Validation UI in version v.0.1.6.

Fixed

  • Improved performance of the Annotations list webpage. (Internal Ticket).

  • Improved performance of Document Search in the Smartview. (Internal Ticket).

released-2023-05-17_20-51-37

This version uses the Konfuzio Python SDK in version v.0.2.18 and Konfuzio Document Validation UI in version v.0.1.6.

Fixed

  • Extraction AI could not be migrated because the Category was not associated automatically(Internal Ticket).

  • Improved laoding time of the Extraction AI list (Internal Ticket).

released-2023-05-13_19-27-00

This version uses the Konfuzio Python SDK in version v.0.2.18 and Konfuzio Document Validation UI in version v.0.1.6.

Added

Changed

Fixed

  • Fixed the Bbox retrieval for blank Documents (Internal Ticket).

  • Opening the Task Log of an ongoing AI training caused an Error (Internal Ticket).

  • Failed Quality Assurance during AI training showed the wrong status (Internal Ticket).

  • In rare cases some PDF files has been wrongly intendified as corrupted during upload (Internal Ticket).

  • The Project exportwas not including the API name of Labels (Internal Ticket).

released-2023-05-02_12-09-37

This version uses the Konfuzio Python SDK in version v.0.2.17 and Konfuzio Document Validation UI in version v.0.1.5.

Please Note: If you upgrade from a version before ‘released-2023-04-23_18-48-59’ you must conduct the migration steps described in the release notes of released-2023-04-23_18-48-59.

Added

Changed

Fixed

  • In some cases, Documents got stuck in a “Queing for..” status when restarting Konfuzio Server (Internal Ticket).

  • In some cases, new Annotation could not be created (Internal Ticket).

  • The deletion of a Project did not complete (Internal Ticket).

  • When creating Annotations, Label Sets from other Categories have been suggested (Internal Ticket).

released-2023-04-23_18-48-59

This version uses the Konfuzio Python SDK in version v.0.2.16 and Konfuzio Document Validation UI in version v.0.1.5.

Important note: This release changes the internal format of saved AIs. Therefore, you need to migrate existing AIs, before updating to this Konfuzio Server version. Please run “python manage.py resave_all_with_cloudpickle” to do so. If this command is not available on your Konfuzio Server Installation, please upgrade to released-2023-03-18_13-32-19 first. In case you need help or experience an issue with the migration please contact is via https://konfuzio.com/support. This Konfuzio Server will not start if unmigrated AIs are present. Finally, the usual update actions need to be run. For more information on how to run a “manage.py” command, please refer to the self-hosted guide.

Added

  • Calculate and access Tokenizers via the web interface (Internal Ticket).

  • Sort Labels in Label-Sets to allow Users to customize the UI per Category (Internal Ticket).

  • Improved training time of Extraction AIs when using the word detection mode (reduced up to 50%) (Internal Ticket).

Fixed

  • A bug when training with character detection mode, which was tokenizing some labels incorrectly, causing them to be skipped during extraction (Internal Ticket)

  • A bug during the extraction post-processing steps, which was causing the first line items of each page to be skipped (Internal Ticket)

released-2023-03-18_13-32-19

This version uses Konfuzio AbstractExtractionAI in version v.0.3.23, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Document Validation UI in version 0.1.3.

Added

Fixed

  • AI-Guests can now re-categorize Document using the API (Internal Ticket).

released-2023-03-06_21-09-18

This version uses Konfuzio AbstractExtractionAI in version v.0.3.23, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Document Validation UI in version 0.1.2.

Added

released-2023-02-17_14-27-57

This version uses Konfuzio AbstractExtractionAI in version v.0.3.23, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Document Validation UI in version 0.1.1.

Added

Fixed

  • When using Keycloak, logging out now also terminates the Keycloak session (Internal Ticket).

  • Handle the upload of corrupted Documents with a proper error message (Internal Ticket

  • When using an AI in a other Project then it was trained on, a potentiall conflict with existing Labels is now avoided (Internal Ticket)

released-2023-02-08_10-58-15

This version uses Konfuzio AbstractExtractionAI in version v.0.3.23, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Capture Vue in version 0.1.1.

Added

Fixed

released-2023-01-23_22-14-45

This version uses Konfuzio AbstractExtractionAI in version v.0.3.23, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Capture Vue in version 0.1.0.

Added

Changed

Fixed

  • Proper error handling when a Page number is used in the API which does not exist (Internal Ticket).

released-2023-01-23_14-32-08

This version uses Konfuzio AbstractExtractionAI in version v.0.3.23, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Capture Vue in version 0.1.0.

Added

Changed

released-2023-01-15_19-35-28

This version uses Konfuzio AbstractExtractionAI in version v.0.3.23, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Document Validation UI in version 0.1.0.

Added

Fixed

  • Restrict the maximum auto-deletion time of a Document to a maximum of 5 years (Internal Ticket).

  • The ordering of Annotations Sets in the API V3 (Internal Ticket).

released-2022-12-22_11-03-21

This version uses Konfuzio AbstractExtractionAI in version v.0.3.21, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Document Validation UI in version 0.0.11-pre-release-5.

Fixed

  • Re-running the categorization now also re-runs the extraction (Internal Ticket).

released-2022-12-20_11-23-04

This version uses Konfuzio AbstractExtractionAI in version v.0.3.21, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Document Validation UI in version 0.0.11-pre-release-5.

Added

Fixed

  • Show a message that informs a User if his account was deactivated (Internal Ticket).

  • Failed login attemps have not been shown in the web interface (Internal Ticket).

  • Missing German translation on the Member list page (Internal Ticket).

  • Improve performance of Document List Endpoint in API V3 by excluding large Document attributes (Internal Ticket).

released-2022-12-05_19-18-47

This version uses Konfuzio AbstractExtractionAI in version v.0.3.21, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Document Validation UI in version 0.0.11-pre-release-1.

Please note: When you upgrade to this version (or a newer one) we recommend to run “python manage.py init_email_templates” as the email templates have been updated. This needs to be run after the usual update actions.

Added

Changed

Fixed

  • In a very rare case text embeddings could not be extracted from Documents (Internal Ticket).

  • The error handling for invalid PDF Documents (Internal Ticket).

  • The notification email template for AI trainings was not considering errors in the training process (Internal Ticket).

  • callback_url is now called if re-extraction is triggered on a Document (for example, when the Category changes) (Internal Ticket).

  • Fix an issue that prevented the full deletion of on line of text in multiline Annotations (Internal Ticket).

  • Fix a missing placeholder in an email template (Internal Ticket).

  • Improved loading time of the Category list view (Internal Ticket).

  • The assignee filter for the Document List now requires Permissions to view the Members of a Project (Internal Ticket).

released-2022-11-16_12-13-49

Fixed

  • Speedup the Document List page for Superuser (Internal Ticket).

  • The Annotation creation on empty areas in a Document is now possible (Internal Ticket).

released-2022-11-11_13-19-29

This version uses Konfuzio AbstractExtractionAI in version v.0.3.21, the Konfuzio Python SDK in version v.0.1.16 and Konfuzio Document Validation UI in version 0.0.10-pre-release-7.

Please note: When you upgrade to this version (or a newer one) you need to run “python manage.py init_user_permissions”. This needs to be run after the usual update actions.

Added

Changed

Fixed

  • The numbering of Annotation Sets in the SmartView does not consider deleted Annotation Sets anymore (Internal Ticket).

  • In some situations a Project could not be deleted via API (Internal Ticket).

  • In specific scenarios, the deletion of the last remaining Annotation in a Document was not possible (Internal Ticket).

  • The SmartView did not use rotated pages due to a caching problem (Internal Ticket).

  • The arrow in the Project- and language selector was not clickable (Internal Ticket).

  • On the Annotation list Page the Category filter was not showing all Annotations (Internal Ticket).

  • Fix an issue where the Category- and Document API V3 endpoint did not include all relevant Label-Sets (Internal Ticket).

  • Fix an issue that prevented specific SmartView messages to be dismissed (Internal Ticket).

  • Fix an issue that prevented null values to be passed to the API v3 annotation creation endpoint (Internal Ticket).

released-2022-10-28_07-23-39

Added

Fixed

  • Accepting an Annotation was overwritting already existing custom offset strings (Internal Ticket).

  • In some cases an Extraction AI training was failing when detecton mode ‘Character’ was selected (Internal Ticket).

released-2022-09-21_12-00-31

This version uses Konfuzio AbstractExtractionAI in version v.0.3.22, the Konfuzio Python SDK in version v.0.1.15 and Konfuzio Document Validation UI in version 0.0.8.

Fixed

  • Prevent an issue where a popup window could not be closed when using the SmartView (Internal Ticket).

  • Filtering for “feedback required” on Document overview (Internal Ticket).

released-2022-09-04_09-11-18

This version uses Konfuzio AbstractExtractionAI in version v.0.3.21, the Konfuzio Python SDK in version v.0.1.15 and Konfuzio Document Validation UI in version 0.0.8.

Added

Changed

Fixed

  • Top annotation filter in the SmartView now considers unrevised annotations.

  • Fix an issue in which a Document cannot be processed because negative bounding boxes are detected.

  • Fix an issue which caused the processing time to be shown as negative.

released-2022-07-28_15-55-29

This version uses Konfuzio AbstractExtractionAI in version v.0.3.21, the Konfuzio Python SDK in version v.0.1.15 and Konfuzio Document Validation UI in version 0.0.6.

Changed

  • Speed up runtime of Extraction AIs

Fixed

  • Fix an issue which causes some Extraction AIs to crash on multipage documents.

  • Fix an issue that prevents the calculation of bounding boxes for small or slightly rotated characters.

released-2022-07-25_21-20-48

Added

  • Allow to set a default assignee for uploaded documents

  • Allow to notify users via email when they get assigned to documents

Fixed

  • Top annotation filter in the SmartView now takes accepted Annotations into account

  • Errors messages in case a document could not be processed are now displayed correctly

released-2022-07-19_16-30-46

Changed

  • New Extraction AIs are saved in a more efficient way

released-2022-07-05_19-35-21

This version uses Konfuzio AbstractExtractionAI in version v.0.3.15, the Konfuzio Python SDK in version v.0.1.15 and Konfuzio Document Validation UI in version 0.4.0.

Added

  • Show the user who started an AI training on the detail page of an AI

  • Allow to set a time (in days) after which documents are automatically deleted

  • Allow to rotate pages via API

  • Add thumbnail images for document pages

Changed

  • Links to deleted annotation will now redirect to the respective document

released-2022-06-10_15-32-19

This version uses Konfuzio AbstractExtractionAI in version v.0.3.15 and the Konfuzio Python SDK in version v.0.1.15.

Added

  • Option to enforce running OCR even if text embeddings are present

  • Improved error messages in case a document cannot be processed

  • Option to exclude email content when using the email-integration

  • Option to make document accessible via public link

  • Beta Version of APIV3

  • Beta Version of new document dashboard (bases on Konfuzio Document Validation UI)

  • Automatic rotation of pages (#8980)

Changed

  • For on-premise Users, now the Postgresql 10 is the minimum version

  • Improved Extraction AI

  • On-Premise container run now as non-root and using read-only fileystem

  • Improved mouse pointer in the Smartview

Fixed

  • An issue where empty Annotation Sets could appear on Documents

  • An issue where conflicting annotions could be created

  • An issue where negative annotations where not correctly being deleted (#9127)

  • Rare cases where OCR text included some characters mutiple times

released-2022-04-27_14-23-38

Added

  • Add assignee attribute of a Document to the API

released-2022-03-15_09-14-17

Changed

  • “Rerun extraction” via the user interface applies new annotations now also to training and test documents

released-2022-02-11_23-12-26

Added

  • Add option to filter for related annotation sets

Fixed

  • Sorting of annotation sets in the csv export

  • Document API endpoint returning declined annotations

released-2022-01-18_11-08-24

Added

  • Added api_name to Label API

Fixed

  • Link to documentation page

  • Missing translation on document list page

  • Evaluation did not complete for AIs with a large amount of training data

released-2021-12-11_14-33-57

Added

  • For on-premise installations, the OCR method for new projects is choosen based on the available OCR solutions.

  • For on-premise installations, the project import considers now declined annotations

  • For on-premise installations, Superusers can see the Konfuzio Server version and how many pages and documents have been processed.

released-2021-11-21_19-14-19

Added

  • Text summarization endpoint.

  • Categorization AI parameters in the Project view

Fixed

  • An issue where the reload after uploading new documents does not happen

released-2021-11-26_08-11-36

This version uses Konfuzio AbstractExtractionAI in version v.0.3.0. We recommend to use the Konfuzio Python SDK in version 0.1.15

Added

  • “Sentence” option to the available detection modes

Fixed

  • An error where an invalid date in the document text stoppped the training process

released-2021-11-23_18-14-28

Fixed

  • E-mails without an attachment have not been processed.

released-2021-11-16_23-02-22

Fixed

  • CSV export for ProRis by Inveos

released-2021-11-05_09-55-10

Added

  • Allow deletion of characters of an annotation without excluding it from the training process

released-2021-11-01_23-19-58

Added

  • An option to specify the category of a document when uploading it via API (and thereby skipping the categorization)

Changed

  • The GET document API endpoint now returns the annotation displayed in the SmartView (instead of only showing the extraction AI results)

released-2021-10-25_20-12-18

This version uses Konfuzio AbstractExtractionAI in version 2021-10-20_18-29-25. We recommend to use the Konfuzio Python SDK in version 0.1.10

Added

  • CSV export compatible with ProRis by Inveos

released-2021-10-16_13-20-12

Added

  • Improve detection of annotations which consist of multiple words

  • Date filtering for project documents API endpoint

  • Filtering of labels and label sets according to the category of a document (in the SmartView)

Fixed

  • Selection of characters in SmartView incomplete when editing an annotation

  • Dark Mode setting of browser not compatible with Konfuzio Server

  • Some case where the document list was not reloaded automatically

released-2021-10-07_11-42-29

Added

  • More advanced task priorities and improved worker ressource usage

  • Auto-reload of new uploaded documents

released-2021-09-28_09-29-43

Fixed

  • Evaluation does not complete if no test documents are specified

released-2021-09-24_13-53-32

Fixed

  • Incompleted evaluation

  • Formatting of the “Check your browser” page for logged out users.

released-2021-09-16_12-25-23

Fixed

  • Adding of categories to existing label sets

released-2021-09-08_16-01-46

Added

  • Migration scripts for user permissions and e-mail templates

released-2021-09-07_12-24-26

Added

  • Support for SMTP e-Mail backends via environment variables

Fixed

  • DOS protection prevents start of Konfuzio server

released-2021-09-05_20-57-31

Added

  • Autosave for any change on the document list page

  • German language support

  • Finetuning of exctraction AIs via parameters

  • New fields AI quality and data quality

  • More detailed evaluation

  • Description Field for extraction AI, label sets, categorization AI, categories

Changed

  • Rename project inviations to members

  • Rename the dataset status form “OCR Error” to “Excluded”

  • Start training per extraction AI

  • Get more insights via the document detail page

released-2021-08-10_17-08-11

Changed

  • Deactive adoption of template settings according to AI model if not explicitly allowed.

released-2021-08-10_11-19-33

Added

  • Maximum number of pages per document

Fixed

  • Slow processing of extraction tasks

  • Evaluation when multiple annotations are present

released-2021-07-28_18-53-12

Changed

  • Make word-based tokenizer the default for new projects

released-2021-07-23_09-33-20

Fixed

  • Usage of word-base tokenizer

  • Duplicated hints

released-2021-07-20_17-29-23

Fixed

  • Edited annotation were excluded from the training process

released-2021-07-15_17-29-25

Added

  • Support to reuse label sets across categories

Changed

  • Allow “rerun extraction” on test and training documents

  • Remove “project statistic csv export” as it is redundant to document csv export

  • Include evaluation for training data in the AI model evaluation report

Fixed

  • Fixed a bug where the EXIF attribute orientation corrupted the bounding boxes images

  • “accept top annotations” does not update human created annotations

released-2021-07-02_18-13-01

Changed

  • Rate limits for task system

released-2021-06-29_22-14-33

Added

  • HTTP codes to API interface

Fixed

  • Content type description for some API endpoints

released-2021-06-22_22-45-48

Added

  • A experimental version of a training health report

released-2021-06-20_15-14-31

Fixed

  • Failed retraninings for some projects

  • Increased disk usage due to an cache deletion issue

  • Filtering of project invotations according to currently selected project

  • Clarify return types in API documentation

released-2021-05-26_20-16-02

Added

  • Show confidence for categorization results

  • Show evaluation of categorization Ai models

  • Track version (number of retrainings) for all Ai models

  • Track project and template origin of AiModel

released-2021-05-24_13-42-45

Changed

  • Use business evaluation implementation from training package

  • Loading time for CSV export evaluation reduced by saving it in the database.

released-2021-05-18_16-37-25

Added

  • Global project switcher

  • “Top candidates” filter in SmartView

  • “Change dataset” functionality in SmartView

  • Landing page in case the user has no projects (i.e. just registered)

  • Language switcher (not enabled yet)

  • Initial support for German translations (not enabled yet)

Fixed

  • Label threshold is now limited from 0.0 to 1.0

Changed

  • New design for login/signup/reset password pages

  • Design improvements in the control panel and SmartView

  • New logo and favicon

  • API documentation has been improved with types and examples; is now based on OpenAPI 3

  • Updated frontend dependencies and tooling

Removed

  • admin_importer, copy_extraction_as_annotation and related functions have been removed

released-2021-05-04_12-37-16

Fixed

  • Calculation of true negative when using multiple templates.

released-2021-04-28_12-27-19

Added

  • Filter for top annotations in SmartView

Changed

  • Dont allow training if there are no training documents

released-2021-04-25_20-09-04

Added

  • Protect signup with captcha

Fixed

  • Editing of annotation if there are already declined annotations.

released-2021-04-19_22-32-19

Added

  • Add label creation endpoint

  • Token-based authentication for the API

released-2021-04-03_09-46-56

Added

  • Show Django sidebar in Smartview and template view.

Changed

  • Save extraction results in a more efficient way.

  • Show a warning if an annotation with a custom offset string is created

  • Shwo loading indicator in the smartview search

Fixed

  • Default template dropdown sometimes disabled when creating a Template

  • Rare case where the document list could not be loaded

released-2021-03-15_15-12-04

Added

  • Add option to accept all annotations.

released-2021-03-07_21-32-41

Added

Fixed

  • Typo in privacy policy

  • Confirmation message when deleting labels

  • Performance of csv export

Changed

  • Delete old unrevised annotations when rerunning AiModel.

released-2021-02-25_09-28-07

Added

  • Option to select tokenizer for training (ProjectAdmin)

  • Option to add training parameters (SuperuserProjectAdmin)

Changed

  • Set a documents category_template on new documents if there is only one category_template available

  • Improved delete / accept performance of annotations

Fixed

  • Count of annotations on the LabelAdmin

released-2021-02-15_18-56-51

Changed

  • Show category template as empty when actual empty (instead of displaying the first available template)

  • Improved Smartview performance by changing entity loading

Added

  • Project name added to SectionLabel in the AiModelAdmin

  • Assign user to documents (“Assignee”). Can be enabled in the ProjectSuperuserAdmin

  • Add status field to the AiModel (“Training”, “Failed”, “Done”)

  • Dont allow new retraining if there is a training in progress AiModel.

released-2021-02-13_18-18-52

Changed

  • Use annotation permalink in LabelAdmin

Fixed

  • OCR Read API did not use text embeddings when available

  • Files with misssing fonts could not be processed

  • Creation of small annotations when accepting or declining

released-2021-02-10_13-52-15

Added

  • Admin action for Microsoft Graph API / Planner API

Fixed

  • SuperUserDocumentAdmin performance

  • OutOfMemory errors in the categorization

released-2021-02-03_17-07-23

Added

  • Permalink for annotations

  • Add an additional routine to fix corrupted pds

  • Improved frontend error tracking

Fixed

  • Validation when edting an annotation

Changed

  • Renamed option ‘priority_ocr’ to ‘priority_processing’

  • Allow rerun extraction for documents with revised annotations

  • Allow deletion default templates

released-2021-01-26_18-07-11

Added

  • Add column ‘category’ to csv export

released-2021-01-20_11-17-24

Added

  • Show selection bounding boxes for automtic created annotations

released-2021-01-14_22-06-52

Added

  • Visual annotations: images and area can now be annotate

Fixed

  • Loading time for Smartview

released-2021-01-13_23-26-03

Fixed

  • Retraining now assigns AIModels to templates even if they was no before

Added

  • Add Message when doing evaluation which tells the user if test set is empty.

released-2021-01-12_21-13-48

Fixed

  • Google Analytics integration

  • Empty Textextraction for ParagraphExtractions

released-2021-01-10_18-36-49

Fixed

  • Disable link formatting by sendgrid.

released-2021-01-08_22-30-10

Fixed

  • Bbox calculation in ParagraphModel

  • Evaluation sometimes not running

  • Speedup annotation creating

released-2021-01-05_11-53-22

Changed

  • Two column Annotation selection is now possible

Added

  • ParagraphModel introduced in addition to the Extraction- & CategoryModels, this is set per project via the SuperUserDocumentAdmin.

  • Option to update the document document text, this is set per project via the SuperUserDocumentAdmin.

  • Document Segmentation API Endpoint

released-2020-12-22_19-04-04

Changed

  • Email Template are now managed within the application.

  • Major improvement and refactor in the underlying training package.

Fixed

  • Link to imprint on SignUp

  • Smartview when scrolling horizontally

released-2020-12-16_20-17-30

Added

  • Search for Smartview

Fixed

  • TemplateCreationForm does not allow to select parent template

released-2020-12-16_09-44-30

Added

  • Searchbar for SuperuserProjectAdmin

  • Add link to flower (task monitoring) for superusers

  • Add support for GoogleTag Manager

  • Create Support Ticket for Retraining and Invitation of new Users

Changed

  • Increase SoftTimeLimit for extraction (necessary for large documents >500 pages).

Fixed

  • Fix bbox generation fox Paragraph Annotations

  • Fixed Evaluation not triggered for new AiModels

released-2020-12-10_13-15-14

Added

  • Sentry error reporting for Javascript Frontend (i.e. Smartview)

  • Allow to add Project specific document CategorizationModel

Changed

  • Document Search now considers filenames and shows links to Dashhboard, Labeling and Smartview

  • Allow deletion of Labels

Fixed

  • Allow “None” as confidence for rule-base ExtractionModels

released-2020-12-01_21-08-32

Added

  • Proof of Concept Microsoft Graph API connection (for logged in users): app.konfuzio.com/graph

  • Button to upload demo Documents

  • SuperuserProjectAdmin added (same like previous ProjectAdmin, however only accessible for Superusers only)

  • Google Analytics Tag for app.konfuzio.com

Changed

  • Default permission Group “CanReadProject” replaced with “CanCreateReadUpdateProject”. New users can now create new Projects.

  • Project Page for “normal” user does not show technical fields like “ocr” and “text_layout” anymore.

  • Dont show file endings like ‘.pkl’ for AiModels

released-2020-11-26_19-43-14

Fixed

  • Missing bbox attribute in Document API (prevents retraining via training package)

  • Running of proper ExtractionModel in Multi-Document-Template project

  • Loading time for the Document page (still room for improvements)

Added

  • Slightly better Categorization model.

released-2020-11-20_20-05-47

Added

  • A public registration page: https://app.konfuzio.com/accounts/signup

  • A Internal registration page to create users manually and faster: https://app.konfuzio.com/register/ (you need to be logged in to see this page)

  • Users can invite new users to a project via “ProjectInvitations”

  • Password reset functionality

Fixed

  • The Smartview is much faster

  • Improved creation of Templates and additional validation logic template inconsistencies.

Changed

  • Save bbox and entity per page in order to improve performance

released-2020-11-09_18-04-28

Added

  • Support for more than one default Template in a project

  • Categorization for multi Template projects

  • Links to related models in the Project, AIModel, Label and Template view

  • Internal user registration form, app.konfuzio.com/register

Changed

  • AiModel belongs now to DefaultTemplates instead of project

released-2020-10-27_10-37-15

Changed

  • Documents are now soft-deleted. There is a hard delete option in the SuperuserDocumentAdmin.

  • AiModel are made active automatically for matching DefaultTemplates if the AIMode is better than before.

released-2020-10-21_08-53-42

Fixed

  • Loading time when updating a project.

released-2020-10-19_22-46-49

Changed

  • Increase max allowed workflow time from 90 to 180 seconds.

Fixed

  • sucess messages for ‘rerun_workflow’ admin action

  • loading time of AiModel

  • csv export

Added

  • add hocr fied to document api.

  • add a project option to hide the Smartview and Labeling tool.

released-2020-10-14_11-39-17

Changed

  • AIModel can be uploaded and evaluted before setting active for a project

released-2020-10-13_15-10-22

Added

  • Multilanguage Support (DE/EN) in the backend (actuall translation are not included yet)

Changed

  • ‘create_labels_and_templates’ is now a project option (false by default).

  • Gunicorn workers restart after 500 requests.

  • Flower dashboard is running in separated container now

Fixed

  • Fix upload_ai_model to upload files larger than 2GB

  • Loading speed for SequenceAnnotation Admin

released-2020-10-03_15-18-47

Fixed

  • Recover tasks in case celery worker crashes

released-2020-10-01_12-02-37

Fixed

  • Internet Explorer warning badge

  • ‘Not machine-readable’ was not detecting 0 as proper value for normalization.

Changed

  • Remove extraction count from AiModel admin.

  • Refactor annotation accept/delete buttons to separate components and SVG

released-2020-09-16_18-19-53

Added

  • Additional normalization formats

  • Sentry message if retraining is triggered.

  • Detectron (fully imlemented) and preparation for visual classification results in SuperUserDocumetAdmin

Changed

  • Dont raise sentry error if document got deleted during workflow

Fixed

released-2020-09-11_13-47-51

Added

  • Add sentry message if project retraining is triggered.

  • Fix cpu minute calculation.

released-2020-09-09_16-12-44

Added

Changed

  • Allow extractions which does not have an accuracy.

  • On the dashboard: Dont show section.position column if all extractions have the same. Dont show accuracy column if all extraction does not have one.

  • Dont show retraining webhook url (on the project detail page). Display is with **** like it is password.

released-2020-09-08_22-38-19

Added

  • Per-project measuring of cpu time.

  • Additional date-formats for normalization.

  • First draft of boolean-formats for normalization.

released-2020-09-08_09-17-00

Added

  • Document Filter added for ‘human feedback required’ and ‘100% machine readable.

  • Additional normalization formats for numbers.

  • Document Categorization Classifier added to DocumentSuperUserAdmin

Changed

  • For the document view and Smartview, rename ‘possibly incorrect’ to ‘not machine-readable’

  • For the document view and Smartview, rename ‘pending review’ to ‘require feedback’

  • For the document view, divide column NOTES into FEEDBACK REQUIRED and NOT MACHINE-READABLE

Fixed

  • Dont raise an error if ai_model predict section with a template that does not exist.

released-2020-09-07_17-48-22

Fixed