All notable changes to this project will be documented in this file.
- Option to select tokenizer for training (ProjectAdmin)
- Option to add training parameters (SuperuserProjectAdmin)
- Set a documents category_template on new documents if there is only one category_template available
- Improved delete / accept performance of annotations
- Count of annotations on the LabelAdmin
- Show category template as empty when actual empty (instead of displaying the first available template)
- Improved Smartview performance by changing entity loading
- Project name added to SectionLabel in the AiModelAdmin
- Assign user to documents ("Assignee"). Can be enabled in the ProjectSuperuserAdmin
- Add status field to the AiModel ("Training", "Failed", "Done")
- Dont allow new retraining if there is a training in progress AiModel.
- Use annotation permalink in LabelAdmin
- OCR Read API did not use text embeddings when available
- Files with misssing fonts could not be processed
- Creation of small annotations when accepting or declining
- Admin action for Microsoft Graph API / Planner API
- SuperUserDocumentAdmin performance
- OutOfMemory errors in the categorization
- Permalink for annotations
- Add an additional routine to fix corrupted pds
- Improved frontend error tracking
- Validation when edting an annotation
- Renamed option 'priority_ocr' to 'priority_processing'
- Allow rerun extraction for documents with revised annotations
- Allow deletion default templates
- Add column 'category' to csv export
- Show selection bounding boxes for automtic created annotations
- Visual annotations: images and area can now be annotate
- Loading time for Smartview
- Retraining now assigns AIModels to templates even if they was no before
- Add Message when doing evaluation which tells the user if test set is empty.
- Google Analytics integration
- Empty Textextraction for ParagraphExtractions
- Disable link formatting by sendgrid.
- Bbox calculation in ParagraphModel
- Evaluation sometimes not running
- Speedup annotation creating
- Two column Annotation selection is now possible
- ParagraphModel introduced in addition to the Extraction- & CategoryModels, this is set per project via the SuperUserDocumentAdmin.
- Option to update the document document text, this is set per project via the SuperUserDocumentAdmin.
- Document Segmentation API Endpoint
- Email Template are now managed within the application.
- Major improvement and refactor in the underlying training package.
- Link to imprint on SignUp
- Smartview when scrolling horizontally
- Search for Smartview
- TemplateCreationForm does not allow to select parent Template
- Searchbar for SuperuserProjectAdmin
- Add link to flower (task monitoring) for superusers
- Add support for GoogleTag Manager
- Create Support Ticket for Retraining and Invitation of new Users
- Increase SoftTimeLimit for extraction (necessary for large documents >500 pages).
- Fix bbox generation fox Paragraph Annotations
- Fixed Evaluation not triggered for new AiModels
- Allow to add Project specific document CategorizationModel
- Document Search now considers filenames and shows links to Dashhboard, Labeling and Smartview
- Allow deletion of Labels
- Allow "None" as confidence for rule-base ExtractionModels
- Proof of Concept Microsoft Graph API connection (for logged in users): app.konfuzio.com/graph
- Button to upload demo Documents
- SuperuserProjectAdmin added (same like previous ProjectAdmin, however only accessible for Superusers only)
- Google Analytics Tag for app.konfuzio.com
- Default permission Group "CanReadProject" replaced with "CanCreateReadUpdateProject". New users can now create new Projects.
- Project Page for "normal" user does not show technical fields like "ocr" and "text_layout" anymore.
- Dont show file endings like '.pkl' for AiModels
- Missing bbox attribute in Document API (prevents retraining via training package)
- Running of proper ExtractionModel in Multi-Document-Template project
- Loading time for the Document page (still room for improvements)
- Slightly better Categorization model.
- A public registration page: https://app.konfuzio.com/accounts/signup
- A Internal registration page to create users manually and faster: https://app.konfuzio.com/register/ (you need to be logged in to see this page)
- Users can invite new users to a project via "ProjectInvitations"
- Password reset functionality
- The Smartview is much faster
- Improved creation of Templates and additional validation logic template inconsistencies.
- Save bbox and entity per page in order to improve performance
- Support for more than one default Template in a project
- Categorization for multi Template projects
- Links to related models in the Project, AIModel, Label and Template view
- Internal user registration form, app.konfuzio.com/register
- AiModel belongs now to DefaultTemplates instead of project
- Documents are now soft-deleted. There is a hard delete option in the SuperuserDocumentAdmin.
- AiModel are made active automatically for matching DefaultTemplates if the AIMode is better than before.
- Loading time when updating a project.
- Increase max allowed workflow time from 90 to 180 seconds.
- sucess messages for 'rerun_workflow' admin action
- loading time of AiModel
- csv export
- add hocr fied to document api.
- add a project option to hide the Smartview and Labeling tool.
- AIModel can be uploaded and evaluted before setting active for a project
- Multilanguage Support (DE/EN) in the backend (actuall translation are not included yet)
- 'create_labels_and_templates' is now a project option (false by default).
- Gunicorn workers restart after 500 requests.
- Flower dashboard is running in separated container now
- Fix upload_ai_model to upload files larger than 2GB
- Loading speed for SequenceAnnotation Admin
- Recover tasks in case celery worker crashes
- Internet Explorer warning badge
- 'Not machine-readable' was not detecting 0 as proper value for normalization.
- Remove extraction count from AiModel admin.
- Refactor annotation accept/delete buttons to separate components and SVG
- Additional normalization formats
- Sentry message if retraining is triggered.
- Detectron (fully imlemented) and preparation for visual classification results in SuperUserDocumetAdmin
- Dont raise sentry error if document got deleted during workflow
- Creation of Templates - Calculation of width and height dimension when creating sandwich pdf and when using azure
- Add sentry message if project retraining is triggered.
- Fix cpu minute calculation.
- Allow extractions which does not have an accuracy.
- On the dashboard: Dont show section.position column if all extractions have the same. Dont show accuracy column if all extraction does not have one.
- Dont show retraining webhook url (on the project detail page). Display is with ** like it is password.
- Per-project measuring of cpu time.
- Additional date-formats for normalization.
- First draft of boolean-formats for normalization.
- Document Filter added for 'human feedback required' and '100% machine readable.
- Additional normalization formats for numbers.
- Document Categorization Classifier added to DocumentSuperUserAdmin
- For the document view and Smartview, rename 'possibly incorrect' to 'not machine-readable'
- For the document view and Smartview, rename 'pending review' to 'require feedback'
- For the document view, divide column NOTES into FEEDBACK REQUIRED and NOT MACHINE-READABLE
- Dont raise an error if ai_model predict section with a template that does not exist.