Machine learning improvements (#644231, #644243)
The machine learning API has been improved. It is now possible to export the data which is used for training to a CSV file. The exported CSV file contains the category and the text according to the training specification. The user can inspect and edit the file before using it to create a model. The model can be created on any system. Therefore, it is possible, for example, to extract data from a production system and use a development system for training the model.
The class MLTicketTextClassifierTrainingSpec of the machine learning API was replaced by MLTicketTextClassifierSpec.
If you use MLTicketTextClassifierTrainingSpec in your scripts, you need to adapt the scripts accordingly.
When using the API to train a model based on a CSV file, the language of the string fields is detected automatically to improve the training results. Alternatively, the user can set the language manually by adding the following line to the script:
mlFileField.action.language = NormalizeLanguage.GERMAN;
If the language could not be detected, a warning message is written to the log files.
Coding example
The following example shows how to generate a CSV file containing the data used for training:
// export file from source system / production system
mlFileService.exportWithMetaData(dataset.getId(), new FileOutputStream("/tmp/ticket_priority.csv.jar"));
mlFileService.delete(dataset.getId());
// use created file on training system (note that the data set may change name if file ticket_priority.csv already exists)
MLFile dataset = mlFileService.importWithMetaData(new FileInputStream("/tmp/ticket_priority.csv.jar"));