Analysis

Knowledge base

This function has not yet been adapted for the new user interface for the Supervisor. Instead, the old user interface is included.

The list shows all versions of knowledgebases that have been in the system once and backups. Knowledgebases may be uploaded, activated, downloaded and deleted directly via the novomind iAGENT Supervisor.

There are two privileges available for the configuration of the menu “Knowledge base” in the group administration (Administration –> Master Data –> Groups):

  • Knowledge base – Display
  • Knowledge base – Administration

Activating Knowledge base – Display shows the menu item Knowledge base in the Analysis menu. With the Display privilege, the user can see an overview of the knowledge bases (all knowledge bases uploaded via this menu are displayed) and view or save the displayed knowledge bases locally as a zip archive via the button Download.

With the Knowledge base – Administration right, new knowledge bases can be uploaded, and uploaded ones can be activated and deleted. Multiple knowledge bases may be selected at once. This permits to delete old knowledge bases collectively.

With this privilege, it is possible to lock the knowledge base when downloading it (in contrast to downloading it without the Administration right). This symbolic locking has the following purpose: If a user wants to download a live knowledge base, they are advised that a download has already taken place after its activation – if the knowledge base is locked. This means that the affected knowledge base may already be in use by another user. Regarding the templates, locking also means that a user upon trying to activate a knowledge base including templates is advised that the current templates have been edited since the knowledge base was locked.

The active knowledge base (marked with the word LIVE) appears in first place of the list. By clicking on a column header, data can be sorted by the respective column.
The columns have the following meaning:

  • Name:
    Name of the zip archive that contains the knowledge base. If a knowledge base was not loaded via this menu “unknown_knowledgebase” is shown.
  • Live/Backup Data:
    Date when a knowledge base was activated and/or backed up. Upon activation of a knowledge base, a backup of the most recently activated knowledge base is automatically created. Uploaded knowledge bases that have not yet been activated do not have an entry in this column.
  • Live/Backup User:
    User who activated the knowledge base and/or initiated the backup. Analogous to the Live/Backup Date, no user is specified if the knowledge base has not yet been activated.
  • Upload Date:
    Date when the upload of the knowledge base occurred. Backups of a previously active knowledge base do not have an entry in this column because no upload took place in this case. Upload User: User who uploaded the knowledge base. Analogous to the Upload Date, no user is specified if the knowledge base has been backed up.

If the active knowledge base was locked and is selected, the date and the user of the first download of this knowledge base are displayed in the lower part of the window:

Knowledge base
Knowledge base

If the entry is a system-generated backup caused by the activation of a new knowledge base, it is also displayed in the lower part of the window.

Clicking on the double arrow in the lower right corner (see Fig. above, lower right corner), the description window can be displayed. This window shows any comments on the respective knowledge base. They are created via the novomind Composer and cannot be edited here. Clicking on the double arrow again hides the description window.

Depending on the privileges of the respective user and the selected knowledge base, the available functions are displayed in the button bar above the list. Only functions that are enabled at the time are visible.

Uploading a knowledgebase

You can upload a new knowledge base to the novomind iAGENT system by clicking the Upload button in the main button bar. For this function to be available, the user needs to have the Knowledge base – Administration right. After clicking on the button, an upload dialogue appears where a local file may be selected.

Clicking on Browse accesses the file system. After selecting the respective knowledge base (zip file format), the upload is started by clicking on Upload file, subject to the following conditions.

  • The knowledge base must be available as a zip archive (Encoding CP437 Standard under Windows). The name of the zip file is used as the name for the knowledge base.
  • The zip archive must only contain one folder that contains the knowledge base. The name of the folder must be equivalent to the one of the parameters “IQProject” in the nmIQmail.cfg (usually “iMAILMaster”).
  • The archive must contain a “*.iqp” file.
  • The archive must have a template directory called “TEMPLATES” that is equivalent to the parameter “templateDirectory” in the nmIQmail.cfg.

If one of the conditions is not met, a warning message advising the user of the specific problem appears. After a successful upload, the new knowledge base is automatically selected in the overview but appears on the bottom of the list because the list is sorted by the activation date per default – and the new knowledgebase does not have an activation date yet.

Deleting a knowledgebase

To delete one or multiple knowledge bases, select the entries to be deleted and hit the button Delete from the main button bar. To delete knowledge bases, the user must have the privilege Knowledge base – Administration. Since only inactive knowledge bases can be deleted, the button is only displayed if an inactive knowledge base is selected.

The deletion of a knowledge base cannot be undone. It is not possible to restore a knowledge base after it has been deleted. A corresponding security warning appears after clicking on Delete.

  • OK: The knowledge base is deleted forever.
  • Cancel: The deletion is cancelled.

To delete multiple knowledge bases, multi-select them by clicking while holding either the CTRL or the SHIFT key.

If another user activates the selected knowledge base in the meantime, a warning message that the knowledge base cannot be deleted is displayed after the confirmation of the deletion. The knowledge base is not deleted. The view is refreshed so that the previously selected knowledge base is now highlighted as a live knowledge base.

Activating a knowledgebase

To activate a knowledge base, means to make the knowledge base available for use in novomind iAGENT, click the button Activate in the main button bar. For activating uploaded knowledge bases, the user must also have the Knowledge base – Administration right.

All knowledge bases – except for the one that is already active – can be activated. Upon successful activation, a copy of the previously active knowledge base is automatically created as a backup. For activation, select the desired knowledge base and click Activate in the main button bar.

If the active knowledge base has already been downloaded and locked, a dialogue with a warning appears. It advises the user that someone may work on the knowledge base somewhere else. This message is also displayed if the same user carries out the download or locking. Time and user of the first download are shown below the table after selecting the active knowledge base.

Selecting OK will continue the activation process. If changes to the templates have been made in the Supervisor interface since the knowledge base has last been downloaded or locked, another confirmation dialog will come up. Cancel will abort the activation process.

When continuing the activation process, you will now be prompted whether to activate the new knowledge base together with the new templates: Select OK if you wish to activate the new knowledge base without templates. This will not overwrite the current templates. If, however, you have made changes to the templates in the Composer (i.e., if you have made changes offline) you can now activate the knowledge base with templates by clicking Incl. Templ. (which stands for “including templates”). Users should make sure that changes to the current templates that may have occurred in the meantime via the novomind iAGENT Supervisor are not accidentally overwritten.

If you wish to activate the knowledge base without templates via OK, you will be prompted to confirm the activation itself. OK now confirms the activation and the knowledge base will be activated. Cancel aborts the activation.

If you have opted for an activation with templates via Incl. Templ. you will be prompted to confirm this action. With OK, you confirm the activation and the knowledge base will be activated. Cancel aborts the activation and no changes will be made to the knowledge base or templates.

 If the cartridges in the knowledge base have been changed, the Core process on the novomind iAGENT server must also be restarted after the activation of the knowledge base.

Statistic Engine

This functionality has not yet been adapted for the new Supervisor interface. Instead, the old user interface is included.

A knowledge base can access a Statistic Engine in order to categorize inbound mails. Further information on how this is done is contained in the novomind Composer user’s manual.

The panels and menus covered in this chapter are used to configure the Statistic Engine used in the knowledge base. In particular, mails that are already in the novomind iAGENT system as well as the categories allocated to them can be used for the configuration. This enables the Statistic Engine to be improved and modified as necessary over time. Furthermore, it can be linked to the knowledge base while it is operational, i.e. it is not necessary to stop ongoing processes beforehand.

On the left side of the dialogue the available mail sets are shown. A mail set consists of a number of categories to which training and evaluation mails have been allocated. These mails are used to train and evaluate the Statistic Engine.

Underneath the mail set, you will see a number of buttons that are used to

  • duplicate a mail set (Clone),
  • train it (Train),
  • link it to the knowledge base (Go live)
  • or add new mails to the existing set (Add).

The right display side is used to gather information on a selected mail set or access panels enabling an existing mail set to be modified.

The following chapters describe how to create a new Statistic Engine and maintain and develop existing Statistic Engines. Information is provided on the available panels, what they contain and how they are used. The following text compares a mail set with the Statistic Engine that can be generated for that particular range of mails.

Creating a mailset

You can create a new mail set either by using the New button in the static menu bar or by clicking on the button marked Add in area 1 (see image below). Next, you will see the panel shown in area 2. This panel is used to select mails that have already entered the novomind iMAIL system. The selected mails cam then be added to a new or existing mail set by means of the New / Add button.

Creating/ Modifying a Mailset
Creating/Modifying a mailset

The following setting can be defined for areas 1 – 5 of the image above:

  • Area 1:
    The From and To limits can be configured for mail IDs. The minimum and maximum mail IDs for all mails currently in the novomind iAGENT system are already defined here.
  • Area 2:
    Restrictions applying to the date of entry for mails can be entered here. Only those mails are taken into account that entered the system between the specified From and To dates. The minimum and maximum entry date for all mails is already defined here.
  • Area 3:
    This area is used to select the mail that should be allocated to a mail set. The contents of the selection list depend on the categories that have been configured in the novomind iMAIL system (see section Categories). The current category of a mail is used, which means that any potential recategorizations are also taken into account. Multiple categories can be chosen from the list by holding down the SHIFT or CTRL keys when making your selection.
  • Area 4
    The target mailset
  • Area 5:
    This selection list is used to specify the category that should be allocated to the mails in this particular mail set. The contents of the selection list depend on the categories that have been configured in the novomind iAGENT system (see section Categories). If you select the first option Keep categories, a mail in that mail set will be allocated the same category it received in the novomind iMAIL system.

Instead of creating a new mail set, it is possible to make a copy of an existing mail set. To clone a set, simply select the relevant mail set, and then click on the button marked Clone.

Modify a mailset

A mail set contains a number of categories. Training and evaluation mails are assigned to each category. This chapter deals with the panels that are used to modify the categories allocated to the mails as well as their status as training or evaluation mails.

The upper area shows a series of folders (tree structure), each of which represents an individual mail set. The folders can be opened up or closed by double clicking on the + symbol or on the name of the mail set. The selected mail set can be re-named simply by clicking on the relevant icon beside the name of the mailset in the tree. An input window then opens in which the new mail set name can be entered.

The categories contained in the mail set are listed below the main folder. Here you will see the training and evaluation mails that have been allocated to this particular category. The number of mails is included in brackets after the name of the corresponding mail set, categories and training / evaluation mails. Click on the name of a mail set to display the details panel of the particular mailset. This details panel also contains the results of the evaluation.

Viewing Mails within a Mailset

If you click on a category or on the training and evaluation mails for that category in the mailset-tree, a panel showing the relevant mails is displayed. Each line represents a mail.

The first column lists the mail IDs and the second column contains the subject of the corresponding mail. The check boxes in the following two columns indicate whether or not the mails are part of the bundle used for training and evaluation purposes.

To display the contents of a mail, as well as the category currently allocated to it, simply click on the subject of the mail. The subject line will then be shown in orange and a window will open. The header section in this window contains the mail ID and a selection box indicating the current category. The lower section displays the mail subject and contents.

The selection box in the header enables the mail category to be changed. To do this, the user simply selects the new category from the list.

The change of category only affects the current mail set and not the categorization process within the iAGENT system as a whole means the mail itself stays untouched.

The checkboxes allow the user to define mails belonging to a group of mails used for the purposes of training and evaluation. Several mails can be selected simultaneously by holding down the SHIFT or CTRL key and selected one or more mail subjects from the list. Once several mails have been selected, a new menu will appear above the mail list. This menu is used to modify the category allocation for all selected mails, as well as define whether these mails should be used for training and evaluation purposes or not.

When the ID column in the list is shown in orange, this indicates that the category for that particular mail has been changed for this mailset. If the Train and Eval columns are colored orange, this shows that the mail’s status as a training or evaluation mail has been modified. For example, the last two mails in the list (ID 2279 and 2251) have been assigned a new category. To apply the modifications to the relevant mail set, click on the button marked Save changes in the static menu bar.

Before adding a new category that was not previously available in the novomind iAGENT system to a Statistic Engine, the category must first be created (see section Categories). The new category can then be assigned to a mail by categorizing mails accordingly or just change the category through the mailset as described above.

Training and Evaluation

A mail set is used to train and evaluate a Statistic Engine. In order to do this, each mail set must contain at least two different categories. For each category, around 20 to 30 training mails must be available. These training mails must be representative of the type of mails usually allocated that particular category. If no training mails have been allocated to a particular category, that category will not be recognized by the Statistic Engine.

The evaluation mails allocated to a category are used to check whether the Statistic Engine is functioning effectively. We recommend using between 50 and 100 mails per category. Once a Statistic Engine has been created, the evaluation mails are classified by the Statistic Engine. The results are used to check how effectively the Statistic Engine is working and draw conclusions regarding potential problems with the classification process.

Once a mail set has been created, it can be selected by clicking on the button marked Train. After this, a status bar will be displayed underneath the mail set menu, which provides the user with information about the current status of the training or evaluation process. To stop the process, click on Stop. Once the process is complete, the status bar will disappear and the evaluation results will be displayed on the right-hand side. See next chapter for more details.

The Go live button enables a trained Statistic Engine to be linked up to the novomind iAGENT system’s classification process. The word LIVE is displayed in red letters after the name of the Statistic Engine (Mailset) currently in use. You should avoid making modifications to the mail set for the Statistic Engine in use so that new results can always be compared directly with the previous set of results. Instead, any changes should be made using a copy of the mail set, which can be done by clicking on the Clone button. This enables Statistic Engines to be constantly improved without losing any earlier versions in the process.

Evaluation Result

After a Statistic Engine has been generated, the evaluation mails are used to check whether the it is working effectively or not. The Statistic Engine classifies the evaluation mails and displays the actual results alongside the expected category classifications. The expected classification is the category allocated to a mail by the mail set.

The Evaluation Result dialogue displays the evaluation results for a Statistic Engine (Mailset). The area on top contains results relating to the overall classification of the evaluation mails. The area below lists the classification results alongside the relevant categories.

Tool tips provide additional information about the figures listed in a particular column. To view the relevant tool tip, hold the mouse pointer over the heading of the column for which you need further information.

The various columns in the top table (headed Overall) contains the following information:

  • Correct:
    The number of mails which were allocated the correct category by the Statistic Engine.
  • Incorrect:
    The number of mails which were allocated the wrong category by the Statistic Engine.
  • RecRate (recognition rate):
    Percentage of evaluation mails that were correctly categorized.
  • Ø Rec (average weighted return rate):
    Average categories returned, weighted according to the number of mails expected.
  • Ø Prec (average weighted precision rate):
    Average precision, weighted according to the number of mails expected.
  • Ø Conf (average weighted confidence rate):
    Average confidence rate for the categories, weighted according to the number of mails expected.

The columns in the “Details” table below contain the following information:

  • Exp (expected):
    The number of mails expected. Is equivalent to the number of evaluation mails for this particular category.
  • Act (actual):
    Number of evaluation mails that were actually allocated to this category.
  • Cor (correct):
    Number of evaluation mails correctly allocated to this category.
  • Incor (incorrect):
    Number of evaluation mails that were incorrectly allocated to this category.
  • Mis (missing):
    Number of evaluation mails that should have been allocated to this category but were allocated to a different category instead.
  • Rec (return):
    Relationship between the number of mails expected for this category (Exp) and the number of evaluation mails that were correctly classified (Cor), in percent.
  • Prec (precision):
    Relationship between the evaluation mails actually allocated to this category (Act) and the number of evaluation mails correctly allocated to it (Cor), in percent.
  • Conf (confidence):
    The confidence rate for this category, based on its return and precision ratings using the following formula: 2 * return rate * precision rate / (return rate + precision rate).

After a Statistic Engine has been trained, additional information will be displayed in the mail list. The mail IDs for mails which had been categorized incorrectly by the Statistic Engine will be displayed in red. To identify which category has been allocated incorrectly to a particular mail, simply position the cursor over the mail ID of the message in question. A tool tip will be displayed containing the name of the allocated category.

If the mail set used by a Statistic Engine is modified, the results evaluation is no longer valid. This is indicated by means of the heading out dated in the title at the top of the evaluation results page.