Overview
The Predict App allows you to create your own classifier (Enrichment) which is applied to generate a prediction for each entry in your selected data source. For example you could build a document classifier to label each of the documents on your company's network drive by its contents, ie invoice, product description, meeting minutes etc etc.
Graft can use labeled data (examples of the correct classification) either within the datasource you are working with or a separate source added to Graft as a datasource.
Click here for more details on preparing your labels, whether in a separate or combined data source.
This guide is for building custom predict apps, if you need one of Graft's preconfigured prediction apps please see here.
What's going to happen...
- First we will name & configure the Entity with the field(s) (Target) in the data source you want to make a prediction from will be selected
- Second we will name & configure the App (classifier) that will be built and then used to generate predictions on the field we selected earlier
- Finally we will create your predict App which will generate those predictions on the target field
|
We can then test using the Try It tab in the UI. If the performance of your predict app is not what you are expecting you can use Active Learning to review predictions, update those that need it and retrain the model.
Pre requisites
You will need to have created a project to build your search app in.
If you are using a second data source for your label data this will need to have been created.
App Selection
- Select the App type you would like to create from the App Library template tab
- Mousing over the App tile will display a short description of the App and its applications
Select which of the methods you would like to create the App if you have your own data ready to go
Alternatively use the Graft Demo data source
Entity Configuration
Samples will be shown below the selections allowing you to validate you have made the correct selections
|
|
or
|
Note: If you are using our demo data you will not need to select the primary key
If a row in your data is missing an entry for one or more of the fields selected to be predicted against, the row will not be embedded and a prediction can not be generated.
Selecting your label source
Once the basic entity configuration is made (which field(s) we want to generate a prediction for) we now need to provide some labels to show examples of the prediction values we are looking for.
Graft supports 2 different methods of using labeled data. The following sections detail the steps for each method, further information on how to prepare your own label data can be found in Preparing label data
Using labels in your current data source (in this entity)
The simplest option is to use labels from within the data source you are planning on making a prediction for.
By default the in this entity button is selected
Choose the field in the entity data source which contains the labels
Add a name for the Prediction APP (Graft will default to the selected field name followed by _prediction) |
|
Samples will be shown below the selections allowing you to validate you have made the correct selections
The label field (in this example Season) and the fields selected when configuring the Entity (product_category and product_description) |
Using labels in another data source
To use another data source as labels Click on the Somewhere Else radio button
This second file MUST have a unique key which matches the original file so that the label can be matched up with an entry in the original file.
From the Data source drop down select the data source you would like to use
Select the field which matches the primary key in the original data source For the demo data source the primary key is product_id
Select the field containing the labels from those available Add a name for the Prediction APP (Graft will default to the selected field name followed by _prediction)
|
|
Samples will be shown below the selections allowing you to validate you have made the correct selections
The label field (in this example Season) and the fields selected when configuring the Entity (product_category and product_description) |
Finalize the APP configuration
Once you have completed the configuration and you are happy with the provided samples
- Click on CREATE APP to start the app creation process or click CANCEL or the to return to the previous step
Graft will now process all of your data (which may take some time depending on the size of your data set) and upon completion will show the App TRY IT tab. Once the data ingestion has been completed you can navigate away from the App UI and return later to review the completed App if required.
TRY IT
The TRY IT tab allows you to experiment with your new app without the need for any coding skills and allows you to test the configuration, filter results and if necessary adjust the app configuration.
The right hand panel shows the current configuration, the name of the entity which holds your data to be predicted against
The Enrichment name, its current performance see Monitoring for more details
IMPROVE PERFORMANCE shortcut to Active Learning
The fields to be used to generate a prediction
The current status of the data set with options to sync or export data
|
- Clicking on IMPROVE PERFORMANCE will start an Active Learning session for the prediction data allowing the user to review and correct predictions and improve their custom prediction model
- Clicking on the EXPORT DATA button will jump to the Project SQL tab allowing the user to export their prediction data, further details below.
- Clicking on the SYNC DATA button will ingest and process all the items associated with the data source, with how the predict app is created all initial data will have been sync'd, however, if further data is added to the source this button can be used when needed.
- Clicking on the icon will schedule daily ingestion and processing of your data source. If you would like to configure more frequent data processing you can enable a processing workflow.
Exporting your predictions
Clicking on the EXPORT DATA button will jump to an automatic SQL query of your prediction results
By Default the SQL will include the fields of your data source in order and additional fields related to predictions. This query is limited to 100 entries, but you are able to remove or adjust this line and tweak the SQL as much as you require.
The additional fields included are...
- target_<entity_name>: the value of the field which is going to be labeled
- <entity_name>_prediction: the predicted value
- <entity_name>_pred_prob: the probability of the generated predicted value
Once satisfied click RUN QUERY to generate your results which can be exported. Further details on the Managing Results can be found here.
Adhoc
The ADHOC sub tab within TRY IT allows you to query your new app with text that is not within your data set.
For example using our demo product catalog you could add in a short description of a item and see the prediction generated
Search in Data Set
The SEARCH IN DATASET sub tab within TRY IT allows you to see the current predictions and filter on specific text.
Type in a suitable phrase and click FILTER.
The fields are laid out in the following sequence...
- Unique Key: the Unique reference value for each element in your data source (e.g. product_id)
- Prediction: The prediction made by the model (e.g. mwh_season_pred_1_prediction <enrichment_name>_prediction)
- Embedded Fields: One of more fields used in the creation of the Entity (e.g. product_category & product_description)
- Remaining fields: from the data source: product_title, price etc
API tab
This tab allows you to experiment with an API snippet which can then be copied and used in your own development environment to retrieve the complete prediction results.
You will need an API key to start. If none are available Click on the Create an API key link and follow these instructions. If there are one or more API keys in the project you can switch between then using the API key drop down menu. Click on MANAGE KEYS to add or delete new keys.
The API snippet is controlled from the configuration bar on the right.
Changes made to the configuration are immediately shown in the code.
Use the Output format drop down to switch between CSV (default), TSV or JSONL (JSON Lines) formats
The Entity predict adhoc field allows you to enter your test text for each field you selected earlier in the configuration of the Predict App.
In this example this is 2 fields product_description and product_category. Replace the default "query text" with an appropriate entry
|
|
Ensure that the all fields have an appropriate entry or Graft will return poor results |
Once you have added an API key click on TEST REQUEST to run the API query.
Copying the Request Code
Click on the icon to the right of the REQUEST heading in the top left hand corner of the code window
Copying the response
Click on the icon to the right of the RESPONSE heading in the top left hand corner of the results window
Settings tab
This tab allows you to delete your app, see which field is being processed for the App and also enable the App for external API access.
You can edit your App's configuration by clicking on the icon, see here for more details on what can be changed.
App API limits
The number of Apps available for external access is limited by your tier. If no APIs are remaining you will need to disable any existing APIs (save the changes) and then enable the App API of interest or consider upgrading to a bigger tier.