Graft supports the use of label data to build Enrichment models for use in Custom prediction apps. Label data can either be held within the datasource that is currently being used in the App or in a separate data source.
It is NOT required that all rows in your data have a label our how many labels do i need FAQ sets out our recommendations for the number and distribution of labels, what you do need is a representative sample for example if you need to label the seasons it is important to provide labels for all 4 seasons (and all year if appropriate) so that Graft has reasonable samples of each Season, if there are no 'Summer' labels, Graft will not be able to predict entries for the Summer season.
In data source labels
For labels in the current data source, you only need to specify the field which has your labels. For example using this short sample set, if we need to train a model to recognize the Seasonality of a product we can use the Season field as our label.
Once trained the model would be able to predict the season for all rows in our data
Independent Label Source file
To be able to use a separate file there must be a mechanism to be able to link the label to an entry in the primary data source, this is typically a primary key - but can be any field with unique entries which can be matched.
Unlike labels in the primary datasource, Graft expects that each row of the label file contains a unique ID and a label. The file can contain other fields but these are not used for building an enrichment.
Note: If the label file contains additional unique IDs which are not in the main data source these are not used in the creation of an Enrichment model