Graft supports a number of options for Entity embeddings, the most commonly used is a single field embedding, however, others may be helpful depending on your use case. This article describes the various options available.
General notes
Graft will select a default model based on the field type which you can change by selecting another option from the drop down menu, see the model section of this help center.
In each case you may change the Embedding Strategy based on your needs.
Single field Embedding
The simplest option is to embed a single field to generate embeddings which can be exported, searched upon or used to generate predictions (with the application of an Enrichment).
While in the Create Entity workflow select a single field from the available list.
- Click FINISH to complete the Entity set-up
The Embedding configuration will be shown at the top of the Entity Dashboard, listing the field, model and embedding strategy.
At this time you can select one of the preconfigured processing workflows
Multiple field embedding
Creating a single embedding from two or more fields concatenates the results of each field into a single embedding space. This can be fields of the same or different modalities.
Benefits
- Build more semantically rich representations of entities (information from different fields)
- Entity representations can be more robust to bad data - (if there is information content overlap, you get some redundancy)
Notes:
- Ensure that your data is complete when using multiple fields, data MUST exist in each field to be embedded or the embedding will be skipped.
- Multi-field embeddings are larger and take longer to process
- Select 2 or more fields from those selected in the earlier step
- Click FINISH to complete the Entity set-up
The Entity dashboard will show the resulting Entity configuration
At this time you can select one of the preconfigured processing workflows
Multiple embeddings
This allows you to create 2 or more independent Embeddings within a single Entity. Each embedding can have a single or multiple fields as described earlier in this document.
This can be helpful when you wish to experiment with embedding strategies allowing 2 or more embeddings to be generated whose performance can be compared for search or classification tasks.
For example; build a enrichment (classifier) for each embedding with differing embedding strategies to see which one performance best ahead of moving the most accurate into production
- To add a second Embedding click on the under the initial Embedding
- Check the field(s) that you require for the second embedding
Each embedding will be given a unique name starting with my_embedding, then my_embedding_1 etc.
- Click FINISH to complete the Entity set-up
The Entity dashboard will show the resulting Entity configuration
- Click on the and arrows to step through the configuration of each embedding
At this time you can select one of the preconfigured processing workflows
Note: At this time workflows only runs against a single embedding, processing of additional embeddings needs to be initiated manually from the Entity Dashboard