Usage
Use when most of your text is longer than approximately 400 words and your primary goal is to perform similarity or semantic searches on the text
Methodology
The chunk and token average strategy splits the text into chunks of n tokens, where n is dependant on the model chosen see Trunk Models documentation for more details. In each chunk, individual token embeddings are extracted and averaged to make a single embedding for the chunk. The embeddings for all chunks are then also averaged.