New and improved embedding mannequin

Unification of capabilities. Now we have considerably simplified the interface of the /embeddings endpoint by merging the 5 separate fashions proven above (text-similarity, text-search-query, text-search-doc, code-search-text and code-search-code) right into a single new mannequin. This single illustration performs higher than our earlier embedding fashions throughout a various set of textual content search, sentence similarity, and code search benchmarks.

Longer context. The context size of the brand new mannequin is elevated by an element of 4, from 2048 to 8192, making it extra handy to work with lengthy paperwork.

Smaller embedding measurement. The brand new embeddings have solely 1536 dimensions, one-eighth the scale of davinci-001 embeddings, making the brand new embeddings more economical in working with vector databases.

Diminished worth. Now we have lowered the worth of latest embedding fashions by 90% in comparison with previous fashions of the identical measurement. The brand new mannequin achieves higher or comparable efficiency because the previous Davinci fashions at a 99.8% decrease worth.

Total, the brand new embedding mannequin is a way more highly effective instrument for pure language processing and code duties. We’re excited to see how our clients will use it to create much more succesful functions of their respective fields.