Can anyone explain why #opensource embedding models perform so much better than closed source? #machinelearning
Today we open-weight (Apache 2.0) released the two best embedding models for ecommerce search and recommendations available anywhere. Marqo ecommerce models significantly outperform models from Amazon, Google, Cohere and Jina (see below). Fun fact: we had to create a significantly smaller and easier evaluation dataset just to accommodate some of the private models! + Up to 88% improvement on the best private model, Amazon-Titan-Multimodal (and better than Google Vertex, Cohere). + Up to 31% improvement on the best open source model, ViT-SO400M-14-SigLIP. + 5ms single text/image inference (A10g). + Up to 231% improvement over other bench-marked models (see blog below). + Evaluated on over 4M products across 10,000's of categories. + Detailed performance comparisons across three major tasks: Text2Image, Category2Image, and AmazonProducts-Text2Image. + Released 2 evaluation datasets: GoogleShopping-1m and AmazonProducts-3m. + Released evaluation code. + Apache 2.0 model weights available on Hugging Face and to test out on Hugging Face Spaces. Blog: https://lnkd.in/gNAypw-K Hugging Face: https://lnkd.in/gv3PGg5W GitHub: https://lnkd.in/gedNJsAu