TensorFlow Similarity is a Python package designed to make training of similar models easy and fast using the open source end-to-end platform of machine learning TensorFlow.

The ability to search for related items has many applications in the real world, he illustrated the team of TensorFlow: from finding similar clothes, to identifying the song that is playing, to helping save missing pets.

More generally, being able to quickly retrieve related items is a vital part of many basic information systems such as multimedia research, recommendation systems and clustering pipeline.

Under the hood, many of these systems are enhanced by deep learning models that are trained using contrastive learning. Contrastive learning teaches the model to learn a space of inclusion in which similar examples are close, while dissimilar ones are far away. For example, images belonging to the same class are grouped together, while distinct classes are separated from each other.

If applied to an entire set of data, the TensorFlow team still explains, contrasting losses allow a model to learn how to project elements into the embedding space so that the distances between embeddings are representative of what they are

At the end of the training you will find a well clustered space where the distance between similar elements is small and the distance between dissimilar elements is large.

Once the model is trained, you build an index that contains the embeddings of the various elements you want to make searchable. Then, at the time of the query, TensorFlow Similarity uses the Fast Apprximate Nearest Neighbor (ANN) search to recover the closest items from the index in sub-linear time.

This research is fast and leads to high recovery accuracy, stresses the team of TensorFlow, as well as being easier to scale. The indexing system Approximate Nearest Neighboring integrated in TensorFlow Similarity, which is based on NMSLIB, makes it possible to search millions of indexed items, recovering the top-K matches similar in a fraction

In addition to accuracy and recovery speed, the other great advantage of similarity models is that they allow you to add an unlimited number of new classes to the index without having to retrain. C” only need to calculate embeddings for representative elements of the new classes and add them to the index.

This ability to add dynamically new classes is particularly useful when dealing with problems where the number of distinct elements is unknown in advance, changes constantly or is extremely large.

TensorFlow Similarity provides all the components needed to make the evaluation of the similarity training and the questioning intuitive and easy. In particular, TensorFlow Similarity introduces SimilarityModel(), a new Keras model that natively supports embedding indexing and querying. This allows you to perform end-to-end training and evaluation quickly and efficiently.

You can start experimenting with TensorFlow Similarity with the tutorial ›Hello World › published online and all information is on the GitHub repository of the project.

Leave a Reply

Your email address will not be published.

You May Also Like