Hi Anna, I mentioned the two options for loss in the article. But in the code, I decided to explore a simple alternative way. We can find the dot product of 2 normalized vectors/embeddings. If they are from the same class, we can force the dot product to be 1 (similar) otherwise 0 (not similar). This will create orthogonal embeddings for each class.
In order to bring dot product closer to target output of 1 or 0, I have used MSE. I mentioned it in "Part 2: Create the model". Let me know if this answers your question.
You can check out my another article https://towardsdatascience.com/bird-song-classification-using-siamese-networks-and-dilated-convolutions-3b38a115bc1. It is based on Bird Song Classification using Siamese Networks. I have used triplet loss in this one.