hanxiao.github.io icon indicating copy to clipboard operation
hanxiao.github.io copied to clipboard

Building Cross-Lingual End-to-End Product Search with Tensorflow · Han Xiao Tech Blog

Open hanxiao opened this issue 8 years ago • 10 comments

https://hanxiao.github.io/2018/01/10/Build-Cross-Lingual-End-to-End-Product-Search-using-Tensorflow/

hanxiao avatar Jan 25 '18 21:01 hanxiao

Migrated from Disqus Mike Leonard commented on 2018-01-19T08:42:31Z

Hey great blog post and blog. I found it via hackernews today and wanted to add your blog to my Feedly rss reader but it doesn't accept it. It works for almost all sites I've tried so I thought I'd let you know - you may need to add/fix the rss feed if you'd like people to be able to follow your blog via feedly and other rss readers :)

hanxiao avatar Jan 26 '18 08:01 hanxiao

Migrated from Disqus Han Xiao commented on 2018-01-19T22:47:36Z

rss feed should work fine now :)

Hey great blog post and blog. I found it via hackernews today and wanted to add your blog to my Feedly rss reader but it doesn't accept it...

hanxiao avatar Jan 26 '18 09:01 hanxiao

Well I really like about this is that it's accessible to a beginner like me because the steps are clearly defined. You can almost call this a recipe it is so clear

Mike4tech avatar Apr 26 '18 08:04 Mike4tech

Thanks for the nice words @Mike4tech!

hanxiao avatar Apr 30 '18 11:04 hanxiao

wow. thanks for the post. Thanks for also taking time to note & putting links to other concepts /terms that are way beneficial for a novice like me

duythvn avatar May 02 '18 05:05 duythvn

Thank you so much for providing valuable information.

I am using the bidirectional-RNN as a query encoder and the output of the model is a query vector which has a shape of (batch_size, timestep, embedding_size)

My question is how can you resize the query vector to be the same as a document vector? let's say 100 dimensions. I ended up using tf.reshape to (batch_size,-1) followed by using tf.dense of 100 units. However, the downside is that tf.dense requires an exact number of batch_size and timestep.

Is there any better way to implement this? Otherwise could you please provide some code of your query encoder. I have tried your dilated RNN but could not figure out how to use it.

Ekkalak-T avatar Jul 26 '18 07:07 Ekkalak-T

@Ekkalak-T you have your query represented as a BTD shaped tensor (batch, time, dimension) and you want to transform it to [B, 100], is that your question?

If it is, you need to first do pooling on the T axis, e.g. query=tf.reduce_max(query, axis=1), which gives you a [B, D] tensor. Then you do dense(query, num_units=100).

In this article, I didn't use any fancy pooling strategy but simply taking the last output from the RNN.

For more pooling strategy, please refer to my another post here.

hanxiao avatar Jul 28 '18 11:07 hanxiao

@hanxiao Thank you so much for the solution. By the way could you please give me some advice how to inference the model in detail. Now I've been using the metric_p as a relevance score of each document when comparing a single query. I also asked this question in the different thread. https://hanxiao.github.io/2017/11/08/Optimizing-Contrastive-Rank-Triplet-Loss-in-Tensorflow-for-Neural/

Ekkalak-T avatar Aug 06 '18 04:08 Ekkalak-T

@Ekkalak-T It is answered in the other post

hanxiao avatar Aug 07 '18 02:08 hanxiao

Great work Hanxiao. And so happy to see that your work towards NLU.

munaAchyuta avatar Sep 07 '18 07:09 munaAchyuta