Report

Efﬁcient Retrieval of Recommendations in a Matrix Factorization Framework Noam Koenigstein Parikshit Ram Yuval Shavitt School of Electrical Engineering Computational Science & Engineering School of Electrical Engineering Tel Aviv University Georgia Institute of Technology Tel Aviv University Motivation • In the field of Recommender System, Matrix Factorization (MF) models have shown superior accuracy for recommendation tasks. E.g., The Netflix Prize, KDD-Cup’11, etc. • Training is fast. Computing test scores is fast. But… Retrieval of Recommendations (RoR) is s--l--o--w ! • This problem is well known in the industry, yet never been approached before in academia! ITEMS 2 Yahoo! Music: 2 1M Users 625K Items 4 5 U S E R S 3 3 Naïve Multithreading: High latency + wasteful 2 .... 4 1 2 2 5 6 Tera elements ~300 multiplications ~5 days CPU Reduction to Inner Product = = cos = ∈ T =1 Core problem: Given a user vector and a set of item, find an item vector that will maximize () () () () () () () () . . . . . . () () 1 Best Matches Algorithms • Metric Space • Cosine Similarity • Locality Sensitive Hashing Metric Trees R R Branch-and-bound Algorithm Bounding Inner Product Similarity Approximate Solution Users vectors can be normalized Users can be clustered based on their spherical angle! = = cos =1 Relative Error Bound What is the error when recommendations are retrieved based on an approximate user vector? − = ≤1− cos + Δ cos Adaptive Approximate Solution Experimentations Set-up MovieLens Netflix Yahoo! Music 1,000,206 100,480,507 252,800,275 Users 6,040 480,189 1,000,990 Items 3,952 17,770 624,961 95.81% 98.82% 99.96% Ratings Sparsity Yahoo! Music Recommendations: Modeling Music Ratings with Temporal Dynamics and Item Taxonomy Gideon Dror, Noam Koenigstein, Yehuda Koren (RecSys-11`) Exact Alg. Speedup Approximate Alg. Speedup Speedup vs. Precision Speedup vs. MedianRank Conclusions • We introduce a new and relevant research problem • An exact solution with limited speedup • An approximate solution with a tradeoff between accuracy and speedup • Much room for further research…