GLMix: Generalized Linear Mixed Models For Large Scale Response Prediction
published: Sept. 22, 2016, recorded: August 2016, views: 1416
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Generalized linear model (GLM) is a widely used class of models for statistical inference and response prediction problems. For instance, in order to recommend relevant content to a user or optimize for revenue, many web companies use logistic regression models to predict the probability of the user’s clicking on an item (e.g., ad, news article, job). In scenarios where the data is abundant, having a more ﬁne-grained model at the user or item level would potentially lead to more accurate prediction, as the user’s personal preferences on items and the item’s speciﬁc attraction for users can be better captured. One common approach is to introduce ID-level regression coeﬃcients in addition to the global regression coeﬃcients in a GLM setting, and such models are called generalized linear mixed models (GLMix) in the statistical literature. However, for big data sets with a large number of ID-level coeﬃcients, ﬁtting a GLMix model can be computationally challenging. In this paper, we re-port how we successfully overcame the scalability bottleneck by applying parallelized block coordinate descent under the Bulk Synchronous Parallel (BSP) paradigm. We deployed the model in the LinkedIn job recommender system, and generated 20% to 40% more job applications for job seekers on LinkedIn.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !