The rapid development of deep learning has resulted in impressive performance across many natural language processing (NLP) tasks. Because today’s ubiquitous search systems deal with natural language in user queries and profiles, as well as in the countless documents they crawl on the web, search would seem an ideal environment for deep NLP. Its application in industry search engines however faces a number of unique challenges, such as serving latency, robustness issues and effectiveness.
In the new paper Deep Natural Language Processing for LinkedIn Search Systems, a LinkedIn research team studies the use of deep NLP on various representative search engine tasks, aiming to provide useful insights for the development of better industry search engines.
The team summarizes their contributions as:
- To our best knowledge, this is the first comprehensive study for applying deep learning to five representative NLP tasks in search engine productions. For each task, we highlight the difference between classic NLP tasks and search tasks, provide practical solutions, and deploy the models in LinkedIn’s commercial search engines.
- While previous works focus on offline relevance improvement, we strive to reach a balance between latency, robustness and effectiveness. We summarize the observations and best practices across the five tasks into lessons, which could be a valuable resource for the development of search engines and in other industry applications.
This paper poses three questions: (1) When is deep NLP helpful/not helpful in search systems? (2) How to address latency challenges? (3) How to ensure model robustness? To find the answers, the researchers conducted a comprehensive study on the application of deep learning to five representative NLP tasks in the search engine field.
The team first outlines the three main components of a typical search system: language understanding, which extracts important features from the inputs; language generation, which suggests queries that are more likely to lead to desirable search results; and document retrieval & ranking, which produces the final results.
They then conduct experiments on five representative search tasks covering classic NLP challenges: query intention prediction, query tagging, query auto-completion, query suggestion and document ranking.
For query intention prediction, the LinkedIn system predicts the probability of a user’s intent towards seven vertical searches: people, job, feed, company, group, school, event. For this task, the study compared convolution neural network (CNN) and long short-term memory (LSTM) approaches with their production model as the baseline. The results show that both CNN and LSTM models outperform the production model, indicating their superior effectiveness for capturing query intents.
The query tagging task aims to identify named entities in queries. The LinkedIn search system identifies seven such entry types: first name, last name, company name, school name, geolocation, title, and skill. For this task, the team chose a semi-Markov conditional random field (SCRF) to train the production model and compared that with a bidirectional LSTM-CRF architecture. Here, the traditional SCRF method achieved the best results, indicating that LSTM’s ability to extract long-distance dependencies is not helpful in this task.
For query auto-completion, the team applied an unnormalized language model to the candidate ranking phase and compared it to neural language modelling. The results show that the proposed unnormalized language model can reach the same level of relevance performance while significantly reducing latency.
For query suggestion, the team tested a seq2seq model in federated search and compared it with LinkedIn’s baseline frequency-based method. The results show that the seq2seq model can find better candidates, while also generating in-domain terminology for novice users.
Finally, on document ranking, the team compared CNN-ranking with the baseline xgboost method. On this task, the proposed CNN-ranking outperformed the baseline by a large margin.
Based on their experiments, the team concluded that deep NLP can be particularly beneficial on search-related language generation tasks and when dealing with data with rich paraphrasing. Deep NLP however is not helpful for query tagging tasks. The researchers also identify latency as the biggest challenge in this domain, and show that robustness and overfitting issues can typically be handled via careful data analysis.
The paper Deep Natural Language Processing for LinkedIn Search Systems is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.