On August 8th Chinese Q&A website Zhihu — named after the Mandarin phrase “Do You Know?” — announced it had raised US$270 million in its Series E funding round. With 180 million registered users and 110 million answered posts, China’s biggest Q&A website and fourth-largest social media platform’s total valuation now stands at US$2.5 billion.
Goldman Sachs Managing Director Kaixun Zhang, who joined this round of investment, positioned Zhihu’s business model:
“When compared with Quora, [Zhihu] has a highly active social network backing its high-quality Q&A content; when compared with Reddit, it focuses on improving the professionality of its content.”
Branding itself as the biggest knowledge sharing platform in China, Zhihu’s typical user profile is 25 to 35 year-old college graduates with monthly salary ranging from CN¥5k to 30k — a community of brainy users willing to pay for quality content.
The Zhihu platform began to commercialize in 2017 with a focus on ad sales and knowledge products. The company also introduced its premium service Zhihu Live, which provides professional knowledge sharing sessions for paid users.
Upon completion of the company’s latest funding round, Zhihu CEO Yuan Zhou sent a letter to employees:
“We will use AI to cope with upcoming challenges, finding solutions for content creation, content quality control, community management, information recommendation, etc. In the past few years, algorithms have been playing the commander role in core features such as question routing and recommendation. This transforms the original model of relying on an asker’s social graph to connect users and questions, turning it into an automated and large-scale Q&A connection among the entire platform…”
An Overview of Zhihu’s AI Engine
So how does China’s largest Q&A site use AI? In a recent interview with Synced, Zhihu’s team identified key AI-powered operations as content creation, content distribution and consumption, connectivity and management, and layout optimization.
Zhihu’s Q&A system identifies users’ areas of interest based on their search history. If a user has not found a satisfactory answer, the system helps them construct a more clearly organized question. When a new question is posted, a convolutional neural network (CNN) model will pick the best match from over 250,000 topics, providing further linkage to and recommendations for potential answerers in the community.
The system performs question routing to find the most competent candidates, inviting them to come up with high-quality answers to the new question. This is a typical learning to rank problem in machine learning, which can be solved by pointwise, pairwise, and listwise approaches.
Content Distribution and Consumption
Search is a typical problem where AI can help, by returning and ranking the most relevant questions or topics based on a user’s query.
Recommendations is a key feature on homepage feeds, providing users with trending stories, content of interest, and related topics recommended by Zhihu. To search and sort within over 100 million Q&A entries on its site, Zhihu uses a recommender system that consists of hierarchical deep neural networks (DNN). The current DNN model is essentially a regression model for predicting whether a user will be interested in a piece of content. Regression can provide a comprehensive score of user interest based on clicks, reads, time spent reading, positive or negative comments, and so on.
Drawing on a user’s activity history, Zhihu creates a user profile, content profile and other meta information which are referenced to recall relevant information from its database using a variety of sources (e.g. follows and followers, keywords, videos). Depending on the type of sources, inverted index and embedding are common approaches for recalling. A fixed number of relevant instances are then assigned to queues based on specific tags.
Usually, dozens of such queues are recalled, followed by a learning to rank process using DNN within 100 milliseconds for content recommendation. As a result, most information displayed on Zhihu homepages is highly personalized. Zhihu also strategically adjusts a small portion of the recommended content to avoid redundant entries.
The recommendation system’s algorithm design has gone through a series of changes. The homepage initially used the Edgerank algorithm for sorting posts. A gradient boosted decision tree (GBDT) based machine learning ranking technology was later introduced. To overcome the limited data capacity of previous algorithms, DNN was applied to all stages of recalling and ranking for content recommendations.
Users’ average daily time spent on Zhihu increased 50 percent from December 2017 to May 2018, with strong growth corresponding to the application of DNN for recall in January and the introduction of DNN for ranking in March.
Connectivity and Management
Both connectivity and management are specifically required in a social network community like Zhihu.
With the power of AI, Zhihu can graph each user’s profile information, including reading history and follows, etc. Using graph embedding techniques, the platform can achieve a better understanding of all users in the community, and connect those who share similar values, hobbies or interests.
To improve the platform’s discussion environment, Zhihu’s online content analysis bot Walle performs real-time detection and purging of bad user behavior such as giving irrelevant answers, spamming or leaving unfriendly comments.
Computer vision supported by AI can help optimize user interfaces and render news feeds, including image selection and tailoring, to provide a clear and concise presentation of appropriate information for a better user experience.
For example, ResNet is an accurate and efficient method for image recognition that can score images according to content and quality. This helps Zhihu automatically filter thousands of sensitive or low-grade images on daily basis.
Zhihu is determined to meet the challenges involved in maintaining China’s largest Q&A platform. Zhihu Homepage Project Manager Rui Zhang told Synced, “We named our AI-powered recommender system the ‘Crystal Ball’ because we believe our AI can ‘sense’ what users are seeking, enabling us to make appropriate recommendations.”
Localization: Tingting Cao | Editor: Michael Sarazen, Meghan Han