Dr. Yu Zheng describes urban computing as “playing a giant chess game.”
Dr. Zheng joined JD Finance last month as Vice President and Chief Data Scientist, Urban Computing Business Unit President, and Urban Computing Lab Director. JD Finance is the fintech arm of JD.com, China’s largest e-commerce platform by revenue.
Synced recently spoke with Dr. Zheng on JD Finance’s entry into Urban Computing.
Why would a fintech company like JD Finance work on urban computing?
Fintech is composed of two keywords: finance and technology. Many people think of JD Finance as a fintech company, but this is not accurate: we provide empowering technology for the finance industry.
JD Finance’s business model is B2B2C. To illustrate this, let’s say we provide a better risk control model for banks, which in turn give better loan services for customers. In this equation JD Finance is the first B (business), the bank is the second B (business), and the end customer is C (customer).
We can also replace the middle B (business) with G (government), providing governing institutes the proper technology to serve its people. In other words, urban computing broadens JD Finance’s existing businesses. The Chinese Government’s invitation to JDF to get into these urban computing areas such as transportation and environment will change the public’s view of our company.
What’s your hiring plan for this new unit?
That is confidential for the time being, we have some upcoming announcements. But I can tell you that we are building sub-divisions focusing on environment and transportation control, and the teams will be big.
What are the ties between JD Finance and urban computing?
Smart commerce. People’s impression of urban computing is environment, transportation, and various types of city planning. But in fact commerce also plays a big role in this panorama, for example in business site selection, real estate assessment, and helping banks with risk management.
For example, if a company seeks a bank loan to build a casino, then the bank will conduct risk assessment on the project itself, whereas in the past, the bank’s approach may have been to just assess the company’s own credit qualifications such as bad debt ratios, creditworthiness, and so on.
Companies with good qualifications may undertake risky projects by factoring in local development and consumption levels, which are only reflected in forms of data. Criteria such as local spending index, travel method, urban infrastructure including power networks and transportation all contribute to the decision making process.
Urban computing uses diverse spatio-temporal data to do calculations such as analysis, prediction, causal analysis, and anomaly detection for a given scenario.
What are JD Finance’s competitive advantages in this space?
We have a lot of data. JD.com has nearly 300 million active users, as shown in our latest financial report. Our huge datasets are composed of product info, user transaction data, and logistic data. Financial figures like wealth management, payment and consumption also contribute to datasets.
Sufficient data quantity can accurately reflect a city’s economic well-being. Logistics data for example maps the commercial flow of an area and its business relations with surrounding areas. We have data in abundance which is rare and good.
What are the technical difficulties for urban computing compared to other AI technologies?
Urban computing relies on spatio-temporal data, which is neither video, image, nor text. It has its own data management methods and AI algorithms. In other words, you can’t solve the problems at hand by throwing in a CNN or LSTM alone. Spatio-temporal attributes, including time trends, periods, and proximity, spatial distance, and spatial gradation are characteristics that cannot be grasped by commonly used algorithms.
There are also multiple data sources. For instance, the casino case we mentioned above demands the use of POI, road network data points, plus a lot of data such as environmental and spending, which jointly predict future changes.
Multivariate data fusion is a difficult, and it is also a relatively new discipline and research direction in machine learning. How can data from different fields determine that 1+1 is greater than 2? This is very difficult.
At the same time, urban computing is not a simple cloud computing problem. Cloud computing platforms can’t support such spatio-temporal data. The data structure query method of spatio-temporal data, as well as the multi-data fusion and indexing mechanism just described, do not yet exist.
The cloud service providers currently on the market are not suitable for urban computing. Service providers must undergo a special technical build up, in order to manage, analyze, and tap into spatio-temporal big data, and form a dynamic closed loop. It is very difficult, and the threshold is also very high.
Can you explain the complexity A BIT MORE?
Let me give you a concrete example: traffic light control is more difficult to tackle than AlphaGo. AlphaGo faces a 19×19 grid, and the states on each grid are only black, white, or empty.
Yet there are tens of thousands of intersections of traffic lights in Beijing, and the status and actions of each intersection have more possibilities. Traffic may be flowing at 40 km/h, 45km/h or 30 km/h; signal light timing may be 30 seconds for the red light or 20 seconds for the green light. All of these are changing continuously. We are also missing data. Roads with no pedestrians, cars, or sensors do not give us any data. The road is also an open system. One man, or even a dog crossing the road will alter the traffic state.
Therefore, urban problems have a large state space, large movement space, and an open system. This is certainly much more difficult to solve than problems on a Go board.
Let me give another example. In urban population flow forecasting, we divide cities into a number of grids, and we want to predict how many people will enter and exit in each grid.
The flow of people in a grid is related not only to how many people have entered and exited in the previous hour, but also to how many people are moving in and out of neighboring grids. You also want to loop in areas that are far away, taking into consideration for that when big events happen, people will emerge from subway exits. If you only rely on local changes around the grid, you can’t predict bigger things happening, say tragic incidents such as a stampede.
What qualities does a company need to succeed in urban computing?
It’s data and team. Urban computing relies on good databases and data resources. Everyone thinks that governments have a lot of data, but this is not the case. In many cases, they need industry data to support their own decision-making processes and solve problems.
Also important is the team. People say AI is a talent war, but it’s definitely not a war of numbers. A team of one hundred is not necessarily better than a team of ten. But there are many times when you can’t solve a problem yourself and so rely on others for inspiration. If we are stuck with missing critical data, perhaps an “a-ha!” moment from a teammate will provide alternative measures. This is what the word “talent” stands for.
A good engineer can work on ten projects at the same time, while a mediocre team might use one hundred people without any results. We realize the importance of talent, and so we have heavily invested in pooling top-notch talents.
Prior to joining JD Finance, Dr. Zheng led the urban computing team at Microsoft Research Asia, publishing profusely in this area; while also holding the positions of Chair Professor of Shanghai Jiaotong University, Guest Professor of Hong Kong University of Science and Technology, and Editor-in-Chief of ACM TIST.
Journalist: Meghan Han | Editor: Michael Sarazen