“In epidemiology analysis, the relationship between people in space and time is very important,” explains Bao Jie, head of the JD Group city data management platform, “and human activity trajectories are a typical form of spatiotemporal data.”
Part 1. How the response to an 1854 London cholera outbreak informed Wuhan strategy
Wuhan announced an entire city lockdown at 2:00 am on January 23 2020, suspending all buses, subways, ferries, and long-distance travel on flights or trains. However, as data released by the Wuhan Railway Bureau of China shows, nearly 300,000 people fled the city by rail before the lockdown came into full effect, with at least 251 trains departing Wuhan station on the day of the announcement.
Realizing the severity of the situation, JD Group vice president Zheng Yu pulled together an R&D team of 30. JD Data Management department head Bao Jie, who did his PhD under renowned professor in the field of spatiotemporal databases Dr. Mohamed Mokbel and has published over 40 research papers in journals and conferences such as SIGKDD, ICDE, VLDB, and AAAI, was appointed to lead the team. Boa’s main research interests include spatiotemporal data management/mining and distributed computing platforms.
There were two main challenges facing of the JD team: determining the flow of people from the disease epicentre and where they were heading, and tracing their close contacts in order to isolate them from other citizens to stop the spread. Bao decided to make use of spatiotemporal big data, inspired by a cholera outbreak he had studied the first year of his PhD.
The Broad Street cholera outbreak of 1854 killed 127 Londoners in three days. All the victims lived on or near Broad Street in the city’s Soho district. By the following week, three quarters of area residents had fled.
Physician John Snow counted the number of deaths in each household during the outbreak and plotted their geographical locations on a map. His subsequent analysis suggested the Broad Street deaths could be related to the local water pump. Authorities shut down the pump and the spread of the disease was contained.
“The spread of cholera could be traced using spatiotemporal data analysis. This is the most classic example of spatiotemporal big data analysis,” Bao explains.
Part 2. Designing the right system to track down COVID-19
The JD Group had launched its JD Urban Spatiotemporal Data Engine in November 2019 as part of its delivery and logistics platform, intending to use it to analyze the trajectories of couriers, identify missing networks, predict road traffic, recommend courier travel routes and improve the overall delivery efficiency.
Boa’s team set to work adapting the system for its new role: tracking the coronavirus outbreak. More than 30 people from the JD Urban Data Team participated in the “anti-epidemic” project, including two data developers and an algorithm engineer who were stranded in Hubei and worked remotely. Typically the team were on conference call from about 10 am until late evening. On busy days, as Bao recalls, “often it was 4 or 5 am”.
The team set up a “JD Epidemic Prevention and Control Technical Support System” powered by federated learning, homomorphic encryption, and other data protection technologies. The system linked telecommunications, government, public security systems and enterprise data to create an AI system to help predict and control COVID-19 spread.
The team also designed a complete SQL engine to enable all operations to use SQL-like statements, thus lowering operator threshold and increasing system flexibility. The system shows categories including “time range”, “spatial range”, “stay time in space-time range”, and “target city.” Boa says the spatial scope “can be as large as the entire Hubei Province or as small as one Wuhan street or neighbourhood.”
Locating a patient’s close contacts is not easy, and a patient’s recollection of where they have been and when before isolation is not always adequate or accurate. And if one heads to the grocery market to buy food or to a restaurant to eat, it becomes next to impossible to identify all possible contacts.
In JD’s system, subject “B” will be judged as a “close contact person” if their cumulative contact time with subject “A” is considered spatially close and exceeds one hour.
Trajectory data was processed to identify meaningful segments within long travel trajectories. This enabled the system to successfully dig out neighbourhoods or streets with higher probability of suspected cases.
In the first three weeks of February the JD Group helped Beijing municipal government locate more than 500 high-risk COVID-19 cases. As of March 1, a quarter of the Suqian city’s diagnosed coronavirus patients had been identified by the system. The system is currently at work assessing high-risk groups in 18 provinces and cities including Guangzhou, Nanjing and Chengdu.
Source: Synced China
Localization: Meghan Han | Editor: Michael Sarazen