Unless you know precisely how Shanghai’s 364 subway stations align with the metropolis above them, navigating the subterranean tangle of China’s largest city can leave you frustrated and hopelessly lost.
On December 5th Alibaba, Ant Financial, and Shanghai Shentong Metro Group jointly launched a new voice interaction system for purchasing subway tickets. Riders can now tell a machine their destination — for example, Zhongshan Park — and the system will use the AutoNavi (高德地图) cloud map to issue a ticket for the nearest station.
Alibaba CEO Jack Ma was among the first to try out the new system, which is expected to be available on all Shanghai’s ticketing machines next year as part of China’s project to build smart cities’ IoT infrastructure.
Current voice dialogue systems such as smart speakers and voice assistants require “trigger words.” The iPhone’s voice assistant, for example, activates when it hears “Hey Siri.” Alibaba’s new system uses multi-model interaction. Zhijie Yan, head of the Alibaba voice team, says the goal is to eliminate trigger words altogether. “You just need to approach the machine and it will interact with you naturally.”
Says Yan, “Real life environments are most likely noisy, and that remains to be the biggest technical challenge.” Voice recognition is difficult in open, noisy environments, which is exactly what Shanghai subway is, especially during rush hours. Alibaba’s new ticketing system uses computer vision to identify speaker’s lip movements and measure the distance between speaker and machine before finalizing its voice input. Visual signals are combined with audio signals captured by a large microphone array, with a supporting software signal processor suppressing noise and interference.