Lawyee is a Peking University spinoff founded in 2003 that was one of the first companies specializing in building searchable legal databases and digitizing IT systems for Chinese courts and law firms.
When Lawyee started the Internet’s legal landscape was barren and computer technologies underdeveloped. Things fast-tracked however in 2014 when China’s Supreme People’s Court made publishing court decisions on an official online archive mandatory within seven days of entry into force, excluding politically-sensitive, private or youth crime cases.
“Lawyee became the contractor for building this database,” says company General Vice Manager Chen Hao, “and we accumulated several million documents by the end of 2014.” This number skyrocketed to 40 million in mid-2018, and page views exceeded 14 billion. Lawyee also runs a case law database which Chinese law schools can access for an annual subscription fee of few thousand US dollars.
Responding to recent advances in AI, Chen explains that “AI evolved out of decades of statistics research. Legal researchers have also used statistics for a long time, and there are many machine learning algorithms on SPSS Statistics. Since 2006 deep learning has upgraded machine perception.”
“However there aren’t many mature commercial applications at least in areas of law. For both common and continental legal systems, technology so far has only tackled vertical problems, but it’s not enough to reinvent the legal industry. Now there are startups building applications on top of IBM Watson that may prove successful.”
Building on strong database resources, AI can generate legal documents and perform compliance review of contracts, quality control of documents, and analysis of legal risks, business guidelines, and so on.
One of Lawyee’s current AI projects is identifying the core issues in a legal document. Lawyee’s AI trained on more than 30 million data samples can currently identify core arguments in about 70 out of 100 documents, and be exactly on-target with 60.
Chen says the hard part is human-labelling larger quantities of data. “When data samples rise from 100 to 10k, people can no longer handle the repetitive work, not to mention when data size goes up to 300k plus.”
Another challenging task is finding benchmarks for algorithm performance, which is crucial for measuring accuracy. “There are no benchmark datasets such as ImageNet for image or SQuAD for NLP in the legal space. Labelling them is very expensive, thus impossible for normal companies. For accuracy, we proclaim that certain software can go up to 70-80 percent. Customers are willing to pay more if other options on the market don’t work as well,” says Chen.
The Chinese government directive mandating legal document digitalization has engendered a number of Beijng legaltech startups, most notably AI-powered legal consulting companies Lvpin Technology and Itslaw, along with natural language processing solution providers Fa’gou’gou and most recently deepcurious.ai.
Journalist: Meghan Han | Editor: Michael Sarazen
0 comments on “Lawyee Has Digitized 40 Million Chinese Court Records”