AI Computer Vision & Graphics Machine Learning & Data Science Research

Chinese Researchers Use CNNs to Classify 3000-Year-Old Oracle Bone Scripts

The earliest evidence of China’s recorded history is found in the Shang dynasty (~1600 to 1046 BC), and this has survived thanks largely to oracle bones (甲骨 – jiǎgǔ) which carry the oldest known ancient Chinese scripts. The inscriptions, typically on ox scapulae and turtle or tortoise shells due to their flat surfaces, were used mainly in divination.

To better understand the form of Chinese characters used on oracle bones from over 3,000 years ago, a group of Chinese researchers recently applied a multi-regional convolutional neural network (CNN) to classify oracle bone rubbings. Their study has been published by journal IEEE Computer Graphics and Applications.

The paper’s first author, China’s Southwest University Associate Professor Shanxiong Chen, told Synced, “Our team, as part of the Southwest University’s Database and Artificial Intelligence Laboratory, is particularly interested in using deep learning theories and methods in the field of image processing to identify, classify and hopefully assist in repairing ancient Chinese books and documents,” The new study is a partnership between Chen’s team and Bofeng Mo from the Center for Oracle Bone Studies of the Capital Normal University in Beijing.

Chen had previously cooperated with universities in China’s Guizhou province, home to almost 50 of China’s 55 ethnic minorities, using AI technologies to extract characters from texts written in the script of the ancient Yi ethnic group. They proposed a CNN framework that could effectively recognize Yi characters from even smudgy or incomplete documents and published the results last May.

Examples of Yi characters

Older than the ancient Yi books, and similar to many other traces of ancient civilizations, most surviving oracle bones are broken pieces and are preserved as part of the Yinxu Museum collection. To interpret the inscriptions, oracle experts have relied on hundreds of thousands of oracle rubbings.

In recent years, in addition to the inscription itself, oracle experts have started to pay more attention to oracle materials, forms, shapes, locations, and so on. Capital Normal University Professor Tianshu Huang, who specializes in oracle inscriptions, has even proposed the establishment of a branch of “oracle morphology” to focus on the study of oracle materials and forms.

“Differentiating the materials of oracle bones is one of the most basic steps for oracle bone morphology — we need to first make sure we don’t assemble pieces of ox bones with tortoise shells,” Chen told Synced. The process of splicing fragmentary bones termed conjugation is vital for the recovery of complete, identifiable oracle bone scripts.

The current classification process of oracle bones depends almost entirely on the experts’ experience — given the amount of existing oracle bone rubbings, it’s a great deal of work. While it usually takes long-term studying and a fair accumulation of professional knowledge to become an oracle expert, Chen’s team believed CNN could be a faster learner when it comes to classification tasks.

Examples of oracle bone script from Yinxu

The classification of tortoise shells and animal bones by oracle experts is based mainly on two unique characteristics that can only be found on tortoise shells: shield grain and tooth grain. A seemingly straightforward task, it can be very challenging since the original oracle bones are mostly worn out through thousands of years, leaving many natural cracks that are easily confused with shield grain or tooth grain. In addition, oracle experts found it difficult to use mathematical models to clearly describe shield grain and tooth grain.

After scanning and imaging the oracle bone rubbings, the researchers used the CNN’s classification predictions based on features it extracted to establish a material classification and recognition model for the oracle bone rubbings. They also used the local features of oracle bone rubbings to improve the performance of the rubbings classification.

Classification framework of oracle bone rubbings

The data input was first divided into three local regions — shield grain region, tooth grain region, and non-shield grain and non-tooth grain region — each corresponding to a feature extraction subnet consisting of two Conv-Pooling-ReLU layers and two fully connected layers, that extracts the function of each local region. Since the features of each local region are composed of vectors, the researchers then applied a multi-feature fusion subnet made up of four Auto-Encoding layers to fuse these features and reduce the region size to obtain fusion features before the final output.

The researchers used a dataset consisting of 1,476 tortoise shell rubbings and 300 ox bone rubbings, from which they chose one-third as the test set and two-thirds as the training set. Experiment results show the proposed method reaches a level close to that of oracle experts.

“As I said, classification is the first step,” Chen explained. “This study specifically focused on telling between animal bones and tortoise shells, and we’re continuously working with Capital Normal University’s Center for Oracle Bone Studies on further classifying different types of animal bones.”

Meanwhile, Chen is also building up models for oracle bone conjugation and plans on publishing the results later this year as part of his ongoing exploration of these ancient artifacts: “Achieving automatic conjugation will be one more step towards our ultimate goal to hopefully help provide complete oracle bone scripts for experts to interpret the messages.”

Journalist: Yuan Yuan | Editor: Michael Sarazen

10 comments on “Chinese Researchers Use CNNs to Classify 3000-Year-Old Oracle Bone Scripts

%d bloggers like this: