Ancient book project collaborated by Harvard and ZJU sees great progress

2023-04-06

Trends

By Zhu Jingning

Few would deny that China has the richest collections of ancient books and documents in the world. It is preliminarily estimated that there are more than 600,000 ancient books in existence in the country, among which about 250,000 have been digitized with photos.

The digitization means scanning the ancient books and putting them online so that readers can read them on the web pages. But regretfully, they cannot search the contents of the books and much less to edit or hyperlink them.

Xu Yongming, a professor at the School of Humanities of Zhejiang University, believes that only by converting the scanned pages into text files can ancient books become conveniently used.

In recent years, he has led the research and development of the “Zhejiang Literature Network”, “Academic Mapping Platform” and “China Smart Ancient Books”. Under the efforts of his team, voluminous ancient books previously on shelves have a more permanent digital residence.

Up to now, the “Academic Mapping Platform”, a project co-founded by Harvard University and Zhejiang University in 2018, has travelling routes maps of more than 700 historic and literacy figures and about 1,200 geographic distribution maps, which are accessed by readers from more than 70 countries. “China Smart Ancient Books” has uploaded indexed ancient books with hundreds of people’s genealogy and social relationships, attracting readers from more than 30 countries since its debut in 2021.

"The intellectualization of ancient books is still in its early stage and has a long way to go. The process of constructing a platform and data base requires a significant amount of funds from multiple parties," Xu mentioned at the conference of the East Asian Digital Humanities –The Tools of the Trade held in Harvard University last month.

At present, about 600 teachers and students have participated in the online collections of ancient books on the platform and are all paid according to the difficulties of their tasks. Although the accuracy of the OCR is now up to 90% and above, it still requires a lot of human efforts to correct the remaining errors.

Whether it is the crowd sourcing system, the purchase of ancient books, the protection of data security, or the maintenance and update cost of the platform, it is not a modest amount of expenditure, whereas the scientific research fund for liberal arts is relatively modest.

Xu has been looking for entrepreneurs with humanistic minds to cooperate with. "Despite the challenges, the intellectualization of ancient books has broad prospects. With the development of technology, it is possible to preserve and make these ancient texts more accessible to a broader audience. This will not only benefit scholars and researchers but also allow the general public to gain a deeper appreciation of China’s rich cultural heritage."