Ancient damaged Roman scrolls were deciphered using AI
“I do can’t believe it it worked!” says Nat Friedman, co-founder of the Vesuvius Challenge, which offered $1m in prizes to anyone who could use artificial intelligence (AI) to carbonize papyrus scrolls when Mount Vesuvius erupted in 79A.D. But the work he did. On February 5th Mr. Friedman announced that a team of three had been awarded $700,000 to successfully produce four pieces of text, each at least 140 characters long, and with at least 85% of the legible characters, from a scroll called Banana Boy. The three winners, Luke Farritor, Youssef Nader and Julian Schilliger, are all computer science students.
The scroll is one of hundreds found in the Roman city library of Herculaneum, which is thought to have belonged to Julius Caesar’s father-in-law. Along with hundreds of other scrolls in the city’s library, it was destroyed by scorching gases that engulfed the city in the same eruption that buried the nearby city of Pompeii.
It is difficult to read text from the scrolls because the heat turned them into brittle charcoal logs; all attempts to remove them resulted in their physical removal. So attention shifted to finding ways to solve it virtually, through computer analysis 3d scans of the scrolls used X-rays Deciphering the scrolls turned into a software problem – but a very complicated one.
Virtual Unrolling is a two-stage process pioneered by W. Brent Seales, a computer scientist at the University of Kentucky. The first stage, called segmentation, involves tracing the edges of the rolled papyrus sheet within the 3d scan, then pull out 2D images of the surface of the roll. The second stage, ink tracing, analyzes the resulting images to extract the scroll’s text ink from the papyrus background. This is particularly difficult for the Herculaneum scrolls, which are written in carbon-based ink, so there is little contrast with the carbonized papyrus background.
Dr. Seales, along with Mr. Friedman and Daniel Gross, two technology entrepreneurs, thought AI approaches could be successfully taken on both of these problems, and a prize challenge launched to find out. Since then a community of thousands of enthusiasts has developed a range of tools and tricks to speed up the fiddling process of segmentation, and to find the inks of individual letters, and then whole words. In October 2023 Mr. Farritor and Mr. Nader received smaller awards for extracting the first legible word (“porphyras”, which means “purple” in ancient Greek) from the Banana scroll Boy (so named because of his size and shape).
The two students then teamed up and, together with Mr. Schilliger, further developed the machine learning technique involved in ink detection. By manually marking areas called ink, they were able to train a neural network to find more of them; these were fed back into the model to improve its detection capabilities. Mr. Nader also changed the neural network to a new architecture called TimeSformer, which produced sharper results. At the same time, Mr. Schilliger designed a machine to automate more of the separation process (much of which still has to be done by hand).
The deadline for submitting the results of the grand prize was the end of December, and the award was given to the three after the entries were evaluated by a team of paper experts. (The three runners-up will receive prizes of less than $50,000 each.) The winning entry featured 15 columns of text, written in Greek. Reading it was “mind blowing”, says Federica Nicolardi, papyrologist at the University of Naples Federico II, who was one of the judges. The text is believed to be a previously unknown work on happiness by Philodemus, an Epicurean philosopher who lived in Herculaneum.
Mr. Friedman now wants to expand the entire process. With the ink trace solved, he says, “the bottle is now separated”. Mr. Schilliger’s self-segregation device is a big step forward, and he has agreed to make it open source, and collaborate with others to improve it. Other prizes are offered as incentives. At the same time, Mr. Friedman aims to scan more scrolls using the Diamond Light Source, a particle accelerator in Great Britain, and standardize the scanning process.
That costs money. After giving away $1.2m in prizes, some of it in his own pocket, Mr Friedman is looking for other sponsors to support the project. He hopes that deciphering ancient scrolls will lead to the rediscovery of lost works from antiquity – “each scroll is a mystery box,” he says – and, at his ‘ eventually, revived interest in excavating the city in Herculaneum, which could contain thousands more of them. . ■