JERUSALEM — Israeli researchers said Thursday they are developing a computer program to make ancient documents more legible and easily indexed, which could eventually lead to a searchable catalog of archived historical texts.
The program, which is being developed by a team of computer scientists and historians at Ben-Gurion University of the Negev, would make the faded, smudged or overwritten words in ancient texts easier to read.
The program can also be used to determine which documents are original through a process called writer identification, said Jihad El-Sana, a researcher on the project and assistant professor of computer science.
"We are developing a kind of technology to enhance documents' visual (properties) for two reasons — to make them easier to read and because we want to archive and index them," El-Sana said Thursday.
As more and more documents are digitized, the kind of program the team is developing would cut the time it takes to study these ancient texts, said Daphna Weinshall, a computer science professor at the Hebrew University in Jerusalem.
"If they can just digitize these documents, machines are much more efficient than humans. Once it's on the computer, they can do it a lot easier," Weinshall said.
The technology could also be used to piece together fragments of texts housed in different locations throughout the world.
El-Sana said the team hopes to create a system by which historians can search through the images of a document and find the words they seek, cutting down the time it takes historians to pore over documents.
"We're trying to ... utilize the talents of the human being better and save human hours," El-Sana said.
The researchers have worked mainly with ancient Hebrew and Arabic texts.
An open-source program could be available to researchers for download in three years, but the algorithms would need more "refinement" before they are ready for the general public, El-Sana added.
© 2013 The Associated Press. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.