Digital Text

But now we (and our texts) live in the Digital world.  What does that mean for transmission?

I think texts are getting crazier… Texts on paper, even hand copied, are fairly static and stable over centuries.  How we interpret them changes a lot, but the witnesses are pretty solid.

Digital texts on the other hand are so easy to change, copy, distribute, re-distribute, etc., etc.  The internet is a textual Wild West where everyone copies back and forth from each other, text is constantly transforming.  It is almost impossible to trace its life as it tumbles through the digital world.

The last few decades have seen a chaotic proliferation of free online books, often of highly suspect quality and consistency, almost always with poor metadata.  Projects such as the Universal Library/Million Book Project rushed to create scanned copies of, well, millions of books–most of which are terrible quality.  Pages are missing, images are out of focus or capture a hand, files got jumbled (you can some of the wreckage on Internet Archive:… These digitized images of books are supposed to be a type of facsimile edition–attempting to exactly represent the qualities of the original book.  However, simple images of pages do not take advantage of the digital medium.

To enable search functions, OCR is used to create transcriptions of the images–essentially creating a new version of the text, a low accuracy machine generated/interpreted version… Since the OCR is machine readable, it is used to generate all sorts of other formats such as epubs or txt.  Few of these receive any human editing.

However, as early as the 1970’s, Project Gutenberg was creating ebook editions of public domain works specifically focused on general Reading.  They are not attempting to create authoritative or critical editions.  Nor are they trying to exactly reproduce a specific print copy of the work.  Instead, the texts are proof read and edited by real people, to generate good Human readable ebooks.  Anyone (i.e. YOU) can volunteer to become a proofreader, check out their Distributed Proofreaders:

This project follows on that type of tradition.

However, there is a lot of other directions you can go with digital texts.  There is growing demand for academic quality texts to use as raw material for digital analytic techniques.  Standards such as Text Encoding Initiative (TEI) offer possibilities of enriching text with semantic markup.

You can also do a lot of neat stuff with online readers to create digital editions that collate and display several variant texts together, or a critical text with the variant readings as hyperlinks.  For example, check out the “Online Critical Pseudepigrapha”:

A digital text can represent the extremely complicated nature transmission, supporting multiple readings of the text (set in a historical context) rather than a single critical edition.  For example, check out the “Homer Multitext Project”:

Anyway, forget about all that–I will get back to Aladore soon…


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s