DigitalAladore 1.0 is a valid EPUB2. To recap: EPUB was chosen for the ebook because it is a free and open format built on open web standards (in contrast to proprietary formats such as Kindle AZW). And we love Free because of the many practical benefits of open source development plus the moral ideals of respecting the user’s freedom.
The EPUB2 standard was first released in 2007, but has since been superseded by EPUB3 released in October 2011. EPUB3 was designed to take advantage of new elements introduced in HTML5 and allow more interactive functionality (script). However, support of the full specification continues to be very poor. The only readers with full support seem to be commercial apps that deliver interactive books in a closed ecosystem. For example, AZARDI offers a cost-free reading app that has good support of advanced features of EPUB3, but it is focused on secure “content fulfillment” of interactive textbook subscriptions. To publish to the platform, authors must use their proprietary ebook creation application. Kobo and Apple have developed tweaked versions of EPUB3 that do not fully comply with the standard and focus on the possibilities for improved DRM, rather than functionality not found in EPUB2.
However, for simple functionality (i.e. a linear novel) EPUB3 is supported by most reading devices. I decided to update the Aladore EPUB2 to an EPUB3 version for future compatibility, higher specs, and improved semantic inflection. Guidelines now suggest adding larger images and cover images than I used in the EPUB2 to ensure they don’t look terrible on HD tablets. So while DigitalAladore 1.0 was optimized for older e-ink ereaders, the EPUB3 version will be optimized for larger, more powerful devices.
However, Sigil does not currently support the creation of ebooks following the EPUB3 spec. If you make changes to the markup following EPUB3, Sigil will actually correct them back to EPUB2 when saving the file. So, to create the Aladore EPUB3 we have to do a few extra steps:
- Replace all the image files with larger versions using Sigil.
- Use the Sigil plugin ePub3-itizer to export a pseudo EPUB3. Sigil developers intend to implement full EPUB3 creation and editing support soon, so this plugin is considered a “stop-gap measure.” It changes the HTML headers, restructures a few files, and adds the nav.xhtml.
- Unzip the ePub3-itizer output to edit the contents. Because Sigil limits the markup to XHTML valid to the EPUB2 spec, it is not possible to add HTML5 tags such as section or EPUB3 attributes such as epub:type (thus, it is what I call a pseudo EPUB3). I used the IDPF Accessibility Guidelines (The epub:type attribute) plus the attribute vocab EPUB 3 Structural Semantics Vocabulary to add some semantic structure to the text. This markup can be used for styling the document with CSS, but is also useful for machine processing and accessibility options. You can mark up sections of the ebook (frontmatter, body, backmatter), divisions within (abstract, chapters), types of content (footnote), or individual elements (title). I added div tags with attributes in the EPUB2 which I converted to section tags, for example, each <div class=”chapter”> became <section epub:type=”chapter”>. I used these epub:type values: cover, titlepage, chapter, epigraph, toc, and loi. Since I made each chapter a single XHMTL file, another option would be to add the epub:type attribute to the body element. However, those attributes would be lost if merging the HTML, so I prefer the section tags.
- Delete the toc.ncx file. This file was used by older reading devices to provide navigation functionality, but it is not part of the EPUB3 spec as it is replaced by nav.xhtml. However, many people seem to be leaving this file in the EPUB for legacy support. If you leave it, everything should work fine, but the file will NOT fully validate.
- Re-Zip the new EPUB3. EPUBs need to be zipped in the correct order or they will not function. This means you must create the zip archive first (in Windows right click somewhere and choose New > Compressed (zip) Folder), then add the mimetype file (drag it into the new zip folder). Then all the rest of the content can be added. Finally, change the extension from .zip to .epub.
- Validate with the IDPF EPUB Validator.
The sketchyTech blog talks about the differences created by this process in more detail if you want to hear from some one else…
But basically, that’s it! Not too complicated, although it requires some thought about 1) the quality of images to include, 2) changes to styling with larger screens in mind, and 3) consideration of semantic inflection to provide better accessibility and machine readability. I will post the new Aladore EPUB3 soon!
AUTHOR OF ‘THE TWYMANS,’ ‘THE NEW JUNE,’ ETC.
ILLUSTRATED IN COLLOTYPE FROM THE DRAWINGS BY
THE LADY HYLTON
William Blackwood and Sons
Edinburgh and London
If you have been following along, all that prototyping, testing, and tweaking eventually brings us to a NEW Aladore EPUB! I am calling it DigitalAladore1.0, because there might be some more versions to come (for example conversion to epub3 standard)…
This is an EPUB2 file which should render well on dedicated e-ink readers for a high quality reading experience. The text is much better than the auto-generated editions I encountered at the beginning of this project (here is one of the source editions on Internet Archive with a crummy PDF and epub available). We have done a lot of work to go beyond the first Digital Aladore draft edition. The images are nicer, the underlying mark up is sensible, the metadata is complete, and the epub package is put together correctly. And we did it all with Free software.
This is a major milestone for Digital Aladore, but I still have more to say (of course). For example, I uploaded the new epub to Internet Archive, which I think is an amazing resource: we need to talk more about free distribution and the public domain. Lets save it for another day! For now:
DigitalAladore 1.0: Aladore, by Henry Newbolt (1914), https://archive.org/details/AladoreHenryNewbolt
Internet Archive has recently rolled out a major redesign of their website. I don’t love everything about the design (its much less information dense, so requires more navigation and is less easy to browse), but one thing is AWESOME: they now provide direct download of the page image files!
Remember that work around I came up with to harvest the page images out of their online reader? You don’t have to do that any more. Just click on the download button, and most items will offer raw and edited scans images (JP2) in addition to the usual EPUB, PDF, and other access derivatives. They actually expose all files related to an item if you click “See All Files”, including the metadata in a bunch of formats.
Check out Aladore 1915 for an example: https://archive.org/details/aladorehen00newbrich
Wow! I am seriously impressed and excited! I mentioned the issue of only providing limited access versions in previous posts, so its great to see the huge collection at Internet Archive take this massive leap forward in enabling users and re-use.
p.s. on a related note, just as I cropped out the illustrations from Aladore and provide them in nicely edited versions here for re-use, Internet Archive initiated a project about a year ago to mine the wealth of images out of their digitized books. They started posting images on their Flickr website in July 2014, https://www.flickr.com/photos/internetarchivebookimages
Today they have 2,878,891 images posted! That’s a lot to browse through… they haven’t gotten to Aladore yet, so you will still have to visit here!
Since I have been holding the Digital Aladore world in suspense for too long, I decided to release a draft version of the EPUB. I uploaded it to the Internet Archive Community Texts collection for easy distribution: https://archive.org/details/AladoreNewbolt
This version of the EPUB is minimally formatted. The cover is pretty ugly. And there is no stylesheets, so it won’t look very fancy. But, it has the most up-to-date edited text, all the images, and it works! So enjoy!
P.S. I also uploaded the plain text version to the Internet Archive page.
Following on the Public Domain Day post from last week, I thought I should point out the Exciting and Interesting comic book discussion of fair use and copyright from the Duke Center for the Study of the Public Domain:
“Bound by Law?” http://web.law.duke.edu/cspd/comics/digital.php
The comic book is a fun way to communicate about complex and poorly understood issues, but I think it also demonstrates another aspect of openness–open distribution. The comic is licensed CC-by-nc-sa and you can get it in many different forms, recognizing many different potential users and uses.
If you want a nice physical copy, there is print versions available for purchase from multiple sources. If you want to read online, there is the free HTML website or a Flash version (with more functionality than HTML, but requires Flash Player and more resources). Further more, there are many different versions available for download, depending on your needs and resources: two sizes of PDF, high res cover images for web or printing, or a zip of the individual page images for easy reuse. There are links to versions in several languages, and even an edition without the text to facilitate users making their own translations. That is OPEN!
More than just providing material free of cost, more than just publishing content, they are interacting with users, encouraging further creativity and thought.
If you want some more comics (relating to copyright!), check out “Mimi and Eunice’s Intellectual Pooperty” from Nina Paley, http://questioncopyright.com/mimi-book-ip.html
The page offers print booklets for purchase or a free CBR format version to download, and links to the free master files on Archive.org. More comics are available at the main Mimi and Eunice site. Nina Paley describes her approach to intellectual property as Copyheart: ♡2010 by Author/Artist. Copying is an act of love. Please copy.
For a more academic discussion of issues surrounding the public domain, check out James Boyle, The Public Domain: Enclosing the Commons of the Mind, http://www.thepublicdomain.org
Boyle’s website offers free download of several books and educational resources.
In addition to marking the New Year, January 1 is also Public Domain Day, the day when a fresh crop of works officially enter the public domain as their copyright expires! The Duke Law Center for the Study of the Public Domain puts it this way:
The end of the copyright term on these works means that they enter the public domain, completing the copyright bargain. Copyright gives creators – authors, musicians, filmmakers, photographers – exclusive rights over their works for a limited time. This encourages creators to create and publishers to distribute – that’s a very good thing. But when the copyright ends, the work enters the public domain – to join the plays of Shakespeare, the music of Mozart, the books of Dickens – the material of our collective culture. That’s a good thing too! (Public Domain Day 2015)
It is an exciting day at Digital Aladore where we rely exclusively on public domain content (and Free software). But its not a very happy holiday in the United States right now… Because exactly ZERO works entered the public domain thanks to the insane copyright extensions of 1998. In fact the extension ensures no works will enter the public domain until 2019! When copyright was first enacted in America the term was 14 years, but it has been gradually extended ever since. In 1998 copyright was extended to the life of the author plus 70 years or 95 years in the case of corporate authorship. But the reason we have no new works this year is because the law also retroactively applied an 95 year extension to ALL works copyrighted between 1923 and 1977… Seriously, that’s insanely long. It has NOTHING to do with benefiting and encouraging authors and artists. It is only about benefiting giant corporations and hoarders of capitol.
For more information about this sad holiday in America, check out the great site from the Duke Law Center for the Study of the Public Domain, Public Domain Day 2015. There are extensive articles explaining the legal and cultural situation, and teasers about all the works that SHOULD have entered the public domain this year…
In many other places the holiday is more cheerful! For example, in Canada copyright term is “life plus 50 years” meaning up north we will get to enjoy the full Class of 2015 put together by The Public Domain Review. Most of the European Union is “life plus 70”, but isn’t hobbled by the 95 year extension applied in the United States, so they will still get some holiday treats.
Recent research and economic modeling suggest that current copyright terms are too long and do NOT provide incentive for creation. Instead our shared culture is being locked away by corporate profiteers. In fact, the majority of works still protected by copyright are orphans–out of print with no likely hood of ever being used again commercially. Projects like Digital Aladore, Free software, and honestly the majority of the internet point out that creators aren’t purely profit driven. Its time to reform copyright to benefit the creators rather than hoarders of capitol (who already have plenty of power and wealth!).
Happy Public Domain Day and best wishes for the New Year!