Tagged: IDPF

Onward to EPUB3?

DigitalAladore 1.0 is a valid EPUB2. To recap: EPUB was chosen for the ebook because it is a free and open format built on open web standards (in contrast to proprietary formats such as Kindle AZW). And we love Free because of the many practical benefits of open source development plus the moral ideals of respecting the user’s freedom.

The EPUB2 standard was first released in 2007, but has since been superseded by EPUB3 released in October 2011. EPUB3 was designed to take advantage of new elements introduced in HTML5 and allow more interactive functionality (script). However, support of the full specification continues to be very poor. The only readers with full support seem to be commercial apps that deliver interactive books in a closed ecosystem. For example, AZARDI offers a cost-free reading app that has good support of advanced features of EPUB3, but it is focused on secure “content fulfillment” of interactive textbook subscriptions. To publish to the platform, authors must use their proprietary ebook creation application. Kobo and Apple have developed tweaked versions of EPUB3 that do not fully comply with the standard and focus on the possibilities for improved DRM, rather than functionality not found in EPUB2.

However, for simple functionality (i.e. a linear novel) EPUB3 is supported by most reading devices. I decided to update the Aladore EPUB2 to an EPUB3 version for future compatibility, higher specs, and improved semantic inflection. Guidelines now suggest adding larger images and cover images than I used in the EPUB2 to ensure they don’t look terrible on HD tablets. So while DigitalAladore 1.0 was optimized for older e-ink ereaders, the EPUB3 version will be optimized for larger, more powerful devices.

However, Sigil does not currently support the creation of ebooks following the EPUB3 spec. If you make changes to the markup following EPUB3, Sigil will actually correct them back to EPUB2 when saving the file.  So, to create the Aladore EPUB3 we have to do a few extra steps:

  • Replace all the image files with larger versions using Sigil.
  • Use the Sigil plugin ePub3-itizer to export a pseudo EPUB3. Sigil developers intend to implement full EPUB3 creation and editing support soon, so this plugin is considered a “stop-gap measure.” It changes the HTML headers, restructures a few files, and adds the nav.xhtml.
  • Unzip the ePub3-itizer output to edit the contents. Because Sigil limits the markup to XHTML valid to the EPUB2 spec, it is not possible to add HTML5 tags such as section or EPUB3 attributes such as epub:type (thus, it is what I call a pseudo EPUB3). I used the IDPF Accessibility Guidelines (The epub:type attribute) plus the attribute vocab EPUB 3 Structural Semantics Vocabulary to add some semantic structure to the text. This markup can be used for styling the document with CSS, but is also useful for machine processing and accessibility options. You can mark up sections of the ebook (frontmatter, body, backmatter), divisions within (abstract, chapters), types of content (footnote), or individual elements (title). I added div tags with attributes in the EPUB2 which I converted to section tags, for example, each <div class=”chapter”> became <section epub:type=”chapter”>. I used these epub:type values: cover, titlepage, chapter, epigraph, toc, and loi. Since I made each chapter a single XHMTL file, another option would be to add the epub:type attribute to the body element. However, those attributes would be lost if merging the HTML, so I prefer the section tags.
  • Delete the toc.ncx file.  This file was used by older reading devices to provide navigation functionality, but it is not part of the EPUB3 spec as it is replaced by nav.xhtml. However, many people seem to be leaving this file in the EPUB for legacy support. If you leave it, everything should work fine, but the file will NOT fully validate.
  • Re-Zip the new EPUB3. EPUBs need to be zipped in the correct order or they will not function. This means you must create the zip archive first (in Windows right click somewhere and choose New > Compressed (zip) Folder), then add the mimetype file (drag it into the new zip folder). Then all the rest of the content can be added. Finally, change the extension from .zip to .epub.
  • Validate with the IDPF EPUB Validator.

The sketchyTech blog talks about the differences created by this process in more detail if you want to hear from some one else…

But basically, that’s it!  Not too complicated, although it requires some thought about 1) the quality of images to include, 2) changes to styling with larger screens in mind, and 3) consideration of semantic inflection to provide better accessibility and machine readability. I will post the new Aladore EPUB3 soon!


After completing the tweaks outlined in the last few posts, I opened the Aladore epub with Calibre’s built in editor for a final look.  As mentioned in previous posts, the editor is comparable to Sigil, although not necessarily designed for creating ebooks from scratch. However, because it is built into Calibre’s ebook library management platform, it is great for making tweaks on the fly for testing on your reading devices. Also, development on the project currently seems more active than Sigil.

To get a overview of the contents of the epub, I open Reports from the Tools menu.  This analyzes the package, listing all the files, words, images, styles, characters, and links.  It is a nice way to quickly look for any issues that might still be lurking. I scan through the words to see if any weirdness stands out, then check the characters to ensure there is nothing strange.  You will learn interest factoids, such as “and” is the most used word at 4001 times, or there is 66,910 spaces in the ebook.

reportsIt is worth noting that Calibre slightly modifies the metadata when ebooks are added to the library.  If you are anal about your newly perfected markup, you might want to re-edit it.  One powerful feature of the editor is “Compare to another book” under the File menu. It creates a nice visualization highlighting the differences between versions of the ebook (compare with Juxta used earlier in Digital Aladore). Here it is showing the differences introduced by the automatic Calibre metadata edits:


So everything looks okay! I also flipped through it on my reader for a final “user testing” session.

Finally, we want to run it though IDPF’s EPUB Validator (a free web-based tool) to ensure everything is kosher:


Ready for distribution?