To create my Aladore edition, I decided to use the wonderful Free software (GPLv3) EPUB editor, Sigil, https://github.com/user-none/Sigil
I have used Sigil to tweak and create ebooks for many years. At the beginning of this project several months ago, I was sad to read that active development was going to stop. In February 2014, Sigil’s main developer John Schember posted that development was stalled and suggested using Calibre’s newly expanded epub editor (we will visit Calibre in another post soon!). However, at the end of September 2014, Sigil surged back to life with a major new release! Hooray! Thank you John Schember and Kevin Hendricks!
A good user guide and tutorials can be found hosted on the old Google code page, http://web.sigil.googlecode.com/git/files/OEBPS/Text/introduction.html, or downloaded as an epub, https://github.com/user-none/Sigil/blob/master/docs/Sigil_User_Guide_0_7_2.epub.
Sigil is a true, full featured EPUB editor. It can easily import text or HTML files, add images and style sheets, generate table of contents and index, and create a cover. It allows you to toggle between WYSIWYG and code editing views. It has a number of utilities built in to tidy and validate your EPUB. Its only disadvantage at this point is that it is based on EPUB2 and does not yet support the full features of the newest standard, EPUB3.
So lets open Sigil and get going!
First, I go to File > Add > Existing Files and select all the six HTML files containing the text of Aladore (generated by YAGF, edited by Bluefish). The directory structure of the EPUB is represented in the “Book Browser” in the left hand pane. I can see the HTML files I added. Double click on the file in the Book Browser to open it in the main window where all the active files are tabbed.
Now, I work through each file to break it into to chapters. Normally, each chapter is contained in its own HTML file. This ensures that there is a page break and that the file sizes remain small enough for good performance on ereaders (which may have very low specs). I put the cursor in front of the chapter title and press Ctrl+Enter (or select Edit > Split at Cursor from the menu, or click the Split at Cursor icon in the tool bar). The chapter title will now be at the head of a new HTML file. Select the chapter title and change the style to H1. You may want to use H2 or 3 to allow some diversity in your heading formatting, but the disadvantage is that Sigil interprets this as a hierarchy. The table of contents will have all the H2+ nested underneath the earlier H1. In practical terms, this means when you access the TOC on your ereader you have to click through a hierarchy rather than straight to to a chapter. With Sigil it is just important to get the headings marked–it is very easy to change the level as a batch later on if you want to tweak the formatting.
I keep the directory of page images handy in case I need to refer to the original to make any corrections as I scan through the text. Spelling errors are only marked on the code view:
If your text has a lot of strange words, like Aladore does, it is a good idea to set up a custom dictionary to work with. Go to Edit > Preferences > Spellcheck Dictionaries and add a new dictionary:
Now, as I work through the text I add the common weird words (for example, the name of the hero Ywain) to the new Aladore dictionary by right clicking in Code view. This makes it is easier to identify actual errors.
Now its just a long slog through the chapters, carefully tagging the headings and touching up the text. Mainly, I am just fixing the paragraph breaks as needed and looking for anything that seems weird.
At this point, I also begin to notice patterns of errors. For example, “80” often appears in place of “So,” and zero in place of “O”. Time for some more Find & Replace!
Luckily, Sigil offers advanced find & replace features much like Bluefish. It supports regular expressions and can work on selections or the whole book at once.
Another feature helpful for isolating errors is generating a Report via Tools > Reports. The report gives summary information about everything in the EPUB. For example, the HTML section lists all the files with their size, word count, spelling errors count, and other statistics. Looking at the Characters section often highlights errors which would be impossible to otherwise identify:
The character report for Aladore listed a bunch of instances of the exotic characters “ﬁ” (which is not “f i”) and “ﬂ” (which is not “f l”). These are strange OCR artifacts and need to be replaced with normal characters. Clicking on the report places the cursor on an example so you can jump right into Find & Replace. Finally, I click Tools > Spellcheck > Spellcheck which generates a spelling error report similar to the character report. You can quickly scan through this list of errors to locate issues–even where there are many strange words as with Aladore. Clicking on a word in the list highlights in the HTML so you can fix it.
Alright, keep going, we almost have an EPUB! Next post…