Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.13.0 #256

Open
wants to merge 50 commits into
base: master
Choose a base branch
from
Open

0.13.0 #256

wants to merge 50 commits into from

Conversation

eshellman
Copy link
Collaborator

@eshellman eshellman commented Jan 14, 2025

0.13.1 January 20, 2025
With this release we add some important new capabilities to Ebookmaker; other more profound functionalities such as PDF generation have been deferred to the next major version (0.14), but important groundwork is laid so that problems can surface sooner rather than later.

  • This release includes support for MathML. Although it's over 10 years old, it's only recently that all the major web browsers are supporting MathML. MathML has been part of the EPUB3 spec since its beginning. MathML also has important accessibility attributes and some assistive technologies (but not all), are supporting it. It renders math beautifully, allowing equations to reflow in ebooks, the lack of which makes our math-containing ebooks a poor experience in EPUB readers and Kindles. Fixes MathML test #254

    • For compatibility with older ebook readers, we are converting math element to img elements for our EPUB2 files. The alttext attribute of the math element will be used for the alt attribute. Because of this, use of the altimg and alttext attributes on the math element should be encouraged. If altimage is not supplied, the text content of the mathML content will be used in the EPUB2 file.
    • While support for MathML in Ebookmaker enables experimental use and possibly deployment at PG, it is up to PG and DP management to determine if and when to recommend its use.
    • MathCAT is software to look into for generating alt text from MathML https://nsoiffer.github.io/MathCAT/
    • An online service existed to generate alt text automatically from MathML: https://github.com/openstax/mathmlcloud I am trying to find out more about its status.
    • arXiv uses LaTeXML to convert LaTeX math to MathML https://math.nist.gov/~BMiller/LaTeXML/
    • fixed issue where html5 math element produced EPUB3 that didn't validate
  • This release adds support for inline (embedded) SVG via the svg element. Fixes EPUB content.opf file not reflecting presence of an <svg> #136, attributes appear to be downcased, causing validation errors #135,

    • Previous versions of Ebookmaker only supported external svg images.
    • Because Ebookmaker is required to produce XML based output in EPUB files, inline SVG must use XML syntax rather than HTML5 syntax. This means that element and attribute names inside the svg element are case-sensitive, and the svg element MUST include an xmlns="http://www.w3.org/2000/svg" attribute.
    • While use of the alt attribute on external svg images in img elements should be well understood, the impact of inline SVG on accessibility is mixed. If the svg does not contain text elements, the use of the svg title attribute is recommended. See https://www.unimelb.edu.au/accessibility/techniques/accessible-svgs for a good discussion.
    • We remove the svg role attribute in EPUB2 files, because there's no way to use use it in EPUB2 files without triggering a validation error. The role attribute is retained for inline svg in EPUB3 files. No problems occur for HTML5 files.
  • implement HTML5/EPUB3 audio element for sound files. Fixes Add HTML5/EPUB3 audio in 0.13 #214.

    • links to mp3 and ogg are replaced with HTML5 audio elements during initial parsing
    • HTML audio elements are deprecated to links for EPUB2 files
    • mp3 and ogg files will be included in EPUB3 files.
    • because PPers have been hiding links to sound files for ebook files using x-ebookmaker CSS, we are adding CSS to unhide elements that contain the audio elements, for EPUB3 only.
    • we no longer include music xml files in our EPUB documents
    • HTML5 audio elements have many attributes. PG and DP need to decide what settings are best to use. No autoplay please!
    • the audio element as converted to mobi by calibre displays, but doesn't play on older Amazon products. It's our understanding that send-to-kindle does not accept EPUBs with audio, so kindle users may need to use our EPUB2 files for send-to-kindle when the ebook contains audio.
    • When adding HTML5 audio to a book, use code like this: <audio title="" controls="controls"><source src="music/test.mp3" type="audio/mpeg" id="id-audio">Audio content is not currently supported on your device.</audio>. It's important to have localized fallback text inside the audio element, after the source element.
    • in addition the mp3 audio, the code supports "ogg" audio files, whatever they are.
  • groundwork for PDF generation from html5 (v 0.14)

    • removed html.noimages, pdf.noimages.
    • having css attached to the body element can cause all sorts of problems for HTML print pagination. This release adds a 'screen' media selector to problematic css rules that include body in their selector. For now, this is just margin and padding rules. If you find that this changes anything, please let us know!
  • added the "production" flag to allow context-sensitive log entries. if the flag is set, then lack of header/footer markers will be reported as CRITICAL errors; if not (the default, there will be INFO messages instead. The production flag also turns on ALTTEXT logging. Fixes Add a config flag to suppress log messages irrelevant to pre-upload usage #251

  • added FILESDIR and CACHEDIR to sample conf file

  • changed the WARNING when FILESDIR was not configured to an INFO; there was already a sensible default. Fixes set default for FILESDIR #248.

  • it's been a long time since logging was rationalized. In general, there was much unneeded logging. Many WARNING messages were changed to INFO; many INFO messages were changed to DEBUG, and several DEBUG messages were removed, commented out, or changed to summary messages outside of a loop. Fixes Review INFO warnings and change to DEBUG where appropriate #250, Change pagenum warnings to INFO #249

  • any links to gutenberg.org, pglaf.org, or pgdp.org are now considered gutenberg links and get an INFO message rather than a WARNING.

  • Flow optimization

    • store sourcefile urls in ParserFactory to enable skipping duplicate epub generation.
    • this should allow faster processing and less duplication of log messages.
  • nonstandard single-line comments in style elements are now removed. These made up a majority of our ERROR logs. fixes EBM's removal of CDATA markers from style elements leaves debris #252

  • removed dependency on CherryPy that made code hard to understand and was useful only when using Ebookmaker as a web spider, which we don't do. mediatypes are no longer wrapped in a Hederelement object, they're just strings.

  • inappropriate alt text ERRORs changed to WARNINGs.

  • added test for html5 source file

remove html.noimages, pdf.noimages.
Add html.images as a pd requirement
store sourcefiles in ParserFactor to enable skipping the duplicate html parse
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment