Compressing SVG with EXI

Youenn Fablet, Romain Bellessort, Jun Fujisawa,
Anthony Grasso, Hervé Ruellan
Canon
Play (31min) Download: MP4 | MP3

EXI is a binary format for encoding XML Infosets. It is currently in Candidate Recommendation, one of the last steps of the W3C standardization process. The main goal of this format is to provide very good compaction for a wide range of XML documents, applications and devices. To meet this goal, several encoding options are defined within the EXI format. Those encoding options must be selected with care to meet the requirements of an application in terms of compression, processing efficiency and memory usage.

This study describes the possibilities and achievements obtained by the EXI format for SVG applications. The impacts of the encoding options, in particular schema-less and schema-informed encoding (i.e. whether schema information is available) and compression (i.e. a further generic compression step is applied on the encoded data) are evaluated in this context.

Comparison with existing technologies and practices (SVGZ and BiMSVG) is also conducted in the study to show the strengths and weaknesses of the EXI technology applied to SVG. While clearly being an improvement over SVGZ, EXI, in its most interoperable form and without the use of compression option, is less compact than BiMSVG. Indeed, BiMSVG has built-in support for very efficient encoding of SVG textual content, whereas the most interoperable form of EXI only relies on schema information. However, as EXI enables the definition of built-in data types, the addition of specific SVG support to an EXI processor is evaluated in the study, demonstrating largely improved results.

This study also highlights specific benefits and issues: interoperability (lack of a convenient well-known XML Schema for SVG document encoding, SVG versions and modules), impact of the compression option on SVG progressive rendering, usefulness of features like schema deviation support (i.e. enabling the efficient encoding of items not described in the schema), self-contained elements (i.e. encoding parts of the SVG document independently of each other) or datatype representation map (i.e. mapping dedicated codecs to specific SVG textual content).

To conclude the study, recommendations on the best use of EXI for SVG (notably for both open and closed environments) and a look at the future are made. SVG 2.0 being on the way, some features like the definition of a more structured syntax for some SVG constructs (path or animation elements notably) have a very positive impact on the compaction results. This is assessed within the study. The study also raised questions for which the SVG community feedback is sought. Is there a need for a common compact format for all SVG applications? Is there a need to define an interoperable EXI SVG support? Should SVG 2.0 syntax be assessed in terms of EXI compaction results? Is there a need to define a common SVG schema for EXI processors? How should be handled the different SVG versions (1.0, 1.1, 1.2 and future 2.0)?