Multipage DjVu documents.

The DjVu3 system supports two models for multi-page documents: bundled multi-page documents and indirect multi-page documents. The multi-page API allows you to assemble already compressed pages and to create multipage DjVu documents using either model.

Bundled multi-page documents --- A bundled multi-page DjVu document uses a single file to represent the entire document. This single file contains all the pages as well as ancillary information (e.g. the page directory, data shared by several pages, thumbnails, etc.). Using a single file format is very convenient for storing documents or for sending email attachments.

A bundled multi-page document is composed of a single "FORM:DJVM" composite chunk. This composite chunk always begins with a "DIRM" chunk containing the document directory (see. DjVmDir.h) which represents the list of the component files that compose the document. The component files themselves are then encoded as IFF85 composite chunks following the "DIRM" chunk.

Including shared information --- Any DjVu image file contained in a multipage file may contain an "INCL" chunk containing the ID of a shared component file. The decoder should then process the chunks contained in the shared component file as if they were contained by the DjVu image file. The shared component file may potentially containing any information otherwise allowed in a DjVu image file (except for the "INFO" chunk of course). There are many benefits associated with storing such shared information in separate files. A well designed browser may keep pre-decoded copies of these files in a cache. This procedure would reduce the size of the data transferred over the Internet and also increase the display speed. The multipage DjVu compressor, for instance, identifies similar object shapes occuring in several pages. These shapes are encoded in a shape dictionary (chunk "Djbz") placed in a shared component file. All relevant pages include this shared component file. Although they appear in several pages, these shared shapes are encoded only once in the document.

Browsing a multi-page document --- You can view the pages using the DjVu plugin and a web browser. When you type the URL of a multi-page document, the browser starts downloading the whole file, but displays the first page as soon as it is available. You can immediately navigate to other pages using the DjVu toolbar. Suppose however that the document is stored on a remote web server. You can easily access the first page and see that this is not the document you wanted. Although you will never display the other pages the browser is transferring data for these pages and is wasting the bandwith of your server (and the bandwith of the Internet too). You could also see the summary of the document on the first page and jump to page 100. But page 100 cannot be displayed until data for pages 1 to 99 has been received. You may have to wait for the transmission of unnecessary page data. This second problem (the unnecessary wait) can be solved using the ``byte serving'' options of the HTTP/1.1 protocol. This option has to be supported by the web server, the proxies, the caches and the browser. We are coming there but not quite yet. Byte serving however does not solve the first problem (the waste of bandwith).

Indirect multi-page documents --- DjVu solves both problem using a special multi-page format named the indirect model. An indirect multi-page DjVu document is composed of several files. The main file is named the index file. You can browse a document using the URL of the index file, just like you do with a bundled multi-page document. The index file however is very small. It simply contains the document directory and the URLs of secondary files containing the page data. When you browse an indirect multi-page document, the browser only accesses data for the pages you are viewing. This can be done at a reasonable speed because the browser maintains a cache of pages and sometimes pre-fetches a few pages ahead of the current page. This model uses the web serving bandwith much more effectively. It also eliminates unnecessary delays when jumping ahead to pages located anywhere in a long document.

Obsolete Formats --- The library also supports two other multipage formats which are now obsolete. These formats are technologically inferior and should no longer be used.

Alphabetic index Hierarchy of classes


DjVu is a trademark of LizardTech, Inc.
All other products mentioned are registered trademarks or trademarks of their respective companies.