File Formats

Background

In the coming years, there will be a file format war that will be reminiscent of the DVD format wars of the mid 2000′s. To most people, this war will seem like a tempest in a teapot, a bunch of computer weenies arguing agrily over minutiae. However rather than being minutiae, the differences between the file formats are significant and can have serious financial consequences for your business.

File formats are critical in that you may want to open a document 10 or 20 years down the road and discover that your software won’t open the documents.  Hiring consultants to convert the files into a usable format can be expensive.  Worse, a conversion effort and can introduce errors.  Sticking with one software platform, Microsoft Office for example, is not the way to avoid this problem; later versions of Word have been known to have trouble opening documents created in earlier versions of the application.

To meet the need for long term reliable data formats, there have been several attempts by various organizations to develop file formats that would remain consistent over years.  These include Adobe’s Portable Document Format (PDF), Microsoft’s Office OpenXML Format (OOXML), or OASIS’ Open Document Format (ODF).

The Controversy

The most capable specifications are OOXML and ODF; they can store dynamic information such as the contents of a spreadsheet, unlike PDF documents. Unlike the PDF specification which describes how to render the document, the other two formats are nominally XML formats, which describe the document structure while allowing the software to decide the best way to render or display the data.  While the ODF format is technically superior to OOXML, Microsoft’s support for ODF is very poor.

The PDF Format

The chief benefit of he PDF format is that it has been around for a while, and there is no chance that you will be unable to open a PDF document.  However, the PDF format is based on a printing format; it is not designed to store spreadsheets with calculated cells, for example.  Moreover, as a publishing format, it is really optimized for generating printed documents.  It is not conducive for producing well rendered documents when read on displays with non-standard displays such as cell-phones.

OOXML

The chief benefit of OOXML is that it is supported by Microsoft Office.  The problem with it is that nobody else supports it, and because OOXML is suffering from the same creep that plagued older versions of Office files.  The OOXML files produced by Office 2007, ofr example, are not standard OOXML files.  Moreover, the OOXML format is deeply flawed;  in an effort to maintain backward compatibility with office components that can’t read XML documents.

The end result is that software developers who do not work for Microsoft are struggling to produce software that can read OOXML files.  This lack of a broad adoption means that Microsoft will face little opposition if they should unilaterally alter the file format in future versions of Office.

ODF Format

The ODF format was originally invented by engineers working at Sun Microsystems.  It is a truly open specification in that the published specification is complete and fully describes how to write and read documents.  Moreover OASIS has a good system for inviting people to submit recommendations for improving the standard and for reviewing the changes.  As  a result it has supplanted the original native formats of many non-Microsoft office suites such as OpenOffice or KOffice.

The benefit of saving files in the ODF format is that once a file is saved in this format, you will be able to open the file at any arbitrary date in the future.

The major problem with the format is Microsoft’s hostility to it.  Microsoft’s business model is dependent on people routinely upgrading Microsoft Office and Microsoft Windows.  Unfortunately for Microsoft, OpenOffice is a full featured and no-cost to use office suite.  In order to limit the adoption of this free rival, Microsoft has set upon a strategy of non-interoperability.  They reason that so long as alternate programs are incapable of reading Microsoft Office files while Microsoft Office programs are incapable of opening files produced by rival programs, there will be large disincentives associated with abandoning Microsoft Office. Their native support for ODF is intentionally very  poor.  I strongly expect that they are going to attempt a version of their famous Embrace, Extend, Extinguish strategy.  However, this strategy is probably doomed to failure, because there exist several plug-ins that allow a person to save ODF files from within Microsoft Office.

My Recommendations

In my experience, Microsoft file formats are a poor long-term storage option.  Microsoft cannot resist modifying their formats in unpredictable ways.  The ODF format is what a industry standard should be.

I recommend that people either use OpenOffice, or download and install and use Sun’s ODF Plugin for Microsoft Office.

President, Waverley Computer Services

Posted in Design and Needs Analysis, Microsoft Office, Open Office, Uncategorized

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>