Accessing File Formats (Wiki forum at Coderanch) (original) (raw)

How do I access the XYZ file format in java ?

A large database of file extensions be found at www.file-extensions.org and dotwhat.net

And if you don't know what type a given file is, they there are various way to determine it programmatically: http://www.rgagnon.com/javadetails/java-0487.html

An interesting article about Microsoft's binary file formats, especially DOC and XLS, is Why are the Microsoft Office file formats so complicated? (And some workarounds) It also mentions some alternatives to dealing with those formats directly.

Access

CHM

Excel

Gedcom

HDF (Hierarchical Data Format)

Image and movie files

INI

Matlab

mbox

OpenDocument (ODF)

Office Open XML

OpenOffice Java API

Outlook / PST

PDF

PowerPoint

Project

QIF (used by Microsoft Money and Quicken)

RTF

Visio

Word

Something else?
If you encounter an obscure format for which no library is available, it may be feasible to create a reader for it if you have a file format description (which may be available on Wotsit, see link above). Several libraries, so-called lexers and parsers, are available that help in creating a reader, especially if the file format is ASCII, and not binary. You will need knowledge of regular expressions, though. Some file formats that have been tackled using this approach include RTF, CSV, HPGL and PBM/PGM/PPM. Lexers are easier to start with, but parsers can do more of the work for you. All these have ready-to-use examples on their web sites.


CategoryFaq CategoryHowTo