Using OPC to Store Your Own Data (original) (raw)
et’s say you’ve been tasked with writing an application that stores some related information, such as an album of pictures. You also need to store some metadata about each picture.
One traditional way to store is to use a directory of files. This approach is simple, but opens the door to the files getting out of sync. For example, someone might inadvertently delete either a picture without deleting the associated metadata file, or vice versa. Also, when sharing this data, all the data needs to be put together (often in a zip file). Finally, such an application would be completely custom?you would be able to share the pictures and associated metadata only by publishing the exact data format so others could write programs that read and write the same format.
Another approach is to use a database. This is a proven method, and reduces the possibility of the files getting out of sync, but it requires a database engine (such as SQL Server Express, SQL Server compact, etc.). Using a database also requires you to write database access code. Using a database, you’ll still need a way to import/export the data in such a way that it can be shared with other people?in other words, you still have custom format problems.
Enter the Open Packaging Convention
One attractive method is to use the same technology that is used to store Office 2007 files?the Open Packaging Convention (OPC). OPC is based on storing related data in a zip file.
To see the OPC in action, create a document using Word 2007 (or use Word 2003 with the Office Compatibility Pack). Save the document as something like Document.docx. Close the file and rename the file, changing the extension to .zip (e.g. Document.zip).
After doing that, you can view the contents of the package (see Figure 1) using your favorite ZIP utility (WinZip, WinRAR, Windows, etc.).
? | |
---|---|
Figure 1. Zipped Contents: After renaming a .docx file so it has a .zip extension, you can examine the contents with any ZIP file utility. |
Author’s Note: While it’s possible to view the contents of a package / extract files with virtually any zip utility, making changes to the package file might make the package unreadable by the packaging classes. One utility that lets you modify the package without such problems is 7-Zip.
Creating a Package in .NET
With a little effort, you don’t actually need Word to create a package; creating one in .NET is easy. You’ll find most of the classes you need in the System.IO.Packaging namespace.
Dim package As Package = _ package.Open("c:Album1.palb", IO.FileMode.Create)
Author’s Note: Whenever you open a package programmatically, it’s important to remember to close the package by either calling package.Close() or Dispose(). Failure to do this will leave the package file locked for the lifetime of your application.
Our Data Model
Here’s an example that creates the photo album application described earlier, using OPC. The application adds a number of image files (picture) and stores some additional information about each picture (pictureInfo), so each picture will have an associated XML file. Therefore the package you need to create has two parts, as shown in Figure 2.
? | |
---|---|
Figure 2. Sample Data Model: Here’s the data model for the picture album program example. |
Add an Image
Adding a part requires a few steps. First, create the package part:
Dim uri As New Uri("/pictures/Bee.jpg", _ UriKind.RelativeOrAbsolute) Dim picturePart As PackagePart = _ package.CreatePart(uri, System.Net.Mime.MediaTypeNames.Image.Jpeg, _ CompressionOption.Maximum)
Note that the preceding code doesn’t actually put any actual data in the package. To put the image in the PackagePart, you need to open a stream (via the PackagePart.GetStream method), and then write the image data to that stream:
Using stream As IO.Stream = _ picturePart.GetStream(IO.FileMode.Create) Dim bytes As Byte() = IO.File.ReadAllBytes("c:Bee.jpg") stream.Write(bytes, 0, bytes.Length) End Using
It’s All About Relationships
At this point, you’ve added a part to the package, but there’s no good way to navigate from the package to the part. For that, you need to create a relationship. Each Package contains a list of relationships. Each PackagePart also has a list of relationships. Each relationship can point to either an internal PackagePart or an external URI.
Creating a relationship between the package and the picturePart is straightforward:
Const Package2Picture As String = _ "http://schemas.fowlcoder.com/package/relationships/picture" _package.CreateRelationship(picturePart.Uri, _ TargetMode.Internal, _ Package2Picture)
The last parameter in the call to CreateRelationship is important. You’ll need this value later to enumerate the relationships from the package to the various picture parts.
Adding Picture Metadata
Now that we’ve added the image itself to the package, you need to add the metadata for that picture. An easy way to do this is to create another part that will hold an XML file:
Dim uri As New Uri("pictures/Bee.jpg.xml", _ UriKind.RelativeOrAbsolute) Dim pictureInfoPart As PackagePart = _package.CreatePart(uri, _ System.Net.Mime.MediaTypeNames.Text.Xml, _ CompressionOption.Maximum)
You can use the PictureInfo class to give the picture a name and a description, and then use an XmlSerializer to create the XML from the PictureInfo:
Dim pictureInfo As New PictureInfo() pictureInfo.Name = "Bee" pictureInfo.Description = "A stunning picture of a bee." Using stream As IO.Stream = _ pictureInfoPart.GetStream(IO.FileMode.Create) Dim serializer As New XmlSerializer(GetType(PictureInfo)) serializer.Serialize(stream, pictureInfo) End Using
Creating a Relationship Between PackageParts
With the picture and the metadata stored in PackageParts, you now need to create a relationship between the picture and its information. That’s straightforward, and similar to code you’ve already seen:
Const Picture2PictureInfo As String = _ "http://schemas.fowlcoder.com/package/relationships/picture2info" picturePart.CreateRelationship(pictureInfoPart.Uri, _ TargetMode.Internal, _ Picture2PictureInfo)
Now the two parts are related.
Getting the Data Out
Now that you’ve inserted all this meaningful information, you also need a way to get it out. The manual way (and a good way to check what you’ve done so far) is to rename the album file to Album1.zip and inspect the contents (see Figure 3):
? | |
---|---|
Figure 3. Package Contents: The figure shows the contents of the album package. |
In Figure 3, you can see the two added PackageParts. Also, note the presence of the _rels folder, which stores the relationships for all the parts in this folder.
To read the contents of this package from code, you first open the package (if it’s not already open):
Dim package As Package = package.Open( _ "c:Album1.palb", IO.FileMode.Open)
Then, enumerate the relationships between the package and the pictures (here’s where you need that Package2Picture argument I noted earlier).
Dim pictureRelationships = _ package.GetRelationshipsByType(Package2Picture)
Now, you can enumerate the pictureRelationships list to get the images. For each relationship, retrieve the picturePart, and then get the data via a Stream:
For Each pictureRelationship In pictureRelationships Dim picturePart As PackagePart = _ package.GetPart(pictureRelationship.TargetUri) Dim image As Image Using stream As IO.Stream = _ picturePart.GetStream(IO.FileMode.Open) image = image.FromStream(stream) End Using Next
You also enumerate the relationships to get the metadata for each picture, deserializing the data into a PictureInfo instance:
Dim relationships = _ picturePart.GetRelationshipsByType(Picture2PictureInfo) 'In production code, you would want to make sure that there ' is a relationship before doing this. Dim relationship As PackageRelationship = relationships(0) Dim pictureInfoPart As PackagePart = _ package.GetPart(relationship.TargetUri) Dim pictureInfo As PictureInfo Using stream As IO.Stream = _ pictureInfoPart.GetStream(IO.FileMode.Open) Dim serializer As New XmlSerializer( _ GetType(PictureInfo)) pictureInfo = DirectCast( _ serializer.Deserialize(stream), _ PictureInfo) End Using
Deleting an Image
Deleting a PackagePart is simple. The following code deletes the pictureInfo XML file as well as the picture itself:
_package.DeletePart(pictureSummary.PictureInfoUri) _package.DeletePart(pictureSummary.PictureUri)
DeletePart automatically deletes the relationships that were created from that part. However, it does not delete relationships to that part. To do that, you need the relationship ID. One way to get this ID is to keep track of it when creating the relationship. The other is to store it when enumerating the relationships.
_package.DeleteRelationship(packageToPicturePartRelationshipId)
Putting it All Together
You’ve seen most of the fundamental operations required to store data in an OPC package. To demonstrate how all this code hooks together, the sample code for this article combines all the code you’ve seen so far with a simple user interface (see Figure 4) that lets you create or open an album, and add and remove pictures.
? | |
---|---|
Figure 4. Explore OPC: This sample application illustrates the principles of storing, associating, and retrieving information from OPC. |
OPC Recommendations
Although you’ve seen how to perform the individual operations required to work with OPC, when you’re building a real-world application, you’ll want to follow these recommendations:
- Chunk data: You probably noticed that the article example associated a separate XML file with each image rather than creating one big XML file for the entire album. One advantage of this approach is that you won’t lose all your data if the package somehow gets corrupted.
- Use indirection: Rather than referencing stored data directly, you can use relationships to reference the same data from multiple parts. For example, if you use the same image 50 times in a Word document, you only need to store one copy of the image. This both reduces the amount of space necessary to store the document and allows you to replace that image in all 50 locations in your document with one operation.
You’ve seen that OPC can be used to store disparate data types in a single container. This approach has many advantages over using a directory of files including organization and portability. While OPC isn’t a replacement for a full-blown database, it is a compelling choice to store many kinds of application data.