First things first, I exaggerate a little when I say programmatically what I really mean is not manually! What we’re going to look at is effectively performing a mail-merge without having to use any COM or other API nasties, just the CustomXML support within MS Word. In order to run through this you will need the following:
- MS Word 2007 (of course!)
- A source XML file, in my case I’m going to use Infopath 2007
- The Word 2007 Content Control Toolkit available here
- [optional] XML Spy to view what’s going on
The Scenario
You have a standard set of documents that you require customers to fill in at the start of a new engagement, these documents are full of customisations of the customer’s name, order details etc. You would like to generate these documents automatically on a server without having to automate MS Word (!).
You would rather not convert the document to an XSL as the document changes quite frequently and you don’t want the burden of changing the documents each time, a solution whereby the user’s can change the template would be preferable.
The Data Source
There are two categories of data source, the first is the content for the customisations (i.e. customer name) the second are the template documents. For this data source we will use an Infopath form as below:
This should form should be self-explanatory so lets move on to the template word document. In this case we’ll assume the document has been written by someone else and merged in fields have been clearly shown such as:
Now we want to replace the values inside <> with values that come out of our Infopath form. We must now convert the temporary placeholder text for Word Content Controls. We do this by highlighting the text and then selecting the plain text control from the developer toolbar as below:
Next we need to set the properties for our new control by highlighting the control and clicking properties. You should then put some meaningful text in the ‘Tag’ field as you will need this to identify the control later.
Perform this action for all of the fields you wish to populate, where the same field is used you can copy+paste content controls which will retain the same tag. I recommend switching on design mode to ensure you have got all of the controls on and they are tagged correctly giving you something like:
Now let’s complete our InfoPath form and save some data:
Save the output XML file somewhere temporarily as we will need this later. Ensure the word document is closed and open it with the Content Control toolkit:
Note the tags that we applied to the word document are shown, without these you can’t tell the controls apart. Now create a new Custom XML part using the link on the right and switch it to Edit View and you should see:
We must now put our custom XML data from the Infopath form in to the word document. To do this simply open up the XML file from Infopath using the folder icon in the Custom XML parts pane as shown above giving:
Now change to the bind view and drag the fields from the right-pane on to the appropriate control tag on the left:
Save and close the document and re-open in Word and….
…you should have the values from the Infopath form showing inside the Word document. Obviously you’re not going to want to go through this process to generate the document each time, however changing the data now that it is bound is very easy.
Rename the document from “yourdoc.docx” to “yourdoc.docx.zip”, open in explorer and navigate to the ‘CustomXml’ folder and you will see a file called item1.xml
If you double-click that file you will note that it opens in Infopath, thus to change the document ‘data’ you just need to replace the xml file within the word document package each time.
Also I should mention that following the steps above the document is bound “two-ways” that is if you change the clientName field in the Word document not only will all occurrences of that field source change throughout the document but also the xml file will be changed. This behaviour can be changed using the properties of each content control.
Limitations
This type of data merging / generation is only really suitable for simple text insertion, if you have repeating data for example you will probably need to use an XSL approach.