In one of our projects, we ran into a use case in which we had to process zipped XML files. The zipped XML files were stored into one big zip file.
With the basic functionality of the WSO2 ESB (WSO2 Enterprise Service Bus and sometimes spelled as WS02) one might considers to :
- Unzip the big zip file outside the WSO2 ESB, giving us a bunch of smaller zip files.
- Unzip all the smaller zipped XML files outside the WSO2 ESB too, leaving us the XML files
- Process all the XML files in the WSO2 ESB
Clean up our temporary created file
Sounds complicated and cumbersome, doesn’t it? A better approach would be, to do all these steps inside the WSO2 ESB, but how?
Well the answer is quite simple, using a custom mediator. I created an UnzipMediator which opens a zipfile and iterates over all entries in the zip file, just like a Iterate mediator does:
Let’s take it one step further: what if we have to process files containing a mixture of plain XML files and zipped XML files? Of course I should educate the customer … but that will not always take away the problem.
Inspired by the the ‘magic number’ (see https://en.wikipedia.org/wiki/Magic_number_%28programming%29).
I created a class MagicEntry which can be included in the definition of the Unzip mediator and which can be used in the sequence inside the Unzip mediator to act different on different types of files in the zip file.:
The Unzip mediator will try to find a match for every file inside the zip file. When the first bytes of the file matches the content of the <ynl:magic> element, it knows how to pass the content of the file to the target’s sequence:
- When the content is XML we want the ESB to process it as XML, so you can use Xpath expressions to iterate over and extract data from the XML. Therefore we need to pass the XML as an Axiom OMElement to the inner sequence.
- When the content is binary, for instance another zip file or an image, we still want the ESB to be able to process it. Therefor we must pass it as a base64 encoded binary payload, because that’s the way the ESB processes binary messages.
- One thing I struggled with while passing the binary data to the inner sequence was the ESB throwing a ‘java.lang.RuntimeException: ContentID is null’ at me when I tried to save the binary content to disk using a VFS endpoint. WSO2 development support helped me out on this one, I had to call setOptimize(true) on the OMText object (see code).
- When the content is text, it is passed as text the inner sequence.
The Unzip mediator does a few more things, it sets a couple of properties in the context that is passed to the inner sequence:
- UNZIPPED.FILENAME : the filename of the current entry being processed
- UNZIPPED.MIMETYPE : the mimetype of the matched Magic entry
- UNZIPPED.PAYLOADTYPE : XML, TEXT or BINARY determined by the Magic too
Inside the inner sequence, we can decide how to process the content the payload ( = content of the file) based upon the values of these properties:
To be able to do all this zip magic, we need a couple of classes:
- Of course we need the UnzipMediator class, this is the class that does all the work.
- To be able to use the UnzipMediator by adding a <ynl:unzip> to the sequence we need the UnzipMediatorFactory. This factory creates the UnzipMediator when the WSO2 ESB is parsing the sequence configuration.
- When in the admin UI the source of a sequence is shown, serializers are used the recreate the XML document. To serialize the UnzipMediator we need the UnzipMediatorSerializer.
- When the UnzipMediatorFactory finds <ynl:magic> elements while parsing the sequence.xml, MagicEntry objects are add to a list inside the mediator. (We don’t need a MagicEntrySerializer because the UnzipMediatorSerializer serializes the MacicEntries too.
Besides the java code, we need a few more files:
This file tells the ESB which class is a MediatorFactory.
This file tells the ESB which class is a MediatorSerializer.
The maven build config file, which tells maven how to create the jar that we will deploy to the WSO2 ESB when the build is completed.
To use this mediator in your WSO2 ESB, you have to:
- Build the UnzipMediator jar:
- Unzip the zip file
- Use maven to build the package:
Deploy the resuting jar:
Copy the UnzipMediator-<version>.jar to $ESB_HOME/repository/components/dropins
Restart the ESB
Include the xmlns_ynl=”https://www.yenlo.com/wso2/mediators” namespace in the definition of your sequence(s).
Add the <ynl:unzip> mediator to your sequence(s).
Are you interested in other WSO2 products like the WSO2 API Manager or the WSO2 Identity Server as well? Have a look here. In case you need WSO2 support, contact the Yenlo WSO2 Guru team to get WSO2 Development Support or WSO2 Operational Support. Of course we do deliver excellent WSO2 training services as well, based on reallife WSO2 tutorials.