Grace, Talent, Higher Works and Humility: Web Services, Service Buses and such

Problem:
An application sends data updated by users to several other dependent systems (via different mechanisms per target) where each has its own maintenance schedule. The downtime of dependent systems results in lost updates or delays during manual cleanup.

Solution (attempted):
Use Apache Synapse to mediate between source and target systems. Setup to queue messages ("dead letter queue") when target system is unavailable. The queued messages are delivered once the target system becomes available (system retries sending message on a scheduled timer). A reason this particular solution is attractive is its potential transparency to the application (no code change to client application).

Initial technology involved: Synapse 2.1.0, CXF 2.4.2 client, MS FIM 2010 & IIS hosted target web service

Solution (implemented):
I ended up implementing several important additions to the client application and some minor changes. Synapse was not used at all. The first addition was the creation of functionality which verifies whether a WSDL URL returns data (no network connection failure and HTTP 200 response). This was integrated into some existing "service checking" code which runs in a regularly scheduled thread and manages the state transitions of some flags which the client application references to determine if services are available without trying to make DB or HTTP calls during each end-user activity. The second major addition was creation of a queue/web services proxy(implements same interface as existing service) which holds data structures representing the data provided by the caller of the target web service. The caller was modified to check the WSDL service status flag on each call and if it indicated the service was unavailable, it retrieved a Spring based singleton for the queue/proxy instance. Also, if the client received a network IO type exception but the service flag indicated the service had been available then the client performs the call again but against the queue/proxy. The above logic at this point results in either successful calls to the final target web service or queued up data for future retry calls. To process any queued up calls, the "service checking" thread (on a transition from WSDL URL down to URL up) will tell the queue based proxy to use the normal client to send the queued messages - so they either success or stay in the queue if there are service/network connectivity issues. I may document this a bit better here later.

Notes regarding attempt with Synapse:

The Synapse docs are not great and a number of classes and settings are poorly named or have various inconsistencies. There are a good number of examples but they tend to only demonstrate very simple situations. I found very few third party references to Synapse which makes we wonder how widely it is used in production applications. I wish there was a greater variety of more complex examples and better documentation. This may not be a good long term solution overall but could fit a near term tactical need if it worked.
The most time consuming aspect was trying to get the CXF client to communicate with Synapse in certain required use cases. The CXF client utilizes the exposed WSDL and did not like what was being produced by Synapse.

A better detailed explanation is:

CXF did not like the default WSDL returned by Synapse - which was really a modified version of what the live FIM web service returned.
The use case requirement was that the CXF client must be able to communicate with the Synapse proxy when the final target web service is not running (implying that the target FIM web service isn't returning data for a WSDL request).

The normal Synapse samples appear to require the target service to return WSDL. There was some documentation/samples mentioning the Synapse way of producing a hard coded WSDL.
The solution which was nearest to being successful was to produce a compatible WSDL using wsdl2soap and force Synapse to return that WSDL. The problem run info though is that some imported schema references were still referencing the FIM/IIS server which would not be available in this use case. I gave up trying to rework the WSDL to get around parts being imported. Getting to that point was about 2 days of time and the need for a production solution was only about a week away.

It took 3-4 days to get the basic proxy with in-memory "dead letter" queue support working. A bit embarrassing that I missed something in the docs and did not spend enough time in the samples which resulted in about 2-3 lost hours fighting an incorrect URL for the proxy in some initial prototyping. Synapse is using Axis 2 and doesn't allow custom URL's which is somewhat annoying. During my testing of the target service down use case, I was running into null pointer exceptions in the Synapse code. I was able to track that down and a minor change to their source and a local build got me farther but then I was getting Synapse errors about configuration in wrong format as it prepared to reprocess messages when I would transition from the target service down to target service up. After about 5 minutes of looking at the Synapse source producing the error I had to make the call that this was not the way to go at this time. I may try to submit the couple changes I made back to Synapse (if my management is ok with it). I wish this had been more straight forward.
Some further googling on the general "dead letter" handling in many systems returned lots of similar reports of people trying to something similar and technology used not fully supporting it. I know I saw some thread which indicated changes had been checked into some project (Synapse, ServiceMix or Camel??) which sounded like it would come closer to handling this use case but I don't think a release is out yet. Don't remember which one it was at the moment. Will have to revisit in future - I know this type of need will occur again.

Technology considered:

Apache ServiceMix

took too much time going through docs; feature rich but complex
Not sure we are ready to tackle OSGI
Plan to review further later

Apache Camel

Usable from ServiceMix or directly
Started looking into this but research incomplete

Mule ESB

Need to review license/user agreement further; don't want to get tied up in legal tape

I think this solution requires attribution to original authors (and I am all for that) but I am not sure of how to get management/legal authorization to make that type of change. Not sure the solution warrants the pain of trying to work through the red tape.

Not sure I want to maintain an Erlang install on servers

WebMethods

We own this but are trying to migrate off due to ridiculous cost for only minor benefit
Reasonably easy to use

Oracle ESB

We own this as well but cost is excessive
Resource intensive & complex
Forces certain architectural aspects to meet our needs which increases overall cost

WSO2 ESB

Need to review license/user agreement; don't want to get tied up in legal tape

Several others were reviewed but don't remember which ones off the top of my head.

Things to consider:

At some point, we will likely implement some work-flow solutions so we should make sure that any BPEL type technology integrates cleanly into long term technology selections.
Long term I expect large amounts of various application functionality to end up being shared. I expect this sharing to likely be done by exposing it via web services (OK, I'll say SOA). It doesn't make sense to reimplement the wheel or share only via frameworks.
With a future out-sourced portal in the plans, it makes a lot of sense to use web services locally and only expose the minimal interface to the provider that will host the portal. This should reduce the security exposure by limiting secure data access to well defined API's which can be secured and audited easily.

Grace, Talent, Higher Works and Humility

Monday, July 2, 2012

Web Services, Service Buses and such

No comments:

Post a Comment

Translate