Showing posts with label CXF. Show all posts
Showing posts with label CXF. Show all posts

Monday, November 17, 2014

Apache ServiceMix - intial thoughts

So you have a really expensive (purchase price) integration product which has ridiculously high yearly maintenance fees in a public service type organization which isn't out to make money.

What can you do to bring costs down and provide better value to the public/consumers?

In this case, I am evaluating replacing the commercial product with Apache ServiceMix.

My first impression is that there is a lot of value bundled with ServiceMix and the wide assortment of technologies it works with.  The initial major downside appears to be a lack of equivalent tooling for handling the details of various DB trigger based integrations.  The existing commercial tool is more technical-end-user / operations oriented than ServiceMix which is developer oriented.  For this organization, this is a somewhat painful tradeoff but hopefully methods to reduce the pain will be found over time.

Other items to note include there is a general expectation that Maven is used; the ServiceMix examples involve its use.  I decided to include Eclipse in the mix since a number of people will be involved in this and I am hoping that a GUI may help in the transition.  I may regret the Eclipse / Maven integration; it has provided a few pain points in getting started.  If there is a budget for more formal training, it would likely be very worthwhile.

Primary aspects of ServiceMix functionality I want to evaluate:
  • OSGI and the impact on the development and operations processes
  • CXF web service implementations
  • Camel routes to tie some of the functionality together
  • ActiveMQ for some internal messaging/high availability/reliability needs
  • Activiti for performing some more complex workflows
I'm not an expect (yet!) in any of these but I have been slowly starting to work myself through the details.  I've been spending a little time over the last month or so getting a better grasp of the technology and if/how it may fit into our needs. Below is just my brief overview - I am planning to more in-depth posts as time permits.

OSGI is a lot of things but it all starts with modularity.  In the case of ServiceMix, Karaf is the underlying core OSGi container.  I am focusing on Karaf 2.4.0 (in ServiceMix 5.3.0) for now; it is a recent release which is sort of a bridge between major differences between 2.3.x and 3.0.x.  It supports the OSGi R5 specification and has fairly recent dependencies.  I am not convinced that we really need some of the features in R5+ right now but hopefully this version may ease any later upgrade we need to perform.  There are some supporting technologies which I am looking into as well.  The Apache Cave and Cellar projects are partly what drew me into taking a closer look.  The Cave project is an OSGi bundle repository which I am still figuring out how it fits in compared to a plain Maven repository or repository manager such as Artifactory.  The Cellar project is for clustering Karaf.  This may be a good method of gaining scalability and availability in a production environment.

I have been prototyping a few things but decided to take a chance on a few books to see if they could provide a little extra insight or provide quicker/more clearly.  The books in question are
  • Enterprise OSGi in Action - Manning Publishing
  • Learning Karaf Cellar -  PACKT Publishing
  • Learning Apache Karaf - PACKT Publishing
  • Apache Karaf Cookbook - PACKT Publishing
Sort of funny but I am reading them in the above order.

I had picked up Enterprise OSGi in Action quite a while back and had skimmed it but had no time to really put it to use.  I think it is a decent book and was my main initial exposure to much of the terminology.  I do find that there are things that didn't become clearer until reading a few other books/resources. 

I found Learning Karaf Cellar to be a reasonable book.  It did clarify and confirm a few things I was wondering.  It didn't answer all my questions but got me a little further than I was.  One area of clarification includes the need/use of the cellar-eventadmin feature - in my prior testing I had installed it but the book seems to neglect it.  Most of what I was really looking for (at the moment) was in the last 20 pages or so of the book.

The Learning Apache Karaf has a few items I did not know.  There is overlap with the Cellar book but that isn't too terrible.  Since I have been playing with Karaf for a bit over a month now some of the basic command info wasn't needed but if you were starting from nothing would be useful.

I'm just poking around the Cookbook for now.  It tries to point out differences between Karaf 2.x and 3.0 but may be more 3.0 focused.  Some of the items are interesting and I may try to evaluate or leverage.  Will have to really read more in depth before commenting much more on it.

Between my research and prototyping, I am thinking that my desire to use the Distrubuted OSGi (DOSGi) enterprise functionality may be premature (with regard to remote services mainly).  I think just clustering Karaf with Cellar and putting it behind a load balanced will likely provide most of what I am thinking of for an initial deployment.  I also recognize that the default Cellar setup isn't fully capable of meeting my initial expectations - mainly by not persisting some settings/config to disk.  The Cellar book discusses that aspect of their use of Hazelcast a bit which is nice but I will likely have to find some Hazelcast specific documentations to get the details I need.  I am still trying to  work out if/how declarative services should fit in to my plans (versus Spring or Blueprint).

Onto CXF; I have a service and client in prod using this but they are outside of any OSGi environment.  I have prototyped a couple other services in OSGi with CXF and am very pleased with some aspects.  One big plus is with PeopleSoft; I must be able to utilize multiple versions of the PeopleSoft provided API jars for accessing application servers.  OSGi allows me to do that; allowing me to create services targetting multiple PeopleSoft instances with different PeopleTools versions and have those services deployed in the same process.

For Camel, I am still working on how to best utilize it.  I think that creating a number of fine grained services in place of the few but beast like monolithic services is a better way to go.  With those in place, I can use Camel to tie them together to match existing data flows.  Combined with OSGi, there is a distinct dynamism which we don't currently have which hopefully speed up turn-around on changes.

I am still considering ActiveMQ for our environment.  It is a touch call for whether the cost of potential lost transactions is higher than the implementation and operations cost of using ActiveMQ.  I am really torn on this; it would probably benefit our users on likely rare occasions but would incur a cost to manage/maintain.  Still an item of active consideration.

There are a few places where real workflow would just make a lot of sense and using appropriate tools likely will be a big benefit here versus just codifying it.  That is the reason I would like to look into Activiti.  It will not do much without changes to some of our applications and those are likely big changes.  This is a longer term target.

Adopting something like this will have an effect on our development processes/environment, testing environments, production setups and deployment/maintenance methods.  I'm planning on trying to document some of the details of processes and procedures I am going through during my research and planning. 

God Bless!
Scott

Monday, July 2, 2012

Web Services, Service Buses and such

 Problem:
An application sends data updated by users to several other dependent systems (via different mechanisms per target) where each has its own maintenance schedule.  The downtime of dependent systems results in lost updates or delays during manual cleanup.

Solution (attempted):
Use Apache Synapse to mediate between source and target systems.  Setup to queue messages ("dead letter queue") when target system is unavailable.  The queued messages are delivered once the target system becomes available (system retries sending message on a scheduled timer).  A reason this particular solution is attractive is its potential transparency to the application (no code change to client application).

Initial technology involved:  Synapse 2.1.0, CXF 2.4.2 client, MS FIM 2010  & IIS hosted target web service

Solution (implemented):
I ended up implementing several important additions to the client application and some minor changes. Synapse was not used at all. The first addition was the creation of functionality which verifies whether a WSDL URL returns data (no network connection failure and HTTP 200 response). This was integrated into some existing "service checking" code which runs in a regularly scheduled thread and manages the state transitions of some flags which the client application references to determine if services are available without trying to make DB or HTTP calls during each end-user activity.  The second major addition was creation of a queue/web services proxy(implements same interface as existing service) which holds data structures representing the data provided by the caller of the target web service.  The caller was modified to check the WSDL service status flag on each call and if it indicated the service was unavailable, it retrieved a Spring based singleton for the queue/proxy instance.  Also, if the client received a network IO type exception but the service flag indicated the service had been available then the client performs the call again but against the queue/proxy.  The above logic at this point results in either successful calls to the final target web service or  queued up data for future retry calls.  To process any queued up calls, the "service checking" thread (on a transition from WSDL URL down to URL up) will tell the queue based proxy to use the normal client to send the queued messages - so they either success or stay in the queue if there are service/network connectivity issues.  I may document this a bit better here later.

Notes regarding attempt with Synapse:
  1. The Synapse docs are not great and a number of classes and settings are poorly named or have various inconsistencies.  There are a good number of examples but they tend to only demonstrate very simple situations.  I found very few third party references to Synapse which makes we wonder how widely it is used in production applications.  I wish there was a greater variety of more complex examples and better documentation.  This may not be a good long term solution overall but could fit a near term tactical need if it worked.
  2. The most time consuming aspect was trying to get the CXF client to communicate with Synapse in certain required use cases.  The CXF client utilizes the exposed WSDL and did not like what was being produced by Synapse.  
    1. A better detailed explanation is:
      1. CXF did not like the default WSDL returned by Synapse - which was  really a modified version of what the live FIM web service returned.
      2. The use case requirement was that the CXF client must be able to communicate with the Synapse proxy when the final target web service is not running (implying that the target FIM web service isn't returning data for a WSDL request).
    2. The normal Synapse samples appear to require the target service to return WSDL.  There was some documentation/samples mentioning the Synapse way of producing a hard coded WSDL.
    3. The solution which was nearest to being successful was to produce a compatible WSDL using wsdl2soap and force Synapse to return that WSDL.  The problem run info though is that some imported schema references were still referencing the FIM/IIS server which would not be available in this use case.  I gave up trying to rework the WSDL to get around parts being imported.  Getting to that point was about 2 days of time and the need for a production solution was only about a week away. 
  3. It took 3-4 days to get the basic proxy with in-memory "dead letter" queue support working.  A bit embarrassing that I missed something in the docs and did not spend enough time in the samples which resulted in about 2-3 lost hours fighting an incorrect URL for the proxy in some initial prototyping.  Synapse is using Axis 2 and doesn't allow custom URL's which is somewhat annoying.  During my testing of the target service down use case, I was running into null pointer exceptions in the Synapse code.  I was able to track that down and a minor change to their source and a local build got me farther but then I was getting Synapse errors about configuration in wrong format as it prepared to reprocess messages when I would transition from the target service down to target service up.  After about 5 minutes of looking at the Synapse source producing the error I had to make the call that this was not the way to go at this time.  I may try to submit the couple changes I made back to Synapse (if my management is ok with it).  I wish this had been more straight forward.
  4. Some further googling on the general "dead letter" handling in many systems returned lots of similar reports of people trying to something similar and technology used not fully supporting it.  I know I saw some thread which indicated changes had been checked into some project (Synapse, ServiceMix or Camel??) which sounded like it would come closer to handling this use case but I don't think a release is out yet.  Don't remember which one it was at the moment.  Will have to revisit in future - I know this type of need will occur again.
Technology considered:
  • Apache ServiceMix
    • took too much time going through docs; feature rich but complex
    • Not sure we are ready to tackle OSGI
    • Plan to review further later
  • Apache Camel
    • Usable from ServiceMix or directly
    • Started looking into this but research incomplete
  • Mule ESB
    • Need to review license/user agreement further; don't want to get tied up in legal tape
      • I think this solution requires attribution to original authors (and I am all for that) but I am not sure of how to get management/legal authorization to make that type of change.  Not sure the solution warrants the pain of trying to work through the red tape. 
    • Not sure I want to maintain an Erlang install on servers
  • WebMethods
    • We own this but are trying to migrate off due to ridiculous cost for only minor benefit
    • Reasonably easy to use
  • Oracle ESB
    • We own this as well but cost is excessive
    • Resource intensive & complex
    • Forces certain architectural aspects to meet our needs which increases overall cost
  • WSO2 ESB
    • Need to review license/user agreement; don't want to get tied up in legal tape
Several others were reviewed but don't remember which ones off the top of my head.

Things to consider:
  1. At some point, we will likely implement some work-flow solutions so we should make sure that any BPEL type technology integrates cleanly into long term technology selections.  
  2.  Long term I expect large amounts of various application functionality to end up being shared.  I expect this sharing to likely be done by exposing it via web services (OK, I'll say SOA).  It doesn't make sense to reimplement the wheel or share only via frameworks. 
  3. With a future out-sourced portal in the plans, it makes a lot of sense to use web services locally and only expose the minimal interface to the provider that will host the portal.  This should reduce the security exposure by limiting secure data access to well defined API's which can be secured and audited easily.