Wednesday, November 7, 2012

Caching quagmire

There are a few drastically different views on caching ranging from it is a crutch for poor designs to being a key design aspect which supports scalability in big data environments.  I take a practical view, use what makes the most sense for the problem at hand.

My particular experience involves mostly 2 cases.  An application was created which was did not scale well.  A lack of time to reengineer it left caching as the most expedient solution.  The other case involves an ERP system which wass stretched to the limits on fairly expensive hardware leaving very little capacity for data manipulation by externally integrated applications and integrations.  Limited money combined with substantial growth in the user base led to caching as a way to minimize the impact of non-ERP application overhead. 

Memory tends to be cheaper than large quantities of CPU cores so my organization has a large investment in low/mid-range servers with memory sizes in the 8-32GB+ range.

Where this takes us at the moment is an adhoc collection of solutions which include Memcached, EhCache  and JBoss Infinispan. I find supporting Memcached with a specific Java application of ours overly painful.  Maint on servers or unplanned downtime tends to result in problems with data issues.  I can't blame this fully on Memcached - some developers decided to use Memcached like a DB in some functionality which was a mistake.  EhCache worked OK but I have always found the documentation lacking and trying to figure out feature/licensing aspects between it and Terracotta drove me to find something more straight forward.  The "phone home" feature of EhCache is also a little disconcerting even though there is a way to disable it.  Anyways, I have written a small JSR 107 wrapper allowed me to convert one application from EhCache to Infinispan without any real difficulty.  I can't say that it made any real performance difference but that wasn't the problem which needed solving.  The conversion from Memcached to Infinispan is being planned - that has some challenges due to some of the odd ways it is used.  My preference is to keep the Infinispan cache in-process with the application on each cluster node (at least until I can take a broader look at things and determine if a more grid like environment serving data to our entire application environment makes sense).  I'll have to post updates as the process evolves.

I recently took a quick look at some info on Hazelcast and it has some interesting aspects.  It seemed a little easier to configure (less options and some defaults which may make sense for us).  I may take a closer look at this down the road. 

No comments:

Post a Comment