Sunday, September 21, 2014

Jenkins and Groovy - ERP data quality and validation

Many organizations have an ERP and they may manage large amounts of data.  I won't bother debating the tradeoffs of commercial ERP systems but I want to mention some ways I am dealing with certain data issues.

Our ERP is one that has some pretty old roots (and they show their grey) - with the changes in technology, schema, functionality, etc there are plenty of issues in data validation by the system.  Each new vendor maintenance release is like rolling the dice in regards to what validations change (for better or worse).  On top of that, the vendor process for major upgrades doesn't really handle conversion and/or reporting of some data and it is usually tables which really need extra work.  There are also business process and training issues within the distributed organization.  All of this results in many data messes.

Data issues on their own are bad enough but the way the problems tend to manifest for us provide a very bad experience for our customers.  In our case, the vendor technology manages data in a way where some data which is not actively being used (i.e. by an end user) is still operated on and validated as part of some transaction.  The problem is that when the actively used/accessed data is saved, the system also performs validations on the non-actively used data.  If that non-actively used data is invalid then the system throws a tantrum which usually has no useful message for the end-user since the data at fault is not in anyway related to what they did.  This is especially true when you have a custom application accessing the system - useful error responses from the source system are a rarity.

So what do you do?

I am slowly putting Jenkins jobs which perform checks and validations for some of the common problems.  I write the jobs to identify bad data, form a report of some sort and email it to particular users whom I hope are able to direct it to an appropriate person for correction/mitigation, etc.

The solution I am documenting here is the use of Jenkins with Groovy using JDBC to perform a check, report generation and emailing of any failure result.

I am modifying things somewhat to take organization identifying type information out.

import groovy.text.SimpleTemplateEngine
import javax.naming.*
import javax.sql.DataSource
import javax.mail.internet.InternetAddress
import javax.mail.internet.MimeMessage
import javax.mail.*
import javax.activation.*
import groovy.sql.Sql
import static java.util.Calendar.*
 
class PlanValidations

    // Produce a connection to the database
    def connect()
    {
          Context initCtx = new InitialContext();
          DataSource sqlDS = initCtx.lookup("jdbc/connid");
          def sql = new groovy.sql.Sql(sqlDS);
          return sql;
    }

        // An array to contain the column names returned by the query
    def colnames = []
 
    def checkPlanInactivation(sql)
    {
            // The query is looking for a misconfiguration which is typically a failure to follow the proper business process.
            // The correct process for disabling the data item is add a new row with a current effective date which states which
            //  period the data should be disabled starting at.  The query finds each max effective item plus several "lagging" fields for which an outer query compares to determine if the proper setup was used.
         
      def badInactivePlanSQL = "
SELECT *
FROM
  (SELECT INSTITUTION,
    FIRST_PERIOD_VALID,
    PROG,
    PLAN,
    LAST_PERIOD,
    LAG(LAST_PERIOD, 1, ' ')
      OVER (PARTITION BY INSTITUTION, PROG, PLAN ORDER BY EFFDT ASC)
        PREV_LAST_PERIOD,
    EFF_STATUS,
    LAG(eff_status, 1, ' ')
      OVER (PARTITION BY INSTITUTION, PROG, PLAN ORDER BY EFFDT ASC)
        PREV_eff_status,
    effdt,
    LAG(effdt, 1, NULL)
      OVER (PARTITION BY INSTITUTION, PROG, PLAN ORDER BY EFFDT ASC)
        PREV_eff_dt
  FROM PLAN_TBL
  ORDER BY INSTITUTION,
    PROG,
    plan,
    effdt
  )
WHERE EFF_STATUS    = 'I'
AND prev_eff_status = 'A'
AND prev_PERIOD = ' '
AND EFFDT           > SYSDATE 
           ";

   
         // This will be a map of lists  {inst -> [rows]}
      def institutionBadSetup = [:].withDefault {[]} 
            
      sql.eachRow(badInactivePlanSQL,
                  // keep track of the column names returned by the query
         { meta ->
           (1..meta.columnCount).each { colnames << meta.getColumnName( it ).toLowerCase() }
         },
                 // save off the bad data by institution
         {
             institutionBadSetup[it.INSTITUTION] << it.toRowResult()
         }
      )
      return institutionBadSetup
    }

    def doChkInactivations()
    {
         def conn = connect();        
         def invalidPlanSetups = checkPlanInactivation(conn)
                   // take all the results and send off an enail per institution
         invalidPlanSetups.each() {
           institution, lst -> sendEmail(getEmailTo(institution, conn), getSubject(institution), getMsg(institution, lst))
         }     
        
         conn.close();        
    }

        //  Produce an html message with description of problem, related data and hints at fixing.
    def getMsg(institutionId, plans)
      {
       def html = """\
         <html>
         <head></head>
         <body>
         There is a problem with the setup of following plans for $institutionId.<br/><br/> 
       
          *** Provide a message here which describes the problem and provides help in correcting. ***
        <br/><br/>
         """
     
            // use the saved off column names and match up with the field data while constructing the simple report
       plans.each() { plan ->
          (0..colnames.size()-1).each {
            println plan.PLAN + " " + it
            html += "${colnames[it]}[${plan[it]}] "
          }
         html +="<br/><br/>"
       }
     
       html += """\
        </body>
        </html>""";
     
        return html
      }
 
    // Produce a subject for the email
    def getSubject(institutionId)
      {
      def env = System.getenv()
      String environ = env['ENV_TYPE']
      String hostname = env['HOSTNAME']
     
      return "invalid INACTIVE plan setup - institution(${institutionId}) <system name>(${environ}) server(${hostname})"
      }
 
      // Use an existing table of users to provide i
      def getEmailTo(institutionId, sql)
      {
      // hard coded a user [me] as email "to" during R&D
      def emailSql = "select 'me@someplace.org' as cntct_eml from inst_misc where institution = '${institutionId}'"
      def emailAdr = sql.firstRow(emailSql)
      return emailAdr.vx_oa_admcntct_eml
      }
 
    def sendEmail(receiver_email, email_subject, msgtext)
    {       
        def reportdt = String.format('%tY/%<tm/%<td  %<tH:%<tM:%<tS', Calendar.instance);       
 
        println "$msgtext";

        Properties mprops = new Properties();
        mprops.setProperty("mail.transport.protocol", "smtp");
        mprops.setProperty("mail.host", "smtp.somewhere.org");
        mprops.setProperty("mail.smtp.port", "25");
        Session lSession = Session.getDefaultInstance(mprops, null);
        MimeMessage msg = new MimeMessage(lSession);
       
        msg.setText(msgtext, "utf-8", "html");
       
        def me = "sender@somewhere.org";
        StringTokenizer tok = new StringTokenizer(receiver_email, ";");
        
        ArrayList emailTos = new ArrayList();
        while(tok.hasMoreElements())
        {
           emailTos.add(new InternetAddress(tok.nextElement().toString()));
        }
        
        InternetAddress[] to = new InternetAddress[emailTos.size()];
        to = (InternetAddress[]) emailTos.toArray(to);
        msg.setRecipients(MimeMessage.RecipientType.TO,to);
        
        msg.setSubject(email_subject);
        msg.setFrom(new InternetAddress(me));
       
        //Send the message via our own SMTP server, but don't include the envelope header.
        Transport transporter = lSession.getTransport("smtp");
        transporter.connect();
        transporter.send(msg);
     }
}

validationChk = new PlanValidations();
validationChk.doChkInactivations();

No comments:

Post a Comment