Successful Process Automation: A Summary

July 26th, 2010

InformationWeek Analytics (http://analytics.informationweek.com/index) invited me to write about the subject of process automation.  The article, part of their series covering application architectures, was released in July of this year.  It provided an opportunity for me to articulate the key components that are required to succeed in the automation of business processes.

Both the business and IT are positioned to make-or-break the use of process automation tools and techniques. The business must redefine its processes and operational rules so that work may be automated.  IT must provide the infrastructure and expertise to leverage the tools of the process automation trade.

Starting with the business there must be clearly defined processes by which work gets done.  Each process must be documented, including the points where decisions are made.  The rules for those decisions must then be documented.  Repetitive, low-value and low-risk decisions are immediate candidates for automation.

A key value point that must be reached in order to extract sustainable and meaningful value from process automation is measured in Straight Through Processing (STP).  STP requires that work arrive from a third-party and be automatically processed; returning a final decision and necessary output (letter, claim payment, etc.) without a person being involved in handling the work.

Most businesses begin using process automation tools without achieving any significant STP rate.  This is fine as a starting point so long as the business reviews the manual work, identifies groupings of work, focuses on the largest groupings (large may be based on manual effort, cost or simple volume) and looks to automate the decisions surrounding that group of work.  As STP is achieved for some work, the review process continues as more and more types of work are targeted for automation.

The end goal of process automation is to have people involved in truly exceptional, high-value, high-risk, business decisions.  The business benefits by having people attend to items that truly matter rather than dealing with a large amount background noise that lowers productivity, morale and client satisfaction.

All of this is great in theory but requires an information technology infrastructure that can meet these business objectives.

Read the rest of this entry »

Clam Festival 2010

July 19th, 2010

We made our annual trek to Yarmouth, Maine for this year’s Clam Festival.  We had signed up to participate in The Levity Project – Maine so we had to be there by 4:30 pm on Friday.  This was the first time we’ve had a fixed schedule when heading up and one of the only years we have been there in time for the parade.

After parking we rushed past the Food Circle in order to get to the North Yarmouth Academy Gymnasium and receive our instructions and umbrella hats for “Maine’s Longest Smile” being organized by The Levity Project.  Although temperatures as we drove through Massachussets and New Hampshire and up into Maine were in the 90s, by the time we got to Yarmouth it was about 75.  The gym was another story, hot and humid, but full of festivity!

Hippity-hop balls were being test driven by a variety of people while others were testing out the new hats.  We signed in, filled out a photo release and starting reviewing the instructions we had been given.  By 5pm we were being walked through the overall plan and were ready to head out to our assigned locations by 5:30.  Before getting into position I snuck by the First Parish Congregation Church for my annual lobster roll.  They make a perfect lobster roll – a bun and lots of lobster meat!  Nothing else to distract from the delicious lobster flavor.

Read the rest of this entry »

Destination Reached: CISSP

July 2nd, 2010

CISSP logoI am happy to report that I have been awarded the Certified Information Systems Security Professional (CISSP) by the International Information Systems Security Certification Consortium [(ISC)2]a.

I started pursuing the certification in mid-2009, got serious about studying early this year (2010), took the exam in late April, was notified that I passed and had my background endorsed in May, had to update my resume for an auditor in early June and was awarded the CISSP designation at the end of June.

I felt that this certification was important both professionally and personally.

Professionally, the certification serves as a validation that I have a solid and broad understanding of information systems’ security.  People who have worked with me know that I have been focused on IS security for many years.

Whether performing security-centered code reviews, fixing flawed implementations or teaching designers and developers how to improve the security of their systems, I have been on a mission to mentor and train people to observe effective security practices and principles.  I’ve also had operational responsibility for system infrastructures.  With that experience I was able to pass GIAC’s GSEC and Red Hat’s RHCE exams several years ago.

Personally, the process of studying and passing the exam allowed me to pursue and attain a non-trivial goal.  I am enrolled and taking classes toward my master’s degree, but completing that work will require several more years of part time attendance.  Setting and achieving intermediate goals helps to keep me focused and learning.

If you are wondering what the CISSP is all about, please read on.

Read the rest of this entry »

My First Semantic Web Program

June 5th, 2010

I have create my first slightly interesting, to me anyway, program that uses some semantic web technology.  Of course I’ll look back on this in a year and cringe, but for now it represents my understanding of a small set of features from Jena and Pellet.

The basis for the program is an example program that is described in Hebler, Fischer et al’s book “Semantic Web Programming” (ISBN: 047041801X).  The intent of the program is to load an ontology into three models, each running a different level of reasoner (RDF, RDFS and OWL) and output the resulting assertions (triples).

I made a couple of changes to the book’s sample’s approach.  First I allow any supported input file format to be automatically loaded (you don’t have to tell the program what format is being used).  Second, I report the actual differences between the models rather than just showing all the resulting triples.

As I worked on the code, which is currently housed in one uber-class (that’ll have to be refactored!), I realized that there will be lots of reusable “plumbing” code that comes with this type of work.  Setting up models with various reasoners, loading ontologies, reporting triples, interfacing to triple stores, and so on will become nuisance code to write.

Libraries like Jena help, but they abstract at a low level.  I want a semantic workbench that makes playing with the various libraries and frameworks easy.  To that end I’ve created a Sourceforge project called “Semantic Workbench“.

I intend for the Semantic Workbench to provide a GUI environment for manipulating semantic web technologies. Developers and power users would be able to use such a tool to test ontologies, try various reasoners and validate queries.  Developers could use the workbench’s source code to understand how to utilize frameworks like Jena or reasoner APIs like that of Pellet.

I invite other interested people to join the Sourceforge project. The project’s URL is: http://semanticwb.sourceforge.net/

On the data side, in order to have a rich semantic test data set to utilize, I’ve started an ontology that I hope to grow into an interesting example.  I’m using the insurance industry as its basis.  The rules around insurance and the variety of concepts should provide a rich set of classes, attributes and relationships for modeling.  My first version of this example ontology is included with the sample program.

Finally, I’ve added a semantic web section to my website where I’ll maintain links to useful information I find as well as sample code or files that I think might be of interest to other developers.  I’ve placed the sample program and ontology described earlier in this post on that page along with links to a variety of resources.

My site’s semantic web page’s URL is: http://monead.com/semantic/
The URL for the page describing the sample program is: http://monead.com/semantic/proj_diffinferencing.html

Angels and Saints, Patience Please

May 31st, 2010


“I like the silent church before the service begins, better than any preaching.” -Ralph Waldo Emerson

This quote has been running through my head a lot as I’ve been spending time alone in church.  Our church music director will be away for the first three Sundays in June and asked if I would be willing to take the reins during his absence.  This isn’t the first time I done this, but I believe it is the longest stint I’ve had.

My initial thinking, and usual approach, when playing the service is to use the piano to accompany everything.  This is where I feel safest.  I spend a lot of time at the piano, working with the Children’s Choir, rehearsing with the Brass Kickers and accompanying myself for a variety of solos and duets.

However, I’ve begun to feel compelled to use the organ.  Its variety of colors, range of tones and wide dynamic range cannot be approached by the piano.  Although I love the sound of a piano accompanying a solo voice, the organ adds significant sonic breadth, especially when accompanying hymns.

In fact, I believe that it is the flatness of verse after verse of a hymn played on the piano that has continually pushed me to move out of my personal comfort zone and explore the organ as a more versatile and ultimately more appropriate instrument for such situations.  To be sure, it is now taking me an inordinate amount of time to prepare for a service.

When using the piano, all I needed to do was learn to play the notes.  Now I need to worry about the voicings for each verse.  Looking at the text to suggest color and dynamics adds work.  Basic tasks such as figuring out which manual to use for each verse and configuring piston settings so that they are convenient to access while playing also adds complexity for someone that does not use the instrument often.

I have great respect for those that make such planning and preparation look easy.  I cannot imagine doing this week after week, at least not with a separate full time job.  I would guess that over time one would get to know the instrument and have a more organized approach to this process.  For me there is a great deal of experimentation, figuring out which ranks extend into which octaves and which timbres sound well together.

To be sure, it is an amazing experience to fiddle with such decisions and hear the difference in the feeling invoked by a given hymn.  Played with certain stops, the piece is upbeat.  Change the sounds and it is suddenly reflective or pensive.  In fact, my biggest risk is probably over-using the breadth of sounds and dynamics.

For instance the carillon seems like a great choice for bringing out a melody, perhaps to introduce a hymn.  However, it would probably be tiresome for the congregation if used every week.  Also, it is tempting to approach some hymns with a powerful accompaniment.  I enjoy hearing the reverberation at the end of the piece while practicing.  However, the congregation shouldn’t be in a shouting match with the organ, so I’ll need to tame my “Virgil Fox-ness”1.

This will be in interesting month of Sundays for me and the congregation.  I pray we will each find enjoyment and meaning during the worship time spent together.

At the least, I hope those worshiping don’t come away saying, “I like the silent church before the service begins, better than Dave’s organ playing.”
:)

1http://en.wikipedia.org/wiki/Virgil_Fox

Database Refactoring and RDF Triples

May 12th, 2010

One of the aspects of agile software development that may lead to significant angst is the database.  Unlike refactoring code, the refactoring of the database schema involves a key constraint – state!  A developer may rearrange code to his or her heart’s content with little worry since the program will start with a blank slate when execution begins.  However, the database “remembers.”  If one accepts that each iteration of an agile process produces a production release then the stored data can’t be deleted as part of the next iteration.

The refactoring of a database becomes less and less trivial as project development continues.  While developers have IDE’s to refactor code, change packages, and alter build targets, there are few tools for refactoring databases.

My definition of a database refactoring tool is one that assists the database developer by remembering the database transformation steps and storing them as part of the project – e.g. part of the build process.  This includes both the schema changes and data transformations.  Remember that the entire team will need to reproduce these steps on local copies of the database.  It must be as easy to incorporate a peer’s database schema changes, without losing data, as it is to incorporate the code changes.

These same data-centric complexities exist in waterfall approaches when going from one version to the next.  Whenever the database structure needs to change, a path to migrate the data has to be defined.  That transformation definition must become part of the project’s artifacts so that the data migration for the new version is supported as the program moves between environments (test, QA, load test, integrated test, and production).  Also, the database transformation steps must be automated and reversible!

That last point, the ability to rollback, is a key part of any rollout plan.  We must be able to back out changes.  It may be that the approach to a rollback is to create a full database backup before implementing the update, but that assumption must be documented and vetted (e.g. the approach of a full backup to support the rollback strategy may not be reasonable in all cases).

This database refactoring issue becomes very tricky when dealing with multiple versions of an application.  The transformation of the database schema and data must be done in a defined order.  As more and more data is stored, the process consumes more storage and processing resources.  This is the ETL side-effect of any system upgrade.  Its impact is simply felt more often (e.g. potentially during each iteration) in an agile project.

As part of exploring semantic technology, I am interested in contrasting this to a database that consists of RDF triples.  The semantic relationships of data do not change as often (if at all) as the relational constructs.  Many times we refactor a relational database as we discover concepts that require one-to-many or many-to-many relationships.

Is an RDF triple-based database easier to refactor than a relational database?  Is there something about the use of RDF triples that reduces the likelihood of a multiplicity change leading to a structural change in the data?  If so, using RDF as the data format could be a technique that simplifies the development of applications.  For now, let’s take a high-level look at a refactoring use case.

Read the rest of this entry »

Business Ontologies and Semantic Technologies Class

May 9th, 2010

Last week I had the pleasure of attending Semantic Arts’ training class entitled, “Designing and Building Business Ontologies.”  The course, led by Dave McComb and Simon Robe, provided an excellent introduction to semantic technologies and tools as well as coverage of ontological best practices.  I thoroughly enjoyed the 4-day class and achieved my principle goals in attending; namely to understand the semantic web landscape, including technologies such as RDF, RDFS, OWL, SPARQL, as well as the current state of tools and products in this space.

Both Dave and Simon have a deep understanding of this subject area.  They also work with clients using this technology so they bring real-world examples of where the technology shines and where it has limitations.  I recommend this class to anyone who is seeking to reach a baseline understanding of semantic technologies and ontology strategies.

Why am I so interested in semantic web technology?  I am convinced that structuring information such that it can be consumed by systems, in ways more automated than current data storage and association techniques allow, is required in order to achieve any meaningful advancement in the field of information technology (IT). Whether wiring together web services or setting up ETL jobs to create data marts, too much IT energy is wasted on repeatedly integrating data sources; essentially manually wiring together related information in the absence of the computer being able to wire it together autonomously!

Read the rest of this entry »

Full Disk Encryption – A (Close to Home) Case Study

April 28th, 2010

This is a follow-up to my previous entry regarding full disk encryption (see: http://monead.com/blog/?p=319).  In this entry I’ll look at Blue Slate’s experience with rolling out full disk encryption company-wide.

Blue Slate began experimenting with full disk encryption in 2008.  I was actually the first user at our company to have a completely encrypted disk.  My biggest surprise was the lack of noticeable impact on system performance.  My machine (Gateway M680) was running Windows XP and I had 2GB of RAM and a similarly-sized swap space.  Beyond a lot of programming work I do video and audio editing.  I did not notice significant impact on editing and rendering of such projects.

Later in 2008, we launched a proof of concept (POC) project involving team members from across the company (technical and non-technical users).  This test group utilized laptops with fully encrypted drives for several months.  We wanted to assure that we would not have problems with the various software packages that we use. During this time we went through XP service pack releases, major software version upgrades and even a switch of our antivirus solution.  We had no reports of encryption-related issues from any of the participants.

By 2009 we were focused on leveraging full disk encryption on every non-server computer in the company.  It took some time due to two constraints.

First, we needed to rollout a company-wide backup solution (as mentioned in my previous post on full disk encryption, recovery of files from a corrupted encrypted device is nearly impossible).  Second, we needed to work through a variety of scheduling conflicts (we needed physical access to each machine to setup the encryption product) across our decentralized workforce.

Read the rest of this entry »

Full Disk Encryption – Two Out of Three Aren’t Bad

April 14th, 2010

Security is a core interest of mine.  I have written and taught about security for many years; consistently keeping our team focused on secure solutions, and am in pursuit of earning the CISSP certification.  Some aspects of security are hard to make work effectively and other aspects are fairly simple, having more to do with common sense than technical expertise.

In this latter category I would put full disk encryption.  Clearly there are still many companies and individuals who have not embraced this technique.  The barrage of news articles describing lost and stolen computers containing sensitive information on unencrypted hard drives makes this point every day.

This leads me to the question of why people don’t use this technology.  Is it a lack of information, limitations in the available products or something else?  For my part I’ll focus this posting on providing information regarding full disk encryption, based on experience. A future post will describe Blue Slate’s deployment of full disk encryption.

Security focuses on three major concepts, Confidentiality, Integrity and Availability (CIA).  These terms apply across the spectrum of potential security-related issues.  Whether considering the physical environment, hardware, applications or data, there are techniques to protect the CIA within that domain.

Read the rest of this entry »

Privacy Lost – Unmasking Masked Data

April 1st, 2010

Privacy is an issue which is consistently in the news.  Large amounts of data are stored by retailers, governments, health care providers, employers and so forth. Much of this data contains personal information.  Keeping that data private has proven itself to be a difficult task.

We have seen numerous examples of unintended data loss (unintended by the company whose systems are stolen or attacked).

We hear about thefts of laptops containing personal information for hundreds of thousands of people.  Internet-based attacks that allow attackers access to financial transaction data and even rogue credit card swiping equipment hidden in gas pumps have become background noise in a sea of leaked data.  This is an area that gets the lion’s share of attention in the media and by security professionals.

Worse than these types of personal data loss, because they are completely preventable, are those that are predicated on a company consciously releasing their customer data.  Such companies always assume that they are not introducing risk, but often they are.  In all cases, if the owner of the data had simply held it internally no privacy loss would have occurred.

There have been cases of personal data loss due to mistakes in judgment.

AOL released a large collection of search data to researchers.  The people releasing the data didn’t consider this a risk to privacy.  How could the search terms entered by anonymous people present a risk to privacy?

Of course we now now know that within the data were people’s social security numbers (SSN), phone numbers, credit card numbers and so forth.  Why?  Well, it turns out that some people will search for those things, quite possibly to prove to themselves that their data is safe.  What better way to see if your SSN or credit card number is published on the Internet than by typing it into a search engine?  No matches, great!

Personal data has even been lost by companies releasing data after attempting to mask or anonymize it.

The intent of masking is to remove enough information, the personally identifying information (PII), so that the data cannot be associated with real people. Of course this has to be done without losing the important details that allow patterns and relationships in the data to be found.

Read the rest of this entry »