<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Dave&#039;s Reflections</title>
	<atom:link href="http://monead.com/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://monead.com/blog</link>
	<description></description>
	<lastBuildDate>Tue, 04 Sep 2012 03:14:59 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Target’s Unit Pricing Looks Like a Bad Yolk on Consumers</title>
		<link>http://monead.com/blog/?p=1643</link>
		<comments>http://monead.com/blog/?p=1643#comments</comments>
		<pubDate>Tue, 04 Sep 2012 03:14:59 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Consumer]]></category>
		<category><![CDATA[Quality]]></category>
		<category><![CDATA[consumer awareness]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[unit pricing]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1643</guid>
		<description><![CDATA[I don’t often purchase dairy goods at Target.  Today was an exception.  Usually I head to Target for videos, office supplies, gift cards, Halloween decorations and so forth.  On this occasion I needed some shipping boxes, birthday napkins and eggs.  Rather than stop at two stores I decided to get the eggs at Target. So [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;">I don’t often purchase dairy goods at Target.  Today was an exception.  Usually I head to Target for videos, office supplies, gift cards, Halloween decorations and so forth.  On this occasion I needed some shipping boxes, birthday napkins and eggs.  Rather than stop at two stores I decided to get the eggs at Target.</p>
<p>So why the posting?  Well, either my math skills are getting really poor (probably due to my relentless exposure to computers and calculators) or Target has some issue with understanding egg pricing.  Here are pictures of the price labels in the dairy section (taken at the Target in Clifton Park, NY on September 3, 2012).  <strong>Anything look askew?</strong></p>
<p><a href="http://monead.com/blog/wp-content/uploads/2012/09/EggUnitPricingPricing.Target.201209031.png"><img class="alignnone size-full wp-image-1654" title="Egg Unit Pricing - Target - 2012-09-03" src="http://monead.com/blog/wp-content/uploads/2012/09/EggUnitPricingPricing.Target.201209031.png" alt="Egg Unit Pricing - Target - 2012-09-03" width="410" height="496" /></a></p>
<p>Now, if the errors were consistent I guess I could understand.  After all converting from “12 eggs” to “price per dozen” takes some understanding of the word “dozen” along with the principle of unit pricing.  <strong>What I find interesting is that the unit pricing calculation for these eggs is somewhat cracked since it does not seem to follow a pattern.  </strong>Also, apparently no one has noticed these strange unit prices.</p>
<p><span id="more-1643"></span>Unit pricing is <em><strong>supposed</strong> </em>to help people make informed decisions about the price difference between options.  It is irritating enough when the unit price for a “value” or “saver” size is higher than the regular size.  But at least the shopper who is paying attention can readily identify the deception.</p>
<p>More troublesome is looking at a group of products and finding that their unit prices don’t use a consistent unit.  For example, I’ve looked at cereals and had some brands give a unit price per ounce and others per pound.  Worse, I’ve seen volume and weight measures used on similar products.  Clearly such a store doesn’t want me (the consumer) to have an easy path to understanding the relative prices of those items.</p>
<p><strong>In this case the unit price is just plain wrong. </strong> It cannot be used in any way to compare the relative prices of the eggs since the error isn’t consistent, creating some sort of shell game.  In several cases the unit price is correct, in others it is off by 12 (treating the price for 12 eggs as if it were the price for 1).  In one case the error is a factor of 8.  <em>Where did they come up with 8?</em></p>
<p>Being both amused and frustrated by these errors, I’ll be taking a closer look at unit pricing at a variety of stores just to understand the reliability of these “consumer aids.” It will be interesting to compare the results between retail chains as well.</p>
<p>Anyone out there ever use unit price information when choosing which product to buy?  Ever found errors or confusing unit pricing schemes being used?</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1643</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Semantics in the Cognitive Corporation™ Framework</title>
		<link>http://monead.com/blog/?p=1632</link>
		<comments>http://monead.com/blog/?p=1632#comments</comments>
		<pubDate>Tue, 14 Aug 2012 21:46:53 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Cognitive Corporation]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[Semantic Technology]]></category>
		<category><![CDATA[Software Composition]]></category>
		<category><![CDATA[cognitive corporation]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[enterprise applications]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[system integration]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1632</guid>
		<description><![CDATA[When depicting the Cognitive Corporation™ as a graphic, the use of semantic technology is not highlighted.  Semantic technology serves two key roles in the Cognitive Corporation™ – data storage (part of Know) and data integration, which connects all of the concepts.  I’ll explore the integration role since it is a vital part of supporting a [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;">When depicting the Cognitive Corporation™ as a graphic, the use of semantic technology is not highlighted.  <em>Semantic technology serves two key roles in the Cognitive Corporation™ –</em> <em>data storage (part of Know) and data integration</em>, which connects all of the concepts.  I’ll explore the integration role since it is a vital part of supporting a learning organization.</p>
<p>In my last post I talked about the fact that <strong><em>integration between components has to be based on the meaning of the data</em></strong>, not simply passing compatible data types between systems.  Semantic technology supports this need through its design.  What key capabilities does semantic technology offer in support of integration?  Here I’ll highlight a few.</p>
<p><strong>Logical and Physical Structures are (largely) Separate</strong></p>
<p>Semantic technology reduces the tie between the logical and physical structures of the data versus a relational database.  In a relational database it is the physical structure (columns and tables) along with the foreign keys that maintain the relationships in the data.  Just think back to relational database design class, in a normalized database all of the column values are related to the table’s key.</p>
<p><em>This tight tie between data relationships (logical) and data structure (physical) imposes a steep cost if a different set of logical data relationships is desired</em>.  Traditionally, we create data marts and data warehouses to allow us to represent multiple logical data relationships.  These are copies of the data with differing physical structures and foreign key relationships.  We may need these new structures to allow us to report differently on our data or to integrate with different systems which need the altered logical representations.</p>
<p>With semantic data we can take a physical representation of the data (our triples) and apply different logical representations in the form of ontologies.  To be fair, the physical structure (subject-&gt;predicate-&gt;object) forces certain constrains on the ontology but <em>a logical transformation is far simpler than a physical one</em> even with such constraints.</p>
<p><strong><span id="more-1632"></span>Meaning is Inherent in the Infrastructure</strong></p>
<p>Semantic technology includes a powerful ontological representation at its core.  This key capability, constraining terms as a way to create precise definitions, allows us to represent information with clear and unambiguously defined meaning.  More to the point, by starting with precise definitions at an enterprise level, we can assure that each system ties to that ontology through related system-level ontologies.</p>
<p>The “meaning” aspect of semantic technology is valuable in the Cognitive Corporation™ since the data from all systems is being integrated and used to inform our predictive models and ultimately our applications, through workflow, rules and data changes.  Having a concise set of definitions that each system can tie to allows us to incorporate legacy systems as well as create new systems each sharing the concepts from our enterprise ontology.</p>
<p><strong><em>The integration of operational data and behavioral data is vital to a learning organization.</em></strong><em> </em>In order to learn<strong> </strong>it is not good enough to see that some value is changing, <em>we have to be able to track down <strong>why</strong> the value is changing</em>.  This means that the operational data alone is insufficient to inform learning. Operational data allows us to build models representing what is known to be happening. It is not particularly useful in helping to identify the root cause.  Our broader behavioral data is required and must be correlated with the operational data.  Semantic technology supports this requirement very well.</p>
<p><strong>Flexibility in Data Representation through URIs</strong></p>
<p>When integrating systems we may need to work with a variety of data, including rich data such as documents and media files.  The structure used to represent data in semantic solutions gives us significant latitude in terms of the data being represented.  We can create relationships to constants (string, integer, floating point and so forth) but of more interest is how we create relationships to other objects.  That is done through URIs.</p>
<p><em>This simple approach to data representation, using URIs, removes a great deal of noise from the design of our database</em>.  It also allows us to take advantage of an ongoing trend to move unstructured information into Content Management Systems (CMS).  Any modern CMS will permit referencing its information through a URL – easily integrated with our semantic data store.  Further, integration between systems, even those from other businesses, is much more reasonably accomplished through URLs than copying data or creating proprietary integrations.</p>
<p>Another benefit to representing objects through the use of URLs is the fact that the content represented by a given URL can differ – an image in one case, a video in another and an MP3 in a third.  We do not need to bother our data tier with these details.  When representing data we are interested in intent, the logical relationships.  <strong><em>Rendering is not the job of the data tier – so why add that overhead to our data infrastructure?</em></strong>  It is the data tier’s job to enable the logical structure so that we can find what we are looking for.  What to do with the data once we find it is the job of other tiers in our solution stack.</p>
<p>I hope this post shines a little light on how semantics plays a key role in supporting a Cognitive Corporation™.  As mentioned in an earlier post, we will be putting these concepts to the test with a set of enterprise products.  The team looks forward to being able to share their insights and lessons learned.</p>
<p>Are you looking at semantic technology within the scope of your enterprise architecture?  What do you see as its principal benefits, if any?</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1632</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cognitive Corporation™ Innovation Lab Kickoff!</title>
		<link>http://monead.com/blog/?p=1608</link>
		<comments>http://monead.com/blog/?p=1608#comments</comments>
		<pubDate>Sat, 11 Aug 2012 03:24:18 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[BPM]]></category>
		<category><![CDATA[Business Processes]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Cognitive Corporation]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[Data Analytics]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[Semantic Technology]]></category>
		<category><![CDATA[Software Composition]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[business rules]]></category>
		<category><![CDATA[cognitive corporation]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[enterprise applications]]></category>
		<category><![CDATA[enterprise systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[semantics]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1608</guid>
		<description><![CDATA[I am excited to share the news that Blue Slate Solutions has kicked off a formal innovation program, creating a lab environment which will leverage the Cognitive Corporation™ framework and apply it to a suite of processes, tools and techniques.  The lab will use a broad set of enterprise technologies, applying the learning organization concepts [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;"><strong>I am excited to share the news that Blue Slate Solutions has kicked off a formal innovation program</strong>, creating a lab environment which will leverage the Cognitive Corporation™ framework and apply it to a suite of processes, tools and techniques.  The lab will use a broad set of enterprise technologies, applying the learning organization concepts implicit in the Cognitive Corporation’s™ feedback loop.</p>
<p>I’ve blogged a couple of times (see references at the end of this blog entry) about the Cognitive Corporation™.  The depiction has changed slightly but the fundamentals of the framework are unchanged.</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2012/08/CC-v2-crop.485x321.png"><img class="alignright size-medium wp-image-1621" title="Cognitive Corporation Depiction" src="http://monead.com/blog/wp-content/uploads/2012/08/CC-v2-crop.485x321-300x198.png" alt="Cognitive Corporation Depiction" width="300" height="198" align="right" /></a>The focus is to <em><strong>create a learning enterprise</strong></em>, where the learning is built into the system integrations and interactions. Enterprises have been investing in these individual components for several years; however <em>they have not truly been integrating them in a way to promote learning.</em></p>
<p>By “<strong>integrating</strong>” I mean <strong>allowing the system to understand the meaning of the data being passed</strong> between them.  Creating a screen in a workflow (BPM) system that presents data from a database to a user is not “integration” in my opinion.  It is simply passing data around.  This prevents the enterprise ecosystem (all the components) from working together and collectively learning.</p>
<p>I liken such connections to my taking a hand-written note in a foreign language, which I don’t understand, and typing the text into an email for someone who does understand the original language.  Sure, the recipient can read it, but I, representing the workflow tool passing the information from database (note) to screen (email) in this case, have no idea what the data means and cannot possibly participate in learning from it.  <strong>Integration requires understanding.  Understanding requires defined and agreed-upon semantics.</strong></p>
<p>This is just one of the Cognitive Corporation™ concepts that we will be exploring in the lab environment.  We will also be looking at the value of these technologies within different horizontal and vertical domains.  Given our expertise in healthcare, finance and insurance, our team is well positioned to use the lab to explore the use of learning BPM in many contexts.</p>
<p><span id="more-1608"></span>From an implementation perspective we have a core execution team of 8 individuals assigned as owners across Requirements, BPM, BRM, Data Mining, Semantic Technology, Relational Database Technology, and Application Architecture.  Scrum will be used as the lifecycle for the lab projects.  Beyond the deep expertise provided by our team, <em>I’m excited by the energy and excitement that the execution team members bring to the lab</em>.</p>
<p>A separate advisory team has also been established.  This group will provide a sounding board as the sprints complete and new stories are considered.  It will also serve as a pool of team members to swap into the execution team since these folks will be in the loop with the ongoing projects.</p>
<p>The tools we will use are enterprise solution components including workflow and rules engines, web service and message-based middleware, traditional relational database management systems, data mining tools and semantic technology such as an ontology editor, reasoner, triple store and SPARQL endpoint.  <strong>We are creating an architecture with components similar to what we find our clients using, <em>augmented with less common components such as predictive modeling tools and semantics.</em></strong></p>
<p>The lab’s goal is to advance our company’s expertise with these enterprise tools and techniques, providing insight regarding best practices for leveraging them as components of a learning system.  Methods for dynamic interactions between components such as predictive analytics driving business rules changes or semantic technology simplifying data integration will be tried, honed and <em>ultimately become part of Blue Slate Solution’s best practices for advancing the value of BPM within our clients’ enterprises.</em></p>
<p>As part of our process, we will be sharing updates, both from individual team members through blog entries and also in general communications highlighting the overall progress and outputs from the lab.  We all hope that others will find the information thought provoking and educational.</p>
<p>If you have thoughts about what we are undertaking, or would like to learn more about it, please feel free to send an email to info@blueslate.net.</p>
<p><em><strong>Prior blog entries on the Cognitive Corporation™:</strong></em></p>
<ul>
<li><a href="http://monead.com/blog/?p=1231" target="_new">The Cognitive Corporation – An Introduction</a></li>
<li><a href="http://monead.com/blog/?p=1156" target="_new">The Cognitive Corporation &#8211; Effective BPM Requires Data Analytics</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1608</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>State Selection Lists on Website Forms &#8211; How Hard Are They to Sort?</title>
		<link>http://monead.com/blog/?p=1588</link>
		<comments>http://monead.com/blog/?p=1588#comments</comments>
		<pubDate>Thu, 14 Jun 2012 04:07:09 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Quality]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[online forms]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[quality]]></category>
		<category><![CDATA[sorting]]></category>
		<category><![CDATA[user interface]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1588</guid>
		<description><![CDATA[This post certainly falls into the “nitpick” category, but the flaw occurs often enough to be somewhat irritating.  The problem you ask?  Drop-down lists of state names that are not ordered by the state name but instead by the state’s 2-letter postal abbreviation.  Granted, this error pales in comparison to applications containing SQL injection flaws [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;">This post certainly falls into the “nitpick” category, but the flaw occurs often enough to be somewhat irritating.  The problem you ask?  <strong>Drop-down lists of state names that are not ordered by the state name but instead by the state’s 2-letter postal abbreviation. </strong> Granted, <em>this error pales in comparison to applications containing SQL injection flaws or race conditions exposing personal information,</em> but I’m going to complain none-the-less.</p>
<p>What exactly is the issue?  Well, it turns out that the two letter postal abbreviations (for example AK for Alaska and HI for Hawaii) can’t be used as the key for sorting the state names into alphabetical order.  For the most part it works, however for some states, such as Nevada through New Mexico it breaks.  As a New York resident I get tripped up by this.</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2012/06/BadStateSorting.cropped.png"><img class="alignleft size-full wp-image-1592" title="BadStateSorting.cropped" src="http://monead.com/blog/wp-content/uploads/2012/06/BadStateSorting.cropped.png" alt="Example of Incorrect State Sorting" width="282" height="262" align="left" /></a>The image shown here is a web form for a college admissions site.  As you can see, Nevada follows New Hampshire, New Jersey and New Mexico but precedes New York.  In reality it should follow Nebraska and precede New Hampshire.  This order is incorrect; it is based on the state abbreviations.  If instead of state names the website were displaying the state abbreviations, the order would be NH, NJ, NM, NV, NY and all would be fine.</p>
<p>The developer(s) of this site are not alone in their mistaken use of the postal abbreviation as the sort key.  I’ve encountered this issue with online shopping sites, reservation systems and survey forms.  I typically do a quick “view source” of the site and invariably they are using the state abbreviation as the actual value being passed to the server.  I’m sure they are using that for sorting as well.</p>
<p>You might think this sort of thing doesn’t matter.  From my point of view it represents a “<strong>broken window</strong>,” using Andy Hunt&#8217;s and Dave Thomas&#8217; language from <em>The Pragmatic Programmer</em>.  Little things count.  Little things left uncorrected form an environment where developers may become more and more sloppy.  After all, if I don’t need to pay attention to my sort key for state, what’s to say I won’t make a similar mistake with country or a product list or any other collection of values that is supposed to be ordered to make access easier?</p>
<p>Please, if you are designing an input form, make sure that sorted information displayed by your widgets is sorted by the <strong><em>display value</em></strong>, not some internal code.  It will make the use of your form easier for users and garner the respect of your fellow developers.</p>
<p>Have you seen this flaw on websites you’ve visited?  Do you have pet peeves with online form designs?  I’d enjoy hearing about them.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1588</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Semantic Technology and Business Conference, East 2011 – Reflections</title>
		<link>http://monead.com/blog/?p=1519</link>
		<comments>http://monead.com/blog/?p=1519#comments</comments>
		<pubDate>Wed, 07 Dec 2011 17:22:13 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[Semantic Technology]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[enterprise systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[system integration]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1519</guid>
		<description><![CDATA[I had the pleasure of attending the Semantic Technology and Business Conference in Washington, DC last week.  I have a strong interest in semantic technology and its capabilities to enhance the way in which we leverage information systems.  There was a good selection of topics discussed by people with a variety of  backgrounds working in [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;">I had the pleasure of attending the Semantic Technology and Business Conference in Washington, DC last week.  I have a strong interest in semantic technology and its capabilities to enhance the way in which we leverage information systems.  There was a good selection of topics discussed by people with a variety of  backgrounds working in different verticals.</p>
<p>To begin the conference I attended the half day <strong>“Ontology 101” presented by Elisa Kendall and Deborah McGuinness</strong>.  They indicated that this presentation has been given at each semantic technology conference and the interest is still strong.  The implication being that new people continue to want to understand this art.</p>
<p>Their material was very useful and if you are someone looking to get a grounding in ontologies (<em>what are they?  how do you go about creating them?</em>) I recommend attending this session the next time it is offered.  Both leaders clearly have deep experience and expertise in this field.  Also, the discussion was not tied to a technology (e.g. RDF) so it was applicable regardless of underlying implementation details.</p>
<p>I wrapped up the first day with <strong>Richard Ordowich who discussed the process of reverse engineering semantics (meaning) from legacy data</strong>.  The goal of such projects being to achieve a data harmonization of information across the enterprise.</p>
<p>A point he stressed was that <em><strong>a business really needs to be ready to start such a journey</strong></em>.  This type of work is very hard and very time consuming.  It requires an enterprise wide discipline.  He suggests that before working with a company on such an initiative one should ask for examples of prior enterprise program success (e.g. something like BPM, SDLC).</p>
<p>Fundamentally, a project that seeks to harmonize the meaning of data across an enterprise requires organization readiness to go beyond project execution.  The enterprise must put effective governance in place to operate and maintain the resulting ontologies, taxonomies and metadata.</p>
<p><strong>The full conference kicked off the following day. </strong> One aspect that jumped out for me was that <em><strong>a lot of the presentations dealt with government-related projects. </strong></em> This could have been a side-effect of the conference being held in Washington, DC but I think it is more indicative that spending in this technology is more heavily weighted to public rather than private industry.</p>
<p>Being government-centric I found any claims of “value” suspect.  A project can be valuable, or show value, without being cost effective.  Commercial businesses have gone bankrupt even though they delivered value to their customers.  More exposure of positive-ROI commercial projects will be important to help accelerate the adoption of these technologies.</p>
<p>Other than the financial aspect, <em><strong>the presentations were incredibly valuable in terms of presenting lessons learned, best practices and in-depth tool discussions.</strong></em>  I’ll highlight a few of the sessions and key thoughts that I believe will assist as we continue to apply semantic technology to business system challenges.</p>
<p><strong><span id="more-1519"></span>Marcel Jemio from the Department of the Treasury</strong> discussed a massive project seeking to apply semantics to their siloed transactional data.  His focus was to present his lessons learned as well as a possible blue print for similar projects.  He did a great job on both fronts.  A focus he suggested was that <strong><em>such projects should not be concerned with classifying data; rather they should seek to classify knowledge. </em></strong> It is knowledge that is actionable and provides meaning and relevance.</p>
<p><em>Some other important points he made included the need for business data governance, standardized data (vocabulary, semantics) and a shared information repository</em>.  Beyond his excellent sharing of experiences, his slides included recommendations for how to present information to business stakeholders so as to inform and educate.</p>
<p><strong>Richard Green’s presentation looked at how to go about creating a practical ontology for an enterprise. </strong> He dove into detail, showing some of the structured approaches that he uses to gather the requirements that feed an ontology design.  A key point for Richard was that ontologies must be practical (delivering value, being easy to use), define an ontology (describing what exists and their relationships) and work for a specific enterprise or constituent.</p>
<p>I thought his approach was very logical and included useful guidance for thinking through the process.  Sharing his documentation templates helped to clarify how the approach works and specifically how his “5 w’s” are applied to different situations.  I intend to integrate some of his approach into my practices.</p>
<p>Another presentation that contained specific guidance was <strong>Janet Millenson’s ROI of RDF discussion</strong>.  She outlined some of the key drivers that would lead an organization to apply RDF-based technology as a more efficient way to approach a business challenge.  She went beyond simple ROI and also discussed the underlying business driver such as reducing costs and supporting their mission.</p>
<p>Her approach to making the business case was standard procedure for introducing large scale change to an organization&#8217;s leadership.  Like Marcel, she cautioned against the use of technology terms in these types of discussions.  The technology is irrelevant to business leadership, the results are what matter.  Based on her experience she shared the importance of setting realistic expectations and creating the right team in order to succeed.</p>
<p>When discussing actual project planning and execution she gave a good overview of some major aspects that must be thought-through and managed in order to succeed.  These include people with their skills as well as technologies leveraged and the business domain being addressed.  Considering technology issues (data quality, governance), organizational concerns (build/buy, vendor relationships) and market risk (vendor consolidation, standards evolution) are vital in order to create the most effective result.</p>
<p><em>Her bottom line was that a successful project will address a specific problem in the organization (not boil the ocean), understand and work within constraints, start with realistic expectations and plan flexibility into the result to deal with technology and business changes.</em></p>
<p><strong>David Booth (presenting on behalf of Jurgen Angele) walked through 2 healthcare industry case studies</strong> where semantic technology was used to effectively advance the effectiveness and safety of care while reducing costs.  The initial case was from the Cleveland Clinic.</p>
<p>The usual starting point of having multiple stovepipe systems that contained related data was presented.  The total data set, if combined in a meaningful way, would create a much more useful information source for doctors and researchers. Since semantic technology simplifies the integration of separate data sources, it was an obvious choice for this project.</p>
<p>David described the high-level process that was used to bring the data together.  The basic subsumption capability of inferencing is a simple yet powerful way to quickly combine data.  He quoted Jim Hendler, <em><strong>“A little inferencing goes a long way.” </strong></em> The presentation described the pipeline process that was applied and which makes a lot of sense as a standard approach to dealing with such integration projects.</p>
<p>The second case study revolved around PanGenX, which seeks to use human genome information to understand which drugs will work for which individuals.  The overall data integration process was very similar to the first case study.  One point that was made, which resonated with me, was the statement that <em><strong>SPARQL is a convenient rule language. </strong></em> This comment was in the context of simple rules, obviously not inferencing.  However if simple rules are being created, having them in the same syntax as the queries simplifies the process.</p>
<p><strong>Neil Raden’s presentation,</strong> albeit abbreviated, was thought provoking.  Neil’s direct manner, devoid of pretense, allowed him to actually cover a lot of information in a very short time.  His take on the overhyped and overly complex processes that represent current BI practices, such as large MDM projects and monolithic data warehouses, was that these are not where businesses should be spending their BI dollars.</p>
<p>He dropped a lot of information in one-liners.  Some of which weren’t related to semantics as much as data in general.  I found his presentation a call to action to <em><strong>leverage semantic technology in support of a new data paradigm, not a new way to implement existing paradigms. </strong></em> I don’t know if that was a point he intended to make but it seemed like a logical extension of what he was saying.</p>
<p>The conference wrapped up with a <strong>panel discussion between David Booth, Elisa Kendall, David McComb and Dennis Wisnosky. </strong> They discussed some of the changes they have seen with semantic technology including the fact that project implementation seems to be getting easier as tools mature and the semantic technology ecosystem gets fleshed out.  They are also seeing people focusing on more pragmatic and meaningful projects, going beyond simply playing with the technology.</p>
<p>There was some discussion about the continuing hype associated with this area and the need to not fall prey to that.  The work that has been done represents hard work and collegiate sharing.  <em>The solutions are neither turnkey nor ready to simply buy and install.</em></p>
<p>There continues to be a focus on unstructured data which is good since so much corporate information is sitting idle in documents rather than databases.</p>
<p><em><strong>Some specific guidance for getting started with semantic technology:</strong></em> prove it to yourself with a POC.  You can’t discuss it more broadly if you don’t believe in it.  Choose an initiative and give it a name so that people recognize it.  Plan to deliver value on a regular (no less than every 6 months) basis.  Start small and grow from there.</p>
<p>Controlling scope, as in all projects, is central to working with new technology in order to get to a success point and begin acquiring lessons learned, which will allow for future success.  The guidance from “The Mythical Man Month” is still relevant: <em>you always throw the first one away.</em></p>
<p><em><strong>A theme that was heard in several sessions and repeated in the panel discussion was that use cases must be leveraged</strong></em> when beginning a semantic technology project.  In reality that has nothing to do with semantics.  I leverage use cases (or user stories) in every business system project.  The fact that they are required for semantic technology projects simply points out that these projects are still focused on business success and must be driven by business needs.</p>
<p><em><strong>On the technology side there were several vendors present</strong></em> demonstrating their tools and providing sessions that explored their case studies.   The capabilities of these tools continue to advance quickly.  I saw several that I will be trying out including ontology editors, inference engines and integration tools.</p>
<p>It is great to see the tool landscape continuing to grow in breadth and depth.  It is also nice to see a mix of commercial and open source alternatives, allowing a company to start a small POC without a large capital investment and then having commercial alternatives based on needs.</p>
<p><strong>Overall the conference presented useful experiential information through case studies and lessons learned. </strong> More commercial experience needs to be shared.  Also, a way needs to be found to make the output and advancement that the working groups are driving more visible and understandable.</p>
<p>In my case I am not looking to build a triple store or an OWL interpreter, those types of standards are not interesting to me.  I need more basic information.  Some of this exists but may not be as visible as it could be.  There was discussion around the importance of raising the visibility of things like CIO-level information and developer (of semantic technology-based applications, not the semantic technology itself) documentation.</p>
<p><em><strong>I continue to be convinced that semantic technology will fundamentally change the way we work with data and system integrations. </strong></em> It is simply a matter of time to get there.  Practitioners and thought leaders have an important role to play in order to get this vision out into the mainstream and keep it in the forefront of enterprise system conversations.</p>
<p>As always, if you have thoughts you’d like to share about a topic on my blog, feel free to add a comment.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1519</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Using ARQoid for Android-based SPARQL Query Execution</title>
		<link>http://monead.com/blog/?p=1420</link>
		<comments>http://monead.com/blog/?p=1420#comments</comments>
		<pubDate>Thu, 01 Dec 2011 18:22:39 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[semantics]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1420</guid>
		<description><![CDATA[I was recently asked about the SPARQL support in Sparql Droid and whether it could serve as a way for other Android applications to execute SPARQL queries against remote data sources.  It could be used in this way but there is a simpler alternative I&#8217;d like to discuss here. On the Android platform it is [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;">I was recently asked about the SPARQL support in <a href="http://monead.com/blog/?p=1028" target="_new">Sparql Droid</a> and whether it could serve as a way for other Android applications to execute SPARQL queries against remote data sources.  It could be used in this way but there is a simpler alternative I&#8217;d like to discuss here.</p>
<p>On the Android platform it is actually quite easy to execute SPARQL against remote SPARQL endpoints, RDF data and local models.  The heavy lifting is handled by <a href="http://code.google.com/p/androjena/" target="_new">Androjena’s</a> <a href="http://code.google.com/p/androjena/wiki/ARQoid" target="_new">ARQoid</a>, an Android-centric port of HP’s Jena ARQ engine.</p>
<p>Both engines (the original and the port) do a great job of simplifying the execution of SPARQL queries and consumption of the resulting data.  In this post I’ll go through a simple example of using ARQoid.  Note that all the <a href="http://monead.com/semantic/AndrojenaDemo.zip">code being shown here is available for download</a>.  This post is based specifically on the <em>queryRemoteSparqlEndpoint()</em> method in the <em>com.monead.androjena.demo.arqoid.SparqlExamples</em> class.</p>
<h2>Setup</h2>
<p>To begin, some environment setup needs to be done in order to have a properly configured Android project ready to use ARQoid.</p>
<p>First, obtain the ARQoid JAR and its dependencies.  This is easily accomplished using the <a href="http://code.google.com/p/androjena/downloads/list" target="_new">download</a> page on the <a href="http://code.google.com/p/androjena/wiki/ARQoid" target="_new">ARQoid Wiki</a> and obtaining the latest ARQoid ZIP file.  Unzip the downloaded archive.   Since I’m discussing an Android application I’d expect that you would have created an Android project and that it contains a libs directory where the JAR files should be placed.</p>
<p>Second, add the JAR files to the classpath for your Android project.  I use the ADT plugin for Eclipse to do Android development.  So to add the JARs to my project I choose the <strong>Project</strong> menu item, select <strong>Properties</strong>, choose <strong>Build Path</strong>, select the <strong>Libraries</strong> tab, click the <strong>Add JARs…</strong> button, <strong>navigate to the libs</strong> directory, <strong>select the JAR files</strong> and click <strong>OK</strong> on the open dialogs.</p>
<p>Third, setup a minimal Android project.  The default layout, with a small change to its definition will work fine.</p>
<h2>Overview</h2>
<p><em><strong>Now we are ready to write the code that uses ARQoid to access some data.</strong></em>  For this first blog entry I’ll focus on a trivial query against a SPARQL endpoint.  There would be some small differences if we wanted to query a local model or a remote data set.  Those will be covered in follow-on entries.</p>
<p><strong>Here is a list of the ARQoid classes we will be using for this initial example:</strong></p>
<ul>
<li><strong>com.hp.hpl.jena.query.Query</strong> – represents the query being executed</li>
<li><strong>com.hp.hpl.jena.query.Syntax</strong> – represents the query syntaxes supported by ARQoid</li>
<li><strong>com.hp.hpl.jena.query.QueryFactory</strong> – creates a <em>Query</em> instance based on supplied parameters such as the query string and syntax definition</li>
<li><strong>com.hp.hpl.jena.query.QueryExecution</strong> – provides the service to  execute the query</li>
<li><strong>com.hp.hpl.jena.query.QueryExecutionFactory</strong> – creates a <em>QueryExecution</em> instance based on supplied parameters such as a <em>Query</em> instance and SPARQL endpoint URI</li>
<li><strong>com.hp.hpl.jena.query.ResultSet</strong> – represents the returned data and metadata associated with the executed query</li>
<li>c<strong>om.hp.hpl.jena.query.QuerySolution</strong> – represents one row of data within the <em>ResultSet</em>.</li>
</ul>
<p>We’ll use these classes to execute a simple SPARQL query that retrieves some data associated with space exploration.  <a href="http://www.talis.com/" target="_new">Talis</a> provides an endpoint that we can use to access some interesting space exploration data.  The endpoint is located at <a href="http://api.talis.com/stores/space/services/sparql" target="_new">http://api.talis.com/stores/space/services/sparql</a>.<br />
<strong>The query we will execute is:</strong></p>
<pre>SELECT ?dataType ?data
WHERE {
  &lt;http://nasa.dataincubator.org/launch/1961-012&gt; ?dataType ?data.
}</pre>
<p>This query will give us a little information about Vostok 1 launched by the USSR in 1961.</p>
<h2><span id="more-1420"></span>Create the Query instance</h2>
<p>We begin by creating the <em>Query</em> instance using the <em>QueryFactory</em>.</p>
<pre>// Create a Query instance
Query query = QueryFactory.create(queryString, Syntax.syntaxARQ);</pre>
<p>This code assumes that the query given earlier has been assigned to the String  variable <em>queryString</em></p>
<h2>Create the QueryExecution instance</h2>
<p>We next create a <em>QueryExecution</em> using the <em>QueryExecutionFactory</em>.</p>
<pre>// This query uses an external SPARQL endpoint for processing
// This is the syntax for that type of query
QueryExecution qe = QueryExecutionFactory.sparqlService(sparqlEndpointUri, query);</pre>
<p>This code assumes that the Talis endpoint mentioned above has been assigned to the String variable <em>sparqlEndpointUri</em>.</p>
<h2>Execute the query and obtain the ResultSet instance</h2>
<p>We are now ready to actually execute the query and obtain the <em>ResultSet</em> instance.</p>
<pre>// Execute the query and obtain results
ResultSet resultSet = qe.execSelect();</pre>
<p>The resultSet variable now provides us access to the query results.</p>
<h2>Retrieve the column names</h2>
<p>A useful piece of metadata in the <em>ResultSet</em> instance is the list of column names.  These will be based on the information requested in the <em>SELECT</em> clause.  ARQoid can return them as a List&lt;String&gt;.</p>
<pre>// Get the column names (the aliases supplied in the SELECT clause)
List&lt;String&gt; columnNames = resultSet.getResultVars();</pre>
<p>The columnNames List will contain the aliases given in the SELECT clause.</p>
<h2>Iterate through the resulting rows</h2>
<p>We can now iterate through the resulting rows, asking for each row’s data which is represented as a <em>QuerySolution</em> instance.</p>
<pre>// Iterate through all resulting rows
while (resultSet.hasNext()) {
  // Get the next result row
  QuerySolution solution = resultSet.next();</pre>
<p>The solution variable will contain the current result row&#8217;s data.</p>
<h2>Obtain the data for a row and column</h2>
<p>To actually access the data you can request a specific column name from the <em>QuerySolution</em> instance.  However, you need to know whether the data is null, a literal value or a URI.  The following code performs the necessary tests and then prints the data to the standard output (in the downloadable code it will be presented on the Android device&#8217;s screen).</p>
<pre>// Data value will be null if optional and not present
if (solution.get(var) == null) {
  System.out.println("{null}");
// Test whether the returned value is a literal value
} else if (solution.get(var).isLiteral()) {
  System.out.println (solution.getLiteral(var).toString());
// Otherwise the returned value is a URI
} else {
  System.out.println(solution.getResource(var).getURI());
}</pre>
<p>From the code, above, you can see that in order to access the basic data value you use the <strong>get(String)</strong> method which expects the column name to be passed.  This will return <em>null</em> if there is no data associated with this row and column.  If there is a value, the method <strong>isLiteral()</strong> may be called to test whether the data is a literal.  If it is not than it will likely be a URI.  The URI can be accessed by calling the <strong>getResource(String)</strong> method, passing the column name, and calling the <strong>getURI()</strong> method on that value.</p>
<h2>Close the QueryExecution instance</h2>
<p>The last step is important.  The <em>QueryExecution</em> instance should be cleaned up by calling its close() method.</p>
<pre>// Important - free up resources used running the query
qe.close();</pre>
<p>These are the basic steps you need to carry out in order to execute a SPARQL query against a SPARQL endpoint using ARQoid.  Here is a screen shot of the Android emulator executing the example just covered.</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2011/12/ARQoid.example.screenshot.png"><img class="alignnone" title="ARQoid demo example screenshot" src="http://monead.com/blog/wp-content/uploads/2011/12/ARQoid.example.screenshot-300x212.png" alt="ARQoid demo running in the Android emulator example screenshot" width="300" height="212" align="left" /></a></p>
<h2>Notes</h2>
<p>A few other items for completeness.  First, <em><strong>remember to add the INTERNET permission to your Android manifest</strong></em> (&lt;uses-permission android:name=&#8221;android.permission.INTERNET&#8221;/&gt;).  Failure to do so will lead to an ARQoid failure when it tries to access the remote endpoint.  The stack trace will indicate a failure to access the URI – it won’t mention that there is a permission issue.</p>
<p>Also, depending on your query, you may be trying to access a lot of data which could be time consuming and also could cause issues with small memory devices.  <em><strong>You may limit the number of rows returned and set a starting row for the results. </strong></em> This allows you to create a sliding window in your application by only pulling a few results and then allowing the user to ask for more.  These values are set on the <em>Query</em> instance.  Once you have the <em>Query</em> from the <em>QueryFactory</em> you can use these methods.</p>
<pre>// Limit the number of results returned
// Setting the limit is optional - default is unlimited
query.setLimit(10);

// Set the starting record for results returned
// Setting the limit is optional - default is 1 (and it is 1-based)
query.setOffset(11);</pre>
<p>The limit and offset given above would cause ARQoid to return 10 records, starting with the 11th one found. If there were fewer than 11 results then no records would be returned.</p>
<h2>Conclusion</h2>
<p>Hopefully if you are trying to use ARQoid this will give you a quick template to leverage the basic features of the engine.  In the future I’ll expand on this by adding other data sources as well as using the URIs returned by the query to create a richer result for the user.</p>
<p>Remember that you may <a href="http://monead.com/semantic/AndrojenaDemo.zip">download the sample code</a> if you want to see the working Android demonstration application described here.  Also, you can <a href="https://market.android.com/details?id=com.monead.semantic.android.sparql" target="_new">install Sparql Droid</a> which contains a variety of sample SPARQL queries that use the local model, SPARQL endpoints, RDF data sources as well as demonstrating federated queries.</p>
<p>If you have questions or comments about this topic please add them to this post or send them to me via my contact page.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1420</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The Cognitive Corporation™ &#8211; Effective BPM Requires Data Analytics</title>
		<link>http://monead.com/blog/?p=1156</link>
		<comments>http://monead.com/blog/?p=1156#comments</comments>
		<pubDate>Tue, 25 Oct 2011 18:35:12 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[BPM]]></category>
		<category><![CDATA[Business Processes]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Cognitive Corporation]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[Data Analytics]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[business intelligence]]></category>
		<category><![CDATA[business rules]]></category>
		<category><![CDATA[cognitive corporation]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data analytics]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[machine learning]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1156</guid>
		<description><![CDATA[The Cognitive Corporation™ is a framework introduced in an earlier posting.  The framework is meant to outline a set of general capabilities that work together in order to support a growing and thinking organization.  For this post I will drill into one of the least mature of those capabilities in terms of enterprise solution adoption [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;"><a href="http://monead.com/blog/?p=1231">The Cognitive Corporation</a><strong>™</strong> is a framework introduced in <a href="http://monead.com/blog/?p=1231">an earlier posting</a>.  The framework is meant to outline a set of general capabilities that work together in order to support a growing and thinking organization.  <strong>For this post I will drill into one of the least mature of those capabilities in terms of enterprise solution adoption &#8211; <em>Learn</em>.</strong></p>
<p>Business rules, decision engines, BPM, complex event processing (CEP), these all invoke images of computers making speedy decisions to the benefit of our businesses.  The infrastructure, technologies and software that provide these solutions (SOA, XML schemas, rule engines, workflow engines, etc.) support the decision automation process.<strong><em>  However, they don’t know what decisions to make.</em></strong></p>
<p>The BPM-related components we acquire provide the <strong><em>how</em></strong> of decision making (send <em><strong>an</strong></em> email, route <em><strong>a</strong></em> claim, suggest <em><strong>an</strong></em> offer).  <strong><em>Learning</em></strong>, supported by data analytics, provides a powerful path to the <em><strong>what</strong></em> and <em><strong>why</strong></em> of automated decisions (send <strong><em>this</em></strong> email to <strong><em>that</em></strong> person<em><strong> because</strong> they are at risk of defecting</em>, route <strong><em>this</em></strong> claim to <strong><em>that</em></strong> underwriter <em><strong>because</strong> it looks suspicious</em>, suggest <strong><em>this</em></strong> product to <em><strong>that</strong></em> customer<em><strong> because</strong> they appear to be buying these types of items</em>).</p>
<p>I’ll start by outlining the high level journey from <em>data</em> to <em>rules</em> and the cyclic nature of that journey.  Data leads to rules, rules beget responses, responses manifest as more data, new data leads to new rules, and so on.  Therefore, the journey does not end with the definition of a set of processes and rules.  <strong><em>This link between updated data and the determination of new processes and rules is the essence of any learning process, providing a key function for the cognitive corporation.</em></strong></p>
<p><span id="more-1156"></span>The following image depicts the overall process of using data analytics to take corporate information and derive new business processes and rules, hence <strong><em>learning</em></strong>.  I will refer to the numbered items throughout this post.</p>
<p style="text-align: center;"><a href="http://monead.com/blog/wp-content/uploads/2011/10/Cognitive-Corporation-BPM-Requires-Data-Analytics.png"><img class="size-medium wp-image-1369 aligncenter" title="The Cognitive Corporation’s™ Process of Using Data Analytics to Derive Business Rule and Process Definitions" src="http://monead.com/blog/wp-content/uploads/2011/10/Cognitive-Corporation-BPM-Requires-Data-Analytics-300x202.png" alt="The Cognitive Corporation’s™ Process of Using Data Analytics to Derive Business Rule and Process Definitions" width="300" height="202" /></a></p>
<p>Data Analytics is a general term and there are many subtleties within such a broad field.  In this case <strong>I’m focused on the machine (computer) learning aspect of analytics</strong>.  I’m skipping past the need to identify key data sources, create canonical definitions, and normalize information.  Overused, yet accurate, the phrase, &#8220;<em><strong>Garbage In, Garbage Out</strong></em>&#8221; applies to data analytics as much as any other aspect of computing.  The relevant data must be effectively organized before proceeding (<em>in the diagram this is depicted as items 1, 2 and 3</em>).</p>
<p><strong>What is machine learning? </strong> It is a way that computers can be used to look at data and find patterns that we don’t realize exist.  There are a variety of algorithms that allow computers to be used for this purpose.  Some are very difficult to understand and some are quite simple.  They each have strengths and weaknesses.  Not all work well for a given type of data or analytics.</p>
<p><!--more-->Therefore, the first step for leveraging data analytics (on clean data, of course) is to understand the computer’s perception of that data at a high-level.  This is known as <strong>data preprocessing</strong> and contains within it tasks that are necessary in order to find actionable information within the data (<em>diagram item 4</em>).  The tasks can be outlined as: Aggregation, Sampling, Dimensionality Reduction, Feature Subset Selection, Feature Creation, Discretization, and Feature Transformation.  I will explore each of these tasks in future postings.</p>
<p>Once the data is understood and we know how to structure it for automated analytics (<em>diagram items 5 and 6</em>), we need to run it through <strong>machine learning algorithms</strong> (<em>diagram item 7</em>).  These are the secret sauce of many data analytics tools.  The algorithms involve a variety of mathematical approaches to identifying relationships within the data.  What isn&#8217;t always easy to understand is why the analytics are identifying certain relationships.  Some algorithms make this harder than others to discern.  As with data preprocessing&#8217;s steps, I&#8217;ll delve into these algorithms in future posts.</p>
<p>Fundamentally, the output from this type of tool is a <strong>predictive model</strong> (<em>diagram item 8</em>) that can take new data and predict the outcomes.  Assuming we have focused on something pertinent to our operations, retaining customers perhaps, then we want the computer to find a way to predict which customers are likely to defect.  If such a model is tested and found to provide accurate results then we are confident that within its black box it “<em>knows</em>” something about our defecting clients that we might be able to use in order to improve retention.</p>
<p>It is the next step, <em><strong>being able to use the model to derive a business rule change</strong></em>, which requires human intervention.  The task is to <strong>root out the underlying cause(s)</strong>, which the computer has found, in a way that we are able to understand (<em>diagram item 9</em>) and, more importantly, <strong>that we may may act upon</strong> (<em>diagram item 10</em>).  This is the crux of leveraging data analytics to improve business operations.</p>
<p><strong>To successfully leverage the data analytics findings</strong> (<em>items 9 and 10</em>) a business needs three things. <strong> First, it must have a formal understanding of the processes used within the analytic environment. </strong> Experts must be able to translate the predictive model back to the actual source data that is pertinent to the predictions.</p>
<p><strong>Second, there must be an open and collaborative environment that accepts unanticipated data relationships</strong>, working to define them in business terms – not looking to dispute or discredit them. Often, people who “know the business” cannot accept what the data analytics results are showing.  The findings, often by definition, are at odds with commonly accepted &#8220;truths.&#8221;  This is part of the power of using these tools, they aren&#8217;t beholden to our mental baggage and, by design, they think outside of our business-knowledge-based box.</p>
<p><strong>Third, the business must have an agile environment that can make changes to the underlying processes and rules culled from the model.</strong>  This permits the patterns and causes found by the data analytics tool to manifest as timely action within the business&#8217; operation.  Therefore, there must be business leaders empowered to change business processes and rules, as well as an IT infrastructure that supports implementing rapid rule and process changes (<em>diagram step 11</em>).</p>
<p>The faster an organization can take data analytics results and update business operations (rules and processes), the better its ability to respond to changing business environments.  Such a capability provides a key differentiator for any business.  More and more companies are finding themselves competing in ever more commoditized markets<strong>.  It is an organization&#8217;s unique ability to use automation as a way to quickly personalize and flex to specific situations that allows it to be more productive, more responsive and ultimately more successful.</strong></p>
<p>This process is not simple nor is it completely automatic.  It consistently requires manual effort and tuning.  <em><strong>Machines don&#8217;t do the thinking, they simply provide leaders with actionable information by which conclusions may be drawn and decisions may be made.</strong></em></p>
<p>Some business use cases for data analytics tools are easier to implement than others.  For instance, predictive models around actuarial data lead to scoring rules in a business rule environment fairly directly.  The relationships between buying patterns and social media and their impact on business processes and rules are not as straightforward to map.  The rewards, however, are meaningful and worthwhile as long as the correct data is being used to drive the correct decisions.</p>
<p>If you are exploring data analytics as an input to business rule and process improvements I’d be interested in hearing about your experiences and thoughts.  Also, as I mentioned at the beginning, this post is meant to introduce the high-level concepts.  Future postings will dive deeper into the individual processes, tools and techniques that were touched upon.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1156</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Expanding on “Code Reviews Trumps Unit Testing, But They Are Better Together”</title>
		<link>http://monead.com/blog/?p=1334</link>
		<comments>http://monead.com/blog/?p=1334#comments</comments>
		<pubDate>Wed, 19 Oct 2011 01:18:13 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Quality]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[efficient coding]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[Managing IS Projects]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1334</guid>
		<description><![CDATA[Michael Delaney, a senior consulting software engineer at Blue Slate, commented on my previous posting.  As I created a reply I realized that I was expanding on my reasoning and it was becoming a bit long.  So, here is my reply as a follow-up posting.  Also, thank you to Michael for helping me think more [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;"><em>Michael Delaney, a senior consulting software engineer at <a href="http://www.blueslate.net/" target="_new">Blue Slate</a>, commented on my <a href="http://www.blueslate.net/roller/daveread/entry/code_reviews_trump_unit_testing" target="_new">previous posting</a>.  As I created a reply I realized that I was expanding on my reasoning and it was becoming a bit long.  So, here is my reply as a follow-up posting.  Also, thank you to Michael for helping me think more about this topic.<br />
</em></p>
<p>I understand the desire to rely on unit testing and its ability to find issues and prevent regressions.  For TDD, I&#8217;ll need to write separately.  Fundamentally I&#8217;m a believer in white box testing.   Black box approaches, like TDD, seem to be of relatively little value to the overall quality and reliability of the code.  <em>Meaning, <strong>I’d want to invest more effort in white box testing than in black box testing.</strong></em></p>
<p>I&#8217;m somewhat jaded, being concerned with the code&#8217;s security, which to me is strongly correlated with its reliability.  That said, I believe that unit testing is much more constrained as compared to formal reviews.  <strong><em>I’m not suggesting that unit tests be skipped</em></strong>, rather that we understand that <em>unit tests can catch certain types of flaws and that those types are narrow as compared to what formal reviews can identify.</em></p>
<p><span id="more-1334"></span>Here is how I view the constraints around unit testing:</p>
<p><strong>First, unit tests are normally implemented by the same developer who created (or will create) the code. </strong> This is a huge constraint on the <em><strong>brain power (BP)</strong></em> being applied to the quality of the code.  In this case <strong>BP=1</strong>.  If the unit tests are created by a separate individual then <strong>BP&gt;1 and, I argue, BP&lt;2</strong> because the tester is focused on the micro level (the unit level of code).</p>
<p><strong>Second, unit testing is often used to only check that the code does what it is supposed to do.</strong>  This means unit tests don&#8217;t often check whether code does things it isn&#8217;t supposed to.</p>
<p><strong>Third, unit tests don&#8217;t consider maintainability of the code, or really any macro-level concerns. </strong> Passing unit test doesn&#8217;t mean that the code is easy to understand or logically organized.  I agree that single-purpose methods and well written code are easier to test, but bad code can be tested and even meet high coverage requirements.</p>
<p>The reason I believe that the study&#8217;s results (discussed in the <a href="http://www.blueslate.net/roller/daveread/entry/code_reviews_trump_unit_testing" target="_new">previous post</a>) are probably still accurate is that many software fundamentals are no different now than in 1986.  We used unit testing frameworks (sometimes called scaffolds) and leveraged unit tests to prevent regressions.  In some ways we had to be more concerned with the completeness of our test scenarios in 1986 due to the overhead of modifying an application.  Developers today can be somewhat cavalier with code changes since our system environments tend make distribution of updates easier.</p>
<p><em><strong>I agree that new languages, paradigms and environments have not reduced the need for unit testing,</strong></em> quite the contrary.  We have more options and threats to deal with.  However, we can more quickly make and distribute an updated application than we could in 1986.</p>
<p>Considering the breadth of what can be identified,<strong> <em>formal design and code reviews go beyond simply checking that for some inputs a valid output is calculated. </em></strong> Instead, formal reviews apply <strong>BP=T(RE%)</strong> where <em><strong>T is the team size</strong></em> and <em><strong>RE% is the review effort percentage</strong></em> the team invests.</p>
<p>These reviews<strong> </strong><em><strong>catch maintenance issues, identify best practices violations</strong></em><strong> </strong>(beyond what a style checker would do),<strong> </strong><em><strong>find opportunities for refactoring</strong></em><strong> </strong>(reducing code, reducing complexity, etc.) and<strong> </strong><em><strong>serve as a way for the team to increase its overall expertise through interactions and discussion. </strong></em> Clearly, this represents a much broader value set than that provided by unit testing.</p>
<p>As I said in the original posting, this thinking isn&#8217;t a defense to skip unit testing in favor of formal reviews.  (I clamor for unit tests and high coverage metrics.)  Instead, my message is a call to understand that you need both (and other types of testing and inspections), realizing that the formal reviews catch a more comprehensive set of issues than unit testing.  Time and again I have seen organizations leveraging unit testing but not formal reviews.  <strong><em>This means that their RE% is zero</em>.  </strong><em><strong>For me this is a poor investment choice.</strong></em></p>
<p>Perhaps it would be worth doing a small scale study to see if our shop has a similar experience?  It might be quite informative for the whole team.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1334</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Code Reviews Trump Unit Testing , But They Are Better Together</title>
		<link>http://monead.com/blog/?p=1291</link>
		<comments>http://monead.com/blog/?p=1291#comments</comments>
		<pubDate>Wed, 12 Oct 2011 03:28:25 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Quality]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[efficient coding]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[Managing IS Projects]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1291</guid>
		<description><![CDATA[Last week I was participating in a formal code review (a.k.a. code inspection) with one of our clients.  We have been working with this client, helping them strengthen their development practices.  Holding formal code reviews is a key component for us.  Part of the formal process we introduced includes reviewing the unit testing results, both [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;">Last week I was participating in a formal code review (a.k.a. code inspection) with one of our clients.  We have been working with this client, helping them strengthen their development practices.  Holding formal code reviews is a key component for us.  Part of the formal process we introduced includes reviewing the unit testing results, both the (successful) output report and the code coverage metrics.</p>
<p>At one point we were reviewing some code that had several error handling blocks that were not being covered in the unit tests.  These blocks were, arguably, unlikely or impossible to reach (such as a Java StringReader throwing an IOException).  There was some discussion by the team about the necessity of mocking enough functionality to cover these blocks.</p>
<p>Although we agreed that some of the more esoteric error conditions weren’t worth the programmer’s time to mock-up, it occurred to me later that we were missing an important point.  <strong>What mattered was that we were holding a formal code review and looking at those blocks of code.</strong></p>
<p>Let me take a step back.  In 1986, <a href="http://en.wikipedia.org/wiki/Capers_Jones" target="_new">Capers Jones</a> published a book entitled <a href="http://en.wikipedia.org/wiki/Special:BookSources/9780070328112" target="_new"><strong>Programming Productivity</strong></a>.  Although dated, the book contains many excellent points that cause you think about how to create software in an efficient way.  Here efficiency is <em><strong>not</strong></em> <em>about lines of code per unit of time</em>, but<strong> more importantly</strong>, <em><strong>lines of correct code per unit of time</strong></em>.  This means taking into account rework due to errors and omissions.</p>
<p>One if the studies presented in the book relates to identifying defects in code.  It is a study whose results seem obvious when we think about them.  <strong><em>However, we don’t always align our software development practices to leverage the study’s lessons and maximize our development efficiency. </em></strong> Perhaps we believe that the statistics have changed due to language construct, experience, tooling and so forth.  We’d need similar studies to the ones presented by Capers Jones in order to prove that, though.</p>
<p>Below are a few of the actions from the book’s study of defect detection approaches.  I’ve skipped the low end and high-end numbers that Caper&#8217;s includes, simply giving the modes (averages) which are a good basis for comparison:</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2011/10/DefectIdentificationRatesData.png"><img title="Defect Identification Rates Data" src="http://monead.com/blog/wp-content/uploads/2011/10/DefectIdentificationRatesData.png" alt="Defect Identification Rates Data Table" width="285" height="255" align="left" /></a><br />
<a href="http://monead.com/blog/wp-content/uploads/2011/10/DefectIdentificationRatesGraph.png"><img title="Defect Identification Rates Graph" src="http://monead.com/blog/wp-content/uploads/2011/10/DefectIdentificationRatesGraph-300x182.png" alt="Defect Identification Rates Graph" width="300" height="182" align="right" /></a><br clear="all" /><br />
<span id="more-1291"></span><strong>Based on this data, we see that formal reviews of the design and code are much more effective at finding defects than unit testing. </strong> In this case, unit testing is focused on branch coverage.  Obviously, changing the testing expectations, such as applying domain testing concepts, might change the effectiveness. My guess is that such a change would not bring unit testing on par with formal code reviews but would improve its performance, at a cost of productivity – since domain testing takes more rigor to apply.</p>
<p>As Capers points out, these percentages can’t be added together.  <em>In other words, there is an intersection of the defects found by these methods. </em> In his studies these methods found some of the same and some unique defects.  <em><strong>Therefore, this data does not provide a defense to skip unit testing. </strong></em> Instead, <strong>each defect detection method should be used in a complementary fashion.</strong>  The statistical data does inform us regarding level-of-effort decisions.  Also, we can use this information to help define where to cut corners, if necessary.</p>
<p><strong>I</strong><strong>f we have to reduce the time we spend in the testing phase, we would do better to give up some unit testing in favor of keeping more formal code reviews. </strong> This isn’t always the way we approach such situations.  It is easy to skip the formal reviews, believing that the time would be better spent by developers creating more tests for their code.  Apparently this is not the case.</p>
<p>I think it is valuable to periodically remind ourselves about studies like these as a way to make sure we aren’t falling into bad habits or being misled by incorrect assumptions.  Also, as teams bring on new members, it is good to review our practices, understand the relevant literature and actively discuss why we do what we do (and whether it needs to change).</p>
<p>A parting thought about statistics like the ones quoted here.  They are an average from a very broad array of problem types and development shops.  As <a href="http://en.wikipedia.org/wiki/Boris_Beizer" target="_new">Boris Beizer</a> has pointed out, <em>each shop has a “<strong>bug fingerprint</strong>”. </em> The bug fingerprint is impacted by things like the technologies being used, types of applications being developed and the experience of the team members.</p>
<p>Further, different types of bugs are better identified through different detection methods (branch tests, domain tests, code inspections, etc.).  Therefore, the optimal set of testing approaches for a given group of developers will differ somewhat from the average.  I’ll discuss this in a future post.</p>
<p>What does your team do in regards to testing and inspections?  Are there other techniques you find useful for identifying software defects?  I welcome your comments and feedback.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1291</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Cognitive Corporation™ &#8211; An Introduction</title>
		<link>http://monead.com/blog/?p=1231</link>
		<comments>http://monead.com/blog/?p=1231#comments</comments>
		<pubDate>Mon, 26 Sep 2011 18:56:58 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[BPM]]></category>
		<category><![CDATA[Cognitive Corporation]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[cognitive corporation]]></category>
		<category><![CDATA[enterprise applications]]></category>
		<category><![CDATA[enterprise systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[system integration]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1231</guid>
		<description><![CDATA[Given my role as an enterprise architect, I’ve had the opportunity to work with many different business leaders, each focused on leveraging IT to drive improved efficiencies, lower costs, increase quality, and broaden market share throughout their businesses.  The improvements might involve any subset of data, processes, business rules, infrastructure, software, hardware, etc.  A common [...]]]></description>
				<content:encoded><![CDATA[<p style="padding-top: 14px;">Given my role as an enterprise architect, I’ve had the opportunity to work with many different business leaders, each focused on leveraging IT to drive improved efficiencies, lower costs, increase quality, and broaden market share throughout their businesses.  The improvements might involve any subset of data, processes, business rules, infrastructure, software, hardware, etc.  A common thread is that <em><strong>each project seeks to make the corporation smarter through the use of information technology.</strong></em></p>
<p>As I’ve placed these separate projects into a common context of my own, I’ve concluded that the long term goal of leveraging information technology must be for it to support cognitive processes.  I don’t mean that the computers will think for us, rather that<em><strong> IT solutions must work together to allow a business to learn, corporately.</strong></em></p>
<p>The individual tools that we utilize each play a part.  However, we tend to utilize them in a manner that focuses on isolated and directed operation rather than incorporating them into an overall learning loop.  In other words,<em><strong> we install tools that we direct without asking them to help us find better directions to give.</strong></em></p>
<p>Let me start with a definition: similar to thinking beings, <strong>a <span style="text-decoration: underline;">cognitive corporation</span>™ leverages a feedback loop of information and experiences to inform future processes and rules. </strong> Fundamentally, learning is a process and it involves taking known facts and experiences and combining them to create new hypothesis which are tested in order to derive new facts, processes and rules.  Unfortunately, we don’t often leverage our enterprise applications in this way.</p>
<p>We have many tools available to us in the enterprise IT realm.  These include database management systems, business process management environments, rule engines, reporting tools, content management applications, data analytics tools, complex event processing environments, enterprise service buses, and ETL tools.  <em>Individually, these components are used to solve specific, predefined issues with the operation of a business.  </em>However, this is not an optimal way to leverage them.</p>
<p>If we consider that these tools mimic aspects of an intelligent being, then we need to leverage them in a fashion that manifests the cognitive capability in preference to simply deploying a point-solution.  This involves thinking about the tools somewhat differently.</p>
<p><span id="more-1231"></span>As I consider the vision of empowering a learning corporation, I break enterprise tools into 6 general areas.  These areas provide a portion of what is required in order to survive and grow in an environment.  <strong>These 6 areas are: Think, Communicate, Act, Know, Sense and Learn.</strong>  They are in no particular order since the absence of any will prevent meaningful growth from occurring.</p>
<p>I’ll be expanding on this concept over time, drilling into each area and the solutions we often use to meet that need.  Depicted below is a very high-level diagram of these areas with the categories of products or functions I place in each.  The categories are broad and I’ve chosen key terms to represent a lot of different tools and techniques.</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2011/09/Cognitive-Corporation-Enterprise-Components2.png"><img class="aligncenter size-medium wp-image-1288" title="Cognitive Corporation Enterprise Components" src="http://monead.com/blog/wp-content/uploads/2011/09/Cognitive-Corporation-Enterprise-Components2-300x220.png" alt="Cognitive Corporation Enterprise Components Depiction" width="300" height="220" /></a></p>
<p>This framework provides an infrastructure, similar to how a person is made up of a set of systems.  However,<em><strong> it is the information that is shared between systems that becomes the key to actually supporting the idea of a thinking and learning system.</strong></em>  There are assumptions we make as we design enterprise systems that will limit our ability to empower a cognitive corporation™.</p>
<p>For instance, we may limit our definition of valuable data to the information collected by our on-line systems.  A broader interpretation begins to consider the process steps followed and the rules executed for a given interaction as data.  There are many places where such assumptions and limitations will interfere with gaining a true advantage from our enterprise solutions.</p>
<p><em><strong>We must change our view of the way that a business benefits through the use of IT.</strong></em>  The focus on feature sets for a specific type of problem must be replaced with a <em>focus on the whole set of systems and how they interrelate and may be leveraged to benefit business growth &#8211; in multiple dimensions</em>.  This will allow us to make an order-of-magnitude jump in the value derived from IT investments.</p>
<p>Therefore, <strong>an enterprise architecture goes beyond the role that systems will play, it must define how the systems will interact within the entire enterprise. </strong> We are then in a position to leverage these powerful tools to create a cognitive corporation™.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1231</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
