<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Dave&#039;s Reflections</title>
	<atom:link href="http://monead.com/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://monead.com/blog</link>
	<description></description>
	<lastBuildDate>Wed, 07 Dec 2011 17:22:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Semantic Technology and Business Conference, East 2011 – Reflections</title>
		<link>http://monead.com/blog/?p=1519</link>
		<comments>http://monead.com/blog/?p=1519#comments</comments>
		<pubDate>Wed, 07 Dec 2011 17:22:13 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[Semantic Technology]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[enterprise systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[system integration]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1519</guid>
		<description><![CDATA[I had the pleasure of attending the Semantic Technology and Business Conference in Washington, DC last week.  I have a strong interest in semantic technology and its capabilities to enhance the way in which we leverage information systems.  There was a good selection of topics discussed by people with a variety of  backgrounds working in [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;">I had the pleasure of attending the Semantic Technology and Business Conference in Washington, DC last week.  I have a strong interest in semantic technology and its capabilities to enhance the way in which we leverage information systems.  There was a good selection of topics discussed by people with a variety of  backgrounds working in different verticals.</p>
<p>To begin the conference I attended the half day <strong>“Ontology 101” presented by Elisa Kendall and Deborah McGuinness</strong>.  They indicated that this presentation has been given at each semantic technology conference and the interest is still strong.  The implication being that new people continue to want to understand this art.</p>
<p>Their material was very useful and if you are someone looking to get a grounding in ontologies (<em>what are they?  how do you go about creating them?</em>) I recommend attending this session the next time it is offered.  Both leaders clearly have deep experience and expertise in this field.  Also, the discussion was not tied to a technology (e.g. RDF) so it was applicable regardless of underlying implementation details.</p>
<p>I wrapped up the first day with <strong>Richard Ordowich who discussed the process of reverse engineering semantics (meaning) from legacy data</strong>.  The goal of such projects being to achieve a data harmonization of information across the enterprise.</p>
<p>A point he stressed was that <em><strong>a business really needs to be ready to start such a journey</strong></em>.  This type of work is very hard and very time consuming.  It requires an enterprise wide discipline.  He suggests that before working with a company on such an initiative one should ask for examples of prior enterprise program success (e.g. something like BPM, SDLC).</p>
<p>Fundamentally, a project that seeks to harmonize the meaning of data across an enterprise requires organization readiness to go beyond project execution.  The enterprise must put effective governance in place to operate and maintain the resulting ontologies, taxonomies and metadata.</p>
<p><strong>The full conference kicked off the following day. </strong> One aspect that jumped out for me was that <em><strong>a lot of the presentations dealt with government-related projects. </strong></em> This could have been a side-effect of the conference being held in Washington, DC but I think it is more indicative that spending in this technology is more heavily weighted to public rather than private industry.</p>
<p>Being government-centric I found any claims of “value” suspect.  A project can be valuable, or show value, without being cost effective.  Commercial businesses have gone bankrupt even though they delivered value to their customers.  More exposure of positive-ROI commercial projects will be important to help accelerate the adoption of these technologies.</p>
<p>Other than the financial aspect, <em><strong>the presentations were incredibly valuable in terms of presenting lessons learned, best practices and in-depth tool discussions.</strong></em>  I’ll highlight a few of the sessions and key thoughts that I believe will assist as we continue to apply semantic technology to business system challenges.</p>
<p><strong><span id="more-1519"></span>Marcel Jemio from the Department of the Treasury</strong> discussed a massive project seeking to apply semantics to their siloed transactional data.  His focus was to present his lessons learned as well as a possible blue print for similar projects.  He did a great job on both fronts.  A focus he suggested was that <strong><em>such projects should not be concerned with classifying data; rather they should seek to classify knowledge. </em></strong> It is knowledge that is actionable and provides meaning and relevance.</p>
<p><em>Some other important points he made included the need for business data governance, standardized data (vocabulary, semantics) and a shared information repository</em>.  Beyond his excellent sharing of experiences, his slides included recommendations for how to present information to business stakeholders so as to inform and educate.</p>
<p><strong>Richard Green’s presentation looked at how to go about creating a practical ontology for an enterprise. </strong> He dove into detail, showing some of the structured approaches that he uses to gather the requirements that feed an ontology design.  A key point for Richard was that ontologies must be practical (delivering value, being easy to use), define an ontology (describing what exists and their relationships) and work for a specific enterprise or constituent.</p>
<p>I thought his approach was very logical and included useful guidance for thinking through the process.  Sharing his documentation templates helped to clarify how the approach works and specifically how his “5 w’s” are applied to different situations.  I intend to integrate some of his approach into my practices.</p>
<p>Another presentation that contained specific guidance was <strong>Janet Millenson’s ROI of RDF discussion</strong>.  She outlined some of the key drivers that would lead an organization to apply RDF-based technology as a more efficient way to approach a business challenge.  She went beyond simple ROI and also discussed the underlying business driver such as reducing costs and supporting their mission.</p>
<p>Her approach to making the business case was standard procedure for introducing large scale change to an organization&#8217;s leadership.  Like Marcel, she cautioned against the use of technology terms in these types of discussions.  The technology is irrelevant to business leadership, the results are what matter.  Based on her experience she shared the importance of setting realistic expectations and creating the right team in order to succeed.</p>
<p>When discussing actual project planning and execution she gave a good overview of some major aspects that must be thought-through and managed in order to succeed.  These include people with their skills as well as technologies leveraged and the business domain being addressed.  Considering technology issues (data quality, governance), organizational concerns (build/buy, vendor relationships) and market risk (vendor consolidation, standards evolution) are vital in order to create the most effective result.</p>
<p><em>Her bottom line was that a successful project will address a specific problem in the organization (not boil the ocean), understand and work within constraints, start with realistic expectations and plan flexibility into the result to deal with technology and business changes.</em></p>
<p><strong>David Booth (presenting on behalf of Jurgen Angele) walked through 2 healthcare industry case studies</strong> where semantic technology was used to effectively advance the effectiveness and safety of care while reducing costs.  The initial case was from the Cleveland Clinic.</p>
<p>The usual starting point of having multiple stovepipe systems that contained related data was presented.  The total data set, if combined in a meaningful way, would create a much more useful information source for doctors and researchers. Since semantic technology simplifies the integration of separate data sources, it was an obvious choice for this project.</p>
<p>David described the high-level process that was used to bring the data together.  The basic subsumption capability of inferencing is a simple yet powerful way to quickly combine data.  He quoted Jim Hendler, <em><strong>“A little inferencing goes a long way.” </strong></em> The presentation described the pipeline process that was applied and which makes a lot of sense as a standard approach to dealing with such integration projects.</p>
<p>The second case study revolved around PanGenX, which seeks to use human genome information to understand which drugs will work for which individuals.  The overall data integration process was very similar to the first case study.  One point that was made, which resonated with me, was the statement that <em><strong>SPARQL is a convenient rule language. </strong></em> This comment was in the context of simple rules, obviously not inferencing.  However if simple rules are being created, having them in the same syntax as the queries simplifies the process.</p>
<p><strong>Neil Raden’s presentation,</strong> albeit abbreviated, was thought provoking.  Neil’s direct manner, devoid of pretense, allowed him to actually cover a lot of information in a very short time.  His take on the overhyped and overly complex processes that represent current BI practices, such as large MDM projects and monolithic data warehouses, was that these are not where businesses should be spending their BI dollars.</p>
<p>He dropped a lot of information in one-liners.  Some of which weren’t related to semantics as much as data in general.  I found his presentation a call to action to <em><strong>leverage semantic technology in support of a new data paradigm, not a new way to implement existing paradigms. </strong></em> I don’t know if that was a point he intended to make but it seemed like a logical extension of what he was saying.</p>
<p>The conference wrapped up with a <strong>panel discussion between David Booth, Elisa Kendall, David McComb and Dennis Wisnosky. </strong> They discussed some of the changes they have seen with semantic technology including the fact that project implementation seems to be getting easier as tools mature and the semantic technology ecosystem gets fleshed out.  They are also seeing people focusing on more pragmatic and meaningful projects, going beyond simply playing with the technology.</p>
<p>There was some discussion about the continuing hype associated with this area and the need to not fall prey to that.  The work that has been done represents hard work and collegiate sharing.  <em>The solutions are neither turnkey nor ready to simply buy and install.</em></p>
<p>There continues to be a focus on unstructured data which is good since so much corporate information is sitting idle in documents rather than databases.</p>
<p><em><strong>Some specific guidance for getting started with semantic technology:</strong></em> prove it to yourself with a POC.  You can’t discuss it more broadly if you don’t believe in it.  Choose an initiative and give it a name so that people recognize it.  Plan to deliver value on a regular (no less than every 6 months) basis.  Start small and grow from there.</p>
<p>Controlling scope, as in all projects, is central to working with new technology in order to get to a success point and begin acquiring lessons learned, which will allow for future success.  The guidance from “The Mythical Man Month” is still relevant: <em>you always throw the first one away.</em></p>
<p><em><strong>A theme that was heard in several sessions and repeated in the panel discussion was that use cases must be leveraged</strong></em> when beginning a semantic technology project.  In reality that has nothing to do with semantics.  I leverage use cases (or user stories) in every business system project.  The fact that they are required for semantic technology projects simply points out that these projects are still focused on business success and must be driven by business needs.</p>
<p><em><strong>On the technology side there were several vendors present</strong></em> demonstrating their tools and providing sessions that explored their case studies.   The capabilities of these tools continue to advance quickly.  I saw several that I will be trying out including ontology editors, inference engines and integration tools.</p>
<p>It is great to see the tool landscape continuing to grow in breadth and depth.  It is also nice to see a mix of commercial and open source alternatives, allowing a company to start a small POC without a large capital investment and then having commercial alternatives based on needs.</p>
<p><strong>Overall the conference presented useful experiential information through case studies and lessons learned. </strong> More commercial experience needs to be shared.  Also, a way needs to be found to make the output and advancement that the working groups are driving more visible and understandable.</p>
<p>In my case I am not looking to build a triple store or an OWL interpreter, those types of standards are not interesting to me.  I need more basic information.  Some of this exists but may not be as visible as it could be.  There was discussion around the importance of raising the visibility of things like CIO-level information and developer (of semantic technology-based applications, not the semantic technology itself) documentation.</p>
<p><em><strong>I continue to be convinced that semantic technology will fundamentally change the way we work with data and system integrations. </strong></em> It is simply a matter of time to get there.  Practitioners and thought leaders have an important role to play in order to get this vision out into the mainstream and keep it in the forefront of enterprise system conversations.</p>
<p>As always, if you have thoughts you’d like to share about a topic on my blog, feel free to add a comment.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1519</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Using ARQoid for Android-based SPARQL Query Execution</title>
		<link>http://monead.com/blog/?p=1420</link>
		<comments>http://monead.com/blog/?p=1420#comments</comments>
		<pubDate>Thu, 01 Dec 2011 18:22:39 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[semantics]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1420</guid>
		<description><![CDATA[I was recently asked about the SPARQL support in Sparql Droid and whether it could serve as a way for other Android applications to execute SPARQL queries against remote data sources.  It could be used in this way but there is a simpler alternative I&#8217;d like to discuss here. On the Android platform it is [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;">I was recently asked about the SPARQL support in <a href="http://monead.com/blog/?p=1028" target="_new">Sparql Droid</a> and whether it could serve as a way for other Android applications to execute SPARQL queries against remote data sources.  It could be used in this way but there is a simpler alternative I&#8217;d like to discuss here.</p>
<p>On the Android platform it is actually quite easy to execute SPARQL against remote SPARQL endpoints, RDF data and local models.  The heavy lifting is handled by <a href="http://code.google.com/p/androjena/" target="_new">Androjena’s</a> <a href="http://code.google.com/p/androjena/wiki/ARQoid" target="_new">ARQoid</a>, an Android-centric port of HP’s Jena ARQ engine.</p>
<p>Both engines (the original and the port) do a great job of simplifying the execution of SPARQL queries and consumption of the resulting data.  In this post I’ll go through a simple example of using ARQoid.  Note that all the <a href="http://monead.com/semantic/AndrojenaDemo.zip">code being shown here is available for download</a>.  This post is based specifically on the <em>queryRemoteSparqlEndpoint()</em> method in the <em>com.monead.androjena.demo.arqoid.SparqlExamples</em> class.</p>
<h2>Setup</h2>
<p>To begin, some environment setup needs to be done in order to have a properly configured Android project ready to use ARQoid.</p>
<p>First, obtain the ARQoid JAR and its dependencies.  This is easily accomplished using the <a href="http://code.google.com/p/androjena/downloads/list" target="_new">download</a> page on the <a href="http://code.google.com/p/androjena/wiki/ARQoid" target="_new">ARQoid Wiki</a> and obtaining the latest ARQoid ZIP file.  Unzip the downloaded archive.   Since I’m discussing an Android application I’d expect that you would have created an Android project and that it contains a libs directory where the JAR files should be placed.</p>
<p>Second, add the JAR files to the classpath for your Android project.  I use the ADT plugin for Eclipse to do Android development.  So to add the JARs to my project I choose the <strong>Project</strong> menu item, select <strong>Properties</strong>, choose <strong>Build Path</strong>, select the <strong>Libraries</strong> tab, click the <strong>Add JARs…</strong> button, <strong>navigate to the libs</strong> directory, <strong>select the JAR files</strong> and click <strong>OK</strong> on the open dialogs.</p>
<p>Third, setup a minimal Android project.  The default layout, with a small change to its definition will work fine.</p>
<h2>Overview</h2>
<p><em><strong>Now we are ready to write the code that uses ARQoid to access some data.</strong></em>  For this first blog entry I’ll focus on a trivial query against a SPARQL endpoint.  There would be some small differences if we wanted to query a local model or a remote data set.  Those will be covered in follow-on entries.</p>
<p><strong>Here is a list of the ARQoid classes we will be using for this initial example:</strong></p>
<ul>
<li><strong>com.hp.hpl.jena.query.Query</strong> – represents the query being executed</li>
<li><strong>com.hp.hpl.jena.query.Syntax</strong> – represents the query syntaxes supported by ARQoid</li>
<li><strong>com.hp.hpl.jena.query.QueryFactory</strong> – creates a <em>Query</em> instance based on supplied parameters such as the query string and syntax definition</li>
<li><strong>com.hp.hpl.jena.query.QueryExecution</strong> – provides the service to  execute the query</li>
<li><strong>com.hp.hpl.jena.query.QueryExecutionFactory</strong> – creates a <em>QueryExecution</em> instance based on supplied parameters such as a <em>Query</em> instance and SPARQL endpoint URI</li>
<li><strong>com.hp.hpl.jena.query.ResultSet</strong> – represents the returned data and metadata associated with the executed query</li>
<li>c<strong>om.hp.hpl.jena.query.QuerySolution</strong> – represents one row of data within the <em>ResultSet</em>.</li>
</ul>
<p>We’ll use these classes to execute a simple SPARQL query that retrieves some data associated with space exploration.  <a href="http://www.talis.com/" target="_new">Talis</a> provides an endpoint that we can use to access some interesting space exploration data.  The endpoint is located at <a href="http://api.talis.com/stores/space/services/sparql" target="_new">http://api.talis.com/stores/space/services/sparql</a>.<br />
<strong>The query we will execute is:</strong></p>
<pre>SELECT ?dataType ?data
WHERE {
  &lt;http://nasa.dataincubator.org/launch/1961-012&gt; ?dataType ?data.
}</pre>
<p>This query will give us a little information about Vostok 1 launched by the USSR in 1961.</p>
<h2><span id="more-1420"></span>Create the Query instance</h2>
<p>We begin by creating the <em>Query</em> instance using the <em>QueryFactory</em>.</p>
<pre>// Create a Query instance
Query query = QueryFactory.create(queryString, Syntax.syntaxARQ);</pre>
<p>This code assumes that the query given earlier has been assigned to the String  variable <em>queryString</em></p>
<h2>Create the QueryExecution instance</h2>
<p>We next create a <em>QueryExecution</em> using the <em>QueryExecutionFactory</em>.</p>
<pre>// This query uses an external SPARQL endpoint for processing
// This is the syntax for that type of query
QueryExecution qe = QueryExecutionFactory.sparqlService(sparqlEndpointUri, query);</pre>
<p>This code assumes that the Talis endpoint mentioned above has been assigned to the String variable <em>sparqlEndpointUri</em>.</p>
<h2>Execute the query and obtain the ResultSet instance</h2>
<p>We are now ready to actually execute the query and obtain the <em>ResultSet</em> instance.</p>
<pre>// Execute the query and obtain results
ResultSet resultSet = qe.execSelect();</pre>
<p>The resultSet variable now provides us access to the query results.</p>
<h2>Retrieve the column names</h2>
<p>A useful piece of metadata in the <em>ResultSet</em> instance is the list of column names.  These will be based on the information requested in the <em>SELECT</em> clause.  ARQoid can return them as a List&lt;String&gt;.</p>
<pre>// Get the column names (the aliases supplied in the SELECT clause)
List&lt;String&gt; columnNames = resultSet.getResultVars();</pre>
<p>The columnNames List will contain the aliases given in the SELECT clause.</p>
<h2>Iterate through the resulting rows</h2>
<p>We can now iterate through the resulting rows, asking for each row’s data which is represented as a <em>QuerySolution</em> instance.</p>
<pre>// Iterate through all resulting rows
while (resultSet.hasNext()) {
  // Get the next result row
  QuerySolution solution = resultSet.next();</pre>
<p>The solution variable will contain the current result row&#8217;s data.</p>
<h2>Obtain the data for a row and column</h2>
<p>To actually access the data you can request a specific column name from the <em>QuerySolution</em> instance.  However, you need to know whether the data is null, a literal value or a URI.  The following code performs the necessary tests and then prints the data to the standard output (in the downloadable code it will be presented on the Android device&#8217;s screen).</p>
<pre>// Data value will be null if optional and not present
if (solution.get(var) == null) {
  System.out.println("{null}");
// Test whether the returned value is a literal value
} else if (solution.get(var).isLiteral()) {
  System.out.println (solution.getLiteral(var).toString());
// Otherwise the returned value is a URI
} else {
  System.out.println(solution.getResource(var).getURI());
}</pre>
<p>From the code, above, you can see that in order to access the basic data value you use the <strong>get(String)</strong> method which expects the column name to be passed.  This will return <em>null</em> if there is no data associated with this row and column.  If there is a value, the method <strong>isLiteral()</strong> may be called to test whether the data is a literal.  If it is not than it will likely be a URI.  The URI can be accessed by calling the <strong>getResource(String)</strong> method, passing the column name, and calling the <strong>getURI()</strong> method on that value.</p>
<h2>Close the QueryExecution instance</h2>
<p>The last step is important.  The <em>QueryExecution</em> instance should be cleaned up by calling its close() method.</p>
<pre>// Important - free up resources used running the query
qe.close();</pre>
<p>These are the basic steps you need to carry out in order to execute a SPARQL query against a SPARQL endpoint using ARQoid.  Here is a screen shot of the Android emulator executing the example just covered.</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2011/12/ARQoid.example.screenshot.png"><img class="alignnone" title="ARQoid demo example screenshot" src="http://monead.com/blog/wp-content/uploads/2011/12/ARQoid.example.screenshot-300x212.png" alt="ARQoid demo running in the Android emulator example screenshot" width="300" height="212" align="left" /></a></p>
<h2>Notes</h2>
<p>A few other items for completeness.  First, <em><strong>remember to add the INTERNET permission to your Android manifest</strong></em> (&lt;uses-permission android:name=&#8221;android.permission.INTERNET&#8221;/&gt;).  Failure to do so will lead to an ARQoid failure when it tries to access the remote endpoint.  The stack trace will indicate a failure to access the URI – it won’t mention that there is a permission issue.</p>
<p>Also, depending on your query, you may be trying to access a lot of data which could be time consuming and also could cause issues with small memory devices.  <em><strong>You may limit the number of rows returned and set a starting row for the results. </strong></em> This allows you to create a sliding window in your application by only pulling a few results and then allowing the user to ask for more.  These values are set on the <em>Query</em> instance.  Once you have the <em>Query</em> from the <em>QueryFactory</em> you can use these methods.</p>
<pre>// Limit the number of results returned
// Setting the limit is optional - default is unlimited
query.setLimit(10);

// Set the starting record for results returned
// Setting the limit is optional - default is 1 (and it is 1-based)
query.setOffset(11);</pre>
<p>The limit and offset given above would cause ARQoid to return 10 records, starting with the 11th one found. If there were fewer than 11 results then no records would be returned.</p>
<h2>Conclusion</h2>
<p>Hopefully if you are trying to use ARQoid this will give you a quick template to leverage the basic features of the engine.  In the future I’ll expand on this by adding other data sources as well as using the URIs returned by the query to create a richer result for the user.</p>
<p>Remember that you may <a href="http://monead.com/semantic/AndrojenaDemo.zip">download the sample code</a> if you want to see the working Android demonstration application described here.  Also, you can <a href="https://market.android.com/details?id=com.monead.semantic.android.sparql" target="_new">install Sparql Droid</a> which contains a variety of sample SPARQL queries that use the local model, SPARQL endpoints, RDF data sources as well as demonstrating federated queries.</p>
<p>If you have questions or comments about this topic please add them to this post or send them to me via my contact page.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1420</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The Cognitive Corporation™ &#8211; Effective BPM Requires Data Analytics</title>
		<link>http://monead.com/blog/?p=1156</link>
		<comments>http://monead.com/blog/?p=1156#comments</comments>
		<pubDate>Tue, 25 Oct 2011 18:35:12 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[BPM]]></category>
		<category><![CDATA[Business Processes]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Cognitive Corporation]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[Data Analytics]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[business intelligence]]></category>
		<category><![CDATA[business rules]]></category>
		<category><![CDATA[cognitive corporation]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data analytics]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[machine learning]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1156</guid>
		<description><![CDATA[The Cognitive Corporation™ is a framework introduced in an earlier posting.  The framework is meant to outline a set of general capabilities that work together in order to support a growing and thinking organization.  For this post I will drill into one of the least mature of those capabilities in terms of enterprise solution adoption [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;"><a href="http://monead.com/blog/?p=1231">The Cognitive Corporation</a><strong>™</strong> is a framework introduced in <a href="http://monead.com/blog/?p=1231">an earlier posting</a>.  The framework is meant to outline a set of general capabilities that work together in order to support a growing and thinking organization.  <strong>For this post I will drill into one of the least mature of those capabilities in terms of enterprise solution adoption &#8211; <em>Learn</em>.</strong></p>
<p>Business rules, decision engines, BPM, complex event processing (CEP), these all invoke images of computers making speedy decisions to the benefit of our businesses.  The infrastructure, technologies and software that provide these solutions (SOA, XML schemas, rule engines, workflow engines, etc.) support the decision automation process.<strong><em>  However, they don’t know what decisions to make.</em></strong></p>
<p>The BPM-related components we acquire provide the <strong><em>how</em></strong> of decision making (send <em><strong>an</strong></em> email, route <em><strong>a</strong></em> claim, suggest <em><strong>an</strong></em> offer).  <strong><em>Learning</em></strong>, supported by data analytics, provides a powerful path to the <em><strong>what</strong></em> and <em><strong>why</strong></em> of automated decisions (send <strong><em>this</em></strong> email to <strong><em>that</em></strong> person<em><strong> because</strong> they are at risk of defecting</em>, route <strong><em>this</em></strong> claim to <strong><em>that</em></strong> underwriter <em><strong>because</strong> it looks suspicious</em>, suggest <strong><em>this</em></strong> product to <em><strong>that</strong></em> customer<em><strong> because</strong> they appear to be buying these types of items</em>).</p>
<p>I’ll start by outlining the high level journey from <em>data</em> to <em>rules</em> and the cyclic nature of that journey.  Data leads to rules, rules beget responses, responses manifest as more data, new data leads to new rules, and so on.  Therefore, the journey does not end with the definition of a set of processes and rules.  <strong><em>This link between updated data and the determination of new processes and rules is the essence of any learning process, providing a key function for the cognitive corporation.</em></strong></p>
<p><span id="more-1156"></span>The following image depicts the overall process of using data analytics to take corporate information and derive new business processes and rules, hence <strong><em>learning</em></strong>.  I will refer to the numbered items throughout this post.</p>
<p style="text-align: center;"><a href="http://monead.com/blog/wp-content/uploads/2011/10/Cognitive-Corporation-BPM-Requires-Data-Analytics.png"><img class="size-medium wp-image-1369 aligncenter" title="The Cognitive Corporation’s™ Process of Using Data Analytics to Derive Business Rule and Process Definitions" src="http://monead.com/blog/wp-content/uploads/2011/10/Cognitive-Corporation-BPM-Requires-Data-Analytics-300x202.png" alt="The Cognitive Corporation’s™ Process of Using Data Analytics to Derive Business Rule and Process Definitions" width="300" height="202" /></a></p>
<p>Data Analytics is a general term and there are many subtleties within such a broad field.  In this case <strong>I’m focused on the machine (computer) learning aspect of analytics</strong>.  I’m skipping past the need to identify key data sources, create canonical definitions, and normalize information.  Overused, yet accurate, the phrase, &#8220;<em><strong>Garbage In, Garbage Out</strong></em>&#8221; applies to data analytics as much as any other aspect of computing.  The relevant data must be effectively organized before proceeding (<em>in the diagram this is depicted as items 1, 2 and 3</em>).</p>
<p><strong>What is machine learning? </strong> It is a way that computers can be used to look at data and find patterns that we don’t realize exist.  There are a variety of algorithms that allow computers to be used for this purpose.  Some are very difficult to understand and some are quite simple.  They each have strengths and weaknesses.  Not all work well for a given type of data or analytics.</p>
<p><!--more-->Therefore, the first step for leveraging data analytics (on clean data, of course) is to understand the computer’s perception of that data at a high-level.  This is known as <strong>data preprocessing</strong> and contains within it tasks that are necessary in order to find actionable information within the data (<em>diagram item 4</em>).  The tasks can be outlined as: Aggregation, Sampling, Dimensionality Reduction, Feature Subset Selection, Feature Creation, Discretization, and Feature Transformation.  I will explore each of these tasks in future postings.</p>
<p>Once the data is understood and we know how to structure it for automated analytics (<em>diagram items 5 and 6</em>), we need to run it through <strong>machine learning algorithms</strong> (<em>diagram item 7</em>).  These are the secret sauce of many data analytics tools.  The algorithms involve a variety of mathematical approaches to identifying relationships within the data.  What isn&#8217;t always easy to understand is why the analytics are identifying certain relationships.  Some algorithms make this harder than others to discern.  As with data preprocessing&#8217;s steps, I&#8217;ll delve into these algorithms in future posts.</p>
<p>Fundamentally, the output from this type of tool is a <strong>predictive model</strong> (<em>diagram item 8</em>) that can take new data and predict the outcomes.  Assuming we have focused on something pertinent to our operations, retaining customers perhaps, then we want the computer to find a way to predict which customers are likely to defect.  If such a model is tested and found to provide accurate results then we are confident that within its black box it “<em>knows</em>” something about our defecting clients that we might be able to use in order to improve retention.</p>
<p>It is the next step, <em><strong>being able to use the model to derive a business rule change</strong></em>, which requires human intervention.  The task is to <strong>root out the underlying cause(s)</strong>, which the computer has found, in a way that we are able to understand (<em>diagram item 9</em>) and, more importantly, <strong>that we may may act upon</strong> (<em>diagram item 10</em>).  This is the crux of leveraging data analytics to improve business operations.</p>
<p><strong>To successfully leverage the data analytics findings</strong> (<em>items 9 and 10</em>) a business needs three things. <strong> First, it must have a formal understanding of the processes used within the analytic environment. </strong> Experts must be able to translate the predictive model back to the actual source data that is pertinent to the predictions.</p>
<p><strong>Second, there must be an open and collaborative environment that accepts unanticipated data relationships</strong>, working to define them in business terms – not looking to dispute or discredit them. Often, people who “know the business” cannot accept what the data analytics results are showing.  The findings, often by definition, are at odds with commonly accepted &#8220;truths.&#8221;  This is part of the power of using these tools, they aren&#8217;t beholden to our mental baggage and, by design, they think outside of our business-knowledge-based box.</p>
<p><strong>Third, the business must have an agile environment that can make changes to the underlying processes and rules culled from the model.</strong>  This permits the patterns and causes found by the data analytics tool to manifest as timely action within the business&#8217; operation.  Therefore, there must be business leaders empowered to change business processes and rules, as well as an IT infrastructure that supports implementing rapid rule and process changes (<em>diagram step 11</em>).</p>
<p>The faster an organization can take data analytics results and update business operations (rules and processes), the better its ability to respond to changing business environments.  Such a capability provides a key differentiator for any business.  More and more companies are finding themselves competing in ever more commoditized markets<strong>.  It is an organization&#8217;s unique ability to use automation as a way to quickly personalize and flex to specific situations that allows it to be more productive, more responsive and ultimately more successful.</strong></p>
<p>This process is not simple nor is it completely automatic.  It consistently requires manual effort and tuning.  <em><strong>Machines don&#8217;t do the thinking, they simply provide leaders with actionable information by which conclusions may be drawn and decisions may be made.</strong></em></p>
<p>Some business use cases for data analytics tools are easier to implement than others.  For instance, predictive models around actuarial data lead to scoring rules in a business rule environment fairly directly.  The relationships between buying patterns and social media and their impact on business processes and rules are not as straightforward to map.  The rewards, however, are meaningful and worthwhile as long as the correct data is being used to drive the correct decisions.</p>
<p>If you are exploring data analytics as an input to business rule and process improvements I’d be interested in hearing about your experiences and thoughts.  Also, as I mentioned at the beginning, this post is meant to introduce the high-level concepts.  Future postings will dive deeper into the individual processes, tools and techniques that were touched upon.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1156</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Expanding on “Code Reviews Trumps Unit Testing, But They Are Better Together”</title>
		<link>http://monead.com/blog/?p=1334</link>
		<comments>http://monead.com/blog/?p=1334#comments</comments>
		<pubDate>Wed, 19 Oct 2011 01:18:13 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Quality]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[efficient coding]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[Managing IS Projects]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1334</guid>
		<description><![CDATA[Michael Delaney, a senior consulting software engineer at Blue Slate, commented on my previous posting.  As I created a reply I realized that I was expanding on my reasoning and it was becoming a bit long.  So, here is my reply as a follow-up posting.  Also, thank you to Michael for helping me think more [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;"><em>Michael Delaney, a senior consulting software engineer at <a href="http://www.blueslate.net/" target="_new">Blue Slate</a>, commented on my <a href="http://www.blueslate.net/roller/daveread/entry/code_reviews_trump_unit_testing" target="_new">previous posting</a>.  As I created a reply I realized that I was expanding on my reasoning and it was becoming a bit long.  So, here is my reply as a follow-up posting.  Also, thank you to Michael for helping me think more about this topic.<br />
</em></p>
<p>I understand the desire to rely on unit testing and its ability to find issues and prevent regressions.  For TDD, I&#8217;ll need to write separately.  Fundamentally I&#8217;m a believer in white box testing.   Black box approaches, like TDD, seem to be of relatively little value to the overall quality and reliability of the code.  <em>Meaning, <strong>I’d want to invest more effort in white box testing than in black box testing.</strong></em></p>
<p>I&#8217;m somewhat jaded, being concerned with the code&#8217;s security, which to me is strongly correlated with its reliability.  That said, I believe that unit testing is much more constrained as compared to formal reviews.  <strong><em>I’m not suggesting that unit tests be skipped</em></strong>, rather that we understand that <em>unit tests can catch certain types of flaws and that those types are narrow as compared to what formal reviews can identify.</em></p>
<p><span id="more-1334"></span>Here is how I view the constraints around unit testing:</p>
<p><strong>First, unit tests are normally implemented by the same developer who created (or will create) the code. </strong> This is a huge constraint on the <em><strong>brain power (BP)</strong></em> being applied to the quality of the code.  In this case <strong>BP=1</strong>.  If the unit tests are created by a separate individual then <strong>BP&gt;1 and, I argue, BP&lt;2</strong> because the tester is focused on the micro level (the unit level of code).</p>
<p><strong>Second, unit testing is often used to only check that the code does what it is supposed to do.</strong>  This means unit tests don&#8217;t often check whether code does things it isn&#8217;t supposed to.</p>
<p><strong>Third, unit tests don&#8217;t consider maintainability of the code, or really any macro-level concerns. </strong> Passing unit test doesn&#8217;t mean that the code is easy to understand or logically organized.  I agree that single-purpose methods and well written code are easier to test, but bad code can be tested and even meet high coverage requirements.</p>
<p>The reason I believe that the study&#8217;s results (discussed in the <a href="http://www.blueslate.net/roller/daveread/entry/code_reviews_trump_unit_testing" target="_new">previous post</a>) are probably still accurate is that many software fundamentals are no different now than in 1986.  We used unit testing frameworks (sometimes called scaffolds) and leveraged unit tests to prevent regressions.  In some ways we had to be more concerned with the completeness of our test scenarios in 1986 due to the overhead of modifying an application.  Developers today can be somewhat cavalier with code changes since our system environments tend make distribution of updates easier.</p>
<p><em><strong>I agree that new languages, paradigms and environments have not reduced the need for unit testing,</strong></em> quite the contrary.  We have more options and threats to deal with.  However, we can more quickly make and distribute an updated application than we could in 1986.</p>
<p>Considering the breadth of what can be identified,<strong> <em>formal design and code reviews go beyond simply checking that for some inputs a valid output is calculated. </em></strong> Instead, formal reviews apply <strong>BP=T(RE%)</strong> where <em><strong>T is the team size</strong></em> and <em><strong>RE% is the review effort percentage</strong></em> the team invests.</p>
<p>These reviews<strong> </strong><em><strong>catch maintenance issues, identify best practices violations</strong></em><strong> </strong>(beyond what a style checker would do),<strong> </strong><em><strong>find opportunities for refactoring</strong></em><strong> </strong>(reducing code, reducing complexity, etc.) and<strong> </strong><em><strong>serve as a way for the team to increase its overall expertise through interactions and discussion. </strong></em> Clearly, this represents a much broader value set than that provided by unit testing.</p>
<p>As I said in the original posting, this thinking isn&#8217;t a defense to skip unit testing in favor of formal reviews.  (I clamor for unit tests and high coverage metrics.)  Instead, my message is a call to understand that you need both (and other types of testing and inspections), realizing that the formal reviews catch a more comprehensive set of issues than unit testing.  Time and again I have seen organizations leveraging unit testing but not formal reviews.  <strong><em>This means that their RE% is zero</em>.  </strong><em><strong>For me this is a poor investment choice.</strong></em></p>
<p>Perhaps it would be worth doing a small scale study to see if our shop has a similar experience?  It might be quite informative for the whole team.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1334</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Code Reviews Trump Unit Testing , But They Are Better Together</title>
		<link>http://monead.com/blog/?p=1291</link>
		<comments>http://monead.com/blog/?p=1291#comments</comments>
		<pubDate>Wed, 12 Oct 2011 03:28:25 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Quality]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[efficient coding]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[Managing IS Projects]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1291</guid>
		<description><![CDATA[Last week I was participating in a formal code review (a.k.a. code inspection) with one of our clients.  We have been working with this client, helping them strengthen their development practices.  Holding formal code reviews is a key component for us.  Part of the formal process we introduced includes reviewing the unit testing results, both [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;">Last week I was participating in a formal code review (a.k.a. code inspection) with one of our clients.  We have been working with this client, helping them strengthen their development practices.  Holding formal code reviews is a key component for us.  Part of the formal process we introduced includes reviewing the unit testing results, both the (successful) output report and the code coverage metrics.</p>
<p>At one point we were reviewing some code that had several error handling blocks that were not being covered in the unit tests.  These blocks were, arguably, unlikely or impossible to reach (such as a Java StringReader throwing an IOException).  There was some discussion by the team about the necessity of mocking enough functionality to cover these blocks.</p>
<p>Although we agreed that some of the more esoteric error conditions weren’t worth the programmer’s time to mock-up, it occurred to me later that we were missing an important point.  <strong>What mattered was that we were holding a formal code review and looking at those blocks of code.</strong></p>
<p>Let me take a step back.  In 1986, <a href="http://en.wikipedia.org/wiki/Capers_Jones" target="_new">Capers Jones</a> published a book entitled <a href="http://en.wikipedia.org/wiki/Special:BookSources/9780070328112" target="_new"><strong>Programming Productivity</strong></a>.  Although dated, the book contains many excellent points that cause you think about how to create software in an efficient way.  Here efficiency is <em><strong>not</strong></em> <em>about lines of code per unit of time</em>, but<strong> more importantly</strong>, <em><strong>lines of correct code per unit of time</strong></em>.  This means taking into account rework due to errors and omissions.</p>
<p>One if the studies presented in the book relates to identifying defects in code.  It is a study whose results seem obvious when we think about them.  <strong><em>However, we don’t always align our software development practices to leverage the study’s lessons and maximize our development efficiency. </em></strong> Perhaps we believe that the statistics have changed due to language construct, experience, tooling and so forth.  We’d need similar studies to the ones presented by Capers Jones in order to prove that, though.</p>
<p>Below are a few of the actions from the book’s study of defect detection approaches.  I’ve skipped the low end and high-end numbers that Caper&#8217;s includes, simply giving the modes (averages) which are a good basis for comparison:</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2011/10/DefectIdentificationRatesData.png"><img title="Defect Identification Rates Data" src="http://monead.com/blog/wp-content/uploads/2011/10/DefectIdentificationRatesData.png" alt="Defect Identification Rates Data Table" width="285" height="255" align="left" /></a><br />
<a href="http://monead.com/blog/wp-content/uploads/2011/10/DefectIdentificationRatesGraph.png"><img title="Defect Identification Rates Graph" src="http://monead.com/blog/wp-content/uploads/2011/10/DefectIdentificationRatesGraph-300x182.png" alt="Defect Identification Rates Graph" width="300" height="182" align="right" /></a><br clear="all" /><br />
<span id="more-1291"></span><strong>Based on this data, we see that formal reviews of the design and code are much more effective at finding defects than unit testing. </strong> In this case, unit testing is focused on branch coverage.  Obviously, changing the testing expectations, such as applying domain testing concepts, might change the effectiveness. My guess is that such a change would not bring unit testing on par with formal code reviews but would improve its performance, at a cost of productivity – since domain testing takes more rigor to apply.</p>
<p>As Capers points out, these percentages can’t be added together.  <em>In other words, there is an intersection of the defects found by these methods. </em> In his studies these methods found some of the same and some unique defects.  <em><strong>Therefore, this data does not provide a defense to skip unit testing. </strong></em> Instead, <strong>each defect detection method should be used in a complementary fashion.</strong>  The statistical data does inform us regarding level-of-effort decisions.  Also, we can use this information to help define where to cut corners, if necessary.</p>
<p><strong>I</strong><strong>f we have to reduce the time we spend in the testing phase, we would do better to give up some unit testing in favor of keeping more formal code reviews. </strong> This isn’t always the way we approach such situations.  It is easy to skip the formal reviews, believing that the time would be better spent by developers creating more tests for their code.  Apparently this is not the case.</p>
<p>I think it is valuable to periodically remind ourselves about studies like these as a way to make sure we aren’t falling into bad habits or being misled by incorrect assumptions.  Also, as teams bring on new members, it is good to review our practices, understand the relevant literature and actively discuss why we do what we do (and whether it needs to change).</p>
<p>A parting thought about statistics like the ones quoted here.  They are an average from a very broad array of problem types and development shops.  As <a href="http://en.wikipedia.org/wiki/Boris_Beizer" target="_new">Boris Beizer</a> has pointed out, <em>each shop has a “<strong>bug fingerprint</strong>”. </em> The bug fingerprint is impacted by things like the technologies being used, types of applications being developed and the experience of the team members.</p>
<p>Further, different types of bugs are better identified through different detection methods (branch tests, domain tests, code inspections, etc.).  Therefore, the optimal set of testing approaches for a given group of developers will differ somewhat from the average.  I’ll discuss this in a future post.</p>
<p>What does your team do in regards to testing and inspections?  Are there other techniques you find useful for identifying software defects?  I welcome your comments and feedback.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1291</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Cognitive Corporation™ &#8211; An Introduction</title>
		<link>http://monead.com/blog/?p=1231</link>
		<comments>http://monead.com/blog/?p=1231#comments</comments>
		<pubDate>Mon, 26 Sep 2011 18:56:58 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[BPM]]></category>
		<category><![CDATA[Cognitive Corporation]]></category>
		<category><![CDATA[Information Systems]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[cognitive corporation]]></category>
		<category><![CDATA[enterprise applications]]></category>
		<category><![CDATA[enterprise systems]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[system integration]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1231</guid>
		<description><![CDATA[Given my role as an enterprise architect, I’ve had the opportunity to work with many different business leaders, each focused on leveraging IT to drive improved efficiencies, lower costs, increase quality, and broaden market share throughout their businesses.  The improvements might involve any subset of data, processes, business rules, infrastructure, software, hardware, etc.  A common [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;">Given my role as an enterprise architect, I’ve had the opportunity to work with many different business leaders, each focused on leveraging IT to drive improved efficiencies, lower costs, increase quality, and broaden market share throughout their businesses.  The improvements might involve any subset of data, processes, business rules, infrastructure, software, hardware, etc.  A common thread is that <em><strong>each project seeks to make the corporation smarter through the use of information technology.</strong></em></p>
<p>As I’ve placed these separate projects into a common context of my own, I’ve concluded that the long term goal of leveraging information technology must be for it to support cognitive processes.  I don’t mean that the computers will think for us, rather that<em><strong> IT solutions must work together to allow a business to learn, corporately.</strong></em></p>
<p>The individual tools that we utilize each play a part.  However, we tend to utilize them in a manner that focuses on isolated and directed operation rather than incorporating them into an overall learning loop.  In other words,<em><strong> we install tools that we direct without asking them to help us find better directions to give.</strong></em></p>
<p>Let me start with a definition: similar to thinking beings, <strong>a <span style="text-decoration: underline;">cognitive corporation</span>™ leverages a feedback loop of information and experiences to inform future processes and rules. </strong> Fundamentally, learning is a process and it involves taking known facts and experiences and combining them to create new hypothesis which are tested in order to derive new facts, processes and rules.  Unfortunately, we don’t often leverage our enterprise applications in this way.</p>
<p>We have many tools available to us in the enterprise IT realm.  These include database management systems, business process management environments, rule engines, reporting tools, content management applications, data analytics tools, complex event processing environments, enterprise service buses, and ETL tools.  <em>Individually, these components are used to solve specific, predefined issues with the operation of a business.  </em>However, this is not an optimal way to leverage them.</p>
<p>If we consider that these tools mimic aspects of an intelligent being, then we need to leverage them in a fashion that manifests the cognitive capability in preference to simply deploying a point-solution.  This involves thinking about the tools somewhat differently.</p>
<p><span id="more-1231"></span>As I consider the vision of empowering a learning corporation, I break enterprise tools into 6 general areas.  These areas provide a portion of what is required in order to survive and grow in an environment.  <strong>These 6 areas are: Think, Communicate, Act, Know, Sense and Learn.</strong>  They are in no particular order since the absence of any will prevent meaningful growth from occurring.</p>
<p>I’ll be expanding on this concept over time, drilling into each area and the solutions we often use to meet that need.  Depicted below is a very high-level diagram of these areas with the categories of products or functions I place in each.  The categories are broad and I’ve chosen key terms to represent a lot of different tools and techniques.</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2011/09/Cognitive-Corporation-Enterprise-Components2.png"><img class="aligncenter size-medium wp-image-1288" title="Cognitive Corporation Enterprise Components" src="http://monead.com/blog/wp-content/uploads/2011/09/Cognitive-Corporation-Enterprise-Components2-300x220.png" alt="Cognitive Corporation Enterprise Components Depiction" width="300" height="220" /></a></p>
<p>This framework provides an infrastructure, similar to how a person is made up of a set of systems.  However,<em><strong> it is the information that is shared between systems that becomes the key to actually supporting the idea of a thinking and learning system.</strong></em>  There are assumptions we make as we design enterprise systems that will limit our ability to empower a cognitive corporation™.</p>
<p>For instance, we may limit our definition of valuable data to the information collected by our on-line systems.  A broader interpretation begins to consider the process steps followed and the rules executed for a given interaction as data.  There are many places where such assumptions and limitations will interfere with gaining a true advantage from our enterprise solutions.</p>
<p><em><strong>We must change our view of the way that a business benefits through the use of IT.</strong></em>  The focus on feature sets for a specific type of problem must be replaced with a <em>focus on the whole set of systems and how they interrelate and may be leveraged to benefit business growth &#8211; in multiple dimensions</em>.  This will allow us to make an order-of-magnitude jump in the value derived from IT investments.</p>
<p>Therefore, <strong>an enterprise architecture goes beyond the role that systems will play, it must define how the systems will interact within the entire enterprise. </strong> We are then in a position to leverage these powerful tools to create a cognitive corporation™.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1231</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Going Green Means More Green Going?</title>
		<link>http://monead.com/blog/?p=1123</link>
		<comments>http://monead.com/blog/?p=1123#comments</comments>
		<pubDate>Thu, 11 Aug 2011 05:18:37 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Cars]]></category>
		<category><![CDATA[Civic Hybrid]]></category>
		<category><![CDATA[fuel efficiency]]></category>
		<category><![CDATA[hybrid car]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[mileage]]></category>
		<category><![CDATA[TCO]]></category>
		<category><![CDATA[total cost of ownership]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1123</guid>
		<description><![CDATA[Readers of my blog may be aware that I own a hybrid car, a 2007 Civic Hybrid to be precise.  I have kept a record of almost every gas purchase, recording the date, accumulated mileage, gallons used, price paid as well as the calculated and claimed MPG.  I thought since I now have four years [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;">Readers of my blog may be aware that I own a hybrid car, a 2007 Civic Hybrid to be precise.  I have kept a record of almost every gas purchase, recording the date, accumulated mileage, gallons used, price paid as well as the calculated and claimed MPG.  I thought since I now have four years of data that I could use the data to evaluate the fuel efficiency’s impact on my total cost of ownership (TCO).<strong></strong></p>
<p><strong>I had two questions I wanted to answer: 1) did I achieve the vehicle’s advertised MPG; and is the gas savings significant versus owning a non-hybrid. </strong></p>
<p>To answer the second question I needed to choose an alternate vehicle to represent the non-hybrid.  I thought a good non-hybrid to compare would be the 2007 Civic EX since the features are similar to my car, other than the hybrid engine. <em></em></p>
<p><em>Some caveats, I am not including service visits, new tires or the time value of money in my TCO calculations.</em></p>
<p>First some basic statistics.  I have driven my car a little over 105,500 miles at this point.  I have used about 2,508 gallons of gas costing me $7,466 over the last four years.  I have had to fill up the car about 290 times.  My mileage over the lifetime of the car has averaged 42 MPG which matches the expected MPG from the original sticker.  <strong><em>Question 1 answered – advertised MPG achieved.</em></strong></p>
<p>To explore question 2, I needed an average MPG for the EX.  Since traditional cars have different city and highway MPG I had to choose a value that made sense based on my driving, yet be conservative enough to give me a meaningful result.  The 2007 Civic EX had an advertised MPG of 30 city and 38 highway.  I do significantly more highway than city driving, but thought I’d be really conservative and choose 32 MPG for my comparison.</p>
<p>With that assumption in place, I can calculate the gas consumption I would have experienced with the EX.  Over the 105,500 miles I would have used about 3,306 gallons of gas costing about $9,903.</p>
<p>What this means is that <strong><em>if I had purchased the EX in 2007 instead of the Hybrid I would have used about 798 more gallons of gas costing me an additional $2,437 over that time period.</em></strong>  That is good to know, both in terms of my reduced carbon footprint and fuel cost savings.</p>
<p>However, there is a cost difference between the two vehicle purchase prices.  The Hybrid MSRP was $22,600 while the EX was $18,710.  The Hybrid cost me $3,890 more to purchase.</p>
<div id="attachment_1131" class="wp-caption alignleft" style="width: 241px"><a href="http://monead.com/blog/wp-content/uploads/2011/08/GasComparisonChart.png"><img class="size-full wp-image-1131" title="Gas Consumption (Hybrid versus Postulated EX)" src="http://monead.com/blog/wp-content/uploads/2011/08/GasComparisonChart.png" alt="Gas Consumption (Hybrid versus Postulated EX)" width="231" height="109" /></a><p class="wp-caption-text">Gas Consumption (Hybrid versus Postulated EX)</p></div>
<p><strong>So over the four years I’ve owned the car, I’m actually currently behind by $1,453 over purchasing the EX</strong> (again not considering the time value of money, which would make it worse).  I will need to keep the car for several more years to break even, and in reality it may not be possible to ever break even if I start including the time value factor.   <em>Question 2 answered and it isn&#8217;t such good news.</em><strong><em></em></strong></p>
<p><strong><em>My conclusion is that purchasing a hybrid is not a financially smart choice.</em></strong>  I also wonder if it is even an environmentally sound one given the chemicals involved in manufacturing the battery.  Maybe the environment comes out ahead or maybe not.  I think it is unfortunate that the equation for the consumer doesn’t even hit break even when trying to do the right thing.</p>
<p><span id="more-1123"></span>With my main questions answered I did spend a little time playing with the data.  After all, exploring data is fun.  For the moment I’ll stick with the more mundane reporting.  I’ve loaded the data into Rapid Miner and Weka to do some mining, which I’ll write about later.</p>
<div id="attachment_1127" class="wp-caption alignleft" style="width: 310px"><a href="http://monead.com/blog/wp-content/uploads/2011/08/MilesMPGBySeason.png"><img class="size-medium wp-image-1127 " title="Miles Driven and MPG by Season" src="http://monead.com/blog/wp-content/uploads/2011/08/MilesMPGBySeason-300x180.png" alt="Miles Driven and MPG by Season" width="300" height="180" /></a><p class="wp-caption-text">Miles Driven and MPG by Season</p></div>
<p>I split the data in seasons, since I’ve noted an obvious pattern in MPG changing through the year.  I then reported on miles driven and MPG in each season.  To keep the reporting balanced I only used the three complete years of data that I have (it’ll be another 7 months to fill out another complete set of seasons).</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2011/08/AvgMPG.png"><img title="Average MPG by Season" src="http://monead.com/blog/wp-content/uploads/2011/08/AvgMPG-150x150.png" alt="Average MPG by Season" width="150" height="150" align="left" /></a> <a href="http://monead.com/blog/wp-content/uploads/2011/08/Distance.png"><img title="Distance by Season" src="http://monead.com/blog/wp-content/uploads/2011/08/Distance-150x150.png" alt="Distance by Season" width="150" height="150" align="left" /></a> <a href="http://monead.com/blog/wp-content/uploads/2011/08/GallonsUsed.png"><img title="Gallons Used by Season" src="http://monead.com/blog/wp-content/uploads/2011/08/GallonsUsed-150x150.png" alt="Gallons Used by Season" width="150" height="150" align="left" /></a><br />
<br clear="all" /><br />
Looking at the charts it is interesting to see the differences in total driving distance by season.  I wasn’t expected to see that, or if anything differed, I expected winter to involve fewer miles driven than other seasons.  As I’d observed informally, the average MPG clearly changes during winter.  I think increased Ethanol added by fuel companies during cold months is the likely culprit.  <em><strong>Interesting that gas prices don’t reflect that lowered efficiency during the winter.</strong></em></p>
<p>All told, even though I do 25% of my driving in the winter, I use 27% of the gas I purchase.  Not a huge difference, but certainly not optimal.  I should drive less in the winter to maximize my annual savings.</p>
<p>As I begin considering the purchase of a new car, I am less sure that a hybrid will be my top choice, especially since many small cars get good mileage, easily within 6 MPG (highway) of a lot of hybrids.</p>
<p>If others out there have had a significantly different experience with their hybrid I would be interested in hearing about it.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1123</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Android Programming Experiences with Sparql Droid</title>
		<link>http://monead.com/blog/?p=1069</link>
		<comments>http://monead.com/blog/?p=1069#comments</comments>
		<pubDate>Mon, 11 Jul 2011 03:18:02 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1069</guid>
		<description><![CDATA[As I release my 3rd Alpha-version of Sparql Droid I thought I&#8217;d document a few lessons learned and open items as I work with the Android environment.  Some of my constraints are based on targeting smart phones rather than tablets, but the lessons learned around development environments, screen layouts, and memory management are valuable. I&#8217;ll [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;">As I release my 3rd Alpha-version of <a href="https://market.android.com/details?id=com.monead.semantic.android.sparql&amp;hl=en" target="_new">Sparql Droid</a> I thought I&#8217;d document a few lessons learned and open items as I work with the Android environment.  Some of my constraints are based on targeting smart phones rather than tablets, but the lessons learned around development environments, screen layouts, and memory management are valuable.</p>
<p>I&#8217;ll start on the development side.  <em><strong>I use Eclipse and the android development plugin is very helpful. </strong></em>It greatly streamlines the development process.  Principally, it automates the generation of the resources from the source files.  These resources, such as screen layouts and menus, require a conversion step after being edited.  The automation, though, comes at a price.</p>
<p>Taking a step back, Android doesn&#8217;t use an Oracle-compliant JVM.  Instead it uses the <a href="http://en.wikipedia.org/wiki/Dalvik_%28software%29" target="_new">Dalvik VM</a>.  This difference creates two major ramifications:<strong> 1) not all the standard packages are available; and 2) any compiled Java code has to go through a step to &#8220;align&#8221; it for Dalvik.</strong> This alignment process is required for class files you create and for any third-party classes (such as those found in external JAR files).  Going back to item 1, if an external JAR file you use needs a package that isn&#8217;t part of Dalvik, you&#8217;ll need to recreate it.</p>
<p>The alignment process works pretty fast for small projects.  My first application was a <a href="https://market.android.com/details?id=com.monead.games.android.sequence" target="_new">game</a> that used no external libraries.  The time required to compile and align was indistinguishable from typical compile time.  However, with Sparql Droid, which uses several large third-party libraries, the alignment time is significant &#8211; on the order of a full minute.</p>
<p>That delay doesn&#8217;t sound so bad, unless you consider the <strong>Build Automatically feature in Eclipse.  This is a feature that you want to turn off when doing Android development that includes third-party libraries of any significance. </strong>Turning off that feature simply adds an extra step to the editing process, a manual build, and slightly reduces the convenience of the environment.</p>
<p>With my first Android project, I was able to edit a resource file and immediately jump back to my Java code and have the resource be recognized.   Now I have to manually do a build (waiting a minute or so) after editing a resource file before it is recognized on the code side.  Hopefully the plug-in will be improved to cache the aligned libraries, saving that time when the libraries aren&#8217;t being changed.</p>
<p><em><strong><span id="more-1069"></span>Another lesson learned for me relates to GUI interaction</strong></em>, specifically updating the GUI.  In Sparql Droid I am using threads so that processing of ontologies or execution of SPARQL queries does not interfere with UI interaction.  I immediately hit the Android design principle that <em><strong>only the thread that created a view can update the view. </strong></em></p>
<p>Attempting to interact with a UI component from another context (thread) leads to a <strong>CalledFromWrongThreadException </strong>being thrown.  <strong>The correct way to have an asynchronous update to the UI is using a Handler instance that is created as part of the view and therefore runs under that view&#8217;s thread. </strong>Once I setup a handler and used events to communicate updates, I was able to create threads and manage asynchronous GUI updates without any issues.<strong><br />
</strong></p>
<p><em><strong>A major frustration point for me was an Ant 1.7 compatibility issue. </strong></em>I hadn&#8217;t bothered upgrading to Ant 1.8 and for my first Android application I had no issues with the release target that the Android environment provides.  However, the reason it worked was that I was not using any third-party JAR files.</p>
<p>There is an incompatibility with the Android development environment&#8217;s dynamic inclusion of JARs in the default <em>libs </em>folder.  The real issue is that the Android setup doesn&#8217;t complain about Ant 1.7.  According to <a href="http://code.google.com/p/android/issues/detail?id=13091" target="_new">this bug report</a> that shortcoming will be resolved in the next Android development tools release (perhaps it has already).</p>
<p><strong>Unfortunately the build failure is silent (since the build classpath includes the JARs, only the Dalvik alignment process doesn&#8217;t) and the symptom is, obviously, a ClassNotFoundException when you run the released application. </strong>Focusing on that error led me down the wrong path for a while since I though the issue might be an incompatibility with a class being aligned &#8211; even though it was working in the emulator.  I eventually unzipped the APK file and saw that the JARs were not there. Given that fact, Google quickly led me to a report about the Ant 1.7 and Android tools problem.</p>
<p><em><strong>Of course a major constraint when programming for the phone is memory. </strong></em>When I programmed for my Vic 20 (with a whopping 3.5 K of user memory) I prided myself on what I could squeeze into the system.  I&#8217;m back to pursuing that line of design.</p>
<p><strong>For me, the most interesting aspect of dealing with constrained memory is reacting to OutOfMemoryException situations. </strong>I&#8217;m experimenting with detecting these and changing the operation of the program.  In many cases the out of memory condition isn&#8217;t fatal and can be handled (at least with the functionality I&#8217;m working with currently).</p>
<p>For example, a common place for me to exhaust memory is when resizing images that have been downloaded.  In my initial design, I downloaded the image and held onto the original, returning a resized copy to my view.  My first memory-sensitive fix was to resize the image and only keep the smaller version.</p>
<p>However, depending on the number of images being downloaded this can still overwhelm the available heap.  I have been experimenting with catching the out of memory condition and abandoning the image.  This seems to work fine in the emulator.  Next I&#8217;ll be trying to save the image to local storage as a last ditch effort to not losing it.</p>
<p><em><strong>I have also been learning a bit about image manipulation. </strong></em>There are API calls that seem to imply they will support resizing an image in a general fashion but they don&#8217;t seem to actually alter the image.  Specifically, <em><strong>I tried creating a Canvas with my desired dimensions and calling the Image class&#8217; draw() method, </strong></em>keeping the returned Image instance.  However that image had the same dimensions as the original.</p>
<p>To actually resize an image, I ended up having to create a Bitmap instance from the Image and use the createBitmap() method, passing in the desired dimensions.  It isn&#8217;t clear to me from the API documentation why my original approach doesn&#8217;t work.</p>
<p><em><strong>One other design consideration that I&#8217;m still fussing with is laying out sets of query results. </strong></em>When pulling data that is best represented in groups we often use tables.  Each row is a member of the group and each column is one attribute of the group.  However, on the small UI of a phone that design almost always guarantees horizontal scrolling.  Also, if the columns require varying amounts of vertical space then the table looks odd when the taller columns aren&#8217;t visible.</p>
<p>When I first released Sparql Droid I had the SPARQL results reported in a table.  I didn&#8217;t mind the horizontal scrolling and for simple result sets the tables weren&#8217;t too wide and my columns had similar amounts of data in them so no rows were taller than others.  Once I opened up my queries to a broader set of endpoints and added images that all changed.</p>
<p><strong>Using the HorizontalLayout becomes problematic; it implies an infinitely wide canvas. </strong>The side effect I encountered was that large text fields simply ended up as one long line rather than wrapping.  Add to that the fact that images occupy varying vertical space, and the table ended up looking very odd.</p>
<p>I removed the use of a horizontal layout and instead now group results with a horizontal divider label row.  I like this visualization better for most situations.  Though I think for small sets of attributes that have short values, a horizontal layout makes more sense.  This is another place, like dealing with memory constraints, where I&#8217;ll probably want the program to make a dynamic decision.  In this case, deciding on a horizontal or vertical record layout.</p>
<p>All told, these are the Android programming events that have been of the most interest, and in some cases frustration, for me over the past few weeks.  I&#8217;m looking forward to adding more features to Sparql Droid while learning more about the Android environment.</p>
<p>I&#8217;d appreciate your feedback whether about programming in general, semantic technology or the Sparql Droid application.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1069</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sparql Droid – A Semantic Technology Application for the Android Platform</title>
		<link>http://monead.com/blog/?p=1028</link>
		<comments>http://monead.com/blog/?p=1028#comments</comments>
		<pubDate>Fri, 24 Jun 2011 17:53:55 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1028</guid>
		<description><![CDATA[The semantic technology concepts that comprise what is generally called the semantic web involve paradigm shifts in the ways that we represent data, organize information and compute results. Such shifts create opportunities and present challenges.  The opportunities include easier correlation of decentralized information, flexible data relationships and reduced data storage entropy.  The challenges include new [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;"><img class="alignleft size-full wp-image-190" title="Sparql Droid logo" src="http://monead.com/blog/wp-content/uploads/2011/06/rdf.512x512-150x150.png" alt="Sparql Droid logo" width="100" height="100" align="left" /><em><strong>The se</strong></em><em><strong></strong></em><em><strong>mantic technology concepts that comprise what is generally called the semantic web involve paradig</strong></em><em><strong></strong></em><em><strong>m shifts in the ways that we represent data, organize information and compute resul</strong></em><em><strong></strong></em><em><strong>ts.</strong></em> <em><strong></strong></em>Su<em><strong></strong></em>ch shifts cr<em><strong></strong></em>e<em><strong></strong></em>ate <em><strong></strong></em>opportunities and present challenges.  The opportunities inclu<em><strong></strong></em>de easier correlation of decentralized information, flexible data relationships and reduced data storage entropy.  The challenges include new data management technology, new syntaxes, and a new separation of data and its relationships.</p>
<p>I am a str<em><strong></strong></em>ong advocate of leveraging semantic technology.  <strong>I believe that this new paradigms provide a more flexible basis for our journey to create meaningful, efficient and effective business automation solutions. </strong>However, one challenge that differentiates leveraging semantic technology from more common technology (such as relational databases) is the lack of mature tools supporting a business system infrastructure.</p>
<p>It will take a while for solid solutions to appear.  Support for mainstream capabilities such as reporting, BI, workflow, application design and development that all leverage semantic technology are missing or weak at best.  Again, this is an opportunity and a challenge.  For those who enjoy creating computer software it presents a new world of possibilities.  For those looking to leverage mature solutions in order to advance their business vision it will take investment and patience.</p>
<p><em><strong>In parallel with the semantic paradigm we have an ever increasing focus on mobile-based solutions. </strong></em>Smart phones and tablet devices, focused on network connectivity as the enabler of value, rather than on-board storage and compute power, are becoming the standard tool for human-system interaction.  As we design new solutions we must keep the mobile-accessible mantra in mind.</p>
<p>As part of my exploration of these two technologies, <strong>I’ve started working on a semantic technology mobile application called <a href="https://market.android.com/details?id=com.monead.semantic.android.sparql&amp;feature=search_result" target="_new">Sparql Droid</a>. </strong>Built for the Android platform, my goal is a tool for exploring and mashing semantic data sources.  As a small first-step I’ve leveraged the <a href="http://code.google.com/p/androjena/" target="_new">Androjena</a> port of the Jena framework and created an application with some basic capabilities.</p>
<p><span id="more-1028"></span>Sparql Droid allows the user to enter an ontology, run the Androjena OWL reasoned and then use SPARQL to query the resulting triples.  This functionality provided a way to prove to myself that the pieces would work, as well as to get a feel for the performance and scalability of the Androjena libraries running on a smartphone.  I also wanted to see how a multi-threaded application would perform on the Android.</p>
<p><a href="http://monead.com/blog/wp-content/uploads/2011/06/SparqlDroid.OntologyTab.png"><img class="alignnone size-medium wp-image-1059" title="Ontology Input" src="http://monead.com/blog/wp-content/uploads/2011/06/SparqlDroid.OntologyTab-180x300.png" alt="Ontology Input" width="180" height="300" /></a> <a href="http://monead.com/blog/wp-content/uploads/2011/06/SparqlDroid.SparqlTab.Results.png"><img class="alignnone size-medium wp-image-1060" title="SPARQL Output" src="http://monead.com/blog/wp-content/uploads/2011/06/SparqlDroid.SparqlTab.Results-180x300.png" alt="SPARQL Query Output" width="180" height="300" /></a> <a href="http://monead.com/blog/wp-content/uploads/2011/06/SparqlDroid.TreeViewTab.png"><img class="alignnone size-medium wp-image-1061" title="Tree View" src="http://monead.com/blog/wp-content/uploads/2011/06/SparqlDroid.TreeViewTab-180x300.png" alt="" width="180" height="300" /></a></p>
<p>The application works and I have placed it in the Android Market for those that are willing to experiment with it.  <strong>The real goal of the application is to become a point-and-click tool for exploring and mashing semantic data. </strong>However, having a semantic reasoner available in order to do some experimentation seems like a feature one would want in a semantic toolbox so I believe this functionality will remain useful.</p>
<p>I&#8217;ll be writing a bit in an upcoming blog post about my development journey.  I have learned a bit about Android programming in the process of building this application and perhaps others will be able to benefit from my experiences.</p>
<p>In the meantime, if you are willing to test the application and provide feedback on its operation as well as collaborate on fleshing this out I would appreciate hearing from you.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1028</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>OpenOffice in a Heterogeneous Office Tool Environment</title>
		<link>http://monead.com/blog/?p=1015</link>
		<comments>http://monead.com/blog/?p=1015#comments</comments>
		<pubDate>Sat, 05 Mar 2011 02:54:03 +0000</pubDate>
		<dc:creator>David Read</dc:creator>
				<category><![CDATA[Tools and Applications]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[microsoft office suite]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://monead.com/blog/?p=1015</guid>
		<description><![CDATA[A few months ago I blogged about my new computer and my quest to use only OpenOffice as my document tool suite (How I Spent My Christmas Vacation).  For a little over a month I was able to work effectively, exchanging documents and spreadsheets with coworkers without incident.  However, it all came crashing down.  My [...]]]></description>
			<content:encoded><![CDATA[<p style="padding-top: 14px;">A few months ago I blogged about my new computer and my quest to use only <strong>OpenOffice </strong>as my document tool suite (<a href="http://monead.com/blog/?p=902" target="_new">How I Spent My Christmas Vacation</a>).  For a little over a month I was able to work effectively, exchanging documents and spreadsheets with coworkers without incident.  However, it all came crashing down.  My goal in this blog entry is to describe what worked and what didn’t.</p>
<p>OpenOffice provides five key office-type software packages.  <em><strong>Writer </strong></em>for word processing, <em><strong>Calc </strong></em>for spreadsheets, <em><strong>Impress </strong></em>for presentations, <em><strong>Base </strong></em>for database work and <em><strong>Draw </strong></em>for diagrams.  There is a sixth tool, <em><strong>Math </strong></em>for creating scientific formulas and equations, which is similar to the equation editor available with MS Word.</p>
<p>As one of my coworkers suggests when providing positive and negative feedback, I’ll use the sandwich approach.  If you’ve not heard of this approach, the idea is to start with some good points, then go through the issues and wrap up with a positive item or two.</p>
<p>On a positive note, <strong>the OpenOffice suite is production worthy</strong>.  For the two tools that seem to be most commonly used in office settings, word processing and spreadsheets, the Writer and Calc tools have all the features that I was used to using with the Microsoft Office (MS Office) tools.  In fact <em>for the most part I was unaware that I was using a different word processor or spreadsheet. </em>From a usability perspective there is little or no learning curve for an experienced MS Office user to effectively use these OpenOffice tools.</p>
<p>Of key importance to me was the ability to work with others who were using MS Office.  The ability for OpenOffice to open the corresponding MS Office documents worked well at first but then cracks began to show.</p>
<p>OpenOffice Writer was able to work with MS Office documents in both the classic Word &#8220;doc&#8221; format and the newer Word 2007 and later “docx” format.  However, Writer cannot save to the “docx” format.  If you open a “docx” then the only MS Office format that can be used to save the document is the “doc” format.  <em>At first this was a small annoyance but obviously meant that if a “docx” feature was used it would be lost on the export to “doc”.</em></p>
<p>Another aggravating issue was confusion when using the “Record Changes” feature, which is analogous to the “Track Changes” features in MS Word.  Although the updates made using MS Word could be seen in Writer, notes created in Word were inconsistently presented in Writer.  The tracked changes were also somewhat difficult to understand when multiple iterations of edits had occurred.  At work we often use track changes as we collaborate on documentation so this feature needs to work well for our team.</p>
<p>I eventually ran into two complete show-stoppers.  In the first case, <em><strong>OpenOffice was unable to display certain images embedded in an MS Word document</strong></em>.  Although some images had previously been somewhat distorted, it turned out that certain types of embedded images wouldn’t display at all.  The second issue involved the Impress (presentation) tool.</p>
<p>I’ve mentioned that Writer and Calc are very mature and robust.  The Impress tool doesn’t seem to be as solid.  As I began working with a team member on a presentation we were delivering in February I discovered that <em><strong>there appears to be little compatibility between MS PowerPoint and Impress. </strong></em>I was unable to work with the PowerPoint presentation using Impress.  The images, animations and text were all completely wrong when opened in Impress.</p>
<p>To be fair, I have created standalone presentations using Impress and the tool has a good feature set and works reliably.  I’ve used it to create and deliver presentations with no issues.  OpenOffice even seems to provide a nicer set of boilerplate presentation templates than the ones that come with MS PowerPoint.</p>
<p><strong>My conclusion after working with OpenOffice now for about 3 months is that it is a completely viable solution if used as the document suite for a company. </strong>However, <em><strong>it is not possible to succeed with these tools in a heterogeneous environment where documents must be shared with MS Office users.</strong></em></p>
<p>I will probably continue to use OpenOffice for personal work.  I’ll also continue to upgrade and try using it with MS Office documents from time to time.  Perhaps someday it will be possible to leverage this suite effectively in a multi-platform situation. Certainly from an ROI perspective it becomes harder and harder to justify the cost of the MS Office suite when such a capable and well-designed open source alternative exists.</p>
<p>Have you tried using alternatives to MS Office in a heterogeneous office tool environment?  Have you had better success than I have?  Any pointers on being able to succeed with such an approach?  Is such an approach even reasonable?  Please feel free to share your thoughts.</p>
]]></content:encoded>
			<wfw:commentRss>http://monead.com/blog/?feed=rss2&#038;p=1015</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

