// JSON-LD for Wordpress Home, Articles and Author Pages. Written by Pete Wailes and Richard Baxter. // See: http://builtvisible.com/implementing-json-ld-wordpress/

Posts Tagged ‘linkedin’

Going Green Means More Green Going?

Thursday, August 11th, 2011

Readers of my blog may be aware that I own a hybrid car, a 2007 Civic Hybrid to be precise.  I have kept a record of almost every gas purchase, recording the date, accumulated mileage, gallons used, price paid as well as the calculated and claimed MPG.  I thought since I now have four years of data that I could use the data to evaluate the fuel efficiency’s impact on my total cost of ownership (TCO).

I had two questions I wanted to answer: 1) did I achieve the vehicle’s advertised MPG; and is the gas savings significant versus owning a non-hybrid.

To answer the second question I needed to choose an alternate vehicle to represent the non-hybrid.  I thought a good non-hybrid to compare would be the 2007 Civic EX since the features are similar to my car, other than the hybrid engine.

Some caveats, I am not including service visits, new tires or the time value of money in my TCO calculations.

First some basic statistics.  I have driven my car a little over 105,500 miles at this point.  I have used about 2,508 gallons of gas costing me $7,466 over the last four years.  I have had to fill up the car about 290 times.  My mileage over the lifetime of the car has averaged 42 MPG which matches the expected MPG from the original sticker.  Question 1 answered – advertised MPG achieved.

To explore question 2, I needed an average MPG for the EX.  Since traditional cars have different city and highway MPG I had to choose a value that made sense based on my driving, yet be conservative enough to give me a meaningful result.  The 2007 Civic EX had an advertised MPG of 30 city and 38 highway.  I do significantly more highway than city driving, but thought I’d be really conservative and choose 32 MPG for my comparison.

With that assumption in place, I can calculate the gas consumption I would have experienced with the EX.  Over the 105,500 miles I would have used about 3,306 gallons of gas costing about $9,903.

What this means is that if I had purchased the EX in 2007 instead of the Hybrid I would have used about 798 more gallons of gas costing me an additional $2,437 over that time period.  That is good to know, both in terms of my reduced carbon footprint and fuel cost savings.

However, there is a cost difference between the two vehicle purchase prices.  The Hybrid MSRP was $22,600 while the EX was $18,710.  The Hybrid cost me $3,890 more to purchase.

Gas Consumption (Hybrid versus Postulated EX)

Gas Consumption (Hybrid versus Postulated EX)

So over the four years I’ve owned the car, I’m actually currently behind by $1,453 over purchasing the EX (again not considering the time value of money, which would make it worse).  I will need to keep the car for several more years to break even, and in reality it may not be possible to ever break even if I start including the time value factor.   Question 2 answered and it isn’t such good news.

My conclusion is that purchasing a hybrid is not a financially smart choice.  I also wonder if it is even an environmentally sound one given the chemicals involved in manufacturing the battery.  Maybe the environment comes out ahead or maybe not.  I think it is unfortunate that the equation for the consumer doesn’t even hit break even when trying to do the right thing.


Android Programming Experiences with Sparql Droid

Sunday, July 10th, 2011

As I release my 3rd Alpha-version of Sparql Droid I thought I’d document a few lessons learned and open items as I work with the Android environment.  Some of my constraints are based on targeting smart phones rather than tablets, but the lessons learned around development environments, screen layouts, and memory management are valuable.

I’ll start on the development side.  I use Eclipse and the android development plugin is very helpful. It greatly streamlines the development process.  Principally, it automates the generation of the resources from the source files.  These resources, such as screen layouts and menus, require a conversion step after being edited.  The automation, though, comes at a price.

Taking a step back, Android doesn’t use an Oracle-compliant JVM.  Instead it uses the Dalvik VM.  This difference creates two major ramifications: 1) not all the standard packages are available; and 2) any compiled Java code has to go through a step to “align” it for Dalvik. This alignment process is required for class files you create and for any third-party classes (such as those found in external JAR files).  Going back to item 1, if an external JAR file you use needs a package that isn’t part of Dalvik, you’ll need to recreate it.

The alignment process works pretty fast for small projects.  My first application was a game that used no external libraries.  The time required to compile and align was indistinguishable from typical compile time.  However, with Sparql Droid, which uses several large third-party libraries, the alignment time is significant – on the order of a full minute.

That delay doesn’t sound so bad, unless you consider the Build Automatically feature in Eclipse.  This is a feature that you want to turn off when doing Android development that includes third-party libraries of any significance. Turning off that feature simply adds an extra step to the editing process, a manual build, and slightly reduces the convenience of the environment.

With my first Android project, I was able to edit a resource file and immediately jump back to my Java code and have the resource be recognized.   Now I have to manually do a build (waiting a minute or so) after editing a resource file before it is recognized on the code side.  Hopefully the plug-in will be improved to cache the aligned libraries, saving that time when the libraries aren’t being changed.


Sparql Droid – A Semantic Technology Application for the Android Platform

Friday, June 24th, 2011

Sparql Droid logoThe semantic technology concepts that comprise what is generally called the semantic web involve paradigm shifts in the ways that we represent data, organize information and compute results. Such shifts create opportunities and present challenges.  The opportunities include easier correlation of decentralized information, flexible data relationships and reduced data storage entropy.  The challenges include new data management technology, new syntaxes, and a new separation of data and its relationships.

I am a strong advocate of leveraging semantic technology.  I believe that this new paradigms provide a more flexible basis for our journey to create meaningful, efficient and effective business automation solutions. However, one challenge that differentiates leveraging semantic technology from more common technology (such as relational databases) is the lack of mature tools supporting a business system infrastructure.

It will take a while for solid solutions to appear.  Support for mainstream capabilities such as reporting, BI, workflow, application design and development that all leverage semantic technology are missing or weak at best.  Again, this is an opportunity and a challenge.  For those who enjoy creating computer software it presents a new world of possibilities.  For those looking to leverage mature solutions in order to advance their business vision it will take investment and patience.

In parallel with the semantic paradigm we have an ever increasing focus on mobile-based solutions. Smart phones and tablet devices, focused on network connectivity as the enabler of value, rather than on-board storage and compute power, are becoming the standard tool for human-system interaction.  As we design new solutions we must keep the mobile-accessible mantra in mind.

As part of my exploration of these two technologies, I’ve started working on a semantic technology mobile application called Sparql Droid. Built for the Android platform, my goal is a tool for exploring and mashing semantic data sources.  As a small first-step I’ve leveraged the Androjena port of the Jena framework and created an application with some basic capabilities.


OpenOffice in a Heterogeneous Office Tool Environment

Friday, March 4th, 2011

A few months ago I blogged about my new computer and my quest to use only OpenOffice as my document tool suite (How I Spent My Christmas Vacation).  For a little over a month I was able to work effectively, exchanging documents and spreadsheets with coworkers without incident.  However, it all came crashing down.  My goal in this blog entry is to describe what worked and what didn’t.

OpenOffice provides five key office-type software packages.  Writer for word processing, Calc for spreadsheets, Impress for presentations, Base for database work and Draw for diagrams.  There is a sixth tool, Math for creating scientific formulas and equations, which is similar to the equation editor available with MS Word.

As one of my coworkers suggests when providing positive and negative feedback, I’ll use the sandwich approach.  If you’ve not heard of this approach, the idea is to start with some good points, then go through the issues and wrap up with a positive item or two.

On a positive note, the OpenOffice suite is production worthy.  For the two tools that seem to be most commonly used in office settings, word processing and spreadsheets, the Writer and Calc tools have all the features that I was used to using with the Microsoft Office (MS Office) tools.  In fact for the most part I was unaware that I was using a different word processor or spreadsheet. From a usability perspective there is little or no learning curve for an experienced MS Office user to effectively use these OpenOffice tools.

Of key importance to me was the ability to work with others who were using MS Office.  The ability for OpenOffice to open the corresponding MS Office documents worked well at first but then cracks began to show.

OpenOffice Writer was able to work with MS Office documents in both the classic Word “doc” format and the newer Word 2007 and later “docx” format.  However, Writer cannot save to the “docx” format.  If you open a “docx” then the only MS Office format that can be used to save the document is the “doc” format.  At first this was a small annoyance but obviously meant that if a “docx” feature was used it would be lost on the export to “doc”.

Another aggravating issue was confusion when using the “Record Changes” feature, which is analogous to the “Track Changes” features in MS Word.  Although the updates made using MS Word could be seen in Writer, notes created in Word were inconsistently presented in Writer.  The tracked changes were also somewhat difficult to understand when multiple iterations of edits had occurred.  At work we often use track changes as we collaborate on documentation so this feature needs to work well for our team.

I eventually ran into two complete show-stoppers.  In the first case, OpenOffice was unable to display certain images embedded in an MS Word document.  Although some images had previously been somewhat distorted, it turned out that certain types of embedded images wouldn’t display at all.  The second issue involved the Impress (presentation) tool.

I’ve mentioned that Writer and Calc are very mature and robust.  The Impress tool doesn’t seem to be as solid.  As I began working with a team member on a presentation we were delivering in February I discovered that there appears to be little compatibility between MS PowerPoint and Impress. I was unable to work with the PowerPoint presentation using Impress.  The images, animations and text were all completely wrong when opened in Impress.

To be fair, I have created standalone presentations using Impress and the tool has a good feature set and works reliably.  I’ve used it to create and deliver presentations with no issues.  OpenOffice even seems to provide a nicer set of boilerplate presentation templates than the ones that come with MS PowerPoint.

My conclusion after working with OpenOffice now for about 3 months is that it is a completely viable solution if used as the document suite for a company. However, it is not possible to succeed with these tools in a heterogeneous environment where documents must be shared with MS Office users.

I will probably continue to use OpenOffice for personal work.  I’ll also continue to upgrade and try using it with MS Office documents from time to time.  Perhaps someday it will be possible to leverage this suite effectively in a multi-platform situation. Certainly from an ROI perspective it becomes harder and harder to justify the cost of the MS Office suite when such a capable and well-designed open source alternative exists.

Have you tried using alternatives to MS Office in a heterogeneous office tool environment?  Have you had better success than I have?  Any pointers on being able to succeed with such an approach?  Is such an approach even reasonable?  Please feel free to share your thoughts.

Domain Testing at the Unit Level, Part 1: An Introduction

Tuesday, February 1st, 2011

It is surprising how many times I still find myself talking to software teams about unit testing.  I’ve written before that the term “unit testing” is not definitive.  “Unit testing” simply means that tests are being defined that run at the unit level of the code (typically methods or functions).  However, the term doesn’t mean that the tests are meaningful, valuable, or quality-focused.

From what I have seen, the term is often used as a synonym for path or branch level unit testing.  Although these are good places to start, such tests do not form a complete unit test suite.  I argue that the pursuit of 100% path or branch coverage and the exclusion of other types of unit testing is a waste of time. It is better for the overall quality of the code if the unit tests achieve 80% branch coverage and include an effective mix of other unit test types, such as domain, fuzz and security tests.

For the moment I’m going to focus on domain testing.  I think this is an area ripe for improvement.  Extending the “ripe” metaphor, I’d say there is significant low-hanging fruit available to development teams which will allow them to quickly experience the benefits of domain testing.

First, for my purposes in this article what is unit-level domain testing?  Unit-level domain testing is the exercising of program code units (methods, functions) using well-chosen values based on the sets of values grouped, often, by Boolean tests in the code. (Note that the well-chosen values are not completely random.  As we will see, they are constrained by the decision points and logic in the code.)

The provided definition is not meant to be mathematically precise or even receive a passing grade on a comp-sci exam.  In future postings I’ll delve into more of the official theory and terminology.  For now I’m focused on the basic purpose and value of domain testing.

I do need to state an assumption and create two baseline definitions in order to proceed:

Assumption: We are dealing only with integer numeric values and simple Boolean tests involving a variable and a constant.


  • Domain - the set of values included (or excluded) by a Boolean test.  For example, the test, “X > 3” has its domain of matching values 4, 5, 6, … and its domain of non-matching values 3, 2, 1, …
  • Boundary – The constant used in a Boolean test forming the point between the included and excluded sets of values.  So for “X > 3” the boundary value is 3.

Now let’s look at some code.  Here is a simple Java method:

public int add(int op1, int op2) {
    return op1 + op2;

This involves one domain, the domain of all integers, sort of.  Looking closely there is an issue; the domain of possible inputs (integers) is not necessarily the domain of possible (correct) outputs.

If two large integers were added together they could produce a value longer than a Java 32-bit integer.  So the output domain is the set of values that can be derived by adding any two integers.  In Java we have the constants MIN_VALUE and MAX_VALUE in the java.lang.Integer class.  Using that vernacular, the domain of all output values for this method can be represented as: MIN_VALUE – MIN_VALUE through MAX_VALUE + MAX_VALUE.

Here is another simple method:

public int divide(int dividend, int divisor) {
    return dividend / divisor;

Again we seem to have one domain, the set of all integers.  However we all know there is a problem latent in this code.  Would path testing effectively find it?


Fuzzing – A Powerful Technique for Software Security Testing

Friday, January 21st, 2011

I was participating in a code review today and was reminded by a senior architect, who started working as an intern for me years ago, of a testing technique I had used with one of his first programs.  He had been assigned to create a basic web application that collected some data from a user and wrote it to a database.  He came into my office, announced it was done and proudly showed it to me.  I walked over to the keyboard, entered a bunch of junk and got a segmentation fault in response.

Although I didn’t have a name for it, that was a standard technique I used when evaluating applications.  After all, the tried and true paths, expected inputs and easy errors will be tested early and often as the developer exercises the application using the basic use cases.  As Boris Beizer said, “The high-probability paths are always tested if only to demonstrate that the system works properly.” (Beizer, Boris. Software Testing Techniques. Boston, MA: Thomson Computer Press, 1990: 76.)

It is unexpected input that is useful when looking to find untested paths through the code. If someone shows me an application for evaluation the last thing I need to worry about is using it in an expected fashion, everyone else will do that.  In fact, I default to entering data outside the specification when looking at a new application.  I don’t know that my team always appreciates the approach.  They’d probably like to see the application work at least once while I’m in the room.

These days there is a formal name for testing of this type, fuzzing.  A few years ago I preferred calling it “gorilla testing” since I liked the mental picture of beating on the application. (Remember the American Tourister luggage ad in the 1970s?)  But alas, it appears that fuzzing has become the accepted term.

Fuzzing involves passing input that breaks the expected input “rules”.  Those rules could come from some formal requirements, such as a RFC, or informal requirements, such as the set of parameters accepted by an application.  Fuzzing tools can use formal standards, extracted patterns and even randomly generated inputs to test an application’s resilience against unexpected or illegal input.


Tag, You’re It!

Wednesday, January 12th, 2011

The Internet is full of examples of simplifications creating vulnerabilities.  A good number of these can be represented as indirection enablers.  IP addresses, domain names, URIs, tiny URLs, QR Codes and now Microsoft tags.  Each of these serves the purpose of simplifying and decoupling.  We have seen many exploits for the first four, what about these last two?

As you likely know, QR Codes and Microsoft tags are graphical images targeted at print media, though there is no reason they can’t be used in an online fashion.  They are most often presented as rectangular graphics (examples below).  The reason for using them is to provide an easy way for someone to access a web page (or other online resource) related to the printed content.  Since these images represent character data they can also be used to house information, like contact details, that do not require online access to interpret.

The use case is simple: install a special program that interprets the codes or tags; point the camera from a smart phone or computer at the graphic; and voilà, your phone presents a web page, phone number or other embedded content. Basically this avoids having to manually enter a URL.  Depending on a company’s marketing strategy this is a powerful feature since a particular ad might want to direct a person to a URL that embeds  information about the specific advertisement, media source, publication page and so forth.  Typing in a complicated URL would put off many people but this removes most of the effort while making the print media interactive.

The main issues with adoption are educating the public about the use of these codes and getting people to install the reader software.  Some of you may recall Radio Shack trying to do something similar several years ago.  They created a scanning device, given out for free, that people had to connect to their PCs.  They could then scan a specific item in a Radio Shack catalog or advertisement and be brought to a web page with detailed information and ordering instructions.

Although that particular attempt failed, these newer approaches have the advantages of being broadly available, leveraging a common accessory on a smart phone (camera) and providing benefits to more than one company.  It will be interesting to see if any of the competing standards catch on with the general public (beyond the two mentioned already there are others such as Data Matrix, Quickmark and PDF417).

My concern, however, isn’t whether these graphical links become popular, it is whether they present another security risk. I believe that they do, in a manner similar to Tiny URLs, yet possibly more insidious.


How I Spent My Christmas Vacation

Wednesday, January 5th, 2011

(or Upgrading to Android and Windows 7)

The holidays are usually a time I can use to catch-up on some extra reading or research.  This year I had two major infrastructure changes that occupied my time.  I moved from my Blackberry Storm to an HTC Incredible and from my old Gateway M680 with Windows XP to a Dell Vostro 3700 running Windows 7.  It has been a bumpy couple of weeks getting my virtual life back in order.

Before getting into some of the details of the experiences, I’ll summarize by saying that both upgrades were worth the learning curve and associated frustration.  The Incredible’s hardware and the Android OS are orders-of-magnitude beyond the Storm in terms of usability, reliability, and functionality.  On my computer, Windows 7 (64-bit professional version) provides a clean and efficient environment.  The compatibility with 32-bit applications has worked flawlessly so far.

The phone journey…

I ordered the Incredible with the intention of switching over to it during the week before Christmas.  I would be off from work that week so any issues with email and calendar wouldn’t pose much risk.  However Verizon had other plans.  A day after the Incredible arrived they shut off my Storm.  This meant I had to get the Incredible going immediately.  This was during a week that I was traveling to Alabama and Vermont so I needed my cell phone working reliably.

I was pleasantly surprised at how quickly I was fully operational with the basic services (phone, email and calendar).  Blue Slate uses Google as our hosted email service so its ease of integration with the Android environment isn’t a surprise.  The phone setup process through Verizon has changed since I got my Storm several years ago.  Making on-line changes to my services is now simple.  I quickly expanded my data plan so that I could use the 3G Mobile feature of the Incredible while at the client’s site.  No issues at all!

My main disappointment with the Incredible is its battery life. With my Storm I could go days without recharging.  Now I have to recharge my phone every night.  I’ve gone through the “kill the app” phase and found that process doesn’t really help.  I use WiFi as much as possible since that is supposed to save battery life over using the cell connection to access email and internet services.  I keep the screen dimmed and turn off location services when they are not needed.

On the bright side, the variety of applications, including a nice SSH tool makes the phone amazingly versatile.  I don’t have to fire up my computer to check on a batch job or fix a basic database problem on our Linux servers.  The GPS services surpass my Magellan’s capabilities so I have one less device to carry with me on trips.

All in all I’m very pleased with my move to the Incredible.  I probably would have considered the iPhone but really prefer Verizon’s coverage.  This phone should serve me well for my 2-year contract.

The computer journey…

My new Dell arrived several weeks before Christmas.  I put off doing anything with it, knowing that the process of moving my virtual life, installed and configured over the course of 4 years on my trusty Gateway laptop, would be onerous.  I’m glad I waited.  Although the Dell is a great machine, the process of getting products installed (or obtaining newer versions) and getting files and configurations in place took several days.


Creating a SPARQL Endpoint Using Joseki

Monday, November 29th, 2010

Being a consumer of semantic data I thought creating a SPARQL endpoint would be an interesting exercise.  It would require having some data to publish as well as working with a SPARQL library.  For data, I chose a set of mileage information that I have been collecting on my cars for the last 5 years.  For technology, I decided to use the Joseki SPARQL Server, since I was already using Jena.

For those who want to skip the “how” and see the result, the SPARQL endpoint along with sample queries and a link to the ontology and data is at: http://monead.com/semantic/query.html

The first step in this project was to convert my mileage spreadsheets into triples.  I looked briefly for an existing ontology in the automobile domain but didn’t find anything I could use.  I created an ontology that would reflect my approach to recording automobile mileage data.  My data  records the miles traveled between fill-ups as well as the number of gallons used.  I also record the car’s claimed MPG as well as calculating the actual MPG.

The ontology reflects this perspective of calculating the MPG at each fill-up.  This means that the purchase of gas is abstracted to a class with information such as miles traveled, gallons used and date of purchase as attributes.  I abstracted the gas station and location as classes, assuming that over time I might be able to flesh these out (in the spreadsheet I record the name of the station and the town/state).

A trivial Java program converts my spreadsheet (CSV) data into triples matching the ontology.  I then run the ontology and data through Pellet to derive any additional triples from the ontology.  The entire ontology and current data are available at http://monead.com/semantic/data/HybridMileageOntologyAll.Inferenced.xml.

It turns out that the ontology creation and data conversion were the easy parts of this project.  Getting Joseki to work as desired took some time, mostly because I couldn’t find much documentation for deploying it as a servlet rather than using its standalone server feature.  I eventually downloaded the Joseki source in order to understand what was going wrong.  The principle issue is that Joseki doesn’t seem to understand the WAR environment and relative paths (e.g. relative to its own WAR).

I had two major PATH issues: 1) getting Joseki to find its configuration (joseki-config.ttl); and 2) getting Joseki to find the triple store (in this case a flat file).


Semantic Web Summit (East) 2010 Concludes

Thursday, November 18th, 2010

I attended my first semantic web conference this week, the Semantic Web Summit (East) held in Boston.  The focus of the event was how businesses can leverage semantic technologies.  I was interested in what people were actually doing with the technology.  The one and a half days of presentations were informative and diverse.

Our host was Mills Davis, a name that I have encountered frequently during my exploration of the semantic web.  He did a great job of keeping the sessions running on time as well as engaging the audience.  The presentations were generally crisp and clear.  In some cases the speaker presented a product that utilizes semantic concepts, describing its role in the value chain.  In other cases we heard about challenges solved with semantic technologies.

My major takeaways were: 1) semantic technologies work and are being applied to a broad spectrum of problems and 2) the potential business applications of these technologies are vast and ripe for creative minds to explore.  This all bodes well for people delving into semantic technologies since there is an infrastructure of tools and techniques available upon which to build while permitting broad opportunities to benefit from leveraging them.

As a CTO with 20+ years focused on business environments, including application development, enterprise application integration, data warehousing, and business intelligence I identified most closely with the sessions geared around intra-business and B2B uses of semantic technology.  There were other sessions looking a B2C which were well done but not applicable to the world in which I find myself currently working.

Talks by Dennis Wisnosky and Mike Dunn were particularly focused on the business value that can be achieved through the use of semantic technologies.  Further, they helped to define basic best practices that they apply to such projects.  Dennis in particular gave specific information around his processes and architecture while talking about the enormous value that his team achieved.

Heartening to me was the fact that these best practices, processes and architectures are not significantly different than those used with other enterprise system endeavors.  So we don’t need to retool all our understanding of good project management practices and infrastructure design, we just need to internalize where semantic technology best fits into the technology stack.