Dave's Reflections » open source

Posts Tagged ‘open source’

OpenOffice in a Heterogeneous Office Tool Environment

Friday, March 4th, 2011

A few months ago I blogged about my new computer and my quest to use only OpenOffice as my document tool suite (How I Spent My Christmas Vacation). For a little over a month I was able to work effectively, exchanging documents and spreadsheets with coworkers without incident. However, it all came crashing down. My goal in this blog entry is to describe what worked and what didn’t.

OpenOffice provides five key office-type software packages. Writer for word processing, Calc for spreadsheets, Impress for presentations, Base for database work and Draw for diagrams. There is a sixth tool, Math for creating scientific formulas and equations, which is similar to the equation editor available with MS Word.

As one of my coworkers suggests when providing positive and negative feedback, I’ll use the sandwich approach. If you’ve not heard of this approach, the idea is to start with some good points, then go through the issues and wrap up with a positive item or two.

On a positive note, the OpenOffice suite is production worthy. For the two tools that seem to be most commonly used in office settings, word processing and spreadsheets, the Writer and Calc tools have all the features that I was used to using with the Microsoft Office (MS Office) tools. In fact for the most part I was unaware that I was using a different word processor or spreadsheet. From a usability perspective there is little or no learning curve for an experienced MS Office user to effectively use these OpenOffice tools.

Of key importance to me was the ability to work with others who were using MS Office. The ability for OpenOffice to open the corresponding MS Office documents worked well at first but then cracks began to show.

OpenOffice Writer was able to work with MS Office documents in both the classic Word “doc” format and the newer Word 2007 and later “docx” format. However, Writer cannot save to the “docx” format. If you open a “docx” then the only MS Office format that can be used to save the document is the “doc” format. At first this was a small annoyance but obviously meant that if a “docx” feature was used it would be lost on the export to “doc”.

Another aggravating issue was confusion when using the “Record Changes” feature, which is analogous to the “Track Changes” features in MS Word. Although the updates made using MS Word could be seen in Writer, notes created in Word were inconsistently presented in Writer. The tracked changes were also somewhat difficult to understand when multiple iterations of edits had occurred. At work we often use track changes as we collaborate on documentation so this feature needs to work well for our team.

I eventually ran into two complete show-stoppers. In the first case, OpenOffice was unable to display certain images embedded in an MS Word document. Although some images had previously been somewhat distorted, it turned out that certain types of embedded images wouldn’t display at all. The second issue involved the Impress (presentation) tool.

I’ve mentioned that Writer and Calc are very mature and robust. The Impress tool doesn’t seem to be as solid. As I began working with a team member on a presentation we were delivering in February I discovered that there appears to be little compatibility between MS PowerPoint and Impress. I was unable to work with the PowerPoint presentation using Impress. The images, animations and text were all completely wrong when opened in Impress.

To be fair, I have created standalone presentations using Impress and the tool has a good feature set and works reliably. I’ve used it to create and deliver presentations with no issues. OpenOffice even seems to provide a nicer set of boilerplate presentation templates than the ones that come with MS PowerPoint.

My conclusion after working with OpenOffice now for about 3 months is that it is a completely viable solution if used as the document suite for a company. However, it is not possible to succeed with these tools in a heterogeneous environment where documents must be shared with MS Office users.

I will probably continue to use OpenOffice for personal work. I’ll also continue to upgrade and try using it with MS Office documents from time to time. Perhaps someday it will be possible to leverage this suite effectively in a multi-platform situation. Certainly from an ROI perspective it becomes harder and harder to justify the cost of the MS Office suite when such a capable and well-designed open source alternative exists.

Have you tried using alternatives to MS Office in a heterogeneous office tool environment? Have you had better success than I have? Any pointers on being able to succeed with such an approach? Is such an approach even reasonable? Please feel free to share your thoughts.

Tags: linkedin, microsoft office suite, open source
Posted in Tools and Applications | No Comments »

Creating a SPARQL Endpoint Using Joseki

Monday, November 29th, 2010

Being a consumer of semantic data I thought creating a SPARQL endpoint would be an interesting exercise. It would require having some data to publish as well as working with a SPARQL library. For data, I chose a set of mileage information that I have been collecting on my cars for the last 5 years. For technology, I decided to use the Joseki SPARQL Server, since I was already using Jena.

For those who want to skip the “how” and see the result, the SPARQL endpoint along with sample queries and a link to the ontology and data is at: http://monead.com/semantic/query.html

The first step in this project was to convert my mileage spreadsheets into triples. I looked briefly for an existing ontology in the automobile domain but didn’t find anything I could use. I created an ontology that would reflect my approach to recording automobile mileage data. My data records the miles traveled between fill-ups as well as the number of gallons used. I also record the car’s claimed MPG as well as calculating the actual MPG.

The ontology reflects this perspective of calculating the MPG at each fill-up. This means that the purchase of gas is abstracted to a class with information such as miles traveled, gallons used and date of purchase as attributes. I abstracted the gas station and location as classes, assuming that over time I might be able to flesh these out (in the spreadsheet I record the name of the station and the town/state).

A trivial Java program converts my spreadsheet (CSV) data into triples matching the ontology. I then run the ontology and data through Pellet to derive any additional triples from the ontology. The entire ontology and current data are available at http://monead.com/semantic/data/HybridMileageOntologyAll.Inferenced.xml.

It turns out that the ontology creation and data conversion were the easy parts of this project. Getting Joseki to work as desired took some time, mostly because I couldn’t find much documentation for deploying it as a servlet rather than using its standalone server feature. I eventually downloaded the Joseki source in order to understand what was going wrong. The principle issue is that Joseki doesn’t seem to understand the WAR environment and relative paths (e.g. relative to its own WAR).

I had two major PATH issues: 1) getting Joseki to find its configuration (joseki-config.ttl); and 2) getting Joseki to find the triple store (in this case a flat file).

(more…)

Tags: Java, linkedin, ontology, open source, semantic web, semantics
Posted in Java, Public Data, Semantic Web, Software Development, Tools and Applications | No Comments »

Semantic Workbench, Get It In Gear

Tuesday, September 21st, 2010

I received a helpful push from Paul Evans this evening. He reminded me that the Semantic Workbench SourceForge project (semanticwb.sourceforge.net) is just sitting idle, waiting to be kicked-off. We talked about the vision around the project, which needs to be clearly and concisely articulated as a mission. At that point we’ll have a direction to take.

This conversation coincided with my attendance at two semantic-web presentations at Oracle OpenWorld, which I am able to attend since it is co-located with JavaOne. I’ll write more about my experiences at this year’s JavaOne conference soon.

These semantic -web presentations validated the value of semantic technologies and the need to make them more visible to the IT community. For my part, this means I need to do more writing and presenting about semantic technologies while creating a renewed vigor around the Semantic Workbench project.

As Paul and I spoke and I tried to define my vision around the project, I realized that I was being too wordy for a mission statement. The fundamentals of my depiction were also different from the current project overview on SourceForge. The overview does not describe the truly useful application that I would like to see come out of the project.

Recognizing this disconnect reinforced the need to come up with a more useful and actionable mission. In the hopes that the project can be of value, I present this mission statement:

The Semantic Workbench strives to provide a complete Java-based GUI and tool set for exploring, testing, and validating common semantic web-based operations.

(more…)

Tags: linkedin, open source, semantic web, semantics
Posted in Java, Semantic Web, Software Development, Tools and Applications | No Comments »

Semantic Workbench – A Humble Beginning

Wednesday, August 18th, 2010

As a way to work with semantic web concepts, including asserting triples, seeing the resulting inferences and also leveraging SPARQL, I have needed a GUI. In this post I’ll describe a very basic tool that I have created and released that allows a user to interact with a semantic model.

My objectives for this first GUI were basic:

Support input of a set of triples in any format that Jena supports (e.g. REF/XML, N3, N-Triples and Turtle)
See the inferences that result for a set of assertions
Create a tree view of the ontology
Make it easy to use SPARQL queries with the model
Allow the resulting model to be written to a file, again using any format supported by Jena

Here are some screen shots of the application. Explanations of the tabs are then provided.

: Initial View: Appearance at startup. The reasoner cannot be run until there is text in the assertions text area.

: Assertions Tab Populated: The assertions tab is shown populated. The Run Reasoner button is then used to run the reasoner and create an in-memory model that can be saved to disk or explored using SPARQL.

: Inferences Tab: Once the reasoner has been run successfully (e.g. legal set of assertons entered on the Assertions tab and the Run Reasoner button used), any inferences will be displayed on this tab.

: Tree View Tab: Once the reasoner has been run successfully (e.g. legal set of assertons entered on the Assertions tab and the Run Reasoner button used), the model (asserted and inferred) will be shown as a tree structure based on class.

: SPARQL Tab: Execute SPARQL queries against the model

The program provides each feature in a very basic way. On the Assertions tab a text area is used for entering assertions. The user may also load a text file containing assertions using the File|Open menu item. Once the assertions are entered, a button is enabled that allows the reasoner to process the assertions. The reasoner level is controlled by the user from a drop down.

(more…)

Tags: Information Systems, linkedin, ontology, open source, semantic web, semantics
Posted in Information Systems, Java, Semantic Web, Tools and Applications | No Comments »

Creating RDF Triples from a Relational Database

Thursday, August 5th, 2010

In an earlier blog entry I discussed the potential reduction in refactoring effort if our data is represented as RDF triples rather than relational structures. As a way to give myself easy access to RDF data and to work more with semantic web tool features I have created a program to export relational data to RDF.

The program is really a proof-of-concept. It takes a SQL query and converts the resulting rows into assertions of triples. The approach is simple: given a SQL statement and a chosen primary key column (PK) to represent the instance for the exported data, assert triples with the primary key column value as the subject, the column names as the predicates and the non-PK column values as the objects.

Here is a brief sample taken from the documentation accompanying the code.

Given a table named people with the following columns and rows:

       id    name    age
       --    ----    ---
       1     Fred    20
       2     Martha  25

And a query of: select id, name, age from people
And the primary key column set to: id
Then the asserted triples (shown using Turtle and skipping prefixes) will be:

       dsr:PK_1
          a       owl:Thing , dsr:RdbData ;
          rdfs:label "1" ;
          dsr:name "Fred" ;
          dsr:age "20" .

       dsr:PK_2
          a       owl:Thing , dsr:RdbData ;
          rdfs:label "2" ;
          dsr:name "Martha" ;
          dsr:age "25" .

You can see that the approach represents a quick way to convert the data.

(more…)

Tags: data, Information Systems, Java, linkedin, ontology, open source, programming, semantic web, semantics
Posted in Information Systems, Java, Semantic Web, Software Composition, Software Development, Tools and Applications | 1 Comment »

My First Semantic Web Program

Saturday, June 5th, 2010

I have create my first slightly interesting, to me anyway, program that uses some semantic web technology. Of course I’ll look back on this in a year and cringe, but for now it represents my understanding of a small set of features from Jena and Pellet.

The basis for the program is an example program that is described in Hebler, Fischer et al’s book “Semantic Web Programming” (ISBN: 047041801X). The intent of the program is to load an ontology into three models, each running a different level of reasoner (RDF, RDFS and OWL) and output the resulting assertions (triples).

I made a couple of changes to the book’s sample’s approach. First I allow any supported input file format to be automatically loaded (you don’t have to tell the program what format is being used). Second, I report the actual differences between the models rather than just showing all the resulting triples.

As I worked on the code, which is currently housed in one uber-class (that’ll have to be refactored!), I realized that there will be lots of reusable “plumbing” code that comes with this type of work. Setting up models with various reasoners, loading ontologies, reporting triples, interfacing to triple stores, and so on will become nuisance code to write.

Libraries like Jena help, but they abstract at a low level. I want a semantic workbench that makes playing with the various libraries and frameworks easy. To that end I’ve created a Sourceforge project called “Semantic Workbench“.

I intend for the Semantic Workbench to provide a GUI environment for manipulating semantic web technologies. Developers and power users would be able to use such a tool to test ontologies, try various reasoners and validate queries. Developers could use the workbench’s source code to understand how to utilize frameworks like Jena or reasoner APIs like that of Pellet.

I invite other interested people to join the Sourceforge project. The project’s URL is: http://semanticwb.sourceforge.net/

On the data side, in order to have a rich semantic test data set to utilize, I’ve started an ontology that I hope to grow into an interesting example. I’m using the insurance industry as its basis. The rules around insurance and the variety of concepts should provide a rich set of classes, attributes and relationships for modeling. My first version of this example ontology is included with the sample program.

Finally, I’ve added a semantic web section to my website where I’ll maintain links to useful information I find as well as sample code or files that I think might be of interest to other developers. I’ve placed the sample program and ontology described earlier in this post on that page along with links to a variety of resources.

My site’s semantic web page’s URL is: http://monead.com/semantic/
The URL for the page describing the sample program is: http://monead.com/semantic/proj_diffinferencing.html

Tags: Information Systems, Java, linkedin, ontology, open source, programming, semantic web, semantics, system integration
Posted in Information Systems, Java, Semantic Web, Software Composition, Software Development, Tools and Applications | 1 Comment »

David S. Read

Dave's Reflections (Blog)

Posts Tagged ‘open source’

OpenOffice in a Heterogeneous Office Tool Environment

Creating a SPARQL Endpoint Using Joseki

Semantic Workbench, Get It In Gear

The Semantic Workbench strives to provide a complete Java-based GUI and tool set for exploring, testing, and validating common semantic web-based operations.

Semantic Workbench – A Humble Beginning

Creating RDF Triples from a Relational Database

My First Semantic Web Program

Pages

Archives

Categories