// JSON-LD for Wordpress Home, Articles and Author Pages. Written by Pete Wailes and Richard Baxter. // See: http://builtvisible.com/implementing-json-ld-wordpress/

MongoDB and Java – Powerful Complementary Platforms

May 31st, 2016

I have found that including MongoDB in the design of Java applications allows me a valuable level of flexibility in meeting client objectives. I have created an initial open source project on GitHub, JavaMongo, with the goal of providing working examples of Java and MongoDB integration. A secondary goal is to include development best practices, such as using testing frameworks and good coding style.

This posting is intended to give a little background on why I find Java and MongoDB to be useful tools in my software development arsenal and then to introduce the JavaMongo project. Future postings will include some videos walking developers through the examples as well as the frameworks being used (like JUnit, Cobertura and Checkstyle)

Background

Java is an ubiquitous platform for creating business applications. It has proven itself across a wide range of use cases from small point-based solutions to large generalized solution stacks. The variety of libraries, frameworks and tools for designing, building, testing and managing Java applications provides significant benefits to companies building solutions using Java. However, an application without ready access to data isn’t particularly useful. As enterprise-scale database options have broadened to include NoSQL, those individuals creating Java-based solutions must be sure to take advantage of new data options in order to benefit from the strengths of such components.

MongoDB is a great NoSQL platform that can be used to provide additional capabilities to your applications. MongoDB is a document store that has proven its reliability, scalability and integrate-ability across numerous small and large-scale applications. Its value and focus complements the way we use relational databases for online transaction-oriented processing (OLTP) and offers advantages over the way we use relational databases for data marts and warehouses.

A point of clarification before proceeding: I’m not here to say that MongoDB is better than some other data product, or, more generally, that document stores are better than relational databases. I find such arguments meaningless without a specific use case or project goal. These technologies are different and have individual strengths and weaknesses in the face of a specific set of project objectives.

I have found that MongoDB plugs in well when I need a place to federate data (structured, semi-structured and unstructured). Given a common platform, it simplifies the work required to build and alter connections between attributes. If you’ve looked at other information about my background you’ll see that I find the use of semantic technology to be incredibly valuable for data federation and classification. MongoDB as a flexible repository plays well with semantics. At the end of this post I’ll give you a small example of that.

JavaMongo Project

The JavaMongo project is intended to provide Java developers with working examples of Java and MongoDB integrations. Over time I expect a variety of common situations to be demonstrated, with associated documentation explaining the use case and the resulting implementation.

In order to have some interesting data to work with, I’m using data sets that my company releases to the public domain. In order to work with the JavaMongo examples you’ll need to import that data into your MongoDB instance. For more information about downloading and importing the sample data, see the discussion on MongoDB Collection of Honeypot Data on my NoSQL topic page.

The initial JavaMongo project contains a basic README file with information on running the example code. Instead of rehashing that information in this post, I’d like to walk through the basic operations being demonstrated in the example code. The main class we’ll explore is BasicStatistics (us.daveread.education.mongo.honeypot.BasicStatistics).

As you know, a Java program starts execution with the main() method. We see that the first step that the BasicStatistics’ main() method takes is to create an instance of the BasicStatictics class.

BasicStatistics Constructor

The constructor code goes through the entire process of connecting to a MongoDB database, accessing a collection and running a query on data in the collection.

First, an instance of MongoClientOptions is created. This class allows us to configure certain client side options related to the connection. I’ll get into more detail with this in future examples. In this case, the program is simply setting the connection timeout to 2000 milliseconds (2 seconds) so that if the instance is not available the program won’t hang for a long time. You wouldn’t make the timeout this short in a production environment but it helps for debugging our local environment by failing fast if something is wrong.

Read the rest of this entry »

Accountable Care Organizations, Data Federation and CMS’ Updated Final Rule for the Medicare Shared Savings Program

June 8th, 2015

CMS LogoCMS has published a final rule (http://federalregister.gov/a/2015-14005) focused on changes to the Medicare Shared Savings Program (MSSP) which impacts Accountable Care Organizations (ACO) significantly. There are a variety of interesting changes being made to the program. For this discussion I’m looking at CMS’ continual drive toward data use and integration as a basis for improving quality of care, gaining efficiency and cutting costs in health care. One way this drive is manifested in the new rule regards an ACO’s plans as related to “enabling technologies,” which is an umbrella term for leveraging electronic data.

As background, Subpart B (425.100 to 425.114) of the MSSP describes ACO eligibility requirements. Two of the changes in this section clearly underscore the importance of electronic data and data integration to the fundamental operation of an ACO. Specifically, looking at page 127, the following updates are being made to section 425.112(b)(4) (emphasis mine):

Therefore, we proposed to add a new requirement to the eligibility requirements under § 425.112(b)(4)(ii)(C) which would require an ACO to describe in its application how it will encourage and promote the use of enabling technologies for improving care coordination for beneficiaries. Such enabling technologies and services may include electronic health records and other health IT tools (such as population health management and data aggregation and analytic tools), telehealth services (including remote patient monitoring), health information exchange services, or other electronic tools to engage patients in their care.

It goes on to add:

Finally, we proposed to add a provision under § 425.112(b)(4)(ii)(E) to require that an ACO define and submit major milestones or performance targets it will use in each performance year to assess the progress of its ACO participants in implementing the elements required under § 425.112(b)(4). For instance, providers would be required to submit milestones and targets such as: projected dates for implementation of an electronic quality reporting infrastructure for participants;

It is clear from the first change that an ACO must have a documented plan in place for continually expanding its use of electronic data and providing data visibility and integration between itself and its beneficiaries and providers. This is a tall order. The number of different systems and data formats along with myriad reporting and analytic platforms makes a traditional integration approach tedious at best and a significant business risk at worst.

The second change, keeping CMS apprised of the progress of data-centric projects, is clearly intended to keep the attention on these data publishing and integration projects. It won’t be enough to have a well-articulated plan, the ACO must be able to demonstrate progress on a regular basis.

Read the rest of this entry »

Impetus for Our Semantics and NoSQL Workshop at the 2015 SmartData Conference

May 15th, 2015

I'm Speaking at the 2015 SmartData ConferenceI’m looking forward to being one of the presenters for infuzIT’s hands-on data integration and analysis workshop at this year’s SmartData Conference in San Jose. Giving people the opportunity to see the amazing power of semantics combined with NoSQL to quickly integrate and analyze data makes my day.

My background includes significant work with data, both as an application developer and data warehouse architect. The acceleration of data-centric hardware and software capabilities over the past 10 years now supports a very different paradigm for exploring, reporting and analyzing data. Processes and procedures for creating a data warehouse or mart, the accepted rules of the road for creating integrated data repositories, are no longer clear cut. The data federation debate is no longer Inmon or Kimball.

A significant shift in data integration revolves around the required lifespan of the integrated data. This lifespan has two key aspects whose evolution now allows us to rethink our approach to data federation. This permits us to be much more agile when bringing heterogeneous data sources together. The two aspects are reflected in these design questions: 1) what data, if any, will be rehosted; and 2) what relationships will be supported within the integrated data?

Rehosting Data

In a traditional data warehouse the data must be rehosted. The new repository is the target where transformed data (cleaned-up, standardized) exists. The queries that will be retrieving data from multiple sources are really pulling data from a single source that has been populated from multiple sources. It represents a heavyweight process, driven by Extract-Transform-Load (ETL) scripts and requiring space to host redundant information.

Relationships Between Data Elements

The target warehouse schema determines what relationships are defined between the data elements being combined. Getting this “right” requires careful planning and coordination between the various groups that will use the warehouse. Given the significant effort, represented as cost, organizations tend to design data warehouses to support broad constituencies as a way to amortize the investment across departments and projects.

Paradigm Shift

Semantics and NoSQL allow us to reduce the effort of integrating data by orders of magnitude. They support a completely different mindset for bringing data together. Instead of carefully designing a model that works well in the general sense (reducing the value in specific cases) we have environments that allow us to experiment, adjust and focus on each case.

Below are several drivers which allow us to approach data federation differently using semantics and NoSQL.

Read the rest of this entry »

Medicaid Managed Care Congress Conversations Highlight the Value of Data Federation

May 22nd, 2014

Photo of Scott, Chris and Dave at MMCC 2014

This week I had the opportunity to attend the Medicaid Managed Care Congress (MMCC) in Baltimore, MD and the privilege of speaking with a variety of leaders from provider, payer, and services organizations. With me from Blue Slate Solutions were Scott Van Buren and Chris Garber. A common theme we heard as we spoke with the attendees was the challenge of bringing data together from multiple sources and making sense of that information.

Medicaid is potentially the most complex government program that exists in the United States. There are federal and state aspects as well as portions that are handled at a local level. Some funding and services are defined as required while others are optional. The financial models’ formulas involve many variables. In short, there are numerous challenges in Medicaid, including the dual eligible changes that seek to address the services disconnects that often exist when a person is eligible for both Medicare and Medicaid.

Combining data from providers, payers, patients, government entities and the community are all necessary in order to optimize the quality of care that is provided to each patient. The definition of provider continues to expand, covering not just the medical needs of a person but incorporating the various social services, so important to the holistic care of an individual, under the umbrella of “provider.”

As we listened to people and talked about their data challenges we were also able to walk them through the Data Unleashed™ approach. The iterative learn-as-you-go process resonated across the board, whether people represented patient advocacy groups, provider organizations or healthcare plans. The capability to start small, obtain value quickly and adapt rapidly to changing environments fits the Medicaid complexities well.

Data Unleashed Front End Screenshot

If you would like to learn more about our agile and lightweight approach to accessing data from across your enterprise in order to quickly begin creating meaningful reporting and analytics, please check out dataunleashed.com for descriptions, videos and case studies. We’d also appreciate the opportunity to host a webinar with your team where we can explore Data Unleashed™ in more depth and discuss your specific data challenges.

Data Unleashed™ Headed to the 2014 Medicaid Managed Care Congress

May 15th, 2014

License plate and Data Unleashed license plate frameFor those of you spending time in Baltimore next week (May 19-21, 2014) to attend the Medicaid Managed Care Congress please stop by Blue Slate’s booth. Our MINI road trip begins Sunday as we head for Camden Yards and the beautiful inner harbor area. Our goal in attending? Having the opportunity to speak with you about your data challenges as well as your Medicaid journey.

Data Unleashed(tm) LogoWe will be demonstrating what we mean by lightweight data federation and agile analytics as the drivers behind creating the Data Unleashed™ service platform. Given our extensive healthcare focus, we have deep experience working with companies on Medicaid initiatives, such as those involving dual eligibles, for instance the FIDA program in New York State.

Beyond data integration and analytics, we provide expertise for plans to: implement business process and business rule management solutions; prepare for site reviews and audits; and unify data from a variety of internal and cloud-based systems. More broadly beyond Medicaid, we work extensively in the Medicare and commercial healthcare space, leading transformative change for businesses such as Medicare Administrative Contractors (MACs) and Blues plans.

We look forward to having a chance to learn more about your operational challenges and share with you our organization’s background and focus areas. Let’s get together and explore opportunities to advance your organization’s strategic goals around  improving quality of care and reducing costs.

Why Isn’t Everybody Doing It?

April 28th, 2014

SheepThat is a very dangerous question for a leader to ask when evaluating options. Yet it is one I hear far too often in the healthcare realm. It encapsulates a rejection of innovation, evolution and learning all in one terse, often rhetorical, question.

A common context for this question, often prefixed by, “If this is so great…,” is when discussing semantics and semantic technology. Although these concepts are not new to some industries, such as media, they are foreign in many healthcare organizations. Yet we know that healthcare payers and providers alike struggle with massive data integration and data analytics challenges just like media conglomerates.

The needs to: combine siloed information; drive an analytics mindset throughout an organization; and support the flexibility of a constantly changing IT environment are common in large healthcare organizations. Repeated attempts by organizations to meet these needs betray a lack of consensus around how to best achieve a valuable result.

Further, the implication that how most organizations solve a problem is optimal ignores the fact that best practices must change over time. The best way to solve a problem last year may not be the same this year. The healthcare industry is changing, the physical world of servers, networks, disk drives, memory is changing, and the expectations of members are changing. What was infeasible years ago becomes commonplace. Relational databases were all but unworkable in the 1970s due to a lack of experienced DBAs, slow disk drives, slow processors and limited memory.

In the same way, semantic formalization and graph databases were too new and limited to deal with large data sets until people gained expertise with ontologies while system hardware benefitted from another generation of Moore’s law. In the face of ongoing innovation, the question leaders should ask when approaching a challenge is, “What advancements have been made since the last time we looked at this problem?

Innovation Technology Strategy Leadership SignpostLeadership requires leading, not following. Leaders mentor their organizations through change in order to reach new levels of success. Leadership is based on learning, open-mindedness, creativity and risk-taking. The question, “Why isn’t everybody doing it?” is the antithesis of leadership and has no place there. In fact, if everybody is doing something, a leader would be better off asking, “How do we get ahead of what everybody is doing?”

Leaders must be on the forefront of pushing for better, faster, cheaper. Questioning the status quo, looking for new opportunities, seeking to leapfrog the competition, those are key foci for leadership.

As a leader, the next time you find yourself limiting your willingness to explore an option because everybody isn’t doing it, keep in mind that calculators, computers, automobiles, elevators, white boards, LED light bulbs, Google maps, telephones, the Internet, 3-D printing, open heart surgery, and many more concepts that are accepted or gaining traction, had a day when only one person or organization was “doing it.” Challenge yourself and your organization to find new options, new best practices and new paradigms for advancing your strategy and goals.

How Does Semantic Technology Enable Agile Data Analytics?

April 25th, 2014

I’m glad you asked. SDATAVERSITYcott Van Buren and I will be presenting a Dataversity webinar entitled, Using Semantic Technology to Drive Agile Analytics, on exactly that topic. Scheduled for May 14, 2014 (and available for replay afterwards), this webinar will highlight key semantic technology capabilities and how those provide an environment for data agility.

We will focus most of the webinar on a case study that demonstrates the agility of semantic technology being used to conduct data analysis within a healthcare payer organization. Healthcare expertise is not required in order to understand the case study.

swAs we look into several iterations of data federation and analysis, we will see the effectiveness of bringing the right subset of data together at the right time for a particular data-centric use. This concept translates well to businesses that have multiple sets of data or applications, including data from third parties, and seek to combine relevant subsets of that information for reporting or analytics. Further, we will see how this augments data warehousing projects, where the lightweight and agile data federation approach informs the warehouse design.

Please plan to  join us virtually on May 14 as we describe semantic technology, lightweight data federation and agile data analytics. There will also be time for you to pose questions and delve into areas of interest that we do not cover in our presentation.

The webinar registration page is: http://content.dataversity.net/051414BlueslateWebinar_DVRegistrationPage.html

We look forward to having the opportunity to share our data agility thoughts and experiences with you.

San Jose and the SemTechBiz 2014 Conference, Here I Come!

April 18th, 2014

semtechbiz2014.imspeaking.203x72I am thrilled to have been invited back to participate at the Semantic Technology and Business (SemTechBiz) conference. This is the premier US conference for learning about, exploring and getting your hands on semantic technology. I’ll be part of a Blue Slate team (including Scott Van Buren and Michael Delaney) who will be conducting a half-day hands-on workshop, Integrating Data Using Semantic Technology, on August 19, 2014. Our mission is to have participants use semantic technology to integrate, federate and perform analysis across several data sources.

We have some work to do to iron out our overall use case, pulling from work we have done with several clients. At a minimum we’ll be working with database schemas, ontologies, reasoners and data analytics tools. It will be a fun and educational experience for attendees.

I’ll post more specifics once the SemTechBiz agenda is published and we have finalized the workshop structure. I hope to see you this August 19-21 in San Jose for our workshop and the amazing learning opportunities throughout the conference.

For more information on the conference, visit its website: http://semtechbizsj2014.semanticweb.com/index.cfm

Initial Time to Build? Vision to Release in Days? Those Aren’t Relevant Measures for Business Agility!

April 15th, 2014

I routinely receive emails, tweets and snail mail from IT vendors that focus on how their solution accelerates the creation of business applications. They will quote executives and technology leaders, citing case studies that compare the time to build an application on their platform versus others. They will make the claim that this speed to release proves that their platform, tool or solution is “better” than the competition. Further, they claim that it will provide similar value for my business’ application needs. The focus of these advertisements is consistently, “how long did it take to initially create some application.”

This speed-to-create metric is pointless for a couple of reasons. First, an experienced developer will be fast when throwing together a solution using his or her preferred tools. Second, an application spends years in maintenance versus the time spent to build its first version.

Build it fast!

Years ago I built applications for GE in C. I was fast. Once I had a good set of libraries, I could build applications for turbine parts catalogs in days. This was before windowing operating systems. There were frameworks from companies like Borland that made it trivial to create an interactive interface. I moved on to Visual Basic and SQLWindows development and was equally fast at creating client-server applications for GE’s field engineering team. I progressed to C++ and created CGI-based web applications. Again, building and deploying applications in days. Java followed, and I created and deployed applications using the leading edge Netscape browser and Java Applets in days and eventually hours for trivial interfaces.

Since 2000 I’ve used BPM and BRM platforms such as PegaRULES, Corticon, Appian and ILOG. I’ve developed applications using frameworks like Struts, JSF, Spring, Hibernate and the list goes on. Through all of this, I’ve lived the euphoria of the initial release and the pain of refactoring for release 2. In my experience not one of these platforms has simplified the refactoring of a weak design without a significant investment of time.

Speed to initial release is not a meaningful measure of a platform’s ability to support business agility. There is little pain in version 1 regardless of the design thought that goes into it. Agility is about versions 2 and beyond. Specifically, we need to understand what planning and practices during prior versions is necessary to promote agility in future versions.

Read the rest of this entry »

Heartbleed – A High-level Look

April 12th, 2014

HeartbleedThere has been a lot of information flying about on the Internet concerning the Heartbleed vulnerability in the OpenSSL library. Among system administrators and software developers there is a good understanding of exactly what happened, the potential data losses and proper mitigation processes. However, I’ve seen some inaccurate descriptions and discussion in less technical settings.

I thought I would attempt to explain the Heartbleed issue at a high level without focusing on the implementation details. My goal is to help IT and business leaders understand a little bit about how the vulnerability is exploited, why it puts sensitive information at risk and how this relates to their own software development shops.

Heartbleed is a good case study for developers who don’t always worry about data security, feeling that attacks are hard and vulnerabilities are rare. This should serve as a wake-up-call that programs need to be tested in two ways – for use cases and misuse cases. We often focus on use cases, “does the program do what we want it to do?” Less frequently do we test for misuse cases, “does the program do things we don’t want it to do?” We need to do more of the latter.

BusinessSecurityBrief: Heartbleed - TitleSlideI’ve created a 10 minute video to walk through Heartbleed. It includes the parable of a “trusting change machine.” The parable is meant to explain the Heartbleed mechanics without requiring that the viewer be an expert in programming or data encryption.

If you have thoughts about ways to clarify concepts like Heartbleed to a wider audience, please feel free to comment. Data security requires cooperation throughout an organization. Effective and accurate communication is vital to achieving that cooperation.

Here are the links mentioned in the video: