// JSON-LD for Wordpress Home, Articles and Author Pages. Written by Pete Wailes and Richard Baxter. // See: http://builtvisible.com/implementing-json-ld-wordpress/

Posts Tagged ‘data’

Thoughts on Blockchain’s Relationship to Data Security

Wednesday, June 13th, 2018

After reading an article in the Wall Street Journal, “Blockchain Could Be the Security Answer. Maybe.” (May 30, 2018) I was concerned that information in the article could mislead readers regarding the place of blockchain in a cybersecurity discussion. Further, ruminations regarding blockchain’s ability to protect information spout from various media sources with insufficient detail regarding exactly how the information is protected.

This post isn’t meant to explain blockchain, there are many resources for that. Instead I focus on a few points made in the article specific to data security. In general, I find there is a lack of understanding about blockchain’s place in a data security context, the article simply highlights a few. I’ll frame my discussion using a common cybersecurity framework, the CIA triad.

When considering data security we often separate information protection into three categories: 1) Confidentiality – data should only be visible to those with a legitimate reason to access it; 2) Integrity – data should be accurate and no unauthorized changes should be made to it; and 3) Availability – the data should be accessible when it is needed. These three categories of protection, Confidentiality, Integrity, and Availability, form the CIA triad. To secure information, computers and programs must effectively provide all three.

Blockchain Protects Data Integrity

Blockchain was created to focus on the integrity of data. That is, the premise for blockchain is that a group wants to share information and assure that no one changes the data without consensus. The data is visible to anyone with access to the blockchain. Public and private keys in blockchain are only used to authenticate data changes – managing the integrity of the data.

A byproduct of a typical blockchain deployment is enhanced availability. If there are multiple organizations each with a complete copy of the blockchain, then the information is redundantly stored across multiple systems and accessible through multiple networks. Although not the focus of blockchain, and not a guaranteed security feature, especially if a single organization is using the technology privately, blockchain’s support for a distributed implementation can be used to enhance availability.

Confidentiality Is Another Issue

As relates to confidentiality, keeping private data private, the article implies that the keys used with blockchain encrypt the data, and hence aid in confidentiality. For instance, the article mentions, “With blockchain, the patient’s entire medical record is stored in a ledger and encrypted with the patient’s private key.” There are a three significant errors in this statement.

(more…)

MongoDB and Java – Powerful Complementary Platforms

Tuesday, May 31st, 2016

I have found that including MongoDB in the design of Java applications allows me a valuable level of flexibility in meeting client objectives. I have created an initial open source project on GitHub, JavaMongo, with the goal of providing working examples of Java and MongoDB integration. A secondary goal is to include development best practices, such as using testing frameworks and good coding style.

This posting is intended to give a little background on why I find Java and MongoDB to be useful tools in my software development arsenal and then to introduce the JavaMongo project. Future postings will include some videos walking developers through the examples as well as the frameworks being used (like JUnit, Cobertura and Checkstyle)

Background

Java is an ubiquitous platform for creating business applications. It has proven itself across a wide range of use cases from small point-based solutions to large generalized solution stacks. The variety of libraries, frameworks and tools for designing, building, testing and managing Java applications provides significant benefits to companies building solutions using Java. However, an application without ready access to data isn’t particularly useful. As enterprise-scale database options have broadened to include NoSQL, those individuals creating Java-based solutions must be sure to take advantage of new data options in order to benefit from the strengths of such components.

MongoDB is a great NoSQL platform that can be used to provide additional capabilities to your applications. MongoDB is a document store that has proven its reliability, scalability and integrate-ability across numerous small and large-scale applications. Its value and focus complements the way we use relational databases for online transaction-oriented processing (OLTP) and offers advantages over the way we use relational databases for data marts and warehouses.

A point of clarification before proceeding: I’m not here to say that MongoDB is better than some other data product, or, more generally, that document stores are better than relational databases. I find such arguments meaningless without a specific use case or project goal. These technologies are different and have individual strengths and weaknesses in the face of a specific set of project objectives.

I have found that MongoDB plugs in well when I need a place to federate data (structured, semi-structured and unstructured). Given a common platform, it simplifies the work required to build and alter connections between attributes. If you’ve looked at other information about my background you’ll see that I find the use of semantic technology to be incredibly valuable for data federation and classification. MongoDB as a flexible repository plays well with semantics. At the end of this post I’ll give you a small example of that.

JavaMongo Project

The JavaMongo project is intended to provide Java developers with working examples of Java and MongoDB integrations. Over time I expect a variety of common situations to be demonstrated, with associated documentation explaining the use case and the resulting implementation.

In order to have some interesting data to work with, I’m using data sets that my company releases to the public domain. In order to work with the JavaMongo examples you’ll need to import that data into your MongoDB instance. For more information about downloading and importing the sample data, see the discussion on MongoDB Collection of Honeypot Data on my NoSQL topic page.

The initial JavaMongo project contains a basic README file with information on running the example code. Instead of rehashing that information in this post, I’d like to walk through the basic operations being demonstrated in the example code. The main class we’ll explore is BasicStatistics (us.daveread.education.mongo.honeypot.BasicStatistics).

As you know, a Java program starts execution with the main() method. We see that the first step that the BasicStatistics’ main() method takes is to create an instance of the BasicStatictics class.

BasicStatistics Constructor

The constructor code goes through the entire process of connecting to a MongoDB database, accessing a collection and running a query on data in the collection.

First, an instance of MongoClientOptions is created. This class allows us to configure certain client side options related to the connection. I’ll get into more detail with this in future examples. In this case, the program is simply setting the connection timeout to 2000 milliseconds (2 seconds) so that if the instance is not available the program won’t hang for a long time. You wouldn’t make the timeout this short in a production environment but it helps for debugging our local environment by failing fast if something is wrong.

(more…)

Accountable Care Organizations, Data Federation and CMS’ Updated Final Rule for the Medicare Shared Savings Program

Monday, June 8th, 2015

CMS LogoCMS has published a final rule (http://federalregister.gov/a/2015-14005) focused on changes to the Medicare Shared Savings Program (MSSP) which impacts Accountable Care Organizations (ACO) significantly. There are a variety of interesting changes being made to the program. For this discussion I’m looking at CMS’ continual drive toward data use and integration as a basis for improving quality of care, gaining efficiency and cutting costs in health care. One way this drive is manifested in the new rule regards an ACO’s plans as related to “enabling technologies,” which is an umbrella term for leveraging electronic data.

As background, Subpart B (425.100 to 425.114) of the MSSP describes ACO eligibility requirements. Two of the changes in this section clearly underscore the importance of electronic data and data integration to the fundamental operation of an ACO. Specifically, looking at page 127, the following updates are being made to section 425.112(b)(4) (emphasis mine):

Therefore, we proposed to add a new requirement to the eligibility requirements under § 425.112(b)(4)(ii)(C) which would require an ACO to describe in its application how it will encourage and promote the use of enabling technologies for improving care coordination for beneficiaries. Such enabling technologies and services may include electronic health records and other health IT tools (such as population health management and data aggregation and analytic tools), telehealth services (including remote patient monitoring), health information exchange services, or other electronic tools to engage patients in their care.

It goes on to add:

Finally, we proposed to add a provision under § 425.112(b)(4)(ii)(E) to require that an ACO define and submit major milestones or performance targets it will use in each performance year to assess the progress of its ACO participants in implementing the elements required under § 425.112(b)(4). For instance, providers would be required to submit milestones and targets such as: projected dates for implementation of an electronic quality reporting infrastructure for participants;

It is clear from the first change that an ACO must have a documented plan in place for continually expanding its use of electronic data and providing data visibility and integration between itself and its beneficiaries and providers. This is a tall order. The number of different systems and data formats along with myriad reporting and analytic platforms makes a traditional integration approach tedious at best and a significant business risk at worst.

The second change, keeping CMS apprised of the progress of data-centric projects, is clearly intended to keep the attention on these data publishing and integration projects. It won’t be enough to have a well-articulated plan, the ACO must be able to demonstrate progress on a regular basis.

(more…)

Impetus for Our Semantics and NoSQL Workshop at the 2015 SmartData Conference

Friday, May 15th, 2015

I'm Speaking at the 2015 SmartData ConferenceI’m looking forward to being one of the presenters for infuzIT’s hands-on data integration and analysis workshop at this year’s SmartData Conference in San Jose. Giving people the opportunity to see the amazing power of semantics combined with NoSQL to quickly integrate and analyze data makes my day.

My background includes significant work with data, both as an application developer and data warehouse architect. The acceleration of data-centric hardware and software capabilities over the past 10 years now supports a very different paradigm for exploring, reporting and analyzing data. Processes and procedures for creating a data warehouse or mart, the accepted rules of the road for creating integrated data repositories, are no longer clear cut. The data federation debate is no longer Inmon or Kimball.

A significant shift in data integration revolves around the required lifespan of the integrated data. This lifespan has two key aspects whose evolution now allows us to rethink our approach to data federation. This permits us to be much more agile when bringing heterogeneous data sources together. The two aspects are reflected in these design questions: 1) what data, if any, will be rehosted; and 2) what relationships will be supported within the integrated data?

Rehosting Data

In a traditional data warehouse the data must be rehosted. The new repository is the target where transformed data (cleaned-up, standardized) exists. The queries that will be retrieving data from multiple sources are really pulling data from a single source that has been populated from multiple sources. It represents a heavyweight process, driven by Extract-Transform-Load (ETL) scripts and requiring space to host redundant information.

Relationships Between Data Elements

The target warehouse schema determines what relationships are defined between the data elements being combined. Getting this “right” requires careful planning and coordination between the various groups that will use the warehouse. Given the significant effort, represented as cost, organizations tend to design data warehouses to support broad constituencies as a way to amortize the investment across departments and projects.

Paradigm Shift

Semantics and NoSQL allow us to reduce the effort of integrating data by orders of magnitude. They support a completely different mindset for bringing data together. Instead of carefully designing a model that works well in the general sense (reducing the value in specific cases) we have environments that allow us to experiment, adjust and focus on each case.

Below are several drivers which allow us to approach data federation differently using semantics and NoSQL.

(more…)

Medicaid Managed Care Congress Conversations Highlight the Value of Data Federation

Thursday, May 22nd, 2014

Photo of Scott, Chris and Dave at MMCC 2014

This week I had the opportunity to attend the Medicaid Managed Care Congress (MMCC) in Baltimore, MD and the privilege of speaking with a variety of leaders from provider, payer, and services organizations. With me from Blue Slate Solutions were Scott Van Buren and Chris Garber. A common theme we heard as we spoke with the attendees was the challenge of bringing data together from multiple sources and making sense of that information.

Medicaid is potentially the most complex government program that exists in the United States. There are federal and state aspects as well as portions that are handled at a local level. Some funding and services are defined as required while others are optional. The financial models’ formulas involve many variables. In short, there are numerous challenges in Medicaid, including the dual eligible changes that seek to address the services disconnects that often exist when a person is eligible for both Medicare and Medicaid.

Combining data from providers, payers, patients, government entities and the community are all necessary in order to optimize the quality of care that is provided to each patient. The definition of provider continues to expand, covering not just the medical needs of a person but incorporating the various social services, so important to the holistic care of an individual, under the umbrella of “provider.”

As we listened to people and talked about their data challenges we were also able to walk them through the Data Unleashed™ approach. The iterative learn-as-you-go process resonated across the board, whether people represented patient advocacy groups, provider organizations or healthcare plans. The capability to start small, obtain value quickly and adapt rapidly to changing environments fits the Medicaid complexities well.

Data Unleashed Front End Screenshot

If you would like to learn more about our agile and lightweight approach to accessing data from across your enterprise in order to quickly begin creating meaningful reporting and analytics, please check out dataunleashed.com for descriptions, videos and case studies. We’d also appreciate the opportunity to host a webinar with your team where we can explore Data Unleashed™ in more depth and discuss your specific data challenges.

Data Unleashed™ Headed to the 2014 Medicaid Managed Care Congress

Thursday, May 15th, 2014

License plate and Data Unleashed license plate frameFor those of you spending time in Baltimore next week (May 19-21, 2014) to attend the Medicaid Managed Care Congress please stop by Blue Slate’s booth. Our MINI road trip begins Sunday as we head for Camden Yards and the beautiful inner harbor area. Our goal in attending? Having the opportunity to speak with you about your data challenges as well as your Medicaid journey.

Data Unleashed(tm) LogoWe will be demonstrating what we mean by lightweight data federation and agile analytics as the drivers behind creating the Data Unleashed™ service platform. Given our extensive healthcare focus, we have deep experience working with companies on Medicaid initiatives, such as those involving dual eligibles, for instance the FIDA program in New York State.

Beyond data integration and analytics, we provide expertise for plans to: implement business process and business rule management solutions; prepare for site reviews and audits; and unify data from a variety of internal and cloud-based systems. More broadly beyond Medicaid, we work extensively in the Medicare and commercial healthcare space, leading transformative change for businesses such as Medicare Administrative Contractors (MACs) and Blues plans.

We look forward to having a chance to learn more about your operational challenges and share with you our organization’s background and focus areas. Let’s get together and explore opportunities to advance your organization’s strategic goals around  improving quality of care and reducing costs.

Why Isn’t Everybody Doing It?

Monday, April 28th, 2014

SheepThat is a very dangerous question for a leader to ask when evaluating options. Yet it is one I hear far too often in the healthcare realm. It encapsulates a rejection of innovation, evolution and learning all in one terse, often rhetorical, question.

A common context for this question, often prefixed by, “If this is so great…,” is when discussing semantics and semantic technology. Although these concepts are not new to some industries, such as media, they are foreign in many healthcare organizations. Yet we know that healthcare payers and providers alike struggle with massive data integration and data analytics challenges just like media conglomerates.

The needs to: combine siloed information; drive an analytics mindset throughout an organization; and support the flexibility of a constantly changing IT environment are common in large healthcare organizations. Repeated attempts by organizations to meet these needs betray a lack of consensus around how to best achieve a valuable result.

Further, the implication that how most organizations solve a problem is optimal ignores the fact that best practices must change over time. The best way to solve a problem last year may not be the same this year. The healthcare industry is changing, the physical world of servers, networks, disk drives, memory is changing, and the expectations of members are changing. What was infeasible years ago becomes commonplace. Relational databases were all but unworkable in the 1970s due to a lack of experienced DBAs, slow disk drives, slow processors and limited memory.

In the same way, semantic formalization and graph databases were too new and limited to deal with large data sets until people gained expertise with ontologies while system hardware benefitted from another generation of Moore’s law. In the face of ongoing innovation, the question leaders should ask when approaching a challenge is, “What advancements have been made since the last time we looked at this problem?

Innovation Technology Strategy Leadership SignpostLeadership requires leading, not following. Leaders mentor their organizations through change in order to reach new levels of success. Leadership is based on learning, open-mindedness, creativity and risk-taking. The question, “Why isn’t everybody doing it?” is the antithesis of leadership and has no place there. In fact, if everybody is doing something, a leader would be better off asking, “How do we get ahead of what everybody is doing?”

Leaders must be on the forefront of pushing for better, faster, cheaper. Questioning the status quo, looking for new opportunities, seeking to leapfrog the competition, those are key foci for leadership.

As a leader, the next time you find yourself limiting your willingness to explore an option because everybody isn’t doing it, keep in mind that calculators, computers, automobiles, elevators, white boards, LED light bulbs, Google maps, telephones, the Internet, 3-D printing, open heart surgery, and many more concepts that are accepted or gaining traction, had a day when only one person or organization was “doing it.” Challenge yourself and your organization to find new options, new best practices and new paradigms for advancing your strategy and goals.

How Does Semantic Technology Enable Agile Data Analytics?

Friday, April 25th, 2014

I’m glad you asked. SDATAVERSITYcott Van Buren and I will be presenting a Dataversity webinar entitled, Using Semantic Technology to Drive Agile Analytics, on exactly that topic. Scheduled for May 14, 2014 (and available for replay afterwards), this webinar will highlight key semantic technology capabilities and how those provide an environment for data agility.

We will focus most of the webinar on a case study that demonstrates the agility of semantic technology being used to conduct data analysis within a healthcare payer organization. Healthcare expertise is not required in order to understand the case study.

swAs we look into several iterations of data federation and analysis, we will see the effectiveness of bringing the right subset of data together at the right time for a particular data-centric use. This concept translates well to businesses that have multiple sets of data or applications, including data from third parties, and seek to combine relevant subsets of that information for reporting or analytics. Further, we will see how this augments data warehousing projects, where the lightweight and agile data federation approach informs the warehouse design.

Please plan to  join us virtually on May 14 as we describe semantic technology, lightweight data federation and agile data analytics. There will also be time for you to pose questions and delve into areas of interest that we do not cover in our presentation.

The webinar registration page is: http://content.dataversity.net/051414BlueslateWebinar_DVRegistrationPage.html

We look forward to having the opportunity to share our data agility thoughts and experiences with you.

San Jose and the SemTechBiz 2014 Conference, Here I Come!

Friday, April 18th, 2014

semtechbiz2014.imspeaking.203x72I am thrilled to have been invited back to participate at the Semantic Technology and Business (SemTechBiz) conference. This is the premier US conference for learning about, exploring and getting your hands on semantic technology. I’ll be part of a Blue Slate team (including Scott Van Buren and Michael Delaney) who will be conducting a half-day hands-on workshop, Integrating Data Using Semantic Technology, on August 19, 2014. Our mission is to have participants use semantic technology to integrate, federate and perform analysis across several data sources.

We have some work to do to iron out our overall use case, pulling from work we have done with several clients. At a minimum we’ll be working with database schemas, ontologies, reasoners and data analytics tools. It will be a fun and educational experience for attendees.

I’ll post more specifics once the SemTechBiz agenda is published and we have finalized the workshop structure. I hope to see you this August 19-21 in San Jose for our workshop and the amazing learning opportunities throughout the conference.

For more information on the conference, visit its website: http://semtechbizsj2014.semanticweb.com/index.cfm

Data Unleashed™ – Addressing the Need for Data-centric Agility

Thursday, April 3rd, 2014

Data Unleashed™. The name expresses a vision of data freed from its shackles so that it can be quickly and iteratively accessed, related, studied and expanded. In order to achieve that vision, the process of combining, or federating, the data must be lightweight. That is, the approach must facilitate rapid data set expansion and on-the-fly relationship changes so that we may quickly derive insights. Conversely, the process must not include a significant investment in data structure design since agility requires that we avoid a rigid structure.

Over the past year Blue Slate Solutions has been advancing its processes and technology to support this vision, which comprises the integration between components in our Cognitive Corporation® framework. More recently we have invested in an innovation development project to take our data integration experiences and semantic technology expertise and create a service offering backed by a lightweight data federation platform. Our platform, Data Unleashed™, enables us to partner with customers who are seeking an agile, lightweight enhancement to traditional data warehousing.

I want to emphasize that we believe that the Data Unleashed™ approach to data federation works in tandem with traditional Data Warehouses (DW) and other well-defined data federation options. It offers agility around data federation, benefiting focused data needs for which warehouses are overkill while supporting a process for iteratively deriving value using a lightweight data warehouse™ approach that informs a broader warehousing solution.

At a couple of points below I emphasize differences between Data Unleashed™ and a traditional DW. This is not meant to disparage the value of a DW but to explain why we feel that Data Unleashed™ adds a set of data federation capabilities to those of the DW.

As an aside, Blue Slate is producing a set of videos specifically about semantic technology, which is a core component of Data Unleashed™. The video series, “Semantic Technology, An Enterprise Introduction,” will be organized in two tracks, business-centric and technology-centric. Our purpose in creating these is to promote a holistic understanding of the value that semantics brings to an organization. The initial video provides an overview of the series.

What is Data Unleashed™ All About?

Data Unleashed™ is based on four key premises:

  1. the variety of data and data sources that are valuable to a business continue to grow;
  2. only a subset of the available data is valuable for a specific reporting or analytic need;
  3. integration and federation of data must be based on meaning in order to support new insights and understanding; and
  4. lightweight data federation, which supports rapid feedback regarding data value, quality and relationships speeds the process of developing a valuable data set.

I’ll briefly describe our thinking around each of these points. Future posts will go into more depth about Data Unleashed™ as well. In addition, several Blue Slate leaders will be posting their thoughts about this offering and platform.

(more…)