Entries by admin

Schema-on-what?

Recently, a customer asked us to help transition a set of data flows from an overwhelmed RDBMS to a “Big Data” system. These data flows had a batch dynamic, and there was some comfort with Pig Latin in-house, so this made for an ideal target platform for the production data flows (with architectural flexibility for […]

The Data Architecture Lifecycle

It’s a very exciting time to be in the data world, with new and groundbreaking technologies released seemingly every day. There is every temptation to pick up today’s new shiny, find an excuse to throw it into production, and call it an architecture. Of course, a more deliberate approach is required for long-term success – […]

Not Just Open Source: 7 High-Value Areas To Consider Commercial Components For Big Data Architectures

The Big Data landscape is largely dominated by powerful free open source technologies. Different configurations and applications of these technologies seemingly consume the majority of mindshare, and it can be easy to lose sight of commercial offerings that can provide relevant business value. Some of the areas in which commercial vendors offer particular value: Managed […]

FTC & Big Data Bias Warnings

A recent WSJ article echoes an FTC report released last Wednesday warning of the possible consequences of bias in Big Data applications. The article identifies a number of valid concerns around privacy, equal opportunity, and accuracy. It also rightly hints at possible positive consequences as well. For example, they quote cases where people judged poor […]

Big Data Architecture Patterns

Repeatable Approaches to Big Data Challenges for Optimal Decision Making ​ Abstract A number of architectural patterns are identified and applied to a case study involving ingest, storage, and analysis of a number of disparate data feeds. Each of these patterns is explored to determine the target problem space for the pattern and pros and […]

The Metadata Lifecycle

When designing an enterprise architecture for business intelligence, advanced analytics, and other data­centric applications, it is often useful to capture major data flows. This may require some research into use cases and tooling and even a bit of hard thinking, but it’s a straightforward exercise. What isn’t so straightforward is capturing the state of metadata […]

A Big Data Analysis Paradox

There is a nuance about Big Data analysis. It’s really about small data. While this may seem confusing and counter to the whole Big Data “movement”, small data is the product of Big Data analysis. This is not a new concept, nor is it unfamiliar to people who have been doing data analysis for any […]