Customers often ask what gives us the qualifications to work in their industry (industries like these, for example). They wonder whether we are able to able to handle the massive amounts and types of data they have available within their respective industries. Before we answer these questions, consider the following:

Picture in your mind the industry you work for. Do you think you have an ability to offer a unique set of skills of which no other industry can compare? Are your data sources large and unwieldy, seemingly more complex than other industries? Do you feel as though it takes a person within your industry to fully comprehend the data complexities you have to manage?

If you answered “yes” to any of these questions, you’re wrong.

That’s not entirely true, you might not be completely wrong. But chances are that, while your data may be unique in some ways, it’s probably not harder or more complex than most other extant industries. Now you’re saying to yourself “Well, how do you know? You don’t work in my industry, do you?” But you might be surprised to find that we do work in your industry. In fact, we work in all industries.

When it comes to leveraging Big Data, breadth of skill set and ability are key to managing the overwhelmingly complex sets of data that you encounter in your industry. The problem many of these industries face is that they don’t actually have that breadth to work with. Yes, they may be leaders in their industry, but that still means they are held within the confines of only one industry, not knowing what else is out there that might work for them. That is where we come in. You see, our work in multitudinous industries (eCommerce, Healthcare, Finance, Manufacturing, and Life Sciences, to name a few) across myriad platforms has provided us with a vast breadth of skill sets and abilities that pertain not only to the industry in which they were acquired but to innumerable other industries as well.

Often times, problems that may seem unprecedented or distinct within one industry have more than likely already occurred along a similar vein within another industry. Since works in multiple organizations across many industries, we have the ability to identify and solve many, many problems and compare them to many other problems experienced within those industries. Additionally, as Country Music Hall of Famer Kenny Rogers so eloquently explains, you got to know when to hold ‘em, know when to fold ‘em, know when to walk away and know when to run. The same principle applies to solving Big Data problems. We have high-horsepower, high-caliber data scientists with good judgment who know when to bridge across organizations and industries, when to focus within the single industry, and when to find another solution entirely.‘s engineering team has extensive experience across many industries and thrives in new environments, and can help you with your company’s Big Data, Machine Learning, and Custom Software needs. For more information on how we can help handle these needs, visit our library full of case studies and white papers.

As I outline in the Machine Learning Field Guide, the concept of Machine Learning arose from interests in having machines learn from data. The industry has seen cycles of stagnation and resurgence in machine learning/AI research since as early as the 1950s. During the 1980s, we saw the emergence of the Multi-layer Perceptron and it’s back propagation training mechanism, both fundamental to today’s highly sophisticated Deep Learning architecture capable of image recognition and behavior analysis. However, to reach its zenith, this field depended on advancements in data proliferation and acquisition that wouldn’t materialize for many more decades. As promising as the initial results were, early attempts in industrial application of artificial intelligence as a whole fizzled.

Though the practice of Machine Learning only ascended to prominence recently, much of its mathematical foundation dates back centuries. Thomas Bayes, father of the Bayesian method from which we base contemporary statistical inference, wrote his famous equation in the 1700s. Shortly after, in the early 1800s, immortalized academics like Legendre and Gauss developed early forms of the statistical regression models we use today. Statistical analysis as a discipline remained an academic curiosity from this time until the commoditization of low-cost computing in the 1990s and onslaught of social media and sensor data in the 2000s.

What does this mean for Machine Learning today? Enterprises are sitting on data goldmines and collecting more at a staggering rate with ever greater complexity. Today’s Machine Learning is about mining this treasure trove, extracting actionable business insights, predicting future events, and prescribing next best actions, all in laser-sharp pursuit of business goals. In the rush to harvest these gold mines, Machine Learning is entering its golden age, buoyed by Big Data technology and Cloud infrastructure, and abundant access to open source software. Intense competition in the annual ImageNet contest between global leaders like Microsoft, Google, and Tencent rapidly propels machine learning/image recognition technology forward, and source codes for all winning entries are made available to the public free of charge. Most contestants in the Kaggle machine learning site share their work in the same spirit as well. In addition to these source codes, excellent free machine learning tutorials compete for mindshare on Coursera, edX, and Youtube. Hardware suppliers such as Nvidia and Intel further the cause by continuing to push the boundary for denser packaging of high-performance GPU to speed up Neural Networks. Thanks to these abundant resources, any aspiring entrepreneur or lone-wolf researcher has access to petabytes of storage, utility massive parallel computing, open source data, and software libraries. As of 2015, this access has led to developing computer image recognition capabilities that outperform human image recognition abilities.

With recent stunning successes in Deep Learning research, the floodgates open for industrial applications of all kinds. Practitioners enjoy a wide array of options when targeting specific problems. While Neural Networks clearly lead in the high-complexity and high-data volume end of the problem space, classical machine learning still achieves higher prediction and classification quality for low sample count applications, not to mention the drastic cost savings in computing time and gears. Research suggests that the crossover occurs at around one hundred thousand to one million samples. Just a short time ago, numbers like these would have scared away any level-headed project manager. Nowadays, data scientists are asking for more data and are getting it expediently and conveniently. A good Data Lake and data pipeline are necessary precursors to any machine learning practice. Mature data enterprises emphasize the close collaboration of data engineering (infrastructure) teams with data science teams. “Features” are the lingua franca of their interactions, not “files,” “primary keys,” or “provisions”.

Furthermore, execution environments should be equipped with continuous and visual monitoring capabilities, as any long running Neural Network training session (days to weeks) involves frequent mid-course adjustment based on feedback of evolving model parameters. Whether the most common Linear Regression or the deepest Convolutional Neural Network, the challenge of any machine learning experimentation is wading through the maze of configurational parameters and picking out a winning combination. After selecting the candidate models, a competent data scientist navigates a series of decisions from starting point, to learning rate, to sample size, to regularization setting, as well as constant examination of convergence on parallel training runs and various runtime tuning, all in attempt to get the most accurate model in the shortest amount of time.

Like I state in my recent e-book “Machine Learning Field Guide,” Machine Learning is smarter than ever and improving rapidly. This predictive juggernaut is coming fast and furious and will transform any business in its path. For the moment, it’s still a black magic in the hands of the high priests of statistics. As an organization with a mission to deliver its benefits to clients, trained an internal team of practitioners, organized an external board of AI advisors, and packaged a Solutions Playbook as a practice guide. We have harnessed best practices, specialty algorithms, experiential guidelines, and training tutorials, all in effort to streamline delivery and concentrate most of our engagement efforts to areas that require specific customizations.

To find out more, check out the Machine Learning Field Guide, by Chief Data Scientist Bruce Ho.

To most in the know, Watson has long been considered more hype and marketing than technical reality. Presented as infinitely capable, bleeding edge technology, you might think the well-known Watson brand would be delivering explosive growth to IBM.

Reality is far different. IBM’s stock is down in a roaring market. The company is, in effect, laying off thousands of workers by ending it’s work-from-home policy. More than $60M has perhaps been wasted by MD Anderson on a failed Watson project. All of this is happening against the backdrop of a rapidly expanding market for Machine Learning solutions.

But why? I saw Watson dominate on Jeopardy.

And dominate it did, soundly beating Ken Jennings and Brad Reuter. So think for a moment about what Watson was built to do. Watson, as was proven then, is a strong Q&A engine. It does a fine job in this realm and was truly state of the art…in 2011. In this rapidly-expanding corner of the tech universe, that’s an eternity ago. The world has changed exponentially, and Watson hasn’t kept pace.

So what’s wrong with Watson?

  • It’s not the all-encompassing answer to all businesses. It offers some core competencies in Natural Language and other domains, but Watson, like any Machine Learning tech, and perhaps more than most, requires a high degree of customization to do anything useful. As such, it’s a brand around which Big Blue sells services. Expensive services.
  • The tech is now old. The bleeding edge of Machine Learning is Deep Learning, leveraging architectures Watson isn’t built to support.
  • The best talent is going elsewhere. With the next generation of tech leaders competing for talent, IBM is now outgunned.
  • …and much more discussed here.

The Machine Learning market is strong and growing. IBM has been lapped by Google, Facebook, and other big name companies, and these leaders are open sourcing much of their work.

Will Watson survive? Time will tell.