For many years, and with rapidly accelerating levels of targeting sophistication, marketers have been tailoring their messaging to our tastes. Leveraging our data and capitalizing upon our shopping behaviors, they have successfully delivered finely-tuned, personalized messaging.

Consumers are curating their media ever more by the day. We’re buying smaller cable bundles, cutting cords, and buying OTT services a la carte. At the same time, we’re watching more and more short-form video. Video media is tilting toward snack-size bites and, of course, on demand.

Cable has been in decline for years and the effects are now hitting ESPN, once the mainstay of a cable package. Even live sports programming, long considered must see and even bulletproof by media executives, has seen declining viewership.

 

So what’s to be done?

To thrive, and perhaps merely to survive, content owners must adapt. Leagues and networks have come a long way toward embracing a “TV Everywhere” distribution model despite the obnoxious gates at every turn. But that’s not enough and the sports leagues know it.

While there are many reasons for declining viewership and low engagement among younger audiences, length of games and broadcasts are a significant factor. The leagues recognize that games are too long. The NBA has made some changes that will speed up the action and the NFL is also considering shortening games to avoid losing viewership. MLB has long been tinkering in the same vein. These changes are small, incremental, and of little consequence to the declining number of viewers.

Most sporting events are characterized by long stretches of calm, less interesting play that is occasionally accented by higher intensity action. Consider for a moment how much actual action there is in a typical football or baseball game. Intuitively, most sports fans know that the bulk of the three-hour event is consumed by time between plays and pitches. Still, it’s shocking to see the numbers from the Wall Street Journal, which point out that there are only 11 minutes of action in a typical football game and a mere 18 minutes in a typical baseball game.

 

A transformational opportunity

There is so much more they can do. Recent advances in neural network technology have enabled an array of features to be extracted from streaming video. The applications are broad and the impacts significant. In this sports media context, the opportunity is nothing short of transformational.

Computers can now be trained to programmatically classify the action in the underlying video. With intelligence around what happens where in the game video, the productization opportunities are endless. Fans could catch all of the action, or whatever plays and players are most important to them, in just a few minutes. With a large indexed database of sports media content, the leagues could present near unlimited content personalization to fans.

Want to see David Ortiz’s last ten home runs? Done.

Want to see Tom Brady’s last ten TD passes? You’re welcome.

Robust features like these will drive engagement and revenue. With this level of control, fans are more likely to subscribe to premium offerings, offering predictable recurring revenue that will outpace advertising in the long run.

Computer-driven, personalized content is going to happen. It’s going to be amazing, and we are one step closer to getting there.

Scientists have been working on the puzzle of human vision for many decades. Convolutional Neural Network (CNN or convnet)-based Deep Learning reached a new landmark for image recognition when Microsoft announced it had beat the human benchmark in 2015. Five days later, Google one-upped Microsoft with a 0.04% improvement.

Figure 1. In a typical convnet model, the forward pass reduces the raw pixels into a vector representation of visual features. In its condensed form, the features can be effectively classified using fully connected layers.

source: https://docs.google.com/presentation/d/10XodYojlW-1iurpUsMoAZknQMS36p7lVIfFZ-Z7V_aY/edit#slide=id.g18c28bf1e3_0_201

 

Data Scientists don’t sleep. The competition immediately moved to the next battlefield of object segmentation and classification for embedded image content. The ability to pick out objects inside a crowded image is a precursor to fantastic capabilities, like image captioning, where the model describes a complex image in full sentences. The initial effort to translate full-image recognition to object classification involved different means of localization to efficiently derive bounding boxes around candidate objects. Each bounding box is then processed with a CNN to classify the single object inside the box. A direct pixel-level dense prediction without preprocessing was, for a long time, a highly sought-after prize.

 

Figure 2. Use bounding box to classify embedded objects in an image

source: https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/object_localization_and_detection.html

In 2016, a UC Berkeley group, led by E. Shelhamer, achieved this goal using a technique called Fully Convolutional Neural Network. Instead of using convnet to extract visual features followed by fully connected layers to classify the input image, the fully connected layers are converted to additional layers of convnet. Whereas the fully connected layers completely lose all information on the original pixel locations, the cells in the final layer of a convnet are path-connected to the original pixels through a construct called receptive fields.

Figure 3. During the forward pass, a convnet reduces raw pixel information to condensed visual features which can then be effectively classified using fully connected neural network layers. In this sense, the feature vectors contain the semantic information derived from looking at the image as a whole.

source: https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf

 

Figure 4. In dense prediction, we want to both leverage the semantic information contained in the final layers of the convnet and assign the semantic meaning back to the pixels that generated the semantic information. The upsampling step, also known as the backward pass, maps the feature representations back onto the original pixels positions.

source: https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf

 

The upsampling step is something of great interest. In a sense, it deconvolutes the dense representation back to its original resolution and the deconvolution filters can be learned through Stochastic Gradient Descent, just like any forward pass learning process. A good visual demonstration of deconvolution can be found here. The most practical way to implement this deconvolution step is through bilinear interpolation, as discussed later.

The best dense prediction goes beyond just upsampling the last and coarsest convnet layer. By fusing results from shallower layers, the result becomes much more finely detailed. Using a skip architecture as shown in Figure 4, the model is able to make accurate local predictions that respect global structure. The fusion operation is based on concatenating vectors from two layers and perform a 1 x 1 convolution to reduce the vector dimension back down again.

 

Figure 5. Fuse upsampling results from shallower layers push the prediction limits to a finer scale.

 source: https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf

LABELED DATA

As is often the case when working with Deep Learning, collecting high-quality training data is a real challenge. In the image recognition field, we are blessed with open source data from PASCAL VOC Project. The 2011 dataset provides 11,530 images with 20 classes. Each image is pre-segmented with pixel-level precision by academic researchers. Examples of segmented images can be found here.

 

OPEN SOURCE MODELS

Computer vision enthusiasts also benefit hugely from open source projects which implement almost every exciting new development in the deep learning field. The author’s group posted a Caffe implementation of FCNN. For keras implementations, you will find no fewer than 9 FCN projects on GitHub. After trying out a few, we focused on the Aurora FCNproject, which started running with very little modifications. The authors provided rather detailed instruction on environment setup and downloading of datasets. We chose the AstrousFCN_Resnet50_16s model out of the six included in the project. The training took 4 weeks on a two Nvidia 1080 card cluster, which was surprising but perhaps understandable given the huge number of layers. The overall model architecture can be visualized by either a JSON tree or with PNG graphics, although both are too long to fit on one page. The figure below shows just one tiny chunk of the overall model architecture.

Figure 6. Top portion of the FCN model. The portion shown is less than one-tenth of the total.

source: https://github.com/aurora95/Keras-FCN/blob/master/Models/AtrousFCN_Resnet50_16s/model.png

It is important to point out that the authors of the paper and code both leveraged established image recognition models, generally the winning entries of the ImageNet competition, such as the VGG nets, ResNet, AlexNet, and the GoogLeNet. Imaging is the one area where transfer learning applies readily. Researchers without the near infinite resources found at Google and Microsoft can still leverage their training results and retrain high-quality models by adding only small new datasets or make minor modifications. In this case, the proven classification architectures named above are modified by stripping away the fully connected layers at the end and replaced with fully convolutional and upsampling layers.

RESNET (RESIDUAL NETWORK)

In particular, the open source code we experimented with is based on Resnet from Microsoft. Resnet has the distinction of being the deepest network ever presented on ImageNet, with 152 layers. In order to make such a deep network converge, the submitting group had to tackle a well-known problem where error rate tends to rise rather than drop after a certain depth. They discovered that by adding skip (aka highway) connections, the overall network converges much better. The explanation lies with the relative ease in training intermediates to minimize residuals rather the originally intended mapping (thus the name Residual Network). The figure below illustrates the use skip connections used in the original ResNet paper, which are found in the open source FCN model derived from ResNet.

Figure 7a. Resnet uses multiple skip connections to improve the overall error rate of a very deep network

source: https://arxiv.org/abs/1512.03385

 

Figure 7b. Middle portion of the Aurora model displaying skip connections, which is a characteristic of ResNet.

source: https://github.com/aurora95/Keras-FCN/blob/master/Models/AtrousFCN_Resnet50_16s/model.png

The exact intuition behind Residual Network is less than obvious. There is plenty good discussion in this Quora blog.

BILINEAR UPSAMPLING

As alluded to in Figure 4, at the end stage the resolution of the tensor must be brought back to original dimension using an upsampling step. The original paper stated that a simple bilinear interpolation is fast and effective. And this is the approach taken in the Aurora project, as illustrated below.

Figure 8. Only a single upsampling stage was implemented in the open source code.

source https://github.com/aurora95/Keras-FCN/blob/master/Models/AtrousFCN_Resnet50_16s/model.png

Although the paper authors pointed out the improvement achieved by use of skips and fusions in the upsampling stage, it is not implemented by the Aurora FCN project. The diagram for the end stage illustrates that only a single up sampling layer is used. This may leave room for further improvement in error rate.

The code simply makes a TensorFlow call to implement this upsampling stage:

X = tf.image.resize_bilinear(X, new_shape)

 

ERROR RATE

The metrics used to measure segmentation accuracy is intersection over union (IOU). The IOU measured over 21 randomly selected test images are:

[ 0.90853866  0.75403876  0.35943439  0.63641792  0.46839113  0.55811771

0.76582419  0.70945356  0.74176198  0.23796475  0.50426148  0.34436233

0.5800221   0.59974548  0.67946723  0.79982366  0.46768033  0.58926592

0.33912701  0.71760929  0.54273803]

These have a mean of 0.585907. This mean is very close to the number published in the original paper. The pixel level classification accuracy is very high at 0.903266, meaning when a pixel is classified as certain object type, it is correct about 90% of the time.

 

CONCLUSION

The ability to identify image pixels as members of a particular object without a pre-processing step of bounding box detection is a major step forward for deep image recognition. The techniques demonstrated by Shelhamer’s paper achieves this goal by combining coarse-level semantic identification with pixel-level location information. This technique leverages transfer learning based on pre-trained image recognition models that were winning entries in the ImageNet competition. Various open source project replicated the results. Certain implementations require extraordinarily long training time.

Voice Ordering Is Here. Voice Shopping Is Coming… And It’s Far More Interesting

Siri has been with us for years, but it’s in the last few months and largely due to Amazon that voice assistants have won rapid adoption and heightened awareness.

Over these past few months, we’ve been shown the power of a new interaction paradigm. I have an Echo Dot and I love it. Controlling media and the home controls (doing some lights, maybe thermostat soon) seem among the most useful and sticky applications. The Rock, Paper, Scissors skill… yeah, that one’s probably not going to see as much use. But let’s not forget that this slick device is brought to us by the most dominant eCommerce business in the known universe. So it’s great for voice shopping, right? No, not at all, as it doesn’t actually do “shopping.”

“But I heard the story about the six-year-old who ordered herself a dollhouse?” So did I, and it reinforces my point. Let me explain. The current state of commerce via Alexa is most like a broad set of voice operated Dash Buttons. For quick reorders of things you buy regularly and when you’re not interested in price comparisons, it’s fine. What it’s not — voice shopping. Shopping is an exercise in exploration, research, and comparison. That experience requires a friendly and intelligent guide. As such, voice shopping isn’t supported by the ubiquitous directive-driven (do X, response, end) voice assistants.

 

Enter Jaxon and Conversational AI

Shopping is about feature and price comparison, consideration of reviews, suggestions from smart recommendation engines, and more. Voice shopping is enabled by a conversational voice experience, one that understands history and context and delivers a far richer experience than is widely available today.

 

The Mobile Impact

Mobile commerce isn’t new but is still growing fast. Yet, despite consumers spending far more time on mobile devices than on desktops (broadly defined, including laptops), small screen eCommerce spending still lags far behind.

So why can’t merchants close on mobile? The small screen presents numerous challenges. Small screens make promotion difficult and negatively impact upselling and cross-selling. Another major factor, and one you’ve probably experienced, is the often terrible mobile checkout process. Odds are you’ve abandoned a mobile purchase path after fiddling with some poorly designed forms. I have. Maybe you went back via your laptop. Maybe you didn’t. Either way, that’s a terrible user experience.

Our approach to Conversational AI solves these small screen challenges. Merchants can now bring a human commerce experience to the small screen without the mess. It’s a new, unparalleled engagement opportunity — a chance to converse with your customer, capture real intelligence about their needs, and offer just the right thing. It’s an intelligent personal shopper in the hands of every customer.

Come re-imagine voice shopping with us. Imagine product discovery and comparison, driven by voice. Imagine being offered just what you were looking for, based on a natural language description of what you need. Imagine adjusting your cart with your voice. Imagine entering your payment and shipping info quickly and seamlessly, via voice. It’s all possible and it’s happening now with Jaxon.

Customers often ask what gives us the qualifications to work in their industry (industries like these, for example). They wonder whether we are able to able to handle the massive amounts and types of data they have available within their respective industries. Before we answer these questions, consider the following:

Picture in your mind the industry you work for. Do you think you have an ability to offer a unique set of skills of which no other industry can compare? Are your data sources large and unwieldy, seemingly more complex than other industries? Do you feel as though it takes a person within your industry to fully comprehend the data complexities you have to manage?

If you answered “yes” to any of these questions, you’re wrong.

That’s not entirely true, you might not be completely wrong. But chances are that, while your data may be unique in some ways, it’s probably not harder or more complex than most other extant industries. Now you’re saying to yourself “Well, how do you know? You don’t work in my industry, do you?” But you might be surprised to find that we do work in your industry. In fact, we work in all industries.

When it comes to leveraging Big Data, breadth of skill set and ability are key to managing the overwhelmingly complex sets of data that you encounter in your industry. The problem many of these industries face is that they don’t actually have that breadth to work with. Yes, they may be leaders in their industry, but that still means they are held within the confines of only one industry, not knowing what else is out there that might work for them. That is where we come in. You see, our work in multitudinous industries (eCommerce, Healthcare, Finance, Manufacturing, and Life Sciences, to name a few) across myriad platforms has provided us with a vast breadth of skill sets and abilities that pertain not only to the industry in which they were acquired but to innumerable other industries as well.

Often times, problems that may seem unprecedented or distinct within one industry have more than likely already occurred along a similar vein within another industry. Since BigR.io works in multiple organizations across many industries, we have the ability to identify and solve many, many problems and compare them to many other problems experienced within those industries. Additionally, as Country Music Hall of Famer Kenny Rogers so eloquently explains, you got to know when to hold ‘em, know when to fold ‘em, know when to walk away and know when to run. The same principle applies to solving Big Data problems. We have high-horsepower, high-caliber data scientists with good judgment who know when to bridge across organizations and industries, when to focus within the single industry, and when to find another solution entirely.

BigR.io‘s engineering team has extensive experience across many industries and thrives in new environments, and can help you with your company’s Big Data, Machine Learning, and Custom Software needs. For more information on how we can help handle these needs, visit our library full of case studies and white papers.

As I outline in the Machine Learning Field Guide, the concept of Machine Learning arose from interests in having machines learn from data. The industry has seen cycles of stagnation and resurgence in machine learning/AI research since as early as the 1950s. During the 1980s, we saw the emergence of the Multi-layer Perceptron and it’s back propagation training mechanism, both fundamental to today’s highly sophisticated Deep Learning architecture capable of image recognition and behavior analysis. However, to reach its zenith, this field depended on advancements in data proliferation and acquisition that wouldn’t materialize for many more decades. As promising as the initial results were, early attempts in industrial application of artificial intelligence as a whole fizzled.

Though the practice of Machine Learning only ascended to prominence recently, much of its mathematical foundation dates back centuries. Thomas Bayes, father of the Bayesian method from which we base contemporary statistical inference, wrote his famous equation in the 1700s. Shortly after, in the early 1800s, immortalized academics like Legendre and Gauss developed early forms of the statistical regression models we use today. Statistical analysis as a discipline remained an academic curiosity from this time until the commoditization of low-cost computing in the 1990s and onslaught of social media and sensor data in the 2000s.

What does this mean for Machine Learning today? Enterprises are sitting on data goldmines and collecting more at a staggering rate with ever greater complexity. Today’s Machine Learning is about mining this treasure trove, extracting actionable business insights, predicting future events, and prescribing next best actions, all in laser-sharp pursuit of business goals. In the rush to harvest these gold mines, Machine Learning is entering its golden age, buoyed by Big Data technology and Cloud infrastructure, and abundant access to open source software. Intense competition in the annual ImageNet contest between global leaders like Microsoft, Google, and Tencent rapidly propels machine learning/image recognition technology forward, and source codes for all winning entries are made available to the public free of charge. Most contestants in the Kaggle machine learning site share their work in the same spirit as well. In addition to these source codes, excellent free machine learning tutorials compete for mindshare on Coursera, edX, and Youtube. Hardware suppliers such as Nvidia and Intel further the cause by continuing to push the boundary for denser packaging of high-performance GPU to speed up Neural Networks. Thanks to these abundant resources, any aspiring entrepreneur or lone-wolf researcher has access to petabytes of storage, utility massive parallel computing, open source data, and software libraries. As of 2015, this access has led to developing computer image recognition capabilities that outperform human image recognition abilities.

With recent stunning successes in Deep Learning research, the floodgates open for industrial applications of all kinds. Practitioners enjoy a wide array of options when targeting specific problems. While Neural Networks clearly lead in the high-complexity and high-data volume end of the problem space, classical machine learning still achieves higher prediction and classification quality for low sample count applications, not to mention the drastic cost savings in computing time and gears. Research suggests that the crossover occurs at around one hundred thousand to one million samples. Just a short time ago, numbers like these would have scared away any level-headed project manager. Nowadays, data scientists are asking for more data and are getting it expediently and conveniently. A good Data Lake and data pipeline are necessary precursors to any machine learning practice. Mature data enterprises emphasize the close collaboration of data engineering (infrastructure) teams with data science teams. “Features” are the lingua franca of their interactions, not “files,” “primary keys,” or “provisions”.

Furthermore, execution environments should be equipped with continuous and visual monitoring capabilities, as any long running Neural Network training session (days to weeks) involves frequent mid-course adjustment based on feedback of evolving model parameters. Whether the most common Linear Regression or the deepest Convolutional Neural Network, the challenge of any machine learning experimentation is wading through the maze of configurational parameters and picking out a winning combination. After selecting the candidate models, a competent data scientist navigates a series of decisions from starting point, to learning rate, to sample size, to regularization setting, as well as constant examination of convergence on parallel training runs and various runtime tuning, all in attempt to get the most accurate model in the shortest amount of time.

Like I state in my recent e-book “Machine Learning Field Guide,” Machine Learning is smarter than ever and improving rapidly. This predictive juggernaut is coming fast and furious and will transform any business in its path. For the moment, it’s still a black magic in the hands of the high priests of statistics. As an organization with a mission to deliver its benefits to clients, BigR.io trained an internal team of practitioners, organized an external board of AI advisors, and packaged a Solutions Playbook as a practice guide. We have harnessed best practices, specialty algorithms, experiential guidelines, and training tutorials, all in effort to streamline delivery and concentrate most of our engagement efforts to areas that require specific customizations.

To find out more, check out the Machine Learning Field Guide, by Chief Data Scientist Bruce Ho.

To most in the know, Watson has long been considered more hype and marketing than technical reality. Presented as infinitely capable, bleeding edge technology, you might think the well-known Watson brand would be delivering explosive growth to IBM.

Reality is far different. IBM’s stock is down in a roaring market. The company is, in effect, laying off thousands of workers by ending it’s work-from-home policy. More than $60M has perhaps been wasted by MD Anderson on a failed Watson project. All of this is happening against the backdrop of a rapidly expanding market for Machine Learning solutions.

But why? I saw Watson dominate on Jeopardy.

And dominate it did, soundly beating Ken Jennings and Brad Reuter. So think for a moment about what Watson was built to do. Watson, as was proven then, is a strong Q&A engine. It does a fine job in this realm and was truly state of the art…in 2011. In this rapidly-expanding corner of the tech universe, that’s an eternity ago. The world has changed exponentially, and Watson hasn’t kept pace.

So what’s wrong with Watson?

  • It’s not the all-encompassing answer to all businesses. It offers some core competencies in Natural Language and other domains, but Watson, like any Machine Learning tech, and perhaps more than most, requires a high degree of customization to do anything useful. As such, it’s a brand around which Big Blue sells services. Expensive services.
  • The tech is now old. The bleeding edge of Machine Learning is Deep Learning, leveraging architectures Watson isn’t built to support.
  • The best talent is going elsewhere. With the next generation of tech leaders competing for talent, IBM is now outgunned.
  • …and much more discussed here.

The Machine Learning market is strong and growing. IBM has been lapped by Google, Facebook, and other big name companies, and these leaders are open sourcing much of their work.

Will Watson survive? Time will tell.

Chopping down a tree is a lot like taking on a business project (e.g., for me that could mean delivering a large software application, creating a new artificial intelligence platform, or building a company). I am on vacation this week and decided to do some yardwork today. The first project I took on was chopping down a tree. I came at it with an axe and a thin pair of gloves – definitely not the right equipment, but sometimes you just have to dive in with whatever tools you have in front of you. I threw on a pair of shades to cover the minimal safety precautions and started with a fury. I swung with all my might for a “sprint” and then had to slow down or I would have burnt out. I then started to notice the similarities to business and began reflecting:

  • First step is to identify and get clarity on what you are trying to accomplish. Some people think when it comes to starting a company or creating a product that that’s the hard part. The reality is that while the selection process does play a role in the ultimate success, it is the perseverance, perspiration, and positioning that are core.
  • Planning is a luxury that is vitally important, but needs to be balanced with diving in and getting the job done. I am a huge proponent of planning and being organized, but often times it is instinct that prevails to spark growth. If I had planned better, I would have had the right gloves at the very least. However, if I did a hardware store run, I would have lost inspiration and that tree would still be standing.
  • Assess don’t obsess about risk. I have chopped down a tree before, so the risk of chopping down a tree was precalculated, but I did have to size up this particular tree. Speed and spontaneity can get buried by over analyzing (analysis paralysis).
  • It helps to change your approach and come at it from different angles. It’s easy to fall into a rut. Shift gears if you feel diminishing returns. Take a step back and come back with a fresh perspective (a new stance). Don’t go too far adrift, as initial efforts can, and should be built upon.
  • Brute force works for a bit, but letting the weight of the axe do most of the work is the long game. The analogy is here is that you don’t need to go it alone. That method just doesn’t scale. Pull together the right team and rely on them to add to the effort (ideally autonomously).
  • Keep on doing and going. OK, I admit it, that is a quote from Henry Ford, but I have adopted it. Keep chipping away at whatever you are working on and you will eventually get there.
  • Find a certain angle that works and focus there. When you hone in on one area, you will make strides. Ride the momentum and make a significant dent, when you have it in front of you. Never ease up when you feel you are making progress.
  • In the end, it was a coordinated effort coming at it from both sides with all my might that won. Teamwork is the name of the game.

I am sore, I have a big blister on my hand, but I did it. I thought about borrowing a chainsaw or hiring someone to do it, but I knew I had it. At some point, you just know that you got it. Own it when you do!

“Alexa, how are you different than Siri?” “I’m more of a home-body”

I’m away from my desk, so I guess I can’t ask Alexa. No problem, I’ve got an iPhone in my pocket.

“Hey Siri, what’s the status of my Amazon order?” “I wish I could, but Amazon hasn’t set that up with me yet.” Doh!

IPAs (intelligent personal assistants*) are in their infancy, but they are a next major step in human-computer interaction. With the expected concurrent growth of IoT and connected devices, IPAs will be everywhere soon. Consider that it is easier to fit a small mic and speaker into a device than a screen and keyboard, and often easier to interact with such via voice outside of the desktop environment.

However, as the highly-contrived (after all I actually am at my desk typing this, and Alexa is giving me dirty looks) scenario above illustrates, IPAs have different capabilities, and different strengths and weaknesses. While Alexa and Siri both want to be my concierge, I’m more likely to talk to Watson when I want to discuss cancer treatments or need to pwn Jeopardy. When I’m hungry after midnight, it’s TacoBot to the rescue.

As a user, I already interact with more than one IPA, and over time this number is only going to grow. I want to use the IPA that is both best and most convenient for my immediate need. I have no interest in being restricted to a single IPA vendor’s ecosystem; likewise I don’t want to have to juggle endpoints and IPAs for every little task. And Taco Bell wants to craft their own brand and persona into TacoBot instead of subsuming it into one of the platform IPAs or chasing every third-party platform in a replay of the mobile app days.

What I really need is for the assorted IPAs in my life to work together on my behalf as a team. When I’m out and about, I want Siri to go ask Alexa when my order will arrive. Neither IPA alone can meet my criteria: report on order status while I’m away from my home, but Siri [mobile] and Alexa [connected to Amazon ordering] can achieve this collaboratively. Consider some of the aspects of complex, non-trivial tasks:

  • Mobility and location
  • Interactions with multiple, cross-vendor external systems
  • Asynchronous: real-world actions may take time to occur and aren’t containable within a one-time “conversation” with a current-state IPA
  • Deep understanding of both complicated domains and of my highly-personalized circumstances and context

So how do we herd these cats? One challenge is the mechanics of IPA-to-IPA communication. Will they speak the same language? How will each understand what another is good at? If the other is knowledgable about an area completely outside of the first IPA’s knowledge area?

APIs are the first, easiest option. They generally require explicit programming, but the interfaces are highly efficient and well-defined structurally. This is both a strength and a weakness, as well-defined structure imparts a rigidity and implication of understanding on both “client” and “server”. The Semantic Web was one attempt to address understanding gaps without explicit programming on both sides.

Another option is the utilization of human language. IPAs are rapidly learning to become better at this defining skill, and if they can communicate with people then why not use natural language capability with each other? Human language can be very expressive, if limited in information rate (good luck speaking more bits/s than a broadband connection), but efficiency and accuracy is a concern, at least with the current state of technology. One argument is that an IPA that does not fully understand a user’s language may better serve the user by simply relaying these words to another more suitable IPA instead of attempting to parse that poorly-understood language into an appropriate API call.

Of course, this is not an either/or decision and both may be utilized to better effect.

Language Interface for Conversation Ais

As this team of IPAs becomes more collaborative, another issue emerges that any manager will appreciate: how best to coordinate so that these IPAs function as a team rather than an inefficient collection of individuals.

  • One low-friction model is command-and-control. Alexa (or Siri, or Cortana, or Google, or… ) is the boss, makes dispatch decisions, and delegates to other IPAs.
  • Agile methodologies may provide inspiration for more collaborative processes. Goals are jointly broken down and estimated in terms of confidence, capability, etc. by the team of IPAs, and individual subtasks agreed upon and committed to by a voting system.
  • Because computation is cheap and generally fast in human time, a Darwinian approach may also work. Individual IPAs can proceed in competition and the best, or fastest, result wins. Previous wins, within a given context, will add a statistical advantage for the winning IPA in future tasks.

As IPAs become more and more entwined in our daily lives and embedded into the devices that surround us, we will learn to utilize them as a collaborative community rather than as individual devices. Unique personas become a “customer service” skill, but IPAs with whom we do not always wish to communicate directly still have value to provide. This collective intelligence is one of the directions in which we can expect to see significant advances.

* Also delicious, delicious beers. Mmmmmm beer…

While working with eBay Enterprise on a large multi-company effort to create a marketing platform, I saw first-hand the struggle around effective project management. I played Scrum Master for our development team and immersed myself into the eBay Enterprise team, including keeping a desk at their facility. Actually, our entire team kept desks there. We didn’t come in every day, as we were more productive working from home or in our offices that had private rooms, creating an environment with limited distractions. Collaborating with team members in-person was a weekly, if not bi-weekly, occurrence. We were at eBay every Tuesday (all day), regardless of scheduled meetings. The eBay architects and managers we interfaced with knew we would be there every Tuesday and would schedule meetings accordingly. ‘Water cooler talk” (actually more like foosball and lunch talk) was also pretty strong on Tuesdays. We had 2-week sprints and scheduled the demo/review, retrospective, and planning meetings all on the last Friday of each sprint, also held on site. We would also frequently come in between scheduled days on site, as needed.

I am going into detail on how we as a team were amalgamated into eBay’s development methodology and infrastructure. We were a tightly integrated extension of their team. It worked very well. When the Scaled Agile Framework (SAFe) was deployed, it wreaked havoc. First off, it was done mid-project, so it was disruptive. The real issue was that the project was already off track. We were 1 of 5 other teams on the project. There was another large Fortune 500 company on as a partner that doubled as a contractor, building a complimentary product that was intended to be married to the eBay engine and sold in separate domains (the other company was mostly interested in credit card and banking customers).

We were the “A Team” that set the quality standards. We had a higher velocity than the other teams, but points were normalized, which I thought was a huge mistake. It was done for the sake of making it easier to report points and track progress for upper management. The bigger issue was the fact many of the other teams were “Agile-in-Name-Only”, an all too common phenomenon we run across. If you are not continuously cranking quality, tested, ready-to-deploy code every sprint (giving exceptions to Sprint 0 for setup, and sprints devoted to Hardening, Innovation, and Planning (HIP) sprints), then you are not Agile. Continuous is the operative word and that should transcend development, testing, DevOps, and management.

SAFe has benefits and should be leveraged to create your most effective project teams. For us, the main boost to productivity that it brought was the concept of a Product Owner; typically an architect that serves as the main translator of product requirements from Product Managers. Product Owners are responsible for making sure there are enough stories written out to keep the development team coding. They are also responsible for doing code reviews each sprint and accepting stories.

Developers are still the heart of Agile efforts and must be preserved from typical company distractions like excessive meetings and other activities that are not coding. The fable of the Chicken and the Pig is a great way of thinking about team dynamics. Chickens (product managers, project managers, CTOs, VPs of Engineering, etc) are not core to the development of code. They are influential and very important, but they are not as “committed” and must respect the developer’s time. We have added in a 3rd character, The Wolf, whose role is to make sure the whole operation is embracing Agile and not being affected by outside factors (e.g., upper management budgeting and scheduling in Waterfall ways). The Wolves BigR.io has on staff are adept at identifying the people, policies, and processes that might impact velocity and diplomatically remove them. Wolves lest not worry about corporate politics or any other roadblock to delivering good quality production-ready code that meets or exceeds spec and covers all non-functional requirements.

The principles of Agile are available in a multitude of locations, but be sure to read through the Twelve Principles of Agile Software first as well as the history behind the Agile Manifesto. Also note that Agile is not a “religion” and while structure is key, adaptability to what will make your team thrive is paramount. Beyond the fundamentals like continuous development and testing, it really is all about the rhythm of the team. Ease of collaboration, team camaraderie and energy can’t be underestimated. Here are a handful of components in our approach that I believe are mandatory:

  • Team lunches – helps build the team dynamic and foster more cohesion. Ideally at least once a week and ideally in-person but virtual is better than not doing it all.
  • True retrospectives – give everyone an equal voice and effectuate the changes (within reason).
  • Modern environment – the right tools, equipment, and infrastructure.
    1. QA automation and continuous testing
    2. Continuous integration e.g., Jenkins
  • Planning Poker – have all Pigs in a room (ideally the last day of the prior sprint), prioritize the stories that will make it into the next sprint, and then estimate each story having every Pig vote. Our preference is to use the Fibonacci sequence for estimating relative size of stories and throw out the outliers – 1 on the high side and 1 on the low.
  • Tasking stories – the sooner this is done the better, but in our experience this is better done after everyone has had some time to think them through. We usually make sure all tasking is done by the end of the 1st Tuesday of each sprint.
  • “Control-Freedom Balance” – we make sure each team member burns down their stories as they progress. Keeping a pulse on progress is key for highlighting problem areas and making sure the project stays on track. As tech lead or Scrum Master, you really need to “measure so you can manage” and strike the delicate balance of just enough controls with developer freedom.

BigR.io’s adaptation of Agile is a key element of our team’s DNA. Our engineers embrace and employ our methodology to consistently deliver for our customers. Team member roles and responsibilities are well known at the onset of every project and everyone maintains strict adherence to the established methodology (which by design is adaptable) throughout the project. I will leave by saying that Agile is not hype, it really works, especially if you embrace the core fundamentals/principles, but adapt the structure to your team and company.

For more information on our engagement with eBay please review our case study.

Recently, a customer asked us to help transition a set of data flows from an overwhelmed RDBMS to a “Big Data” system. These data flows had a batch dynamic, and there was some comfort with Pig Latin in-house, so this made for an ideal target platform for the production data flows (with architectural flexibility for Spark and other technologies for new functionality, but here I digress).

One wrinkle from a vanilla Hadoop deployment: they wanted schema enforcement soup-to-nuts. A first instinct might be that this is simply a logical data warehouse – and perhaps it is. So often these days one hears about Hadoop and Data Lakes and Schema-On-Read as the new shiny that it is easy to forget that Schema-On-Write also has a time and a place, and as with most architectural decisions, there are tradeoffs – right (bad pun… intended?) times for each.

Schema-On-Read works well when:

  • Different valid views can be projected on a given data set. This data set may be not well-understood, or applicable across a number of varied use cases.
  • Flexibility outweighs performance.
  • The variety “V” is a dominant characteristic. Not all data will fit neatly into a given schema, and not all will actually be used; save the effort until it is known to be useful.

Schema-On-Write may be a better choice when:

  • Productionizing established flows using well-understood data.
  • Working with data that is more time-sensitive at use than it is at ingest. Fast interactive queries fall into this category, and traditional data warehousing reports do as well.
  • Data quality is critical – schema enforcement and other validation prevents “bad” data from being written, removing this burden from the data consumer.
  • Governance constraints require metadata to be tightly controlled.