Brute force crowdsourcing approach to data labeling is taxing enough even for well heeled enterprises. For a startup operation, the expenses and efforts are easily prohibitive. While Deep Learning exercises involve sample sizes in the millions, even Machine Learning models will require a minimum of 10’s of thousands of samples. Beyond the sheer numbers, the quality of labeling results is far from assured on the Mechanical Turk platform without a well designed cross-checking and validation methodology.
BigRio leverages the state-of-the-art semi-supervised techniques to enhance classification accuracy by a factor of 10 to 100 times, thus reducing the amount of labeling to below 1000 samples per experiment. Oftentimes, a PoC project can extract great values from data even without labeled data, by means of unsupervised learning, such as clustering and visualization, an area BigRio has a long established credential as an innovator.
Visualization technique applied to unsupervised clustering is used to investigate the effectiveness of battery charging cycle data for classifying battery aging behaviors.