Chapter 1, or how an exciting data science project turned into a full-time data labeling nightmare
A few years ago, while working at a data science company that analyzes Twitter data, I was tasked with building classifiers for detecting news-worthy objects in Twitter images. Such a project, for a data scientist, is amazing – interesting technology, challenging goals, lots of value, with all the machine learning buzzwords one can ask for. I was excited about this project, and immediately fired up a GPU instance.
If you’re still building classifiers based on annotated isolated images instead of annotated videos, you’re losing valuable information, getting lower quality annotations, spending more money, and walking away from much more training data. Read on to learn why and how you can transition to scalable, fast, and cost-effective video annotation.
The Legend of The Data Scientist’s Fairy tale goes like this:
Once upon a time, in a village not far from here, lived a young girl who enjoyed looking at data. One day, she got a hold on a unicorn-sized trove of magical data, and after a quick look found inside it a huge diamond. The END!
In the past year, I’ve been working on building a solution to a problem I frequently encountered while working as a data scientist. To train the machine learning models I was building, I had to use clean, labeled data. I often ended up manually reviewing and labeling data myself — a process that was tedious, lengthy, boring, error-prone, and probably not the best use of my time. As a fellow data scientist eloquently put it, “Machine Learning is 99% Manual Labor.”Continue reading ““Machine Learning is 99% Manual Labor” — and what I’ve been up to”