Not since LeBron James was a high school basketball player in Akron, Ohio, has there been so much hype as what Wall Street is lavishing on big data. A certain joke goes that it’s gotten so hard to pin down what big data means that you could get hired just for being able to explain it. The Wall Street Journal warns of a permanent pink slip if you don’t at least understand the term. Consider this a primer.
Employers are in fact increasing the search for data analysts across industries and are even using big data to find those people. Yes, this gets very meta, very quickly.
As I recently covered, the tech-sourcing of white collar jobs is a veritable phenomena. But is it true that employees with no knowledge of big data are sealing their own fates, as the Wall Street Journal opined in a interview with an executive recruiter earlier this month? Does having slept through college stats and never looked back now mean “a permanent pink slip” — not just firing, but permanent unhireability?
Headlines of that kind beg for attention. But the point behind such articles nonetheless appears valid. Across industries, an inability to speak the language of big data — to use phrases like “signal to noise” and the verb “to crunch” (without meaning that something is being eaten or stepped on) — likely means that workers, managers and executives will get left behind. Yes, new things are scary things, but it’s time for even the most suspicious to stop pretending that big data doesn’t exist, or to assert that it’s a fad, and step back for a calm, practical look at its present and future.
The Wall Street Journal story focuses on two different kinds of knowledge workers: analytics professionals and data scientists. Data scientists work with unstructured data sets, whereas analytics professionals work with structured data. Another way to clarify the terms here — and they are confusing — might be that data science is a larger, interdisciplinary field that comprehends big data (and, yes, there are other kinds of data). Data science is the study of data extraction, one abstract step removed from data extraction or analysis itself. Data scientists build the mathematical and statistical models, the hardware and the software tools that data analysts use to derive communicable, actionable knowledge from structured data.
According to the WSJ, hiring has increased most dramatically for data analysts — those workers who know how to gather, analyze and visualize the many different forms of information involved. That means everything from online marketers, who can derive telling patterns from millions of clicks and views, to supply chain managers newly empowered to monitor and reduce energy consumption through sensors installed on every device in every factory. Interestingly, the means by which workers are now educating themselves have scaled up too: Instead of seeking out night classes or stints at university extension schools, adult employees are enrolling in Massive Open Online Courses, or MOOCs, provided by even the likes of Harvard University. This is distance learning scaled up on an Internet architecture meant to accommodate an almost infinite number of students and available much more immediately than most classroom courses.
Grading in such courses is sometimes itself performed by machines (I warned that this gets very meta, very quickly), employing some of the same data tools used to rapidly break down all kinds of text into analyzable pieces. That’s worth noting because it’s a primary indicator that smart educators will learn how to think with such tools, a concrete instance of big data at work where you might not expect it. Notable here, too, is that our consumption of knowledge has expanded and continues to move outside of an older, qualitative model. This is true of our understanding of the scale of useful knowledge itself. We’re living in a very big age. (Not even the old bastions of the qualitative — university English departments — have passed on the use of large data corpuses to inflect their work.)
But why is big data so useful, so intensely valuable to businesses? The simplest answer is competitive intelligence. Masses of information about a thousand different aspects of any market aren’t obscure piles of unreadable signs and secrets for highly specialized humans and machines to decode. They’re open doors, and thriving, smart companies are already walking right through them. Large blocks of historical data about weather conditions have been used to project agricultural productivity for decades. Sentiment analysis that is run on millions of tweets, past and in real time, can lend insight into the performance both of proprietary products and competitors. Obvious applications abound for stock-picking.
Companies are also using big data to improve or alter their business models. According to case studies released by Gartner, a technology research group and leader in big data consulting, a major fast-food retailer “is training cameras on drive-through lanes to determine what to display on its digital menu board. When the lines are longer, the menu features products that can be served up quickly; when the lines are shorter, the menu features higher-margin items that take longer to prepare.” Precise calculations of waste means better and better ideas of how to increase efficiency at every level. Operations, manufacturing, supply chain management — the applications for big data across industry are as wide as the human imagination. Gartner also cites a thing called dark data — information collected by an organization for one purpose, long ago, and then reanalyzed using new tools for insight still buried within it.
The Gartner case studies are at once ingenious and eminently practical. Even human resources departments can now predict how many employees at different tiers and in different wings of large companies will retire or have children in a given year. The things we can know about our own activities and about the behavior we can explore are increasingly being measured by the second.
Gartner estimates that 4.4 million jobs will have been created around big data by 2015. Our current iteration of the knowledge economy has been called the information economy, which is also being called the data economy. Full participation in that economy requires broad-stroke knowledge of its contours and enthusiasm for learning and applying its tools as they continue to emerge.
At minimum, it does seem worthwhile to sign up in the evenings for a machine-graded, big-data-grade-curve-calibrated online statistics course from one of the MOOC providers, or a course in big data for marketing. You might just get an unexpected raise after completing the course — which itself might be correlation and not causality — but at least by then you will be able to opine eloquently on the distinction between the two, which will carry you through the first three quarters of every job interview over the next decade, if the big data on employment trends is to be believed.