When you pretend the subjective is objective.
The first time I detected a slightly fishy smell in how data was used came from hanging around human resources. At some point the old fashioned employee evaluation which involved having the boss sit down with the employee and discuss his or her performance was replaced by online forms that you fill out staring at your screen rather than your employee.
These software apps produce an evaluation that is numerical. So subjective evaluations of things like leadership or teamwork, attitude or aptitude, are converted into numbers on a form. That way calculations can be made comparing employees in departments or industries, they can be used to determine promotions, firing or raises and they can be used within an organization to stack rank employees. If your goal is to destroy the morale of your staff, no tool is more effective than stack ranking.
I’ve been involved in dozens, if not hundreds of these evaluations. For example I have had three different bosses evaluate my ability to communicate by plugging a 1-5 number in a form. The results from the three were wildly different. Do you think that is because my ability to communicate went up and down like a roller coaster? Or did the form tell you more about the evaluator than the evaluated? If you do this on a companywide basis and have dozens of different evaluators tryng to interpret behavior with statistics, what are you likely to find out? Perhaps that the software company that sold you that HR system took you for a ride.
If you want to let your imagination loose on what this could eventually look like read Dave Eggers novel The Circle. In his story, the Circle is a massive technology company that evaluates its employees in real time on a scale of 1-100. If you think that is a good idea, this post is for you.
When the system in gamed.
There seems to be no more straightforward use of online data than to gauge the popularity and effectiveness of content, whether it is a news story, advertisement or Web page. By looking at what gets the most views you’ll find out what was most widely read. That is, unless you only want to consider views by human beings.
You can buy views for just about anything, online articles, ads, video. This is something that publishers have been found to do. They buy thousands of views which make it appear they have more reach than they really have and then charge advertisers accordingly. What do advertisers get for their high rate? They get pinged thousands of times by robots.
Using crowdsourcing to produce numerical ratings for anything from restaurants to businesses to books again appears to be a pretty attractive application of data. Until you find out there are people in places like the Philippines getting paid a buck a shot to hype certain entities and trash others. Yelp at one time estimated that up to 25% of its reviews were fake (Digital Deception: Astroturfers).
Another application of big data that caught on quickly is using data to identify online influence. This is done through a calculation of followers, connections, like, retweets, comments, etc. I’m not certain that actually tells you how influential someone is but I can say with some certainty that if you’ve bought those followers and likes, and the Internet is full of offers to do just that, it really has nothing to do with influence.
The gold standard of online influencer ratings is Klout. As I mentioned in an earlier post (Digital Deception: The Illusion of Influence), my Klout score will go down if I’m travelling for a couple weeks and ignore my social media accounts. But if I catch my dog in a cute pose, post it on Facebook, and get a few dozen likes, my Klout score goes up. Neither of these things have anything to do with whether or not I am influential.
When you need a little context
The one application of big data that most of us have experienced is the recommendation engine. Most commonly this is used on ecommerce sites. If you bought this, you might want this. Just like being prompted to get fries with your burger.
Amazon is well known for this. If I look at my Amazon account history, I’ll find purchases of books about architecture. There will be books intended for young readers that are fantasy fiction. There will be fashion-oriented coffee table books. And when I look at what Amazon recommends to me, there is nothing that I find of any interest. That’s because the missing piece of information here, the piece that big data is powerless to identify, is the fact that all of the above purchases were gifts for different people. The absence of context makes the recommendation engine fail. Except that maybe it reminds me that since I bought my sister a book for Christmas last year I should get something different this year. Not what Amazon is going for.
Another example of a data-driven recommendation engine that is being used by publishers and Web-site operators is to recommend content that is supposedly of interest to persons reading what is on the currently viewed page. Outbrain is one of the leading providers of this type of recommendation service. They are used by CNN. I went to CNN and took a look at this story about the U.S. trying to combat the Islamic State. And then I looked at what stories Outbrain recommended at the bottom of the page. They included “Celebrities Who Have Turned Heads With Their Sense of Style” and “How Much House Can You Afford.” Are you kidding me? Likely this may be more about what Outbrain is getting paid to promote than it is about what their supposed recommendation engine finds to be contextually relevant.
When the data becomes the goal in itself
One area where the growth of big data offers some promise is in education. Standardized tests have been around since the SAT’s were created in the 1920’s. But today’s standardized tests and our ability to absorb, analyze and interpret the data that comes from them (especially if administered online) is starting to be used to evaluate teachers and school systems. The promise is it will tell us what school systems are underperforming and need help and identify the best teachers and the worst.
My concern with that whole process is that the education our children get starts to be geared toward the tests. Considering how they are being used, are schools and teachers focusing on preparing students to score on these tests instead focusing in the classroom on literature or math or science? I suspect that those teachers who are most insecure about their abilities would be the ones who have the most to lose and hence would be the most focused on assuring their students do well in these tests. That in turn probably makes them less effective as teachers even if they are able to get their students scores up.
The world of data that we have at our fingertips really does give us the potential to understand things we didn’t have access to before, to evaluate the way we do things to find the best approach and to manage more effectively whether in the shipping department or the classroom. But that potential is realized only when we use data in an appropriate way. It is not a substitute for vision, for effective decision making or for leadership. And if you don’t bring that to the table, no amount of data is going to be big enough.
(See also The Use and Misuse of Big Data)