When Big Data Comes Up Small

When you pretend the subjective is objective.

The first time I detected a slightly fishy smell in how data was used came from hanging around human resources. At some point the old fashioned employee evaluation which involved having the boss sit down with the employee and discuss his or her performance was replaced by online forms that you fill out staring at your screen rather than your employee.

These software apps produce an evaluation that is numerical. So subjective evaluations of things like leadership or teamwork, attitude or aptitude, are converted into numbers on a form. That way calculations can be made comparing employees in departments or industries, they can be used to determine promotions, firing or raises and they can be used within an organization to stack rank employees. If your goal is to destroy the morale of your staff, no tool is more effective than stack ranking.

I’ve been involved in dozens, if not hundreds of these evaluations. For example I have had three different bosses evaluate my ability to communicate by plugging a 1-5 number in a form. The results from the three were wildly different. Do you think that is because my ability to communicate went up and down like a roller coaster? Or did the form tell you more about the evaluator than the evaluated? If you do this on a companywide basis and have dozens of different evaluators tryng to interpret behavior with statistics, what are you likely to find out? Perhaps that the software company that sold you that HR system took you for a ride.

If you want to let your imagination loose on what this could eventually look like read Dave Eggers novel The Circle. In his story, the Circle is a massive technology company that evaluates its employees in real time on a scale of 1-100. If you think that is a good idea, this post is for you.

When the system in gamed.

There seems to be no more straightforward use of online data than to gauge the popularity and effectiveness of content, whether it is a news story, advertisement or Web page. By looking at what gets the most views you’ll find out what was most widely read. That is, unless you only want to consider views by human beings.

You can buy views for just about anything, online articles, ads, video. This is something that publishers have been found to do. They buy thousands of views which make it appear they have more reach than they really have and then charge advertisers accordingly. What do advertisers get for their high rate? They get pinged thousands of times by robots.

Using crowdsourcing to produce numerical ratings for anything from restaurants to businesses to books again appears to be a pretty attractive application of data. Until you find out there are people in places like the Philippines getting paid a buck a shot to hype certain entities and trash others. Yelp at one time estimated that up to 25% of its reviews were fake (Digital Deception: Astroturfers).

Another application of big data that caught on quickly is using data to identify online influence. This is done through a calculation of followers, connections, like, retweets, comments, etc. I’m not certain that actually tells you how influential someone is but I can say with some certainty that if you’ve bought those followers and likes, and the Internet is full of offers to do just that, it really has nothing to do with influence.

The gold standard of online influencer ratings is Klout. As I mentioned in an earlier post (Digital Deception: The Illusion of Influence), my Klout score will go down if I’m travelling for a couple weeks and ignore my social media accounts. But if I catch my dog in a cute pose, post it on Facebook, and get a few dozen likes, my Klout score goes up. Neither of these things have anything to do with whether or not I am influential.

When you need a little context

The one application of big data that most of us have experienced is the recommendation engine. Most commonly this is used on ecommerce sites. If you bought this, you might want this. Just like being prompted to get fries with your burger.

Amazon is well known for this. If I look at my Amazon account history, I’ll find purchases of books about architecture. There will be books intended for young readers that are fantasy fiction. There will be fashion-oriented coffee table books. And when I look at what Amazon recommends to me, there is nothing that I find of any interest. That’s because the missing piece of information here, the piece that big data is powerless to identify, is the fact that all of the above purchases were gifts for different people. The absence of context makes the recommendation engine fail. Except that maybe it reminds me that since I bought my sister a book for Christmas last year I should get something different this year. Not what Amazon is going for.

Another example of a data-driven recommendation engine that is being used by publishers and Web-site operators is to recommend content that is supposedly of interest to persons reading what is on the currently viewed page. Outbrain is one of the leading providers of this type of recommendation service. They are used by CNN. I went to CNN and took a look at this story about the U.S. trying to combat the Islamic State. And then I looked at what stories Outbrain recommended at the bottom of the page. They included “Celebrities Who Have Turned Heads With Their Sense of Style” and “How Much House Can You Afford.” Are you kidding me? Likely this may be more about what Outbrain is getting paid to promote than it is about what their supposed recommendation engine finds to be contextually relevant.

When the data becomes the goal in itself

One area where the growth of big data offers some promise is in education. Standardized tests have been around since the SAT’s were created in the 1920’s. But today’s standardized tests and our ability to absorb, analyze and interpret the data that comes from them (especially if administered online) is starting to be used to evaluate teachers and school systems. The promise is it will tell us what school systems are underperforming and need help and identify the best teachers and the worst.

My concern with that whole process is that the education our children get starts to be geared toward the tests. Considering how they are being used, are schools and teachers focusing on preparing students to score on these tests instead focusing in the classroom on literature or math or science? I suspect that those teachers who are most insecure about their abilities would be the ones who have the most to lose and hence would be the most focused on assuring their students do well in these tests. That in turn probably makes them less effective as teachers even if they are able to get their students scores up.

The world of data that we have at our fingertips really does give us the potential to understand things we didn’t have access to before, to evaluate the way we do things to find the best approach and to manage more effectively whether in the shipping department or the classroom.  But that potential is realized only when we use data in an appropriate way. It is not a substitute for vision, for effective decision making or for leadership. And if you don’t bring that to the table, no amount of data is going to be big enough.

(See also The Use and Misuse of Big Data)

This entry was posted in Technology and tagged , , , , , , . Bookmark the permalink.

21 Responses to When Big Data Comes Up Small

  1. Pingback: The Use and Misuse of Big Data | off the leash

  2. Lenie says:

    When I first started using Twitter, I would receive followers who would sell me 1,000 or more followers for just $5.00 or whatever, I always thought that was interesting because the people who were selling these ‘followers’ mostly only had a hundred or so followers themselves. Obviously, there was something fishy.


  3. jacquiegum says:

    Ken this is becoming more of an issue for me personally. I get truly angry because these gauges…these bars that we’re all meant to believe matter, are useless in the real world. But my fear is that the data is now becoming the real world. It’s like the worst hamster wheel of all. I feel that we are losing the vision…the good intent of is mutating because there seems to be no accountability when the data is compromised.


  4. patweber says:

    Ken I love your example of Amazon and why sometimes, big data doesn’t work. Sometimes, even with technology, the game is still not fair. I usually question data, and many times, find inconsistencies in it. Now you’ve warned us and given us the whys! Thanks.


  5. Arleen says:

    Interesting your post on data. Just recently it came out that when you rate a product, don’t always believe that they are really true ratings. Data can be misconstrued to work in your behalf or against you depending upon how you read it. If rely on data you will lose your creativity and that is what made the baby boomers innovative. Creativity..


  6. This is a provocative post, Ken. It is ironic as well as poignant to give an online employee evaluation measuring communication skills. In terms of education, I think standardized testing has become extremely detrimental. Teachers teach to the test, not because they want to, but because they have to. Indeed, school principals can lose their jobs if students don’t score well. Yet, these tests do not test for creativity and the art education is continuously slashed.


  7. Have filled in several personality tests for different headhunting companies. The interesting thing is that my personality is different in each test. Whoever constructs any kind of test or evalutation to be filled in online determines the outcome. It’s bad luck if your big date comes out small, nothing else.

    Liked by 1 person

  8. Ken Dowell says:

    Great example. I’m sure your personality doesn’t change every time you take a test.


  9. jankedonna says:

    Lots of good points here. We have so much data available, but analyzing and understanding in context goes beyond compiling the numbers. The measure of online influence through the use of numbers that are easily manipulated or bought and the example you’ve cited with Amazon offering you selections they think you prefer because of your gift purchases highlight that.


  10. Donna Janake says:

    Lots of good points here. It is so easy to collect a lot of data these days, not so easy to truly understand what it means. I have a hard time accepting the measures of online influence when the indicators are so easily manipulated and bought.


  11. andleeb says:

    I think that different online personality tests are build to get insight into the skills of candidate. As I am linked with education from last 10 years and have made different tests or evolved in preparation of many such tests.
    Between I was totally unaware of the fact that views are bought to get into some business.


  12. Very true post about a sad issue that affects us in many ways. Sometimes the only way to make a decision is to read reviews, and then those reviews are fake? That really sucks. And as for reducing personality evaluation to online number crunching you are right that this does not achieve the intended objective nor produce consistent results.


  13. I hate when I’m recommended something by a site it is always completely wrong. Klout is confusing my score goes down even when I have a large amount of activity. Job evaluations are senseless. All this filling out and being rated by a number to accomplish what?


  14. Too bad there isn’t a better way to provide quality control with data – but then, I guess that’s the appeal for some companies. I know that for a hundred bucks or so, you can actually buy a couple of thousand twitter followers. Not sure what purpose that serves?!


    • Ken Dowell says:

      It doesn’t really serve any purpose Krystyna other than to create the appearance of having a broader reach than you really have. It isn’t just random egomaniacs that buy followers, it is often celebrities, politicians and even publishers.


  15. Tim says:

    As always I love your posts Ken. I have often had a chuckle when looking at my Amazon account and there is something recommended for me based on something I searched for once, along time ago, for a reason I no longer remember. Big data has it’s flaws that’s for sure and to be honest I am not a big fan of having absolutely everything quantifiable. Being scored by someone else, or a group of someone else’s, really does bring with it so many outside influences that by days end it has little to do with you and more to do with the evaluator. Great post.


  16. Pat Amsden says:

    The truth is that same type of stack evaluating has been used to evaluate teachers in every course I’ve ever taken in the last twenty years. I never feel it gives a true idea of what the teacher is like. Very few people will give 1’s on a 1-10 scale of effectiveness. Ditto 10. People tend to go to the middle which doesn’t strike me very useful. The same goes for product evaluations.

    The truth is as a boss or business owner I would be far more swayed by someone who raved about the service someone gave them unasked. And I’d definitely be looking into something a client or customer took the trouble to phone or talk to me about.


  17. Meredith says:

    This does NOT make me miss my old career in HR! You make some really good points here, that I’ve often wondered about, at least on a subconscious level about the subjectivity of data.


  18. Eileen says:

    Hi Ken! I have been in a dilemma lately because of my Alexa rank that continues to drop. After reading your post, I don’t care about it anymore. Those numbers represent something I am not. As long as I continue to write good content, as long as people read my posts, I am happy.

    Liked by 1 person

  19. Those employee evaluations should have a scoring system or coding system so that it wouldn’t matter who was doing the evaluation the score would be the same. That’s how it’s supposed to work. The problem is that many businesses don’t take the time or effort require to adopt a valid/reliable evaluation system.They just want to show that they have an evaluation system and they are following it. Dog and Pony Data. Data can be so powerful and informative, unfortunately it’s often skewed or misrepresented by interested parties.


  20. Now I wonder if Angie’s List is objective. As for standardized testing, I doubt that the same system measuring students’ performance will effectively assess a teacher’s. So much for big data, aka hype.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.