Who Will Be the Data Keepers? As Big Data grows bigger, more complicated, and more difficult to deal with, the choice of which data get retained and which discarded becomes pivotal

June 12, 2014

CFO.com | US

Who Will Be the Data Keepers?

As Big Data grows bigger, more complicated, and more difficult to deal with, the choice of which data get retained and which discarded becomes pivotal.

John Parkinson

Free-standing Big Data businesses can evolve from the expansion of data gathering and analysis within a single company.

Some years ago, around 2000, when I was running a research program on the then rapidly growingdigital economy, I had a team working on estimating how much of the total of human experience was being captured and stored digitally. There were about 6 billion people in the world at the time, with maybe 1 billion “online” in some fashion. This was before smartphones (as we now think of them) and tablets, so there were lots of documents, emails, texts and images, but relatively a lot less video. With some reasonable assumptions about usage, some fairly comprehensive secondary research on stored data volumes, some modeling and a few assumptions, we came up with a number — around 1 percent of human experience was being captured digitally.

Back then, therefore, 99 percent of human experience was either being lost or was confined to human memory (not a great long-term storage mechanism, for accuracy or speed of recall). Roll the calendar forward to 2014 and we are creating and storing much more information. I recently saw an estimate of more than 350 gigabytes a year for every online individual, plus another 350 gigabytes of associated data — log files, account information and so on. With 3 billion online users by 2015, that’s a lot of data — no wonder it’s called “Big”!

And it’s growing every year as we collect more of each individual’s experience and add more connected users. Back in 2000, when we were doing the original research, we estimated that at that rate of growth, even if we used just one atom to store each “bit” of data and there were no duplicate copies, we’d run out of atoms sometime before 2020, and as far as I can tell, we’re still on track to do so. All those atoms cost money — and can’t be used for anything else, so left alone, big data will eventually “eat the entire world” or at least the entire budget.

But the total of human digital experience isn’t the sum of each individual’s. There’s masses of duplication — intersecting and overlapping viewpoints; and “shared” experiences, where viewpoints may differ but the view is the same for everyone. As challenging as it may seem (and will be) we’re going to have to start reducing the amount of duplication (or get more bits per atom) in the stored data.

And then there are all the things that don’t change much or at all from moment to moment (or day to day, or year to year). If the view is always the same, we can “edit” it out of the data and replace it with a “tag” that links each view to the first (that’s essentially how compression software works to reduce the size of large files). If we do this well, we can reduce the size of the stored data by more than 80 percent and still keep enough to recreate every scene as it actually happened from the viewpoint of everyone who was involved.

(OK, for the math purists among you, I know some things get larger when processed by “lossless” compression algorithms, but in the real world, where there’s a lot of “well behaved” and static data, the reduction percentage is a pretty good target.)

So we can probably push out the day when we will run out of atoms to store our bits by a couple of decades (maybe), but sooner or later we are going to run out. At which point “curation” strategies will become really important. Just what should be kept? Who and what gets edited out and essentially forgotten?

If this seems unfair or unreasonable, remember that throughout history almost everything that happened has been forgotten. Only a tiny fraction of the total of human experience made it from generation to generation — especially before the invention of the printing press. Something had to be pretty important to get remembered — and even so, plenty of pretty important things weren’t. Many great ideas were lost and had to be rediscovered — and it’s probable that some remain lost to this day.

So curation will matter. So will who gets to be a curator.

And then there’s the time factor. The closer we get to recording all of human experience, the less time there is to go back and review what we recorded. Today, we can use the huge gaps in the total recorded and stored experience to watch what happened to others — real or imagined. But at close to 100 percent experience capture, there’ll be no time to do so. We will be living only going forward. And if we can’t ever go back and review the past (because by doing so we will miss being part of the present), why bother to record everything in the first place?

Finally, there’s entropy — which you can think of as the propensity for organized things to self-randomize over time. The more bits we store, the more bits will be randomly flipping from one to zero or vice versa, unless we watch them to make sure they don’t. (We have to keep adding orderliness to the total system to counter the inclination to randomness.) But the more bits we store, the more time we need to check for errors and the less time we have to do so. At some point, we’re going to be doing damage just with the checking process, which is also part of the entropic environment. Eventually, if the curators don’t delete you, entropy will.

The Big Data frenzy we are experiencing today is just the tip of the iceberg. It’s going to get bigger, more complicated and more difficult to deal with. Not a pretty picture, even with all the claimed benefits.

And always remember Sturgeon’s Law: 90 percent of everything is, in general, crud. Which specific 90 percent depends on your point of view. Better start training as a curator. So you get to decide which points of view matter.

John Parkinson is an affiliate partner at Waterstone Management Group in Chicago. He has been a global business and technology executive and a strategist for more than 35 years.


About bambooinnovator
Kee Koon Boon (“KB”) is the co-founder and director of HERO Investment Management which provides specialized fund management and investment advisory services to the ARCHEA Asia HERO Innovators Fund (www.heroinnovator.com), the only Asian SMID-cap tech-focused fund in the industry. KB is an internationally featured investor rooted in the principles of value investing for over a decade as a fund manager and analyst in the Asian capital markets who started his career at a boutique hedge fund in Singapore where he was with the firm since 2002 and was also part of the core investment committee in significantly outperforming the index in the 10-year-plus-old flagship Asian fund. He was also the portfolio manager for Asia-Pacific equities at Korea’s largest mutual fund company. Prior to setting up the H.E.R.O. Innovators Fund, KB was the Chief Investment Officer & CEO of a Singapore Registered Fund Management Company (RFMC) where he is responsible for listed Asian equity investments. KB had taught accounting at the Singapore Management University (SMU) as a faculty member and also pioneered the 15-week course on Accounting Fraud in Asia as an official module at SMU. KB remains grateful and honored to be invited by Singapore’s financial regulator Monetary Authority of Singapore (MAS) to present to their top management team about implementing a world’s first fact-based forward-looking fraud detection framework to bring about benefits for the capital markets in Singapore and for the public and investment community. KB also served the community in sharing his insights in writing articles about value investing and corporate governance in the media that include Business Times, Straits Times, Jakarta Post, Manual of Ideas, Investopedia, TedXWallStreet. He had also presented in top investment, banking and finance conferences in America, Italy, Sydney, Cape Town, HK, China. He has trained CEOs, entrepreneurs, CFOs, management executives in business strategy & business model innovation in Singapore, HK and China.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: