Categories
Uncategorized

Big Data

(This article appeared in the Anderson Valley Advertiser April 2012)

“Mathematics are well and good but nature keeps dragging us around by the nose.” Albert Einstein

A wintry April day—rain, cold, our two woodstoves hard at work translating matter into energy so we may carry on in comfort. Yesterday we celebrated the idea of spring, if not the reality, with the delivery of four cords of firewood from Frank’s Firewood of Boonville, so now several days of stacking wood are upon us. I am graduating from my seventh Mendocino winter, and Frank’s fantastic firewood has kept me snug and warm through every one of them. Thank you, Frank!

Yesterday also brought an email from a friend with the subject heading Data Plague, with a link to an article from the New York Times about Big Data, a hot topic in the world of computer science and technology. Big Data is the incomprehensibly large amount of raw data piling up from all electronic activities that leave digital traces, including scientific research and social media. For instance, every minute of every day some forty-eight hours of video are uploaded to YouTube: the equivalent of eight years of content each day.

According to the Big Data article, many people in government and academia and private industry are interested in mining this rapidly growing data universe, and President Obama has earmarked 200 million dollars for his Big Data Research and Development Initiative. And just last month the National Science Foundation awarded 10 million dollars to Berkeley’s A.M.P. Expedition, which stands for “algorithms machines people,” a team of U.C. Berkeley professors and graduate students working to advance Big Data analysis.

As usual, no one asked my opinion about any of this, but here are my thoughts on the intrinsic and extrinsic value of Big Data. Once upon a time there was this emperor, see, and he wasn’t actually wearing any clothes, but because he was the emperor everyone had to pretend he was wearing clothes even though he wasn’t.

“The man ignorant of mathematics will be increasingly limited in his grasp of the main forces of civilization.” John Kemeny

Stacking firewood, one might surmise, is something like trying to make sense of Big Data. There on the driveway (in cyber space) is a huge jumble of firewood (pile of data) composed of many separate pieces of wood (bits of data). Over time, I will get all that wood neatly organized in eight or nine stacks in the woodshed, and over more time I will burn those stacks to heat our home. Meanwhile, the Big Data geeks will try to organize their ever-expanding pile of data bits (measured in petabytes, one million gigabytes, and exabytes, one billion gigabytes) and then…and then nothing.

Still more astonishing is that world of rigorous fantasy we call mathematics.” Gregory Bateson

Eight years of Youtube video uploaded every day? That’s 240 years per month! Joe points his phone camera out the bus window as we make our way through Chinatown. Okay. Cool. Click, click. Uploaded to Youtube. Here are Margaret and Binny eating ice cream. Good. Click, click. Uploaded to Youtube. Ralph’s three-legged cat named Popsicle is eating a mouse. Ew! Click, click. Uploaded to Youtube. Becky’s Great Dane Buffy rolls on something dead. Hardee har har. Click, click. Uploaded to Youtube. Here are millions of videos of people looking into their cameras and making silly faces. Yes! Click, click. Uploaded to Youtube. And here is Zigmund Olafson, pulling down two hundred grand a year (of taxpayers’ money) as Permanently Visiting Professor of Theoretical Cyber Whatever at U.C. Berkeley running 1700 centuries of such stuff through a super computer in the basement of ADE (Algorithms Digest Emptiness) and after nine months of data digestion and crunching and analysis discovering that…kittens and puppies are cuter than heck!

“We’ll judge our success by whether we build a new paradigm of data.” Michael Franklin, director of A.M.P. Expedition.

A new paradigm of data? Puh-leez. How about a new paradigm of excellent and affordable healthcare for everybody? How about a new paradigm of equitable taxation? How about a new paradigm of funding our parks and schools? How about a new paradigm of peaceful resolution of conflicts? How about a new paradigm of closing all the insanely dangerous nuclear power plants and insulating our homes and solarizing every viable rooftop? How about a new paradigm of generosity and love? Oh, no. What we need is a new paradigm of data. And just what might that new paradigm of data look like? We have absolutely no idea, but we’ll let you know if we think we’re successful in building that paradigm after we’ve spent hundreds of billions of dollars, you know, feeding digital stuff into really fast computers. Okay? Cool. Click, click. Uploaded to Youtube.

“I don’t agree with mathematics; the sum total of zeros is a frightening figure.” Stanislaw J. Lec

Of my many unhappy experiences with publishers, one of the saddest had to do with a chunk of data that followed me around like the Hound of the Baskervilles and is no doubt following me still. This chunk of data suggests that my second, third, fourth, and fifth novels did not sell many copies. Never mind that the various publishers involved did absolutely nothing to promote or distribute my books, and in most cases suspended all support for the books before they were published. No, the data says the books did not sell, which translates in corporate parlance to “Todd does not sell.”

Being reminded of this damning data every time I approached an agent or publisher, I nevertheless continued to try to interest mainstream publishers in my work for many years, with and without the services of literary agents. Of course, agents are privy to this same database, and so I was a pariah to most of them. But eight years ago, shortly before moving to Mendocino, I succeeded in interesting an agent in representing my novel Bender’s Lover, a metaphysical love story comedy thriller set in San Francisco and having to do with music, friendship, and power. I warned this agent about the damning data that was following me, but she seemed undaunted. “After all,” she said, “those sales figures are over twenty years old and this book is so good that…”

She sent copies of my tome to fourteen editors in New York, eleven declining to consider the manuscript because of the aforementioned database. Three said they would give the book a read, and lo a miracle occurred (or so we thought.) A senior editor at Viking went mad for the book, called my agent with a fat offer, and asked that we all get together for a conference call the next day, which we did. My oh my, did we have fun, a ménage á trois love fest during which we designed the cover and cast the movie and read aloud our favorite parts from Bender’s Lover; and for the next forty-eight hours I believed the curse had finally been lifted from my career and I would at last be allowed to ascend to my rightful place in the pantheon of American novelists.

This delightful editor’s last words to me were, “I don’t anticipate any problems, since I have carte blanche here, but as a formality I do have to run this by a couple people in Sales and then I’ll call you with my formal offer. I cannot tell you how excited I am to be getting this book. It’s going to be huge.”

Alas, Sales nixed the deal because the data says Todd doesn’t sell, never mind how old the data or what the data is based on. Never mind anything except the raw little numbers, which in truth are miraculous for being more than zeros.

My agent’s voice was trembling as she gave me the sorry news, and then she took a deep breath and said, “So…under the circumstances, I don’t think there’s really any point in our continuing to work together. Do you?”

Cue the howling hound!

And that is just one of many reasons I do not care much for data, big or small.