Unless you’ve been hiding in a bunker for the last few years (and who’s to blame you if you were?), you know that data science, big data, and machine learning are all the rage. And you know that the NSA has gotten scary good at surveilling the world via its data-parsing mojo.
The Virtues of Being Anonymous
These trends have overturned — or at least added a whole new wrinkle to — the concern so prevalent when I was a kid: that individuals in modern societies were becoming faceless numbers in an uncaring machine. Faceless number? These days, a lot of people aspire to that. They leverage the likes of the search engine DuckDuckGo in the hope of reverting back to being just a blip of anonymous bits lost amid mighty currents of data.
Well, unless you’re willing to live off the grid — or get almost obsessively serious about using encryption tools such as PGP — you’ll just have to dream your grand dreams of obscurity. Even if we somehow rein in the U.S. government, businesses will be doing their own “surveilling” for the foreseeable future.
But look on the bright side. From one perspective, there’s been progress. You’re not just a number these days, you’re a whole vector, or maybe even a matrix — with possible aspirations of becoming a data frame.
No, this isn’t an allusion to the disease vectors that have become such a hot topic during the pandemic. The statheads among you may recognize those classifications as belonging to the statistical programming language R, which vies with Python for the best data science language.
What’s a Vector?
In R’s parlance, a vector is “a single entity consisting of a collection of things.” I love the sheer all-encompassing vagueness of that definition. After all, it could apply to me or you, our dogs or cats, or even our smartphones.
But, in R, a vector tends to be a grouping of numbers or other characters that can, if needed, be acted on en masse by the program. It’s a mighty handy tool. With just a couple of keystrokes, you can take one enormous string of numbers and work on them all simultaneously (for example, by multiplying them all by another string of numbers, plugging them all into the same formula or turning them into a table). It’s just easier to breath life into the data this way. It’s what Mickey Mouse would have brandished if he were a statistician’s rather than a sorcerers’s apprentice in Fantasia.
Now imagine yourself as a vector or, at least, as being represented by a vector. Your age, height, weight, cholesterol numbers and recent blood work all become a vector, one your doctor can peruse and analyze with interest. Meanwhile, your purchasing habits, credit rating, income estimates, level of education and other factors are another vector that retail and financial organizations want to tap into. To those vectors could be added many more until they become one super-sized vector with your name on it.
Now, glom your vectors together with millions of other peoples’ vectors, and you’ve got one huge, honking, semi-cohesive collection of potentially valuable information. With it, you and others can, like Pinky and the Brain, take over the world! Or at least sell a lot more toothpaste and trucks.
Three Choices
The bottom line is that we have three basic choices in this emerging Age of Vectors:
Ignore It: Most folks will opt for this one, being too busy or bored for the whole “big data” hoopla. Yes, they know folk are collecting tons of data about them, but who cares? As long as it doesn’t mess up their lives in some way (as in identity theft), then this is just a trend they can dismiss, worrying about it on a case-by-case basis when it directly affects their lives.
Fight the Power: If you don’t want to be vectorized — or if you at least want to limit the degree to which you are — you can try every trick in the book to keep yourself off the radar of the many would-be private and public data-hunters who want to dig through your data-spoor in their quest to track your habits (either as an individual or as part of a larger herd).
Use the Vector, Luke: Some will gladly try to harness the power of the vector, both professionally and personally. They’ll try to squeeze every ounce of utility out of recommendation engines, work assiduously to enhance their social media rankings, try to leverage every data collection/presentation service out there to boost their credit ratings, get offered better jobs, or win hearts (or other stuff) on dating sites. They will certainly wield vectors at work for the purpose of prediction analytics. They may even turn the vector scalpel inward with the goal of “hacking themselves” into better people, like the Quantified Selfers who want to gain “self knowledge through numbers.”
That’s not to say that we can’t pick and choose some aspects of each of these three basic strategies. For instance, I’m just not cut out for the quantified-self game, being just too data-apathetic (let’s s a 7 on a scale of 10) to quantify my life. But, when it comes to analyzing other stuff, from labor data to survey findings to insects in my backyard, I’m all in, willing and ready to use the Force of the Vector. Now, I just have to figure out where I misplaced my statistical light saber…
Featured image from IkamusumeFan - Plot SVG using text editor.