Big Data Makes for Big Sci-Fi Plots

We’re fans of science fiction, and its conversion into science fact, around here. Ray Bradbury has written that “science fiction is the most important literature in the history of the world because it’s the history of ideas, the history of our civilization birthing itself . . . Science fiction is central to everything we’ve ever done, and people who make fun of science fiction writers don’t know what they’re talking about.” It’s in the notion of “civilization birthing itself” that we find plotlines that deal with mass information, and the various ways humans use that information for better or worse. If you look at the concepts and plotlines of many science fiction stories where societies crunch on mountains of data, you end up with stories that last thousands, millions, or billions of years; stories that deal with juggling simultaneous alternate realities; stories of unprecedented intersystem contact, and more. Big data accompanies big ideas, big leaps in intergalactic evolution. 

Although this post talks about the role of big data in science fiction plots, big data as the plot is relatively rare, although the exceptions listed in this post are noteworthy. Almost four years ago, James Bridle posted a story called “The End of Big Data” on Vice, with art by Gustavo Torres. “It’s the world after personal data,” reads the synopsis. Most entities are forbidden from having any identifying information. “No servers, no search records, no social, no surveillance.” This is enforced by satellites constantly monitoring the planet to “make sure the data centers are turned off—and stay off.” Bridle’s story is of a different kind of fictional future. And it’s kind of unique because we typically expect science fiction to talk about pushing limits forward, not back. 

But this is sci-fi about policy and law, not just tech. The story features scenes of data cops busting data pirates, because naturally if you make it illegal, you create a black market for it. These pirates move the data, which they collect like rainwater catchment in “receiver horns” and ship it in physical containers: “Sealed and tagged, these containerized data stores could be shipped anonymously to brokers in India and Malaysia, just a few more boxes among millions, packed with transistors instead of scrap metal and plastic toys.” Bridle describes the mundane and sometimes challenging life of a data cop, surveilling the whole planet through satellite monitoring and, sure, the collection of data; the sovereign must be able to be outside of the law to enforce it, of course. 

The black market forms because “Data is power,” the main character observes. Of course, as Kelly Faircloth points out in another post, the science fiction world is already here—dating sites that can predict when a candidate is lying, social networks already knowing who you know, extremely fast and efficient Orwellian surveillance—it’s all already here. And it has both positive and nefarious uses. It can help you sleep better by collecting and processing sleep data from REM patterns to body positions to ambient noises. It can facilitate the early detection of natural disasters, epidemics, and acts of violence. 

Here’s a spontaneous debate you can make your friends or students have: Resolved: The benefits of big data’s early epidemiological or medical detection of deadly diseases outweigh the detrimental effects of big data on privacy and civic life. Because in science fiction, like in public policy, the question is who is using the data and what their intentions are. 

Paul Bricman blogs about three novels with data-driven plots and in doing so raises an interesting point. We hear a lot about the dystopian or tragic use of big data in sci-fi (and in real life). Bricman’s analysis includes (as it should) The Minority Report by Philip K. Dick, where cops bust people for crimes they haven’t committed (and arguably won’t commit with 100% certainty). Minority Report is a classic example of data dystopianism, but Bricman also includes two works that offer a more optimistic or at least hopeful view: the Foundation series, by Isaac Asimov, where mathematician Hari Seldon develops a “mathematical sociology” to save his civilization from ruin (playing a very long, data-driven game to do so); and The Three Body Problem by Cixin Liu, which turns on the efforts by a cooperative organization composed of two civilizations—Earth and Trisolaris—to solve the problems created by the latter’s having three suns. Their methods include “genetic algorithms applied to movement equations” and the development of “an in-game computer based on millions of medieval soldiers which emulate transistors and other electrical components” in order to predict the behavior of the Trisolaris system’s three suns. If you look hard enough, you’ll find stories about what good people who are data crunchers and philosophers of data can do. 

These are all works that have been written after the “golden age” of science fiction from the late 1930s to the late 1940s. Most emerge in the 1960s and 1970s. But there’s a special place for Olaf Stapledon (who published just prior to the golden age) in any analysis of metadata-based sci-fi because Stapledon’s work was all about making the biggest abstractions and generalizations possible from virtually infinite fields of data across the universe. Stapledon wrote Last and First Men, a “future history” novel in 1930, and followed it with Last Men in London and Star Maker—the latter being Stapledon’s history of the entire universe. Last and First Men encompasses the history of humanity for the next two billion years, detailing the evolution of eighteen iterations of the human species beginning with our own. Stapledon anticipated both genetic engineering and the “hivemind” or “supermind.” The data processing involved in mapping and describing this evolution may not fit neatly into any categories like those we use when appending lead data using our client Accurate Append (an email, and phone contact data vendor), but Stapledon’s work beautifully, sometimes almost poetically, captures such wide perspectives. It’s impossible to do work collecting and processing the information on thousands or millions of humans and not feel that some kind of collective consciousness, elusive but real, is at work.

Huxley, Big Data, and the Artistic Mind

Imagine you’re a famous painter, but in an effort to get with the times, you crowd-source your newest painting. You make a nice big show of it, being creative in your soliciting of ideas, bringing people to your studio (and sharing the encounters on social media), maybe even hosting some focus groups to discuss themes, content, and forms. You finish a painting that is the result of a process of dialogical exchange with your audience. 

Most thinking art scholars and critics would call that a pretty creative artistic endeavor, and because you facilitated it, you’d get credit as the artist, even though the intent of the “performative” and creative aspect of the art was to de-center yourself as the artist. 

What if you were a musician and you did something similar? Maybe you’d invite audience members to hum into a recording device, and then you mix the sounds into what you believe to be optimal if somewhat discordant combination. Obviously, this would be considered a creative act by you as well. The inclusion of audience suggestions is part of the aesthetic experience. It calls into question artistic individuality, making an important philosophical point that you, the individual artist (see what I did there) get credit for developing and illustrating. 

These innovative gestures that blur the line between artist and audience are not problematic in the same way that concerns have been raised about the relationship between big data and artistic expression. In the forthcoming book Beyond the Valley: How Innovators around the World are Overcoming Inequality and Creating the Technologies of Tomorrow, Ramesh Srinivasan explores more concerning technological questions: fashion designers whose creative work is based on algorithms developed by consolidating millions of users’ preferences, for example, or art and music created to specifically appeal to the semiconscious desires of listeners, but the data that goes into making that music is mass data, not individual aesthetic experiences or individual expressions of desire. 

Srinivasan seems mostly concerned that big data produces “undemocratic” outcomes, but I don’t think he means this in a strictly political sense. I think he means that democracy carries a certain expectation of self-consciousness. Even the experimental collaborative art and music I imagined earlier is self-consciously participatory. The participants know they are helping create something, and are intentionally contributing. This isn’t the case when the list of acceptable preferences distilled and given to artists is based on millions of pieces of data. 

This is not to say that big data can’t or shouldn’t play a role in developing products with an aesthetic value like furniture or clothing. This seems like a legitimate form of product development and there’s nothing inherently wrong with it. But it would be a mistake to call it “artistic” without a radical redefinition of the word—because the “art” in it is not conscious or intentional in the same sense that an individual artist’s painting or even an ensemble collectively-written piece of theater is. 

This depersonalization of aesthetics through big data was anticipated by Aldous Huxley (who Srinivasan cites in his book) in books like Brave New World, and Huxley was concerned about industries and politics and other endeavors lacking transparency—not just where stakeholders were able to see the decisions being made, but participate in them too. “Whatever passes for transparency today seems one-directional,” Srinivasan writes. “Tech companies know everything about us, but we know almost nothing about them. So how can we be sure that the information they feed us hasn’t been skewed and framed based on funding models and private news industry politics?”

Huxley, who died 56 years ago on the same day as JFK, November 22, 1963, has developed a reputation as a scathing critic of industrial society, but he was more than that. His thoughtfulness about ethics wasn’t abstract: when he became relatively wealthy working as a Hollywood screenwriter in the 1930s and 40s (he’d immigrated to the U.S. and settled in Southern California), Huxley used a great deal of that money to transport Jews and left-wing writers and artists from Europe to the United States, where they would be safe from fascism. Huxley saw the threat of depersonalized and depersonalizing technology as “an ordered universe in a world of planless incoherence.” That’s not far from how big data skeptics describe the data industry. 

In a recent Washington Post piece, Catherine Rampell echoes these concerns. The “vast troves of data on consumer preferences” owned by large firms are largely collected surreptitiously. “There are philosophical questions,” she writes, “about who should get credit for an artistic work if it was conjured not solely through human imagination but rather by reflecting and remixing customer data.” 

If the fashion or informational choices of big data are alienated from the conscious preferences of audiences or consumers, there is at least one theory of art that holds that artists themselves should be consciously removed from the preferences of audiences who otherwise appreciate the art. It’s the “theory of obscurity,” made (somewhat) famous by San Francisco avant-garde band The Residents, who borrowed it from N. Senada, a musician who may not have existed as such. The theory holds that “an artist can only produce pure art when the expectations and influences of the outside world are not taken into consideration.” In other words, a true artist can’t worry about what the audience thinks. N. Senada had a corollary theory, the “theory of phonetic organization,” holding that “the musician should put the sounds first, building the music up from [them] rather than developing the music, then working down to the sounds that make it up.” Big data aggregation could not play any meaningful role in such work. Perhaps listening to avant-garde music is the best way to avoid being assimilated into a giant cybernetic vat of data goo. But I doubt such a solution can be implemented universally since very few people enjoy that music.