Big Data Viz
Written response to Shirley Wu's hong kong artists, women
Data visualisation is as old as the oldest cave paintings, or making maps from the stars. Its contemporary format, however, is the one most of us are familiar with: ‘data viz’. This is computer-rendered, visualised information. More precisely, it is data that is aestheticised in the style of scientific or mathematical charts, graphs, plots, and arrays, with the aim of swiftly elucidating relationships between data points to viewers. Logically, then, it is most commonly found in corporately influenced presentations and pitch decks, annual reports, and trend forecasts from the sciences to government sectors, as well as in news articles, which can even seem incomplete if data viz is absent.
Data viz has become nearly ubiquitous in roughly a decade—a fact of life, like body scanners at airport security checkpoints. But it is difficult to trace how and why this happened, in part because there is no contemporary history of data visualisation. This isn’t to say there aren’t resources on the subject. Most of them are technical (i.e. ‘This is an example of good design, and this is how you make it’) or staunchly located in their present and, it should be said, laudatory of the format. The most influential may be Information Is Beautiful (2009) by David McCandless, which purports the need for new visual forms that could help us relate to the massive amount of information with which we are being ‘bombarded’ every day. It spurred an entire movement of data viz artists, start-ups, and conferences, intent on creating data-driven visuals that ‘reveal how the world really works’.
Unsurprisingly, data viz grew alongside ‘big data’—the collection, storage, and analysis of information from digital transfers, keystrokes, clicks, and computational processes—and resonates with its logic that more data equals more knowledge; or, the more data, the better. The two seem to be integral to each other (something like ‘big data viz’) yet paradoxically at odds: as more data is collected every day, we are supposedly gaining invaluable insights, which computational processing enables us to analyse with a degree of accuracy that would be impossible at a human scale. At the same time, such large amounts of data are confusing in raw form. Data viz therefore helps distil this information and make it legible. Just as those who collect data believe they are gleaning more about the world, those who promulgate and design data viz believe they are offering clarity. But if more data is better, the question is, for whom?
In what seems to be the only critical piece of writing related to data viz, media theorist and computer programmer Alexander R. Galloway reflects on the stifling amount of visual information represented in a PowerPoint slide, writing:
Unlike realism in painting or photography, wherein an increase in technical detail tends to bring a heightened sense of reality (at least in the traditional definition of aesthetic realism that has held sway since the Renaissance), the high level of technical detail visible here overwhelms the human sensorium, attenuating the viewer’s sense of reality. Rather, like a fractal whose complexity does not decrease when viewed through a magnifying glass, the information contained in the slide does not grow more coherent the longer one inspects it. Eschewing lucidity, the diagram withdraws from the viewer’s grasp, effectively neutering its capacity as a vehicle for information. One is left wondering what exactly the slide is meant to communicate.
What does such an overwhelming amount of detail obfuscate rather than reveal? What does data viz hide in plain sight?
Data viz has become especially pervasive in journalism, in a public way and with a public purpose, relative to NDA–sealed investor meetings, badged-off conference rooms, and invite-only or high-priced academic or technical conferences.
In 2008, one year before Information Is Beautiful was published, Nate Silver launched his polling analytics blog, FiveThirtyEight, which rose to prominence during the United States presidential elections that same year. Its accuracy accelerated its popularity, and by 2010 it was licensed to the New York Times. Arguably still the ‘newspaper of record’, the Times’s integration of statistical infographics into their stories set a precedent throughout the industry. It was around this time that Apple introduced the first iPhone (2007), WebGL was released (2011), and mobile data transfer was becoming more rapid, all of which enabled swifter uploading, loading, and sharing of graphically dynamic content.
The contemporary drive for ‘objective’ truth in journalism enabled by data can be traced to Philip Meyer’s ‘precision journalism’, originated in the late 1960s. Following his Nieman Journalism Fellowship at Harvard (1966–1967), Meyer deployed the social science methodologies that were the subject of his research to conduct citizen surveys about the Detroit Riot and its underlying causes to contribute to the Detroit Free Press’s coverage. Concurrently, developments in digital computing technologies facilitated faster and more accurate data collection at a larger scale. Meyer’s initial interest in this approach, however, stemmed from his observation that politicians were using survey research in their campaigns to anticipate and sway voter opinion. His techniques would go on to influence generations of reporters, but Meyer was aware that these methodologies and their reliance on data could be deployed with deceptiveness—Donald Trump’s 2016 election with the aid of Facebook and Cambridge Analytica being all too indicative.
Today, in what is colloquially called the post-truth era, we question the nature of ‘facts’ perhaps more than ever before. Philosopher Bruno Latour, reflecting on what has now become a common practice of deconstructing reality, argues that ‘facts remain robust only when they are supported by a common culture’. Arguably, data (and data viz) has been able to maintain its position because of the deeply embedded cultural construct, dating to Platonic times, that numbers are objective. Moreover, the computational processes that enable us to arrive at results and visualise these data sets are now perceived as more accurate than human thought, even when we know that the computer is wrong. Take the 2018 study from the Harvard Business School ‘Algorithm Appreciation: People Prefer Algorithmic to Human Judgment’, in which the researchers, through a series of six experiments, demonstrate that ‘people adhere more to advice when they think it comes from an algorithm than a person’. Those who deploy computational thinking, from Silicon Valley founders to STEM advocates, are specifically tapping into these cultural biases when they speak of techno-solutionism and present data as though it is in and of itself revelatory.
Blatant power dynamics in data viz reinforce this logic. Since its design is arbitrary, the ‘author’ of a visualisation is the only one who truly knows how to read the graphic. The reader is dependent on the author’s interpretation, rendering their own interpretation impossible. This is compounded by the content, derived from data and numbers, and its aesthetics, from science and mathematics, both of which give the impression that the visualisation is objective. The myth that data is pure is perpetuated even when it is fallible in both its collection methods and its intentions. Overwhelming levels of detail are like visual defence mechanisms, creating an impenetrable barrier while making the visualisation essential as an explanatory device (recall Galloway’s passage from above).
Data journalist Maddy Varner—of the newly launched, non-profit newsroom and publication The Markup—likens the experience of reading a data viz as reading a book review without the book, but even more harmful, since it is designed to be accepted as fact. When we are unable to comprehend (and what’s more, made to feel ignorant for not understanding) how are we meant to contribute?
The promise of a piece like Shirley Wu’s hong kong artists, women (2020) is that it was designed to illuminate what is missing in the data set, underscoring the problems of data collection and entry. Its aesthetics reveal this as much as the content itself. Wu offers a serene mountain range with a virtual trail that encourages exploration, a stark contrast to typically authoritative data viz projects. Rendered with inky washes of lichen green and aquamarine blue, the fifty-eight mountains are clearly labelled with a legend and biographical information, linking out to the source Wikipedia page for the Hong Kong female-identifying artist that each peak represents.
Wikipedia, one of the largest and most popular online references, relies on user-generated entries and edits. Even if we know that it may not be the most comprehensive, or even the most reliable, of sources, it tends to be a default search mode when we’re curious about a new subject. Articles are available in nearly three hundred languages, though English is by far dominant, both in the number of entries and the activity of its editors—most of whom identify as male. The data set is obviously skewed, and therefore we can assume that Wu’s source material was fundamentally flawed. Even the Wikipedia search terms that Wu used to arrive at her graphics—'Hong Kong’, ‘woman’, ‘artist’—are inadequate. Hong Kong is a city with a complex political history. ‘Woman’ is a label that an individual may or may not choose for themselves. And ‘artist’ is difficult to define. (Wu adjusted this term to encompass poets, dancers, actors, musicians, and other creative practitioners.)
Unlike typically faux-objective data viz projects, Wu’s hong kong artists, women highlights the shortcomings of the data set. The project does not present itself as complete or didactic; rather, Wu offers a space that encourages introspection and interpretation, to critically consider what is missing and, at best, how it may be remedied. Some users may be motivated to edit entries on Wikipedia, which could bring more diverse perspectives to the platform—even though systemic inequalities will continue to prevent many from contributing crowd-sourced, unpaid labour. Others might envision artworks, projects, and activist forms of organising. For information to be beautiful, it needs to be accessible to all of us.
About Kerry Doran
Kerry Doran is one of the external advisors on the curatorial board for M+ Digital Commissions. She is an art historian, critic, and curator, with projects based out of Buenos Aires, Hong Kong, and New York. Her interdisciplinary, research-based work centres on politically subversive media practices and artistic strategies; pop cultural channels, circulation methods, and formats; and the relationship between digital infrastructures, socioeconomics, and art practice.
Doran is a contributor to Artforum, specialising in time-based media, performance, and the historical avant-garde from the Southern Cone. Her writing appears in exhibition catalogues, artist books, and independent publications, including Affidavit, Art in America, BOMB, Flash Art International, Foam Magazine, ramona, Real Life, Rhizome, Terremoto, and SFMOMA’s Open Space. Her curatorial projects have been featured in Artforum, ARTnews, Modern Painters, New York Magazine, the New York Times, Página/12, Rhizome, and The Village Voice, among others. She has presented her research and facilitated discussions at the British Computer Society, Carnegie Mellon University, Harvard’s Graduate School of Design, the International Center of Photography Museum, Goldsmiths, M+ Museum, the Massachusetts Institute of Technology, the Rhode Island School of Design, Virginia Polytechnic Institute and State University, and on Montez Press Radio.
Doran was previously the director of Postmasters and bitforms, respectively, and was one of the founding members of NEW INC, the New Museum’s incubator for art, design, and technology. She holds a master’s with distinction from the Courtauld Institute of Art in London, where she was an Associate Scholar at the Research Forum, and three degrees, summa cum laude, from the University of Colorado Boulder.
View the Project
Shirley Wu’s hong kong artists, women is commissioned for M+’s series of digital commissions, exploring online creative practices that sit at the intersection of visual culture and technology.
Visualising data can be traced to the prehistoric cave paintings, from Indonesia to Lascaux; these representations indicate the idea of visual, figurative art itself. Later, diagramming stars enabled map making. See Howard G. Funkhouser, ‘A Note on a Tenth-Century Graph’, Osiris 1 (January 1936): 260–262; and Michael Friendly, ‘A Brief History of Data Visualization’, Handbook of Data Visualization (Berlin and Heidelberg: Springer, 2008).
‘Data viz’ is how I will refer to this specific contemporary form of data visualisation throughout the piece, to distinguish it from other examples, both in the present and historically.
Funkhouser’s ‘Historical Development of the Graphical Representation of Statistical Data’ (1937) might be our most modern example. Psychologist and statistician Michael Friendly’s ‘A Brief History of Data Visualization’ project (2006), accompanied by an interactive online compendium tracing data visualisation techniques alongside relevant technological developments, seeks to address this gap in knowledge, though it doesn’t appear to have been updated since 2009. Then there is the perennially essential and oft-referenced The Visual Display of Quantitative Information (1983) by Edward R. Tufte, discussing and illustrating examples of what constitutes effective and ineffective design; however, it is by no means a comprehensive historical survey. See: Funkhouser, ‘Historical Development of the Graphical Representation of Statistical Data,’ (Ph.D. diss., Columbia University, 1937); Friendly, ‘A Brief History of Data Visualization’; and Tufte, The Visual Display of Quantitative Information (Cheshire, CT: Graphic Press, 2015).
David McCandless, Information is Beautiful (New York: Collins, 2009). On Information Is Beautiful’s website, the mission is described as ‘dedicated to helping you make clearer, more informed decisions about the world.’ https://informationisbeautiful.net/about/.
Alexander R. Galloway, The Interface Effect (Cambridge: Polity, 2012), 78.
FiveThirtyEight was licensed to the New York Times on a three-year contract, drawing massive amounts of traffic during the 2012 U.S. presidential elections (the site predicted the correct voting outcome for all fifty states.) After Silver’s departure to ESPN in 2013, the Times started ‘The Upshot’, a data-driven column that ‘emphasises data visualisation and graphics to offer an analytical approach to the day's news’. In an article from that time, the founding editor of ‘The Upshot’, David Leonhardt, states that the column’s intention is to help readers get ‘to the essence of issues’, and that data could explain ‘reality to people’. His language echoes that of McCandless (and the majority of those within the field of data viz and big data, for that matter), proselytising the luminary quality of data. John McDuling, ‘“The Upshot” is the New York Times’ replacement for Nate Silver’s FiveThirtyEight’, Quartz, 10 March 2014, https://qz.com/185922/the-upshot-is-the-new-york-times-replacement-for-nate-silvers-fivethirtyeight/.
Everette E. Dennis described Meyer’s work as such in 1971, to contrast it with narrative-driven journalistic techniques. See Philip Meyer, The New Precision Journalism (Lanham, MD: Rowman & Littlefield, 2001).
Cameron Robertson, ‘Reading the Riots: how the 1967 Detroit riots were investigated – video’, The Guardian, 9 December 2011, https://www.theguardian.com/uk/video/2011/dec/09/reading-the-riots-detroit-meyer-video.
As gleaned in an interview with Meyer, even though ‘precision journalism’ does not directly relate to computers, the collection of data was enabled by the implementation of digital computing. Marília Gehrke and Luciana Mielniczuk, ‘Philip Meyer, the Outsider Who Created Precision Journalism’, intexto, https://pdfs.semanticscholar.org/2fda/4cd2019c360a239c634a9f02e579fbd9675e.pdf.
Ava Kofman, ‘Bruno Latour, the Post-Truth Philosopher, Mounts a Defense of Science’, New York Times, 25 October 2018, https://www.nytimes.com/2018/10/25/magazine/bruno-latour-post-truth-philosopher-science.html.
‘Algorithm Appreciation: People Prefer Algorithmic to Human Judgment’, https://www.hbs.edu/faculty/Publication%20Files/17-086_610956b6-7d91-4337-90cc-5bb5245316a8.pdf.
See also: Julia Angwin, ‘A Letter from the Editor,’ The Markup, 25 February 2020, https://themarkup.org/2020/02/25/editor-letter-julia-angwin.
Maddy Varner in conversation with the author, New York, 25 January 2020. This text would have not been possible without Varner’s expertise and references to sources.
As detailed on Art + Feminism’s website: ‘Wikipedia’s gender trouble is well documented. In a 2011 survey, the Wikimedia Foundation found that less than 10% of its contributors identify as female; more recent research puts that number at 16% globally and 23% in the United States. Further, data analysis tools and computational linguistics studies have concluded that Wikipedia has fewer and less extensive articles on women; those same tools have shown gender biases in biographical articles.’ ‘About', Art + Feminism, http://www.artandfeminism.org/#about.
This is a complicated point in the context of Wikipedia, as former Wikipedia-in-Residence and communications scholar Dorothy Howard has illustrated: ‘Wikipedia does pay some contributors, such as developers, administrators, and outreach personnel. The Wikimedia Foundation keeps its paid staff to a scant number under 300—with most involved in software and engineering, fundraising, and community support. Its volunteer numbers, however, are comprised of around 30,000,000 global users and 118,000 regularly contributing ones. But again, I’ve observed that if you ask the average Wikipedia contributor if they would take money for their edits (if it were allowed), they generally scoff. And the idea of introducing money into the network is considered so obscene that doing so might jeopardize your own standing in the community.’ Dorothy Howard, ‘Labor and the New Encyclopedia’, DIS Magazine, February 2015, http://dismagazine.com/discussion/73109/dorothy-howard-intellectual-labor-and-the-datalogical-encyclopedia/.