Getting a Sense of Big Data and Well-being

Susan Oman

doi:10.1007/978-3-030-72937-0_5

Outline

Getting a Sense of Big Data and Well-being

Susan Oman

New Directions in Cultural Policy Research

https://doi.org/10.1007/978-3-030-72937-0_5

visibility

…

description

54 pages

Abstract

Can Big Data improve understanding of well-being and can they harm well-being? The chapter opens by asking what even is ‘Big Data’, and is ‘it’ actually new when large datasets have been valuable in understanding population-level health, wealth and well-being for 6000 years. It reviews the failed promises of Big Data to predict and prevent pandemics, including COVID-19, comparing new data infrastructures with old ones. It presents examples and case studies of social media data and data mining on large scales, and for smaller organisations to understand how we feel. We find there are more limits to Big Data and new data technologies to understand well-being than are made explicit, and question the ethics of Big Data insights and their monetary value in the context of well-being.

CHAPTER 5 Getting a Sense of Big Data and Well-being 5.1 What Even Is ‘Big Data’? Big data generally capture what is easy to ensnare—data that are openly expressed (what is typed, swiped, scanned, sensed, etc.; people’s actions and behaviours; the movement of things)—as well as data that are the ‘exhaust’, a by-product … It takes these data at face value, despite the fact that they may not have been designed to answer specific questions and the data pro- duced might be messy and dirty. (Kitchin 2014, Chap. 2, p. 3 of individual chapter version) Rob Kitchin is possibly one of the most cited definers of ‘Big Data’, opening books and dissertations up and down the land. Yet, as we are about to discover, Kitchin himself tells us that while the term ‘Big Data’ is repeatedly defined (Kitchin 2014, Chap. 2, p. 3), big data themselves defy categorical labelling. So, it is not clear-cut, because differentiating what ‘it’ is and what they are not is often side-stepped, or comes with caveats.1 We encountered something similar before, if you remember, in Chap. 2. When it comes to understanding what well-being is, those inclined to measure are sometimes keen to measure well-being to understand it, rather than define what it is that is being measured. In a similar way, those describing Big Data are often more concerned with what Big Data does (or do), rather than what Big Data is, or are. © The Author(s) 2021 175 S. Oman, Understanding Well-being Data, New Directions in Cultural Policy Research, https://doi.org/10.1007/978-3-030-72937-0_5 176 S. OMAN In this chapter on Big Data, we will discover that how they are used can defy some of the old definitions of how to use data or what data are for. So, let us start with some definitions and what is different. For Kitchin, the lack of ‘ontological clarity’ of Big Data (as the individual concepts and categories of Big Data and the relations between them) means the term acts as a vague, catch-all label for a wide selection of data (Kitchin 2014, Chap. 2, p. 3). Despite this, he has reviewed how other people define it and proposes the key traits of Big Data. These qualities are outlined in Table 5.1. Given the word ‘big’, it is probably no surprise that volume is one of ‘the 3Vs’ identified by Doug Laney back in 2001. The other two being velocity and variety. Other qualities include exhaustivity, resolution, indexicality, relationality, extensionality and Table 5.1 Ways that Big Data are different Label/definition Origin Meaning Pre Big Data Big Data Volume Laney (2001)Consisting of enormous Limited to Very large quantities of data large Velocity Laney (2001) Created in real-time Slow, Fast, freeze- continuous framed/ bundled Variety Laney (2001) Being structured, Narrow2 Wide semi-structured and unstructured Exhaustivity Mayer- An entire system is Samples Entire Schönberger and captured, populations Cukier (2013) Rather than being sampled Resolution and Dodge and Fine-grained (in Coarse and Tight and identification Kitchin (2005) resolution) and weak to tight strong uniquely indexical and strong (in identification) Relationality Boyd and Containing common Weak to Strong Crawford fields that enable the strong (2012) conjoining of different datasets Flexible and Marz and Can add/change new Low to High scalable Warren (2012) fields easily and can middling expand in size rapidly Adapted from tables in Kitchin (2014) and Kitchin and McArdle (2016) 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 177 scalability (Kitchin and McArdle 2016; Kitchin 2014). But what does this mean? How do these characteristics help us understand the data? Having established a series of classifications for Big Data, Kitchin tested his taxonomy of traits with co-author McArdle a few years later (Kitchin and McArdle 2016). They applied the categories to 26 datasets which are widely considered Big Data and drawn from across seven sources: mobile communication, websites, social media/crowdsourcing, sensors, cam- eras/lasers, transaction process generated data and administrative data (2016). The authors find all seven traits in Table 5.1 are only applicable to ‘a handful’ of these datasets (Kitchin and McArdle 2016, 9). This shows how difficult it is to diagnose what Big Data actually are. Rather than the qualities of the data themselves, it might be more useful to instead turn to thinking about the contexts of data again: where they come from, and what they do (Oman n.d.). The key differences in the characteristics of Big Data are context, which is often missing when presented. Table 5.2 represents how difficult it is to diagnose what Big Data actually are, without considering the qualities that affect their use. It shows there are additional Vs: veracity, value and vari- ability—these are concerned with how the data suit their re-purposing. Given the multiple insights and applications of data outside of their origi- nal setting, it can be difficult—even more difficult—to find certainty from them. This is because the data were collected, generated and produced for a specific reason, or as a by-product, that differs from how they are re-used. The value of Big Data is the variety of insights that are possible and that can be used for other purposes. However, there are many things in the data that may not be useful. This also means using Big Data can increase the risk of confounding more traditional causal explanations. Instead, the mess of Big Data lends them to correlation with many insights, which can Table 5.2 Some qualities of Big Data Label / Origin Qualities of data that affect their use definition Veracity Marr (2014) The data can be messy, noisy and contain uncertainty and error. Value Marr (2014) Many insights can be extracted and the data repurposed. Variability McNulty Data whose meaning can be constantly shifting in relation (2014) to the context in which they are generated. Synthesised from Kitchin and McArdle (2016) 178 S. OMAN be used to enable prediction of well-being for individuals and society. We shall return to correlations and well-being in our case studies later in this chapter. Table 5.3 looks at sources of different kinds of data typically used to predict well-being along with their pros and cons. These sources were drawn from an article in a journal for Data Science Analytics (Voukelatou et al. 2020), and I have synthesised these with Kitchin’s seven sources (mobile communication, websites, social media/crowdsourcing, sensors, cameras/lasers, transaction process generated data and administrative data) retaining commentary from Voukelatou et al. on the pros and cons for their use to understand well-being. You may look at these and feel like these data sources seem like strange ways to understand people’s well- being: the difference in origins and what they may be used for. You may also note that the authors’ presentation of the pros and cons, based on these sources, does not really prompt consideration for the people whose data they are, more their ease of use for the Data Scientist. Returning to contexts of use: mobile phone data, for example, have a primary purpose which is for billing, or because apps need location data to work (such as maps or for local restaurant recommendations). This is very different from these data being used to understand trends about people and society. Our previous examples of data re-use (or secondary analysis) have largely involved data that were collected in national surveys, or through more qualitative methods with smaller samples to understand a specific aspect of people and society more deeply in some way. Notably, even if the research question is different when data are re-used in Chap. 3’s examples, the purpose of the data’s collection is not as different, or as removed, as this ‘exhaust’, ‘by-product’ nature of the data Kitchin refers to. The process which has come to be known as ‘datafication’ (as coined by Mayer-Schönberger and Cukier 2013) describes the increased demand for and uses of data. As we have seen in previous centuries, appetite for num- bers (pandemics being one accelerator of data desire) has coincided with technological evolutions with numbers. In turn, and as we have seen over the last four chapters, different disciplines have increased and expanded their capacities for data and knowing the human experience in their own, particular way, and ‘new sciences’ have been declared. ‘Big Data’, as data with the qualities presented above, result from mounting capacity and faster instruments that increase the possibilities for the origins and vol- umes of data that can be stored in expanding databases, or in different databases which can be readily linked for a variety of purposes. As we have 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 179 Table 5.3 Sources of Big Data and their pros and cons for well-being measurement Data Source Pros Cons Mobile communications Captures temporal, spatial and Not publicly available, data (including GPS) social dimensions, sparsity, geographically Worldwide diffusion, Imprecise Repeatability Limited coverage in rural Unbiased and classified, areas real-time monitoring Indoor/altitude spatial inaccuracy Social media Measuring social dynamics, Privacy issues, publicly available overrepresentation, Social desirability bias Disturbance of normal activities to post Health and fitness Cost-effective, Not publicly available, not (including mental health Prediction of near-term risk of necessarily representative of and well-being apps) events the population Reduced respondent burden Requests for data input can disrupt daily activities Data can neglect moment- to-moment variations in mood. News Variety of subject domains, Gatekeeping bias, Variety of data Coverage bias, Range of targets, Statement bias 24/h updated, Archived historical news Transaction process Modelling of dynamic Dependency on retailer’s generated data household behaviour, permission, Temporal accuracy, Legal constraints Long-term coverage, Quality Websites and searches Publicly available. Population size varies across Speed, convenience, flexibility, domains. ease of analysis Relevant queries difficult to Timeliness, observation of identify people’s behaviour through Bias of content and terms searches Comparability of different search terms on different days (continued) 180 S. OMAN Table 5.3 (continued) Data Source Pros Cons Crowdsourcing Large number of data Risk of low-quality results, Speed, relative low-cost trade-off between quality measurement of daily and cost behaviour and activity Use of self-reports Paid participation of users Administration data Accurate, temporal stability, Limited understanding of valid for community-level human experience in understanding and cross- administration data cultural comparisons NOTES: Made from synthesising across Rob Kitchin’s 7: mobile communication; websites; social media/ crowdsourcing; sensors; cameras/lasers; transaction process generated data; and administrative data & Voukelatou et al. (2020)—with the data examples in this chapter also seen before, it can be difficult to decide which came first: appetite for data, or capacity to expand on data possibilities. In the age of Big Data, these newer data sources hold a wide variety of easy-to-capture data points, including observations of how we feel, where we are (or were), who we know, what we spend—and on what. These provide information on what products we have clicked on, and those we have not bought (Turow 2011). They can show how and where we spend our spare time and our money, both off and online. They are, therefore, incredibly valuable for research and commerce. It is not these individual data points that are important, per se, but the links between them, that make them valuable. Through linking, assump- tions can be made about how our behaviour, such as online spending, or improved mood, can be replicated in another place or time. These insights are also linked with other more familiar data points from administrative records, for example: where we were born, how much we earn, whether we own our own house. Other data are produced by loyalty cards, smart- phones and in-house devices, such as Alexa, expanding such linking opportunities. Those who may try to avoid ‘being known’ by these other data will try to bypass the systems that gather these data. However, this resistance also becomes data in and of themselves; avoidance still produces digital traces that can be used to gather insights. Corporations may still create an automated profile of sorts, and assumptions will be made about the kind of products ‘the resistors’ buy. The persistence of data practices and their seeming inescapability are the reason we are starting to think 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 181 about the experience of Big Data as something we ‘live with’ (Kennedy et al. 2020) and as something we ‘feel’. This chapter covers some of the pervasiveness of Big Data, alongside the possibilities that come with that. Crucially, we look at what that means for well-being. We start by looking at the ways that data about mundane aspects of our lives is increasing, alongside how normalised increasing data collection, analysis and re-use are. These ‘data practices’ present new pos- sibilities and realities of data-driven systems and decision-making that affect culture and society. In this chapter, we touch on some of the uncomfortable aspects of these new realities, before historicising Big Data as well-being data to con- textualise contemporary concerns regarding data practices that can be harmful. The second half of the chapter uses case studies to explore these concerns about well-being and data. Firstly, we consider a high-profile case that was billed as the promise of Big Data: Google Flu Trends (GFT), looking back from the age of COVID-19. Three further, short examples show the possibilities of social media data, place-based data, and health and fitness data to understand well-being for social and cultural policy and culture and society more generally. 5.2 Big Data: A New Way to Understand Well-being? “Big Data”, was cited 40,000 times in 2017 in Google Scholar, about as often as “happiness”! (Bellet and Frijters 2019) The datafication of social life has led to a profound transformation in how society is ordered, decisions are made, and citizens are governed. (Hintz and Brand n.d., 2) Digital devices and data are becoming an ever more pervasive and part of social, commercial, governmental and academic practices. (Ruppert et al. 2013, 2) The majority of Big Data are collected in a different way to the national surveys and interviews we encountered in Chaps. 3 and 4, and conse- quently has numerous different qualities. One is that surveys and question- naires are, by and large, overt methods, in that it is obvious you are asking questions to generate data. The new technologies use data which are col- lected covertly and so often gathered on individuals without their 182 S. OMAN ‘considered consent’, and are often processed without transparency. Figure 5.1 shows just a small selection of the types of personal data that are useful and valuable for social analytics and that are covered in this chapter. Fig. 5.1 Some examples of personal data used for social analytics in the era of Big Data 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 183 Social analytics involve the: monitoring, analysing, measuring and interpreting of data about people’s movements, characteristics, interac- tions, relationships, feelings, ideas and other content. Figure 5.1 shows only a few of many more examples. Here, they are categorised into domains that share the same names as the UK’s well-being measures, to enable you to cross reference the different kinds of insights available under each domain from these data (although biometrics is a new addition). The data are from ‘observations’ of how we move around the on and offline world. They can include behaviours collected by sensors (think of how your mobile phone uses data via GPS to tell you when the next bus is, or that you are about to encounter traffic on the motorway). They include our feelings, shared by social media data, or in apps. While demo- graphic data have long been collected, as we know, these newer forms of data can say much more about us, our well-being and quality of life. As we shall discover, this is both for good and bad and any insights gained need to be put into context. As we have also discovered, data are not only numbers or text, but can be sound and pictures. Analysing these kinds of qualitative data as Big Data holds new possibilities. In some ways it is these new possibilities that feel the most uncomfortably non-human. Whether it is concern that your phone is always listening to you, or, rather, that Alexa or Siri are (to huma- nise these technologies). Even the Street View option of Google Maps allows us to look at other people’s homes. I remember keenly finding the image of the flat I rented in London for years, only to see my washing-up through the kitchen window. I couldn’t help but think, I wish I had known they were coming. More notable than my neglected washing-up being on public view for judgement are other visual data used for training datasets, particularly for facial recognition. There are the moments when you know that facial rec- ognition technology is being used: to log in to your phone, or at passport control at the airport, perhaps. However, they are also being developed for schools, public transport systems, workplaces and healthcare facilities (Ada Lovelace Institute 2019). Revelations about its use in shopping cen- tres prompted media and public outrage, regulatory investigation and political criticism (Denham 2019; BBC 2019). These reactions are in part about the further encroachment on the way we live (like the call centre example from the 1990s that opens the book) and in part the lack of con- sent and knowledge about these data being collected about us. 184 S. OMAN Some people who uploaded photos to Flickr, some 10–15 years ago, more recently discovered they (as in the people’s faces and their photos) appeared in a huge facial-recognition database called MegaFace (Hill and Krolik 2019). They found the database held facial data on around 700,000 individuals, including their children, and was being downloaded by vari- ous companies to train face-identification algorithms. These algorithms were then being used to track protesters, surveil terrorists, spot problem gamblers and spy on the public at large (Hill and Krolik 2019). Notably, a colleague who read this chapter before publication—a digital sociologist,3 no less—confessed to me their shock at reading this anecdote, as they had used Flickr and were not aware of this story. Therefore, not only are our personal data collected and used without our knowledge, but the contro- versies surrounding their re-use are not even shared with users. This poses questions for accountability and transparency. The questions of who is collecting these data, and who is using them, and for what, present a more complex issue than before. Public support for the police to use facial recognition technology is conditional upon limitations and subject to appropriate safeguards, but there is no trust in private company use (Ada Lovelace Institute 2019). As we have been dis- covering—it is the contexts of data collection and uses that we need to understand: it is the who, what, where, why and what for? that are important. Why We Need to Ask Critical Questions of Data in the Context of Well-being Many issues related to Big Data don’t have clear-cut answers, especially where well-being is concerned. While data reveal details of the vulnerable, often involving risk for these people and their communities, the State uses data systems that people increasingly need to be a part of to access health- care and welfare support (Dencik 2020). This is why the growing amount of research which problematises the utility and ethics of Big Data, and how they are used, is vital. In this area of critical data science (see Bates 2016), some researchers use Big Data to reveal the limits and social issues con- nected to everyday datasets that we all use, such as a search engine’s image database (e.g. Otterbacher et al. 2017). These critical studies of data and their effects on society reveal how data are capable of not only new prob- lems, but persistent racism and misogyny, as we discovered in Chap. 1 with Virginia Noble’s example of what happens when you search for the phrase 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 185 ‘black girls’ (Noble 2018). These projects reveal data’s negative social effects, and how they are already embedded in society, exacerbating issues. Other research aims to investigate what people know and think is going on. Also looking at the possibilities of Big Data (and their associated tech- nologies) to understanding aspects of well-being. One such example (Living With Data n.d.) presents real-life cases of public sector data prac- tices to members of the public. It wants to understand how much people appreciate the possible benefits and how much they doubt or distrust the possible implications of data systems and sharing in their everyday lives. One option being, of course, that many people may not really care as much as we think they do, or should. We touch on these issues in this chapter. Most notable is the increase in concerns regarding the harms that Big Data and new technologies are capable of, and which are happening unchecked (i.e. the UK’s Data Justice Lab n.d.; Eubanks 2018; O’Neil 2016; Noble 2018; Benjamin 2019). There are two main problems here. One is that we are compromising well- being in the so-called aim of better understanding the human condition. The second is that we are not only using these data and technologies to understand people but also sorting and managing them in different ways that suit those who are already more powerful. It is vital to note that key to concerns about datafication are how these practices disproportionately affect the well-being of those already most vulnerable. Facial recognition, for example, negatively impacts people already disadvantaged, owing to its own gendered, heteronormative classed and racialised biases (Ada Lovelace Institute 2019). These tech- nologies are also being trialled in policing in the UK and have reported more than 90% of incorrect matches (Fussey and Murray 2019; Davies et al. 2018). In a more general way, all public services are adopting new data practices and possibilities. Data-driven decision-making is growing as an everyday feature of pub- lic services. Who receives welfare (Eubanks 2018, 37) housing (Eubanks 2018, 93) and other interventions, such as child protection (Eubanks 2018, 135) or education (O’Neil 2016, 5-9; 52–60) are decisions increas- ingly made by algorithms, rather than people. Even when automated deci- sions are questioned by people (Eubanks 2018, 141), it is unclear whether ‘experienced workers’ (Eubanks 2018, 77) or the data system has the greater influence in key decisions. Beyond welfare, algorithms intervene in other social policy areas. They monitor the ‘quality’ of education, using dubious proxies (O’Neil 2016), 186 S. OMAN with various bad outcomes, including teachers undeservedly losing their jobs.4 In COVID-19 UK in 2020, an algorithm also decided the grades awarded to school-leavers in the absence of exams, owing to social distanc- ing measures. One national media headline (Pidd 2020) called this ‘pun- ishment by statistics’. The UK’s A Level algorithm example was extremely high profile, causing outrage that data-driven decision-making would have such an enormous effect on the futures of these young people. It was seen as morally outra- geous for a number of reasons. First, because our society dictates that these young people’s well-being should be protected. Second, this algorithm used data that no one had consented to: no one knew at the time that their prior grades could be used as a final grade. Third, the data model also included proxies for expected performance which were nothing to do with each stu- dent’s own academic record. Instead, they used their school’s overall perfor- mance in previous years, which were scores based on previous students’ grades, not theirs. While the governing body, Ofqual, insisted its standardi- sation arrangements ‘are the fairest possible to facilitate students progressing on to further study or employment as planned’ (Pidd 2020), there were further controversies over transparency around how they had arrived at ‘fair’. After which, Ofqual published a 319-page document explaining its method- ology (Pidd 2020) which was criticised for not being accessible to the gen- eral public. Therefore, not only did the whole thing seem far from fair, but Ofqual didn’t make explicit how the approach was fair to those affected. Here we see public services failing to look after well-being through the use of data in ways which go against the moral code of fairness, account- ability and transparency5—and without the young people’s consent. Beyond their high-profile nature, what is different about these data uses? While Chap. 2 discussed the greater role of data in public services from the 1980s onwards, this ostensibly had a different rationale. It aimed to evalu- ate qualities of these services, such as efficiency or cost-effectiveness. While these approaches led to flawed decisions and evaluations, assessments were made at a societal level. Contemporary data-driven decision-making, whether the allocation of resources to people or the labelling of individu- als at risk, is a different approach and uses data on a different level. Or, to use the language of Chap. 3, there is a different unit of analysis, and that unit could be a vulnerable person. In sum, why do we need to ask critical questions about how people and their well-being are being understood or about how data and data systems used to understand people can compromise well-being? Going back to 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 187 those definitions, people are often concerned with the speed and size, and so on, of Big Data. Actually, as Kitchin indicates, it is the contexts of these data that are the most important ways that they are different. Not only are the contexts of origin of Big Data more different, and further from the contexts of use, than before, but the practices of analysing data feel less human. By this I mean that less human attention is now required in data analysis and in important processes that require data. What does that mean for decisions made about people and well-being? As we will discover in a few sections, the response to COVID-19 required older data and data systems—and more human judgement—than you would have imagined if you were looking at media reports of the promise of artificial intelligence (AI) in the first half of 2020. However, as the financial value of data increases, the more expediently they can be ana- lysed, and here we must ask other questions. Who stands to gain and who stands to lose? Who has chosen to participate? But then did people ever get to choose to participate in systems of well-being data? Or were we even thinking about data as ‘a thing’ about us, that affects our lives and was valuable? The next two sections deconstruct the financial value of Big Data and whether this reality is even new. Value Another major reason why we need to ask critical questions about Big Data and well-being concerns the financial value of knowing more about people and the financial value of the systems that sort people for public services and welfare distribution (Eubanks 2018). Beyond public services, the value of the new ways that Big Data can work is not just in knowing more about people, but because of the potential this knowledge has to orient people’s thinking through suggestion and in some high-profile cases to manipulate what they do. They enable marketers to sell you prod- ucts you might be most tempted by, knowing when you might be most susceptible too, based on your previous sales or what else you’ve looked at (Turow 2011). They also enable political campaigns to target their mes- sages in the same way and change voting behaviour (Avila 2019; Bates et al. 2016; Murgia 2017). The recent Cambridge Analytica scandal saw Facebook implicated in not only the unethical use of people’s data, and knowledge it had on their behaviour, but in misinformation that is thought to have changed the results of the US presidential election 2016 and Brexit in the UK the same year. 188 S. OMAN The first and second waves of well-being (Bache and Reardon 2013) from Chap. 2, and to which we keep returning, evolved as historical moments in which data capabilities married policy-makers’ aims: improv- ing the way we think about measuring human progress. Similarly, well- being metrics became more viable because well-being methodologies were evolving in a way that politicians saw as favourable. Political will and aca- demic developments work with evolving infrastructure and technological development to enable datasets to be created with more detailed and nuanced information about quality of life. These factors work together for new methodologies to generate new kinds of data and analytical approaches which then, by extension, affect research and policy-making, which in turn impact upon our quality of life. The increasing emphasis on Big Data as ‘the new oil’6 (a misnomer, of course) is not because datasets are ‘better’ (which would need some quali- fication) or because the technologies are new (though admittedly this is partly why it has become such a fixation). Instead, ‘Big Data’ datasets offer data with different qualities than more traditional data acquired by surveys. This means big datasets offer capacity to answer different research ques- tions—or answer the same research questions differently. Most importantly, they have been called the new oil because: (1) ‘data powers today’s most profitable corporations, just like fossil fuels energized those of the past’ (Matsakis 2019) and (2) this means these qualities can be financialised. The amount of data on individuals that are now collected is almost impossible to visualise in our minds. The growing number of devices and sensors means we are generating more and more data than can be col- lected: the International Data Corporation predicts that by 2025, the total amount of digital data created worldwide will rise to 163 zettabytes (Coughlin 2018). That is 1021 (1,000,000,000,000,000,000,000 bytes) or one trillion Gigabytes. The European Commission forecasted the European ‘data market’ to be worth as much as €106.8 billion by 2020 (Ram and Murgia 2019). These kinds of numbers reinforce the impor- tance of looking at Big Data as social phenomena—with social effects, but how new are large datasets about people and populations? 5.3 Are Big Data Even Actually New? While data are ‘sold’ to us as ‘the new oil’ (The Economist 2017), large datasets, and their use to understand human behaviour, are not new; nei- ther is the relationship between governments, commerce and value, when 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 189 it comes to data. Mary Poovey’s A History of the Modern Fact: Problems of Knowledge in the Sciences of Wealth and Society (1998) describes the rise of merchants and their influence over the State, including campaigns to pro- mote the balance of trade as the index of national well-being from the early seventeenth century onwards (Poovey 1998, 93–94). The new ‘enthusiasm for numbers’ in the early to mid-nineteenth century (Hacking 1991, 186; Porter 1986, 1996) coincided with a growing infrastructure to collect and analyse data. This desire for numbers, and the data processes that were required to provide them, led to the ‘great explosion of numbers that made the term statistics’ (Porter 1986, 11). If truth be told, the term ‘statistics’ originated for governments to understand ‘the quantum of hap- piness’ (Sinclair 1798, vol. 20, p. xiii). In this ‘avalanche of numbers’, ‘nation-states classified, counted and tabulated their subjects anew’ (Hacking 1990, 2; 1991, 186). However, while ‘statistics’ may be hun- dreds of years old, large datasets go back further. Managing land, agricultural hierarchies and the desire to control popu- lations have long required systems of recording. One of the oldest-known writing systems is Sumerian script, which is approximately 6000 years old (Bellet and Frijters 2019). This script is called cuneiform, and its uses are said to include the tracking of trade and taxes: you need records on who has paid, how much; who has not paid, and what they owe (Harford 2017). While the clay tablets these records were written on may not seem like a database, or feel like the Big Data futures outlined in the previous and subsequent sections, they were a dataset of sorts. Crucially, these data were used to monitor and control resources, including the management of people. Most countries now undertake a census of sorts. The UK Census takes place every ten years and has done since 1801.7 The first four were only headcounts, with the 1841 Census being the first to intentionally record names of all individuals in a household or institution. The UK’s ONS website offers an interesting history of censuses in the UK, back to the Domesday book ordered by the Norman (French) King, William the Conqueror in 1086 (ONS 2016). Again, censuses precede these European data moments by some 4000 years in both Egypt and China, whose gov- ernments (as they would have been formed and named in those days) recorded who lived where and how wealthy they were. The Romans held regular censuses to keep track of their expanding—and then contracting— empire. Evidence of other institutionalised data practices exists in the Bible: the book of Genesis talks of kinship and marriage records and 190 S. OMAN Exodus mentions a population census to support the tabernacle. The Church collected information on births, christenings, marriages, wills and deaths; this tracked the business of a church and its parish, but was also a means of counting the faithful and tracking their wealth. You will note that the recording of trade and births, marriages and deaths is not so different from the administrative data that appear in all our examples of well-being data, from Table 3.1 to 5.3. So, what is new about Big Data? We’ve long had large datasets that hold multiple data points on people and nations, but these are thought to be ‘state simplifications’ for officials (Scott 1998). Rationalisation and standardisation mean these rep- resentations ‘did not successfully represent the actual activity of the society depicted, nor were they intended to; they represented only the slice of it that interested the official observer’ (Scott 1998, 3). What the historian James Scott tells us here is that the sorts of information that were collected on scale lacked detail that could be used to improve quality of life. He implies, of course, that those in charge did not actually care about quality of life, only quantity of resource, whether this was people to work the land, make armies, or pay taxes. More recently, as we have seen, govern- ments were charged with responsibility for people’s well-being, and there- fore, more complex data were required.8 One such development was the social survey. The social survey has been used to collect data which capture various qualities of lives in richer ways, and for longer, than it is often credited for. For example, surveys in the UK in the mid-1940s (in World War II) dis- covered almost one in ten households did not have the number of cups deemed necessary for essential use, and ‘the shortage of scrubbing brushes seems to have been extensively felt’ (Oman 2015, 88; ONS 2001, 9). Whilst still administrative records of resource and scarcity, the survey began to be used to articulate more qualitative aspects of quality of life as proxies for well-being. This presents richer detail than many of the con- temporary surveys that generate the well-being data we have seen as either objective or subjective data so far. These more qualitative data were not only collected using government social scientists that we might imagine with clipboards. A project called Mass Observation was established in 1937 by anthropologist Tom Harrisson, poet Charles Madge and filmmaker Humphrey Jennings.9 Mass Observation aimed to record everyday life in Britain. There were paid investigators who anonymously recorded people’s conversations and their 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 191 behaviour: at work, on the street and at memorable occasions, including public meetings or sporting and religious events. This project was reminiscent of the current idea of ‘Big Data’, not only in the scope of the data gathered, but also in how they were gathered. Mass Observation had numerous phases and at one point also used a panel of around 500 voluntary ‘observers’. The initial aims of Mass Observation were to research everyday life, making use of ‘the untrained observer, the man in the street’10 as much as those who were thought to be skilled and qualified in gathering data of this sort (Madge and Harrisson 1937, 10). The observers used various data collection methods to generate large datasets on different topics: some maintained diaries, while others replied to open-ended questionnaires. In 1938, there was ‘a competition’ for the residents of Bolton, Lancashire (see Fig 5.2), asking people what happi- ness meant for them. This was one of many themes, and people would reply to what were called directives with often very long texts describing what they thought and how they felt. The data from these and from the 1938 project can still be accessed via a vast archive at the University of Sussex.11 Mass Observation began with a positive vision of democratising the processes behind how data were gathered to better understand people’s lives. However, over time, much qualitative social research shifted towards the narrower analysis of consumer choice, and Mass Observation became a market-research firm in 1949 (Albert 2019). Mass Observation re- launched in 1981, returning to its original egalitarian ideals and the archives are testament to the ways that Mass Observation aims to engage the public in the documenting of their own lives. These historical examples of large datasets are, therefore, not so dif- ferent from the qualities found in previously crowdsourced, location- based, time-based data on how people feel about things, as seen in Table 5.3. The purchasing of scrubbing brushes was used as proxy data for other qualities of life in the same way our purchasing data are anal- ysed to better understand us. Similarly, a lack of cups was indicative of a particular kind of poverty and lack of resources at a point in time, and this was analysed across the population. However, the democratic prom- ise of Mass Observation and other projects of the time were superseded by the potential of understanding what makes people happy for commer- cial gain. 192 S. OMAN Fig. 5.2 What is happiness? Mass Observation competition flyer, 1938 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 193 The Darker Side of Historical Well-being Data and Commercial Gain With the rise of market research came increased interest in people’s prefer- ences, and in what made them happy or gave them pleasure (Davies 2015; Savage 2010). This involved capturing subjective well-being data, as well as cultivating communications to imply that owning or consuming certain things would increase someone’s well-being in some way. The aim here in this context, of course, was to change people’s purchasing choices. With this shift, people as citizens became consumers. Over the years, ‘consumer sentiment’ indices have been assessed to see if these data can predict peo- ple’s behaviours on a macro level, from economic cycles (Carroll et al. 1994) to presidential popularity (Suzuki 1992). This marriage of mood and economics is not new to us, of course. In Chap. 4, we encountered the development of subjective well-being data, a newer shinier well-being data, as a marriage of economics and psychology, known as happiness eco- nomics that was able to measure subjective well-being at population level. Mood and sentiment analysis are not new, then. Neither are big datas- ets. Even Fitbits and Apple watches are not new; not really, as attaching technologies to people’s bodies has been used to study and improve pro- ductivity and surveillance of workers and citizens for around a hundred years (Davies 2015; Cryle and Stephens 2017). So, what is new? The amount and variety of data on the well-being of individuals and popula- tions are increasing as technologies develop to manage greater amounts of different kinds of data, not only faster, but faster together.12 Therefore, it is not necessarily how one thing (not that Big Data are one thing, really) is new. Instead, it is a far more complex picture of how different aspects of, and different people across fields of, politics, science, research and tech- nology work together—and work with commerce. These all combine as developments in what we know, and ways of knowing, about society. The question is, what does that mean for well-being? How can we learn from previous mistakes regarding the context of who is using what data— and to what end? COVID-19 will offer us many data insights and many insights into how data can help us understand and look after well-being better. The next section looks at the role of data and learning in a pan- demic, of old and new infrastructures and commercial and governmental data practices in the management of a pandemic. 194 S. OMAN 5.4 A Case Study on the Promise of Commercial Big Data One of the most high-profile cases of the possibilities of Big Data involves a tale that begins in 2009 when a new virus was discovered. This new ill- ness spread quickly and combined elements of bird flu and swine flu. This story opens Mayer-Schönberger and Cukier’s book, Big Data: A Revolution That Will Transform How We Will Live, Work and Think, which you may remember is mentioned earlier in the chapter as a much-cited originator of the term ‘datafication’ (2013). The authors explain that the only way authorities could curb the spread of this new virus was through knowing where it was already. In the US, the Centres for Disease Control and Prevention (CDC) requested that doctors inform them of cases. However, the information on the pandemic that the CDC had to work with was out of date. This was by nature of the data collected, and its ‘data journey’ (Bates et al. 2016). There were multiple data journeys to consider: data were collected at the point someone went to the doctor, which could be days after initial symp- toms, let alone contraction; sharing data with the CDC was a time- consuming procedure; the CDC only processed the data once a week. Thus, the picture was probably weeks out of date, making intervention or behavioural analysis difficult. In other words, while the datasets were large, even potentially fairly detailed, these Big Data were too slow. Coincidentally, so Mayer-Schönberger and Cukier tell us, a few weeks before the new disease made the headlines, Google engineers published a paper in a high-profile journal, Nature, which explained how Google could ‘predict’ the spread of the winter flu in the US. This was possible just through analysing what people had typed into their search engine (and, of course, knowing where those people typing were). It compared the CDC data on the spread of seasonal flu from 2003 to 2008 with the 50 million most common search terms in America. The Google engineers looked for correlations between what people typed into the Google search engine and the spread of the disease. Mayer- Schönberger and Cukier point out that. Google’s method doesn’t require traditional infrastructures to distrib- ute mouth swabs or for people to go to doctors’ surgeries. ‘Instead, it is built on ‘big data’—the ability of society to harness informa- tion in novel ways to produce useful insights or goods and services of 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 195 s ignificant value. With it, by the time the next pandemic comes around, the world will have a better tool at its disposal to predict and thus prevent the spread. (Mayer-Schönberger and Cukier 2013, 2–3) Sadly, a pandemic with wider societal and well-being effects arrived after I started writing this book, and despite the promise of Big Data, it did not prevent the spread. Data hold a very important place in the story of COVID-19 and its management, but all data have limitations in how it can inform human action to change reality, as do the different ways of analysing data. Indeed, data are not just there but are managed and used by people with their own interests. Data do not speak for themselves but are interpreted. All data realities also involve selective processes in what data are important and what data are not. These limits are not always made as clear as they should be. Mayer-Schönberger and Cukier’s promise of Big Data as revolutionary and transformational in the US was clearly jumping the gun. Not only was the pandemic not prevented by way of predictive analytics, but actually, part of COVID-19 data management has very much involved doctors’ surgeries and mouth swabs—in the UK at least. To clarify, I was randomly selected from data held on people registered with a GP to participate in a survey in August 2020.13 I was contacted by the Real-time Assessment of Community Transmission (REACT) Study,14 which is in fact a series of studies, using home testing to understand more about COVID-19, and its transmission in communities in England. The logic behind the study was that not all people with the virus were being tested at this point, either because they were asymptomatic or for some other reason. This was one of a few projects to collect data from a sample of the population, over time, in order to understand how it was spreading. This process relied on old infrastructures: I received a letter by Royal Mail, I signed up online, and then I was sent a mouth swab—also by post. That all worked fine for me, but there was a series of steps registering dif- ferent barcodes and I found myself wondering how accessible this was for everyone (when I say everyone, I often think of my once tech-savvy Dad, who’d have been bewildered at this whole process). After completing these steps, a courier was ordered to collect the test. I sat in patiently wait- ing for my test to be collected, slightly anxious about what felt like a huge responsibility, and acutely aware that I might need to be ready to run out and meet a courier with my test. 196 S. OMAN I live in a high-rise with no working bell or intercom (and a bunch of other things that don’t work). For three separate days, I watched for details of the courier on the app, and out of my window, waiting for them to appear on the road, or call to say I should come down. But there was no sighting of the courier in real life and no phone call. When the app showed they were coming, they disappeared without attempting to deliver. After three attempts. I was told that this particular courier company was infamous for not bothering to try and collect from my flats, because it was too inconvenient. So, in my case, while some aspects of the traditional data infrastructure (the post) worked fine, they didn’t necessarily all work together as they might. This meant that my test remained uncollected, expired and had to be securely disposed of. This meant my data became ‘missing data’. What I was surprised by was how the information system assumed you would live somewhere that was easy to access. As we know, many people from our poorest communities live in high-rises where the lift doesn’t work, or the people in the flats themselves are difficult for a courier to access. Thinking about the contexts in which data are collected (or not) can be both extraordinary, and mundane, and we often don’t hear of these stories—when they work, and the odd occasion when they don’t, and what that might mean for the data. Yet, these contexts have huge impact on who is readable in data and how we understand well-being and inequality. So why did COVID-19 data collection end up using more traditional infrastructures in the UK? On a larger scale, why did the world not use Google data as Mayer-Schönberger and Cukier predicted? As it turns out, Google Flu Trends (GFT) missed the peak of the 2013 flu season by 140%, and Google subsequently closed the project (REF). In 2014 a paper called ‘The Parable of Google Flu: Traps in Big Data Analysis’ was pub- lished in another high-profile academic journal, Science (Lazer et al. 2014). The authors concluded that while there was potential in these sorts of methodologies, and while Google’s efforts in projecting the flu may have been well meaning (which could be called into question), the method and data were opaque. This made it potentially ‘dangerous’ (Lazer and Kennedy 2015) to rely on GFT for any decision-making, as the context of the data and the analyses were not made explicit to public decision-mak- ers. Of course, it is also perhaps unlikely that Google had designed the 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 197 tool for public decision-making contexts,15 considering what government officials need to understand for this kind of decision-making. There are other limits to the data: its sample. Google assumes this ubiq- uitous reputation, yet, it is not the only search engine available: people choose other search engines for various reasons. Crucially, Google also does not have global reach. Most services offered by Google China, for example, were blocked by the Great Firewall in the People’s Republic of China. This was not even the first time it was banned in China. So, even if GFT were still in action, would it have pre-empted the COVID-19 out- break in Wuhan, China, before more official announcements? If we are to think about how Big Data have transformed how we live, as Mayer-Schönberger and Cukier want us to, then we must also consider how ‘datafication’ has changed people’s practices. More and more of us scour the internet, hoping to reassure ourselves that recently developed symptoms are minor ailments. This is—as we discovered in Chap. 2—part of the anxiety introduced with audit culture: we consult technologies as a default because we can, rather than should. We search for confirmation that nothing is wrong, rather than only searching when something is wrong. In countries where access to healthcare is diminished, people are actively encouraged to search the internet before interacting with health services. Consequently, this limits the predictability of search data, as their contexts have changed. In the case of COVID-19, people searched for symptoms they didn’t necessarily have, especially in the second quarter of 2020, when most nations were in lockdown and the severity and ramifications of the disease were becoming clearer. The implications of this are that searches would not necessarily have reflected the infected state of an individual that could be aggregated to reveal community or population infections, or more importantly, predict transmission so that it might be controlled in some way. Instead, searches for COVID-19 symptoms may well be a predictor of concern or anxiety. Ironically, then, Google searches are arguably a bet- ter indicator of negative subjective well-being than of COVID-19. The very idea of data being reliable has led to our need to feel sure—to have objective confirmation that all was OK, is OK or will be OK, and has led to an increased reliance on data. In the case of Google searches, this reliance has triggered people to search for verification of risk or safety. So how might we have cut through the ‘noise’ that the definitions at the beginning of this chapter point to, in order to know how it was spreading? We are back at the chicken and the egg dilemma: do people search about 198 S. OMAN COVID-19 because they have symptoms? Or do people search about COVID-19 because they are worried about it and feel compelled to search for confirmation—or search on behalf of friends or loved ones? I watched someone use their internet searches to check our colleague’s proclaimed symptoms against the common signs of swine flu—a very collegiate indi- vidual, but one whose search history told a story of their friend’s (poten- tial) disease state, rather than their own. In this latter case, then, Google searches were more indicative of personality than health or even subjective well-being, although, perhaps well-being data all the same. Bigger datasets make correlation more powerful than causation, explain Mayer-Schönberger and Cukier, devoting a whole chapter to it in their book (2013). Google queries went from 14 billion per year in 2000 to 1.2 trillion a decade later. There are even websites that show a live running tally of how many searches have been achieved in a day.16 If Big Data were all about scale, then GFT would have been more, not less likely to work on the premise of correlation as search numbers increased. The scale at which we have correlations using ‘Big Data’ may be an indicator of causa- tion, but not proof. Is this the end of the promise of Big Data, though? If we return to a case of COVID-19 and Big Data, what might we find? Linking Big Datasets: For Well-being? On New Year’s Day, 2020, a Canadian health monitoring company alerted its customers to the COVID-19 outbreak, some days before the US’ CDC or the World Health Organization (WHO) alerted anyone (Niiler 2020). Of course, the disease was not yet called COVID-19, and it was not known that it was to be a global pandemic. At this point, a cluster of unusual pneumonia cases had been detected. One of the companies said to have beaten the WHO to this discovery is called BlueDot, which uses AI-driven algorithm searches to look at datasets, much like GFT. Unlike Google Flu Trends, BlueDot’s algorithms consolidate and anal- yse data from numerous sources. BlueDot’s owner, Dr. Kamran Khan explains: We can pick up news of possible outbreaks, little murmurs or forums or blogs of indications of some kind of unusual events going on. (Khan, in Niiler 2020) Other data sources are more official, such as statements from health organisations, livestock and news reports in 65 languages. BlueDot also 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 199 uses ‘anonymous mobile phone data’ (Whitaker 2020), flight sales and other records. These various data points enable a prediction of a possible new serious disease. Importantly, the logic is that this approach also offers insight into how that disease becomes mobile by the people who carry it and the planes who carry the people carrying the disease. What we have done is use natural language processing and machine learning to train this engine to recognize whether this is an outbreak of anthrax in Mongolia versus a reunion of the heavy metal band Anthrax. (Niiler 2020) Also, crucially, ‘epidemiologists check that the conclusions make sense from a scientific standpoint’ (Niiler 2020). The company website states that ‘BlueDot protects people around the world from infectious diseases with human and artificial intelligence’ (BlueDot n.d.). Therefore, despite claims to its sophistication, the automated data-sifting still requires human analysis to make sense of what has been found. Khan’s company utilised technological developments at its disposal to synthesise many different types of data from multiple datasets to construct evidence. Only when the data were pieced together was the information useful, and only after human experts had checked it, were these insights deemed useful enough to share and use. BlueDot is a commercial com- pany. The human and artificial intelligence are synthesised as an enterprise, and Khan is often presented as both an entrepreneur, as well as a professor of medicine and public health at the University of Toronto. Khan has also worked in hospitals, so understands how they work. Khan explains in one interview, Disease doesn’t wait for the reviewers, so we need a more agile system. My motivation for creating a company—here to start supporting an entrepre- neurial spirit—using business as the vehicle to do that. (Khan, on Charrington 20 February 2020) There are two things to note here. Khan suggests that the old struc- tures of peer review and scientific expertise are too slow in their use of data and evidence to tackle a global pandemic. He also suggests that his busi- ness successfully links together ‘human and artificial intelligence’ to pro- vide what traditional science cannot: the analysis of data with veracity and variability, speed, resolution, relationality and so on. The value of BlueDot is in its claims to harnessing the qualities of Big Data. 200 S. OMAN To return to Mayer-Schönberger and Cukier, ‘Google’s method’ may not have involved distributing mouth swabs, or been built on old infra- structures, but instead, they explain: [I]t is built on “big data”—the ability of society to harness information in novel ways to produce useful insights or goods and services of significant value. (Mayer-Schönberger and Cukier, 2) So, there we have those familiar terms of insights (a marketing term) and valuation (that we discovered from economics in Chap. 2), alongside clear communications and the presentation of novelty (Chap. 4), goods and services. Mayer-Schönberger and Cukier hint at the complex politics at play on the value of data—and the values of data more broadly than we have already encountered. Crucially, in a book about well-being and data, we have to note that BlueDot’s business is entrepreneurial because it is profitable. In other words, the insights have to be sold to clients and customers. They were also not the only innovator (as acknowledged by the Lancet and MIT Review [McCall 2020; Heaven 2020]). Here, we must return to the eco- nomic value of data because of the possibilities of well-being insights and the ideological project of the well-being agenda. If the well-being agenda is about improving redistribution of resources as an issue of social justice, we might want to think about what position we are coming from: rather than asking, ‘what are the data limits of these well-being projects?’, we might ask, ‘what are the well-being limits of data projects like these?’ Although, despite the clear sophistication of BlueDot’s project, it also did not prevent COVID-19’s spread. This criticism has been noted in the MIT Review: The hype outstrips the reality. In fact, the narrative that has appeared in many news reports and breathless press releases—that AI is a powerful new weapon against diseases—is only partly true and risks becoming counterpro- ductive. (Heaven 2020) The point this MIT article was making here is that the over-reaching claims of AI could be damaging to its future progression, in the same way that GFT overstretched its claims. Data and the distribution of resources are very much part of the COVID-19 story, and not just of private companies profiteering, either. 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 201 Such competition is also reiterated by national politicians misleading the public about ‘world-beating’ systems of data (BBC 2020). In the same way that the social indicators movement was halted because it was not quite measuring what it thought it was measuring (Chap. 2), the ‘promise’ of Big Data has adjusted. The limits of Google’s approach are in a lack of context: the nature of what people actually search for is different than was predicted. The limits on data are social, cultural, political and economic, and by extension, these limit the possibilities for a good society. We will explore social media and mobile communications data in the final few sec- tions to better appreciate this relationship. 5.5 Social Media Data: A Game Changer? I am sure that social media plays a role in unhappiness, but it has as many benefits as it does negatives. (Sir Simon Wessely, president of the UK’s Royal College of Psychiatrists in Campbell 2017) Social media platforms have an interesting relationship to well-being. They are often demonised as bad for well-being, especially for the younger generation who are thought to dwell on images of idealised bodies and lifestyles on Instagram (Campbell 2017). All ages feel a pang looking at the picture-perfect presentations on Facebook, and even the NHS warns people to take breaks from social media (NHS 2016). Credible, successful women leave themselves vulnerable to criticism from strangers in the shar- ing of thoughts, opinions and aspects of their identity on platforms like Twitter (Lewis et al. 2016). Similarly, hate speech against people of colour (Gayle 2018) or for their gender identity (Pearce et al. 2020) are realities of social media platforms. However, social media and online platforms also offer places for human connections, and have had beneficial effects for the social isolation brought about by measures to curb the spread of COVID-19. The jury is still out on many of the pros and cons of social media, including their propensity to spread disinformation, versus credible analysis of data and guidelines. Social media therefore hold an ambivalent place in the management of well-being. These controversial aspects of social media are not their only connections to well-being. The data we share can make them useful for well-being analy- sis. The most mundane aspects of our feeds, the venting of minor irritations, celebrations of small wins or just feelings shared with friends and family mean our social media accounts are full of well-being data. Think about 202 S. OMAN those ONS4 questions again (Table 4.2) that aim to gauge ‘personal well- being’. For example, they all ask you to think about how you felt yesterday overall—in terms of happiness or anxiety, as well as whether you think what you do is worthwhile, and whether you are satisfied with your life. When you think about Facebook’s most prolific posters in your timelines, for example, much of their content will indicate how they felt in similar ways at specific moments. The recent addition of emojis to Facebook means it is easier to proclaim whether you were happy, celebrating or anxious. The reminders of what you were doing this time last year or ten years ago means we are telling everyone on Facebook how we feel now, about how we were feeling in previous years. Crucially, this means it is even easier for Facebook to know this too, as you have essentially coded your own data for them. This compulsion to share how we feel means we are also sharing our data with Facebook and other platforms. These platforms are able to analyse us alongside millions of others at scale. Companies like Brandwatch monitor social media and analyse several billion emoticons each year to inform brands whether they are provoking hatred or happiness with their products. It is also possible for a broad range of actors to mine social media data, whether commercial companies, government agencies, academic research- ers or amateurs with the inclination to do so. The platforms are set up with open Application Programming Interfaces (APIs). APIs are what allow other (data mining) software to interact with social media platforms. Once access to social media data has been gained, it can be ‘scraped’ with com- parative speed with the right skills and software. Scraping is a process which essentially involves gathering and copying data that meets specific search terms. It is then put into a database (that can be as crude as a spreadsheet), for later retrieval or analysis. This can be done by a person, although the term more typically refers to automated processes involving a bot or web crawler. The fact that APIs are generally open as a standard indicates that these data—your data—are made available by social media platforms to be used by various different actors. Not many people think about the fact that their public post on a social media platform is public in the sense that it is no longer their private property and can be used by others in research.17 There are practical limits to what can be known through analysing peo- ple’s social media posts, of course. First, people are not neutrally repre- senting themselves on social media. As we know, people feel compelled to publish reflections on an idealised version of their lives (Kruzan and Won 2019). Of course, our social media posts don’t always represent our lives as happier than they actually are: people often exaggerate the impact of 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 203 minor negative events that are as mundane as missing the bus or being rained on. Some people collectively engage in dissatisfaction with their lot in life, leading to Twitter bubbles and what has become known as ‘the culture wars’,18 as the contemporary cultural conflict between social groups. This term describes a gap between those who side with a tradi- tional, conservative approach, and those with a liberal, progressive approach to society and social issues, such as immigration, abortion, LGBTQIA+ rights, and so on. The contemporary culture wars, as a strug- gle for dominance of values and beliefs, now takes place on Twitter, and we might question the extent to which such rage and passion are indicative of someone’s personal well-being, or some form of tribal rage on a larger scale. Essentially, we are seeing how important social media can be in both distorting and shaping our well-being for better or for worse. The key to appreciating the relationship of social media, data and well-being is under- standing limits and context—of collection and use. Social Media Data Mining in Social and Cultural Sectors Social media data mining is not always a large-scale affair requiring APIs and special software. As found in a six-month research project with city councils and a city-based museums group in the north of England (Kennedy 2016), many small organisations use quite basic techniques to do this work. Social and cultural policy sectors are reliant on understand- ing well-being data, as improving well-being is at the core of what many of them do. Yet, as Chap. 1 of this book acknowledges, the sectors do not always have the skills or confidence to use data. We will look at these sec- tors as a whole in greater depth in the next three chapters. The project exploring how these smaller social and cultural organisa- tions were already using data mining, wanted to understand how they might use it more effectively. The researchers discovered that although software packages were adopted to analyse institutional impact and engagement on Twitter, this was largely unsystematic (Kennedy 2016, 71 & 72). Keen to improve their social media data mining capacity, these organisations signed up for training in new tools that would improve their capability. However, it became clear that less data mining was happening than expected and the capacity of workshop participants to engage with training in the new tools also fell away (Kennedy 2016, 74). Doing better with data seems a good idea, but is not always as easily resourced or incor- porated into working practices as initially hoped. 204 S. OMAN Local councils, social and cultural sector organisations all have limited resources. Despite enthusiasm for being, or becoming, data-driven, capac- ity to invest time and money in new tools at the organisational level is often lacking (Kennedy 2016; Oman 2019a, b). In the case of the cultural sector, there is a tendency to invest in grand schemes, new metrics and reports at policy level that claim to investigate the value of new and/or Big Data and the associated technologies required to generate or analyse them (Gilmore et al. 2018; Oman 2013a). However, when considering the (already ill-defined) cultural sector19 as a whole, differences are obscured in requirements and capacity for data technologies, which are multiplied by huge variability in organisation size, type, purpose, mission and cultural offering across and within sectors (Oman 2013a). These top- down resources and contributions are not always actually used or found useful at an organisational level or across the wider sector (Oman 2013a). Some organisations recognise that their audiences are full of people whose opin- ions are less easily captured by Big Data. Some people, for example, still prefer booking telephone lines to web pages and are certainly not tweeting or Instagramming their experience of a show. As such, some who attend a show are less likely to be generating data on their opinions that might then be mined. Advocates for using Big Data in small organisations acknowl- edge that Big Data can be ‘debilitating’ in their complexity and challenges. This is not always explored in a way that offers resolution (Oman 2013a), and as we have seen (Kennedy 2016) when recommendations, even train- ing, are offered, there is not necessarily the capacity to take them up. Yet, it can be very easy and fast to interact with Big Data as social media data, as long as you consider the limitations of the data and their origins, as well as how you might analyse them yourself. Organisations and indi- viduals do not need Big Data analytics know-how or software, although there are excellent resources freely available to help them understand how,20 as I found when I wanted to explore Twitter discussions about hap- piness. In 2013, Mass Observation recreated the Bolton happiness study on Twitter (see Fig. 5.3). This was still fairly experimental for them as much as me when I requested access to the tweets. There were 25 responses that they captured at the time. The sample of 25 meant that—of course—I did not require data mining or sentiment analysis software—or any knowledge of APIs. In fact, I did not even need to request these tweets from Mass Observation directly, as 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 205 Fig. 5.3 Mass Observation happiness tweets they are still available on Twitter by searching the hashtag (or were in August 2020 when I last checked). A cursory analysis in this case simply meant reading, and noting similarities and themes, which I could have done on a piece of paper. So, what did this cursory analysis tell me? Whilst 20% mentioned pets, all of which were cats (it is the internet after all), one person replied with a single word: bacon. Mainly, however, people described informal, every- day participation,21 including reading, going to gigs, watching films. There were lots of glasses of wine and some chocolate in there too. The textual content of these tweets is reproduced in Box 5.1, without Twitter handles. You might note the surprising varieties of theories of well-being we have encountered so far in the book can be present in 25 tweets. Some map onto clear areas of social policy, others are definitely in the private domain. Some people used negative language to imply life isn’t currently great for them: ‘Day off. Smoke in peace.’ And ‘Ability for women to walk down the street & not be catcalled or threatened. Few happy women here’. Some people were philosophical, others wistful. Some focussed on activities, others on the ‘bliss’ of doing nothing. The variety of tone and content makes for fascinating reading, but leaves these data wide open to interpretation—whether that is via human or artificial intelligence. 206 S. OMAN Box 5.1 Tweets Answering the Question: ‘What Is Happiness?’ • Beer, maps, chocolate, quizzes, the unending pursuit of knowledge • Ability for women to walk down the street & not be catcalled or threatened. Few happy women here • Short term happiness is different for everyone. Long term happiness is about fulfilling your potential. • Bacon • 5 minutes to myself and a good book, with peppermint tea and the cats curled up around me. Absolute bliss! • Volunteering, yoga, baking, being with loved ones, reading, warm days paddling in the sea, colourful things, exploring, my cat: D • Doing what I love (#history), a safe home by the sea, someone to love & share things with • Good company, fireworks, being smiled at, a job well done, ‘sweet pea’ by Manfred Mann, making someone else happy, good health. • I am happiest when discovering/learning new things, such as reading books and finding new music. • Happiness is cooking for those I love, with a glass of wine and giggles on the side. • Day off. Smoke in peace. • “What is happiness?” something to do with dopamine levels • Making things that muself [sic], and hopefully other people will enjoy • Loving and being loved and valued for who I actually am. • More precisely: Time, a book, a view, a friend. • Choices and control in life not just in shopping. • Connecting with other people, being able to make a difference to someone else, a good book and a purring cat on my lap! • My kids • What is happiness?’—“A warm spot on the bed in the sunshine” • Knowing that enough is plenty • The scent of roses on a damp morning […] being where you are without wishing to be somewhere else • Happiness is seeing my children flourish, Swansea City FC progress & succeed & cooking for husband. Ln that order!;) • Love, health and a sense of purpose. Oh, and cake. • What makes me happy? Cuddling up on the sofa with my partner & animals, a glass of wine, chocolate, a film & crochet- bliss • Happiness is good relationships, a little more than enough money, satisfaction and contentment 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 207 I used these tweets as a light-hearted example, with my ever so light- touch analysis, in my first ever conference presentation in 2013. In Chap. 3, I explained that my research question at the beginning of my PhD was loosely: ‘When people describe well-being, how often do they talk about participating in different kinds of activities—and what might that tell us about aspects of social and cultural policy?’ or ‘how can qualitative data collected to understand well-being tell us how people feel about what they do?’. I noted in this presentation that state-funded cultural practices (like art galleries and museums) were less frequently mentioned by people as making them happy than what is called everyday participation (Oman 2013b). This same finding emerged from my reanalysis of the ONS free text data I used in my PhD (Oman 2017, 2020). By extension, these data (with their caveats) were another dataset to suggest we should question whether cultural funding was supporting activities that made people hap- pier or increased their well-being. This was not the only way of analysing these tweets to make an argu- ment about the relationship between culture and well-being. Someone else may have counted how many of these responses included something creative and used their analysis to argue they have found the value of cul- ture to people, thereby justifying more funding. These are debates about data and their use in politics and policy that we return to in the next chap- ter. What is important here is that even with (arguably, especially with) such a small dataset we can see how human bias can interact with data and lead to different arguments. If it is difficult for humans to make categorical claims from a form of sentiment analysis that is not much more systematic or technical than reading 25 tweets, we must remember these limits when these analyses are made through machine learning. This is especially vital as time-sensitive analyses of large-scale samples of emotional expressions are being used in research on COVID-19, particularly given they are seen to have the poten- tial to inform mental health support and help tailor risk communication to change behaviours (i.e. Pellert et al. 2020). As with all data uses men- tioned in this book, it is not that using social media data, or automated sentiment analyses are necessarily bad, but rather, that their limits should be recognised. As ever, it is an issue of methodology, transparency context and legibility. 208 S. OMAN Understanding Where People Are and How They Feel Using Twitter Data Of course, it is not only what people say that can be mined, but also where they are. One research project attempted to gauge community well-being using Twitter data from between 27 September and 10 December 2010 (Quercia et al. 2012). Interestingly, as an aside, this coincided with the UK’s Measuring National Well-being debate which launched in November of that year. The researchers were interested in a few things. They wanted to understand more than individuals, to measure the well-being of com- munities. They state their intention as moving the recent developments in subjective well-being measures that we discovered in the last chapter for- ward. Rather than administering questionnaires on an individual basis, or in a national-level survey, they wanted to explore the recent possibilities of sentiment analysis to understand community well-being, Social media data can significantly reduce the time-consuming pro- cesses that make large-scale surveys and qualitative work resource-heavy. Once these data have been ‘scraped’ and saved into a database, they can be analysed in many ways. In the case of Querica and their co-authors, they were interested in the idea of using sentiment analysis to see if it could interpret community well-being. They created a sentiment metric, which was originally derived from studying Facebook status updates (Kramer 2010). This metric standardised the difference between the percentage of positive and negative words in a Facebook user’s posts in one day. Kramer used the metric to make arguments at a national level, aiming to develop, as he suggests in the title of his paper, ‘An Unobtrusive Behavioral Model of “Gross National Happiness”’. His new standardised metric was found to correlate with self-reported life satisfaction. Looking at the US specifically, peaks were found in life satisfaction that correlated with national and cultural holidays. This is fine in and of itself, but what does that tell us about well-being? Christmas is good for well-being? Other research indicates otherwise (Holmes and Rahe 1967; Mutz 2016), suggesting it can cause feelings of stress for vari- ous reasons: financial, family, and so on. What about the days either side when people are travelling huge distances (with everyone else) using transport infrastructure which is not fit for purpose? Or the excesses of consumption that holidays like Christmas involve, as well as their impact on the planet? What about all those who do not celebrate Christmas, as they are not of a Christian denomination? In his limitations, the author 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 209 acknowledges that there is a possibility that the likelihood to wish people ‘Happy Christmas’ could have affected these results. However, he decided not to control for this, as wishing someone happy holidays is a positive sentiment. We might wonder, then, whether this study was really inter- ested in the possibilities for understanding the human experience using the details of the Facebook posts, or whether it was interested in deriving a metric that was comparable with more established methods. Returning to the study on community well-being, the authors state, ‘it is not clear whether the correspondence between sentiment of self- reported text and well-being would hold at community level, that is, whether sentiment expressed by community residents on social media reflects community socio-economic well-being’ (Quercia et al. 2012, 965). Therefore, they do note some of the limitations of using this approach to answer their research question. However, notably, they do not acknowledge some of the limitations of the metric itself. London was chosen for the study to understand about communities, socio-economics and well-being. Let’s break down what they did and how. The study used four types of data gathering, it: 1. ‘Crawled’ Twitter accounts whose user-specified locations report London neighbourhoods. 2. Geo-referenced the Twitter accounts by converting their locations into longitude—latitude. 3. Measured socio-economic prosperity, using the UK’s IMD.22 4. Conducted sentiment analysis on tweets between particular dates from their sample. How did these processes work? 1. How the crawl worked: the researchers chose three popular London-based profiles of news outlets: the free newspaper The Metro, which was available in London on the Tube at the time (it has since expanded), a right-wing tabloid The Sun and the centre-left newspaper The Independent. These media were chosen because they are thought to capture different demographics of class and politics. Using these three accounts as ‘seeds’, they used ‘a crawler’ to trace linked accounts. Crawlers are software that allows you to gather various kinds of available data based on who interacts with a particular website or Twitter account. In this instance, every user following these accounts was ‘crawled’. 210 S. OMAN 2. Some Twitter users stated where they live in their profiles. Accounts were crawled to find 157k of 250k profiles had listed locations, with 1323 accounts specified London neighbourhoods. They then filtered out likely bots by also ‘crawling’ using another metric23 for each profile. This brought the sample down to 573 profiles. Once these were estab- lished, locations were converted into longitude-latitude pairs, translating these data into geographical co-ordinates which are easier to work with. 3. The IMD is broken into 32,482 areas, 78 of these are within the boundaries of London used by the authors (these are not necessarily fixed). The IMD offered a score for each of London’s 78 census areas. The authors use a census area to represent ‘a community’. We shall return to this key point in a bit, but hold that thought. The data comes from the ONS’ Census and is an objective list of sorts: income, employment, educa- tion, health, crime, housing, and the environmental quality. It is worth noting that in the IMD, the ONS talk about ‘Lower Layer Super Output Areas’ (LSOAs), rather than communities. 4. Sentiment analysis was undertaken on the tweets using two algo- rithms. (1) Kramer’s metric described and (2) something called a ‘Maximum Entropy classifier’, which uses machine learning. The algo- rithm in Kramer’s metric has a limited dictionary, so this second machine learning package was used to improve on the first, by using a training dataset of tweets with smiley and frown-y faces. The authors argue that the results across the two algorithms correlate and are accurate. They then measured the sentiment expressed by a profile’s tweets and then compute, for each region, an aggregate sentiment measure of all the profiles in the region. Findings: So what did they find? Through studying the relationship between sentiment and socio-economic well-being they found that ‘the higher the normalised sentiment score of a community’s tweets, the higher the community’s socio-economic well-being’. In other words, the senti- ment metric accounted for positive and negative sentiments, enabling each area’s aggregated data to show an average score. This tended to correlate with the scale that they used that indicates poverty and prosperity in that locale (the IMD). Limitations—What did the authors identify as limitations? Demographic bias—Twitter users are certain types of people; there- fore, these findings will over-represent the happiness of Twitter users— missing out on non-users. 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 211 Causality—our old friend. Though the causal direction is difficult to determine from observational data, one could repeatedly crawl Twitter over multiple time intervals, and use a cross-lag analysis to observe poten- tially causal relationships. Sentiment—They tracked sentiment but not ‘what actually makes communities happy’ (Quercia et al. 2012, 968). The intention was to compare topics across communities. Their example: given two communities, one talking about yoga and organic food, and the other talking about gangs and junk food, what can be said about their levels of social deprivation? The hope is that topical analysis will answer this kind of question and, in so doing, assist policy makers in making informed choices regarding, for example, urban planning. (Quercia et al. 2012, 968) As evidenced with the possibilities for making an argument using the crude analysis of the Mass Observation tweets, and as suggested by the citation directly above, there is bias in the ways that Big Data can be used to inform social and cultural policy. However, this is not necessarily any more the case in these examples than in those using more traditional data sources explored earlier in the book. The ways our social worlds are ordered do not reside in the algorithms, but in the preconceptions, lazi- ness and judgements which become reproduced through researchers and their research and through policy-makers and their decisions. While the Quercia et al. examples were presented as a binary of opposites for narra- tive effect, the ridiculousness of the proposition may not stop it coming into effect as a deductive study in future. The fact that gangs are unlikely to tweet about gangs is one thing. Furthermore, the idea that these gangs remain within their ONS-allocated geographical boundaries called LSOAs is also a nonsense. This brings me to another point, LSOAs are not communities: not in the way that we think of community well-being as built on social relations and inter-related lives. People are not only active citizens where they live, and in a city like London especially, may actually be more likely to be active citizens where they work. Without the context of understanding London, what it is to live in London, and the complex, overlaid commu- nities and social groups that comprise a postcode, this idea of community well-being is a misnomer. Instead, it matches one index that uses census data, which, while valuable, can be out of date, and is well-known for its various limitations as a metric of socio-economic deprivation or advantage. 212 S. OMAN Perhaps another way to look at a question of community well-being might be to look at people interacting in public space. Plunz et al. (2019) also used sentiment analysis with geo-located Twitter data. They were interested in finding well-being indicators associated with urban park space. Their goal was to assess if tweets generated in parks may express a more positive sentiment than tweets generated in other places in New York City. Their results suggest that tweets in Manhattan are different from other NYC boroughs. In Manhattan, people’s tweets were more positive outside of parks than inside, whereas the opposite was true outside of Manhattan. They concluded that Twitter data could still be useful for aspects of social policy, including urban design and planning. They also note that one of the limitations of geo-located Twitter data is that GPS is less accurate than sometimes accounted for. It also does not account for elevation, so you could be on the metro underneath Central Park, or indeed, stuck in traffic alongside it. It is hard to establish whether people may have gone for a walk to let off steam, or commute to work, for example. The relationships between where we are standing or where we live and our well-being are not new, but a feature of much philosophy on the nature of subjective experience, especially since the Enlightenment (which we shall come to in the next chapter). Big Data offer new ways to test what we know about place. However, these data and devices also make assump- tions about place and experience (Wilmott 2016). The expectations and suppositions of what happens where, for whom and how drive these analy- ses with the same bias as other Big Data technologies, and we must be aware of the limitations of these data, technologies and the ideas of well- being they claim to measure. We also need to be vigilant about who holds the data and why they are analysing. 5.6 Fit for Purpose? Health and Well-being Tracking and Apps Recent technological developments have seen a rise in people using wear- able technologies and their mobile phones to track their movements and behaviour. These include: periods of activity, menstruation, what they have eaten, how they have slept, how far they have walked and their heart rate, in order to gain an overall picture of their health and general well- being. These practices are frequently called the Quantified Self movement (Ruckenstein and Pantzar 2017), which refers both to the cultural phe- nomenon of self-tracking using one’s own data, as well as the community of people who use and share data in this way. 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 213 The technologies are increasingly popular and are being discussed as cost-savers for the NHS, but there are barriers to their use (Jee 2016). Around five years ago, 85% of the general population did not own wear- able devices (Lee et al. 2016). Therefore, measures which use datasets from these technologies will only account for a proportion of the popula- tion, who are most likely to be younger and more affluent (Strain et al. 2019) and already demonstrating an investment in their current and future well-being by owning such a device in the first place. We also do not yet fully understand the impact of COVID-19 on wearable devices and app use, as at the beginning of the crisis there were stories about governments using these data to monitor compliance with lockdown measures (Digital Initiatives 2020). YouGov polling data24 indicate that even in July 2020, 65% of the UK had still never owned a wearable device, with 22% currently using one (with everyone else having tried one, or owned one but not cur- rently using one). However, the same YouGov data indicate that usage has increased from 22% to 27% in January 2021, and those who have never owned a device has decreased at a similar rate. Therefore COVID-19 has seen an increase in wearable technology, as people take an interest in their well-being data in new ways. Self-tracking, or the practice of generating or capturing data about everyday activities like eating, exercise for purposes of self-improvement, puts data and control in the hands of people, as well as the corporations which produce self-tracking devices and the third parties with which these data are shared (Kennedy et al. 2020). The research is ambivalent as to whether the experience of self-tracking has positive benefits, such as per- ception of control, agency or, in the case of professional or amateur sport- ing, opportunities for new communities (Ajana 2017; Lupton 2019; Pink and Fors 2017). It is also thought that these practices in and of them- selves, and in their relationship to control, may decrease well-being more generally (Kennedy et al. 2020). Data collected via mobile phone apps present similar possibilities for community and compromise. Smartphone access and usage only account for certain sections of a national demographic, much like wearable devices. Similarly, people who download an app to better understand their well- being are already self-selecting as wanting to improve their well-being, and therefore may not be considered a representative sample. A number of apps in the early 2010s wanted to further develop the insights gained from better understanding subjective well-being measurement. 214 S. OMAN In 2012, experts in geography and the lived environment based at the London School of Economics created a mobile phone app to understand happiness (MacKerron and Mourato 2013). What they branded a ‘hedonimeter’ (after the nineteenth-century invention we discovered in Chap. 2), the ‘Mappiness’ app asked people to allow the app to collect objective data about where they were (automatically, using GPS data), what activity they were doing, and who they were with (as manual entries). It also asked them to provide hedonic responses (subjective well-being data) as to how awake, happy and relaxed they were. These data were col- lected using sliders instead of the more traditional scales we have previ- ously encountered. The data collected by the app were used in a number of different ways to appreciate subjective well-being and we will touch on a couple here. In 2015, a report which drew on this data was published. ‘Cultural Activities, Artforms and Wellbeing’ reported on research commissioned by Arts Council England (ACE). The authors evaluated the hedonic read- ings of various activities found in the data collected by the app (Fujiwara and MacKerron 2015). Table 5.4 shows what the authors describe as ‘hap- piness activities rankings’, with theatre, dance and concert appearing to have the highest effect, and reading the lowest, unless you incorporate Table 5.4 ‘Happiness activitiesa rankings’ Activities Coefficient Theatre, dance, concert 8.735*** Singing, performing 7.731*** Exhibition, museum, library 7.457*** Hobbies, arts, crafts 5.737*** Talking, chatting, socialising 3.789*** Drinking alcohol 3.646*** Listening to music 3.518*** Childcare, playing with children 2.888*** Reading 2.331*** Watching TV, film 2.084*** Housework, chores, DIY −0.651*** Source: Fujiwara and MacKerron (2015) a The table shows coefficients, rather than rankings. Compared with the baselines, these coefficients report how much happier participants reported being when participating in these activities on a scale, when rel- evant variables have been controlled for. The coefficient shows the size of the impact on happiness from doing the activity (where happiness is measured on a scale of 0-100). All variables were statistically significant. 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 215 other ‘everyday participation’ activities, such as TV watching. As you can see housework, chores and DIY is negatively associated with happiness. Other studies cited in this report indicate that theatre has less of an effect on life satisfaction, whereas reading fares much better (Leadbetter et al. 2013). As we encountered in Chap. 4, there are conceptual differ- ences between life satisfaction and happiness, and common sense might tell us that reading and attending a theatre performance present different kinds of well-being experiences. Yet, seeing that reading looks quite bad for well-being is surprising at first glance. Elsewhere in the report are regression tables25 for other activities, including birdwatching, gardening and hunting and fishing which are significantly better than watching a film—or indeed—poor old reading that doesn’t win on these happiness scales. Interestingly, when you go back to the Twitter data answering the question: ‘what is happiness?’ (Box 5.1) there were many responses that answered reading, curling up on the sofa and watching a film, and so on. While the limited sample of the Twitter data makes it impossible to gener- alise, it certainly still poses questions as to what is going on with con- founding results in various happiness data. One thing that struck me returning to these cases in 2020, a world changed by COVID-19, is the difference between activities in the home and outside the home. Interestingly, the app’s inventors co-authored an academic article for the journal Global Environmental Change. Using the same data, they found that outdoor activities were better for well-being. They state: [T]he predicted happiness of a person who is outdoors (+2.32), birdwatch- ing (+4.32) with friends (+4.38), in heathland (+2.71), on a hot (+5.13) and sunny (+0.46) Sunday early afternoon (+4.30) is approximately 26 scale points (or 1.2 standard deviations) higher than that of someone who is commuting (−2.03), on his or her own, in a city, in a vehicle, on a cold, grey, early weekday morning. Equivalently, this is a difference of about the same size as between being ill in bed (−19.65) vs doing physical exercise (+6.51), keeping all other factors the same. (MacKerron and Mourato 2013, 997) The numbers in the brackets refer to ‘the scale points’, showing the increase in probable happiness by where people are, what day of the week it is, what time of day it is. Interestingly, the greener the space you are in and the hotter the day (if sunniness seems less important than you might expect), the better. While this may appear to be common sense in one way, when you think back to how policy relies on evidence to improve well- being, what are the policy messages here from an investment point of view? 216 S. OMAN I had this app for a while and my results always told me that I was hap- piest in a pub beer garden with my best friends. Did I know that the data I was ploughing in when the app beeped me to do so was going to poten- tially be used to inform policy-making? Well, yes, of course, I guessed that, because I was researching well-being data and policy, which was why I downloaded the app in the first place. But did most people who were interested in how they felt doing certain things imagine the contexts of their data’s potential future use? What policy decisions should be made about beer gardens off the back of my interactions with some sliders on a mobile phone app after a few ciders on a summer’s day? While these data were collected at a scale that means my personal data and my interactions are no longer visible on an individual level, it does pose questions for some of the correlations we make with these data. Are people happier on a weekend because they are not working or because they can go to the pub? 5.7 Conclusion Despite the conflicting evidence from different approaches to ‘Big Data’, people are keen to find new ways to harness them to answer the age-old policy and philosophy questions around people’s well-being. The increase in well-being research coincides with an increase in research with and on Big Data. Both have possibilities and challenges, but could they be exacer- bated by combining well-being research with these data practices? Do Big Data have a capacity for good when making decisions about young people’s exam grades or whether someone is eligible for social housing? We reflected on some important examples of where this went awry in this chapter. New methods and metrics using Big Data, and indeed the research going into developing new tools to harness them, are not necessarily being checked for rigour before the approach is used elsewhere, as was the case with the Twitter community study, and its use of the sentiment metrics. Generalising people’s happiness based on mobile phone data has its limita- tions. We cannot necessarily be entirely sure of whether it is the aesthetic grandeur of an old Victorian bandstand in the park, whether there is a classical concert inside, if you had enough sleep, whether you are picnick- ing with your favourite friends, with your kids, or having time away from your kids; indeed, whether you are stuck on a delayed tube underneath the park, or are walking in a hailstorm, that truly adds to (or detracts from) your momentary happiness. 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 217 The ethics of studying Big Data more broadly should be considered, and the behaviours of those who are outside the sample of users of wear- able tech or smartphones, especially as these people may be older or poorer, for example, which we know intersects with well-being in very significant ways. Despite this, claims are still made that findings from these studies could be used to inform policy and investment. While they can offer some insights, we must be mindful of their limits—and crucially of their implications, especially in different contexts. All in all, Big Data and new technologies, whilst not always revolution- ary in kind, can offer insights into well-being that are useful for policy- makers on a national scale, in international pandemics and for people who simply want to see what people think. But they are not without their lim- its, nor are they a magic bullet to the issues we have with existing data. If anything, they are also shown to have the potential to exacerbate existing problems as much as investigate solutions. The capacity for Big Data to embrace complexity, and at greater speed, means they present new opportunities to analyse health data—and cru- cially how health intersects with social concerns. Reflecting back from today on how crude the Google Flu Trends analysis in 2013 now seems, it is clear that Big Data technologies and techniques are improving at pace. The COVID-19 example, BlueDot, shows that the value of Big Data anal- yses is in their capacity to now cope with more of Big Data’s qualities at the same time, and in fact, to harness them: their messiness, variability, size and the capacity to link previously unconnected data sources from farming information to flight sales. The value was in the variety of data and sources used. Yet harnessing the power of Big Data was not powerful enough to prevent a worldwide crisis, despite the grand claims. What we think of as ‘Big Data’ offer a peculiar perspective on ‘well- being’. Consider the different things they capture, from sleep patterns to elite cycle trails to facial recognition and how many steps your walk to the post office takes. These devices exist to capture and produce data because data can be useful and commercialised. We are not even clear on whether more knowledge of the self is good for well-being or bad (yet?), let alone whether it is good at scale: that governments (and who else) know more about us. What is clear is that data are producing and changing culture and society, as much as they are capturing it. We need to ask questions around the commercial value of these data practices alongside social justice issues. How would these data have had a greater chance of improving well-being were the contexts in which they 218 S. OMAN were analysed different? Who should be included in these discussions, and who is excluded? Ultimately, how will decisions and trade-offs be made between the commercial and social justice dimensions? Notes 1. In fact, what a lot of people refer to as Big Data are not ‘Big’ at all by the initial standards of definition. They are just large datasets or newer types of data in not even large datasets, and so arguably not Big at all. 2. Kitchin and McArdle’s (2016) original table says, ‘Limited to wide’ here (p2), but I think this makes more sense, as: ‘Limited in width’ or narrow. 3. A digital sociologist is interested in understanding the use of digital media (often data) as part of everyday life, and how these various technologies contribute to patterns of human behaviour, identity, relationships and social change. 4. O’Neil describes how the bottom scoring 2–5% of teachers were fired. Yet, the modelled target student scores and small classrooms made the scoring of teachers little better than random, and there was almost no correlation in a teacher’s scores from one year to the next and qualitative data called one of the sacked teachers ‘one of the best teachers I’ve ever come into contact with’ (O’Neil 2016, 4). 5. Critical Data Studies are moving for more fairness accountability and trans- parency in data practices. Please see the FAccT conference for more on this: https://facctconference.org/. 6. This is largely credited to the 2017 article in the Economist, ‘The world’s most valuable resource is no longer oil, but data’ (The Economist 2017). 7. With the exceptions of 1941 (during World War II) and Ireland in 1921. 8. Although, of course, given what we have seen elsewhere in the book, we might question whether the changing possibilities for what data could describe, changed policy, rather than the other way around. 9. There were a number of iterations of Mass Observation, with different people initiating them, but these were the original founding members. 10. There were no women observing anything in those days, of course. 11. See Mass Observation (n.d.) website for more on the data available and how to access them. 12. Several new methodologies are emerging that propose new possibilities for well-being measurement through combining new data sources with the survey data we have explored in previous chapters (Bellet and Frijters 2019; Daas et al. 2013; Jahani et al. 2017). These are not only hoping to understand well-being as personal or subjective experience, but to change the way that social justice issues such as poverty are approached (Blumenstock 2016). International organisations such as the United 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 219 Nations are supporting this kind of work, although primarily focussing on patterns of ‘health and well-being’ (United Nations 2014, 2015). 13. More information is available on the REACT’s data collection and man- agement here: https://www.ipsos.com/ipsos-mori/en-uk/covid-19- swab-test-faqs#nameaddress. 14. REACT was commissioned by the Department of Health and Social Care (DHSC) and is being carried out by Imperial College London in partner- ship with Ipsos MORI and Imperial College Healthcare NHS Trust. https://www.imperial.ac.uk/medicine/research-and-impact/groups/ react-study/. 15. A review of literature on data and data practices, Kennedy et al. (2020), found that tech and policy were considered different worlds when it comes to data practices, and with different aims, although that is evolving. 16. See Internet Live Stats, ‘Google search statistics’ (Internet Live Stats n.d.). Internet Live Stats offer plenty more up-to-date data on data, if you are interesed. 17. For the ethical concerns regarding social media research, see Townsend and Wallace (2016). 18. See Davies 2018 for a discussion on the greater implications of ‘the culture wars’ for politics and community. 19. If you are reading this chapter a while after reading the previous ones, then the cultural sector is a broad description of cultural institutions like librar- ies, heritage sites, museums, theatres and so on. Crucially, it is not only about the buildings themselves, but all the ways people make and consume culture and can include Netflix and outdoor festivals. In the UK, the cul- tural sector includes organisations funded by public subsidy as well as com- mercial organisations. 20. This post from Wasim Ahmed (2019) offers a clearly presented overview of the kinds of analyses available using different software https://blogs.lse. ac.uk/impactofsocialsciences/2019/06/18/using-twitter-as-a-data-source- an-overview-of-social-media-research-tools-2019/ 21. ‘everyday participation’ (Miles and Sullivan 2010) has come to mean the everyday activities we participate in, which tend to fall outside of formal subsidy, which tendentially funds ‘the arts’. 22. IMD is the UK government’s Index of Multiple Deprivation. 23. This is called the PeerIndex realness score. This score is generated using information such as whether the profile has been self-certified on the PeerIndex site and/or has been linked to Facebook or LinkedIn. ‘PeerIndex realness score is a metric that indicates the likelihood that the profile is of a real person, rather than a spambot or twitter feed. A score above 50 means this account is of a real person, a score below 50 means it is less likely to be a real person’ (http://www.peerindex.net/help/scores). 24. See YouGov (n.d.) ‘Brits use of wearable device’. 220 S. OMAN 25. A regression table like the one reproduced in Table 5.4 will mainly be con- cerned with communicating the degree of association between variables. Chapters 7 and 8 go into this in far greater detail. The values will always lie between 0 and 1, and the way this table has been presented shows simplified detail. Ordinarily there is additional information to show not only the degree of association, but how sure we can be that this is a correct estimate. There will always be a degree of error that has to be accounted for. Typically in a regres- sion table, you will find asterixes, as in Table 5.4. Asterisks in a regression table indicate the level of the statistical significance of a regression coefficient. References Ada Lovelace Institute. 2019. Beyond Face Value: Public Attitudes to Facial Recognition Technology. Accessed 28 April 2021. https://www.adalovelacein- stitute.org/report/beyond-face-value-public-attitudes-to-facial-recognition- technology/. Ahmed, W. 2019. Using Twitter as a Data Source: An Overview of Social Media Research Tools (2019). Impact of Social Sciences. Accessed 28 April 2021. https://blogs.lse.ac.uk/impactofsocialsciences/2019/06/18/ using-t witter-a s-a -d ata-s ource-a n-o ver view-o f-s ocial-m edia-r esearch- tools-2019/. Ajana, B. 2017. Self-Tracking: Empirical and Philosophical Investigations. Springer International Publishing. https://doi.org/10.1007/978-3-319-65379-2. Albert, A. 2019. Citizen Social Science: A Critical Investigation. PhD thesis. University of Manchester. https://www.escholar.manchester.ac.uk/api/ datastream?publicationPid=uk-ac-man-scw:319481&datastreamId=FULL-T EXT.PDF. Avila, R. 2019. Fixing Digital Democracy? The Future of Data-Driven Political Campaigning. openDemocracy. Accessed 28 April 2021. https:// www.opendemocracy.net/en/fixing-d igital-d emocracy-f uture-o f-d ata- driven-political-campaigning/. Bache, I., and Reardon, L. 2013. An Idea Whose Time has Come? Explaining the Rise of Well-Being in British Politics. Political Studies, 61(4), 898–914. https://doi.org/10.1111/1467-9248.12001. Bates, J. 2016. Towards a Critical Data Science—The Complicated Relationship Between Data and the Democratic Project. Impact of Social Sciences. https:// blogs.lse.ac.uk/impactofsocialsciences/2016/01/12/towards-a-critical-data- science-data-and-the-democratic-project/. Bates, J., Lin, Y.-W., and Goodale, P. 2016. Data Journeys: Capturing the Socio- material Constitution of Data Objects and Flows. Big Data & Society 3 (2): p. 2053951716654502. https://doi.org/10.1177/2053951716654502. 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 221 BBC. 2019. London Mayor Quizzes King’s Cross Developer on Facial Recognition. BBC News, 14 August. Accessed 29 April 2021. https://www.bbc.com/news/ technology-49343822. ———. 2020. Coronavirus: UK to Have Test, Track and Trace System by June. BBC News. Accessed 28 April 2021. https://www.bbc.co.uk/news/av/ uk-politics-52745202. Bellet, C. and Frijters, P. 2019. Big Data and Well-being, p. 26. Accessed 28 April 2021. https://worldhappiness.report/ed/2019/big-data-and-well-being/. Benjamin, R. 2019. Race After Technology: Abolitionist Tools for the New Jim Code. Medford, MA: Polity. Boyd, D., and Crawford, K. 2012. Critical Questions for Big Data. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.108 0/1369118X.2012.678878. BlueDot. n.d. BlueDot | Who We Are, BlueDot. Accessed 2 May 2021. https:// bluedot.global/team/. Blumenstock, J.E. 2016. Fighting Poverty with Data. Science 353 (6301): 753–754. https://doi.org/10.1126/science.aah5217. Campbell, D. 2017. Facebook and Twitter ‘Harm Young People’s Mental Health’. The Guardian. Accessed 28 April 2021. http://www.theguard- ian.com/society/2017/may/19/popular-social-media-sites-harm-young- peoples-mental-health. Carroll, C., J.C. Fuhrer, and D.W. Wilcox. 1994. Does Consumer Sentiment Forecast Household Spending? If So, Why? The American Economic Review 84 (5): 1397–1408. Charrington, S. 2020. How AI Predicted the Coronavirus Outbreak with Kamran Khan—#350. Accessed 28 April 2021. https://www.youtube.com/ watch?v=V6BpKSGquRw. Coughlin, T. 2018. 175 Zettabytes By 2025. Forbes. Accessed 29 March 2021. https://www.forbes.com/sites/tomcoughlin/2018/11/27/175-zettabytes- by-2025/. Cryle, P.M., and E. Stephens. 2017. Normality: A Critical Genealogy. Chicago: The University of Chicago Press. Daas, P. J. et al. 2013. Big Data and Official Statistics. In Proceedings of the NTTS. New Techniques and Technologies for Statistics, pp. 5–7. Davies, B., Innes, M. and Dawson, A. 2018. An Evaluation of South Wales Police’s Use of Automated Facial Recognition. Cardiff: Crime and Security Research Institute, p. 46. https://www.statewatch.org/media/documents/ news/2018/nov/uk-s outh-w ales-p olice-f acial-r ecognition-c ardiff-u ni- eval-11-18.pdf. Davies, W. 2015. The Happiness Industry: How The Government and Big Business Sold Us Well-Being. London: Verso. ———. 2018. Nervous States: How Feeling Took Over the World. Jonathan Cape. 222 S. OMAN Dencik, L. 2020. The Datafied Welfare State: A Perspective from the UK, 24. Cardiff: Cardiff University. https://datajusticeproject.net/wp-content/ uploads/sites/30/2020/09/The-Datafied-Welfare-State_draft.pdf. Denham, E. 2019. Statement: Live facial recognition technology in King’s Cross. ICO. Accessed: 19 August 2019. https://ico.org.uk/about-the-ico/news- and-events/news-a nd-b logs/2019/08/statement-live-facial-recognition- technology-in-kings-cross/. Digital Initiatives. 2020. Strava: Striving in the Time of Corona? Digital Innovation and Transformation. Accessed 28 April 2021. https://digital.hbs.edu/ platform-digit/submission/strava-striving-in-the-time-of-corona/. Dodge, M., and Kitchin, R. 2005. Codes of Life: Identification Codes and the Machine-Readable World. Environment and Planning D: Society and Space, 23(6), 851–881. https://doi.org/10.1068/d378t. Eubanks, V. 2018. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin’s Publishing Group. Fujiwara, D., and G. MacKerron. 2015. Cultural Activities, Artforms and Wellbeing. London: Arts Council England. Fussey, P., and D. Murray. 2019. London-Met-Police-Trial-of-Facial-Recognition- Tech-Report.pdf. Essex: University of Essex, p. 128. Accessed 28 April 2021. https://48ba3m4eh2bf2sksp43rq8kk-w pengine.netdna-s sl.com/wp- content/uploads/2019/07/London-Met-Police-Trial-of-Facial-Recognition- Tech-Report.pdf. Gayle, D. 2018. Diane Abbott: Twitter Has ‘Put Racists into Overdrive. The Guardian. Accessed 28 April 2021. https://www.theguardian.com/poli- tics/2018/dec/18/diane-a bbott-c alls-f or-t witter-t o-c lamp-d own-o n- hate-speech. Gilmore, A., Kostas, A., and Albert, A. 2018. ‘Never Mind the Quality, Feel the Width’: Big Data for Quality and Performance Evaluation in the Arts and Cultural Sector and the Case of ‘Culture Metrics’. In G. Schiuma and D. Carlucci (Eds.), Big Data in the Arts and Humanities: Theory and Practice. Boca Raton: Taylor and Francis. Hacking, I. 1990. The Taming of Chance. Cambridge: Cambridge University Press. ———. 1991. How Should We Do the History of Statistics? In The Foucault Effect: Studies in Governmentality, ed. G. Burchell, C. Gordon, and P. Miller. Chicago: The University of Chicago Press. Harford, T. 2017. How the World’s First Accountants Counted on Cuneiform. BBC News. Accessed 28 April 2021. https://www.bbc.co.uk/news/ business-39870485. Heaven, W.D. 2020. AI Could Help with the Next Pandemic—But Not with This One, MIT Technology Review. Accessed 2 May 2021. https://www.tech- nologyreview.com/2020/03/12/905352/ai-c ould-h elp-w ith-t he-n ext- pandemicbut-not-with-this-one/. 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 223 Hill, K., and A. Krolik 2019. How Photos of Your Kids are Powering Surveillance Technology. The New York Times. Accessed 28 April 2021. https://www. nytimes.com/interactive/2019/10/11/technology/flickr-f acial- recognition.html. Hintz, A., and J. Brand. n.d. Data Policies: Approaches for Data-Driven Platforms in the UK and EU. Cardiff: Data Justice Lab, p. 30. https://datajustice.files. wordpress.com/2020/01/data-policies-research-report-revised.pdf. Holmes, T.H., and R.H. Rahe. 1967. The Social Readjustment Rating Scale. Journal of Psychosomatic Research 11 (2): 213–218. https://doi. org/10.1016/0022-3999(67)90010-4. Internet Live Stats. n.d. Google Search Statistics—Internet Live Stats. Accessed 28 April 2021. https://www.internetlivestats.com/google-search-statistics/. Jahani, E., et al. 2017. Improving Official Statistics in Emerging Markets Using Machine Learning and Mobile Phone Data. EPJ Data Science 6 (1): 1–21. https://doi.org/10.1140/epjds/s13688-017-0099-3. Jee, C. 2016. Wearable Tech: Could It Save the NHS?, Techworld. Accessed 15 September 2016. http://www.techworld.com/wearables/could-wearables- save-nhs-3621960/. Kennedy, H. 2016. Post, Mine, Repeat: Social Media Data Mining Becomes Ordinary. New York; Secaucus: Palgrave Macmillan UK. https://doi. org/10.1057/978-1-137-35398-6. Kennedy, H., Oman, S., Taylor, M., Bates, J., and Steedman, R. 2020. Public Understanding and Perceptions of Data Practices: A Review of Existing Research. Sheffield: The University of Sheffield. https://livingwithdata.org/ project/wp-content/uploads/2020/05/living-with-data-2020-review-of- existing-research.pdf. Kitchin, R. 2014. The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. SAGE. Kitchin, R., and G. McArdle. 2016. What Makes Big Data, Big Data? Exploring the Ontological Characteristics of 26 Datasets. Big Data & Society 3 (1): p. 2053951716631130. https://doi.org/10.1177/2053951716631130. Kramer, A.D.I. 2010. An Unobtrusive Behavioral Model Of ‘Gross National Happiness’. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI 10, 287–290. Atlanta: Association for Computing Machinery. Kruzan, K.P., and A.S. Won. 2019. Embodied Well-Being Through Two Media Technologies: Virtual Reality and Social Media. New Media & Society 21 (8): 1734–1749. https://doi.org/10.1177/1461444819829873. Laney, D. 2001. 3D data management: Controlling data volume, velocity and variety. Meta Group. Accessed: 16 January 2013. http://blogs.gartner.com/ doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data- Volume-Velocity-and-Variety.pdf. 224 S. OMAN Lazer, D., et al. 2014. The Parable of Google Flu: Traps in Big Data Analysis. Science 343 (6176): 1203–1205. https://doi.org/10.1126/ science.1248506. Lazer, D., and R. Kennedy. 2015. What We Can Learn from the Epic Failure of Google Flu Trends. Wired. Accessed 28 April 2021. https://www.wired. com/2015/10/can-learn-epic-failure-google-flu-trends/ Leadbetter, C., O’Connor, N., and Commonwealth Games, Culture & Sport Analysis Scottish Government. 2013. Healthy Attendance? The Impact of Cultural Engagement and Sports Participation on Health and Satisfaction with Life in Scotland. Scotland: The Scottish Government. Accessed 17 May 2021. https://www.gov.scot/publications/healthy-a ttendance-i mpact-c ultural- engagement-sports-participation-health-satisfaction-life-scotland/. Lee, L. et al. 2016.Information Disclosure Concerns in The Age of Wearable Computing. In Proceedings 2016 Workshop on Usable Security. Workshop on Usable Security, San Diego, CA: Internet Society. https://doi.org/10.14722/ usec.2016.23006. Lewis, R., M. Rowe, and C. Wiper. 2016. Online Abuse of Feminists as An Emerging form of Violence Against Women and Girls. The British Journal of Criminology 57 (6): 1462–1481. https://doi.org/10.1093/bjc/azw073. Living with data. n.d. Living with Data. https://livingwithdata.org/. Lupton, D. 2019. Data Mattering and Self-Tracking: What Can Personal Data Do? Continuum 34 (1): 1–13. https://doi.org/10.1080/1030431 2.2019.1691149. MacKerron, G., and S. Mourato. 2013. Happiness is Greater in Natural Environments. Global Environmental Change 23 (5): 992–1000. https://doi. org/10.1016/j.gloenvcha.2013.03.010. Madge, C., and T.H. Harrisson. 1937. Mass Observation. London: Frederick Muller Ltd. Marr, B. 2014. Big Data: The 5 vs everyone must know. Accessed: 4 September 2015. https://www.linkedin.com/pulse/20140306073407-64875646-big- data-the-5-vs-everyone-must-know. Mass Observation. n.d. Mass Observation. http://www.massobs.org.uk. Matsakis, L. 2019 The WIRED Guide to Your Personal Data (and Who Is Using It). Wired. Accessed: 28 April 2021. https://www.wired.com/story/ wired-guide-personal-data-collection/. Mayer-Schönberger, V., and K. Cukier. 2013. Big Data: A Revolution that Will Transform how We Live, Work, and Think. London: John Murray. Marz, N. and Warren, J. 2012. Big Data: Principles and Best Practices of Scalable Realtime Data Systems. MEAP edition. Westhampton, NJ: Manning. McCall, B. 2020. COVID-19 and Artificial Intelligence: Protecting Health-Care Workers and Curbing The Spread. The Lancet Digital Health 2 (4): e166– e167. https://doi.org/10.1016/S2589-7500(20)30054-6. 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 225 McNulty, E. 2014. Understanding Big Data: The seven V’s. Accessed: 4 September 2015. Accessed: 4 September 2015. http://dataconomy.com/ seven-vs-big-data/. Miles, A., and A. Sullivan. 2010. Understanding the Relationship Between Taste and Value in culture and Sport. London: DCMS. Murgia, M. 2017. Watchdog Probes Cambridge Analytica’s Poll Role. Financial Times. Accessed: 28 April 2021. https://www.ft.com/ content/7482ec7c-01c9-11e7-aa5b-6bb07f5c8e12. Mutz, M. 2016. Christmas and Subjective Well-Being: a Research Note. Applied Research in Quality of Life 11 (4): 1341–1356. https://doi.org/10.1007/ s11482-015-9441-8. NHS. 2016. Want to Feel Happier? Take a Break from Facebook. NHS. https://www.nhs.uk/news/mental-health/want-to-feel-happier-take- a-break-from-facebook/. Niiler, E. 2020) An AI Epidemiologist Sent the First Alerts of the Coronavirus. Wired. Accessed: 28 April 2021. https://www.wired.com/story/ai- epidemiologist-wuhan-public-health-warnings/. Noble, S.U. 2018. Algorithms of Oppression: Data Discrimination in the Age of Google. New York: New York University Press. Oman, S. 2013a. Review of ‘Counting What Counts: What Big Data Can Do for the Cultural Sector’. Cultural Value Initiative. http://culturalvalueini- tiative.org/2013/06/08/review-o f-n estas-c ounting-w hat-c ounts-w hat- big-data-can-do-for-the-cultural-sector-by-susan-oman/. ———. 2013b. Tackling the Deficit: Well-Being and Cultural Participation. Presentation at Culture, Health and Wellbeing International Conference. University of Bristol. ———. 2015. Measuring National Well-Being: What Matters to You? What Matters to Whom? In Cultures of Wellbeing: Method, Place, Policy, ed. S. White and C. Blackmore. London: Palgrave Macmillan. ———. 2017. All Being Well: Cultures of Participation and the Cult of Measurement. PhD Thesis. The University of Manchester. ———. 2019a. Improving Data Practices to Monitor Inequality and Introduce Social Mobility Measures: A Working Paper. The University of Sheffield. Available at: https://www.sheffield.ac.uk/polopoly_fs/1.867756!/file/ MetricsWorkingPaper.pdf. Accessed: 29 March 2021. ———. 2019b. Measuring Social Mobility in The Creative and Cultural Industries: The importance of working in partnership to improve data practices and address inequality. Sheffield: The University of Sheffield. Accessed: 29 March 2021. h t t p s : / / w w w. s h e f f i e l d . a c . u k / p o l o p o l y _ f s / 1 . 8 6 7 7 5 4 ! / f i l e / MetricsPolicyBriefing.pdf. ———. 2020. Leisure pursuits: Uncovering the ‘Selective Tradition’ in Culture and Well-being Evidence for Policy. Leisure Studies, 39(1), 11–25. https://doi. org/10.1080/02614367.2019.1607536. 226 S. OMAN ———. n.d. How Data Work in Contexts. Living with Data. Accessed: 29 April 2021. https://livingwithdata.org/previous-research/how-data-work-in- contexts/. O’Neil, C. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. London: Allen Lane. ONS. 2001. 60 Years of Social Survey: 1941–2001. Norwich: HMSO. ———. 2016. Early Census-Taking in England and Wales. Office for National Statistics. Accessed 28 April 2021. https://www.ons.gov.uk/ census/2011census/howourcensusworks/aboutcensuses/censushistory/ earlycensustakinginenglandandwales. Otterbacher, J., Bates, J., and Clough, P. 2017. Competent Men and Warm Women: Gender Stereotypes and Backlash in Image Search Results. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 6620–6631). Association for Computing Machinery. https://doi. org/10.1145/3025453.3025727. Pearce, R., S. Erikainen, and B. Vincent. 2020. TERF Wars: An Introduction. The Sociological Review 68 (4): 677–698. https://doi.org/10.1177/ 0038026120934713. Pellert, M., et al. 2020. Dashboard of Sentiment in Austrian Social Media During COVID-19. Frontiers in Big Data 3. https://doi.org/10.3389/ fdata.2020.00032. Pidd, H. 2020. ‘Punishment by statistics’: The father who foresaw A-level algo- rithm flaws. The Guardian. Accessed: 11 August 2021. http://www.theguard- ian.com/education/2020/aug/14/punishment-by-statistics-the-father- who-foresaw-a-level-algorithm-flaws. Pink, S., and V. Fors. 2017. Being in a Mediated World: Self-Tracking and the Mind–Body–Environment. Cultural Geographies 24 (3): 375–388. https:// doi.org/10.1177/1474474016684127. Plunz, R.A., et al. 2019. Twitter Sentiment in New York City Parks as Measure of Well-Being. Landscape and Urban Planning 189: 235–246. https://doi. org/10.1016/j.landurbplan.2019.04.024. Poovey, M. 1998. A History of the Modern Fact: Problems of Knowledge in the Sciences of Wealth and Society. Chicago: The University of Chicago Press. Porter, T.M. 1986. The Rise of Statistical Thinking 1820–1900. Princeton: Princeton University Press. ———. 1996. Trust in Numbers The Pursuit of Objectivity in Science and Public Life. Princeton: Princeton University Press. Quercia, D. et al. 2012. Tracking ‘Gross Community Happiness’ from Tweets. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. CSCM 2012, ed. D. Gergle, et al., 965–968. New York: ACM. Ram, A., and M. Murgia. 2019. Data Brokers: Regulators Try to Rein in the ‘Privacy Deathstars’. Financial Times. Accessed 29 March 2021. https://www. ft.com/content/f1590694-fe68-11e8-aebf-99e208d3e521. 5 GETTING A SENSE OF BIG DATA AND WELL-BEING 227 Ruckenstein, M., and M. Pantzar. 2017. Beyond the Quantified Self: Thematic Exploration of a Dataistic Paradigm. New Media & Society 19 (3): 401–418. https://doi.org/10.1177/1461444815609081. Ruppert, E., J. Law, and M. Savage. 2013. ‘Reassembling Social Science Methods: The Challenge of Digital Devices. Theory, Culture & Society 30 (4): 22–46. https://doi.org/10.1177/0263276413484941. Savage, M. 2010. Identities and Social Change in Britain Since 1940: The Politics of Method. Oxford: Oxford University Press. Scott, J.C. 1998. Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed. New Haven: Yale University Press (The Yale ISPS series). Sinclair, J. 1798. Statistical Accounts of Scotland. https://stataccscot.edina.ac.uk/ static/statacc/dist/home. Strain, T., K. Wijndaele, and S. Brage 2019. Physical Activity Surveillance Through Smartphone Apps and Wearable Trackers: Examining the UK Potential for Nationally Representative Sampling. JMIR mHealth and uHealth 7(1): p. e11898. https://doi.org/10.2196/11898. Suzuki, M. 1992. Political Business Cycles in the Public Mind. American Political Science Review 86 (4): 989–996. https://doi.org/10.2307/1964350. The Economist. 2017. The World’s Most Valuable Resource Is No Longer Oil, But Data. The Economist, 6 May. Accessed 29 March 2021. https:// www.economist.com/leaders/2017/05/06/the-w orlds-m ost-v aluable- resource-is-no-longer-oil-but-data. Townsend, L., and Wallace, C. 2016. Social Media Research: A Guide to Ethics. Aberdeen: The University of Aberdeen, p. 16. https://www.gla.ac.uk/media/ Media_487729_smxx.pdf. Turow, J. 2011 Introduction. In The Daily You: How the New Advertising Industry Is Defining Your Identity and Your Worth, 1–12. Yale University Press. UK Data Justice Lab. n.d. Data Justice Lab. https://datajusticelab.org. United Nations. 2014. A World That Counts: Mobilising the Data Revolution for Sustainable Development. Secretary-General of the United Nations. https:// www.tralac.org/images/Resources/UN_Summit/A%20world%20that%20 counts%20Mobilizing%20the%20data%20revolution%20for%20sustainable%20 development%202014.pdf. ———. 2015. Indicators and a Monitoring Framework for the Sustainable Development Goals. Launching a Data Revolution for the SDGs. Secretary- General of the United Nations, p. 233. https://sdgs.un.org/sites/default/ files/publications/2013150612-FINAL-SDSN-Indicator-Report1.pdf. Voukelatou, V., et al. 2020. Measuring Objective and Subjective Well-Being: Dimensions and Data Sources. International Journal of Data Science and Analytics. https://doi.org/10.1007/s41060-020-00224-2. Whitaker, B. 2020. The Computer Algorithm That was Among the First to Detect the Coronavirus Outbreak. Accessed 28 April 2021. https://www.cbsnews.com/ news/coronavirus-outbreak-computer-algorithm-artificial-intelligence/. 228 S. OMAN Wilmott, C. 2016. Small Moments in Spatial Big Data: Calculability, Authority and Interoperability in Everyday Mobile Mapping. Big Data & Society 3 (2): p. 2053951716661364. https://doi.org/10.1177/2053951716661364. YouGov. n.d. Brits Use of Wearable Devices (E.g. A Smartwatch or Wearable Fitness Band). Accessed 28 April 2021. https://yougov.co.uk/topics/tech- nology/trackers/brits-use-of-wearable-devices-eg-a-smartwatch-or-wearable- fitness-band. Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made. The images or other third party material in this chapter are included in the chapter’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copy- right holder.

References (109)

A regression table like the one reproduced in Table 5.4 will mainly be con- cerned with communicating the degree of association between variables. Chapters 7 and 8 go into this in far greater detail. The values will always lie between 0 and 1, and the way this table has been presented shows simplified detail. Ordinarily there is additional information to show not only the degree of association, but how sure we can be that this is a correct estimate. There will always be a degree of error that has to be accounted for. Typically in a regres- sion table, you will find asterixes, as in Table 5.4. Asterisks in a regression table indicate the level of the statistical significance of a regression coefficient. refereNces
Ada Lovelace Institute. 2019. Beyond Face Value: Public Attitudes to Facial Recognition Technology. Accessed 28 April 2021. https://www.adalovelacein- stitute.org/report/beyond-face-value-public-attitudes-to-facial-recognition- technology/.
Ahmed, W. 2019. Using Twitter as a Data Source: An Overview of Social Media Research Tools (2019). Impact of Social Sciences. Accessed 28 April 2021. https: //blogs.lse.ac.uk/impactofsocialsciences/2019/06/18/ using-twitter-as-a-data-source-an-over view-of-social-media-research- tools-2019/.
Ajana, B. 2017. Self-Tracking: Empirical and Philosophical Investigations. Springer International Publishing. https://doi.org/10.1007/978-3-319-65379-2.
Albert, A. 2019. Citizen Social Science: A Critical Investigation. PhD thesis. University of Manchester. https://www.escholar.manchester.ac.uk/api/ datastream?publicationPid=uk-ac-man-scw:319481&datastreamId=FULL-T EXT.PDF.
Avila, R. 2019. Fixing Digital Democracy? The Future of Data-Driven Political Campaigning. openDemocracy. Accessed 28 April 2021. https:// www.opendemocracy.net/en/fixing-digital-democracy-future-of-data- driven-political-campaigning/.
Bache, I., and Reardon, L. 2013. An Idea Whose Time has Come? Explaining the Rise of Well-Being in British Politics. Political Studies, 61(4), 898-914. https://doi.org/10.1111/1467-9248.12001.
Bates, J. 2016. Towards a Critical Data Science-The Complicated Relationship Between Data and the Democratic Project. Impact of Social Sciences. https:// blogs.lse.ac.uk/impactofsocialsciences/2016/01/12/towards-a-critical-data- science-data-and-the-democratic-project/.
Bates, J., Lin, Y.-W., and Goodale, P. 2016. Data Journeys: Capturing the Socio- material Constitution of Data Objects and Flows. Big Data & Society 3 (2): p. 2053951716654502. https://doi.org/10.1177/2053951716654502.
S. OMAN BBC. 2019. London Mayor Quizzes King's Cross Developer on Facial Recognition. BBC News, 14 August. Accessed 29 April 2021. https://www.bbc.com/news/ technology-49343822.
---. 2020. Coronavirus: UK to Have Test, Track and Trace System by June. BBC News. Accessed 28 April 2021. https://www.bbc.co.uk/news/av/ uk-politics-52745202.
Bellet, C. and Frijters, P. 2019. Big Data and Well-being, p. 26. Accessed 28 April 2021. https://worldhappiness.report/ed/2019/big-data-and-well-being/.
Benjamin, R. 2019. Race After Technology: Abolitionist Tools for the New Jim Code. Medford, MA: Polity.
Boyd, D., and Crawford, K. 2012. Critical Questions for Big Data. Information, Communication & Society, 15(5), 662-679. https://doi.org/10.108
/1369118X.2012.678878.
BlueDot. n.d. BlueDot | Who We Are, BlueDot. Accessed 2 May 2021. https:// bluedot.global/team/.
Blumenstock, J.E. 2016. Fighting Poverty with Data. Science 353 (6301): 753-754. https://doi.org/10.1126/science.aah5217.
Campbell, D. 2017. Facebook and Twitter 'Harm Young People's Mental Health'. The Guardian. Accessed 28 April 2021. http://www.theguard- ian.com/society/2017/may/19/popular-social-media-sites-harm-young- peoples-mental-health.
Carroll, C., J.C. Fuhrer, and D.W. Wilcox. 1994. Does Consumer Sentiment Forecast Household Spending? If So, Why? The American Economic Review 84 (5): 1397-1408.
Charrington, S. 2020. How AI Predicted the Coronavirus Outbreak with Kamran Khan-#350. Accessed 28 April 2021. https://www.youtube.com/ watch?v=V6BpKSGquRw.
Coughlin, T. 2018. 175 Zettabytes By 2025. Forbes. Accessed 29 March 2021. https://www.forbes.com/sites/tomcoughlin/2018/11/27/175-zettabytes- by-2025/.
Cryle, P.M., and E. Stephens. 2017. Normality: A Critical Genealogy. Chicago: The University of Chicago Press.
Daas, P. J. et al. 2013. Big Data and Official Statistics. In Proceedings of the NTTS. New Techniques and Technologies for Statistics, pp. 5-7.
Davies, B., Innes, M. and Dawson, A. 2018. An Evaluation of South Wales Police's Use of Automated Facial Recognition. Cardiff: Crime and Security Research Institute, p. 46. https://www.statewatch.org/media/documents/ news/2018/nov/uk-south-wales-police-facial-recognition-cardiff-uni- eval-11-18.pdf.
Davies, W. 2015. The Happiness Industry: How The Government and Big Business Sold Us Well-Being. London: Verso.
---. 2018. Nervous States: How Feeling Took Over the World. Jonathan Cape.
Dencik, L. 2020. The Datafied Welfare State: A Perspective from the UK, 24. Cardiff: Cardiff University. https://datajusticeproject.net/wp-content/ uploads/sites/30/2020/09/The-Datafied-Welfare-State_draft.pdf.
Denham, E. 2019. Statement: Live facial recognition technology in King's Cross. ICO. Accessed: 19 August 2019. https://ico.org.uk/about-the-ico/news- and-events/news-and-blogs/2019/08/statement-live-facial-recognition- technology-in-kings-cross/.
Digital Initiatives. 2020. Strava: Striving in the Time of Corona? Digital Innovation and Transformation. Accessed 28 April 2021. https://digital.hbs.edu/ platform-digit/submission/strava-striving-in-the-time-of-corona/.
Dodge, M., and Kitchin, R. 2005. Codes of Life: Identification Codes and the Machine-Readable World. Environment and Planning D: Society and Space, 23(6), 851-881. https://doi.org/10.1068/d378t.
Eubanks, V. 2018. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin's Publishing Group.
Fujiwara, D., and G. MacKerron. 2015. Cultural Activities, Artforms and Wellbeing. London: Arts Council England.
Fussey, P., and D. Murray. 2019. London-Met-Police-Trial-of-Facial-Recognition- Tech-Report.pdf. Essex: University of Essex, p. 128. Accessed 28 April 2021. https://48ba3m4eh2bf2sksp43rq8kk-wpengine.netdna-ssl.com/wp- content/uploads/2019/07/London-Met-Police-Trial-of-Facial-Recognition- Tech-Report.pdf.
Gayle, D. 2018. Diane Abbott: Twitter Has 'Put Racists into Overdrive. The Guardian. Accessed 28 April 2021. https://www.theguardian.com/poli- tics/2018/dec/18/diane-abbott-calls-for-twitter-to-clamp-down-on- hate-speech.
Gilmore, A., Kostas, A., and Albert, A. 2018. 'Never Mind the Quality, Feel the Width': Big Data for Quality and Performance Evaluation in the Arts and Cultural Sector and the Case of 'Culture Metrics'. In G. Schiuma and D. Carlucci (Eds.), Big Data in the Arts and Humanities: Theory and Practice. Boca Raton: Taylor and Francis.
Hacking, I. 1990. The Taming of Chance. Cambridge: Cambridge University Press.
---. 1991. How Should We Do the History of Statistics? In The Foucault Effect: Studies in Governmentality, ed. G. Burchell, C. Gordon, and P. Miller. Chicago: The University of Chicago Press.
Harford, T. 2017. How the World's First Accountants Counted on Cuneiform. BBC News. Accessed 28 April 2021. https://www.bbc.co.uk/news/ business-39870485.
Heaven, W.D. 2020. AI Could Help with the Next Pandemic-But Not with This One, MIT Technology Review. Accessed 2 May 2021. https://www.tech- nologyreview.com/2020/03/12/905352/ai-could-help-with-the-next- pandemicbut-not-with-this-one/.
Hill, K., and A. Krolik 2019. How Photos of Your Kids are Powering Surveillance Technology. The New York Times. Accessed 28 April 2021. https://www. nytimes.com/interactive/2019/10/11/technology/flickr-facial- recognition.html.
Hintz, A., and J. Brand. n.d. Data Policies: Approaches for Data-Driven Platforms in the UK and EU. Cardiff: Data Justice Lab, p. 30. https://datajustice.files. wordpress.com/2020/01/data-policies-research-report-revised.pdf.
Holmes, T.H., and R.H. Rahe. 1967. The Social Readjustment Rating Scale. Journal of Psychosomatic Research 11 (2): 213-218. https://doi. org/10.1016/0022-3999(67)90010-4.
Jahani, E., et al. 2017. Improving Official Statistics in Emerging Markets Using Machine Learning and Mobile Phone Data. EPJ Data Science 6 (1): 1-21. https://doi.org/10.1140/epjds/s13688-017-0099-3.
Jee, C. 2016. Wearable Tech: Could It Save the NHS?, Techworld. Accessed 15 September 2016. http://www.techworld.com/wearables/could-wearables- save-nhs-3621960/.
Kennedy, H. 2016. Post, Mine, Repeat: Social Media Data Mining Becomes Ordinary. New York; Secaucus: Palgrave Macmillan UK. https://doi. org/10.1057/978-1-137-35398-6.
Kennedy, H., Oman, S., Taylor, M., Bates, J., and Steedman, R. 2020. Public Understanding and Perceptions of Data Practices: A Review of Existing Research. Sheffield: The University of Sheffield. https://livingwithdata.org/ project/wp-content/uploads/2020/05/living-with-data-2020-review-of- existing-research.pdf.
Kitchin, R. 2014. The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. SAGE.
Kitchin, R., and G. McArdle. 2016. What Makes Big Data, Big Data? Exploring the Ontological Characteristics of 26 Datasets. Big Data & Society 3 (1): p. 2053951716631130. https://doi.org/10.1177/2053951716631130.
Kramer, A.D.I. 2010. An Unobtrusive Behavioral Model Of 'Gross National Happiness'. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI 10, 287-290. Atlanta: Association for Computing Machinery.
Kruzan, K.P., and A.S. Won. 2019. Embodied Well-Being Through Two Media Technologies: Virtual Reality and Social Media. New Media & Society 21 (8): 1734-1749. https://doi.org/10.1177/1461444819829873.
Laney, D. 2001. 3D data management: Controlling data volume, velocity and variety. Meta Group. Accessed: 16 January 2013. http://blogs.gartner.com/ doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data- Volume-Velocity-and-Variety.pdf.
Lazer, D., et al. 2014. The Parable of Google Flu: Traps in Big Data Analysis. Science 343 (6176): 1203-1205. https://doi.org/10.1126/ science.1248506.
Lazer, D., and R. Kennedy. 2015. What We Can Learn from the Epic Failure of Google Flu Trends. Wired. Accessed 28 April 2021. https://www.wired. com/2015/10/can-learn-epic-failure-google-flu-trends/
Leadbetter, C., O'Connor, N., and Commonwealth Games, Culture & Sport Analysis Scottish Government. 2013. Healthy Attendance? The Impact of Cultural Engagement and Sports Participation on Health and Satisfaction with Life in Scotland. Scotland: The Scottish Government. Accessed 17 May 2021. https://www.gov.scot/publications/healthy-attendance-impact-cultural- engagement-sports-participation-health-satisfaction-life-scotland/.
Lee, L. et al. 2016.Information Disclosure Concerns in The Age of Wearable Computing. In Proceedings 2016 Workshop on Usable Security. Workshop on Usable Security, San Diego, CA: Internet Society. https://doi.org/10.14722/ usec.2016.23006.
Lewis, R., M. Rowe, and C. Wiper. 2016. Online Abuse of Feminists as An Emerging form of Violence Against Women and Girls. The British Journal of Criminology 57 (6): 1462-1481. https://doi.org/10.1093/bjc/azw073. Living with data. n.d. Living with Data. https://livingwithdata.org/.
Lupton, D. 2019. Data Mattering and Self-Tracking: What Can Personal Data Do? Continuum 34 (1): 1-13. https://doi.org/10.1080/1030431
2019.1691149.
MacKerron, G., and S. Mourato. 2013. Happiness is Greater in Natural Environments. Global Environmental Change 23 (5): 992-1000. https://doi. org/10.1016/j.gloenvcha.2013.03.010.
Madge, C., and T.H. Harrisson. 1937. Mass Observation. London: Frederick Muller Ltd.
Marr, B. 2014. Big Data: The 5 vs everyone must know. Accessed: 4 September 2015. https://www.linkedin.com/pulse/20140306073407-64875646-big- data-the-5-vs-everyone-must-know.
Mass Observation. n.d. Mass Observation. http://www.massobs.org.uk.
Matsakis, L. 2019 The WIRED Guide to Your Personal Data (and Who Is Using It). Wired. Accessed: 28 April 2021. https://www.wired.com/story/ wired-guide-personal-data-collection/.
Mayer-Schönberger, V., and K. Cukier. 2013. Big Data: A Revolution that Will Transform how We Live, Work, and Think. London: John Murray.
Marz, N. and Warren, J. 2012. Big Data: Principles and Best Practices of Scalable Realtime Data Systems. MEAP edition. Westhampton, NJ: Manning.
McCall, B. 2020. COVID-19 and Artificial Intelligence: Protecting Health-Care Workers and Curbing The Spread. The Lancet Digital Health 2 (4): e166- e167. https://doi.org/10.1016/S2589-7500(20)30054-6.
McNulty, E. 2014. Understanding Big Data: The seven V's. Accessed: 4 September 2015. Accessed: 4 September 2015. http://dataconomy.com/ seven-vs-big-data/.
Miles, A., and A. Sullivan. 2010. Understanding the Relationship Between Taste and Value in culture and Sport. London: DCMS.
Murgia, M. 2017. Watchdog Probes Cambridge Analytica's Poll Role. Financial Times. Accessed: 28 April 2021. https://www.ft.com/ content/7482ec7c-01c9-11e7-aa5b-6bb07f5c8e12.
Mutz, M. 2016. Christmas and Subjective Well-Being: a Research Note. Applied Research in Quality of Life 11 (4): 1341-1356. https://doi.org/10.1007/ s11482-015-9441-8.
NHS. 2016. Want to Feel Happier? Take a Break from Facebook. NHS. https://www.nhs.uk/news/mental-health/want-to-feel-happier-take- a-break-from-facebook/.
Niiler, E. 2020) An AI Epidemiologist Sent the First Alerts of the Coronavirus. Wired. Accessed: 28 April 2021. https://www.wired.com/story/ai- epidemiologist-wuhan-public-health-warnings/.
Noble, S.U. 2018. Algorithms of Oppression: Data Discrimination in the Age of Google. New York: New York University Press.
Oman, S. 2013a. Review of 'Counting What Counts: What Big Data Can Do for the Cultural Sector'. Cultural Value Initiative. http://culturalvalueini- tiative.org/2013/06/08/review-of-nestas-counting-what-counts-what- big-data-can-do-for-the-cultural-sector-by-susan-oman/.
---. 2013b. Tackling the Deficit: Well-Being and Cultural Participation. Presentation at Culture, Health and Wellbeing International Conference. University of Bristol.
---. 2015. Measuring National Well-Being: What Matters to You? What Matters to Whom? In Cultures of Wellbeing: Method, Place, Policy, ed. S. White and C. Blackmore. London: Palgrave Macmillan.
---. 2017. All Being Well: Cultures of Participation and the Cult of Measurement. PhD Thesis. The University of Manchester.
---. 2019a. Improving Data Practices to Monitor Inequality and Introduce Social Mobility Measures: A Working Paper. The University of Sheffield. Available at: https://www.sheffield.ac.uk/polopoly_fs/1.867756!/file/ MetricsWorkingPaper.pdf. Accessed: 29 March 2021.
---. 2019b. Measuring Social Mobility in The Creative and Cultural Industries: The importance of working in partnership to improve data practices and address inequality. Sheffield: The University of Sheffield. Accessed: 29 March 2021. h t t p s : / / w w w. s h e f f i e l d . a c . u k / p o l o p o l y _ f s / 1 . 8 6 7 7 5 4 ! / f i l e / MetricsPolicyBriefing.pdf.
---. 2020. Leisure pursuits: Uncovering the 'Selective Tradition' in Culture and Well-being Evidence for Policy. Leisure Studies, 39(1), 11-25. https://doi. org/10.1080/02614367.2019.1607536.
---. n.d. How Data Work in Contexts. Living with Data. Accessed: 29 April 2021. https://livingwithdata.org/previous-research/how-data-work-in- contexts/.
O'Neil, C. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. London: Allen Lane.
ONS. 2001. 60 Years of Social Survey: 1941-2001. Norwich: HMSO.
---. 2016. Early Census-Taking in England and Wales. Office for National Statistics. Accessed 28 April 2021. https://www.ons.gov.uk/ census/2011census/howourcensusworks/aboutcensuses/censushistory/ earlycensustakinginenglandandwales.
Otterbacher, J., Bates, J., and Clough, P. 2017. Competent Men and Warm Women: Gender Stereotypes and Backlash in Image Search Results. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 6620-6631). Association for Computing Machinery. https://doi. org/10.1145/3025453.3025727.
Pearce, R., S. Erikainen, and B. Vincent. 2020. TERF Wars: An Introduction. The Sociological Review 68 (4): 677-698. https://doi.org/10.1177/ 0038026120934713.
Pellert, M., et al. 2020. Dashboard of Sentiment in Austrian Social Media During COVID-19. Frontiers in Big Data 3. https://doi.org/10.3389/ fdata.2020.00032.
Pidd, H. 2020. 'Punishment by statistics': The father who foresaw A-level algo- rithm flaws. The Guardian. Accessed: 11 August 2021. http://www.theguard- ian.com/education/2020/aug/14/punishment-by-statistics-the-father- who-foresaw-a-level-algorithm-flaws.
Pink, S., and V. Fors. 2017. Being in a Mediated World: Self-Tracking and the Mind-Body-Environment. Cultural Geographies 24 (3): 375-388. https:// doi.org/10.1177/1474474016684127.
Plunz, R.A., et al. 2019. Twitter Sentiment in New York City Parks as Measure of Well-Being. Landscape and Urban Planning 189: 235-246. https://doi. org/10.1016/j.landurbplan.2019.04.024.
Poovey, M. 1998. A History of the Modern Fact: Problems of Knowledge in the Sciences of Wealth and Society. Chicago: The University of Chicago Press.
Porter, T.M. 1986. The Rise of Statistical Thinking 1820-1900. Princeton: Princeton University Press.
---. 1996. Trust in Numbers The Pursuit of Objectivity in Science and Public Life. Princeton: Princeton University Press.
Quercia, D. et al. 2012. Tracking 'Gross Community Happiness' from Tweets. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. CSCM 2012, ed. D. Gergle, et al., 965-968. New York: ACM.
Ram, A., and M. Murgia. 2019. Data Brokers: Regulators Try to Rein in the 'Privacy Deathstars'. Financial Times. Accessed 29 March 2021. https://www. ft.com/content/f1590694-fe68-11e8-aebf-99e208d3e521.
Ruckenstein, M., and M. Pantzar. 2017. Beyond the Quantified Self: Thematic Exploration of a Dataistic Paradigm. New Media & Society 19 (3): 401-418. https://doi.org/10.1177/1461444815609081.
Ruppert, E., J. Law, and M. Savage. 2013. 'Reassembling Social Science Methods: The Challenge of Digital Devices. Theory, Culture & Society 30 (4): 22-46. https://doi.org/10.1177/0263276413484941.
Savage, M. 2010. Identities and Social Change in Britain Since 1940: The Politics of Method. Oxford: Oxford University Press.
Scott, J.C. 1998. Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed. New Haven: Yale University Press (The Yale ISPS series).
Sinclair, J. 1798. Statistical Accounts of Scotland. https://stataccscot.edina.ac.uk/ static/statacc/dist/home.
Strain, T., K. Wijndaele, and S. Brage 2019. Physical Activity Surveillance Through Smartphone Apps and Wearable Trackers: Examining the UK Potential for Nationally Representative Sampling. JMIR mHealth and uHealth 7(1): p. e11898. https://doi.org/10.2196/11898.
Suzuki, M. 1992. Political Business Cycles in the Public Mind. American Political Science Review 86 (4): 989-996. https://doi.org/10.2307/1964350.
The Economist. 2017. The World's Most Valuable Resource Is No Longer Oil, But Data. The Economist, 6 May. Accessed 29 March 2021. https:// www.economist.com/leaders/2017/05/06/the-worlds-most-valuable- resource-is-no-longer-oil-but-data.
Townsend, L., and Wallace, C. 2016. Social Media Research: A Guide to Ethics. Aberdeen: The University of Aberdeen, p. 16. https://www.gla.ac.uk/media/ Media_487729_smxx.pdf.
Turow, J. 2011 Introduction. In The Daily You: How the New Advertising Industry Is Defining Your Identity and Your Worth, 1-12. Yale University Press.
UK Data Justice Lab. n.d. Data Justice Lab. https://datajusticelab.org. United Nations. 2014. A World That Counts: Mobilising the Data Revolution for Sustainable Development. Secretary-General of the United Nations. https:// www.tralac.org/images/Resources/UN_Summit/A%20world%20that%20 counts%20Mobilizing%20the%20data%20revolution%20for%20sustainable%20 development%202014.pdf.
---. 2015. Indicators and a Monitoring Framework for the Sustainable Development Goals. Launching a Data Revolution for the SDGs. Secretary- General of the United Nations, p. 233. https://sdgs.un.org/sites/default/ files/publications/2013150612-FINAL-SDSN-Indicator-Report1.pdf.
Voukelatou, V., et al. 2020. Measuring Objective and Subjective Well-Being: Dimensions and Data Sources. International Journal of Data Science and Analytics. https://doi.org/10.1007/s41060-020-00224-2.
Whitaker, B. 2020. The Computer Algorithm That was Among the First to Detect the Coronavirus Outbreak. Accessed 28 April 2021. https://www.cbsnews.com/ news/coronavirus-outbreak-computer-algorithm-artificial-intelligence/.

About the author

Susan Oman

The University of Manchester, Post-Doc

Susan Oman is a doctoral student at the ESRC Centre for Research on Socio-Cultural Change (CRESC), University of Manchester. Her inter-disciplinary research is linked to the AHRC-funded project ‘Understanding Everyday Participation - Articulating Cultural Values’ and investigates the politics of cultural practices, participation and well-being. She is fellow of the Centre of Excellence in Training for Theatre, an award which comprised a 30month research secondment. This facilitated research into Higher Education collaboration with cultural and community organisations and access to creative education and the professions. Prior to this, Susan was a curator, specialising in creating platforms for emerging practitioners to engage with diverse audiences, and organised exhibitions in New York, Melbourne, Dublin and across the UK.

Papers

Followers

View all papers from Susan Omanarrow_forward

Getting a Sense of Big Data and Well-being

Sign up for access to the world's latest research

Abstract

Related papers

References (109)

Related papers