Control, Alt or Delete?
Data-dependent technology, in its various forms, has become fully integrated in society and has had a transformative effect on people’s lives. However, whilst technology is at the forefront of people’s minds, consumer data is not.
We have undertaken a comprehensive programme of qualitative and quantitative research to explore what people know and how they feel about their consumer data being collected and used by commercial organisations. This included: a nationally representative phone survey, interviews with vulnerable consumers, focus groups and deliberative workshops.
The following article summarises our segmentation and our research findings. You can read our full research report at: www.which.co.uk/policy/digitisation
The Data Dozen: consumer segmentation
Our segmentation splits the population into 4 attitudinal groups, with behavioural groups nested within each of these. The different behaviours are apparent to varying extents within each attitudinal group.
- Tolerant (35%): The largest group in the population. They are a little more comfortable about data collection than average and less likely to be concerned about inferences being made. However they are concerned about third party selling.
- Concerned (29%): They are very concerned about inferences being made. Nearly all say that they’re very cautious about sharing their information; however the vast majority are confident that they know how to control what data they share with organisations. More likely believe they don’t benefit from sharing their personal information with organisations.
- Anxious (23%): They are somewhat uncomfortable with data collection and concerned about inferences being made. Most likely group to not feel confident that they know how to control what data they share with organisations. In addition they are the only group where the majority don’t trust organisations to not share their data if they don’t give them permission.
- Liberal (13%): The smallest group in the population. They are similar to the accepting group in that they are more likely to be comfortable with data collection and less likely to be concerned about inference being made. However, in addition they aren’t concerned about third party selling (in contrast to all other groups) and they are most likely to say that they don’t care if people see what they post online and don’t care what organisations do with their information as long as they get what they want.
- Maximisers (24%): This group likes to use shortcuts such as logging into other services through their social media and saving their details. They are also higher than average on taking action to restrict what data can be observed about them.
- Casual (16%): This group is lower than average on trying to restrict what data can be observed about them, and are online an average amount.
- Protector (8%): The smallest group and only seen within the “Anxious” attitudinal segment. They are somewhat more likely than average on taking action to restrict what data can be observed about them.
- Activist (19%): This group frequently takes action to restrict what data can be observed about them, and in addition are more likely to be “dirtying” their data by putting incorrect information in forms.
- Browser (33%): This group include people who are online relatively little or not at all. They are significantly less likely to take action to restrict what is observed about them.
Themes from our research
The data ecosystem is invisible to consumers; limiting their knowledge of it.
This means that their attitudes are mostly formed on only a partial understanding of how their data is collected, what precisely is collected and known about them, and how it’s used. The scale of the data ecosystem is not visible to them, and they therefore have little awareness of the amount of actors involved and that extent to which they can be profiled and have inferences made about them.
Consumers believe, incorrectly, that data transactions are “bounded”.
Consumers’ conceptualise the collection of consumer data as a series of bounded transactions- where individual pieces of data are “given” to an organisation in order to receive a specific product or service. They are very rarely aware of the extent of third party sharing or that their data can be amalgamated to form an individual level profile. When we explained the data ecosystem to them, including the fact that consumer data flows beyond the bounded transactions they imagined, they are unpleasantly surprised that their data is allowed to change hands continually, that it can be bought, and that this can all be done with profiles, not just individual pieces of data.
Consumers judge the acceptability of data collection by what impact it has on them.
They can be pragmatic about data collection and use, if they see the relevance or benefit to them Without telling consumers how the use of their data may impact them they do not have the necessary information to assess the acceptability of data collection. By not giving consumers contextual information on use and impact, it is forcing them to make decisions which cannot be meaningful. This is an important point considering that the discourse around data is usually around collection and at best generalised use.
In our deliberative research we gave examples of what data was collected and how it is used. From this informed perspective consumers spontaneously evaluated whether or not it was acceptable based on whether the collection was necessary for the product or service to function and what impact the use of the data had on them, i.e. whether it led to a benefit and whether it led to tangible harm.
Consumers are primed to “accept” data collection as having a positive impact, because it is easier to identify and conceptualise benefits than harms.
When informed about use consumers tend to assess acceptability by evaluating whether there is a benefit to their data being collected and whether there could be tangible detriment. However, their evaluation is skewed as one on the one side they have direct, known and certain benefits of technology, which have become a necessity in life. And they weigh these up against intangible, unknown potential detriments of having their data collected and used. It is in this context that people appear to be accepting (via their behaviour) of data collection. But in reality they are primed to accept it because of cognitive biases and a lack of alternatives.
Consumers are pushed into operating in a space of "rational disengagement".
Where the cost of trying to engage (e.g. understand what data is being collected and attempt to control this) is so much greater than any benefits they receive from doing this, that there is little reason for them to do so. It is often perceived that there is little benefit to engagement, because there is seen to be a lack of alternative to the product or service that is desired.
Consumers feel powerless to engage with organisations who collect and use their data.
There exists a power imbalance between consumers and organisations which results from: 1) how dependent people have become on technology in their day to day lives; 2) consumers’ lack of knowledge about the full extent of data collection and use by organisations; 3) lack of alternatives if they want to stop using specific companies whose data collection practices they might be concerned about. This means that consumers are often left feeling powerless to try and shape their engagement with organisations who collect and use their data.
Consumers want meaningful control over their data.
When people learn about the data ecosystem they tend to feel unable to control what data is collected and how it’s used. Sometimes people may want to have direct personal control over their data. However, in some instances (for example when not enough information is given for them to make a decision), the remedy may not be to make it consumers’ responsibility. Instead the action is to take action to better control the ecosystem- ensuring good governance and, when things go wrong (such as breaches), clear accountabilities exist and recompense.
Attitudes and behaviour are not necessarily congruent.
Our segmentation shows that there is a relative lack of a relationship between attitudes and behaviour. It found that the population can be split into four attitudinal groups, however nested within each of these are groups of people who may behave very differently, despite holding the same attitudes. For example, in the “Concerned” segment - people who are very concerned about inferences being made - there are people who display the following behaviours, whilst still holding the same attitude:
- “Activist” behaviour: they are taking action to restrict what data can be observed about them, and in addition are more likely to be “dirtying” their data by putting incorrect information in forms. The latter is thought to be a reflection of them taking action to protect their privacy, rather than a security measure.
- “Maximiser” behaviour: they are taking advantage of the shortcuts afforded to them online, for example saving their bank details in forms and logging into other services using their social media.
- “Browser” behaviour: they are online relatively little or not at all and are less likely to take action to restrict what information can be observed about them.
A consumers’ concern about inferences being made and third party selling does not necessarily translate into taking action to restrict what data can be observed about them.
Our analysis shows that being concerned about inferences being made or third party selling are not significant predictors of whether an individual is more likely than average to be “data restrictive” , for example, by clearing their cookies, restricting permissions and checking privacy settings.
Instead, the following factors are predictors:
- Time spent online: those who use the internet for more than 5 hours a day are 1.7 times more likely to be “data restrictive”, compared to those who are only going online for 1-2 hours a day.
- Going online for leisure: those who are high leisure users are 1.5 times more likely than average to be restricting their data.
In exploring the hypothesis that being comfortable with data collection and feeling in control is a result of taking action to restrict data collection, we found that:
- Those who are more likely than average to be comfortable with data collection methods are 1.5 times more likely than the rest of the population to be “data restrictive”.
Vulnerable consumers are more likely to be concerned about data collection and use, because they perceive that tangible detriment could result from it.
This includes that “irrelevant” data could be used “against” them, for example stigmatising them based on health conditions and being charged a higher price for a product or service as a result.
The majority (81%) of the population are concerned about organisations selling anonymised data to third parties.
Third party selling is seen as a murky and morally dubious practice. When we told consumers about the extent of data sharing within the data ecosystem these perceptions are intensified and they frequently feel that:
- They don’t have control over where their data goes
- It is made purposefully difficult for them to opt out of their data being shared
Data they consented to give in one context is being used in another, which they wouldn’t have given consent for if asked.
People are often surprised that there isn’t more regulation of data collection and use.
People generally assume that there are regulations against the widespread sharing and use of their data. When we provided them with information about the data ecosystem many were surprised that their data was allowed to change hands so many times and some assumed that regulations would not allow such practices.
Parts of the population will respond differently to policy recommendations, depending on their perceived need for change and their willingness to take action themselves.
Our segmentation indicates that policy recommendations which encourage consumers to change their behaviour will be more successful with some consumers and less with others. There is already some evidence to support this from a general public poll Which? conducted soon after the Cambridge Analytica and Facebook news story . We found that:
- Only 10% of people in the “Liberal” segment said that, in the light of the Cambridge Analytica and Facebook story, they were concerned about what organisations can do with information about their personality and beliefs. And only 6% said that they got a lot more concerned after the story broke.
- In contrast, 68% of those in the “Anxious” segment said they are very concerned (in the light of the story) and 48% said they become a lot more concerned after the story broke.
Those who are in the “Anxious” and “Concerned” segments are more likely to say they have reduced their use of Facebook since the news story broke. Around a quarter (28% and 26% respectively) said that they were using it less, compared to only around 1 in 10 of those in the “Tolerant” (10%) and "Liberal" (13%) segments.
However, as discussed, those exhibiting the same attitudes don’t always demonstrate the same behaviour. For example those who are in the attitudinal segment “Anxious” vary in their response to the news story: those who are “Activists” tend to be more likely to decrease the number of shares (38%) compared to those in the “Maximiser” (24%), “Unprotected” (21%) and “Browser” (13%) segments. It therefore cannot be presumed that people with the same attitude will all take action to the same extent.
- A quantitative telephone survey of a nationally representative sample of 2,064 UK consumers, with a separate boost of an additional 150 interviews in Scotland, between 18th and 28th January 2018. The survey was cognitively tested prior to fieldwork to ensure comprehension. Data was used to develop a segmentation of consumers.
- 6 focus groups, lasting 2 hours, with 9-10 consumers in each between 20th and 27th November 2017. Locations were London, Nottingham and Colne, Lancashire. Participants were recruited to ensure a spread of gender, age, ethnicity, self-assessed knowledge about data collection and level of comfort with data collection and sharing.
- 21 face-to-face depth interviews with vulnerable consumers in London, Nottingham, Colne, Newport, Leeds, Perth and St Albans, between 20th and 27th November 2017 and 7th and 27th February 2018. Vulnerable consumers were defined as: older consumers aged 80 years and over; consumers belonging to a lower SEG group (DE); consumers with a long-term physical or mental health condition/ disability; consumers who do not feel confident speaking, reading or writing in English.
- 4 deliberative workshops, each one lasting 1.5 days, between 7th and 27th February 2018. Each deliberative workshop consisted of 24 consumers and locations were Newport, Leeds, Perth and St Albans. Participants were recruited to ensure a spread of gender, age, ethnicity, self-assessed knowledge about data collection and self-reported confidence online.
- Cambridge Analytica and Facebook poll: Populus, on behalf of Which?, surveyed 2068 UK adults online between 26th and 27th march 2018. The data have been weighted to be demographically representative of the UK population.
- Segmentation: We adopted a data-led hierarchical clustering method. It was conducted in two stages to first derive high-level clusters based purely upon attitudes, then a second stage based upon the behavioural related questions. The second-order behavioural clusters were nested within the initial attitudinal clusters. We used hierarchical clustering to determine the optimal number of clusters, then k-means clustering to determine cluster membership. The inputs to the clustering were summary constructs created via two-factor analyses of batteries of survey questions. These resulted in summary 6 measures from 23 questions relating to attitudes and summary 3 measures of behaviours from 12 survey questions, explaining 55% and 47% of the variance in the data respectively.
- Logistic regression: used in determining the likelihood of being above average in taking action to restrict what data can be collected about you. “Data restrictive” is defined as being at least one standard deviation higher than average on our summary ‘data restrictive’ measure derived via Factor Analysis. The survey questions that loaded strongly against this measure were: checking privacy settings on social media and email; clearing browsing history or cookies, and restricting permissions on what information apps and websites can access.Leisure use is a summary construct derived from factor analysis of a battery or survey questions relating to consumer data behaviours. The questions that most strongly related to the construct related to reporting using the following in the last 3 months: consoles or websites for gaming online, streaming services, mobiles apps, social media platforms, messaging services and public wi-fi. High leisure users are those who scored above average on this construct.