In my last post I introduced the first of two key concerns about using Big Data in research. I wrote in response to the recent WIRED article, “Scientists Are Just As Confused About the Ethics Of Big-Data Research As You.” In this second part, I continue my analysis.
The WIRED article suggested that the primary ethical concern in big data research is breach of privacy. In this second of two posts, I would like to present a second harm, one that I think is less discussed, more difficult to name, and perhaps more difficult to rationally justify. It is related to why we feel less bothered when we share our information with OKCupid, Facebook, or Google, than when such information is used for research to which we have not explicitly agreed.
The Use of Our Information is Routine
There is a parallel issue in medical research where specimens collected as part of regular medical care are used for research purposes without our permission. Such samples no longer bear any identifying labels, are considered discarded, no longer ours, and consequently available for study. Such use is now routine, not because there has been broad discussion that has found the practice acceptable, but because it is allowed by our legal concepts of property and because most individuals do not know that it occurs.
That is not to say that it is wrong. The benefits that accrue to society as a whole may entirely justify this practice, but its ethical basis is philosophic and not based on public understanding or consensus. To me, there is danger in pursuing practices that may be justified but for which the argument has never been publicly made.
Why Does This Bother Us?
We saw this in a very concrete form with dried bloodspots from newborns. The State of Texas destroyed over 5 million irreplaceable newborn dried bloodspots after parents found out that the samples, without any links to a particular child or family, were being used for research without explicit consent. Most research into public attitudes has shown that the majority of people would support such research and would give permission, but that they want to be asked.
In the same way that parents do not object to the collection of newborn blood spots to screen for treatable diseases, people do not mind sharing information on Facebook for the use they intend. There is no pretense of privacy. Similarly, few object when Google uses information about browsing habits or prior searches to return search results that are likely to be more useful and relevant to an individual, or when Amazon uses purchase history to suggest other products.
So if we are willing to donate our children’s blood, post on Facebook, search on Google, and purchase on Amazon, why does it bother us when someone uses the data or specimens from these activities for research? From a privacy perspective, the data is ours to disclose, and we have disclosed it. What is the source of outrage?
We Have to Talk It Through
I think the answer has to do with the imperfect fit between legal definitions and intuitive ideas of privacy and property. It also has to do with the discrepancy between the “fine print” in the End User License Agreement and intuitive ideas of contractual obligations.
When we use an internet search engine, we think we are simply doing a search, but we implicitly understand that past searches may influence the results of our current search. We are not thinking that those searches, along with the network address of our computer, may be linked to other data, such as spending records, and sold to a third party that will use the data for its own purposes. Use of our data for a novel purpose to create value for someone else, without our knowledge or approval, can seem exploitive.
As with the use of leftover specimens, a feeling of outrage or exploitation may be just that—a feeling—and may just as likely reflect ignorance or lack of transparency as it does true ethical transgression. But leaving such feelings unaddressed only makes them more real and presents a risk to the research enterprise, both for the social sciences and medicine.
It is time for transparency about how data and specimens are used and for development of a public consensus as to what is acceptable and what is not. Barring such a discussion, when such issues do come before IRBs, each of us is likely to make different decisions, based on our personal values. And we as a society are likely to destroy many more irreplaceable blood spots.