(Part 1 of a 2 part series – read part 2 here.)
Although privacy isn’t specifically guaranteed in the U.S. Constitution or Bill of Rights, it has been assumed to be a natural right in a democracy. Ten state constitutions in the U.S. do explicitly guarantee privacy as a right: Alaska, Arizona, California, Florida, Hawaii, Illinois, Louisiana, Montana, South Carolina and Washington. However, at the national level, individual court decisions have attempted to negotiate definitions of privacy rights as issues arise.
Interestingly, one of the first significant discussions of privacy rights came in a Harvard Law Review article in 1890 written by Louis Brandeis and his law partner Samuel Warren titled “The Right to Privacy,” which is considered “the first major article to advocate for a legal right to privacy.” In the article the authors argued for individual privacy protections: “Recent inventions and business methods call attention to the next step which must be taken for the protection of the person, and for securing to the individual … the right ‘to be let alone’ … Numerous mechanical devices threaten to make good the prediction that ‘what is whispered in the closet shall be proclaimed from the house-tops’.” The technology being addressed in this article was the emergent consumer camera that was creating what was then called “Kodak fiends” who, according to Brandeis and Warren, were invading “the sacred precincts of domestic life.”
We have seen major privacy erosions that occurred post-9/11, generally being presented as the price we pay for security. The rise of business analytics with internet-based data collection is presented as the price we pay for the evolving web-based communications era. The loss of privacy can result in personal embarrassment, loss of reputation, lost employment opportunities, fraud and financial loss, and even, in some cases, death or other physical loss.
Today, who doesn’t email, blog, microblog, videoblog, play massively multiplayer online games, use social bookmarking sites, video sharing sites, social or professional networking sites, use location/GPS systems, crowdsourcing platforms or online commerce? Even if somehow you have managed to avoid this, you are still tracked through surveillance systems, web-based medical, governmental, or business databases, and collaborative web-based productivity applications being adopted by schools, governments, and individuals across the globe. Google’s Apps for Education, for example, presents itself as “free for schools with 24/7 support at no cost. Never any ads, and your data is yours.” In an era of public sector austerity—from local K-12 through major research institutions—offers from Google and Microsoft for unlimited, free cloud-based storage, free sophisticated productivity software, email, blogging and other resources have been too seductive to avoid.
This is a description of the University of California system:
The University of California has a contract with Google that provides assurances regarding the security and privacy of customer information stored on Google’s systems. UC’s contract with Google takes precedence if there is a conflict with Google’s posted terms or policies.
As part of adopting Google Apps for faculty and staff use, UC updated its 2008 Google contract originally set up for student deployments. This contract covers areas of concern to higher education and specifically to research institutions:
CORE APPS: UCSC faculty, staff, and students have access to UCSC Core Google Apps (Email, Calendar, Drive, Groups, Sites, Hangouts, Classroom, Vault). Our core Google Apps are governed by a contract between the University of California and Google. UCSC Core Google Apps are FERPA compliant.
Although these deals have allowed major educational institutions to free up resources and funds for other compelling needs, in reality these services that offer free services in return for personal data are, in effect, exchanging these services for the ‘payment’ of detailed information on each person, each interaction, each search, affiliations, images, and so on. Even if the information is not linked one-on-one to specific names, the systems can’t help but be able to recognize/identify users in a more general manner. So, rather than free, these systems have bartered computer services for the price of individual privacy of their members/personnel.
Writing in the New York Times, Jeffrey Rosen asks “how best to live in a world where the Internet records everything and forgets nothing—where every online photo, status update, Twitter post and blog entry by and about us can be stored forever. With Web sites like LOL Facebook Moments, which collects and shares embarrassing personal revelations from Facebook users, ill-advised photos and online chatter are coming back to haunt people months or years after the fact.”
Microsoft’s own Director of Public Policy, Legal & Corporate Affairs, Steve Mutkoski explains that “many of these new products and services are run ‘in the cloud’ by a third party service provider as opposed to on servers operated by the school’s IT staff. This third party operation and control can raise important new regulatory compliance issues, including data protection and data privacy issues, as your school and student data will be handled by a third party. Second, increasingly these products and services are available without monetary payment for teachers to deploy directly in their classrooms. This means that the products or services often won’t go through a more formal procurement process where regulatory compliance and other similar issues would be evaluated. These new cloud products and services are being widely adopted by schools across the country because they lower school costs, increase productivity, and maximize innovation and efficiency.”
“Microsoft and Google have offerings that provide students, staff and teachers with traditional email and other productivity tools (like word processing),” he continues, “all of which are cloud based….There are even indications that social networking tools like Facebook and Twitter are used in schools. And the expectation is that this is only the beginning; with significant amounts of venture capital funding for ‘ed tech,’ we are likely to see an even greater variety of technology aimed at the classroom…as the consumerization trend has moved beyond just devices to include applications and services, workplaces need new internal controls and policies that address not just security risks, but also regulatory risks associated with storage of non-public data on servers operated by third party service providers.”
Tracy Mitrano, Director of the Institute for Internet Culture, Policy and Law (ICPL) program at Cornell, and well-known critic of these programs, believes that the “allegedly ‘free’ services require the entity to ask the question of ‘what is in it for the vendor?’ For many web vendors, their business model revolves primarily around advertising; marketing plays a supporting role.”
Mitrano outlined ten specific areas for Cornell that need to be assured in any contract for cloud-based services. Of those, the following five are most relevant here.
Regulatory Adherence. “Contract should require the entity to maintain all of the legal technical and procedural requirements of all public privacy laws.”
Identity Theft. “To date no federal law requires notification in the event of a breach of personally identifiable information. A majority of state laws do, and interpretations about how and in what ways they apply to other states differ among legal theorists. This uneven and ever-changing landscape makes clear and consistent contractual language difficult to impose. Be that as it may, institutions may decide either to impose those rules found in their home state or to require the entity to adhere to the state with the most rigorous standards of notification and protection. A third approach may be for the institution and the entity to create its own standards for the management of breaches that would include provisions about who reports the matter to whom, the test for a breach and the technical criteria used to do the assessment, and finally which party has responsibility for notifications and follow up, such as credit reporting arrangements.”
User Privacy. “Data mining, aggregation, and sale for targeted marketing is a very significant portion of their revenue. It is therefore imperative that an institution representing its users must examine its own culture, law, and traditions in the area of information privacy and be prepared to make clear claims regarding what is and is not acceptable behavior on the part of the vendor.”
“For example,” Mitrano’s guidance for Cornell continues, “the current standard Google student mail contracts reveal nothing about what Google does with web search terms. It clearly states that the stored mail will be mined, but anonymized, and that not until the student graduates will it feature advertisements on the site. Some institutions may want to bear down on these issues, for a minimum requiring that if Google does mine search terms, that it do so mechanistically and with anonymizing features. Moreover, most institutions would want assurances up front that no personally identifiable information will ever be aggregated for advertising and marketing purposes while the user still uses the account as a student.”
E-discovery. “The institution should, before entering into contract negotiations, do a cost-benefit analysis of how and in what ways to manage e-discovery concerns for which it maintains liability. For example, no liability may attach to student mail, but most certainly the institutions must consider e-discovery implications of sourcing faculty and student mail, as well as materials sourced in course management systems or data centers.”
Warranty and Indemnification. “Most likely, vendors or entities do not provide a warranty of anything! This area of the contract can be a “throw away”—that is, a meaningless term stating that the entity makes no warranty (which, in the case of negligence, will not absolve its liability).”
Keeping Our Values While Developing New Capabilities
One example of “good big data” is in the area of traffic congestion. According to data published by Nationwide Insurance, 1.9 billion gallons of fuel are wasted due to road congestion each year—that’s over five days’ worth of the total daily fuel consumption in the U.S. The average urban commuter is stuck in traffic 34 hours each year. We do have a number of avenues to get updates in real-time on traffic slow-downs, but these generally are too late to help someone already locked into the gridlock of congestion. What we need are tools that can help us anticipate traffic congestion—predict traffic jams before they even occur—so we can make better use of our time.
Microsoft is using its cloud computing platform, Azure, in collaboration with Brazil’s Federal University of Minas Gerais on their Traffic Prediction Project which uses detailed historical information—but integrates this with new types of information as well. Microsoft’s Juliana Salles, Senior Program Manager, Microsoft Research notes that, “the project uses data available by social networks, Department of Transportation, and data that users create themselves while they move around the city. The idea is to combine all of that and, create a solution in real time, helping users to get from Point A to Point B more efficiently.”
Antonio Loureiro, Professor of Computer Science at Federal University of Minas Gerais sees great value in combining existing data with newer forms of information. “How can we predict the traffic condition in the future, and to know the future you have to basically know the past. This application, in my point of view, is different from the others because it works with time series data and information from various social networks and the statistical treatment it gives to this information. This makes the data more reliable and more accurate that will help us in controlling city traffic.” In initial research, using data from London, Chicago, New York City, and Los Angeles, researchers were able to predict traffic snarls with 80% accuracy. That figure is pretty good on its own, but when you consider it was based only on traffic-flow data, it could rise to 90 percent when other data sources—largely from mapping apps and other social media—are added into the data sources.
Nancy K. Herther is librarian for American Studies, Anthropology, Asian American Studies & Sociology at the University of Minnesota Libraries, Twin Cities campus. firstname.lastname@example.org
Tom Gilson. Test Bio