Archive for December, 2007

Can the government get Google’s data?

Sunday, December 30th, 2007

My blog so far has focused on the powers of private companies to collect data about people’s Internet activities, but I think it’s important to mention the possibility of the government gaining access to data from companies like Google. According to their privacy policy, Google doesn’t even need to be subpoenaed to give up data to a government agency. The policy says Google will only give up personal information to third parties if…

“We have a good faith belief that access, use, preservation or disclosure of such information is reasonably necessary to (a) satisfy any applicable law, regulation, legal process or enforceable governmental request, (b) enforce applicable Terms of Service, including investigation of potential violations thereof, (c) detect, prevent, or otherwise address fraud, security or technical issues, or (d) protect against imminent harm to the rights, property or safety of Google, its users or the public as required or permitted by law.” (1)

So, if a lawsuit is filed against someone and Google is asked to turn over information, that person’s entire search history, as well as any personal information they voluntarily provided to Google, could be made public. If the court subpoenaed an ISP, then the person’s web history could easily be linked to their name. And this part of the policy applies only to personal information. Google seems to be even more willing to share non-personally-identifiable information:

“We may share with third parties certain pieces of aggregated, non-personal information, such as the number of users who searched for a particular term, for example, or how many users clicked on a particular advertisement. Such information does not identify you individually.” (1)

It is hard to tell if this category of data could include IP addresses and/or cookie IDs or is limited to data on the number of users that performed certain actions.

To Google’s credit, however, it fought a government subpoena in the case of Gonzalez v. Google, a major legal challenge to search engines’ right to keep data private. The U.S. Department of Justice, led by Attorney General Alberto Gonzalez, filed a motion in federal court on January 18, 2006 seeking a court order that would force Google to share with them “a multi-stage random sample of one million URL’s” that are reachable using Google’s search engine and “the text of each search string entered onto Google’s search engine over a one-week period (absent any information identifying the person who entered such query).” (2) The government wanted this information to help defend the constitutionality of the Children’s Online Protection Act –  a federal law that makes it illegal for websites to make “harmful” material available to minors. Gonzalez argued that studying a sample of Google’s data would enable the government to estimate how often people search for and find material that is harmful to minors, how widely this material is available online, and how effective filtering software is. (2)

Google, however, refused to comply with the subpoena, arguing that the information is irrelevant and redundant, as other search engines had already complied with the subpoena, that it would compromise privileged trade secrets, and that complying with the subpoena would personally identify Google’s users. The government argued that Google needed to turn over “only the text of the random sample of search strings, without any additional information that would identify the person who entered any individual search string.” (2) So it seems that the government was only asking for a list of search terms, without IP addresses, cookie IDs, or any other information.

On March 18, 2006 a judge ruled that Google must give up 50,000 random URLs (less than the government’s demand of a million) but did not have to share any search terms. It’s interesting that Google tracks its users’ searching habits so extensively but didn’t even want to give the government a list of search terms. It seems that the information the government was asking for falls under the category of “aggregated, non-personal information,” which Google says it may share with third parties without users’ consent and without even the belief that doing so is legally necessary. Perhaps Google is being cautious in its privacy policy by informing users of the worst-case scenario regarding their privacy. Maybe Google actually tries to be more conservative about user privacy than it lets on in the policy. Was Google truly acting out of concern for users’ privacy when it resisted the subpoena? Or was Google’s main motive an economic one, such as the desire to protect its trade secrets?

Sources: 

1. Google Privacy Policy. 30 Dec. 2007 <http://www.google.com/intl/en/privacypolicy.html>. 

2. Gonzalez v. Google, Inc. FindLaw. 30 Dec. 2007 <http://news.findlaw.com/hdocs/docs/google/gonzgoog11806m.html>.

What can Google and DoubleClick find out together?

Friday, December 28th, 2007

In this post I will try to determine how the Google-DoubleClick merger will expand the companies’ data-gathering powers.

Both companies track users’ IP addresses, so they would be able to combine their sets of data about each individual user. By using DART cookies, DoubleClick’s servers track each user’s IP address (and therefore location), browser type, operating system, and what ads they click on each time the user visits a site that shows ads from DoubleClick. So, DoubleClick has a complete history of which client sites each individual user has visited and when, as well as which ads the user has clicked on. As long as the user has a static IP address, opting out of the DART cookies will not prevent this type of tracking, since the opt-out cookie still tracks the user’s IP address.

The information that Google records includes IP addresses, browser types, operating systems, dates and times of searches, and links and ads clicked on. So Google can compile a list of all the terms a particular user has searched for, as well as which results and ads the user has clicked on.

After the merger, Google-DoubleClick would be able to track, by IP address, all the terms each user has searched for, all the search results the user has clicked on, all the sites that are clients of DoubleClick that the user has visited, and all the DoubleClick and Google ads the user has clicked on. And if the user has a changing IP address, Google-DoubleClick could use cookies as a backup method and still be able to track all of these things by individual user. Although Google-DoubleClick would not be able to link IP addresses to names without help from ISPs, there is a significant danger that their records of information could be personally identifiable. For example, some people google their own names, the weather in their town, or directions from their address to various destinations. Would you be happy knowing that one company has such an extensive record of your online activities and may be able to link it to your name?

Allowing the Google-DoubleClick merger might lead to a slippery slope. I wonder if DoubleClick will still provide the ability to opt out of cookies, or if it will conform to Google’s policy of not allowing users to opt out of having their activities recorded. What if, at some point down the line, Google someday teamed up with an Internet service provider? Then it would be able to match IP addresses to names with no difficulty whatsoever, and Internet activity for the customers of that ISP would lose all anonymity and privacy.

In my next post I will briefly describe the legal protections against the government gaining access to Google’s and DoubleClick’s vast records of information…

DoubleClick’s privacy policy

Sunday, December 23rd, 2007

DoubleClick seems to be even less forthcoming about privacy than Google. I checked out their privacy policy, and here is a summary of what it says:

The only personal information DoubleClick collects is voluntarily provided by people and includes contact information from people who contacted DoubleClick and e-mail addresses of people who signed up for newsletters. (1) DoubleClick defines personal information as anything that can identify a particular person, including names, addresses, e-mail addresses, telephone numbers, social security numbers, credit card numbers, or bank account numbers. (2)

Non-personal information is collected through session cookies, persistent cookies, server logs, and Web beacons. DoubleClick considers ISPs, operating systems, browser types, and cookie IDs to be non-personal information. (2)

Session cookies are cookies that last only until a user closes his or her browser. DoubleClick uses session cookies when visitors reach their site through ads on other sites. The cookies track what site the user reached DoubleClick.com from and what ad they clicked on, so that DoubleClick can track the effectiveness of their advertising.

Persistent cookies are cookies that last over more than one browser session and are stored in the Cookies folder on a user’s computer. DoubleClick doesn’t go into much detail on how extensively it uses these on its site, but states that it uses data from them “to better understand how the website is used, resolve technical problems, and enhance your experience at this site.” (1)

DoubleClick also uses a specific type of persistent cookies called DART cookies to serve ads on other sites. The cookie gives each user a unique numerical identifier so that DoubleClick’s client sites can track what ads and sponsored search listings they click on in order to deliver personalized ads and gauge which ads are most successful. It is the clients, not DoubleClick, that decide how they will use cookie information to determine which ads to display. So, although DoubleClick makes clients promise never to use sensitive or personally-identifiable information to choose ads, (2) one never really knows what data the client sites are gathering about their users.

Thankfully, DoubleClick makes it easy for users to opt out of the DART cookies. By merely clicking a button on DoubleClick’s privacy page, a user can replace his or her DART cookie with an opt-out cookie. This opt-out cookie contains no unique numerical identifier, but it does track the user’s operating system, browser type, IP address, and local time. For users who have opted out, ads are targeted based only on the content of the client web page that the user is viewing. Simply deleting the DART cookie is not a good way of opting out, since each site that displays DoubleClick ads will search for a DoubleClick cookie and will set a new cookie if there isn’t already one. If a user clicks on the “opt-out” link, however, all sites that display DoubleClick ads will recognize the opt-out cookie and will not set any new cookies. (3)

Another way that DoubleClick collects information is through server logs, which track IP addresses and referring URLs so that DoubleClick can find out how people use its site. (1) The privacy policy doesn’t go into much more detail than this regarding server logs.

Finally, DoubleClick uses Web beacons, which they describe as “small strings of code that are placed in a Web page.” (1) I decided to do a little more research, and I found out that Web beacons are also known as “Web bugs” and are used to track users’ behavior on third-party sites. A Web bug is an image, usually transparent, that is displayed on a web site but resides on another site’s server. When a Web bug loads, the server where the image resides can log information about who views the bug, and therefore who views the third-party website. This information includes what one would expect to find in server logs, such as the user’s IP address, browser type, time of visit, and the URLs of the third-party site and the bug image. (4) So, for example, DoubleClick may place Web bugs on sites that it serves ads on. This would enable them to track how popular each site is, as well as (through IP addresses or cookies) which sites each individual views so that it can customize future ads to the individual based on his or her browsing history.

Towards the end of its privacy policy, DoubleClick states that it may share information with third parties if it has a contract with them to “provide some part of the information or service that you have requested.” (1) It also reminds users that personal information” may be subject to disclosure pursuant to judicial or other government subpoenas, warrants or orders.” (1)  Finally, DoubleClick says that it takes “reasonable security measures in order to protect both personal and non-personal information from loss, misuse and unauthorized access, disclosure, alteration or destruction.” (1)

Sources: 

1. “Privacy Policy for Information Use at This Website.” DoubleClick.com. 28 Aug. 2006. 23 Dec. 2007. <http://www.doubleclick.com/privacy/index.aspx>.

2. “FAQ.” DoubleClick.com. 23 Dec. 2007 <http://www.doubleclick.com/privacy/faq.aspx>.

3. “DART Ad-serving and Search Cookie Opt-Out.” DoubleClick.com. 23 Dec. 2007 <http://www.doubleclick.com/privacy/dart_adserving.aspx>.

4. Smith, Richard M. “The Web Bug FAQ.” EFF.org. 11 Nov. 1999. 23 Dec. 2007 <http://w2.eff.org/Privacy/Marketing/web_bug.html>.