There is a growing debate as to when data should be considered personally identifiable and when data should be considered anonymous (or pseudonymous). I don't think we can here hope to establish a bright line test that once and for all determines if controversial identifiers like IP address is or is not PII, or what number of queries it takes before you can triangulate the identified who is querying, but what is fairly simple to address are the obvious cases.
When Cookies Are Not Anonymous
We often hear that cookies are anonymous and are shown the cookie itself as proof of such anonymity. Who could tell you the identity of id=abc123? And there is the question - who could? As we have seen earlier in the discussion of how cookies store data there are 2 possible mechanisms, what I have called direct storage and referential storage. Determining if a direct storage cookie is PII is pretty straight forward. If you see a cookie, e.g. email@example.com you know immediately it isn't an anonymous cookie. But what about id=abc13? The answer depends on what the cookie issuer has been able to link to the cookie. If they were able to link the cookie to e.g. an Order or a Customer Id AND have access to the data associated with such Order or Customer ID then clearly the cookie is not anonymous!
It turns out these questions of who knows what are particularly complicated because the ownership and control of the cookie data may be different than the party who sets the cookie. This is particularly the case in online advertising where large companies acting as service providers collect data through and linked to their cookie where the data itself belongs to the end advertising client. If then that advertising client maps a Customer ID to the cookie, the technology service provider can truthfully say, "we can't associate personal data to the cookie", but left unsaid is the fact that the owner controller of such data (the advertiser) can and does do this mapping as it is often a crucial part of the business plan as discussed in ad reporting.
An unfortunate side effect of many entities sharing the same cookie is that it makes it extremely difficult to be transparent. To be clear, not every advertiser using large ad serving technology providers directly maps customer identity to the cookie, but many do. In the analytics world where the primary data collection often happens on the merchants site, we may expect this to be an even higher percentage.
There has also been a potentially bad habit in industry to be semanticly correct while missing transparency completely. Behavioral Advertisers almost unanimously will tell you that they don't use personal data for ad targeting. I have no doubt this is almost universally true. Unfortunately nothing in that statement precludes the same used to perform such targeting to identify you by name and email address on the reporting side. This in effect allows for ads to be targeted by the service provider based on cookie's anonymous association with having visited an automobile site, but should the ads be seen the same cookie may be used by the advertiser to reference the full identity of the owner. Again such a look up relies on the differences between what the advertiser links to a cookie and what the service provider links to the same cookie.
1st Parties Are The New Third Party
Another troubling recent trend is the presence of cookies which traditionally had only appeared in 1st party contexts now appearing also in 3rd party contexts. This is particularly troubling where so many site's privacy policies state in some form or fashion that third party cookies are anonymous. The reason this has become so problematic is that when I provide my identity to Facebook, Twitter, Linkedin, Amazon or Yahoo! on their site I have a natural expectation that they may save that identity in a cookie to recognize my preferences upon my return to the site. For better or worse many of the companies we have traditionally thought of as 1st parties appear with great frequency in 3rd party contexts and when they do, the same cookie(s) used to recognize you on that site are sent in the 3rd party context.