It is nice to have scientific evidence of what we’ve been telling business leaders ever since they wanted to start using SSNs as identifiers and passwords!
Today Carnegie Mellon University (CMU) released a very revealing report, “Predicting Social Security numbers from public data” I want to expand upon some of the issues covered within it, and then urge you to communicate effectively to your business leaders the related concerns of your organization…
I love the first sentence in the report, “In modern information economies, sensitive personal data hide in plain sight amidst transactions that rely on their privacy yet require their unhindered circulation.” So true, and eloquently put!
Implications of this research’s findings — for business, government and the individual
Social security numbers were created for the purpose of tracking earnings and paying benefits. They were never meant to be used by businesses as an identifier but have taken on that role because everyone has one, and so it was easy to do. Why create something new if there was already a unique ID available, right? At least that is what business leaders argued; after all, it saved them time and money.
Early in my career, while a systems analyst in the late 1980’s, I expressed my concern to management about using SSNs as the employee computer account identifier as well as the employee ID. The response was that is was an existing piece of information that was readily available to use to uniquely identify employees, and that since there was no law against using it, they would continue to do so. This was before there was any connectivity to outside networks.
In the mid-1990’s I once more expressed my concern about the use of SSNs as identifiers for our customers when the company was planning to give access to their accounts via our Internet site. This time the business execs pointed out how many applications would need to be changed (literally hundreds and possibly thousands) and how many different databases (also hundreds and likely thousands) would have to be completely re-architected to accomodate a different ID structure at most, and even to replace literally millions of SSNs in the least. It would take months, perhaps years, to change everything, and since there were no laws against using SSNs at that time they continued using them.
Finally, after the dawn of the new century and after I’d left the company, they stopped using the full SSN as customer identifiers. However, they were still using the last six digits in many of the business units for customer IDs (calling them customer IDs and not SSNs of course), and also for their internal networks and their personnel IDs. All 20,000+ personnel could see each others’ personnel ID in various directories and as part of their email addresses. And a large portion of personnel could see customer IDs in any number of the thousands of databases.
Over the tenure of my stay at that company, there were occasionally people who complained about the practice. And they often used the argument that if someone knew the last six digits of the SSN it would be fairly easy to figure out the first three digits, and then any one of the 20,000+ people in the company could go off and do bad things armed with their full SSN. The executives always argued, though, that there was never any proof or study that showed this was possible. That because personnel came from all states, and numerous different countries, just knowing the last 6 digits would not pose an identity fraud threat. I always wished I had a study to point to that showed the relative ease with which the full SSN could be discovered.
I was pleasantly surprised to hear that Carnegie Mellon had just finished a study that not only proved the ease with which SSNs could be predicted, but that they showed you didn’t even need to know the last six digits! All you really need to know to determine most SSNs is to gather some key, often publically available, information, such as simply place of birth and date of birth.
Think about all the places where these information items are found!
The first paragraph of the CMU report states, “Unless mitigating strategies are implemented, the predictability of SSNs exposes them to risks of identify theft on mass scales.” Actually, it goes beyond just identity theft to all types of identity fraud and crimes, and also presents very real physical risks through medical identity theft; many hospitals and clinics use the SSN as their main identifier.
This scientifically-sound study demonstrates how relatively simple it is for folks with computer program creating capabilities (and there are MANY of them out there!) to take easily and quickly-found pieces of otherwise mundane information and use computer power to often accurately guess valid SSNs.
Something I know most people probably don’t realize is from page 3 of the report, that “In practical applications, SSNs are often used as authenticators in inquiries processed by Credit Reporting Agencies (CRAs). Since consumer credit reports contain errors and inconsistencies, CRAs are known to accept as valid even inquiries where just seven out of nine SSN digits are actually correct.” This will alarm many, as it should.
This study also shows why we must educate the public in general, and business leaders in particular, for why it is important to not post so many types of otherwise considered “harmless” PII onto public sites, and for businesses to collect only the minimum necessary information necessary to perform business activities. Too many businesses collect so much more PII than is actually necessary, and then thousands of their employees, other types of workers, and business associates have access to it all.
If you read nothing else, you should take away a key sentence from the report, and that is, “Such findings highlight the hidden privacy costs of widespread information dissemination and the complex interactions among multiple data sources in modern information economies, underscoring the role of public records as breeder documents of more sensitive data.” Even though large numbers of information security and IT folks have believed and communicated this to business leaders for years, it has never been demonstrated with such clarity before.
The public and businesses must see this and understand that they should not be so thoughtless and eager to post large amounts of information to the Internet, or give to their employers; the informaion items, each on their own may not have much meaning or power, but when considered all together can reveal the current key to a person’s very identity: the Social Security number.
What solutions could address or at least mitigate the problems this research highlights?
There is no one silver bullet solution. Multiple actions need to occur to address the existing SSNs, and to make changes in how future SSNs are created.
1) Considering how widespread SSN use is, and how dependent upon this identifier the govenment has become on providing benefits based upon it, simply stopping the use of SSNs is likely not feasible. However the government should establish a new way to create SSNs going forward that do not depend upon widely available information to establish the number. If you read the press releases accompanying the report, you’ll see that the Social Security Administration states they plan to start using a different method to create SSNs in 2010. (About time!)
2) The government should stop publicly publishing the SSA Death Master File. This is published to make it easy for those doing HR and other types of related activities necessary to verify customers to check on the validity of the SSNs they have been provided. It makes sense that there are reasons specific persons within businesses need access to this information to perform their job responsibilities and help to prevent fraud. However, just give these folks with these specific responsibilities access to the database; it would be fairly straightforward to do this instead of just throwing all the information out there for everyone in the world, including millions of crooks and would-be crooks, to see.
3) Organizations should stop using SSNs as verification items and account identifiers! As the study shows, too many people can figure SSNs out based upon very little publicly available information.
4) Stop using the SSN prediction items to verify individual identities. There are many other better ways to verify identity than asking for your birth city/state and/or birthdate.
5) Increase the awareness in the general public of the dangers of sharing personally identifiable information (PII) on any type of Internet site. Particularly on Web 2.0 types of sites. Even if the site operator says that an area on the site is “private” or “secure” everyone should consider that anything they put on the site could be seen, copied and used by others. There have certainly been many examples of this already. Now that people know the few types of information that can be used to determine with confidence SSNs, they should realize that by posting any type of PII to a Web 2.0 Internet site they are putting themselves in harm’s way.
6) Require Internet sites and other types of online service providers to protect such information, such as birth city, birthdate and other information, used currently and in the past, to create SSNs.
7) While the problems and subsequent reasoning is understandable, CRAs must stop accepting only 7 of 9 digits as being valid; this makes it much too easy for crooks to get access to all sorts of damaging information about people! Lotteries don’t give you the jackpot for matching all but two digits on your ticket; but here we have CRA businesses handing the jackpot to virtually anyone who asks and provides a “close” match.
8) THIS IS WORTH REPEATING BECAUSE IT NEEDS TO BE DONE, BUT IT IS NEVER DONE ENOUGH! Increase the awareness in the general public of the dangers of sharing personally identifiable information (PII) on any type of Internet site. Particularly on Web 2.0 types of sites. Even if the site operator says that an area on the site is “private” or “secure” everyone should consider that anything they put on the site could be seen, copied and used by others. There have certainly been many examples of this already. Now that people know the few types of information that can be used to determine with confidence SSNs, they should realize that by posting any type of PII to a Web 2.0 Internet site they are putting themselves in harm’s way.
Just a couple of decades or so ago, it wasn’t much of a threat to have so much diverse pieces of personal information available through widely scattered sources, mostly in physical formats that could not be quickly accumulated and analyzed by human brains to result in valuable assimilated PII that can be used to invade and take over a person’s life. We are now in a highly computerized world, with massively powerful computing capabilities, and the old practices just are too risky to continue using.
Tags: awareness and training, Carnegie Mellon, CMU, Information Security, IT compliance, IT training, policies and procedures, privacy, privacy training, risk management, security training, social security number, SSN