Interesting University Paper: “Privacy as an Operating System Service”

Periodically I check for research papers posted on university sites about information security, privacy and compliance.  They often contain great ideas, are a wonderful source of research references, stimulate further thinking, and often contain some interesting and forward-thinking proposals that you do not hear about from vendors or practitioners.

Today I ran across a paper posted on the Columbia University site in July of this year, "Privacy as an Operating System Service" by Sotiris Ioannidis, Stelios Sidiroglou, and Angelos D. Keromytis.  There were some intriguing ideas within discussing how to implement pervasive privacy services into the personal computer operating systems typically the majority of non-technical folks use.

I think it is interesting to think of "privacy," what I view as a goal, state or right in some situations, as part of a technical operating system service.  Certainly there are many technical privacy services out there right now, such as with P3P.  Viewing privacy from the strictly technical aspect, then, privacy baked into the operating system is a wonderful goal.  I’ve written often about the need to incorporate privacy and information security into applications and systems, so this is a nice demonstation of a discussion about how to do that within a personal computer OS.  Okay…now to look at a few of the points within the paper and provide a few thoughts…

The concept of removing personally identifiable information (PII) through the OS is quite interesting.  There are a growing number of vendor products out there right now that are attempting this, and most (if not all) have some very big challenges in thoroughly accomplishing this task.

They provide a good list of challenges with implementing privacy within the OS, as follows:

  • "Protocol Spanning: The operating system must have knowledge of the data and meta-data representation of applications. It needs to use this information to
    sanitize private information for each application in the system, or at least for those applications that the user has specified. For example, in order to scrub user name information in Microsoft Word and Open Office documents, the scrubbing module will have to be able to parse and according to policy remove user name references in both formats.
  • Single Point-of-Failure: Adopting a centralized operating system approach introduces the risk of global failure. If the operating system has a fault in the way it sanitizes private information, all applications will be affected.
  • Performance: It is possible that due to the centralized nature of an OS-center solution, that we might cause a performance bottleneck when executing privacy operations."

Yes, protocol spanning would be a huge challenge.  Think about all the possible applications that individual computers users could have, the diversity of all the vendors, and the likelihood that they would all cooporate to allow the type of collaboration and integration that would be necessary.  Most home computer users use a vast variety of software packages that are very unlike business software, and most of them collect and/or use PII in one way or another.  I’m thinking now about all the software packages (educational, interactive, etc.) that my sons use, and I’m not sure how the PII could be scrubbed from those accompanying data storage repositories.  The first thought is, well maybe that is not necessary, since those types of files would not be sent out of the computer anyway.  However, if that computer is also sometimes attached to the Internet, and an incoming probe or spyware makes it way through the personal firewall, then that data would be put at risk.  On the other hand, that is a risk today, so having the privacy in the OS to work with SOME applications would be better than nothing as long as the computer user does not get a false sense of complete privacy by using the OS privacy capabilities.

The paper gives a concise discussion of the challenges of scrubbing PII from meta data, para data and raw data.  However, it doesn’t suggest possible resolutions to these challenges, or even how to go about trying to resolve them.  I would have liked to have seen more about that.

Of course, the primary problem is the definition of what exactly constitutes PII, and then having a common format or look to those PII items.  PII is not universally defined.  Just within the U.S. federal laws, PII is defined in many different ways.  Looking globally you find even more definitions.  Throughout around 90 global laws I’ve found around 50 different specific types of information that are within these legal definitions.  Trying to integrate all would be an insurmountable task, it would seem.  However, if you would pick, let’s just say, the 10 most common or critical types of PII (perhaps those used most commonly for identity theft and fraud) to define globally, that would certainly be a very good start.

Also key is the ease with which the computer user would actually be able to set their own chosen privacy settings.  The goal of having it very easy for a non-technical computer user is certainly a challenge in and of itself even after a usable solution has been found and implemented into the OS.

I would also want such a solution to be customizeable so that you do not have it being TOO aggressive with removing everything it determines as PII from your outbound traffic…there may be instances where you need to send out what at least appears to be valid PII.

Overall this paper was a good high-level look at the concept of implementing privacy within the OS.  While it wandered here and there from the main idea at times, it was thought-provoking (at least it generated all kinds of questions for me as I read it) and is a good discussion centerpiece for this topic.

Technorati Tags




Leave a Reply