News Feature | August 7, 2014

Tokenization And De-identification Of PHI

By Megan Williams, contributing writer

Tokenization And De-identification Of PHI

It’s time to take a serious look at de-identification of patient data.

Reports have been released outlining the fact that the healthcare industry as a whole is at a greater security risk than the already compromised retail sector. We also don’t have to look deep into the industry to find news of companies penalized for putting large amounts of patient information (20 million pieces in this case) at risk. Healthcare is a large and varied industry, even on the tech front, so no solution regarding security should be overlooked.

De-identification Should Not Be Dismissed

A recent article from Health IT Outcomes looked at what Ontario’s Information And Privacy Commissioner, Dr. Ann Cavoukian, had to say about the importance of de-identification as a security option.

"De-identification remains one of our strongest and most important tools for protecting privacy. To suggest that information may only be de-identified at the expense of data quality is based on an outdated zero-sum paradigm. Challenging recent reports that suggest that call into question the usefulness of de-identification. We need to continue working towards perfecting de-identification techniques and re-identification risk management frameworks, thereby ensuring that de-identification remains an essential tool in protecting privacy, both now, and well into the future."


While de-identification is nothing new, tokenization has not gotten heavy attention within the industry. BiomedCentral, in their 2013 research article “Improved De-identification Of Physician Notes Through Integrative Modeling Of Both Public And Private Medical Text,” covered the application of tokenization to public and private medical text sources.

The report found that their model successfully recalled 98 percent of PHI tokens from 220 discharge summaries. They were able to successfully use medical concepts and terms such as “elevated white blood cell count” as de-identification indicators. Overall, the results exceeded the approved criteria established by four Institutional Review Boards.

Your Clients

For you and your clients, that means that achieving invulnerable data, even in healthcare, is not an unreasonable goal. You stand to take advantage of opportunity in the gap between existing tokenization technology and needs in the healthcare sector.

Go Deeper

As always, healthcare presents special challenges — stipulations around what will qualify as de-identified PHI have already been established as a standard by HHS. To read more on the standards, and rationale behind de-identification from a government perspective, visit HHS’ guidance document here.