I was glad to see so many people interested in this technical topic of preserving data privacy from a computing perspective. We have seen an explosion in the number of IoT devices, meaning there are privacy concerns regarding the sensor data collected by these ubiquitous IoT devices. However, this data can be very useful. When data privacy is protected and preserved, there are many opportunities to improve safety, productivity and quality of life.
Preserving data privacy needs a holistic approach. However, one critical but often overlooked perspective is a computational perspective. This is what I would like to address in today’s talk: how can we use a computational approach to protect and preserve the data privacy of connected workers and, at the same time, still keep the usefulness of this data.
Here are five key takeaways I’ve identified for you. If you prefer, you can watch the recording of the full webinar.
1. We will soon be tracking anything, at any time
At their recent announcement event, Apple revealed the new AirTag. It’s a coin-size low-cost wireless location tag that uses ultra-wideband (UWB) and Bluetooth, allowing you to track the location of an item accurately, using an anchor like an iPhone or AppleTV. At $25 each and with a battery life of about a year, they are one of Apple’s most inexpensive products ever. Even a dongle is more expensive than that!
I use this example to illustrate how easy it is already, to really know the location of your daily objects. Maybe today you’re only tracking the location of your most expensive machines, but very soon, everything that can be tracked will be tracked. The barriers of costs and connectivity will be removed, thanks to 5G.
Once we have this ocean of data available from all of our workers and assets, it raises even more privacy concerns.
2. Geospatial data is much more than just “pins on a map”
I’ve encountered a lot of IoT software platforms that claim they are also a geospatial IoT platform because they can put asset data as pins on a map. That doesn’t mean they can extract hidden insights from spatio-temporal IoT data. “Spatial is special” is especially true for IoT. That means specialized data models, algorithms, and architectures are needed in order to handle the spatio-temporal observations from IoT devices. It also means handling the privacy of the location tracking data (i.e., trajectory) requires specialized methodologies.
3. Striking a balance between privacy and accuracy
Allow me to explain. During the webinar, I present our trajectory processing workflow in a single slide, to demonstrate a reference architecture that handles raw location tracking data and transforms them into decision-ready insights. One of the critical components in that workflow is a data privacy preserving module. To put it simply, uncertainty reduction (i.e., improving accuracy) and privacy preservation are like two competing sides of the same thing. When you increase uncertainty, you preserve privacy. When you reduce the uncertainty, the privacy concerns go up. From a computational perspective, the key is to develop trajectory privacy preserving algorithms that strike a balance between uncertainty and privacy.
4. Removing known identifiers doesn’t mean privacy is preserved
Data privacy concerns usually arise whenever third parties are involved. In some cases, it is necessary to share data with third parties for a greater good, for example, to fight pandemics. When a third party (an entity other than the one that the individual initially shared their data with) accesses data, it is necessary to protect the identities of the individuals sharing that data. This can mean sharing details such as medical histories, locations and other private data with a third party. The question is, how can we protect the privacy of the individual, but also share the necessary information on an individual level and keep the data useful?
The easy answer is to anonymize the microdata, i.e., remove known unique identifiers like name or social security number. But does that mean their privacy is conserved? Not really. There are values that, taken together, can still identify the individual. These attributes are called quasi-identifiers. Some examples could be gender, ZIP code, and date of birth. Quasi-identifiers can be used to join two datasets together to link personal identities, meaning privacy is not preserved. And location tracking data is, in fact, a very powerful quasi-identifier. For example, every day at 7:30 am, there’s a guy from Calgary who drives northwest from a specific location, towards a specific school, and then back towards the University of Calgary. I think there’s only one person. If you can join those together, you know it’s me, Steve. For this reason, even publishing de-identified user trajectory data can still cause serious privacy threats, if the adversary, the bad guys, have access to certain background knowledge.
5. Spatial computing techniques is needed to preserve trajectory privacy
Learn more about using geospatial data in your connected worker strategy
An effective connected worker solution requires geospatial location data. Putting pins on a map is simply not enough. Extracting insights from the location data from IoT requires a specialized geospatial IoT platform and geospatial IoT experts. Are you building your connected worker initiatives for your organizations? To reduce your risks, have your technical teams work with experts like us at SensorUp, to design your IoT applications that are useful and, at the same time, offer the ability to preserve the data privacy of your workers.