You are Here (and we know it!)



What if you get to know that someone in Kimberly, Australia, knows where you went biking last Sunday? If this doesn’t creep you enough, what if the person also knows that you ran across the pavement in front of your house yesterday? 

If you don’t come under the Unconcerned category of Westin’s privacy segmentation, you must have guessed that we are hinting at the scary truth of the current day’s location privacy.

With ubiquitous apps like Google(search), Facebook, Uber keeping records of your movements almost every second(with or without your permission), its hard to imagine how much location-based information do services like Google Maps, Foursquare, Strava posses.








What our project is about?

If these apps are collecting location-based data, then what do we, as students, got to do about it?
Well, in this project, we didn’t do any rocket science, we only took publicly available routes uploaded by a user and predicted a tiny detail of the user, his/her home(or work) location.

The home/work location might seem so common to know about; your neighbors know it, your friends know it, some people from your workplace know it. So what, if we know it?
The difference lies in the fact that, we as students, recognize you and know your personal details, but you, whose information we possess, has no clue about it.

In an extreme case of criminal offense done using this location-based information of yours, someone like us wouldn’t even be part of the list of possible suspects.

Our analysis was done on the maps of users uploaded on Strava. If you don’t know what Strava is,
the next section is here to help!


What is Strava?

Strava is a social fitness network that primarily used to track cycling, and running using GPS data.

Created by the millions of Strava athletes, segments mark popular stretches of road or trail (like your favorite local climb) and create a leader-board of times set by every Strava athlete who has been there before. Strava has an active user base with the addition of one million new users every 45 days and 8 million activities being uploaded each day.



How do we do it?

The following are the steps getting to an address from a user:

We made an application through the official Strava API.                                                                      
This application was presented to Strava users to authenticate it in order to receive their access tokens. Why this authentication is required? It is a measure taken by Strava to limit the collection and processing of publicly available maps(and activities) of Strava users by anyone on the Internet(developers) through its official API.                                                                                     

The following is a shot of the message we sent to active Strava users on Twitter.





  1. Heat maps of all activities of a user. 
Upon receiving users’ access tokens, we could get access to their all their publicly available maps and activity traces. 

We first converted these activity traces(in the form of poly-lines) to coordinates (latitudes and longitudes). 

Then plotted these on geographical maps. Here is the sample of a user’s map.









  1. Start and end locations tags for each activity.
Now that we have every trace in the form of coordinates, we assigned start and end of activity markers to all the activities of a particular user. We also aggregated these counts based on their frequency in nearby locations.









  1. Aggregation and pinpointing.
Finally, we aggregated these coordinates and assigned the maximum probability score to a location based on the frequency of coordinates (rounded to decimal places of 4).


We then converted the lat-longs back to an address, thus predicting the home location of the user.






Privacy Zones: A case in view

Strava has implemented a privacy feature especially to protect its users from personalized attacks of this sort in the form of a feature, called privacy zones.  

A privacy zone allows users to hide all activity traces within a zone, a patch of circular location centering on a chosen coordinate. This deception technique fundamentally allows the user to block out her/his house location and areas nearby from becoming public.
To see how effective privacy zones really are, we first tried running our experiment on a user’s Strava data without privacy zones feature enabled. Next, we predicted the address of the user with privacy zone enabled.


From what we observed, the difference (in distance), between the two predicted locations was not much (~30m).
This is because our prediction technique involves
  •  Aggregation of multiple coordinates of users’ activities tracks.
  •  Rounded off coordinates(up to 4 decimal digits) to get a better estimate of the exact location.


This proves that having predicted your location from privacy zones, combined with some social engineering can still lead them to your correct address!






Concerns

Though Strava has privacy options like privacy zones and some settings in place, our experiment proves that these features are not enough and strong enough. Furthermore, here is an article which specifically states how home locations can be predicted using privacy zones.

It may seem very scary, but this is the reality of today’s social media platforms. We can’t stop using these popular mainstream social media platforms just because of some small security and privacy vulnerabilities in them. Vulnerability, by definition, means a surface open to potential attacks(which may/not be severe). 

So, what can a simple user do? A simple user can only become aware of such possibilities of an attack and react accordingly, which is essentially the purpose of our project!

If you are frequent on Strava, with a few activities uploaded, you could give our app a try: 

Comments

Popular posts from this blog

Hate Speech Analysis on Gab

LinkIt

Authenticity of Linkedin Profiles