RICHMOND, Va. – The world could be more closely connected than you think, and Facebook certainly is one way to track our collective “degrees of separation.”
For decades, it has been proposed that everyone on the planet is connected to everyone else by just six other people. Well, now the world is getting cozier, according to Facebook methodology, which estimated the more appropriate number to be 3.57, and who better to make an educated guess than a social network with over 1.5 billion users worldwide.
In the United States, that number is 3.46.
Each person in the world (at least among the 1.59 billion people active on Facebook) is connected to every other person by an average of three and a half other people. The average distance we observe is 4.57, corresponding to 3.57 intermediaries or “degrees of separation.” Within the US, people are connected to each other by an average of 3.46 degrees.
Facebook says that with more people using their site, the distance between any two people in the world has been shortened. A similar study in 2011 calculated the degrees of separation to be 3.74. That was when Facebook has just 721 users.
If you are logged into Facebook, and click this link, you can see your personal degrees of separation.
Mark Zuckerburg has an above average social connectedness, with only 3.17 degrees between him and everybody else on the planet. Sheryl Sandberg is more connected, with 2.92 degrees.
The author of this story is connected to everyone in the world by 3.22 degrees.
Facebook detailed the statistical techniques used to precisely estimate distance between folks. That data is below, and can be read here.
Calculating degrees of separation in a network with hundreds of billions of edges is a monumental task, because the number of people reached grows very quickly with the degree of separation.
Imagine a person with 100 friends. If each of his friends also has 100 friends, then the number of friends-of-friends will be 10,000. If each of those friends-of-friends also has 100 friends then the number of friends-of-friends-of-friends will be 1,000,000. Some of those friends may overlap, so we need to filter down to the unique connections. We’re only two hops away and the number is already big. In reality this number grows even faster since most people on Facebook have more than 100 friends. We also need to do this computation 1.6 billion times; that is, for every person on Facebook.
Rather than calculate it exactly, we relied on statistical algorithms developed by Kang and others [6-8] to estimate distances with great accuracy, basically finding the approximate number of people within 1, 2, 3 (and so on) hops away from a source.
More accurately, for each number of hops we estimate the number of distinct people you can reach from every source. This estimation can be done efficiently using the Flajolet-Martin algorithm [9]. How does it work? Imagine you have a set of people and you want to count how many are unique. First you assign each person a random integer; let’s call it hash. Approximately 1/2 of the people will have an even hash: the binary representation of the hash will end with 0. Approximately 1/4 of the people will have a hash divisible by 4; that is, the binary representation ends with 00. In general, 1/2n people will have the binary representation of their hash end with n zeros. Now, we can reverse this and try to count how many different people we have by reading their hash values one by one. To do that, we track the biggest number of zeroes we’ve seen. Intuitively, if there were n zeroes, we can expect set to have c*2n unique numbers, where c is some constant. For better accuracy we can do this computation multiple times with different hash values.
This algorithm maps well to our problem: you just find the biggest number of zeroes among all friends’ hashes. By using a bitwise OR operation on the hash, this process can be repeated recursively to estimate the number of unique friends-of-friends, and then friends-of-friends-of-friends. We used Apache Giraph [10] to run this computation on the entire Facebook friendship graph. In our implementation, at each step each person sends a bitwise ORed hash value to all his friends. We do this recursively and use Flajolet-Martin’s math to estimate the number of unique friends for each degree of separation.
In summary, we find that the world is more closely connected than you might think.