Bitcoin from a Network Science perspective

Funny looking network you got there!

  When I first chose to take ICS 622: Network Science, I thought we would be learning about packet-switched networks. I had always been fascinated by the internet and it’s workings, so I had decided to take all the networking classes available at the University of Hawaiʻi at Mānoa. But on the first day of class I learned that this class had to do with a different aspect of the internet. These networks were meant to represent relationships with complex equations and terminology. It was in this class where we learned how Google’s search algorithm used to work, how an epidemic spread could be mapped, and how the six degrees of separation theory came to be. When tasked with finding a project to work on, I had just been introduced to the technical side of blockchain technology. I decided to see if there were any data sets pertaining to Bitcoin transactions and how accessible they were. Luckily there was a small data set of simplified transactions from 2009-2013. I say simplified, because Bitcoin accounts have to ability to produce many public keys even though the transactions are committed by the same account. This is the feature that gave Bitcoin owners their pseudonym. But through network manipulation, the public transaction history can be used to group public keys to specific accounts. So given my limited computing power, as well as limited experience with R, I decided to use this simple data set and make a project out of it.

Code to organize the dataset by time period.

Mapping the Bitcoin network.

  For this project the data set I worked with had a table. The table contained information such as sending account, receiving account, amount of Bitcoins transferred, and when the transfer occurred. From a network science perspective each node would be an account and each account would have edges to represent transactions. The amount that was transferred during a transaction would be the weight of an edge and the amount of edges would represent the amount of transactions that the account was involved in. These edges could be directional as well since there was always a sender and receiver. Using code from previous assignments, I took the data and looked at attributes of the network graph. Some of these attributes were: the number of components, proportion of nodes that make up the giant component, and local clustering coefficient. All of this information was then used to create a report based around my findings as well as previously noted research.

Bitcoin account amount for each time period. Each line represents a price spike

Results, lessons, and future additions.

  If there’s one takeaway from this project it is the fact that calculating network attributes takes a lot of resources. It takes a lot of time, it takes alot of CPU power, and it takes a lot of memory. I had a larger dataset to work off of in the beginning, but even after leaving my computer on overnight my graphs still weren’t complete. There was one point where I cut the data set in half, and I ended up running out of memory on my 16GB macbook. After using the simplified data set I was able to see interesting trends around transaction usage. As people know, Bitcoin has increased astronomically in value since it’s release in 2008. During this time, there would be periods where Bitcoin’s price would spike which correlated with spikes in transactions performed and accounts created. This could either mean that people are either in the process of Fear of missing out or it could be early investors trying to cash out. Another interesting addition to this project would be to take the graphs and map them against the price of Bitcoin day by day.

Link to the Google Drive Directory with all source code and datasets

Thanks to Daniel Suthers for code templates to generate the graphs.