A Decentralized Need for Speed: An Empirical Investigation into Transaction Latency and Construction of Predictive Machine Learning Models for Blockchain

Author: 
Megha Joshi
Adviser(s): 
Fan Zhang
Abstract: 

With its emphasis on privacy and security, blockchain is a promising innovation with the potential to transform industries that extend far beyond the field of finance. However, the decentralized ledger technology faces issues when it comes to time-sensitive transaction, an essential element in most data-carrying applications such as healthcare. To better understand what causes long latency in blockchain transactions, an empirical analysis was conducted. First, latency was differentiated by three aspects of a transaction’s journey in the mempool through use of an efficient API querying workflow. In most cases, wait time between submission to eligibility should be minimal, but these abnormal cases showed that a significant percentage of latency was from a transaction waiting for eligibility. While some of this can be explained by a missing previous nonce, it does not entirely explain the reason for this latency. The median latency between transaction eligibility to inclusion fell around 14 seconds. Using this result, a binary label was created, and a logistic regression analysis was performed. The results showed three statistically significant predictors of latency: a transaction’s position in a block, the block number itself and the gas price a user paid for their transaction —the only expected predictor of latency based on previous studies. Position and block number as potential predictors of latency were then tested as input features that trained six commonly used binary classifiers. The accuracies of these models were compared to models that utilized gas price and gas use as input features. It was found that the new predictors always outperformed the control and had an average accuracy of 70%. This work shows the potential for new predictors of long latency that, if controlled, could move blockchain in the direction of being a highly viable platform for industries that value rapid, secure, and private data-storage and relay.

Term: 
Spring 2023