Management of blockchain-based data provenance and addressing smart contract security issues
University of New Brunswick
Blockchain gained massive popularity in recent years in industry as well as academia due to its specific properties that enable distrusting parties to mutually and pseudo-anonymously manage information through enforceable rules (in the form of smart contracts), and in a decentralized way. The eager adoption of the technology has led to converting and managing a considerable amount of funds in the form of cryptocurrencies. On the other hand, the ease of access to the public blockchain, in addition to the enforcement of pseudo-anonymous transactions, has incentivized its use in criminal activities, as well as numerous security related attacks. Consequently, groups of interest and authorities have been motivated to explore traceability and accountability of users to enable regulatory enforcement, and to prevent and mitigate security related issues. In this work we present different approaches to address the above issues. To enable traceability and audit of smart contracts (abbreviated as contracts) and to analyze, detect, and mitigate contracts’ security issues we present the EideticEther and EtherProv frameworks. The frameworks collect contracts’ execution flow, including their accessed data, across time and in different granularities. Specifically, the EtherProv framework collects execution flow provenance at the control flow graph level, across all participating contracts. The collected provenance facilitates root-cause and forensic analysis, detection of security issues, and traceability, with the aid of provenance retrieval capabilities. Moreover, EtherProv enables mitigating deployed contracts that exhibit undesired activities. To help deanonymize pseudo-anonymous addresses we present an approach that utilizes stylometry techniques to extract unique features of Ethereum contracts’ code that can represent the coding style of the contracts’ developers. We explore the feasibility of using these features to attribute contracts’ code to their deployer’s address, and consequently, affiliate addresses that were used to deploy contracts written by the same developer. In order to enable the described approaches, efficient management and retrieval of historical data is required. However, current blockchain indexes enable to query a single key and its latest value. Our proposed AMVSL blockchain index enables efficient authenticated historical data management and their retrieval over a large range of keys and their current or historical values. To enable rigorous regulatory enforcement auditors require frequent access to multiple blockchains. However, due to the rapid increase of blockchain data volume and their inefficient querying capabilities of a large amount of data, maintaining local blockchain nodes for querying purposes can prove inefficient. To this end, we propose a system that enables remote auditing of blockchain data, providing efficient and richer queries while supporting private information retrieval by utilizing cryptography techniques over semi-trusted servers to protect the auditors’ identities, queries and their results. To handle large data volumes the system employs a scalable distributed processing solution for big data.