Investigation of encrypted and obfuscated network traffic utilizing machine learning
University of New Brunswick
This thesis utilizes machine learning to investigate the classification of the encryption applied to network traffic and the underlying activities. It is firstly motivated by the difficulty of traditional traffic classification caused by additional encryption as ports and headers are hidden. Secondly, the results also present the effectiveness of currently available privacy-enhancing technologies. A new dataset is created, containing Pure (without additional encryption), Tor, Tor with obfuscation, VPN and VPN+Tor network traffic. Additionally, there are five different activities performed during each kind of traffic recording, namely audio streaming, browsing, P2P/SFTP file transfers and video conferencing. The traffic is classified by extracting features based on flows calculated by ARGUS and CICFlowMeter, combining three classifiers with seven feature selection algorithms. The results for the classification of the encryption are well and clearly indicate the possibility of using this detection system in a modified fashion within a practical application. For the detection of the activities inside the encrypted network traffic, the results show that the theoretical protection is not given. Overall, this reveals the need to improve the resistance of commonly used techniques for the protection of network traffic against machine learning.