Résumé | Profiling internet users associated with encrypted applications has been a long-standing challenging issue that helps to identify targeted users’ interests. This paper proposes a machine-learning based solution for creating encrypted application signatures without relying on any certain assumptions on the underlying network infrastructure such as IP address, port number, network flow characteristics. These applications signatures can later be used with passive network monitoring for profiling targeted users in terms of selected application usage such as Facebook, Tor. We propose a proof of concept (PoC) framework with effective features to identify (i) encrypted payloads from any network traffic, and (ii) targeted applications such as ToR, Skype for what the model is trained for. Our study shows that using classical Shannon’s entropy alone can help recognize encrypted payload, but may not help identify particular application payloads. We design features based on standard encoding e.g., UTF-8, entropy e.g., Shannon entropy, BiEntropy, and payload size, so that machine learning algorithms can be used to identify encrypted applications. |
---|