Loading [MathJax]/extensions/MathMenu.js
ProDigger: Towards Robust Automatic Network Protocol Fingerprint Learning via Byte Embedding | IEEE Conference Publication | IEEE Xplore

ProDigger: Towards Robust Automatic Network Protocol Fingerprint Learning via Byte Embedding


Abstract:

As a prerequisite technique, Deep Packet Inspection (DPI) plays a major role to contemporary network security and management. The key of DPI is a repository of protocol f...Show More

Abstract:

As a prerequisite technique, Deep Packet Inspection (DPI) plays a major role to contemporary network security and management. The key of DPI is a repository of protocol fingerprints. However, inferring and maintaining up-to-date fingerprints for various and new protocols is very difficult in order to adapt them to the continuous evolution of the protocols. In this paper, we propose ProDigger, a robust automatic protocol fingerprint learning framework for DPI traffic recognition. The key insight of ProDigger is that byte embedding, a distributional vector representation of one byte with the ability of capturing the contextual information of packet payload, is prelearned from the protocol traces which is conductive to obtain an efficient numerical representation of packet payloads with different message formats or semantic information. Multiple finegrained clusters are obtained by feeding the constructed packet payload representations to the clustering algorithm. Last, by employing byte-embedding-based payload alignment algorithm to each cluster, we attain the target protocol fingerprints in the form of a series of substrings. We implement our approach and evaluate it on real-world Internet traffic traces. The experimental results demonstrate that ProDigger is capable of identifying the corresponding traffic based on the learned fingerprints and show more excellent performance in terms of Precision and Recall in comparison with the state-of-the-art approach.
Date of Conference: 23-26 August 2016
Date Added to IEEE Xplore: 09 February 2017
ISBN Information:
Electronic ISSN: 2324-9013
Conference Location: Tianjin, China

I. Introduction

Network protocol recognition focuses on the capability to recognize which protocol or application generated the network traffic. It is significant for Internet Service Providers and network administrators who always want to know what type of traffic is traversing their network backbones. Therefore, protocol identification makes the source of monitored network traffic visible and has many potential applications, such as Quality of Service (QoS), network security monitoring (IDS/IPS), traffic visualization, network forensics, trends and changes in network applications and more. Protocol identification through Deep Packet Inspection is the most widely applied technique in industry and becomes de facto standard, though it is deemed extremely expensive in terms of processing costs on high speed networks. Fortunately, the consideration can be alleviated by exploiting many new high-performance techniques [10] and optimization strategies [4]. The core of DPI is to match the content of the traffic payload with the pre-constructed fingerprints, also called signatures, typically in form of regular expression. However, inferring accurate and efficient fingerprint for various application protocols faces several challenges. (i) Traditionally, it is a time-consuming, challenging task requiring lots of manual analysis from network protocol experts based on protocol specifications and packet traces. (ii) A majority of proprietary protocols are lack of publicly available documentations, although there are standard RFCs for the public-domain protocols. (iii) Although the protocol fingerprint can be obtained from the open specifications' it may not tackle all the variants. The reason hiding behind this is that the same protocol probably have different implementations. Moreover, some of these implementations don't comply with the open available specification. (iv) The labour-intensive manual signature extraction process has to be repeated from time to time so as to maintain a latest signature repository.

Contact IEEE to Subscribe

References

References is not available for this document.