General TCP state inference model from passive measurements using machine learning techniques
Journal article, Peer reviewed
Published version
View/ Open
Date
2018-05-04Metadata
Show full item recordCollections
Original version
Hagos DH, Engelstad P.E., Yazidi A, Kure Ø. General TCP state inference model from passive measurements using machine learning techniques. IEEE Access. 2018;6:28372-28387 http://dx.doi.org/10.1109/ACCESS.2018.2833107Abstract
Many applications in the Internet use the reliable end-to-end Transmission Control Protocol
(TCP) as a transport protocol due to practical considerations. There are many different TCP variants widely
in use, and each variant uses a speci c congestion control algorithm to avoid congestion, while also
attempting to share the underlying network capacity equally among the competing users. This paper shows
how an intermediate node (e.g., a network operator) can identify the transmission state of the TCP client
associated with a TCP ow by passively monitoring the TCP traf c. Here, we present a robust, scalable and
generic machine learning-based method which may be of interest for network operators that experimentally
infers Congestion Window (cwnd) and the underlying variant of loss-based TCP algorithms within a ow
from passive traf c measurements collected at an intermediate node. The method can also be extended to
predict other TCP transmission states of the client.We believe that our study also has a potential bene t and
opportunity for researchers and scientists in the networking community from both academia and industry
who want to assess the characteristics of TCP transmission states related to network congestion.We validate
the robustness and scalability approach of our prediction model through a large number of controlled
experiments. It turns out, surprisingly enough, that the learned prediction model performs reasonably well
by leveraging knowledge from the emulated network when it is applied on a real-life scenario setting.
Thus, our prediction model is general bearing similarity to the concept of transfer learning in the machine
learning community. The accuracy of our experimental results both in an emulated network, realistic and
combined scenario settings and across multiple TCP congestion control variants demonstrate that our model
is reasonably effective and has considerable potential.