SmartIO: Zero-overhead Device Sharing through PCIe Networking
Markussen, Jonas Sæther; Kristiansen, Lars Bjørlykke; Halvorsen, Pål; Kielland-Gyrud, Halvor; Stensland, Håkon Kvale; Griwodz, Carsten
Peer reviewed, Journal article
Published version
View/ Open
Date
2021-07-08Metadata
Show full item recordCollections
Original version
ACM Transactions on Computer Systems. 2021, 38 (1-2), 1-78. https://doi.org/10.1145/3462545Abstract
The large variety of compute-heavy and data-driven applications accelerate the need for a distributed I/O solution that enables cost-effective scaling of resources between networked hosts. For example, in a cluster system, different machines may have various devices available at different times, but moving workloads to remote units over the network is often costly and introduces large overheads compared to accessing local resources. To facilitate I/O disaggregation and device sharing among hosts connected using Peripheral Component Interconnect Express (PCIe) non-transparent bridges, we present SmartIO. NVMes, GPUs, network adapters, or any other standard PCIe device may be borrowed and accessed directly, as if they were local to the remote machines. We provide capabilities beyond existing disaggregation solutions by combining traditional I/O with distributed shared-memory functionality, allowing devices to become part of the same global address space as cluster applications. Software is entirely removed from the data path, and simultaneous sharing of a device among application processes running on remote hosts is enabled. Our experimental results show that I/O devices can be shared with remote hosts, achieving native PCIe performance. Thus, compared to existing device distribution mechanisms, SmartIO provides more efficient, low-cost resource sharing, increasing the overall system performance.