Published: 
By  Karen Walker

Digital tools and services generate an ever-greater amount of data. Statista, a German firm that provides analyses of market and consumer data trends, forecasts that the total amount of data created in the world will reach 175 zetabytes in 2025. All that data needs to be stored somewhere, likely in one of more than seven million data centers in operation across the globe. Statista projects that within the next year, data centers will contain around 1,327 exabytes (one exabyte is one quintillion bytes). Big data—data sets too large or too complex for traditional data processing applications—make up a bulk of data centers' holdings. Big data takes up 124 exabytes of data center storage globally, with anticipated growth to 403 exabytes by 2021. Organizations continuously call on their data repositories for business operations and decision-making. Computer engineers develop the architectures, tools and algorithms to automatically move bulk data across and through these geographically dispersed data centers with maximum efficiency. A team of computer engineers and computer scientists at Fuzhou University in China, Shanghai Jiao Tong University and the University of Virginia have developed a promising solution that meets massive bulk data transfer demands across the networks that connect data centers around the world. They developed a new store-and-forward scheduling method, based on a mature concept in telecommunications that the team customized for inter-data center bulk data transfers. The team's method temporarily stores bulk data that is not time-sensitive at intermediate sites and forwards the data when the network is less congested. The team earned a best paper award for this research presented at the Institute of Electrical and Electronics Engineers 2020 International Conference on High Performance Switching and Routing. Conventional scheduling methods require knowledge of the entire network, which involves the continual pinging and processing of return signals from every link and node. “Consider the case where computing resources used for scheduling are limited,” said Yuanlong Tan, a co-author and Ph.D. student at UVA. “In this case, the conventional solutions may be too complex to implement for large-scale networks and dynamic traffic.” Xiao Lin, an assistant professor at Fuzhou University and the paper's first author, explained that the store-and-forward method can make decisions based on the status of a few pre-selected routes. “Our studies show that continually assessing the state of the entire network involves a lot of redundancy, offering no new information but imposing extra computational burdens on scheduling,” Lin said. The team's approach merges steady-state knowledge about the network with in-the-moment information about the most consequential, pre-selected links. “This approach reduces redundancy and maximizes computing resources spent on time management of data flowing across a large-scale network.” The team's quest to achieve high performance with less complexity began with a conversation between Shanghai Jiao Tong University professor Weiqiang Sun and the late Malathi Veeraraghavan, professor of electrical and computer engineering at UVA. The pair discussed Sun's ideas of efficiently transferring bulk data in networks, which aligned with Veeraraghavan's research in control-plane architectures, signaling protocols and hybrid networks. Lin, who was one of Sun's advisees at the time, formalized the research partnership when he joined Veeraraghavan's group as a visiting Ph.D. student in 2016. “From 2016 to 2017, we gained a lot from our cooperation. And, eventually, addressing the problem of bulk data transfer in networks, we created a theoretical model and formed a systemic research method that benefits both teams,” Lin said. The conference paper emerged from Veeraraghavan's visit to Shanghai Jiao Tong University in 2018. The research partnership expanded when Lin started his faculty position at Fuzhou University. “Professor Veeraraghavan had a unique vision and firm confidence in our work,” said Tan, whom Veeraraghavan advised throughout his dissertation proposal, defended successfully in March 2020. “She believed we were working on a meaningful problem and encouraged us to stick with our store-and-forward scheme even when other researchers questioned our work.” “We are delighted to have proven Professor V. right,” said Lin. “Academic cooperation should be long-term and stable.” Lin and Tan continue to collaborate and share ideas, following Veeraraghavan's example. The team's collaborative research is published in the Journal of Optical and Communications Networking, Optical Switching and Networking and IEEE Access, in addition to conference papers presented at the Asia-Pacific Conference on Communications (2017), the International Telecommunication Networks and Applications Conference (2018) and the IEEE International Conference on High Performance Switching and Routing (2019 and 2020).