UVA High-Speed Networks (HSN) research
Scientific applications in the fields of high-energy physics, genomics, Earth sciences, climate sciences, fusion science, and many other such fields often require high-performance computing and high-speed networking resources.
The goal of the UVA HSN research group is to contribute advances to high-speed networking to support such scientific applications. Broadly speaking, the networking needs of this community fall into two categories: (i) intra-datacenter networking and (ii) Wide-Area Networking (WAN).
In the intra-datacenter networking space, we have two projects:
- Develop new scheduling algorithms for Hadoop MapReduce for use in hybrid datacenter networks that include optical circuit switches and electronic packet switches
- Develop a Dynamic Congestion Management System (DCMS) and a Virtual Network Configuration System (VNCS) for InfiniBand networks to improve the performance (execution time) of large-core-count scientific applications
In the WAN networking space, we are working on the following problems:
- Develop protocols to enable reliable multicast of file-streams from a single sender to hundreds of receivers; specifically, since 1993, real-time meteorology data such as radar data, satellite data, etc. have been distributed by UCAR in a program called Unidata to hundreds of scientists located in various universities and research labs. Our solution leverages Software Defined Network (SDN) capability to build a multipoint VLAN across multiple domains, and a newly developed reliable multicast transport protocol called File Multicast Transport Protocol (FMTP) to make this data distribution more sustainable (in terms of required network bandwidth and sender compute resources)
- Develop cross-layer designs for effective leveraging of high-speed end-to-end network paths (L2 virtual circuits or L1 optical circuits) for the transfers of large datasets
- Methods for online identification of alpha or elephant flows (high-rate, large-sized) flows and redirection to isolated network slices to reduce their negative impact on other flows.
Our work is supported by research grants from the National Science Foundation (NSF) Computer & Information Science & Engineering (CISE) Directorate’s Computer Advanced CyberInfrastructure (ACI) and Computer Network Systems Divisions, and the Department of Energy Office of Science Advanced Scientific Computing Research (ASCR).
US Ignite: Collaborative Research: Track 1: Industrial Cloud Robotics across Software Defined Networks
Funding agency: CNS, Division Of Computer and Network Systems
PI: Malathi Veeraraghavan; Award: 1531065; $424,261.00; Started: August 24, 2015
Co-PIs: Shaun Edwards, SwRI, & Andrea Fumagalli, UTD
Currently, industrial robots are cost-effective for repetitive and high-volume tasks such as welding and painting, but not for lower-volume, mixed-part production. The need for robotic part handling for unstructured industrial applications is diverse. In manufactured-goods distribution centers, where multiple bins are presented to an operator, a human is required to handle a range of parts that must be boxed and shipped. In the reclamation and recycling industry, humans sort waste streams of mixed products on conveyor belts. Assembly and kitting operations in manufacturing are termed robotic opportunities but they require a solution for handling many part types in the same work-cell. This project will research and integrate technologies to enable the use of industrial robots for low-volume mixed-part production tasks. The proposed solution will include 3D image sensors and high-speed flexible networking, cloud computing, and industrial robots. The inclusion of cutting-edge new software such as the Robot-Operating System Industrial (ROS-I) and Cloud Computing platforms offer excellent educational opportunities for both undergraduate and graduate students. The software developed in this project will be widely distributed to enable further innovations by other teams.
The project objective is to develop cloud robotics applications that leverage high-performance computing and high-speed software-defined networks (SDN). Specifically, the target applications combine big-data analytics of sensor data (of the type collected from factory floors) with the control of industrial robots for low-volume, mixed-part production tasks. Cloud computers located at a remote facility relative to the factory floor on which industrial robots operate can be used for compute-intensive applications such as object identification from 3D sensor data, and grasp planning for the robots to perform object manipulation. The project methods will consist of (i) integrating ROS-I components and developing new software as required to transmit the 3D sensor data to remote computers, running the object identification and grasp planning applications, and returning robot instructions to the original site, (ii) running this software on geographically distributed compute clouds, (iii) collecting measurements and enhancing the software to meet real-time delay requirements. The technical challenge lies in meeting these stringent real-time requirements. For example, high-speed networks with the flexibility to connect arbitrary factory floors and data centers are needed to transfer the 3D sensor data quickly to the remote cloud computers and to deliver the computed robot instructions(hence, SDN).
Leveraging DYNES for Weather Data Distribution on Multicast Virtual Circuits
Funding agency: ACI, Division of Advanced Cyberinfrastructure
PI: Malathi Veeraraghavan. Award: 1340910 $899,946. October 1, 2013 – September 30, 2016.
The project objective is to create a new version of a software program called Local Data Manager (LDM), which is used to distribute weather data by the University Corporation for Atmospheric Research (UCAR) to over 170 institutions. This new version uses network multicast services in order to decrease the required computing resources and network capacity.
The methods being employed are as follows: (i) implement a new version of LDM, LDM-7, by integrating a reliable multicast transport protocol called Virtual Circuit Multicast Transport Protocol (VCMTP) into the current LDM-6, (ii) install LDM-7 on Dynamic Network System (DYNES) hosts that have been deployed at several universities as part of another NSF grant, (iii) compare performance of LDM-7 across two types of network multicast service: layer-3 (IP) multicast and layer-2 multicast virtual circuits, and (iv) beta test LDM-7 at U. Wisconsin and Rutgers U. for future transition to practice at the other 170 institutions.
The potential benefits of the project are promising because LDM-6 has a large deployed base; in addition to UCAR, it is used by NOAA, NASA, US Geological Survey, US Army Corps of Engineers, US Air Force, US Navy, and international agencies. Our expected outcome, based on a preliminary evaluation of VCMTP, is that LDM-7 will require fewer servers and lower network capacity to achieve the same level of performance as today’s LDM-6. More broadly, this network multicast solution with the reliable transport protocol can be used to disseminate other types of information such as financial data and video files.
ACTION: Applications Coordinating with Transport, IP, and Optical Networks
Funding agency: CNS, Division of Computer and Network Systems
PI: Malathi Veeraraghavan. Award: 1405171 $150,000. February 1, 2014 – January 31, 2017.
Today’s Internet is significantly over-provisioned for a number of reasons, e.g., to absorb traffic bursts, to handle the extra (rerouted) load when failures occur, and for long-term traffic growth without frequent upgrades in the field. This overprovisioning adds significant capital costs to the Internet.
The ACTION project enables adaptation of the Internet to changing needs and demands by dynamically adjusting its “optical highway” capacities based on network resource utilization. The project name ACTION stands for “Applications Coordinate through Transport Interfaces with Optical Networks,” implying that the capacity of the Internet optical highways is a time-adjustable commodity, which can be intelligently controlled by the applications via specially designed interfaces. For example, part of the optical spectrum can be freed on highways with low traffic volume and either reassigned to other highways where needed or switched off to conserve energy. Such flexible highways are achieved by combining WDM (Wavelength Division Multiplexing) Flexible Grid technology in the metro-core with DSCM (Digital Subcarrier Multiplexing) technology in the campus/access. The ACTION research activities will culminate in a final demonstration on the Japan Gigabit Network-extreme (JGN-X) test-bed.
The ACTION project will actively engage undergraduate students in Senior Thesis and Design Projects, recruit women and under-represented minorities by leveraging the UVA Center for Diversity in Engineering and the UTDallas long-standing relationship with CONACYT (National Council for Science and Technology of Mexico). Findings, documents, links to publications and software will be disseminated through a wiki-based Web site. All instructional materials developed will be made available at TeachEngineering.com.
US/Japan Trustworthy Networking Workshop
Funding agency: CNS, Division Of Computer and Network Systems
PI: Malathi Veeraraghavan. Award: 1624676 $73,812. February 15, 2016 – January 31, 2017.
This project brings together US and Japanese networking researchers in a 1.5-day workshop to identify key challenges in enabling Trustworthy Networking for Smart and Connected Communities. The two focus areas of the proposed workshop are: (i) trustworthy optical networking, and (ii) trustworthy computing/networking platforms (e.g., IoT/CPS, edge cloud). Optical networks are necessary in the core of provider and enterprise networks as data rates increase, while IoT/CPS/edge clouds are expected to dominate future access networks. Funds requested for this workshop will be used to cover workshop expenses and travel costs for only US attendees.
The goals of the workshop are to identify key challenges in the focus areas of trustworthy optical networking and trustworthy computing/networking platforms for IoT/CPS and edge clouds. Specific attributes of trustworthiness include reliability, dependability, resilience, survivability and robustness. Attendees will make presentations, participate in breakout sessions to identify key challenges, and then collaborate on a workshop report. The expected outcomes include: (i) A list of key research challenges in the two focus areas of the workshop; (ii) An improved mutual understanding of research interests, research contributions, and testbeds in the two countries; (iii) An improved understanding of processes used in the two countries to enable mutually beneficial collaborations; (iv) Seeding of individual linkages; (v) An identification of research objectives and projects that are missing in one of the two countries with the goal of increasing awareness of unaddressed challenges; and (v) Workshop report.
Real-time Alpha-Flow Traffic Management System
Funding agency: DOE SBIR, Department of Energy
PI: Alan Commike. Award: DE-FOA-0000969 $149,994. Award year: 2014
As supercomputing speeds increase and storage costs drop, scientists running applications in various disciplines, such as high-energy physics, genomics, climate studies, etc., generate datasets of ever-increasing sizes at distributed locations around the world. Backbone Research-and-Education Network (REN) providers such as DOEs Energy Sciences Network (ESnet) and Internet2, provide high-speed, reliable network services to support these scientists located at various US Department of Energy (DOE)s national laboratories and universities. Our work will identify, in real-time, very large data flows across these networks, moving them to alternative network paths or queues so that interactive applications such as voice, video, and applications can retain a high quality of service thereby extending the interval to the next inevitable costly equipment upgrade cycle. We are developing a system called real-time Hybrid Network Traffic Engineering System (rHNTES) which will tap the traffic between networking installations to determine which network flows are large, ongoing, and of high bandwidth; i.e. elephant flows. We consider the timely completion of these elephant flows less important than interactive network traffic such as voice, video or applications and as such, change the network priority such that elephant flows are lower priority than other more import network traffic. Our elephant flow identification is accomplished in real-time, using off-the-shelf network processors and innovative software algorithms. Once identified, we will reprogram the switching infrastructure to ensure these flows do not impact the higher priority flows. This project will develop a software packet processing engine using readily available commodity network processors which will be able to analyze network packets at up to 100 billion bits per second. We will use this software engine to develop our real-time HNTES (rHNTES) system based on prior HNTES algorithm work. The network routers will be reprogrammed on-the-fly to mitigate the impact of each of the elephant flows we identify. This system will then be integrated into existing network performance monitoring tools to help identify where network bottles occur. Commercial Applications and Other Benefits: This work will enhance collaborative science making more efficient use of existing and future networking technologies that will help usher in scientific breakthroughs.