Network Operating System Evolution

Origin and Evolution of Network Operating Systems

Contemporary network operating systems are mostly advanced and specialized branches of POSIX-compliant software platforms and are rarely developed from scratch. Generally speaking, network operating systems in routers can be traced to three generations of development, each with distinctively different architectural and design goals.

First-Generation OS: Monolithic Architecture

Typically, first-generation network operating systems for routers and switches were proprietary images running in a flat memory space. While supporting multiple processes for protocols, packet handling, and management, they operated using a cooperative, multitasking model in which each process would run to completion or until it voluntarily relinquished the CPU.

First-generation network operating systems made networking commercially viable and were deployed on a wide range of products. The downside was that these systems were plagued with a host of problems associated with resource management and fault isolation.

Second-Generation OS: Control Plane Modularity

Second-generation network operating systems are free from packet switching and thus are focused on control plane functions. Most core and edge routers installed in the past few years are running second-generation operating systems, and these systems are currently responsible for moving the bulk of traffic on the Internet and in corporate networks.

However, the lack of a software data plane in second-generation operating systems prevents them from powering low-end devices without a separate (hardware) forwarding plane. Also, some customers cannot migrate from their older software easily because of compatibility issues and legacy features still in use.

These restrictions led to the rise of transitional OS designs, in which a first-generation monolithic image would run as a process on top of the second-generation scheduler and kernel, thus bridging legacy features with newer software concepts.

Third-Generation OS: Flexibility, Scalability and Continuous Operation

Although second-generation designs were very successful, the past 10 years brought new challenges. Increased competition led to the need to lower operating expenses and a coherent case for network software flexible enough to be redeployed in network devices across the larger part of the end-to-end packet path.

Another key goal of third-generation operating systems is the capability to run with zero downtime (planned and unplanned). Third-generation operating systems also should make the migration path completely transparent to customers. They must offer an evolutionary rather than revolutionary upgrade experience typical to the retirement process of legacy software designs.

Basic OS Design Considerations

As networking vendors develop their own code, they get further and further away from the original port, not only in protocol-specific applications but also in the system area. Extensions such as control plane redundancy, in-service software upgrades and multichassis operation require significant changes on all levels of the original design. However, it is highly desirable to continue borrowing content from the donor OS in areas that are not normally the primary focus of networking vendors, such as improvements in memory management, scheduling, multicore and symmetric multiprocessing (SMP) support, and host hardware drivers.

Commercial Versus Open-Source Donor OS

The advantage of a more active and popular donor OS is not limited to just minor improvements—the cutting edge of technology creates new dimensions of product flexibility and usability. Not being locked into a single-vendor framework and roadmap enables greater control of product evolution as well as the potential to gain from progress made by independent developers.

Functional Separation and Process Scheduling

Multiprocessing, functional separation and scheduling are fundamental for almost any software design, including network software. Because CPU and memory are shared resources, all running threads and processes have to access them in a serial and controlled fashion. The next section briefly explains the intricate relation between memory, CPU cycles, system performance and stability.

Memory Model

The memory model defines whether processes (threads) run in a common memory space. If they do, the overhead for switching the threads is minimal, and the code in different threads can share data via direct memory pointers.

Scheduling Discipline

Scheduling choices are primarily between cooperative and preemptive models, which define whether thread switching happens voluntarily. A cooperative multitasking model allows the thread to run to completion, and a preemptive design ensures that every thread gets access to the CPU regardless of the state of other threads.

Virtual Memory/Preemptive Scheduling Programming Model

Virtual memory with preemptive scheduling is a great design choice for properly constructed functional blocks, where interaction between different modules is limited and well defined. This technique is one of the main benefits of the second-generation OS designs and underpins the stability and robustness of contemporary network operating systems. However, it has its own drawbacks.

Notwithstanding the overhead associated with context switching, consider the interaction between two threads, A and B, both relying on the common resource R. Because threads don’t detect their relative scheduling in the preemptive model, they can actually access R in a different order and with varying intensity.

Architecture and Infrastructure

Parallelism

Advances in multicore CPU development and the capability to run several routing processors in a system constitute the basis for increased efficiency in a router control plane. However, finding the right balance of price and performance can also be very difficult.

Flexibility and Portability

A sign of a good OS design is the capability to adapt the common software platform to various needs. The capability to extend the common operating system over several products brings the following important benefits:

  • Reduced OPEX from consistent UI experience and common management interface
  • Same code for all protocols; no unique defects and interoperability issues
  • Common schedule for software releases; a unified feature set in the control plane
  • Accelerated technology introduction; once developed, the feature ships on many platforms

Degrees of Modularity

Software modularity, as previously described, focused on the case where tasks are split into multiple loosely coupled modules. This type of modularity is called “horizontal,” as it aims at limiting dependency and mutual impact between processes operating at the same peer level. Another interesting degree of modularity is known as “vertical modularity,” where modular layers are defined between parts of the operating system in the vertical direction.

Open Architecture

An interesting implication of vertical modularity is the capability to structure code well enough to document appropriate software interfaces and allow external pluggable code. While a high degree of modularity within a system allows easy porting to different and diverse hardware architectures, a well-defined and documented application programming interface (API) can be made available to third parties for development of their own applications.

Product Maintenance

Another important characteristic of products is maintainability. It covers the process of dealing with software defects and new features, abilities to improve existing code, and the introduction of new services and capabilities. It also makes a big difference in the number and quality of NOC personnel that is required to run a network. Maintainability is where a large portion of OPEX resides.

Excerpted from Juniper.net, copyright 2010, Juniper Networks, Inc. For a more detailed look at Network Operating System Evolution, download the complete white paper here.

Related Courses
Junos Foundations: JNCIA-Junos Boot Camp (IJOS, JRE)
Junos Enterprise Routing Skills Camp (JIR, AJER)
Junos Enterprise Switching Skills Camp (JEX, AJEX)
JNCIS Enterprise Routing and Switching Certification Boot Camp (JIR, JEX)

In this article

Join the Conversation