### **Current Status of Network Processors**

Neha Jain Research Scholar Department of Computer Science Mohanlal Sukhadia University, Udaipur, India

#### **ABSTRACT**

Number of internet users is increasing day by day. Demands for new application are also increasing. It is possible to create a network processor based on user's demands. This paper shows some details about softwares which help in writing code for NPs. Some brief information about NPs is also shown in this paper, which is currently in market.

#### **General Terms**

Brief introduction of Network Processor with their current status

#### **Keywords**

NP Software, Network Processor Applications, Current status of NPs .Network Processors detail, currently available NPs, Latest NPs

#### 1. INTRODUCTION

As number of users is increasing, demands for new applications on internet is increasing, demand for internet video is increasing , users on YouTube , face book, Google is also increasing, users are giving preferences to online shopping, new protocols are being introduced more rapidly. It forces significant growth in data traffic. In spite of increased bandwidth users required QoS-enabled bandwidth. Users required services such as IPTV, assured business networking services and mobile voice and data services. All of these required development and deployment of high-speed telecommunication systems. Specialized processors, called network processors (NPs) can solve all of these problems. Gupta et al [3] present a methodology for processor evolution in an embedded system design environment. This evaluation can help in selecting a suitable processor core or in evaluating changes to an ASIP. Actually NP is a ASIP. ASIP or application specific instruction set processor .are designed for a set of specific application. For example NPs are ASIP for network services; Kin et al [4] explore a low power application specific programmable processor (ASPP) for media processor. According to Lien et al [15] ASIP exist in b/w ASIC and general purpose processors. Manoj Jain et al [1] described the steps involved in the design process. Further ASIP for networking or NPs are a device that can process packets in network for example a processor on router line cards or processors in network access equipment. NP's should have features like Scalability, product differentiation, low cost and faster time to market. A NP's must have many processing engines (PE's), dedicated hardware accelerators, memory resources Programming for network processor is very painful task. Typical NP architecture is shown in fig. 1.

Manoj Kumar Jain Associate Professor Department of Computer Science Mohanlal Sukhadia University, Udaipur, India



Fig 1: Typical NP Architecture

Example of NP is as follows. Teja - by using this software user can simplify the programming exercise using a graphical user interface. Teja's software can only be used to program a single network processor, the IPX 1200 from Intel Corp. (NASDAQ: INTC). Developers define an application logic in C. These modular logic components are then assigned to the h/w element for a specific target and optimized production code will automatically generated . This application can be reused across multiple products or across generation of network processors. User need to reassign the application logic to the hardware of the new target. Teja NP consists of: an Application Development Environment, Network Processing Operating System (NPOS), a library of foundation application building blocks (including TCP Termination, IPv4 Forwarding, IPv6 Forwarding, and ATM).

Teja NP also allows integration of third-party hardware and software components, such as coprocessors and protocol stacks that complement Intel network processors. Benefits of using Teja NP include accelerated time-to-market, reduction in engineering risk. It provide reliability and high-performance products

Zebra is multi-server routing software. It support protocols such as RIP, OSPF, BGP, IPv6 routing protocols such as RIPng and BGP-4+ supported. Zebra runs on these operating systems: GNU/Linux, FreeBSD, NetBSD, and OpenBSD. It supports common client commands.

CLICK router: The Click Router is a modular software router. Modules of click router are known as element. Each element performs a specific task for packet processing like packet classification, queuing, scheduling, and interfacing with network device. NsClick is an interface provided for the Click router to allow its use in the NS network simulator. This is very useful in testing and developing your Router.

NS Network Simulator is a powerful tool that allows you to simulate network topologies and traffic through the use of Otcl scripting.

Network Processor Applications:

Routing table lookup: Determine the next hop for incoming packets.

Packet Classification: classify packets using header fields against a set of rules,

URL-based Switching: Distribute HTTP requests based on URLs, Transcoding, Encryption/Decryption, intrusion detection, firewall, access control checking, denial-of-service.

Network Processors which are considered in this paper are: Bay Microsystems: Chesapeake, Au1550 network processor, XLP980 Network Processor, C-5 Network Processor, CX8620x, EZchip's NP product family, Marvell Xelerated HX300 Family of Network Processors, Anybus NP40<sup>TM</sup> network processor, FP3: 400G network processor.

# 2. BAY MICROSYSTEMS: IMPLEMENT CHESAPEAKE

It is a 50Gbps Network Processor. It supports traffic management. In Today's market Chesapeake performance is highest of any Network Processor. It is using the lowest power per Gbps in its class. It has 50G raw data interfaces, 125G internal data path. It can process 122 million packets per second.

In conventional multicore processors each core runs a sequence of instructions with associated data, but in Bay designs the packet data move from one core to another. Each core executes just one instruction. The data move, but the instructions do not move. For algorithms that are relatively short and require little or no looping, this is a surprisingly efficient approach in spite of all the data movement. The key is that each time the data hop from one core to another; it's a very short hop. Also, this architecture is highly deterministicthe packet will definitely get all the processing it needs by the time it leaves the far end of the pipeline. It's also a simple programming model; to the programmer, there's apparently only one core.

This processor support MEF (Metro Ethernet Forum) Services: E-Line, E-LAN, Hierarchical Bandwidth Profiles, VLAN Aware L2 Switching, Q-in-Q, PBT, Multicast, VPLS, PWE3, MEF10.1 compliant - Hard-QoS, Multiprotocol Stack. In the Chesapeake design, the pipeline is over 300 stages long, which defines the limit on how much processing can be performed on each packet. Bay has optimized different segments of the pipeline to perform different functions, which adds slightly to the programming complexity--but it seems like a good tradeoff. Chesapeake (110 nm, including also a traffic manager) only consume about 16Watt.

# 3. AU1550® PROCESSOR SECURITY NETWORK PROCESSOR

Au1550: security network processor. It is a versatile high-performance, low-power, high integration security network system. It is suitable for use in Linux-based networking and remote access devices such as gateways, network attached storage (NAS) units, wireless access points, and VoIP (Voice over Internet Protocol) applications. The entire VPN packet protocol is implemented in hardware of Au1550. It supports DES, 3DES, AES, ARC-4, SHA-I, MD5. By implementing complete IPSEC packet-processing task in hardware, the Au1550 processor gives better security performance relative to other network processors.

This security engine supports virtual private network (VPN) solutions for both IPSec and SSL. It has configurable DDR or SDRAM memory interface which extends the power-performance range. It is based on the MIPS32® instruction

set. It gives maximum performance at low power, the processor runs up to 500MHz. Power dissipation is less than 500Mw for the 400-MHz version. The processor clock speeds is 333, 400, and 500 MHz Target device operating system support is available for Linux, VxWorks, and Windows CE. NET. Au1550 Security Network Processor Architecture is shown in fig.2.



Fig2: Au1550 Security Network Processor Architecture

#### 4. XLP980 NETWORK PROCESSOR

Broad corp. introduced the world's highest performance multicore network processor architecture: the XLP980. According to it, this NP can compute 1 trillion operations per second. It is manufactured in a 28nm CMOS process. The 28nm chip uses 20 quad-threaded, out-of-order MIPS cores to drive throughput to 160 Gbits, scalable to 1.28Tbits/s using multi-chip coherency. The XLP980 is optimized for deployment of network functions such as hardware acceleration, virtualization and deep packet inspection.

XLP processor incorporates several networking acceleration engines. Broadcom has described the self governing operations of these engines, such as packet processing, encryption/decryption, RSA acceleration, compression and decompression, storage acceleration, and pattern matching. XLP's put these engines together and process packets completely without the CPU's intromission. System designers can combine XLP's autonomous packet processing with the processor's hardware virtualization to meet the requirements of emerging applications, such as network functions virtualization (NFV). NFV decouples the network functions, such as network address translation (NAT), firewalling, intrusion detection, domain name service (DNS), caching, etc., from hardware appliances, so they can run in software. Broadcom has designed virtualization capabilities into all functional blocks of the processor, including the CPUs, engines, accelerators, and I/Os. These blocks work in unity to support virtualization through the device without needing the CPU to manage operations. System designers need not to worry about packet processing performance, and they can focus on Layer 4-7 applications.

#### 5. C-5 NETWORK PROCESSOR

This processor works at all levels of the protocol stack: Layers 2-7. Its operating frequencies: 166MHz, 200MHz, and 233MHz. It can transmit up to 15 million packets per second at wire speed. It has 17 programmable RISC Cores (for cell/packet forwarding) and 32 programmable Serial Data Processors (for processing bit streams). It can perform up to 133 million table lookups per second. The main components of the C-5 NP are: Channel processors, Executive Processor,

Fabric Processor, Buffer Management Unit, Table Lookup Unit, and Queue Management Unit.



Fig 3: C-5 Network Processor Architecture

#### 6. CX8620x

It provides networking system solutions for a wide variety of mainstream applications in the residential and small office/home office environments. It is based on a 180 MHz ARM926. This network processor has an integrated switch on its single-chip. This switch supports five 10/100 Mbps Ethernet ports with auto MDIX capabilities. It support Ethernet, 802.11 through a PCI interface and can provide Internet access sharing capability for multiple PCs. It is based on IEEE 802.3 specification. It includes five embedded DSP-



Fig 4: CX8620x Network Processor Architecture

based 10Base-T/100Base-TX transceivers. An auto negotiation function determines the selection of either the 10Base-T or 100Base-Tx mode. It has advanced Serial and VoIP capabilities. It has an MMU for Linux and WinCE support. It supports high speed secure VPN Implementations.

#### 7. EZCHIP'S NP PRODUCT FAMILY

Network processor of this family is cost effective, power and board-space efficient. They are used in Metro Switches, Edge and Core Routers, Data Center Switches, 3G/4G Wireless Infrastructure Equipment. It supports Ethernet Aggregation Nodes, Multi-10Gbps Firewalls & VPN. It can be used in Intrusion Detection Appliances, Server Load Balancing Switches, Network Monitoring and Analysis Services.

NP-5 Network Processor: 240-Gigabit Network processor for Carrier Ethernet Applications. It has integrated 240Gbps traffic management which provides granular bandwidth control. It is Suited for line card, services card and pizza box

applications. It uses DDR3 DRAM memory chips for minimizing power and cost.

EZchip's scheduled to sample, in 2014, a newest product family, the NPS – Network Processors for Smart networks. It will be based on C-Based programming, Linux Operating System. It will provide full 7-layer processing, integrated traffic management, security and DPI hardware acceleration. Thus NPS is enabling for a wide range of network services, provide extremely high performance. Any s/w update can be made according to need and make the NPS an excellent fit to facilitate the next transition in the networking market.

#### 8. MARVELL XELERATED HX300 FAMILY OF NETWORK PROCESSORS

Carrier Ethernet packet processing is the main job of the processors which belongs to this family. Targets of HX300 Family of Network Processors are switch-routers, Packet-Optical Transport/OTN, metro aggregation as well as cloud computing platforms. The family has five members: HX337, HX336, HX326, HX330, and HX320. HX300 family can perform packet processing, switching, traffic management, and Ethernet MAC and PHY functionality. TCAM, SRAM are integrated in these processors. TCAM (ternary contentaddressable memory) is a specialized type of highspeed memory that searches its entire contents in a single clock cycle. By using this speed of route lookup, packet classification, packet forwarding is increased. Processor of this family offer the low cost and power efficiency of custom designed fixed-function ASICs. And also provide advantages of reduced time to market and value differentiation through software.



Fig 5: Marvell Xelerated HX300 Family of Network Processors Architecture

# 9. ANYBUS NP40<sup>TM</sup> NETWORK PROCESSOR

The Anybus NP40 is suitable for all major industrial networks. It is best for the application like high-end real-time Industrial Ethernet and fieldbus applications which has fast network cycles and synchronization demands. The Anybus NP40 is a flash-based, single chip network processor that includes a high-performance ARM® Cortex<sup>TM</sup>-M3 and an FPGA fabric. The FPGA fabric is used to implement the various real-time Ethernet interfaces while the ARM core is used to run the protocol and application stacks. Since the NP40 is Flash-based, the device can be re-programmed for several different industrial Ethernet networks. This means that a single hardware platform can support several different networks by simply downloading new firmware. For high-

performance networks, the architecture makes it possible to get practically immediate data transfer with "zero delay". Optimized stacks and controllers the network controllers, implemented in hardware (VHDL), are together with the protocol stacks optimized for best performance and flexibility. A unique API handling method separates the network application from the host interface. It provides a high performance, event-driven architecture which is flexible and straight-forward to modify and extend. Protocol preprocessing is done in hardware (VHDL).

# 10. POWER EDGE OF NETWORK (POWEREN) PROCESSOR

The IBM Power Edge of Network (PowerEN) processor delivers throughput oriented computing, connectivity efficiency, flexible and scalable resource deployment capability to fulfill the requirement of Smarter Systems. PowerEN integrates sixteen general purpose processors. PowerEN has the embedded hardware accelerators necessary for some key applications like Host Ethernet acceleration for network protocol processing, Encryption/ Decryption acceleration, Pattern Matching acceleration, Extensible Markup Language (XML) acceleration, Compression/ decompression acceleration. It works on layer 4-7.

#### 11. FP3: 400G NETWORK PROCESSOR

It is the world's first 400G network processor. The speed of FP3 network processor is four times faster than the speed of the most advanced networks available today. It provides high performance and advanced Quality of Service (QoS). It is highly scalable. Its power consumption is half less than (per Gb/s) as compared to power consumption of FP2. Other network processor currently present in market can only achieve high performance by sacrificing the range of services they support; or conversely, they can scale services support by limiting performance. But FP3 can simultaneously scale both performance and services without any compromise. In order to achieve 400 Gb/s, the FP3 uses 40 nm chip technology in its design. It has 288 RISC cores. Its core frequency is Ghz. It reduces transistor size and enables more processing capacity to be delivered in a smaller package. Today's memory technology does not support 400 Gb/s processing capability. Therefore, Alcatel-Lucent has worked very closely with leading memory suppliers to develop new techniques to

increase the speed of memory access and accelerate the leap to 400 Gb/s processing capability. Its active power management is built into the chip design. In order to ensure optimal energy efficiency across diverse applications it provides facility to turn off portions of the chip, where features are not being used.

#### 12. COMPARISON OF DIFFERENT NPs

Chesapeake network processor uses pipeline architecture and DRAM Controller. It has 125G internal data path and 50G raw data interfaces. AU1550® processor uses Descriptorbased DMA (DDMA).It Operates in parallel to CPU pipeline. Provide support for DES, 3DES, AES, ARC4, SHA1, MD5. XLP980 network Processor focus on Layer 4-7 applications. It is optimized for deployment of network functions such as hardware acceleration, virtualization and deep packet inspection. C-5 network processor has 17 programmable RISC Cores (for cell/packet forwarding) and 32 programmable Serial Data Processors (for processing bit streams). While CX8620x supports complete networking system solutions for a wide variety of mainstream applications in the residential and SOHO environments. ANYBUS NP40<sup>TM</sup> network processor has event-based hardware API. This comparison is shown in table 1.

#### 13. CONCLUSION

In this report survey of different fields of networking where network processors can be used, different software tools which can help in writing network processors are presented. This paper also present features of different network processors. Bay micro system design does not provide security of data. DES or AES implementations are not supported. It has DRAM implementation, but TCAM is available in the market which is much faster than DRAM. By using it, performance can be increased drastically. Au1550 processor provides security services, but it also uses DDR or SDRAM which give low performance as comparatively TCAM. XLP980 works on layer 4 to 7, but processor working on all layers is also available in market e.g. C-5 NP, N-5 NP. Though a significant improvement is made by various manufactures but still scope of improvement exists.

**Table 1: Comparison of different Network Processors** 

|                          | Architectural<br>Discussion                                                             | Task based                                                                                                                                         | Sample Features                                                                                                                              |
|--------------------------|-----------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
| Chesapeake               | Pipeline Architecture<br>DRAM Controller                                                | 50G raw data interfaces 125G internal data path 122 million packets per second processing capacity TM supports Hierarchical Scheduling and Shaping | 50G raw data interfaces 125G internal data path. 122 million packets per second processing. MEF10.1 compliant - Hard-QoS Multiprotocol Stack |
| AU1550® Processor        | High Speed MIPS CPU Core Pipeline: Scalar 5- stage pipeline Descriptor-based DMA (DDMA) | Operates in parallel to CPU pipeline Executes all integer multiply and divide instructions High speed access to on-chip buses                      | Direct support for DES,<br>3DES, AES, ARC4, SHA1,<br>MD5<br>Full IPSec packet protocol<br>processing implemented in<br>hardware              |
| XLP980 Network Processor | Manufactured in a 28nm CMOS process                                                     | Focus on Layer 4-7 applications. can computes 1 trillion operations                                                                                | optimized for deployment of network functions such as                                                                                        |

|                                                |                                                                                                                                                              | per second                                                                                                                                                                                              | hardware acceleration,<br>virtualization and deep<br>packet inspection                                                                                                                                                                     |
|------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| C-5 Network Processor                          | It has 17<br>programmable RISC<br>Cores (for cell/packet<br>forwarding) and 32<br>programmable Serial<br>Data Processors (for<br>processing bit<br>streams). | Programmability at all levels of<br>the protocol stack: Layers 2-7<br>can transmit up to 15 million<br>packets per second at wire speed<br>It can perform up to 133 million<br>table lookups per second | Support for virtually any<br>third-party protocol stack,<br>PHY or fabric interface, and<br>industry standard tools<br>Smart Networks Alliance<br>Program ensures a wide<br>range of verified solutions                                    |
| CX8620x                                        | Based on a 180 MHz<br>ARM926                                                                                                                                 | TDMI interface for VoIP Embedded UART interface for external modem applications .PCI interface                                                                                                          | supports complete<br>networking system solutions<br>for a wide variety of<br>mainstream applications in<br>the residential and SOHO<br>environments.                                                                                       |
| EZCHIP'S Network<br>Processor Family           |                                                                                                                                                              | Deliver a wide variety of applications for layer 2-3 switching and routing, layer 4-7 stateful session processing, and packet payload manipulation                                                      | Can be used in Intrusion Detection Appliances, Server Load Balancing Switches, Network Monitoring and Analysis Services. It is Suited for line card, services card and pizza box applications.                                             |
| ANYBUS NP40 <sup>TM</sup> Network<br>Processor | Event-based hardware<br>API: 8/16-bit parallel<br>and high speed SPI.<br>I/O (shift register<br>interface)                                                   | Event-based interface method enables easy to access input and output data at any time                                                                                                                   | Flash-based, the device can<br>be re-programmed for several<br>different industrial Ethernet<br>networks. This means that a<br>single hardware platform can<br>support several different<br>networks by simply<br>downloading new firmware |

#### 14. REFERENCES

- [1] Manoj Kumar Jain, M. Balakrishnan, Anshul Kumar" ASIP Design Methodologies: Survey and Issues "Proceedings of the IEEE / ACM International Conference on VLSI Design 2001
- [2] A. Tanenbaum. Computer Networks. Prentice Hall PTR. January 1996.
- [3] Gupta, T.V.K.; Sharma, P.; Balakrishnan, M.; Malik, S.: "Processor evaluation in an embedded systems design environment.", Proceedings of Thirteenth International Conference on VLSI Design 2000, 3-7 Jan. 2000, Pages: 98-103.
- [4] Kin, J.; Chunho Lee; Mangione-Smith, W.H.; Potkonjak, M.: "Power efficient mediaprocessors: design space exploration." , Proceedings of the 36th Design Automation Conference 1999, 21-25 June 1999, Pages: 321-326.
- [5] Karras, K., ' A folded pipeline network processor architecture for 100 Gbit/s networks, Architectures for Networking and Communications Systems (ANCS), 2010 ACM/IEEE Symposium on, 25-26 Oct. 2010,pages 1-11.
- [6] Intel® IXP4XX Product Line of Network Processors, July 2010

- [7] N. Shah, "Understading Network Processors, M.sc thesis," http://www-cad.eecs.berkeley.edu/niraj/papers/ UnderstandingNPs,2001
- [8] Gupta, T.V.K.; Sharma, P.; Balakrishnan, M.; Malik, S.: "Processor evaluation in an embedded systems design environment.", Proceedings of Thirteenth International Conference on VLSI Design 2000, 3-7 Jan. 2000, Pages: 98-103.
- [9] Bay Microsystems: http://www.baymicrosystems.com/
- [10] www.netlogicmicro.com
- [11] http://www.embedded.com/electronicsnews/4416337/Broadcom-launches-tera-ops-networkprocessor [11] www.freescale.com
- [12] http://www.ezchip.com
- [13] Anybus® NP40<sup>TM</sup> network processor: http://www.hms.se/
- [14] IBM Power Edge of Network Processor, 2012
- [15] Liem, C.; May, T.; Paulin, P.: "Instruction-set matching and selection for DSP and ASIP code generation", Proceedings of the European Design and Test Conference, 1994. EDAC, the European Conference on Design Automation. ETC European Test Conference. EUROASIC, 28 Feb.-3 March 1994, Pages: 31-37.

IJCA™: www.ijcaonline.org 51