Sky is the limit: August 2009

Every telecommunication equipment vendor has to supply some level of manageability for the equipment. The management software supplied with the network element can be a propreitary software interoperating with the vendor's element management system or it can be a standards based management agent so that it can interoperate with both the vendor's element management system and other third party network management systems. In this article I will cover various aspects to consider in the architecture of a management solution for network elements, especially those based on ATCA platform.

What is element management?

Managing a network element involves the following

Configuration of the network element. The configurations can be initial provisioning (which remain static as long as the network element is operational) and run time configurations (which are dynamically added / modified / deleted at run time)
Collection of fault notifications from the network element.
Collection of statistics and call counters from the network element.

What is an element management system?

An element management system provides some mechanism, in the form of GUI or command line utilities, to manage the network element. The GUI or command line utilities will have fields / commands to configure the device, view alarms and statistics. Typical element management architectures are

GUI is a web based. Network element runs an HTTP server with CGI scripts which communicate with back end APIs in the network element to configure the device and collect alarms. The advantage here is that, GUI can be made extremely interactive if AJAX is employed. Since most of the browsers now support JavaScript, portability doesn't become an issue. The disadvantage is that it doesnt use any standard protocol for management. If a customer wants to integrate a third party network management system (explained later) with the network element, it is not possible unless the third party NMS understands how to invoke the CGI scripts on the network element. Figure 1 provides a conceptual view of this architecture.

Figure 1

GUI is a stand alone JAVA GUI which uses SNMP PDUs to communicate with the network element. Network element runs an SNMP agent which interprets the PDUs and provides configuration and fault management facilities. The advantage of this solution is that it is standards based. Any third party network management system can be integrated with the network element. The SNMP MIB serves as the contract between the NMS and the network element. The disadvantage is that, Java based GUI require separate installation and are generally less interactive (compared to AJAX based web interface). Figure 2 provides a conceptual view of this architecture.

Figure 2

GUI is web based. Network element runs CGI scripts which communicate with local SNMP agent using SNMP. This approach combines the best of (1) and (2) above. Figure 3 provides a conceptual view of this architecture.

Figure 3

What is a Network Management System (NMS)?

Network management systems are standards based management systems that can control and monitor a variety of network elements within a network. A network can comprise multiple devices from multiple vendors. Hence it becomes imperative for the network management system to talk the same language to all the devices it manages. Hence a standard protocol for management becomes mandatory. SNMP is used as the standard protocol to configure various network elements.

Steps to do to make a network element provide stadards compliant management interface:

The following are the steps involved in making a device standards compliant for integration with any NMS.

Define an information model for managing the device. The information model should not depend on how it is going to be managed from an EMS or NMS. That is, the model should care only about what should be managed and not how it should be managed. If some attributes / fields are added in the model just to satisfy some ease of use of use from the NMS / EMS it breaks this goal. If th model takes this path, then it becomes extremely difficult to integrate the network element's instrumentation of the model (the SNMP agent part) with the NMS (the SNMP manager part).
While defining the tables in the information model, normalize the tables using database normalization techniques. This calls for expertise in database design.
Capture the information model as an SNMP MIB.
Implement the instrumentation for the MIB as an SNMP agent at the network element. Instrumentation means, for example if a SET request is given for a particular attribute, the agent should handle propagating that information to the appropriate module in the network element that handles that attribute.

How SNMP based network management systems function?

The function of an NMS can be split into the following

Setting the configuration on a network element (uses SNMP SET). These operations should follow the ACID properties (Atomicity, Consistency, Integrity and Durability). To ensure atomicity of these operations, configuration set requests from NMS are typically blocking calls.
Querying the existing configuration from a network element. This happens when a new network element is added into the NMS (or) when NMS loses connectivity with the network element for sometime and then comes back. The query can be on a single attribute or it can on the complete database. Though many NMS implement both these operations as blocking calls, it makes sense to implement bulk queries as asynchronous operations. This requires the SNMP manager module in the NMS to buffer the sent request so that it can match the responses with the request.
Listening for asynchronous events from the network element. These events include alarms and statistics.

Issues with Synchronous SNMP towards network elements with multiple process architecture

Most of the NMS employ synchronous (blocking call) SNMP towards the SNMP agent for configuration query and apply. This provides a simple implementation for network elements where the entire application along with management agent (SNMP agent) are compiled as a single image and loaded. This is the case with propreitary hardware based solutions.

However, next generation network elements (PDG, AAA servers, e-PDG, MME) running on standards based ATCA chassis environment are architected with multiple UNIX processes for high availability. Such network elements will have each process catering to a separate functionality. For example in a AAA server, one process could be catering to the RADIUS / DIAMETER interface towards the PDG or other NAS, while another process could be interfacing towards the traditional HLR through an SS7 interface. This SS7 process could run on a separate blade, having an SS7 card. In such an architecture, synchronous SNMP adds to lot of delay. Consider the example given in the figure below

Figure 4

As shown in the above figure, in multi cluster HA architectures some of the configuration data need to be set in every cluster. If the SNMP agent blocks for response from all of them it adds to a significant delay. Sometimes some of the clusters may be busy in data plane or control plane processing. It may process the configuration message after some delay. If the NMS which gave the configuration request is set to timeout say, 2 seconds, it may timeout while the request is still in processing at the network element. This leads to momentary data inconsistency between the configuration data available in the NMS and the configuration processed in the hardware.

There are two ways to address this problem

Make the SNMP command request and response asynchronous. This means the NMS will never block once it sends out an SNMP SET request. It may receive the SET response later. But it has to maintain a window of requests sent, so that it can match the response to the request.
Use synchronous SNMP from NMS to SNMP agent. The SNMP agent should return immediately after verifying the validity of the message. It should then process the configuration request in a separate thread and once it configures the data in all the clusters it should send a success / failure notification back to the NMS as an unsolicited message (TRAP or INFORM REQUEST PDU). In this approach the NMS need not maintain any window of requests. It should commit the configured value into the database, only after it receives the success notification from the agent. The notification will carry the OID of the attribute that was configured, the value set and its status.

Conclusion:

In summary, to make a network element provide a standard management interface and be interoperable with any NMS the following need to be done:

The SNMP MIB, should model only what should be managed and not how it should be managed. The model should be well normalized. Make sure that the MIB defines events for notifying success or failure of configuration requests.
Make the mode of operation of SNMP agent (synchronous / asynchronous) configurable (or) a compile time option. If the agent is configured to operate in asynchronous mode, then it can use the asynchronous events to signal the success or failure of operation to the NMS.

This is an excerpt from an advanced C++ training that I gave at my office couple of months back.

Most of our text books give a feeling that the standard template library (STL) in C++ is implemented as a shared library that is linked with your code by the C++ compiler / linker automatically. Templates by declaration dont hold any memory for its definition unless a specific instance of the template needs to be expanded by the compiler. For example a List class is defined to contain elements of a templated type, then a separate class will be created by the compiler for each of the instantiation of the class. That is, if an integer list is declared, then the tempplated List class will be expanded into List class containing integer data. If a string list is declared then the compiler will expand the templated List class into another List class containing string data.

So if the STL implementation is a dynamically linked library with your code, then how is it possible for the STL library writer to have defined an instantiation for all possible data types for all the STL classes. For example if you want to declare a list of your own data structure, you declare it as

std::list yourlist;

When this piece of code is hit by the compiler, it will create a new class called yourstruct_list by expanding the templated declaration of std::list class. But if the STL were to be a predefined shared library it is not possible to have the author of std::list class to have compiled it for the symbol yourstruct_list since, the author has no way of knowing how user defined data structures will look like. If the complete STL were a pre-compiled .so library and if you simply link it then the .so will never have the class instantiations for your data types.

So how is this problem solved by STL?

In STL only the basic functionalities which are independent of templates are compiled as libstdc++.so library. All the STL templated classes like list, vector, map etc.,. are coded in the header files which we include. For example /usr/include/c++//vector -->includes /usr/include/c++//bits/vector.tcc. The *.tcc files have the complete definition of the templated classes. When your C++ code uses STL the following is what happens

1. You define an instantiation of a templated class (eg: vector) for your data type
2. Compiler gets the blue print of the vector class from the included vector-->/usr/include/c++//bits/vector.tcc file
3. Compiler creates an instance of the vector class for your data type as per the definition of its blue print in /usr/include/c++//bits/vector.tcc

For a better understanding of the STL classes look at the source code in /usr/include/c++//bits/*.tcc files.

Sky is the limit

Tuesday, August 11, 2009

What India needs is self discipline to contain any viral outbreak

Sunday, August 9, 2009

Element Management Systems Architecture

Tuesday, August 4, 2009

Facts about STL

My Profiles

Sridhar's shared items

Blog Archive

About Me