Convenient fault diagnosis with LLDP
30 September 2010
This rule-based ‘expert system’ helps to locate network misconfiguration errors
In the field of industrial automation technology, network infrastructure devices based on Ethernet provide a wide range of functionality and support many different network protocols. This wide functional range also increases the risk of mismatches in the resulting configuration settings, which may have adverse effects on the availability and processing speed of the network.
In common network management tools on the market, fault management is usually limited to the recognition and visualisation of dynamic deteriorations in the monitored network. These tools offer solutions that use, for instance, the Simple Network Management Protocol (SNMP) to transmit ‘traps’ sent from the monitored devices to inform on condition changes. But a misconfiguration is static by nature and normally does not generate a change of condition or recordable event. These faults can therefore hardly be detected with the tools currently used.
As a general rule, a static configuration analysis can be performed by a human being with the help of a ‘traditional’ network management system such as Hirschmann Industrial HiVision or Nagios. To do so, this person needs to check the configuration of individual devices against patterns that are known to be valid and error-free.
However, in many cases the sheer size of the network makes it virtually impossible to keep track of all details. Essentially, this has something to do with the psychology of perception. Human beings are able to process only a limited amount of information. A computer-implemented configuration error detection is not subject to such restrictions of limited perception, but always works with the same consistent reliability. Actually, what we are looking for is a solution that provides the knowledge of a human expert—such as a service technician who is able to verify the configuration—in software, with consistent reliability and at the push of a button, which is why this kind of software is often referred to as an ‘expert system.’
The Hirschmann tool that provides the misconfiguration detection functionality essentially consists of programmed sets of rules, which is referred to as a rule-based expert system. The information—to which the rules are applied—is taken directly from the devices involved using SNMP, specifically from the LLDP-MIB (Link Layer Discovery Protocol - Management Information Base).
Figure 1 ‘VLAN Misconfiguration’: In a physical context, different VLAN IDs are configured on two switches. An analysis of the Port and Protocol VLAN information provided in the LLDP-EXT-DOT1-MIB allows to detect differing VLAN settings between infra
(For details on the basic functionality of LLDP please refer to the longer version of this article on the Control Engineering Europe website. Readers will also find direct links to references for general information on how to use LLDP for configuration error detection.)
For misconfiguration detection, LLDP (IEEE802.1AB) presents a number of benefits:
* LLDP Protocol Data Units (PDUs), the data frames sent by an LLDP-enabled end device, can still pass through direct connections between devices even when they are blocked and no longer available for communication purposes (such as VLAN misconfigurations). This makes it possible to recognise misconfigurations even though no network traffic is possible on higher protocol layers.
* LLDP PDUs transmit data to directly neighbouring devices only and therefore accurately represent a physical context, i.e. the direct connection between exactly two devices. This ensures the availability on a device of local data accurately defined in the LLDP MIB as well configuration data from the neighbouring devices, so that they can be compared to each other.
Figure 2 ‘Error Detection’: The error detection GUI component. The upper part displays the neighbouring devices, which can be selected; the text field underneath contains a detailed description of the detected misconfigurations.
When applying the programmed configuration rules to the data base, the identified problems have to be visualised in a precise and graphically clear—that is to say, in an ergonomic way. To achieve this, a condition model including four colour-coded error conditions was developed:
1. Green symbol—No error detected;
2. Yellow warning symbol—An error was detected that could affect the network performance, but has no immediate impact on the availability of the network;
3. Red warning symbol—An error was detected that could affect the availability;
4. Grey symbol—Insufficient data base, no error detection possible.
The detected conditions are analysed and displayed on a per switch port basis.
Contact Details and Archive...
Most Viewed Articles...