IP Networks need TLC too…

By Patrick Hunter

In the spirit of our proactive network maintenance issue this season, it just makes sense to talk about the importance of care and feeding for our inside networks as much as the outside plant. While I certainly must concede that outside cabling and electronics suffer much more harsh conditions than the cabling and components inside network closets, hubsites, and data centers, it certainly cannot be ignored that none of our network components is “set it and forget it.” Not by a long shot, my friends.

There are a number of areas that require focus with regard to network components, such as routers, switches, CMTSs (just a super-specialized router, as you may recall from previous discussions), firewalls, application load balance appliances, and much more. By now, most everyone likely appreciates that these devices are just very specialized computers, with (likely multiple) central processing units (CPUs), and many other application-specific integrated circuits. Like any other computer, they consume electricity and generate heat. Managing the ingestion of electricity and the removal of heat is of paramount importance and has become its own field of specialization in our industry.

In larger organizations, it is not uncommon for a “critical infrastructure” team to be in charge of monitoring, managing, and even designing the electrical and environmental systems used to keep our network gear cooled, dry, and receiving very clean, steady electrical voltage from power systems. Our own SCTE•ISBE has done a thorough job of publishing standards related to these very concerns via the Energy Management Subcommittee of the Engineering Committee. Good reads on this topic would be SCTE Operational Practices SCTE 219 2015 and ANSI/SCTE 226 2016, both of which are available at https://www.scte.org/.

Additionally, even as we make certain that all environmental controls are active and healthy, our electrical systems are providing clean, well-conditioned power, we must also focus on the network equipment and the physical cabling that interconnects them. A solid preventive maintenance program should include what many would consider the basic foundation of network monitoring systems. Telemetry tools that monitor the connection to the network device via Internet control message protocol (ICMP) echo requests, or “pings,” are the most basic form of this. If the monitoring tool can successfully ping the device and get a response, that device is considered to be “up,” and assumed to be fully operational.

Now, it’s important to note that all this test ultimately proves is that the network path from the polling engine to the remote device is functioning, and both the poller and the device are able to respond. It’s not a truly comprehensive evaluation of the health of a device. What about the other functions and features of the device? What about the temperature of the air at the intake ports or exhaust ports of the device, assuming it has cooling components? What about the voltage being supplied to the device, whether it is direct current (DC) or alternating current (AC)? Perhaps the optical receive levels arriving at an optoelectronic transceiver (you have to imagine the “air quotes” here: it’s a “la-ser”) are too high or too low. Perhaps an interface, either virtual or physical, is in trouble and not able to function correctly.

One of the key developments in the history of the Internet is the simple network management protocol (SNMP). This communication protocol was designed specifically with network administration and monitoring in mind. While an exhaustive editorial on SNMP is outside of the scope of this article, let’s cover some of the basics.

There are several message types that are used to communicate between an SNMP managed device (like a router, switch, or cable modem) and the network management station (usually a specialized server or group of servers). Essentially, each network device has a piece of software known as the SNMP agent. The agent’s job is to translate different configuration information and other important management information to and from an SNMP-compliant format. The different types of SNMP messages include “GetRequest,” “Response,” and “Trap.” GetRequest is a message from the network manager to the agent requesting specific information, including those items mentioned previously regarding environmental or electrical conditions. Response is just what it sounds like, the agent responding to a GetRequest from a manager or other types of requests not named here. One very important message type is the Trap message. This is an unsolicited message from an agent running on a network device which is sent to the manager. Traps are typically an important part of creating alerts in a network management environment so that when something is amiss, network administrators can be informed and if necessary, swift action can be taken to correct the problem.

There are a number of versions of SNMP historically, and the latest version approved by the Internet Engineering Task Force (IETF) is SNMPv3. It includes enhanced security features among other things. If you manage a network in which credit card information is transmitted across network devices and servers, SNMPv3 should be the protocol of choice for you.

From practical experience, I can honestly state that one of the biggest reasons network engineers and administrators deal with network incidents is still most often human error. This could be in the form of misconfigurations or simple physical layer mistakes like unplugging the wrong cable or inadvertently pinching a fiber optic jumper in a cabinet door. Years of experience have led us to the conclusion that restricting access to network closets (much like we do for our headends and hubsites) is absolutely critical to reducing the frequency of network impairments and other outages. This also reduces the likelihood that cardboard boxes might be stored in the network closet, which is certainly off limits. This is due to a number of factors, including combustibility risk and the fact that cardboard, even undisturbed, can shed incredible amounts of contaminants and particulates and will contribute to the clogging of intake fans in network equipment and risk compromising the air handling systems as well. It’s not terribly surprising to note that many network closets in small branch facilities like payment centers and technical offices often double as storage areas. This is a recipe for network unreliability for certain.

In closing, the foundation of preventive (and proactive) maintenance in network closets is relatively simple. Restricting access , maintaining a clean, organized environment, following industry-standard recommended practices for critical infrastructure, and monitoring the health of all systems via SNMP in conjunction with other network management systems will yield the best reliability for our networks.


Patrick Hunter Charter CommunicationsPatrick Hunter — “Hunter”

Director, IT Enterprise Network and Telecom,
Charter Communications
hunter.hunter@charter.com

Hunter has been employed with Charter since 2000 and has held numerous positions, from Installer, System Technician, Technical Operations management, Sales Engineer, and Network Engineer. His responsibilities include providing IP connectivity to all users in Charter’s approximately 4,000 facilities, including executive and regional offices, technical centers, call centers, stores, headends, hubsites, and data centers. Mr. Hunter has served on the SCTE Gateway Chapter Board of Directors since 2005. He spends his spare time mentoring, teaching, and speaking on IP and Ethernet networks as well as careers in the network field.


Credit: Shutterstock