A Standardization Success Story from Cox Communications

By Michael Garramone and David Ririe

“Standardizing your network is a good idea.” – Captain Obvious


What’s not as easy is committing to that decision and the best way to follow through on that promise. So how did we embark on our journey towards standardization?

When faced with a considerable hardware upgrade to our M-CMTS platform back in early 2011, we took it as an ideal opportunity to define a set of installation and configuration standards. Not that we were wildly different at the time, but there were certainly disparate configurations across the country. Our company is organized with a central access engineering team that designs the network, with field engineering resources that own, install, deploy, and maintain it. Those field teams are spread across 26 markets and six regions coast to coast.

First, we got the right people together. What makes them right? It’s important to have your subject matter experts of course, but also folks who can engage in healthy debate and collaboration. It will come as quite a shock that engineering discussions can get impassioned, which is perfectly fine. It’s how we get to the best possible outcome, by being inclusive of all ideas and putting everything on the table. Because the team included key people from the field, not only did it help us glean the best practices from across the company, it inherently created their endorsement as well. People do their best work when they feel ownership, and this process allowed everyone to have their say.

Next, we broke up the team into subgroups, each tackling a specific configuration section or feature set of the platform. While reaching consensus is the end goal, and that happens often enough, not all decisions will always boil down to right or wrong. Faced with two relatively equal choices, blue versus green as we like to call them, sometimes it comes down to a judgment call to pick one and move forward. Engaging your vendors for recommended best practices is an important part of this process. Many standards can start and end with their suggestions, and in any case, can only help the discussions.

A few months and many meetings later, we had our first version, with tweaks, fixes, and updates continuing for the years to come. Overall steps were to document, set deadlines, test, discuss some more, trial, repeat if necessary, and finally deploy. For us, documentation was arguably the most important piece, and is where we focused from the start. We knew the foundation of this whole endeavor had to be a single point of reference for the field engineers to use. It was not intended to be a manual or MOP, but a “playbook” of how this platform is deployed by us. In other words, if this device is on our network, then this is how it needs to be configured and managed.

Cox Communications DOCSIS Playbook

Now that we had our standards documented and communicated to the field, enforcing compliance to these standards was the next obstacle. A standard unused is a useless standard, and that is the position we found ourselves in. The point of writing those standards is so they get put to good use in production, but we went through a period without adequate visibility to know if the standards were being followed and who was supposed to fix them if we found otherwise. The problems we identified were accountability and trust, understanding that you can’t expect compliance without both.

The problem of accountability was not in the sense of pointing fingers, but having a well-defined process for changes and assigning roles and responsibilities. To that end, a compliance committee was formed, composed of regional directors to whom the field engineers responsible for maintaining standards reported. All proposed standards were sent to the committee for review, scheduling, dissemination, and tracking.

The problem of trust lay in the data and reporting. As we all know, trust is easy to lose and hard to gain back. Those responsible for compliance can’t act or be held accountable for inaction if forced to rely on inaccurate data. Early on, compliance was determined via a proprietary vendor application, one which we had no direct access or control over and relied on submitting requests to the vendor for updates. The dashboard was sometimes inconsistent and not intuitive, and the data frequently contradicted what was in our standards document versus what was configured. Consequently, it was regularly ignored, we didn’t truly know how compliant the network was, and changes to resolve that were not taking place. We couldn’t argue. If you verify a chassis is following a standard that the report is telling you otherwise, it no longer provides value.

Its shortcomings were clear, and often lamented, so the solution was a third-party, platform agnostic, locally and centrally managed application to check policy compliance. Rules were written by the design engineers using conditional regular expressions and grouped into policies that directly correspond to high level sections of the standards document. Reports are generated, rolled up into a dashboard and scorecard, and reviewed by the compliance committee. A set amount of time is required to resolve any items out of compliance, e.g., 30 days for normal severity and 48 hours for high priority.

Example Compliance Scorecard

Things never stay the same of course, and we learned lessons along the way to help us improve the process. Back to those continuous changes and tweaks I mentioned earlier. Probably the biggest feedback we got from the field engineers was not in the amount of work they had to do to reach compliance, but the frequency they were being sent updates. Early on, changes were coming from design at an irregular pace and adhering to no reliable schedule. The rules were being updated at the time those changes were sent out, before the field had a chance to deploy them. The intent was for the compliance tool to always stay in alignment with the standards. However, put in this position, not only was the field in a seemingly perpetual state of maintenance activity, but they felt like compliance was a moving target they could never hit. As soon as the scorecard cleared up, a new change came in to kick them right back into noncompliance.

As a result, we devised a process by which the compliance committee would group changes into a quarterly release schedule. The bulletin and MOP would be sent to the field with an agreed upon deadline. Thirty days before that deadline, the rules would be created in the compliance tool with an informational-only severity for tracking purposes, then changed to the normal severity when the date passed. All the work in a region can then be scheduled at the same time, in the same maintenance window, and completed in one fell swoop. Any deviation outside this schedule would require separate funding and approval.

Another lesson, closely related, was the lack of alignment between sending changes to the field and the timing of publishing a new version of the standards document that contained those changes. Although the standards were defined in the bulletin, the playbook would often be published several weeks before or after, resulting in discrepancies and causing confusion. It was clearly not optimal, so to resolve this we simply had to commit to bring those timelines into sync as much as possible. The new playbook version and quarterly field bulletin are now published within a few days of each other at most.

The last lesson was to ensure we have an open feedback loop between design and field engineering, mostly to verify the rules were written correctly when any potential discrepancies came up. One wrong character in a regular expression results in noncompliance that doesn’t really exist. There will also always be continuous improvement writing new standards or updating existing ones for which perspective from the field is invaluable. It’s not that two-way communication didn’t exist before, but it was important to make everyone feel comfortable and motivated to get involved, and it was crucial to regaining that trust in the data and reports.

Since enacting these processes, all feedback from the field has been extremely positive and we’re operating like a well-oiled machine. We had 19 major updates last year and only had to touch the boxes five times, four quarterly releases and one funded outside the schedule. DOCSIS equipment is at 99.9% compliance today and is used as the gold standard for documentation and compliance for other organizations in the company to follow.

The benefits of standardization may be hard to quantify, with so many variables affecting customer experience and satisfaction, trouble call and truck roll rates, NPS, and so on. So why did we do it? The value that consistency provides to installation, configuration, maintenance, and troubleshooting is easy to see, and undoubtedly contributes positively to all those metrics. With how our regions are spread out, having a common language helps communication among the central and field engineering teams. Any chassis one logs into will be no different than any other, is installed via an approved template, and every issue is troubleshooting the same configuration. We know the scheduler is balancing traffic across bonded upstreams for all customers. We know self-signed certificates are not accepted, dynamic shared-secret is enabled, and BPI+ is strictly enforced to combat modem theft. We know every DOCSIS 3.0 modem has the same downstream bonding group options. We know every DOCSIS 3.1 modem is using the same approved OFDM channel profile. The list goes on and on, down every section of the standards document.

The fruits of our labors were later seen with our FTTH and CCAP deployments, and will now be realized with DAA. This has put us in prime position for the next step, everybody’s favorite buzzword, automation. We have a new organization focused on maintenance verification, software upgrades, customer provisioning, and among other things, compliance standards. They will soon be taking all we have accomplished and exploring automated compliance remediation and configuration updates as we continue to look for areas to improve.


Michael Garramone

Cable Access Engineer,
Cox Communications, Inc.
michael.garramone@cox.com

Michael Garramone is a 20-year veteran at Cox Com-munications, spending his first 13 years in the Las Vegas market before moving to the Access Engineering Design team at the Atlanta corporate headquarters in 2011. His current efforts center around CCAP design, standards development and documentation, and lab management and evaluation.

 


David RirieDavid Ririe

Sr. Director, Access Engineering
Cox Communications, Inc.
david.ririe@cox.com

David Ririe serves as a Senior Director in the Access Engineering group for Cox Communications in Atlanta, GA. He has worked in various engineering roles for Cox Communications starting as a Network Engineer in the Omaha, NE market 14 years ago. He has lead the team through a number of transitions and upgrades on both the DOCSIS and PON technology platforms in that time. He began his telecom career in the U.S. Air Force working on various communications and data networking platforms in the air and on the ground.