7239

Get a Live Demo

You need to see DPS gear in action. Get a live demo with our engineers.

Get the Alarm Fundamentals White Paper

Download our free Monitoring Fundamentals Tutorial.

An introduction to Monitoring Fundamentals strictly from the perspective of telecom network alarm management.

DPS is here to help.

1-800-693-0351

Have a specific question? Ask our team of expert engineers and get a specific answer!

Learn the Easy Way

Sign up for the next DPS Factory Training!

DPS Factory Training

Whether you're new to our equipment or you've used it for years, DPS factory training is the best way to get more from your monitoring.

Reserve Your Seat Today

Problems With Lab Testing Vs Real-World Environments

By 

April 23, 2020

Share: 
Ryan Ridley
Ryan Ridley
Software Engineer

One of our core principles here at DPS is reliability. In remote telemetry monitoring, reliable equipment can make or break a deal. What good is your equipment if it fails when you need it most? Our RTUs are designed to work for over 10 years, but sometimes a combination of unexpected conditions can cause issues. When this happens, we will work with you to ensure that not only your equipment gets up and running again, but to also resolve the issue that caused the downtime in the first place.

One of the reasons these issues can happen is due to the difference in lab and real-life environments. Here at DPS, we test our units thoroughly. Before a unit ships out the door, we ensure every component is working up to standard. Even though we make sure everything is working, there are unexpected conditions and scenarios that can cause hiccups that are difficult and sometimes impossible to replicate in-house. Some of our clients have massive facilities with hundreds of different pieces of equipment talking to each other, and often times it's not possible to replicate that here at DPS. This is one of the reasons we do remote sessions and are willing to work with you. If we can't reproduce the issue here, we can work with you and gather data to resolve the issues for you.

DPS will work with you until your issue is resolved

Last year in November, we had a client reach out to us with an issue regarding their NetGuardian's web interface going down following a network scan. The network security team would perform a vulnerability scan every Friday, and on the following Monday the NOC team would need to come in and reboot each NetGuardian one by one. The network scan would cause the NetGuardian's web interface to lock up until they were rebooted. Once rebooted, the units would resume reporting as normal.

Now, this was obviously a rather large inconvenience, and greatly reduced the usability of their units, so we sought to resolve this issue as quickly as we could. The first thing we tried was having the client use the lockdown options in NGEdit, as we thought that one of the ports getting hit by the scan was causing this issue. However, after further discussion with the client we came to the conclusion that this would not help resolve the issue.

The next step we took was to reproduce the issue in-house. We set up 5 NetGuardian 832As and loaded them with the same exact firmware the client was using. We then installed the same software the client used to run their network scans and began running the same scans on the test units here. Another engineer and I worked together on running the scans and attempting to break our units.

We ran all the same tests that the client ran, but we were unable to reproduce the issue. We tried using simulated lag and packet drops, as well as running the tests on the same schedule they were. We even had the clients attempt to break one of our units on a public IP address! No matter what we tried, we were unable to reproduce the issue on our equipment here.

Rapid Development on Feature Request Solves the Problem

After these failed attempts of trying to break the unit, the client came up with an idea. What if we could automate the reboot process? That way, even though we couldn't identify the root cause, we would still have a functional solution. Of course we can! We began a plan to implement a new feature that would allow the client to specify a day and time that the unit would automatically reboot. Our engineers got to programming, and within a couple days we had a solution. We sent the client the new firmware update and they put it to use. The client was excited to have the issue resolved, and we were happy to resolve their issue and get their equipment working quickly.

If you're having issues with your equipment, please don't hesitate to contact us. We want you to have the best equipment that fits your needs perfectly. If you site has problems, we want to solve them.

Share: