Infrastructure monitoring can span multiple stakeholders, data center sites and metrics. Use the five W's to establish a comprehensive monitoring strategy.
1. Figure out why real-time monitoring is necessary
Before an IT team invests in any type of real-time monitoring, they should figure out why it's something that they feel is needed within the data center. Potential reasons include cost reduction, productivity improvement, streamlined management and the reduction of surprises and downtime.
These reasons are particularly compelling as organizations look to expand infrastructure outside an on-premises data center and integrate colocation and edge as part of the overall infrastructure. When hybrid IT evolved, businesses that ran their own data centers found capacity management a huge issue, Graham said.
The technology originally came about to provide snapshots of data center infrastructure, which makes it ideal to help IT teams more effectively manage the data center and help with increasingly complex setups that involve multiple technology types.
"Real-time monitoring has advanced so much; it has always been there. What we've seen over the years, they were taking the information at the rack level and trying to optimize that with the infrastructure. Monitoring systems started to get the health of the data center and it became a way to automate and optimize," said Rajan Battish, principal at RSP Architects.
2. Identify who must be involved
Once the IT team builds a business case, they must think about who should be involved as part of the new real-time monitoring setup.
This includes who must be informed of issues, who the system routes the information to in a timely manner, who focuses on facility conditions, what personnel addresses new application rollouts and which teams focus on process improvements and effectiveness.
If an organization has off-premises infrastructure, admins must account for any outside parties, such as managed service providers, colocation and cloud providers, partners, and vendors that might require reporting.
IT teams must establish the major stakeholders because doing so affects reporting structure and software alerts. It also can reduce reporting redundancies.
"Every stakeholder has different priorities, and IT managers may have different needs than a facilities manager. They [are the ones] to decide what's critical and what's not so critical," Rogers said.
He added that organizations can start with localized alarms that aren't necessarily connected to everything.
"Unless you staff 24/7, when you don't have visibility into alarms, then things can go south very quickly. You need to look at the facility and get the alarms to the right people," he said.
3. See what the most useful metrics are
There's certainly a lot of data to track -- especially, when it comes to infrastructure. With a real-time monitoring setup, managers should decide which metrics are important to them and which can provide information quickly.
"It really comes down to what assets you're looking to manage and converge," Rogers said.
Most organizations rely on some common metrics, such as power usage effectiveness (PUE), data center infrastructure efficiency (DCIE), energy reduction and IT equipment utilization. But admins should approach these metrics with some caution.
PUE is a widely used metric, but it's based on general estimates of IT facility power and total equipment power. If teams make IT upgrades, then the PUE could go up. Graham suggested that IT admins use PUE as an internal measurement, instead of trying to compare it against other data centers outside the organization.
Graham and Rogers explained that there are other metrics outside of PUE and DCIE that managers can use, such as cooling metrics, but that they can require more data and analysis that result in low adoption.
4. Establish when to implement monitoring or expand capacity
Figuring out when to bring on more capacity or system applications can be tricky, especially as IT needs constantly evolve. Your team should consider if your monitoring and management tools can help with planning, scheduling and performance of internal development and improvements.
"You can't just think about when you need to have equipment available and running to have an application. You need to work back from your go-live date to cover all resources needed through each step through the implementation," Graham said.
Real-time monitoring also helps on a day-to-day basis with metrics for specific incidents: times that an incident occurs, is reported and is resolved. Identification of these times can help organizations become more proactive with incident response, especially if these instances show patterns over time or regularly occur at specific intervals.
5. Know where infrastructure is located
Organizations should also look at where they must place monitoring software and hardware -- whether it's in the on-premises data center, off site at edge nodes or colocation. There should be a continuous process to track capacity and connected devices from an internal standpoint.
There's also information from cloud providers, so IT teams should ask if data can be tracked and traced within the cloud and establish any compliance needs for documentation or specialized applications.
With a more accurate picture of where all the data and hardware sits within an IT setup, organizations can figure out which real-time monitoring offering is most effective for their needs and can support all the required technology types. This ensures consistent performance and effective capacity management once real-time monitoring is in place.
6. Find out how to monitor infrastructure
After the IT teams, managers and stakeholders discuss when, where and what, admins should investigate how the organization should implement real-time monitoring and increase infrastructure capacity. This involves an understanding of factors that can affect infrastructure growth, such as hot spots, running out of floor space, outages, lack of cooling resources and water events.
Most organizations face challenges during implementation, especially when it comes to having hardware and software communicate with each other, Battish said.
This makes a convergence protocol important, as most organizations use multiple vendors to build out data center infrastructure. However, sensors and protocol converters can help collect data and increase real-time monitoring capabilities.
"There are numerous options when it comes to sensors, so make sure to do your homework and not lock yourself into something that's proprietary and doesn't have the needed functionality for today, tomorrow and in the future," Rogers said.
Industry offerings include wired and wireless sensors for power distribution units and uninterruptable power supplies, as well as a host of software that organizations can purchase off the shelf or customize to internal needs. This combination of sensors and software can help admins and managers reduce the amount of everyday remedial tasks.
"When people are constantly fixing the same issues, it's not good for morale. We're not saying that everyone needs a single pane of glass, but they should reduce redundancy, get consolidating and address the right issues. Transformation in monitoring and management allows people to be more effective and lead in our industry and focus on more optimization," Graham said.