Programmatically manage triggered alarms in vCenter with vRO
If you ever tried to manage vCenter’s alarms with vRO, you probably knows how interesting this rabbit hole can be. Today, we’re going to see, how to acknowledge a specific alert in a vCenter with vRO and even try to make it as a pro, using some development best practices.
PS. As often happens, a quick idea for a post tends to grow as I dive into it! 😊 In this case, we’ll use vRBT to write our code in TypeScript and introduce a new AlarmManagement
class. Along the way, we’ll aim to follow some best practices—highly recommend Uncle Bob’s lectures for inspiration. This includes implementing a dedicated class for error handling, utilizing private methods, and more.
General goals:
- Programmatically acknowledge a specific triggered on ESXi hosts.
- Clear alarms of specific type.
The use case:
- Certain configuration activities may trigger alarms during execution, which we aim to handle appropriately.
- Specific alarms, such as those caused by periodic actions like backups, require special consideration.
The solution
Let’s take an example where we want to acknowledge a specific alarm and explore it in detail.
Acknowledge a specific alarm
Get triggered alarms
We begin with a method called getTriggeredAlarms
. The alarm manager is a vCenter-level entity, which means in vRO, we must retrieve it through the SDK connection. To do this, we first need to obtain the alarm manager from the SDK object.
1
2
3
4
5
6
7
8
if (!vCenter) {
throw this.handleError("No vCenter connections available.");
}
const alarmManager = vCenter.alarmManager;
if (!alarmManager) {
throw this.handleError("Alarm Manager not found.");
}
Now, we can get all ESXi hosts. Of course they can be filtered by some criteria, but for now, we’ll just get all of them.
1
2
3
4
const objectsToCheck = VcPlugin.allHostSystems;
if (!objectsToCheck || objectsToCheck.length === 0) {
throw this.handleError("No host systems available to check.");
}
Let’s create a helper function to build an object containing key alarm properties, making it easier to describe the current situation. Our goal is to fetch all alarms that are in a red
or yellow
state and are unAcknowledgedAlarms
. Additionally, we want to extract basic details such as alarmName
, status
, time
, and alarmObject
to show them later in the logs and use in other functions. To do so, we can use a filter
method to get only the alarms we’re interested in and then map
to extract the necessary information and contract it into the alarmStates
object.
1
2
3
4
5
6
7
8
9
10
11
const extractTriggeredAlarms = (managedObject, alarmStates) => {
return alarmStates
.filter((state) => state.overallStatus === VcManagedEntityStatus.red || state.overallStatus === VcManagedEntityStatus.yellow)
.map((state) => ({
vcHost: managedObject,
alarmName: state.alarm.info.name,
status: state.overallStatus.value,
time: state.time,
alarmObject: state.alarm
}));
};
Now, we’re ready to retrieve all triggered alarms based on the pre-defined filter. Let’s iterate through all ESXi hosts (objectsToCheck
) and get the alarms for each of them. We’ll use the getAlarmState
method to get the alarm states for a specific managed object. If the alarm states are available, we’ll filter them to get only the unacknowledged alarms. If the alarms are found, we’ll extract the triggered alarms and push them to the triggeredAlarms
array.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
const triggeredAlarms = [];
objectsToCheck.forEach((managedObject) => {
try {
const alarmStates: Array<VcAlarmState> = alarmManager.getAlarmState(managedObject);
const unAcknowledgedAlarms: Array<VcAlarmState> = alarmStates.filter((state) => state.acknowledged === false);
if (alarmStates) {
triggeredAlarms.push(...extractTriggeredAlarms(managedObject, unAcknowledgedAlarms));
}
} catch (error) {
System.warn(`Failed to get alarm states for ${managedObject.name}: ${error}`);
}
});
return triggeredAlarms;
Process Alarms
With our alarms in hand, we can begin processing them. Since we’re working with an array of alarms, the looping will be handled in our main code. Each iteration will invoke the processAlarms
method. Let’s take a closer look at it.
1
2
3
4
5
6
7
8
9
public processAlarms(vCenter: VcSdkConnection, vcHost: VcManagedEntity, alarmName: string, alarm: VcAlarm): void {
const alarmManager: VcAlarmManager = vCenter.alarmManager;
const alarmToAcknowledge: VcAlarm | undefined = this.getAlarmByName(alarm, alarmName);
if (alarmToAcknowledge) {
this.acknowledgeSpecificAlarm(alarmManager, vcHost, alarmToAcknowledge);
} else {
System.warn(`Alarm with name ${alarmName} not found.`);
}
}
First, to acknowledge a specific alarm, we need to locate it. We achieve this using the helper method getAlarmByName
. This method takes the alarm object and the alarm name as arguments.
1
2
3
private getAlarmByName(alarm: VcAlarm, alarmName: string): VcAlarm | undefined {
return alarm.info.name === alarmName ? alarm : undefined;
}
If the alarm is found, we acknowledge it using the acknowledgeAlarm
method. If the alarm is not found, we log a warning message.
1
2
3
4
5
6
7
private acknowledgeSpecificAlarm(alarmManager: VcAlarmManager, vcHost: VcManagedEntity, alarm: VcAlarm): void {
try {
alarmManager.acknowledgeAlarm(alarm, vcHost);
} catch (error) {
this.handleError(`Failed to acknowledge alarm ${alarm.info.name} on ${vcHost.name}: ${error}`);
}
}
Clear triggered alarms
Now, let’s move on to clearing alarms. We’ll create a method called clearAlarms
to clear alarms based on the alarm type, trigger type, and status. We’ll use the VcAlarmFilterSpec
object to filter the alarms based on the specified criteria. The clearTriggeredAlarms
method will be used to clear the alarms. This method takes the VcAlarmFilterSpec
object as an argument and its a bit limited. We can only filter by typeEntity
, typeTrigger
, and status
.
entityType
and triggerType
are enums that define the entity and trigger types, respectively. The status
is an array of VcManagedEntityStatus
values that define the status of the alarms to be cleared.
entityType
can be one of the following:
entityTypeAll
entityTypeVm
entityTypeHost
(our case)
triggerType
can be one of the following:
triggerTypeMetric
(our case)triggerTypeAll
triggerTypeEvent
status
can be one of the following:
green
yellow
(our case)red
(our case)gray
To clear all triggered alarms on ESXi hosts, we can combine the values mentioned above in various ways. For instance, if we want to clear all red
alarms (status: red
) on ESXi hosts (type: entityTypeHost
) caused by high CPU utilization (type: triggerTypeMetric
), we can use the following code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
private clearAlarm(
alarmManager: VcAlarmManager,
entityType: VcAlarmFilterSpecAlarmTypeByEntity,
triggerType: VcAlarmFilterSpecAlarmTypeByTrigger,
status: VcManagedEntityStatus
): void {
var vcAlarmFilterSpec = new VcAlarmFilterSpec();
vcAlarmFilterSpec.typeEntity = entityType.entityTypeHost;
vcAlarmFilterSpec.typeTrigger = triggerType.triggerTypeMetric;
vcAlarmFilterSpec.status = [status.red];
try {
alarmManager.clearTriggeredAlarms(vcAlarmFilterSpec);
} catch (error) {
this.handleError(`Failed to clear triggered alarms: ${error}`);
}
}
The clearAlarm
method can serve as an alternative to the clearTriggeredAlarms
method, allowing alarms to be cleared based on specific criteria, such as those defined in the processAlarms
method. It can also be adapted for other implementations depending on the requirements, enabling us to either acknowledge or clear the alarm.
Bonus: Error handling
We can create a class to handle errors. This class will contain a method called handleError
that will log the error message and throw an exception. This way, we can ensure that all errors are handled consistently.
1
2
3
4
5
6
export class AlarmManagementError extends Error {
constructor(message: string) {
super(message);
this.name = "AlarmManagementError";
}
}
Now, we can use this class in our main AlarmManagement
class to handle errors.
1
2
3
4
5
6
private handleError(errorDescription: string): never {
throw new AlarmManagementError(`Error: ${errorDescription}.`);
}
// Usage
this.handleError(`Error description: ${error}`);
Testing the solution
Let’s create a simple workflow to test our solution. We’ll create a new workflow called Acknowledge Alarm
and add an input parameter of type string
called alarmName
. We’ll also add a scriptable task to the workflow and write the following code:
1
2
3
4
5
6
7
8
9
10
11
var vCenter = VcPlugin.AllSdkConnections[0];
var alarmManagementFunctions = System.getModule("com.examples.vmware_aria_orchestrator_examples.actions").alarmManagement();
var triggeredAlarms = alarmManagementFunctions.AlarmManagement.prototype.getTriggeredAlarms(vCenter)
for (var k = 0; k < triggeredAlarms.length; k++) {
var alarm = triggeredAlarms[k];
System.log("Triggered Alarm on " + alarm.vcHost.name + ": " + alarm.alarmName + " (" + alarm.status + ")");
System.log("Time: " + alarm.time);
alarmManagementFunctions.AlarmManagement.prototype.processAlarms(vCenter, alarm.vcHost, alarmName, alarm.alarmObject);
}
The logic is straightforward but sufficient for testing our solution. We retrieve all triggered alarms, log them, and then acknowledge them. The key aspect is leveraging the AlarmManagement
class to handle the functionality. We define alarmManagementFunctions
as a new instance of the AlarmManagement
class and invoke the getTriggeredAlarms
and processAlarms
methods.
We have defined a new test alarm called A test alarm 2
in vCenter.
We can now run the workflow and see the results. If the alarm is triggered, we should see it in the logs and then acknowledge it.
In case of failure, we’ll see a nice and descriptive error message in the logs including the module name AlarmManagementError
, which is very helpful for debugging.:
errorError in (Dynamic Script Module name : alarmManagement#28) AlarmManagementError: Error: No vCenter connections available..
Summary
In this post, we explored how to programmatically manage triggered alarms in vCenter with vRO. We created a new AlarmManagement
class and introduced a few best practices, such as creating a class for errors handling, using private methods, and more. We acknowledged a specific alarm and cleared alarms of a specific type. We also discussed the use case and general goals of the solution.
Source Code
The source code with the unit tests can be found here.
The vRO package is also available here and the external ECMASCRIPT package here.
All packages should be imported