Finance

Charts

Statistics

Macros

Search

Automate data anomaly detection processes, Excel VBA

Here’s the same VBA code explained in English for automating the process of detecting data anomalies based on statistical thresholds (like 3 standard deviations from the mean).

  1. Structure of the Data

Suppose the data is in column A starting from row 2 (A2

), and we want to detect anomalies in this data range.

  1. VBA Code for Detecting Anomalies:
Sub DetectAnomalies()
    Dim ws As Worksheet
    Dim lastRow As Long
    Dim dataRange As Range
    Dim value As Double
    Dim mean As Double
    Dim stdev As Double
    Dim thresholdUpper As Double
    Dim thresholdLower As Double
    Dim cell As Range   
    ' Reference to the active sheet
    Set ws = ThisWorkbook.Sheets("Sheet1")   
    ' Define the data range (column A)
    lastRow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
    Set dataRange = ws.Range("A2:A" & lastRow)   
    ' Calculate the mean and standard deviation of the data
    mean = Application.WorksheetFunction.Average(dataRange)
    stdev = Application.WorksheetFunction.StDev(dataRange)  
    ' Define the anomaly thresholds: 3 standard deviations above or below the mean
    thresholdUpper = mean + 3 * stdev
    thresholdLower = mean - 3 * stdev
    ' Loop through each cell in the data range to detect anomalies
    For Each cell In dataRange
        value = cell.Value       
        ' Check if the value is an anomaly (above or below the thresholds)
        If value > thresholdUpper Or value < thresholdLower Then
            ' Highlight the cell as an anomaly (e.g., in red)
            cell.Interior.Color = RGB(255, 0, 0) ' red
            cell.Offset(0, 1).Value = "Anomaly" ' Mark the cell next to the value
        Else
            ' If the value is not an anomaly, no change
            cell.Interior.ColorIndex = -4142 ' Remove color
            cell.Offset(0, 1).Value = ""
        End If
    Next cell
    ' Show a message when the process is complete
    MsgBox "Anomaly detection complete!", vbInformation
End Sub

Code Explanation:

  1. Variable Definitions:
    • ws: Reference to the active worksheet where the data is stored.
    • lastRow: Variable that stores the last row with data in column A.
    • dataRange: The range of cells containing the numerical data to analyze (A2).
    • mean: The mean (average) of the values in the data range.
    • stdev: The standard deviation of the values in the data range.
    • thresholdUpper and thresholdLower: The upper and lower anomaly thresholds (mean ± 3 standard deviations).
  1. Calculating the Mean and Standard Deviation:
    • The code calculates the average of the data range using Application.WorksheetFunction.Average.
    • It also calculates the standard deviation using Application.WorksheetFunction.StDev.
  2. Anomaly Detection:
    • The code loops through each cell in the dataRange.
    • For each value, it checks if the value is greater than the thresholdUpper or less than the thresholdLower (i.e., outside 3 standard deviations from the mean).
    • If the value is an anomaly, the cell is highlighted in red and the text « Anomaly » is added in the adjacent column (column B).
    • If the value is not an anomaly, any color formatting is removed, and the adjacent column (B) is cleared.
  3. Completion Message:
    • Once the anomaly detection process is complete, a message box is displayed to inform the user that the process is finished.
  1. How to Use the Code?
  1. Open the VBA Editor:
    • Press Alt + F11 to open the VBA editor.
    • In the editor, go to Insert > Module to insert a new module.
  2. Copy and Paste the Code:
    • Copy the code above and paste it into the module you just created.
  3. Run the Code:
    • Press F5 to run the code and detect anomalies in the data in column A of the active sheet.
  1. Customization:
  • Anomaly Thresholds: You can adjust the threshold (currently set as 3 standard deviations) by modifying the thresholdUpper and thresholdLower values.
  • Data Range: If your data is not in column A or extends across multiple columns, you can modify the dataRange.
  • Anomaly Formatting: You can change the highlight color or apply other actions, such as sending an email or saving the results to a different sheet.

This code provides a good starting point for automating anomaly detection in data using VBA. You can tailor it to fit different datasets or anomaly criteria.

0 0 votes
Évaluation de l'article
S’abonner
Notification pour
guest
0 Commentaires
Le plus ancien
Le plus récent Le plus populaire
Online comments
Show all comments
Facebook
Twitter
LinkedIn
WhatsApp
Email
Print
0
We’d love to hear your thoughts — please leave a commentx