Detect Anomaly Point in Real Time Traffic Data#


../../../_images/colab_logo_32px.pngRun in Google Colab  ../../../_images/GitHub-Mark-32px.pngView source on GitHub


In this guide we will demonstrate how to use Chronos Anomaly Detector for time seires anomaly detection in 3 simple steps.

Step 0: Prepare Environment#

We recommend using conda to prepare the environment. Please refer to the install guide for more details.

conda create -n my_env python=3.7 # "my_env" is conda environment name, you can use any name you like.
conda activate my_env
pip install bigdl-chronos

Step 1: Prepare dataset#

For demonstration, we use the publicly available real time traffic data from the Twin Cities Metro area in Minnesota, collected by the Minnesota Department of Transportation. The detailed information can be found here

Now we need to do data cleaning and preprocessing on the raw data. Note that this part could vary for different dataset. For the machine_usage data, the pre-processing contains 2 parts:

  1. Change the time interval from irregular to 5 minutes.

  2. Check missing values and handle missing data.

from bigdl.chronos.data import TSDataset

tsdata = TSDataset.from_pandas(df, dt_col="timestamp", target_col="value")
df = tsdata.resample("5min")\
           .impute(mode="linear")\
           .to_pandas()

Step 2: Use Chronos Anomaly Detector#

Chronos provides many anomaly detector for anomaly detection, here we use DBScan as an example. More anomaly detector can be found here.

from bigdl.chronos.detector.anomaly import DBScanDetector

ad = DBScanDetector(eps=0.3, min_samples=6)
ad.fit(df['value'].to_numpy())
anomaly_indexes = ad.anomaly_indexes()