Description
In this assignment we will practice to detect anomalies in the benchmark dataset. We will explore deep learning approaches that include building a sequence to sequence MLP and autoencoder.
Dataset
NAB (Numenta Anomaly Benchmark) is a novel benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is composed of over 50 data files designed to provide data for research in streaming anomaly detection. It is comprised of both real-world and artificial timeseries data containing labeled anomalous periods of behavior.
Tasks
Part I: MLP for Anomaly Detection
- Choose any dataset from NAB (except those used in class) and prepare it for training (normalize, split between train/test/validation). Explore the dataset by visualizing it and showing statistical parameters about it.
- Build an MLP/LSTM model for predicting a sequence of values (min 5 values). Work with 3 different setups of the window size and the size of the output sequence.
- Using 3 different loss/distance measures identify the anomalies in the dataset.
Compare the measurements.
- Discuss the results and provide the graphs, e.g. train vs validation accuracy and loss over time. Show a confusion matrix (normal vs anomaly).
Part II: Autoencoder for Anomaly Detection
- Build a Autoencoder model for predicting a sequence of values. Show 3 different Autoencoder setups (e.g. using Dense/LSTM/Conv1D layers).
- For one of the model builded in 1 show the process of hyperparameters tuning (e.g. thresholds, # of layers, activation functions).
- Discuss the results and provide the graphs, e.g. train vs validation accuracy and loss over time. Show the confusion matrix.



