Mcfly: An easy-to-use tool for deep learning for time series classification

A new mcfly 3.0 release is out. See how it works and how it can help you to apply deep learning to time series classification.

Florian Huber
Netherlands eScience Center
6 min readApr 15, 2020

Time series data are everywhere. Motion sensor data, climate data, weather data, stock market data, population dynamics, sensor readings, etc. In machine learning, people working with time series are usually trying to solve either a regression task or a classification task.

A regression task is when you want to predict how a known time series will continue. What will the stock market look like tomorrow, or more likely over the next few microseconds? What will the temperature be tomorrow, or how will the arctic sea ice concentration develop over the next few weeks?

Time series REGRESSION and CLASSIFICATION are typical machine-learning tasks which can also be done using deep learning. Regression refers to making predictions on the continuation of a times series, classification is the task to sort given time series into a given number of classes (e.g. here it could be motion sensor data and we want to tell if the respective person was resting or active).

A classification task is when you want to sort a given time series into known classes. Is a patient with a certain heart beat recording healthy or not? What kind of activity is being revealed by motion sensor data? Is an oxygen sensor signal showing more or fewer people in the office?

In the following it will all be about classification of time series.

Classifying time series is not a novel task, so it should come as no surprise that there are plenty of techniques and algorithms to classify such data. But just an in many other fields, deep learning has recently changed this game drastically. Whereas in the past a number of algorithms with long lists of skillfully set parameters might have done the job, now more and more deep learning tools have begun to outperform conventional/classical techniques.

Would you perhaps love to test whether deep learning could help you with your own classification problem, but you aren’t a deep learning practitioner?

No false promises. It can be hard to apply deep learning tools properly. Not so much on the technical level, as I will show you in a second, but from a statistical and mathematical standpoint. But for now just ignore all those concerns. All you want to do is test whether deep learning can give you decent results on your dataset.

And that can indeed be super simple. How? Just use mcfly.

mcfly is a Python library developed by the Netherlands eScience Center to generate a bunch of deep learning networks for you and to train them on your dataset. We recently released version 3.0 which comes with more deep learning architectures and runs on newer Python libraries. The main goal of mcfly is to: (1) provide a quick entry point to the field of deep learning, and (2) quickly give you a first guesstimate of how (and if!) deep learning can help with your problem.

What is needed?

  1. You will need Python (3.5 or higher). Then simply pip install mcfly:
    pip install mcfly
  2. Next you need a time series dataset to work on.
    Your dataset should contain samples belonging to different categories (or classes) with multiple samples per categories. The more samples per category, the better, but how many samples your dataset must have for deep learning to work is difficult to answer. The more categories you have, and the smaller the differences between those are, the more samples are needed for deep learning models to (hopefully) pick up the subtle differences in the patterns.

Let’s do some deep learning now…

For a quick start let’s use a simple data set to figure out how it all works. For this purpose I selected a very simple motion sensor data set called “RacketSports”. It consists of mobile phone motion sensor data taken when playing either squash or badminton. The goal here is to correctly tell from a given motion sensor time series what the person was doing, with four possible classes in the data set:

Squash_BackhandBoast
Squash_ForehandBoast
Badminton_Smash
Badminton_Clear

The dataset is publicly available here. I downloaded it, split it into training, validation, and test set and saved it as numpy arrays. The processed files can now be found on zenodo, or even simpler just import them from our mcfly-tutorial. If you clone the mcfly-tutorial repo you can run the following:

Snippet from a much more extensive mcfly tutorial notebook.

This will give you 3 arrays with the data for training, validation, and testing, together with 3 arrays of the corresponding labels. Those, however, are still stings and need to be translated into binary labels. For instance like this:

Snippet from a much more extensive mcfly tutorial notebook.

Generate deep learning models using mcfly

This is the heart of mcfly. It will generate number_of_models models for you. These are built from four different types of deep learning architectures, which are displayed in the figure below. They all have different pros and cons. One of the purposes of mcfly is to quickly test those architectures and see if some work better than others on a given problem. If you want to select only one or a few of those model types, simply add to the function: model_types = ['CNN', 'DeepConvLSTM']containing the types you want.

Mcfly (from v3.0 on) generates four different types of deep learning networks for time series classification. Figure appeared in [Kuppevelt et al., 2020, doi: 10.1016/j.softx.2020.100548]

For all those four architecture types, mcfly randomly chooses the key hyperparameters within a set range. Those ranges can also be adjusted manually of course (see mcfly documentation).

Once the models are generated, they will be trained on the given data (or a subset to speed things up). This is done using train_models_on_samples.

See full tutorial notebook for more information.

The performance of the trained models can then interactively be compared using mcfly, for example by comparing the accuracy on the validation set versus a number of key hyperparameters (see screenshot below). The built-in visualization is interactive and allows to select specific models (here numbered 0 to 7), or select specific architectures (here ‘CNN’ or ‘InceptionTime’), or learning rates. Most important feature to look at are the two plots on the top which display the development of the accuracy on both the training set and the validation set. A good model should perform decently well on both sides.

mcfly build-in visualization of the model performance. While in this example all models achieve good results on the training dataset, much fewer reach decent results on the validation dataset.

Overfitting

Since the RacketSports dataset only consits of 150 training examples, you will frequently see that generated deep learning models will overfit the data. Overfitting is one of the most common problems when working with deep learning. It essentially means that you optimize too much on the training data so that the performance on unseen data will suffer. A typical signature for this is that models will do well on the training data (high training accuracy), but will perform poorly on the validation data (low validation accuracy). A typical example is shown below with 8 models, most of which report high train_accuracybut very low val_accuracyvalues!

Example results of 8 mcfly generated models trained on RacketSports dataset. Six perform well on the training data, but only 2 models also perform well on the validation data.

If we now pick one of the better performing models (or iteratively generate and train more models), then we can get quite good results on the RacketSports dataset. Below you can see how we could inspect this by generating a confusion matrix. And here it indeed reveals that most of the times the picked model is correctly predicting the actual activity!

Generating a confusion matrix using the validation dataset.

And now… what are you waiting for?

Grab some interesting time series data and try out some deep learning!

Did you use mcfly?

Awesome! We are always happy to hear from people who have made good use of mcfly. Please get in touch if you have suggestions and ideas for future developments or fixes (e.g. via GitHub, via twitter, or as response to this post). Many thanks.

Reference:
D. van Kuppevelt, C. Meijer, F. Huber, A. van der Ploeg, S. Georgievska, V.T. van Hees. Mcfly: Automated deep learning on time series. SoftwareX, Volume 12, 2020. doi: 10.1016/j.softx.2020.100548

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in Netherlands eScience Center

We’re an independent foundation with 80+ passionate people working together in the Netherlands’ national centre for academic research software.

Written by Florian Huber

Professor for Data Science at University of Applied Sciences Düsseldorf | research software engineer | former biological physicist | former chocolatier |

Responses (2)

What are your thoughts?