Skip to content

The Data Scientist

How can you merge datasets with different timescales?

Handling frequency in machine learning

One of the trickiest situations in machine learning is when you have to deal with datasets coming from different time scales.

Frequency in machine learning is for example, that you are handling financial data, and some of the data comes at a monthly frequency (e.g. sales reports), and some other data comes at daily frequency (e.g. stock market prices). How can you create a model which is utilising both pieces of information at the same time?

One solution is to try and create aggregates of the higher frequency features. So, for example, in this case, you can aggregate the daily frequency features on a monthly level using functions like the mean, and the standard deviation. However, this makes you use lose information, and it might be a suboptimal solution.

A better solution can arrive through the use of deep learning.

 

Deep learning for handling data with different frequency

Using deep neural networks it is possible to do this in a very smooth manner. You can create two subnetworks: one network reads the daily data, the other network reads the monthly data. The outputs of the two subnetworks are then joined together, before they are passed in another layer. The code below shows how you could do this for the two datasets we outlined above.

import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers import *

X_day=pd.read_csv('day_data.csv')
X_month=pd.read_csv('month_data.csv')

day_input = keras.Input(shape=day_X.shape[1:3], name="day_input") 
monthly_input = keras.Input(shape=month_X.shape[1], name="monthly_input")
x1 = LSTM(50)(day_input)
x2 = Dense(num_units,activation='elu')(monthly_input)

merging=Concatenate()([x1,x2])

x = Dense(100,activation='elu')(merging)
x = BatchNormalization()(x)
x = Dropout(0.2)(x)
y = Dense(3,'softmax')(x)

The benefit of using deep learning

The benefit of using deep learning in this case is that you are not losing any information which would have been lost otherwise by aggregating features together. This is done through the use of a layer, like LSTM, GRU or 1d convolution, which can read sequential data. We are simply using a dense layer for the monthly data (processing each month in one batch), and then we can merge the two subnetworks together.

So, make sure to check this trick out next time you are faced with this problem!