Retraining Update Strategies on Old and New Data

Update Model on Old and New Data
We can update the model on a combination of both old and new data.

An extreme version of this approach is to discard the model and simply fit a new model on all available data, new and old. A less extreme version would be to use the existing model as a starting point and update it based on the combined dataset.

Again, it is a good idea to test both strategies and see what works well for your dataset.

We will focus on the less extreme update strategy in this case.

The synthetic dataset and model can be fit on the old dataset as before.

define dataset

X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=1)

record the number of input features in the data

n_features = X.shape[1]

split into old and new data

X_old, X_new, y_old, y_new = train_test_split(X, y, test_size=0.50, random_state=1)

define the model

model = Sequential()
model.add(Dense(20, kernel_initializer=‘he_normal’, activation=‘relu’, input_dim=n_features))
model.add(Dense(10, kernel_initializer=‘he_normal’, activation=‘relu’))
model.add(Dense(1, activation=‘sigmoid’))

define the optimization algorithm

opt = SGD(learning_rate=0.01, momentum=0.9)

compile the model

model.compile(optimizer=opt, loss=‘binary_crossentropy’)

fit the model on old data

model.fit(X_old, y_old, epochs=150, batch_size=32, verbose=0)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

define dataset

X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=1)

record the number of input features in the data

n_features = X.shape[1]

split into old and new data

X_old, X_new, y_old, y_new = train_test_split(X, y, test_size=0.50, random_state=1)

define the model

model = Sequential()
model.add(Dense(20, kernel_initializer=‘he_normal’, activation=‘relu’, input_dim=n_features))
model.add(Dense(10, kernel_initializer=‘he_normal’, activation=‘relu’))
model.add(Dense(1, activation=‘sigmoid’))

define the optimization algorithm

opt = SGD(learning_rate=0.01, momentum=0.9)

compile the model

model.compile(optimizer=opt, loss=‘binary_crossentropy’)

fit the model on old data

model.fit(X_old, y_old, epochs=150, batch_size=32, verbose=0)
New data comes available and we wish to update the model on a combination of both old and new data.

First, we must use a much smaller learning rate in an attempt to use the current weights as a starting point for the search.

Note: you will need to discover a learning rate that is appropriate for your model and dataset that achieves better performance than simply fitting a new model from scratch.

update model with a smaller learning rate

opt = SGD(learning_rate=0.001, momentum=0.9)

compile the model

model.compile(optimizer=opt, loss=‘binary_crossentropy’)
1
2
3
4
5

update model with a smaller learning rate

opt = SGD(learning_rate=0.001, momentum=0.9)

compile the model

model.compile(optimizer=opt, loss=‘binary_crossentropy’)
We can then create a composite dataset composed of old and new data.

create a composite dataset of old and new data

X_both, y_both = vstack((X_old, X_new)), hstack((y_old, y_new))
1
2
3

create a composite dataset of old and new data

X_both, y_both = vstack((X_old, X_new)), hstack((y_old, y_new))
Finally, we can update the model on this composite dataset.

fit the model on new data

model.fit(X_both, y_both, epochs=100, batch_size=32, verbose=0)
1
2
3

fit the model on new data

model.fit(X_both, y_both, epochs=100, batch_size=32, verbose=0)
Tying this together, the complete example of updating a neural network model on both old and new data is listed below.

update neural network with both old and new data

from numpy import vstack
from numpy import hstack
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD

define dataset

X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=1)

record the number of input features in the data

n_features = X.shape[1]

split into old and new data

X_old, X_new, y_old, y_new = train_test_split(X, y, test_size=0.50, random_state=1)

define the model

model = Sequential()
model.add(Dense(20, kernel_initializer=‘he_normal’, activation=‘relu’, input_dim=n_features))
model.add(Dense(10, kernel_initializer=‘he_normal’, activation=‘relu’))
model.add(Dense(1, activation=‘sigmoid’))

define the optimization algorithm

opt = SGD(learning_rate=0.01, momentum=0.9)

compile the model

model.compile(optimizer=opt, loss=‘binary_crossentropy’)

fit the model on old data

model.fit(X_old, y_old, epochs=150, batch_size=32, verbose=0)

save model…

load model…

update model with a smaller learning rate

opt = SGD(learning_rate=0.001, momentum=0.9)

compile the model

model.compile(optimizer=opt, loss=‘binary_crossentropy’)

create a composite dataset of old and new data

X_both, y_both = vstack((X_old, X_new)), hstack((y_old, y_new))

fit the model on new data

model.fit(X_both, y_both, epochs=100, batch_size=32, verbose=0)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

update neural network with both old and new data

from numpy import vstack
from numpy import hstack
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD

define dataset

X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=1)

record the number of input features in the data

n_features = X.shape[1]

split into old and new data

X_old, X_new, y_old, y_new = train_test_split(X, y, test_size=0.50, random_state=1)

define the model

model = Sequential()
model.add(Dense(20, kernel_initializer=‘he_normal’, activation=‘relu’, input_dim=n_features))
model.add(Dense(10, kernel_initializer=‘he_normal’, activation=‘relu’))
model.add(Dense(1, activation=‘sigmoid’))

define the optimization algorithm

opt = SGD(learning_rate=0.01, momentum=0.9)

compile the model

model.compile(optimizer=opt, loss=‘binary_crossentropy’)

fit the model on old data

model.fit(X_old, y_old, epochs=150, batch_size=32, verbose=0)

save model…

load model…

update model with a smaller learning rate

opt = SGD(learning_rate=0.001, momentum=0.9)

compile the model

model.compile(optimizer=opt, loss=‘binary_crossentropy’)

create a composite dataset of old and new data

X_both, y_both = vstack((X_old, X_new)), hstack((y_old, y_new))

fit the model on new data

model.fit(X_both, y_both, epochs=100, batch_size=32, verbose=0)