Update Model on New Data Only
We can update the model on the new data only.
One extreme version of this approach is to not use any new data and simply re-train the model on the old data. This might be the same as “do nothing” in response to the new data. At the other extreme, a model could be fit on the new data only, discarding the old data and old model.
Ignore new data, do nothing.
Update existing model on new data.
Fit new model on new data, discard old model and data.
We will focus on the middle ground in this example, but it might be interesting to test all three approaches on your problem and see what works best.
First, we can define a synthetic binary classification dataset and split it into half, then use one portion as “old data” and another portion as “new data.”
…
define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=1)
record the number of input features in the data
n_features = X.shape[1]
split into old and new data
X_old, X_new, y_old, y_new = train_test_split(X, y, test_size=0.50, random_state=1)
1
2
3
4
5
6
7
…
define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=1)
record the number of input features in the data
n_features = X.shape[1]
split into old and new data
X_old, X_new, y_old, y_new = train_test_split(X, y, test_size=0.50, random_state=1)
We can then define a Multilayer Perceptron model (MLP) and fit it on the old data only.
…
define the model
model = Sequential()
model.add(Dense(20, kernel_initializer=‘he_normal’, activation=‘relu’, input_dim=n_features))
model.add(Dense(10, kernel_initializer=‘he_normal’, activation=‘relu’))
model.add(Dense(1, activation=‘sigmoid’))
define the optimization algorithm
opt = SGD(learning_rate=0.01, momentum=0.9)
compile the model
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
fit the model on old data
model.fit(X_old, y_old, epochs=150, batch_size=32, verbose=0)
1
2
3
4
5
6
7
8
9
10
11
12
…
define the model
model = Sequential()
model.add(Dense(20, kernel_initializer=‘he_normal’, activation=‘relu’, input_dim=n_features))
model.add(Dense(10, kernel_initializer=‘he_normal’, activation=‘relu’))
model.add(Dense(1, activation=‘sigmoid’))
define the optimization algorithm
opt = SGD(learning_rate=0.01, momentum=0.9)
compile the model
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
fit the model on old data
model.fit(X_old, y_old, epochs=150, batch_size=32, verbose=0)
We can then imagine saving the model and using it for some time.
Time passes, and we wish to update it on new data that has become available.
This would involve using a much smaller learning rate than normal so that we do not wash away the weights learned on the old data.
Note: you will need to discover a learning rate that is appropriate for your model and dataset that achieves better performance than simply fitting a new model from scratch.
…
update model on new data only with a smaller learning rate
opt = SGD(learning_rate=0.001, momentum=0.9)
compile the model
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
1
2
3
4
5
…
update model on new data only with a smaller learning rate
opt = SGD(learning_rate=0.001, momentum=0.9)
compile the model
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
We can then fit the model on the new data only with this smaller learning rate.
…
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
fit the model on new data
model.fit(X_new, y_new, epochs=100, batch_size=32, verbose=0)
1
2
3
4
…
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
fit the model on new data
model.fit(X_new, y_new, epochs=100, batch_size=32, verbose=0)
Tying this together, the complete example of updating a neural network model on new data only is listed below.
update neural network with new data only
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD
define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=1)
record the number of input features in the data
n_features = X.shape[1]
split into old and new data
X_old, X_new, y_old, y_new = train_test_split(X, y, test_size=0.50, random_state=1)
define the model
model = Sequential()
model.add(Dense(20, kernel_initializer=‘he_normal’, activation=‘relu’, input_dim=n_features))
model.add(Dense(10, kernel_initializer=‘he_normal’, activation=‘relu’))
model.add(Dense(1, activation=‘sigmoid’))
define the optimization algorithm
opt = SGD(learning_rate=0.01, momentum=0.9)
compile the model
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
fit the model on old data
model.fit(X_old, y_old, epochs=150, batch_size=32, verbose=0)
save model…
load model…
update model on new data only with a smaller learning rate
opt = SGD(learning_rate=0.001, momentum=0.9)
compile the model
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
fit the model on new data
model.fit(X_new, y_new, epochs=100, batch_size=32, verbose=0)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
update neural network with new data only
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD
define dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=1)
record the number of input features in the data
n_features = X.shape[1]
split into old and new data
X_old, X_new, y_old, y_new = train_test_split(X, y, test_size=0.50, random_state=1)
define the model
model = Sequential()
model.add(Dense(20, kernel_initializer=‘he_normal’, activation=‘relu’, input_dim=n_features))
model.add(Dense(10, kernel_initializer=‘he_normal’, activation=‘relu’))
model.add(Dense(1, activation=‘sigmoid’))
define the optimization algorithm
opt = SGD(learning_rate=0.01, momentum=0.9)
compile the model
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
fit the model on old data
model.fit(X_old, y_old, epochs=150, batch_size=32, verbose=0)
save model…
load model…
update model on new data only with a smaller learning rate
opt = SGD(learning_rate=0.001, momentum=0.9)
compile the model
model.compile(optimizer=opt, loss=‘binary_crossentropy’)
fit the model on new data
model.fit(X_new, y_new, epochs=100, batch_size=32, verbose=0)