Introduction

This week, we are going to replicate the result published on the arXiv 1805.00794 (<a href ="https://arxiv.org/pdf/1805.00794.pdf"> link </a>). They provides on Kaggle both datasets already prepared so nearly only the model has to be trained.

Let's start by loading data !

Loading data

In [1]:
import math
import random
import pickle
import itertools

from keras.models import load_model

from sklearn.metrics import accuracy_score, classification_report, confusion_matrix, label_ranking_average_precision_score, label_ranking_loss, coverage_error 
import pandas as pd
import numpy as np

from sklearn.utils import shuffle

from scipy.signal import resample

import matplotlib.pyplot as plt

np.random.seed(42)
C:\python36\envs\machine_learning\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
In [38]:
df = pd.read_csv("F:/data/heartbeat/mitbih_train.csv", header=None)
df2 = pd.read_csv("F:/data/heartbeat/mitbih_test.csv", header=None)
In [39]:
df = pd.concat([df, df2], axis=0)
In [40]:
df.head()
Out[40]:
0 1 2 3 4 5 6 7 8 9 ... 178 179 180 181 182 183 184 185 186 187
0 0.977941 0.926471 0.681373 0.245098 0.154412 0.191176 0.151961 0.085784 0.058824 0.049020 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.960114 0.863248 0.461538 0.196581 0.094017 0.125356 0.099715 0.088319 0.074074 0.082621 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 1.000000 0.659459 0.186486 0.070270 0.070270 0.059459 0.056757 0.043243 0.054054 0.045946 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 0.925414 0.665746 0.541436 0.276243 0.196133 0.077348 0.071823 0.060773 0.066298 0.058011 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 0.967136 1.000000 0.830986 0.586854 0.356808 0.248826 0.145540 0.089202 0.117371 0.150235 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

5 rows × 188 columns

In [41]:
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 109446 entries, 0 to 21891
Columns: 188 entries, 0 to 187
dtypes: float64(188)
memory usage: 157.8 MB

So me know that the first 187 features are time-padded heartbeat and the last column is the class. We can check the dataset balance.

In [42]:
df[187].value_counts()
Out[42]:
0.0    90589
4.0     8039
2.0     7236
1.0     2779
3.0      803
Name: 187, dtype: int64

The dataset is quite unbalanced which is quite logical as we have less people sick than healthy. The most complex one remain the 4th category which is very low. We will try to augment it afterward

In [43]:
M = df.as_matrix()
In [44]:
X = M[:, :-1]
y = M[:, -1].astype(int)
In [45]:
del df
del df2
del M
In [46]:
C0 = np.argwhere(y == 0).flatten()
C1 = np.argwhere(y == 1).flatten()
C2 = np.argwhere(y == 2).flatten()
C3 = np.argwhere(y == 3).flatten()
C4 = np.argwhere(y == 4).flatten()

Let's plot a beat for every kind of heartbeat :

In [47]:
x = np.arange(0, 187)*8/1000

plt.figure(figsize=(20,12))
plt.plot(x, X[C0, :][0], label="Cat. N")
plt.plot(x, X[C1, :][0], label="Cat. S")
plt.plot(x, X[C2, :][0], label="Cat. V")
plt.plot(x, X[C3, :][0], label="Cat. F")
plt.plot(x, X[C4, :][0], label="Cat. Q")
plt.legend()
plt.title("1-beat ECG for every category", fontsize=20)
plt.ylabel("Amplitude", fontsize=15)
plt.xlabel("Time (ms)", fontsize=15)
plt.show()

I'm not doctor so I won't comment it a lot but we can see that the category the risk is mainly related to the height of the middle area. The higher it is, the less amplitude the hearth have so the more risks there is "I guess"

Data augmentation

To train properly the model, we sould have to augment all data to the same level. Nevertheless, for a first try, we will just augment the smallest class to the same level as class 1. With that we will be able to have a test set of around 5x800 observations. For the augmentation, we will stretch the signal in time and in amplitude too. For amplitude, the factor will depend on the current amplitude of the signal to keep the max at 1. As we saw previously, we should also not amplify it a lot to not change the category.

For the time, a small stretching factor is apply and signal is re-padded to 187 steps

In [48]:
def stretch(x):
    l = int(187 * (1 + (random.random()-0.5)/3))
    y = resample(x, l)
    if l < 187:
        y_ = np.zeros(shape=(187, ))
        y_[:l] = y
    else:
        y_ = y[:187]
    return y_
In [49]:
def amplify(x):
    alpha = (random.random()-0.5)
    factor = -alpha*x + (1+alpha)
    return x*factor
In [50]:
def test(x):
    result = np.zeros(shape= (3, 187))
    for i in range(3):   # we need to multiply each input by around 4 to reach a correct amount
        if random.random() < 0.33:
            new_y = stretch(x)
        elif random.random() < 0.66:
            new_y = amplify(x)
        else:
            new_y = stretch(x)
            new_y = amplify(new_y)
        result[i, :] = new_y
    return result
In [51]:
plt.plot(X[0, :])
plt.plot(amplify(X[0, :]))
plt.plot(stretch(X[0, :]))
plt.show()
In [52]:
result = np.apply_along_axis(test, axis=1, arr=X[C3]).reshape(-1, 187)
In [53]:
classe = np.ones(shape=(result.shape[0],), dtype=int)*3

Now let's add those augmented datas to the dataset

In [ ]:
X = np.vstack([X, result])
y = np.hstack([y, classe])

Split

As our dataset is unbalanced, we will use only a subset of same quantity of each class. This will be more accurate.

In [55]:
subC0 = np.random.choice(C0, 800)
subC1 = np.random.choice(C1, 800)
subC2 = np.random.choice(C2, 800)
subC3 = np.random.choice(C3, 800)
subC4 = np.random.choice(C4, 800)
In [56]:
X_test = np.vstack([X[subC0], X[subC1], X[subC2], X[subC3], X[subC4]])
y_test = np.hstack([y[subC0], y[subC1], y[subC2], y[subC3], y[subC4]])
In [57]:
X_train = np.delete(X, [subC0, subC1, subC2, subC3, subC4], axis=0)
y_train = np.delete(y, [subC0, subC1, subC2, subC3, subC4], axis=0)
In [58]:
X_train, y_train = shuffle(X_train, y_train, random_state=0)
X_test, y_test = shuffle(X_test, y_test, random_state=0)
In [59]:
del X
del y
In [60]:
X_train = np.expand_dims(X_train, 2)
X_test = np.expand_dims(X_test, 2)
In [61]:
print("X_train", X_train.shape)
print("y_train", y_train.shape)
print("X_test", X_test.shape)
print("y_test", y_test.shape)
X_train (109147, 187, 1)
y_train (109147,)
X_test (4000, 187, 1)
y_test (4000,)
In [62]:
np.save("F:/data/heartbeat/X_train.npy", X_train)
np.save("F:/data/heartbeat/y_train.npy", y_train)
np.save("F:/data/heartbeat/X_test.npy", X_test)
np.save("F:/data/heartbeat/y_test.npy", y_test)

Model

In [15]:
import numpy as np
import pickle
import math
from sklearn.preprocessing import OneHotEncoder

from keras.models import Model
from keras.layers import Input, Dense, Conv1D, MaxPooling1D, Softmax, Add, Flatten, Activation# , Dropout
from keras import backend as K
from keras.optimizers import Adam
from keras.callbacks import LearningRateScheduler, ModelCheckpoint
In [18]:
X_train = np.load("F:/data/heartbeat/X_train.npy")
y_train = np.load("F:/data/heartbeat/y_train.npy")
X_test = np.load("F:/data/heartbeat/X_test.npy")
y_test = np.load("F:/data/heartbeat/y_test.npy")
In [21]:
ohe = OneHotEncoder()
y_train = ohe.fit_transform(y_train.reshape(-1,1))
y_test = ohe.transform(y_test.reshape(-1,1))
In [4]:
print("X_train", X_train.shape)
print("y_train", y_train.shape)
print("X_test", X_test.shape)
print("y_test", y_test.shape)
X_train (109147, 187, 1)
y_train (109147, 5)
X_test (4000, 187, 1)
y_test (4000, 5)
In [5]:
n_obs, feature, depth = X_train.shape
batch_size = 500

The model is explained in part III A of the document

In [6]:
K.clear_session()

inp = Input(shape=(feature, depth))
C = Conv1D(filters=32, kernel_size=5, strides=1)(inp)

C11 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(C)
A11 = Activation("relu")(C11)
C12 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(A11)
S11 = Add()([C12, C])
A12 = Activation("relu")(S11)
M11 = MaxPooling1D(pool_size=5, strides=2)(A12)


C21 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(M11)
A21 = Activation("relu")(C21)
C22 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(A21)
S21 = Add()([C22, M11])
A22 = Activation("relu")(S11)
M21 = MaxPooling1D(pool_size=5, strides=2)(A22)


C31 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(M21)
A31 = Activation("relu")(C31)
C32 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(A31)
S31 = Add()([C32, M21])
A32 = Activation("relu")(S31)
M31 = MaxPooling1D(pool_size=5, strides=2)(A32)


C41 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(M31)
A41 = Activation("relu")(C41)
C42 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(A41)
S41 = Add()([C42, M31])
A42 = Activation("relu")(S41)
M41 = MaxPooling1D(pool_size=5, strides=2)(A42)


C51 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(M41)
A51 = Activation("relu")(C51)
C52 = Conv1D(filters=32, kernel_size=5, strides=1, padding='same')(A51)
S51 = Add()([C52, M41])
A52 = Activation("relu")(S51)
M51 = MaxPooling1D(pool_size=5, strides=2)(A52)

F1 = Flatten()(M51)

D1 = Dense(32)(F1)
A6 = Activation("relu")(D1)
D2 = Dense(32)(A6)
D3 = Dense(5)(D2)
A7 = Softmax()(D3)

model = Model(inputs=inp, outputs=A7)

model.summary()
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 187, 1)       0                                            
__________________________________________________________________________________________________
conv1d_1 (Conv1D)               (None, 183, 32)      192         input_1[0][0]                    
__________________________________________________________________________________________________
conv1d_2 (Conv1D)               (None, 183, 32)      5152        conv1d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 183, 32)      0           conv1d_2[0][0]                   
__________________________________________________________________________________________________
conv1d_3 (Conv1D)               (None, 183, 32)      5152        activation_1[0][0]               
__________________________________________________________________________________________________
add_1 (Add)                     (None, 183, 32)      0           conv1d_3[0][0]                   
                                                                 conv1d_1[0][0]                   
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 183, 32)      0           add_1[0][0]                      
__________________________________________________________________________________________________
max_pooling1d_2 (MaxPooling1D)  (None, 90, 32)       0           activation_4[0][0]               
__________________________________________________________________________________________________
conv1d_6 (Conv1D)               (None, 90, 32)       5152        max_pooling1d_2[0][0]            
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 90, 32)       0           conv1d_6[0][0]                   
__________________________________________________________________________________________________
conv1d_7 (Conv1D)               (None, 90, 32)       5152        activation_5[0][0]               
__________________________________________________________________________________________________
add_3 (Add)                     (None, 90, 32)       0           conv1d_7[0][0]                   
                                                                 max_pooling1d_2[0][0]            
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 90, 32)       0           add_3[0][0]                      
__________________________________________________________________________________________________
max_pooling1d_3 (MaxPooling1D)  (None, 43, 32)       0           activation_6[0][0]               
__________________________________________________________________________________________________
conv1d_8 (Conv1D)               (None, 43, 32)       5152        max_pooling1d_3[0][0]            
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 43, 32)       0           conv1d_8[0][0]                   
__________________________________________________________________________________________________
conv1d_9 (Conv1D)               (None, 43, 32)       5152        activation_7[0][0]               
__________________________________________________________________________________________________
add_4 (Add)                     (None, 43, 32)       0           conv1d_9[0][0]                   
                                                                 max_pooling1d_3[0][0]            
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 43, 32)       0           add_4[0][0]                      
__________________________________________________________________________________________________
max_pooling1d_4 (MaxPooling1D)  (None, 20, 32)       0           activation_8[0][0]               
__________________________________________________________________________________________________
conv1d_10 (Conv1D)              (None, 20, 32)       5152        max_pooling1d_4[0][0]            
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 20, 32)       0           conv1d_10[0][0]                  
__________________________________________________________________________________________________
conv1d_11 (Conv1D)              (None, 20, 32)       5152        activation_9[0][0]               
__________________________________________________________________________________________________
add_5 (Add)                     (None, 20, 32)       0           conv1d_11[0][0]                  
                                                                 max_pooling1d_4[0][0]            
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 20, 32)       0           add_5[0][0]                      
__________________________________________________________________________________________________
max_pooling1d_5 (MaxPooling1D)  (None, 8, 32)        0           activation_10[0][0]              
__________________________________________________________________________________________________
flatten_1 (Flatten)             (None, 256)          0           max_pooling1d_5[0][0]            
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 32)           8224        flatten_1[0][0]                  
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 32)           0           dense_1[0][0]                    
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 32)           1056        activation_11[0][0]              
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 5)            165         dense_2[0][0]                    
__________________________________________________________________________________________________
softmax_1 (Softmax)             (None, 5)            0           dense_3[0][0]                    
==================================================================================================
Total params: 50,853
Trainable params: 50,853
Non-trainable params: 0
__________________________________________________________________________________________________

Based also on their document:

Learning rate is decayed exponentially with the decay factor of 0.75 every 10000 iterations

In [7]:
def exp_decay(epoch):
    initial_lrate = 0.001
    k = 0.75
    t = n_obs//(10000 * batch_size)  # every epoch we do n_obs/batch_size iteration
    lrate = initial_lrate * math.exp(-k*t)
    return lrate

lrate = LearningRateScheduler(exp_decay)

Based also on their document:

For training the networks, we used Adam optimization method [22] with the learning rate, beta-1, and beta-2 of 0.001, 0.9, and 0.999, respectively

In [8]:
adam = Adam(lr = 0.001, beta_1 = 0.9, beta_2 = 0.999)

This is from me to save the best model

In [9]:
ckpt = ModelCheckpoint("model", 
                       monitor='val_loss', 
                       verbose=0, 
                       save_best_only=False, 
                       save_weights_only=False, 
                       mode='min', 
                       period=1)

Based also on their document:

Cross entropy loss on the softmax outputs is used as the loss function

In [10]:
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
In [11]:
history = model.fit(X_train, y_train, 
                    epochs=150, 
                    batch_size=batch_size, 
                    verbose=2, 
                    validation_data=(X_test, y_test), 
                    callbacks=[lrate, ckpt])
Train on 109147 samples, validate on 4000 samples
Epoch 1/150
 - 6s - loss: 0.3812 - acc: 0.8870 - val_loss: 0.8361 - val_acc: 0.7680
Epoch 2/150
 - 4s - loss: 0.1322 - acc: 0.9641 - val_loss: 0.6263 - val_acc: 0.8320
Epoch 3/150
 - 4s - loss: 0.0944 - acc: 0.9738 - val_loss: 0.5435 - val_acc: 0.8308
Epoch 4/150
 - 4s - loss: 0.0780 - acc: 0.9780 - val_loss: 0.4927 - val_acc: 0.8637
Epoch 5/150
 - 4s - loss: 0.0694 - acc: 0.9806 - val_loss: 0.3962 - val_acc: 0.8758
Epoch 6/150
 - 4s - loss: 0.0602 - acc: 0.9824 - val_loss: 0.3434 - val_acc: 0.8910
Epoch 7/150
 - 4s - loss: 0.0528 - acc: 0.9845 - val_loss: 0.4123 - val_acc: 0.8712
Epoch 8/150
 - 4s - loss: 0.0488 - acc: 0.9853 - val_loss: 0.3270 - val_acc: 0.8955
Epoch 9/150
 - 4s - loss: 0.0464 - acc: 0.9862 - val_loss: 0.4620 - val_acc: 0.8725
Epoch 10/150
 - 4s - loss: 0.0430 - acc: 0.9873 - val_loss: 0.2814 - val_acc: 0.9088
Epoch 11/150
 - 4s - loss: 0.0406 - acc: 0.9875 - val_loss: 0.3313 - val_acc: 0.9015
Epoch 12/150
 - 4s - loss: 0.0388 - acc: 0.9882 - val_loss: 0.2301 - val_acc: 0.9255
Epoch 13/150
 - 4s - loss: 0.0352 - acc: 0.9892 - val_loss: 0.2745 - val_acc: 0.9092
Epoch 14/150
 - 4s - loss: 0.0345 - acc: 0.9895 - val_loss: 0.1951 - val_acc: 0.9350
Epoch 15/150
 - 4s - loss: 0.0326 - acc: 0.9900 - val_loss: 0.2163 - val_acc: 0.9293
Epoch 16/150
 - 4s - loss: 0.0315 - acc: 0.9900 - val_loss: 0.2371 - val_acc: 0.9220
Epoch 17/150
 - 4s - loss: 0.0296 - acc: 0.9907 - val_loss: 0.2490 - val_acc: 0.9240
Epoch 18/150
 - 4s - loss: 0.0285 - acc: 0.9907 - val_loss: 0.2171 - val_acc: 0.9340
Epoch 19/150
 - 4s - loss: 0.0272 - acc: 0.9913 - val_loss: 0.3188 - val_acc: 0.8995
Epoch 20/150
 - 4s - loss: 0.0267 - acc: 0.9915 - val_loss: 0.1955 - val_acc: 0.9437
Epoch 21/150
 - 4s - loss: 0.0262 - acc: 0.9913 - val_loss: 0.1665 - val_acc: 0.9510
Epoch 22/150
 - 4s - loss: 0.0233 - acc: 0.9922 - val_loss: 0.2662 - val_acc: 0.9245
Epoch 23/150
 - 4s - loss: 0.0221 - acc: 0.9925 - val_loss: 0.2021 - val_acc: 0.9427
Epoch 24/150
 - 4s - loss: 0.0220 - acc: 0.9928 - val_loss: 0.2110 - val_acc: 0.9422
Epoch 25/150
 - 4s - loss: 0.0210 - acc: 0.9931 - val_loss: 0.1901 - val_acc: 0.9470
Epoch 26/150
 - 4s - loss: 0.0207 - acc: 0.9932 - val_loss: 0.2179 - val_acc: 0.9358
Epoch 27/150
 - 4s - loss: 0.0196 - acc: 0.9934 - val_loss: 0.1992 - val_acc: 0.9500
Epoch 28/150
 - 4s - loss: 0.0205 - acc: 0.9931 - val_loss: 0.2357 - val_acc: 0.9302
Epoch 29/150
 - 4s - loss: 0.0207 - acc: 0.9928 - val_loss: 0.3241 - val_acc: 0.9150
Epoch 30/150
 - 4s - loss: 0.0174 - acc: 0.9941 - val_loss: 0.2219 - val_acc: 0.9448
Epoch 31/150
 - 4s - loss: 0.0177 - acc: 0.9942 - val_loss: 0.1965 - val_acc: 0.9497
Epoch 32/150
 - 4s - loss: 0.0172 - acc: 0.9940 - val_loss: 0.2576 - val_acc: 0.9265
Epoch 33/150
 - 4s - loss: 0.0163 - acc: 0.9943 - val_loss: 0.2884 - val_acc: 0.9317
Epoch 34/150
 - 4s - loss: 0.0160 - acc: 0.9948 - val_loss: 0.2286 - val_acc: 0.9453
Epoch 35/150
 - 4s - loss: 0.0176 - acc: 0.9941 - val_loss: 0.2666 - val_acc: 0.9380
Epoch 36/150
 - 4s - loss: 0.0144 - acc: 0.9951 - val_loss: 0.2572 - val_acc: 0.9395
Epoch 37/150
 - 4s - loss: 0.0148 - acc: 0.9948 - val_loss: 0.2831 - val_acc: 0.9330
Epoch 38/150
 - 4s - loss: 0.0141 - acc: 0.9950 - val_loss: 0.1942 - val_acc: 0.9548
Epoch 39/150
 - 4s - loss: 0.0147 - acc: 0.9950 - val_loss: 0.2967 - val_acc: 0.9230
Epoch 40/150
 - 4s - loss: 0.0140 - acc: 0.9950 - val_loss: 0.1988 - val_acc: 0.9522
Epoch 41/150
 - 4s - loss: 0.0132 - acc: 0.9955 - val_loss: 0.1825 - val_acc: 0.9540
Epoch 42/150
 - 4s - loss: 0.0151 - acc: 0.9950 - val_loss: 0.2994 - val_acc: 0.9245
Epoch 43/150
 - 4s - loss: 0.0126 - acc: 0.9959 - val_loss: 0.2870 - val_acc: 0.9220
Epoch 44/150
 - 4s - loss: 0.0126 - acc: 0.9956 - val_loss: 0.2601 - val_acc: 0.9415
Epoch 45/150
 - 4s - loss: 0.0141 - acc: 0.9948 - val_loss: 0.2624 - val_acc: 0.9385
Epoch 46/150
 - 4s - loss: 0.0126 - acc: 0.9956 - val_loss: 0.3122 - val_acc: 0.9363
Epoch 47/150
 - 4s - loss: 0.0101 - acc: 0.9965 - val_loss: 0.2568 - val_acc: 0.9495
Epoch 48/150
 - 4s - loss: 0.0109 - acc: 0.9961 - val_loss: 0.3270 - val_acc: 0.9335
Epoch 49/150
 - 4s - loss: 0.0111 - acc: 0.9960 - val_loss: 0.2800 - val_acc: 0.9443
Epoch 50/150
 - 4s - loss: 0.0137 - acc: 0.9950 - val_loss: 0.2434 - val_acc: 0.9457
Epoch 51/150
 - 4s - loss: 0.0097 - acc: 0.9967 - val_loss: 0.2682 - val_acc: 0.9460
Epoch 52/150
 - 4s - loss: 0.0121 - acc: 0.9958 - val_loss: 0.2193 - val_acc: 0.9435
Epoch 53/150
 - 4s - loss: 0.0104 - acc: 0.9963 - val_loss: 0.2776 - val_acc: 0.9337
Epoch 54/150
 - 4s - loss: 0.0098 - acc: 0.9965 - val_loss: 0.3082 - val_acc: 0.9382
Epoch 55/150
 - 4s - loss: 0.0120 - acc: 0.9958 - val_loss: 0.2295 - val_acc: 0.9542
Epoch 56/150
 - 4s - loss: 0.0086 - acc: 0.9971 - val_loss: 0.3306 - val_acc: 0.9320
Epoch 57/150
 - 4s - loss: 0.0092 - acc: 0.9969 - val_loss: 0.3523 - val_acc: 0.9178
Epoch 58/150
 - 4s - loss: 0.0121 - acc: 0.9957 - val_loss: 0.2716 - val_acc: 0.9440
Epoch 59/150
 - 4s - loss: 0.0103 - acc: 0.9963 - val_loss: 0.2358 - val_acc: 0.9497
Epoch 60/150
 - 4s - loss: 0.0131 - acc: 0.9953 - val_loss: 0.2136 - val_acc: 0.9490
Epoch 61/150
 - 4s - loss: 0.0093 - acc: 0.9968 - val_loss: 0.2687 - val_acc: 0.9432
Epoch 62/150
 - 4s - loss: 0.0102 - acc: 0.9965 - val_loss: 0.3113 - val_acc: 0.9300
Epoch 63/150
 - 4s - loss: 0.0083 - acc: 0.9972 - val_loss: 0.2908 - val_acc: 0.9480
Epoch 64/150
 - 4s - loss: 0.0094 - acc: 0.9966 - val_loss: 0.2787 - val_acc: 0.9500
Epoch 65/150
 - 4s - loss: 0.0089 - acc: 0.9969 - val_loss: 0.2356 - val_acc: 0.9560
Epoch 66/150
 - 4s - loss: 0.0071 - acc: 0.9974 - val_loss: 0.2708 - val_acc: 0.9525
Epoch 67/150
 - 4s - loss: 0.0096 - acc: 0.9966 - val_loss: 0.3661 - val_acc: 0.9210
Epoch 68/150
 - 4s - loss: 0.0102 - acc: 0.9963 - val_loss: 0.2982 - val_acc: 0.9383
Epoch 69/150
 - 4s - loss: 0.0127 - acc: 0.9955 - val_loss: 0.3590 - val_acc: 0.9330
Epoch 70/150
 - 4s - loss: 0.0092 - acc: 0.9968 - val_loss: 0.4136 - val_acc: 0.9258
Epoch 71/150
 - 4s - loss: 0.0059 - acc: 0.9980 - val_loss: 0.3529 - val_acc: 0.9292
Epoch 72/150
 - 4s - loss: 0.0083 - acc: 0.9972 - val_loss: 0.2357 - val_acc: 0.9558
Epoch 73/150
 - 4s - loss: 0.0068 - acc: 0.9977 - val_loss: 0.2920 - val_acc: 0.9482
Epoch 74/150
 - 4s - loss: 0.0100 - acc: 0.9965 - val_loss: 0.3429 - val_acc: 0.9302
Epoch 75/150
 - 4s - loss: 0.0090 - acc: 0.9969 - val_loss: 0.2201 - val_acc: 0.9545
Epoch 76/150
 - 4s - loss: 0.0075 - acc: 0.9974 - val_loss: 0.3090 - val_acc: 0.9467
Epoch 77/150
 - 4s - loss: 0.0068 - acc: 0.9977 - val_loss: 0.3870 - val_acc: 0.9317
Epoch 78/150
 - 4s - loss: 0.0082 - acc: 0.9970 - val_loss: 0.2934 - val_acc: 0.9345
Epoch 79/150
 - 4s - loss: 0.0088 - acc: 0.9971 - val_loss: 0.2709 - val_acc: 0.9460
Epoch 80/150
 - 4s - loss: 0.0057 - acc: 0.9980 - val_loss: 0.2406 - val_acc: 0.9622
Epoch 81/150
 - 4s - loss: 0.0118 - acc: 0.9958 - val_loss: 0.4168 - val_acc: 0.9090
Epoch 82/150
 - 4s - loss: 0.0063 - acc: 0.9976 - val_loss: 0.3317 - val_acc: 0.9330
Epoch 83/150
 - 4s - loss: 0.0073 - acc: 0.9974 - val_loss: 0.3339 - val_acc: 0.9375
Epoch 84/150
 - 4s - loss: 0.0108 - acc: 0.9963 - val_loss: 0.3253 - val_acc: 0.9392
Epoch 85/150
 - 4s - loss: 0.0050 - acc: 0.9983 - val_loss: 0.3221 - val_acc: 0.9412
Epoch 86/150
 - 5s - loss: 0.0071 - acc: 0.9975 - val_loss: 0.3323 - val_acc: 0.9335
Epoch 87/150
 - 4s - loss: 0.0087 - acc: 0.9971 - val_loss: 0.3526 - val_acc: 0.9423
Epoch 88/150
 - 4s - loss: 0.0062 - acc: 0.9978 - val_loss: 0.4236 - val_acc: 0.9307
Epoch 89/150
 - 4s - loss: 0.0067 - acc: 0.9976 - val_loss: 0.5435 - val_acc: 0.9155
Epoch 90/150
 - 4s - loss: 0.0091 - acc: 0.9969 - val_loss: 0.2541 - val_acc: 0.9422
Epoch 91/150
 - 4s - loss: 0.0080 - acc: 0.9972 - val_loss: 0.3941 - val_acc: 0.9335
Epoch 92/150
 - 4s - loss: 0.0083 - acc: 0.9972 - val_loss: 0.2730 - val_acc: 0.9500
Epoch 93/150
 - 4s - loss: 0.0073 - acc: 0.9976 - val_loss: 0.2838 - val_acc: 0.9517
Epoch 94/150
 - 4s - loss: 0.0052 - acc: 0.9982 - val_loss: 0.3408 - val_acc: 0.9445
Epoch 95/150
 - 4s - loss: 0.0072 - acc: 0.9975 - val_loss: 0.3400 - val_acc: 0.9397
Epoch 96/150
 - 4s - loss: 0.0056 - acc: 0.9980 - val_loss: 0.3462 - val_acc: 0.9435
Epoch 97/150
 - 4s - loss: 0.0066 - acc: 0.9976 - val_loss: 0.3263 - val_acc: 0.9425
Epoch 98/150
 - 4s - loss: 0.0058 - acc: 0.9979 - val_loss: 0.2820 - val_acc: 0.9505
Epoch 99/150
 - 4s - loss: 0.0069 - acc: 0.9975 - val_loss: 0.2997 - val_acc: 0.9407
Epoch 100/150
 - 4s - loss: 0.0063 - acc: 0.9978 - val_loss: 0.3760 - val_acc: 0.9457
Epoch 101/150
 - 4s - loss: 0.0071 - acc: 0.9975 - val_loss: 0.3127 - val_acc: 0.9437
Epoch 102/150
 - 4s - loss: 0.0082 - acc: 0.9972 - val_loss: 0.2972 - val_acc: 0.9395
Epoch 103/150
 - 4s - loss: 0.0069 - acc: 0.9975 - val_loss: 0.3728 - val_acc: 0.9312
Epoch 104/150
 - 4s - loss: 0.0068 - acc: 0.9976 - val_loss: 0.3095 - val_acc: 0.9420
Epoch 105/150
 - 4s - loss: 0.0061 - acc: 0.9980 - val_loss: 0.2703 - val_acc: 0.9490
Epoch 106/150
 - 4s - loss: 0.0039 - acc: 0.9985 - val_loss: 0.4268 - val_acc: 0.9322
Epoch 107/150
 - 4s - loss: 0.0083 - acc: 0.9974 - val_loss: 0.2772 - val_acc: 0.9445
Epoch 108/150
 - 4s - loss: 0.0044 - acc: 0.9985 - val_loss: 0.3340 - val_acc: 0.9445
Epoch 109/150
 - 4s - loss: 0.0084 - acc: 0.9972 - val_loss: 0.3204 - val_acc: 0.9422
Epoch 110/150
 - 4s - loss: 0.0042 - acc: 0.9985 - val_loss: 0.3161 - val_acc: 0.9457
Epoch 111/150
 - 4s - loss: 0.0075 - acc: 0.9974 - val_loss: 0.3228 - val_acc: 0.9403
Epoch 112/150
 - 4s - loss: 0.0050 - acc: 0.9982 - val_loss: 0.3024 - val_acc: 0.9520
Epoch 113/150
 - 4s - loss: 0.0037 - acc: 0.9987 - val_loss: 0.3301 - val_acc: 0.9467
Epoch 114/150
 - 4s - loss: 0.0083 - acc: 0.9973 - val_loss: 0.3013 - val_acc: 0.9500
Epoch 115/150
 - 4s - loss: 0.0059 - acc: 0.9979 - val_loss: 0.3353 - val_acc: 0.9450
Epoch 116/150
 - 4s - loss: 0.0066 - acc: 0.9976 - val_loss: 0.2638 - val_acc: 0.9545
Epoch 117/150
 - 4s - loss: 0.0056 - acc: 0.9982 - val_loss: 0.3455 - val_acc: 0.9457
Epoch 118/150
 - 4s - loss: 0.0031 - acc: 0.9990 - val_loss: 0.3422 - val_acc: 0.9462
Epoch 119/150
 - 4s - loss: 0.0047 - acc: 0.9984 - val_loss: 0.3007 - val_acc: 0.9480
Epoch 120/150
 - 4s - loss: 0.0063 - acc: 0.9978 - val_loss: 0.4240 - val_acc: 0.9307
Epoch 121/150
 - 4s - loss: 0.0074 - acc: 0.9974 - val_loss: 0.3499 - val_acc: 0.9373
Epoch 122/150
 - 4s - loss: 0.0061 - acc: 0.9979 - val_loss: 0.3304 - val_acc: 0.9440
Epoch 123/150
 - 4s - loss: 0.0039 - acc: 0.9985 - val_loss: 0.3128 - val_acc: 0.9457
Epoch 124/150
 - 4s - loss: 0.0064 - acc: 0.9978 - val_loss: 0.3663 - val_acc: 0.9395
Epoch 125/150
 - 4s - loss: 0.0028 - acc: 0.9991 - val_loss: 0.3032 - val_acc: 0.9468
Epoch 126/150
 - 4s - loss: 0.0086 - acc: 0.9973 - val_loss: 0.3432 - val_acc: 0.9437
Epoch 127/150
 - 5s - loss: 0.0049 - acc: 0.9985 - val_loss: 0.3079 - val_acc: 0.9438
Epoch 128/150
 - 4s - loss: 0.0030 - acc: 0.9990 - val_loss: 0.3106 - val_acc: 0.9492
Epoch 129/150
 - 4s - loss: 0.0054 - acc: 0.9982 - val_loss: 0.3203 - val_acc: 0.9482
Epoch 130/150
 - 4s - loss: 0.0060 - acc: 0.9980 - val_loss: 0.3817 - val_acc: 0.9360
Epoch 131/150
 - 4s - loss: 0.0048 - acc: 0.9984 - val_loss: 0.3175 - val_acc: 0.9442
Epoch 132/150
 - 4s - loss: 0.0049 - acc: 0.9984 - val_loss: 0.3039 - val_acc: 0.9510
Epoch 133/150
 - 4s - loss: 0.0056 - acc: 0.9981 - val_loss: 0.3595 - val_acc: 0.9302
Epoch 134/150
 - 4s - loss: 0.0078 - acc: 0.9973 - val_loss: 0.3876 - val_acc: 0.9345
Epoch 135/150
 - 4s - loss: 0.0041 - acc: 0.9986 - val_loss: 0.4704 - val_acc: 0.9190
Epoch 136/150
 - 4s - loss: 0.0052 - acc: 0.9981 - val_loss: 0.3952 - val_acc: 0.9342
Epoch 137/150
 - 4s - loss: 0.0060 - acc: 0.9981 - val_loss: 0.2508 - val_acc: 0.9540
Epoch 138/150
 - 4s - loss: 0.0028 - acc: 0.9990 - val_loss: 0.4105 - val_acc: 0.9383
Epoch 139/150
 - 4s - loss: 0.0060 - acc: 0.9980 - val_loss: 0.2960 - val_acc: 0.9468
Epoch 140/150
 - 4s - loss: 0.0056 - acc: 0.9980 - val_loss: 0.4142 - val_acc: 0.9425
Epoch 141/150
 - 4s - loss: 0.0043 - acc: 0.9985 - val_loss: 0.4132 - val_acc: 0.9410
Epoch 142/150
 - 4s - loss: 0.0035 - acc: 0.9988 - val_loss: 0.3921 - val_acc: 0.9417
Epoch 143/150
 - 4s - loss: 0.0054 - acc: 0.9982 - val_loss: 0.3563 - val_acc: 0.9385
Epoch 144/150
 - 4s - loss: 0.0087 - acc: 0.9973 - val_loss: 0.3450 - val_acc: 0.9432
Epoch 145/150
 - 4s - loss: 0.0044 - acc: 0.9984 - val_loss: 0.3892 - val_acc: 0.9368
Epoch 146/150
 - 4s - loss: 0.0037 - acc: 0.9988 - val_loss: 0.3887 - val_acc: 0.9462
Epoch 147/150
 - 5s - loss: 0.0033 - acc: 0.9988 - val_loss: 0.3902 - val_acc: 0.9422
Epoch 148/150
 - 5s - loss: 0.0056 - acc: 0.9982 - val_loss: 0.3856 - val_acc: 0.9352
Epoch 149/150
 - 4s - loss: 0.0051 - acc: 0.9982 - val_loss: 0.3408 - val_acc: 0.9450
Epoch 150/150
 - 4s - loss: 0.0037 - acc: 0.9987 - val_loss: 0.3584 - val_acc: 0.9452
In [12]:
with open("result_full.pkl", "wb") as f:
    pickle.dump(history.history, f)

Result Training

Now we trained the model on lot less epoch that what they did because we have quickly overfitting. This is visible by the trend of the test loss. I don't know why they don't had it on their test. In my case, i needed 4s/Epoch on a GTX1070 and they let it ran for 2+hours on a GTX1080 Ti so more than 2k Epochs

In [3]:
with open("result_full.pkl", "rb") as f:
    history = pickle.load(f)
In [15]:
plt.figure(figsize=(20,12))
plt.plot(history["acc"], label = "Training Accuracy")
plt.plot(history["val_acc"], label = "Test Accuracy")
plt.plot(history["loss"], label = "Training Loss")
plt.plot(history["val_loss"], label = "Training Loss")
plt.vlines(20, 0, 1, label = "Early Stop")
plt.ylim(0, 1)
plt.legend()
plt.xlabel("Epochs", fontsize = 15)
plt.title("Result of training", fontsize= 20)
plt.show()

Evaluation

Now let's look at result but still with a critic eye as we still use the test set which was used for the early stop...

In [16]:
model = load_model("model")
C:\python36\envs\machine_learning\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
In [22]:
y_pred = model.predict(X_test, batch_size=1000)
In [24]:
print(classification_report(y_test.argmax(axis=1), y_pred.argmax(axis=1)))
             precision    recall  f1-score   support

          0       0.83      1.00      0.91       800
          1       0.99      0.85      0.92       800
          2       0.94      0.97      0.96       800
          3       0.99      0.92      0.95       800
          4       1.00      0.99      0.99       800

avg / total       0.95      0.95      0.95      4000

In [40]:
print("ranking-based average precision : {:.3f}".format(label_ranking_average_precision_score(y_test.todense(), y_pred)))
print("Ranking loss : {:.3f}".format(label_ranking_loss(y_test.todense(), y_pred)))
print("Coverage_error : {:.3f}".format(coverage_error(y_test.todense(), y_pred)))
ranking-based average precision : 0.971
Ranking loss : 0.017
Coverage_error : 1.068
In [37]:
def plot_confusion_matrix(cm, classes,
                          normalize=False,
                          title='Confusion matrix',
                          cmap=plt.cm.Blues):
    """
    This function prints and plots the confusion matrix.
    Normalization can be applied by setting `normalize=True`.
    """
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
    else:
        print('Confusion matrix, without normalization')

    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)

    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt),
                 horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")

    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

# Compute confusion matrix
cnf_matrix = confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))
np.set_printoptions(precision=2)

# Plot non-normalized confusion matrix
plt.figure(figsize=(10, 10))
plot_confusion_matrix(cnf_matrix, classes=['N', 'S', 'V', 'F', 'Q'],
                      title='Confusion matrix, without normalization')
plt.show()
Confusion matrix, without normalization

The accuracy is close to the one from the paper but we have a lot more confusion between category N and S (12% vs 8%) for them.

Unsupervised learning on learned features

They also used their model as feature extractor with after a TSNE optimisation to visualize it. We can do the same but to avoid overfitting, use the second dataset (even if we don't have the same class)

In [7]:
df = pd.read_csv("F:/data/heartbeat/ptbdb_normal.csv", header=None)
df2 = pd.read_csv("F:/data/heartbeat/ptbdb_abnormal.csv", header=None)
df = pd.concat([df, df2], axis=0)
In [12]:
model = load_model("model")
In [8]:
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 14552 entries, 0 to 10505
Columns: 188 entries, 0 to 187
dtypes: float64(188)
memory usage: 21.0 MB
In [9]:
M = df.as_matrix()
X = M[:, :-1]
y = M[:, -1].astype(int)

X = np.expand_dims(X, 2)
In [10]:
del df
del M
In [16]:
intermediate_layer_model = Model(inputs=model.input,
                                 outputs=model.get_layer("flatten_1").output)
In [19]:
y_latent = intermediate_layer_model.predict(X, batch_size=150)
y_label = model.predict(X, batch_size=150)
In [22]:
np.save("F:/data/heartbeat/ylatent.npy", y_latent)
np.save("F:/data/heartbeat/y_label.npy", np.argmax(y_label, axis=1))
In [27]:
from sklearn.manifold import TSNE

tsne = TSNE(n_components=2)
y_embedded = tsne.fit_transform(y_latent)

If we plot the result of TSNE on the second dataset with predicted classes of the model initial. We can see that we nearly only find heartbeat without issues even if there is several ones

In [28]:
plt.figure(figsize=(20, 12))
plt.scatter(y_embedded[:, 0], y_embedded[:, 1], c=np.argmax(y_label, axis=1))
plt.show()