Keras 学习笔记
条评论参考资料:Keras中文文档,Keras英文文档, Keras教程, W3CSCHOOL,Tensorflow中文社区,
Sequential
顺序模型定义:
1
2from keras.models import Sequential
model = Sequential()使用
add
堆叠模型:1
2
3from keras.layers import Dense
model.add(Dense(units=64, activation='relu', input_dim=100))
model.add(Dense(units=10, activation='softmax'))activation
: 激活函数kernel_regularizer
和bias_regularizer
: 层的权重、偏差的正则化方法。kernel_initializer
和bias_initializer
: 层创建时,权值和偏差的初始化方法,默认为Glorot uniform
1
2
3
4
5
6
7
8
9
10
11# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix:
layers.Dense(64, kernel_regularizer=keras.regularizers.l1(0.01))
# A linear layer with L2 regularization of factor 0.01 applied to the bias vector:
layers.Dense(64, bias_regularizer=keras.regularizers.l2(0.01))
# A linear layer with a kernel initialized to a random orthogonal matrix:
layers.Dense(64, kernel_initializer='orthogonal')
# A linear layer with a bias vector initialized to 2.0s:
layers.Dense(64, bias_initializer=keras.initializers.constant(2.0))
使用
compile
配置学习过程1
2
3model.compile(loss='categorical_crossentropy',
optimizer='sgd',
metrics=['accuracy'])配置优化器
1
2model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.SGD(lr=0.01, momentum=0.9, nesterov=True))训练数据
1
2# x_train 和 y_train 是 Numpy 数组
model.fit(x_train, y_train, epochs=5, batch_size=32)评估模型
1
loss_and_metrics = model.evaluate(x_test, y_test, batch_size=128)
预测
1
classes = model.predict(x_test, batch_size=128)
Functional API
函数式API- 可用于定义更复杂的模型。
- 层可调用,返回值为一个tensor
- 输入tensors和输出tensors被用来定义一个
tf.keras.model
实例 训练方法与sequential一样
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17inputs = keras.Input(shape=(32,)) # Returns a placeholder tensor
# A layer instance is callable on a tensor, and returns a tensor.
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(64, activation='relu')(x)
predictions = keras.layers.Dense(10, activation='softmax')(x)
# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)
# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Trains for 5 epochs
model.fit(data, labels, batch_size=32, epochs=5)
模型的方法和属性
model.layers
是包含模型网络层的展平列表。model.inputs
是模型输入张量的列表。model.outputs
是模型输出张量的列表。model.summary()
打印出模型概述信息。 它是 utils.print_summary 的简捷调用。model.get_config()
返回包含模型配置信息的字典。通过以下代码,就可以根据这些配置信息重新实例化模型:model.get_weights()
返回模型中所有权重张量的列表,类型为 Numpy 数组。model.set_weights(weights)
从 Numpy 数组中为模型设置权重。列表中的数组必须与get_weights()
返回的权重具有相同的尺寸。model.save_weights(filepath)
将模型权重存储为 HDF5 文件。model.load_weights(filepath, by_name=False)
: 从 HDF5 文件(由 save_weights 创建)中加载权重。默认情况下,模型的结构应该是不变的。 如果想将权重载入不同的模型(部分层相同), 设置 by_name=True 来载入那些名字相同的层的权重。
Model
类继承
通过编写tf.keras.Model
的子类来构建一个自定义模型。在init
方法里创建 layers。在call
方法里定义前向传播过程。在call
中,你可以指定自定义的损失函数,通过调用self.add_loss(loss_tensor)
。1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28import keras
class SimpleMLP(keras.Model):
def __init__(self, use_bn=False, use_dp=False, num_classes=10):
super(SimpleMLP, self).__init__(name='mlp')
self.use_bn = use_bn
self.use_dp = use_dp
self.num_classes = num_classes
self.dense1 = keras.layers.Dense(32, activation='relu')
self.dense2 = keras.layers.Dense(num_classes, activation='softmax')
if self.use_dp:
self.dp = keras.layers.Dropout(0.5)
if self.use_bn:
self.bn = keras.layers.BatchNormalization(axis=-1)
def call(self, inputs):
x = self.dense1(inputs)
if self.use_dp:
x = self.dp(x)
if self.use_bn:
x = self.bn(x)
return self.dense2(x)
model = SimpleMLP()
model.compile(...)
model.fit(...)在类继承模型中,模型的拓扑结构是由 Python 代码定义的(而不是网络层的静态图)。这意味着该模型的拓扑结构不能被检查或序列化。因此,以下方法和属性不适用于类继承模型:
model.inputs
和model.outputs
。model.to_yaml()
和model.to_json()
。model.get_config()
和model.save()
。
自定义layer
可以通过编写tf.keras.layers.Layer
的子类来创建一个自定义layer
,该子类编写过程中需要编写下面的方法:build
:创建层的参数。通过add_weight
来添加权值call
:定义前向传播过程。compute_output_shape
:指定怎么根据输入去计算layer
的输出shape
。- layer可以通过
get_config
方法和from_config
方法实现串行。1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44class MyLayer(keras.layers.Layer):
def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(MyLayer, self).__init__(**kwargs)
def build(self, input_shape):
shape = tf.TensorShape((input_shape[1], self.output_dim))
# Create a trainable weight variable for this layer.
self.kernel = self.add_weight(name='kernel',
shape=shape,
initializer='uniform',
trainable=True)
# Be sure to call this at the end
super(MyLayer, self).build(input_shape)
def call(self, inputs):
return tf.matmul(inputs, self.kernel)
def compute_output_shape(self, input_shape):
shape = tf.TensorShape(input_shape).as_list()
shape[-1] = self.output_dim
return tf.TensorShape(shape)
def get_config(self):
base_config = super(MyLayer, self).get_config()
base_config['output_dim'] = self.output_dim
@classmethod
def from_config(cls, config):
return cls(**config)
# Create a model using the custom layer
model = keras.Sequential([MyLayer(10),
keras.layers.Activation('softmax')])
# The compile step specifies the training configuration
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Trains for 5 epochs.
model.fit(data, targets, batch_size=32, epochs=5)
LSTM
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49from keras.layers import Input, Embedding, LSTM, Dense
from keras.models import Model
# 标题输入:接收一个含有 100 个整数的序列,每个整数在 1 到 10000 之间。
# 注意我们可以通过传递一个 "name" 参数来命名任何层。
main_input = Input(shape=(100,), dtype='int32', name='main_input')
# Embedding 层将输入序列编码为一个稠密向量的序列,
# 每个向量维度为 512。
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
# LSTM 层把向量序列转换成单个向量,
# 它包含整个序列的上下文信息
lstm_out = LSTM(32)(x)
# 插入辅助损失
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])
# 堆叠多个全连接网络层
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
# 最后添加主要的逻辑回归层
main_output = Dense(1, activation='sigmoid', name='main_output')(x)
# 定义
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])
# 编译
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
loss_weights=[1., 0.2])
# 传递输入数组和目标数组
model.fit([headline_data, additional_data], [labels, labels],
epochs=50, batch_size=32)
# 根据 name 参数编译
model.compile(optimizer='rmsprop',
loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
loss_weights={'main_output': 1., 'aux_output': 0.2})
# 然后使用以下方式训练:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
{'main_output': labels, 'aux_output': labels},
epochs=50, batch_size=32)GELU
1
2
3
4
5
6
7from keras.layers import Activation
from keras.utils.generic_utils import get_custom_objects
def custom_gelu(x):
return 0.5 * x * (1 + tf.tanh(tf.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3))))
get_custom_objects().update({'custom_gelu': Activation(custom_gelu)})
fit1.add(Dense(output_dim=1, activation=custom_gelu))-
lstm1, lstm_h, lstm_c = LSTM(hideen_size, return_sequences=True, return_state=True)(input)
返回lstm的每层隐状态lstm1
,最后输出lstm_h
,最后的单元状态lstm_c
。
-
lstm_out = Bidirectional(LSTM(10, return_sequences=True)(input))
也可以分开写
1
2
3forward_layer = LSTM(10, return_sequences=True)(input)
backward_layer = LSTM(10, activation='relu', return_sequences=True,
go_backwards=True)(input)-
1
2
3
4
5
6
7
8
9
10
11
12encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder = Bidirectional(LSTM(latent_dim, return_state=True))
encoder_outputs, forward_h, forward_c, backward_h, backward_c = encoder(encoder_inputs)
state_h = Concatenate()([forward_h, backward_h])
state_c = Concatenate()([forward_c, backward_c])
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(None, num_decoder_tokens))
decoder_lstm = LSTM(latent_dim * 2, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
-
Masking layer:
1
2mask = keras.layers.Masking(mask_value= 0, input_shape=(time_step,feature_size))(input)
lstm_output = keras.layers.LSTM(hidden_size, return_sequences= True)(mask)Embedding layer:
1
2embed = keras.layers.Embedding(vocab_size, embedding_size, mask_zero= True)(input)
lstm_output = keras.layers.LSTM(hidden_size, return_sequences= True)(emded)
-
tf.keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
decay
: 学习率衰减
Learning rate
tf 2.0:
tf.keras.optimizers.schedules.LearningRateSchedule
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18class CustomSchedule(tf.keras.optimizers.schedules.LearningRateSchedule):
def __init__(self, d_model, warmup_steps=4000):
super(CustomSchedule, self).__init__()
self.d_model = d_model
self.d_model = tf.cast(self.d_model, tf.float32)
self.warmup_steps = warmup_steps
def __call__(self, step):
arg1 = tf.math.rsqrt(step)
arg2 = step * (self.warmup_steps ** -1.5)
return tf.math.rsqrt(self.d_model) * tf.math.minimum(arg1, arg2)
learning_rate = CustomSchedule(200)
custom_adam = tf.keras.optimizers.Adam(learning_rate, beta_1=0.9, beta_2=0.98,
epsilon=1e-9)keras:
keras.callbacks.LearningRateScheduler(schedule)
1
2
3
4
5
6
7
8
9
10
11
12
13import keras.backend as K
from keras.callbacks import LearningRateScheduler
def scheduler(epoch):
# 每隔100个epoch,学习率减小为原来的1/10
if epoch % 100 == 0 and epoch != 0:
lr = K.get_value(model.optimizer.lr)
K.set_value(model.optimizer.lr, lr * 0.1)
print("lr changed to {}".format(lr * 0.1))
return K.get_value(model.optimizer.lr)
reduce_lr = LearningRateScheduler(scheduler)
model.fit(train_x, train_y, batch_size=32, epochs=300, callbacks=[reduce_lr])
Reduce LR On Plateau
keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto', epsilon=0.0001, cooldown=0, min_lr=0)
1
2
3from keras.callbacks import ReduceLROnPlateau
reduce_lr = ReduceLROnPlateau(monitor='val_loss', patience=10, mode='auto')
model.fit(train_x, train_y, batch_size=32, epochs=300, validation_split=0.1, callbacks=[reduce_lr])