2021-08-14发表2022-09-09更新 Lebenito ML10 分钟读完 (大约1468个字)

李宏毅 2021 机器学习课程 HW 2 记录

本文为日志一样的记录文档，直接记录当时的想法和一些代码改动，可能参考意义不是特别大，只是拿来给我自己后面复盘用的，可能做完一个项目后会重新调整一下本文内容。

2021/08/23 记录：

今天参考这篇文章调整了模型，使用了更大的 Batch_size，更高的模型层数，L2 正则化避免过拟合，在学习至 0.78 精度时还是进入了过拟合。

2021/08/21 记录：

现在能保证每次都能直接训练到 0.75 以上，然后在 150 个 Epoch 左右进入过拟合阶段，可能需要一些其他技巧才能继续训练。

num_epoch = 200
learning_rate = 0.005 #发现用这个 LR 能更快的拟合

optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=0.001)

现在的模型为：

import torch
import torch.nn as nn

class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        self.bn0 = nn.BatchNorm1d(429)
        self.layer1 = nn.Linear(429, 1024)
        self.bn1 = nn.BatchNorm1d(1024)
        self.layer2 = nn.Linear(1024, 512)
        self.bn2 = nn.BatchNorm1d(512)
        self.layer3 = nn.Linear(512, 256)
        self.bn3 = nn.BatchNorm1d(256)
        self.layer4 = nn.Linear(256, 128)
        self.bn4 = nn.BatchNorm1d(128)
        self.out = nn.Linear(128, 39) 

        self.act_fn = nn.ReLU()
        self.dropout = nn.Dropout(p=0.5)
        self.dropout1 = nn.Dropout(p=0.1)
        self.dropout2 = nn.Dropout(p=0.3)

    def forward(self, x):
        x = self.bn0(x)
        x = self.layer1(x)
        x = self.bn1(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer2(x)
        x = self.bn2(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer3(x)
        x = self.bn3(x)
        x = self.dropout1(x)
        x = self.act_fn(x)

        x = self.layer4(x)
        x = self.bn4(x)
        x = self.dropout2(x)
        x = self.act_fn(x)

        x = self.out(x)
        
        return x

2021/08/18 记录：

又是没什么进展的一天，甚至发现昨天训练到的 0.75 准确度是偶然情况，并不能再次复现，不过发现使用学习率衰减的方式能够更快的拟合。

1
2
3

learning_rate = 0.01

scheduler = torch.optim.lr_scheduler.StepLR(optimizer,step_size=10,gamma=0.5,verbose=True)

2021/08/17 记录：

今天终于有了进展，将 Dropout 参数调整至了 0.5，进一步防止了过拟合，并且提高了 Batch_size 至 256（经测试 256 能更快的拟合，512 虽然能提高速度但是效果并不理想），因为 Batch Normalization 需要在每批次的量足够多的时候才能有好的效果，本来测试了一下，是不是需要进行自适应学习率调整，然后加入了一个学习率调整的参数，但是除了让学习速度变慢了，没有什么实质性效果，Adam 这个 optim 本身就已经有了自适应学习速率，用它就够了。

然后模型什么时候拟合这个问题，今天发现需要观察其 loss 变化来进行分析，之前几天我以为达到了过拟合，其实只是训练次数不够造成的错觉，今天在 80 个 Epoch 里，测试准确度达到了 0.75 以上。

这里贴一个关于拟合的文章。

1	BATCH_SIZE = 512

import torch
import torch.nn as nn

class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        self.layer1 = nn.Linear(429, 1024)
        self.bn1 = nn.BatchNorm1d(1024)
        self.layer2 = nn.Linear(1024, 512)
        self.bn2 = nn.BatchNorm1d(512)
        self.layer3 = nn.Linear(512, 256)
        self.bn3 = nn.BatchNorm1d(256)
        self.layer4 = nn.Linear(256, 128)
        self.bn4 = nn.BatchNorm1d(128)
        self.out = nn.Linear(128, 39) 

        self.act_fn = nn.ReLU()
        self.dropout = nn.Dropout(p=0.5)

    def forward(self, x):
        x = self.layer1(x)
        x = self.bn1(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer2(x)
        x = self.bn2(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer3(x)
        x = self.bn3(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer4(x)
        x = self.bn4(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.out(x)
        
        return x

1	num_epoch = 1000

2021/08/15 记录：

尝试在 Adam 中使用 weight_decay 来防止目标过拟合，但因为使用了 Dropout 效果并不显著，其实此时函数已经没有处于过拟合的状态了，可能需要调整模型结构才能继续提高精度，目前将训练 Epoch 提至 40 查看最后效果，在 0.743 时提升 loss 变化已经变得缓慢了起来，怀疑已经接近 critical point。

下次将提升 Batch_size 查看效果，因为此时 batch_size 维持在一个小的量，可能做 normalization 带来的效果并不明显。

目前模型结构为：

import torch
import torch.nn as nn

class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        self.layer1 = nn.Linear(429, 1024)
        self.bn1 = nn.BatchNorm1d(1024)
        self.layer2 = nn.Linear(1024, 512)
        self.bn2 = nn.BatchNorm1d(512)
        self.layer3 = nn.Linear(512, 256)
        self.bn3 = nn.BatchNorm1d(256)
        self.layer4 = nn.Linear(256, 128)
        self.bn4 = nn.BatchNorm1d(128)
        self.out = nn.Linear(128, 39) 

        self.act_fn = nn.ReLU()
        self.dropout = nn.Dropout(p=0.3)

    def forward(self, x):
        x = self.layer1(x)
        x = self.bn1(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer2(x)
        x = self.bn2(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer3(x)
        x = self.bn3(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer4(x)
        x = self.bn4(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.out(x)
        
        return x

1	optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=1e-5)

2021/08/14 记录：

直接跑 Baseline 模型，发现 Train Acc 远高于 Val Acc，初步判断是由于模型过拟合造成的。使用 Batch Norm 能较少提高模型精度，于是在每一层计算中加入 Drop out，降低模型的过拟合程度，在默认的 20 个 Epoch 中顺利提高精度至 0.737 左右。

下面放出修改后的模型

import torch
import torch.nn as nn

class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        self.layer1 = nn.Linear(429, 1024)
        self.bn1 = nn.BatchNorm1d(1024)
        self.layer2 = nn.Linear(1024, 512)
        self.bn2 = nn.BatchNorm1d(512)
        self.layer3 = nn.Linear(512, 128)
        self.bn3 = nn.BatchNorm1d(128)
        self.out = nn.Linear(128, 39) 

        self.act_fn = nn.ReLU()
        self.dropout = nn.Dropout(p=0.3)

    def forward(self, x):
        x = self.layer1(x)
        x = self.bn1(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer2(x)
        x = self.bn2(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.layer3(x)
        x = self.bn3(x)
        x = self.dropout(x)
        x = self.act_fn(x)

        x = self.out(x)
        
        return x

李宏毅 2021 机器学习课程 HW 2 记录

https://blogs.lebenito.net/2021/08/14/【ML_LHY】ML 2021 Spring HW 2/

作者

Lebenito

发布于

2021-08-14

更新于

2022-09-09

许可协议

#LHY

李宏毅 2021 机器学习课程 HW 2 记录

2021/08/23 记录：

2021/08/21 记录：

2021/08/18 记录：

2021/08/17 记录：

2021/08/15 记录：

2021/08/14 记录：

作者

发布于

更新于

许可协议

目录

链接

分类

归档

最新文章

标签