【模型】一篇入门之-Inception-V3卷积神经网络

作者 : 老饼发表日期 : 2024-01-18 11:21:32 更新日期 : 2024-11-19 15:59:49

本站原创文章，转载请说明来自《老饼讲解-深度学习》www.bbbdata.com

Inception-V3模型是基于“对卷积神经网络的重新思考”下，重新推出的一个Inception卷积神经网络

本文讲解Inception-V3的模型结构和其中的各个inception模块，并展示Inception-V3的代码实现

通过本文，可以了解什么是Inception-V3卷积神经网络，以及如何使用代码实现一个Inception-V3模型

01. Inception-V3卷积神经网络是什么

本节介绍Inception-V3模型的结构，快速了解InceptionV3模型是什么

Inception-V3模型是什么

Inception-V3模型出自2015的论文《Rethinking the Inception Architecture for Computer Vision》
Inception-V3主要是对卷积神经网络和Inception的一些本质与特性进行各种零零散散的思考
然后根据这些零零散散的思考结果，修改了Inception-V1、2模型的结构，推出了新的Inception-V3模型
其中，Inception-V3比较特别的是，引入了用于降维的Inception模块
  Inception-V3模型结构如下：

  Inception-V3模型主体是基于Inception-V1进行修改
修改的地方主要如下：
1. 输入
   输入改为了299*299
2. 普通卷积层(C1、C2)
  (1)C1层用三个3*3卷积替代V1的7*7卷积
(2)C1、C2去除了LRN归一化
3. Inception层(C3、C4、C5)
       (1)C3、C4、C5层分别用不同的inception结构模块
   (2)C3、C4层改用inception-R1、R2来替代池化
(3)C4层减少了一个inception模块
4. 辅助器
   去除了低层的辅助器，只保留高层的辅助器
5. 加入BN层
              所有非线性函数(ReLu和softMax)前加入了BN层进行归一化

Inception-V3中的各个Inception模块

Inception-V3中的3个Inception模块如下：



Inception-V3提出了两个Inception降维模块，用来替代单纯的池化
它的思想主要是用池化减少FeatureMap的同时，用卷积来补充通道数，避免信息损失过多
Inception-V3中的2个降维Inception模块如下：

02. Inception-V3模型-详细配置

本节展示Inception-V3模型的详细配置，通过本节可以了解Inception-V3模型的详细运算过程

GoogLeNet-Inception-V3配置与运算

GoogLeNet-Inception-V3的具体运算过程与配置如下：

其中侧头的辅助器的运算与配置如下：

如果输入不是299× 299，而是更小尺寸，可根据原文中提供的方案进行调整
具体如下：
1. 如果输入略小于299× 299，将第一层的卷积步幅改为1
例如输入为151 × 151,可以采用该方案
2. 如果输入远小于299× 299，将第一层的卷积步幅改为1，并去掉第一个池化层
例如输入为79 × 79，可以采用该方案

04. Inception-V3-代码实现

本节展示Inception-V3卷积神经网络的代码实现

Inception-V3-代码实现

要实现Inception-V3模型，需要定义好各个Inception模块，然后再按模型的主流程进行各层的配置就可以了
具体代码如下：

# 本代码用于实现InceptionV3模型
# 转载请说明来自 《老饼讲解-深度学习》 www.bbbdata.com
from   torch import nn
import torch

# 定义带BN和ReLu的卷积
class CovWithBNReLu(nn.Module):
    def __init__(self,in_channels,out_channels,k,s,p):
        super(CovWithBNReLu, self).__init__()
        self.stack = nn.Sequential(
        nn.Conv2d(in_channels ,out_channels, kernel_size=k,stride=s,padding=p),
        nn.BatchNorm2d(out_channels),
        nn.ReLU(inplace=True)
        )
    def forward(self, x):
        return self.stack(x)   
        
# InceptionV3_1模块
class InceptionV3_1(nn.Module):
    def __init__(self,C,C1,C2,C3,C4,pool='max'):
        super(InceptionV3_1, self).__init__()
        
        # 1*1卷积部分
        self.R1 = CovWithBNReLu(C,C1,k=1,s=1,p=0) 
        # 3*3卷积部分
        self.R2 = nn.Sequential(CovWithBNReLu(C ,C2[0],k=1,s=1,p=0),
                                CovWithBNReLu(C2[0],C2[1],k=3,s=1,p=1))
        # 双层3*3卷积部分
        self.R3 = nn.Sequential(CovWithBNReLu(C ,C3[0],k=1,s=1,p=0),
                                CovWithBNReLu(C3[0],C3[1],k=3,s=1,p=1),
                                CovWithBNReLu(C3[1],C3[2],k=3,s=1,p=1))    
        # 池化部分
        if(pool=='max'):
            P = nn.MaxPool2d(kernel_size=3,stride=1,padding=1)
        else:
            P = nn.AvgPool2d(kernel_size=3,stride=1,padding=1)
        self.R4 = nn.Sequential(P, CovWithBNReLu(C,C4, k=1,s=1,p=0))

    def forward(self, x):
        y1 =self.R1(x)
        y2 =self.R2(x)
        y3 =self.R3(x)
        y4 =self.R4(x)
        y  = torch.cat((y1,y2, y3, y4), dim=1)
        return y
    
# InceptionV3_2模块
class InceptionV3_2(nn.Module):
    def __init__(self,C,C1,C2,C3,C4,pool='max'):
        super(InceptionV3_2, self).__init__()
        #-------定义各条路线所使用的计算模块---------
        # 1*1卷积部分
        self.R1 = CovWithBNReLu(C,C1, k=1,s=1,p=0)
            
        # 单层1*7/7*1卷积部分
        self.R2 = nn.Sequential(
                 CovWithBNReLu(C ,C2[0], k=1,s=1,p=0)
                ,CovWithBNReLu(C2[0],C2[1], k=(1,7),s=1,p=(0,3))
                ,CovWithBNReLu(C2[1],C2[2], k=(7,1),s=1,p=(3,0)))
        
        # 双层1*7/7*1卷积部分
        self.R3 = nn.Sequential(
                 CovWithBNReLu(C ,C3[0], k=1,s=1,p=0)
                ,CovWithBNReLu(C3[0],C3[1],k=(1,7),s=1,p=(0,3))
                ,CovWithBNReLu(C3[1],C3[2],k=(7,1),s=1,p=(3,0))
                ,CovWithBNReLu(C3[2],C3[3],k=(1,7),s=1,p=(0,3))
                ,CovWithBNReLu(C3[3],C3[4],k=(7,1),s=1,p=(3,0)))
        # 池化部分
        if(pool=='max'):
            P = nn.MaxPool2d(kernel_size=3,stride=1,padding=1)
        else:
            P = nn.AvgPool2d(kernel_size=3,stride=1,padding=1)
        self.R4 =  nn.Sequential(P,CovWithBNReLu(C,C4,k=1,s=1,p=0))
        
    def forward(self, x):
        y1 = self.R1(x)
        y2 = self.R2(x)
        y3 = self.R3(x)
        y4 = self.R4(x)
        y = torch.cat((y1, y2, y3, y4), dim=1)
        return y
    
# InceptionV3_3模块
class InceptionV3_3(nn.Module):
    def __init__(self,C,C1,C2,C3,C4,pool='max'):
        super(InceptionV3_3, self).__init__()
        #-------定义各条路线所使用的计算模块---------
        # 1*1卷积部分
        self.R1 =  CovWithBNReLu(C,C1, k=1,s=1,p=0)
            
        # 3*3+分枝卷积部分
        self.R2   = nn.Sequential(
                     CovWithBNReLu(C,C2[0],k=1,s=1,p=0)
                    ,CovWithBNReLu(C2[0],C2[1],k=3,s=1,p=1)
                    )
        self.R21  = CovWithBNReLu(C2[1],C2[2],k=(1,3),s=1,p=(0,1))
        self.R22  = CovWithBNReLu(C2[1],C2[3],k=(3,1),s=1,p=(1,0))
        
        # 单层分枝卷积部分
        self.R3  = CovWithBNReLu(C ,C3[0],k=1,s=1,p=0)
        self.R31 = CovWithBNReLu(C3[0],C3[1],k=(1,3),s=1,p=(0,1))
        self.R32 = CovWithBNReLu(C3[0],C3[2],k=(3,1),s=1,p=(1,0))
        
        # 池化部分(支持屏蔽降维卷积)
        if(pool=='max'):
            P = nn.MaxPool2d(kernel_size=3,stride=1,padding=1)
        else:
            P = nn.AvgPool2d(kernel_size=3,stride=1,padding=1)
        self.R4 = nn.Sequential(P,CovWithBNReLu(C,C4,k=1,s=1,p=0))

    def forward(self, x):
        #-----------y1-----------
        y1 = self.R1(x)
        #-----------y2-----------
        y2  = self.R2(x)
        y21 = self.R21((y2))
        y22 = self.R22((y2))
        #-----------y3-----------
        y3  = self.R3(x)
        y31 = self.R31(y3)
        y32 = self.R32(y3)
        #-----------y4----------
        y4 = self.R4(x)
        y = torch.cat((y1, y21,y22,y31,y32, y4), dim=1)
        return y
    
# InceptionV3_R1降维模块
class InceptionV3_R1(nn.Module):
    def __init__(self,C,C1,C2,pool='max'):
        super(InceptionV3_R1, self).__init__()
        #-------定义各条路线所使用的计算模块---------
        # 单层3*3卷积部分
        self.R1 = CovWithBNReLu(C,C1, k=3,s=2,p=0)
            
        # 双层3*3卷积部分
        self.R2 = nn.Sequential(
                 CovWithBNReLu(C,C2[0],k=1,s=1,p=0)
                ,CovWithBNReLu(C2[0],C2[1],k=3,s=1,p=1)
                ,CovWithBNReLu(C2[1],C2[2],k=3,s=2,p=0)
                )
        # 池化部分
        if(pool=='max'):
            P = nn.MaxPool2d(kernel_size=3,stride=2,padding=0)
        else:
            P = nn.AvgPool2d(kernel_size=3,stride=2,padding=0)
        self.R3 = P

    def forward(self, x):
        y1 = self.R1(x)
        y2 = self.R2(x)
        y3 = self.R3(x)
        y = torch.cat((y1, y2,y3), dim=1)
        return y
    
# InceptionV3_R2降维模块
class InceptionV3_R2(nn.Module):
    def __init__(self,C,C1,C2,pool='max'):
        super(InceptionV3_R2, self).__init__()
        #-------定义各条路线所使用的计算模块---------
        # 单层3*3卷积部分
        self.R1 = nn.Sequential(
                     CovWithBNReLu(C,C1[0],k=1,s=1,p=0)
                    ,CovWithBNReLu(C1[0],C1[1],k=3,s=2,p=0)
                    )
        # 双层3*3卷积部分
        self.R2 = nn.Sequential(
                     CovWithBNReLu(C,C2[0],k=1,s=1,p=0)
                    ,CovWithBNReLu(C2[0],C2[1],k=(1,7),s=1,p=(0,3))
                    ,CovWithBNReLu(C2[1],C2[2],k=(7,1),s=1,p=(3,0))
                    ,CovWithBNReLu(C2[2],C2[3],k=3,s=2,p=0)
                    )
        # 池化部分
        if(pool=='max'):
            P = nn.MaxPool2d(kernel_size=3,stride=2,padding=0)
        else:
            P = nn.AvgPool2d(kernel_size=3,stride=2,padding=0)
        self.R3 = P

    def forward(self, x):
        y1 =self.R1(x)
        y2 =self.R2(x)
        y3 =self.R3(x)
        y = torch.cat((y1, y2,y3), dim=1)
        return y

 # InceptionNet3卷积神经网络的结构
class InceptionNet3(nn.Module):
    def __init__(self,in_channel,num_classes):
        super(InceptionNet3, self).__init__()
        self.nn_stack=nn.Sequential(
            #--------------C1层-------------------
            nn.Conv2d(in_channel,32, kernel_size=3,stride=2,padding=0),
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True),  
            # 输出149*149*32
            nn.Conv2d(32,32, kernel_size=3,stride=1,padding=0),
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True),  
            # 输出147*147*32
            nn.Conv2d(32,64, kernel_size=3,stride=1,padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),  
            nn.MaxPool2d(kernel_size=3,stride=2,padding=0),
            # 输出73*73*64
            #--------------C2层-------------------
            nn.Conv2d(64,80, kernel_size=1,stride=1,padding=0),
            nn.BatchNorm2d(80),
            nn.ReLU(inplace=True),  
            # 输出71*71*80
            nn.Conv2d(80,192, kernel_size=3,stride=1,padding=0),
            nn.BatchNorm2d(192),
            nn.ReLU(inplace=True),  
            nn.MaxPool2d(kernel_size=3,stride=2,padding=0),
            # 输出35*35*192
            #--------------C3层-------------------
            InceptionV3_1(C=192,C1=64,C2=[48,64],C3=[64,96,96],C4=32,pool='avg'),
            InceptionV3_1(C=256,C1=64,C2=[48,64],C3=[64,96,96],C4=64,pool='avg'),
            InceptionV3_1(C=288,C1=64,C2=[48,64],C3=[64,96,96],C4=64,pool='avg'),
            InceptionV3_R1(C=288,C1=384,C2=[64,96,96],pool='avg'),
            # 输出17*17*768
            #--------------C4层-------------------
            InceptionV3_2(C=768,C1=192,C2=[128,128,192],C3=[128,128,128,128,192],C4=192,pool='avg'),
            InceptionV3_2(C=768,C1=192,C2=[160,160,192],C3=[160,160,160,160,192],C4=192,pool='avg'),
            InceptionV3_2(C=768,C1=192,C2=[160,160,192],C3=[160,160,160,160,192],C4=192,pool='avg'),
            InceptionV3_2(C=768,C1=192,C2=[192,192,192],C3=[192,192,192,192,192],C4=192,pool='avg'),
            InceptionV3_R2(C=768,C1=[192,320],C2=[192,192,192,192],pool='max'),
            # 输出8*8*1280
            #--------------C5层-------------------
            InceptionV3_3(C=1280,C1=320,C2=[448,384,384,384],C3=[384,384,384],C4=192,pool='avg'),
            InceptionV3_3(C=2048,C1=320,C2=[448,384,384,384],C3=[384,384,384],C4=192,pool='avg'),
            nn.AvgPool2d(kernel_size=8,stride=1,padding=0),
            # 输出1*1*2048
            #--------------全连接层F6----------
            nn.Flatten(),  
            torch.nn.Dropout(p=0.2),
            nn.Linear(2048, num_classes),
            nn.BatchNorm1d(num_classes),
            )
    def forward(self, x):
        p = self.nn_stack(x)
        return p
# ------测试模型---------
x = torch.rand(2,3,299,299)
model = InceptionNet3(in_channel =3,num_classes=1000)
y= model(x)

这里只展示Inception-v3模型的代码实现，具体训练可借鉴Inception-v1模型

好了，上述就是GoogLeNet-Inception-V3的主要内容了~

End

添加评论