17篇注意力机制PyTorch实现，包含MLP、Re-Parameter系列热门论文

机器之心报道

编辑：陈

PyTorch实现各种注意力机制。

注意力（Attention）机制最早在计算机视觉中应用，后来又在 NLP 领域发扬光大，该机制将有限的注意力集中在重点信息上，从而节省资源，快速获得最有效的信息。

2014 年，Google DeepMind 发表《Recurrent Models of Visual Attention》，使注意力机制流行起来；2015 年，Bahdanau 等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》中，将注意力机制首次应用在 NLP 领域；2017 年，Google 机器翻译团队发表的《Attention is All You Need》中，完全抛弃了 RNN 和 CNN 等网络结构，而仅仅采用注意力机制来进行机器翻译任务，并且取得了很好的效果，注意力机制也因此成了研究热点。

经过几年的发展，领域内产生了众多的注意力机制论文研究，这些工作在 CV、NLP 领域取得了较好的效果。近日，在 GitHub 上，有研究者介绍了 17 篇关于注意力机制论文的 PyTorch 的代码实现以及使用方法。

项目地址：https://github.com/xmu-xiaoma666/External-Attention-pytorch

项目介绍

项目作者对注意力机制进行了分类，分为三个系列：Attention 系列、MLP 系列、ReP（Re-Parameter）系列。其中 Attention 系列中包含有大名鼎鼎的《Attention is All You Need》等 11 篇论文；最近比较热门的 MLP 系列包括谷歌的 MLP-Mixer、gMLP ，Facebook 的 ResMLP，清华的 RepMLP ；此外，ReP（Re-Parameter）系列包括清华等提出的 RepVGG、 ACNet。

Attention 系列的 11 篇 Attention 论文 Pytorch 实现方式如下：

Pytorch 实现论文「Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks---arXiv 2020.05.05」
Pytorch 实现论文「Attention Is All You Need---NIPS2017」
Pytorch 实现论文「Simplified Self Attention Usage」
Pytorch 实现论文「Squeeze-and-Excitation Networks---CVPR2018」
Pytorch 实现论文「Selective Kernel Networks---CVPR2019」
Pytorch 实现论文「CBAM: Convolutional Block Attention Module---ECCV2018」
Pytorch 实现论文「BAM: Bottleneck Attention Module---BMCV2018」
Pytorch 实现论文「ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks---CVPR2020」
Pytorch 实现论文「Dual Attention Network for Scene Segmentation---CVPR2019」
Pytorch 实现论文「EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network---arXiv 2020.05.30」
Pytorch 实现论文「ResT: An Efficient Transformer for Visual Recognition---arXiv 2020.05.28」

MLP（多层感知机）系列中，包含 4 篇论文 Pytorch 实现方式，论文如下：

Pytorch 实现论文「RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition---arXiv 2020.05.05」
Pytorch 实现论文「MLP-Mixer: An all-MLP Architecture for Vision---arXiv 2020.05.17」
Pytorch 实现论文「ResMLP: Feedforward networks for image classification with data-efficient training---arXiv 2020.05.07」
Pytorch 实现论文「Pay Attention to MLPs---arXiv 2020.05.17」

ReP（Re-Parameter）系列中，包含 2 篇论文 Pytorch 实现方式，论文如下：

Pytorch 实现论文「RepVGG: Making VGG-style ConvNets Great Again---CVPR2021」
Pytorch 实现论文「ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks---ICCV2019」

总结来说，该项目共用 Pytorch 实现了 17 篇注意力机制论文。每篇论文包括题目（可直接链接到论文）、网络架构、代码。示例如下：

论文：「Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks」。

网络框架：

代码：

from attention.ExternalAttention *import* ExternalAttentionimport torchinput=torch.randn(50,49,512)ea = ExternalAttention(d_model=512,S=8)output=ea(input)print(output.shape)

转载请联系本公众号获得授权

投稿或寻求报道：[email protected]

继续阅读

阅读原文

关键词

机制

注意力机制

注意力

代码

篇论文