降低Transformer的计算复杂度

Efficient Transformers.

R-Drop: Regularized Dropout for Neural Networks

R-Drop:正则化的Dropout方法.

机器学习中的假设检验(Hypothesis Test)

Hypothesis Test in Machine Learning.

Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers

对不同深度学习优化器的基准测试.

Addressing Some Limitations of Transformers with Feedback Memory

Feedback Transformer:改进Transformer的序列信息提取能力.

卷积神经网络中的池化(Pooling)层

Pooling Layers.池化(pooling)是卷积神经网络中的重要组成部分。通过池化可以对特征图(feature map)进行降采样,从而减小网络的模型参数量和计算成本,也在一定程度上降低过拟合的风险。池化的作用包括: 通过降采样增...