Squeeze-and-Excitation Networks

SENet:卷积神经网络的通道注意力机制.

Self-Orthogonality Module: A Network Architecture Plug-in for Learning Orthogonal Filters

自正交化模块:一种用于学习正交滤波器的网络结构插件.

Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning

在大型语言模型中培养孩子:面向有效和泛化的微调.

Why gradient clipping accelerates training: A theoretical justification for adaptivity

为什么梯度裁剪能够加速训练:适应性的理论依据.

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

变分判别瓶颈:通过约束信息流改进深度学习模型.

AdderNet: Do We Really Need Multiplications in Deep Learning?

AdderNet:仅使用加法运算的卷积神经网络.