Radar-Based Human Activity Recognition With 1-D Dense Attention Network

1-D-DAN:为雷达光谱图设计一维密集注意力网络用于人类活动识别.

Lite-HRNet: A Lightweight High-Resolution Network

Lite-HRNet:轻量级高分辨率网络.

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

Switch Transformer:训练万亿级参数的语言模型.

漫威电影宇宙(MCU)列传:卷四

A Records of the Fourth Phase of the Marvel Cinematic Universe (MCU).

Knowledge Neurons in Pretrained Transformers

预训练Transformer中的知识神经元.

Transformer Feed-Forward Layers Are Key-Value Memories

Transformer全连接层是键值记忆单元.