郑之杰的个人博客
欢迎光临
Continuously Differentiable Exponential Linear Units
CELU:连续可微的指数线性单元.
Mish: A Self Regularized Non-Monotonic Activation Function
Mish:一种自正则化的非单调激活函数.
泰勒公式(Taylor Formula)
Taylor Formula.
XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLNet:使用排列语言建模训练语言模型.
MASS: Masked Sequence to Sequence Pre-training for Language Generation
MASS:序列到序列的掩码语言建模.
Unified Language Model Pre-training for Natural Language Understanding and Generation
UniLM:使用BERT实现序列到序列的预训练.