Simple Hardware-Efficient Long Convolutions for Sequence Modeling

用于序列建模的简单的硬件高效长卷积.

Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions

LaughingHyena: 从卷积中提取紧凑循环.

Hyena Hierarchy: Towards Larger Convolutional Language Models

Hyena:面向大型卷积语言模型.

大型语言模型(Large Language Model)

Large Language Model.

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

Mix-LN:通过结合Pre-LN与Post-LN释放深层网络的能力.

北京地铁三号线路考

Beijing Metro Line 3 Inspection.