Einops: Clear and Reliable Tensor Manipulations with Einstein-like Notation.

einops用于实现对张量(Tensor)的操作,它可以更简单灵活地实现张量的变换,并且支持numpy, pytorch, tensorflow, jax等深度学习库。

einops全称是Einstein-Inspired Notation for operations,即受爱因斯坦求和约定(Einstein summation)启发的符号法。


pip install einops


# 三大核心操作
from einops import rearrange, reduce, repeat

# 三个深度学习库相关层
from einops.layers.torch import Rearrange, Reduce, EinMix

# 若干辅助函数
from einops import asnumpy, parse_shape, einsum, pack, unpack

1. 核心操作 rearrange, reduce, repeat

(1) rearrange


from einops import rearrange
output_tensor = rearrange(input_tensor, 'b c h w -> b h w c')

⚪ 增加维度

x = rearrange(x, 'b h w c -> b 1 h w 1 c')

⚪ Flatten操作

y = rearrange(x, 'b c h w -> b (c h w)')

⚪ PixelShuffle中的空间重排操作

y = rearrange(x, 'b (h1 w1 c) h w -> b c (h h1) (w w1)', h1=2, w1=2)

⚪ ShuffleNet中的通道打乱操作

y = rearrange(x, 'b (g1 g2 c) h w-> b (g2 g1 c) h w', g1=4, g2=4)

⚪ 拆分张量

y1, y2 = rearrange(x, 'b (split c) h w -> split b c h w', split=2)

(2) reduce


from einops import reduce
output_tensor = reduce(input_tensor, 'b c (h h2) (w w2) -> b h w c', 'mean', h2=2, w2=2)

⚪ 全局平均池化操作

y = reduce(x, 'b c h w -> b c', reduction='mean')

⚪ $2\times 2$最大池化操作

y = reduce(x, 'b c (h h1) (w w1) -> b c h w', reduction='max', h1=2, w1=2)
# you can skip names for reduced axes
y = reduce(x, 'b c (h 2) (w 2) -> b c h w', reduction='max')

⚪ 通道级的均值归一化

y = x - reduce(y, 'b c h w -> 1 c 1 1', 'mean')

(3) repeat


from einops import repeat
output_tensor = repeat(input_tensor, 'h w -> h w c', c=3)

2. 相关层 Rearrange, Reduce, EinMix


from einops.layers.torch      import Rearrange, Reduce, EinMix
from einops.layers.tensorflow import Rearrange, Reduce, EinMix
from einops.layers.flax       import Rearrange, Reduce, EinMix
from einops.layers.gluon      import Rearrange, Reduce, EinMix
from einops.layers.keras      import Rearrange, Reduce, EinMix
from einops.layers.chainer    import Rearrange, Reduce, EinMix


from torch.nn import Sequential, Conv2d, MaxPool2d, Linear, ReLU
from einops.layers.torch import Rearrange

model = Sequential(
    Conv2d(6, 16, kernel_size=5),
    # flattening without need to write forward
    Rearrange('b c h w -> b (c h w)'),  
    Linear(16*5*5, 120), 
    Linear(120, 10), 

总而言之,pytorch中的tensor.view等操作均可以用einops库替换。更多使用einops库构建深度网络模块的例子可参考Writing a better code with pytorch and einops

EinMix提供基于MLP架构的网络层,特定的MLP网络能够与卷积网络或Transformer网络表现相当,更多细节可参考EinMix: universal toolkit for advanced MLP architectures

3. 辅助函数 asnumpy, parse_shape, einsum, pack, unpack

(1) asnumpy


from einops import asnumpy
y_numpy = asnumpy(y)

(2) parse_shape


from einops import parse_shape
parse_shape(x_5d, 'b c x y z')
# {'b': 10, 'c': 32, 'x': 100, 'y': 10, 'z': 20}
parse_shape(x_5d, 'batch c _ _ _')
# {'batch': 10, 'c': 32}

(3) einsum


C = einsum(A, B, 'b t1 head c, b t2 head c -> b head t1 t2')

(4) packunpack


from einops import pack, unpack

image_rgb = np.random.random([h, w, 3])
image_depth = np.random.random([h, w])

image_rgbd, ps = pack([image_rgb, image_depth], 'h w *')
assert image_rgbd.shape == (h, w, 4)
# ps: [(3,), ()]

unpacked_rgb, unpacked_depth = unpack(image_rgbd, ps, 'h w *')
# 也可指定每个拆分维度
rgb, depth = unpack(image_rgbd, [[3], [1]], 'h w *')

其中ps表示Packed Shapes,记录了合并/拆分张量时每个张量在待处理维度上的占比。

⚪ 视觉Transformer中的CLS token

patch_tokens = np.random.random([batch, height, width, c])
class_tokens = np.zeros([batch, c])

input_packed, ps = pack([class_tokens, patch_tokens], 'b * c')
output_packed = transformer(input_packed)
class_token_emb, patch_tokens_emb = unpack(output_packed, ps, 'b * c_out')