Building fully-connected conditional random field with pydensecrf.

1. pydensecrf简介

pydensecrf是一个构造全连接型条件随机场CRF的python第三方库。全连接CRF模型采用能量函数：

\[E(X) = \sum_{u} \Phi(u) +\sum_{u,v} \Psi(u,v)\]

其中$u,v$是图像$X$上的像素，$\Phi(u)$是一元势函数，$\Psi(u,v)$是二元势函数。

对于二维图像，使用以下语句定义DenseCRF2D类：

import numpy as np
import pydensecrf.densecrf as dcrf

d = dcrf.DenseCRF2D(640, 480, 2)  # width, height, nlabels

2. 一元势 Unary potential

定义一元势函数$\Phi(u) = -\log p(u)$，构造一元势有两种常见的方法：

由人类或其他过程产生的硬标签。
由概率分布计算得到的软标签，例如深度网络的softmax输出。

⚪ 由硬标签构造一元势

from pydensecrf.utils import unary_from_labels
U = unary_from_labels(labels, n_labels, gt_prob, zero_unsure=True)

labels：(numpy.array)图像的标签，形状与图像相同，一个像素对应一个标签值
n_labels：(int)标签总数，不包括$0$标签
gt_prob：(float)标签的置信度，取值$(0,1)$
zero_unsure：(bool)不考虑$0$标签，默认为True

⚪ 由软标签构造一元势

将softmax输出转换成负对数似然值NLL。

from pydensecrf.utils import unary_from_softmax
U = unary_from_softmax(sm, scale=None, clip=1e-5)

sm：(numpy.array)网络输出概率值，第一个维度是类别
scale：(float)网络输出的确定性，如果不为None，则softmax输出被缩放到[0,scale]范围
clip：(float)概率裁剪的最小值，避免出现$\log 0$

该函数相当于进行如下操作：

-np.log(sm).reshape([num_cls, -1]).astype(np.float32)

⚪ 将一元势加入DenseCRF2D

d.setUnaryEnergy(U)

3. 二元势 Pairwise potentials

二元势函数$\Psi(u,v)$用于描述像素点和像素点之间的关系，鼓励相似像素分配相同的标签，而相差较大的像素分配不同的标签。

对于二维图像，常见的二元势有两种:

d.addPairwiseGaussian：高斯关系，创建与颜色无关的位置特征(参数实际相对距离sxy)，大致表示位置相似的像素可能属于同一个类
d.addPairwiseBilateral：双边关系，创建与颜色相关和位置相关的特征，大致表示颜色相似或位置相似的像素可能属于同一个类

# This adds the color-independent term, features are the locations only.
d.addPairwiseGaussian(sxy=(3,3), compat=3,
                      kernel=dcrf.DIAG_KERNEL,
                      normalization=dcrf.NORMALIZE_SYMMETRIC)

# This adds the color-dependent term, i.e. features are (x,y,r,g,b).
# im is an RGB image-array, e.g. im.dtype == np.uint8 and im.shape == (640,480,3)
d.addPairwiseBilateral(sxy=(80,80), srgb=(13,13,13),
                       rgbim=im, compat=10,
                       kernel=dcrf.DIAG_KERNEL,
                       normalization=dcrf.NORMALIZE_SYMMETRIC)

高斯情况下的sxy为$\theta_{\gamma}$，双边情况下，sxy和srgb分别映射到$\theta_{\alpha}$和$\theta_{\beta}$。分别代表“x/y标准偏差”(x/y standard-deviation，sxy)和“rgb标准偏差”(rgb standard-deviation,srgb)。

其余参数选择：

compat：(int)对应公式中的$w^{(1)}$和$w^{(2)}$
kernel：可选dcrf.DIAG_KERNEL,dcrf.CONST_KERNEL,dcrf.FULL_KERNEL
normalization：可选dcrf.NORMALIZE_SYMMETRIC,dcrf.NORMALIZE_AFTER,dcrf.NORMALIZE_BEFORE,dcrf.NO_NORMALIZATION

4. 推理 Inference

CRF模型结合一元势和二元势比较全面地考虑像素之间的关系，并得出优化后的结果：

Q = d.inference(50)
Q = np.argmax(Q, axis=0)
Q = Q.reshape((H, W))

也可以实现一步一步的推理：

Q, tmp1, tmp2 = d.startInference()
for i in range(50):
    print("KL-divergence at {}: {}".format(i, d.klDivergence(Q)))
    d.stepInference(Q, tmp1, tmp2)
Q = np.argmax(Q, axis=0)
Q = Q.reshape((H, W))

5. 实例：使用pydensecrf处理灰度图像

对于灰度图像，其通道数为$1$，因此无法直接构造双边关系的二元势。输入定义如下：

import numpy as np
import pydensecrf.densecrf as dcrf
import pydensecrf.utils as utils
from PIL import Image

fn = './input/image.png'
img = Image.open(fn).convert('L')
img = np.array(img) # [H, W]
prob = img/255.     # [H, W] 模拟softmax的结果

下面人为构造包含$2$个类的概率图，即一个”foreground“和一个”background“类。实现时通过np.tile复制数组：

prob = np.tile(prob[np.newaxis,:,:],(2,1,1))
prob[1,:,:] = 1 - prob[0,:,:]
print(prob.shape)   # [2, H, W]

在构造双边二元势时，使用如下函数：

pairwise_energy = utils.create_pairwise_bilateral(sdims=(10,10), schan=(0.01,), img=image, chdim=2)
d.addPairwiseEnergy(pairwise_energy, compat=10)

sdims：(list/tuple)每个维度的比例因子，相当于addPairwiseBilateral中的sxy
schan：(list/tuple)图像中每个通道的比例因子，相当于addPairwiseBilateral中的srgb
chdim：(int)指定通道维度在图像中的位置。例如 chdim=2说明其通道值放在第二个维度上。如果图像只有一个通道，则使用chdim=-1。

使用CRF处理图像：

class DenseCRF(object):
    def __init__(self, iter_max, pos_w, pos_xy_std, bi_w, bi_xy_std, bi_rgb_std):
        self.iter_max = iter_max
        self.pos_w = pos_w
        self.pos_xy_std = pos_xy_std
        self.bi_w = bi_w
        self.bi_xy_std = bi_xy_std
        self.bi_rgb_std = bi_rgb_std

    def __call__(self, image, probmap):
        H, W = image.shape

        U = utils.unary_from_softmax(probmap)
        U = np.ascontiguousarray(U)

        image = np.ascontiguousarray(image[:,:,np.newaxis])

        d = dcrf.DenseCRF2D(W, H, 2)
        d.setUnaryEnergy(U)
        d.addPairwiseGaussian(sxy=self.pos_xy_std, compat=self.pos_w)
        pairwise_energy = utils.create_pairwise_bilateral(sdims=self.bi_xy_std,
                                                          schan=self.bi_rgb_std, img=image, chdim=2)
        d.addPairwiseEnergy(pairwise_energy, compat=self.bi_w)

        Q, tmp1, tmp2 = d.startInference()
        for i in range(self.iter_max):
            print("KL-divergence at {}: {}".format(i, d.klDivergence(Q)))
            d.stepInference(Q, tmp1, tmp2)
        Q = np.array(Q).reshape((2, H, W))
        return Q[0]


postprocessor = DenseCRF(
    iter_max=10,
    pos_xy_std=3, 
    pos_w=3,      
    bi_xy_std=(10,10),
    bi_rgb_std=(0.01,),
    bi_w=10,
)

prob = postprocessor(img, prob)
result = Image.fromarray((prob * 255).astype(np.uint8)) 
result.save('res.jpg')