学一个R包给文章加一张CNS级别的美图（五）

一文学会ComplexHeatmap包的复杂注释

大家好，我是风。我们今天继续学习ComplexHeatmap的注释。对于一张热图来说，热图主体展示的信息无非就是一个矩阵和聚类，想在图片上添加更多的信息，我们需要赋予这个主体更多的信息，那这些信息体现在哪里呢？就是我们的注释信息。简单的有各种条形注释，通过每一个注释来标注不同分组，然后使用渐变或者不同颜色区分分组中的每一个元素，例如常见的使用样本类型区分热图的疾病组和非疾病组，或者使用临床信息注释表达矩阵的热图。在高分文章中，我们经常会看到更加丰富的注释信息来对热图进行修饰，热图的上下左右都可以添加注释，并且可以多个注释，甚至是热图的行列名都可以进行修饰然后部分突出，以及我们还可以对聚类树进行注释。这一切操作，只要你学会了ComplexHeatmap，就会变得手到擒来。
传送门
手把手教你复现一篇CNS级别美图！附代码，建议收藏！
CNS级别的美图复现起来难不难？小白也能快速上手（附代码）（二）
小白课堂 | 学个R包，复现一张CNS级别美图（附代码）
学一个R包给文章加一张CNS级别的美图（四）

今天我们继续聊聊ComplexHeatmap的热图注释：

加载包

我们还是构建一个跟前面推文一致的数据：

# 加载R包library(ComplexHeatmap)library(circlize)# 构建一个跟之前一样的矩阵set.seed(123)nr1 = 4; nr2 = 8; nr3 = 6; nr = nr1 + nr2 + nr3nc1 = 6; nc2 = 8; nc3 = 10; nc = nc1 + nc2 + nc3mat <- cbind(rbind(matrix(rnorm(nr1*nc1, mean = 1, sd = 0.5), nr = nr1), matrix(rnorm(nr2*nc1, mean = 0, sd = 0.5), nr = nr2), matrix(rnorm(nr3*nc1, mean = 0, sd = 0.5), nr = nr3)), rbind(matrix(rnorm(nr1*nc2, mean = 0, sd = 0.5), nr = nr1), matrix(rnorm(nr2*nc2, mean = 1, sd = 0.5), nr = nr2), matrix(rnorm(nr3*nc2, mean = 0, sd = 0.5), nr = nr3)), rbind(matrix(rnorm(nr1*nc3, mean = 0.5, sd = 0.5), nr = nr1), matrix(rnorm(nr2*nc3, mean = 0.5, sd = 0.5), nr = nr2), matrix(rnorm(nr3*nc3, mean = 1, sd = 0.5), nr = nr3)))mat <- mat[sample(nr, nr), sample(nc, nc)] rownames(mat) = paste0("gene", seq_len(nr))colnames(mat) = paste0("Sample", seq_len(nc))head(mat)## Sample1 Sample2 Sample3 Sample4 Sample5 Sample6## gene1 0.90474160 -0.35229823 0.5016096 1.26769942 0.8251229 0.16215217## gene2 0.90882972 0.79157121 1.0726316 0.01299521 0.1391978 0.46833693## gene3 0.28074668 0.02987497 0.7052595 1.21514235 0.1747267 0.20949120## gene4 0.02729558 0.75810969 0.5333504 -0.49637424 -0.5261114 0.56724357## gene5 -0.32552445 1.03264652 1.1249573 0.66695147 0.4490584 1.04236865## gene6 0.58403269 -0.47373731 0.5452483 0.86824798 -0.1976372 -0.03565404## Sample7 Sample8 Sample9 Sample10 Sample11 Sample12## gene1 -0.2869867 0.68032622 -0.1629658 0.8254537 0.7821773 -0.49625358## gene2 1.2814948 0.38998256 -0.3473535 1.3508922 1.1183375 2.05005447## gene3 -0.6423579 -0.31395304 0.2175907 -0.2973086 0.4322058 -0.25803192## gene4 0.8127096 -0.01427338 1.0844780 0.2426662 0.8783874 1.38452112## gene5 2.6205200 0.75823530 -0.2333277 1.3439584 0.8517620 0.85980233## gene6 -0.3203530 1.05534136 0.7771690 0.4594983 0.2550648 -0.02778098## Sample13 Sample14 Sample15 Sample16 Sample17 Sample18## gene1 -0.0895258 -0.35520328 0.1072694 0.96322199 -0.39245223 -0.1878014## gene2 1.3770269 -0.77437640 0.9829664 0.23854381 -0.53589561 1.3003544## gene3 -0.5686518 -0.51321045 -0.0451598 0.82272880 -0.02251386 0.2427300## gene4 0.8376570 0.10797078 0.4520019 0.81648036 0.02650211 0.4072600## gene5 1.9986067 -0.50928769 0.7708173 -0.09421702 -1.15458444 1.2715970## gene6 -0.2112484 0.01669142 -0.3259750 0.54460361 0.89101254 0.3699738## Sample19 Sample20 Sample21 Sample22 Sample23 Sample24## gene1 1.08736320 0.7132199 -0.1853300 -0.14238650 0.6407669 1.3266288## gene2 0.14237891 0.4471643 0.4475628 -0.31251963 0.7057150 0.8120937## gene3 1.09511516 0.5852612 0.1926402 0.51278568 1.6361334 1.1339175## gene4 -0.02625664 0.9556956 0.3443201 0.07668656 1.7857291 0.5280084## gene5 0.60599022 0.8304101 -0.1902355 -0.14753574 1.0123366 0.2140750## gene6 0.81537706 1.2737905 1.8575325 0.88491126 0.5670193 0.5372756

空白注释（打画板）

我对空白注释的理解就是像训练营中讲到的ggplot2或者plot绘图时候的打画板，打上一个空白画板之后，不断在上面叠加不同的注释信息，让注释信息映射到画板上，这像不像我们常规的画图思路？我们使用anno_empty来进行注释：

ha = HeatmapAnnotation(foo = anno_empty(border = TRUE))Heatmap(mat, name = "mat", top_annotation = ha)

在打上空白的画板之后，我们可以添加一些我们想要画的修饰，比如说我们在绘制热图的时候需要对基因进行分组，然后标注出一些重要的基因，就像火山图给点打上标签一样（训练营的课程内容），我们也可以在热图上面突出显示一些基因：

# 随机生成字母组合作为基因名字random_text = function(n){ sapply(1:n, function(i){ paste0(sample(letters, sample(3:5, 1)), collapse = "") })}# 随机生成四个text作为注释信息text_list = list( text1 = random_text(2), text2 = random_text(3), text3 = random_text(3), text4 = random_text(4))# 设置空白注释ha = rowAnnotation(foo = anno_empty(border = FALSE, width = max_text_width(unlist(text_list)) + unit(4, "mm")))# 生成热图,并将前面生成的text文本打入空白注释的位置Heatmap(matrix(rnorm(1000), nrow = 100), name = "mat", row_km = 4, right_annotation = ha)for(i in 1:4) { decorate_annotation("foo", slice = i, { grid.rect(x = 0, width = unit(2, "mm"), gp = gpar(fill = i, col = NA), just = "left") grid.text(paste(text_list[[i]], collapse = "\n"), x = unit(4, "mm"), just = "left") })}

# 在使用自己的数据时，可以将基因名保存在文本中，读入后构建一个text_list，然后把绘图数据替换成自己的数据，其他不需要进行修改，如果注释信息较多，则继续添加i = 5 6 7 8 ......

前面我们也说过了，空白注释其实就相当于一个画板，这个画板除了打上文本，当然也可以用来绘图，比如：

# 添加空白注释ha = HeatmapAnnotation(foo = anno_empty(border = TRUE, height = unit(3, "cm")))#绘制热图ht = Heatmap(matrix(rnorm(100), nrow = 10), name = "mat", top_annotation = ha)ht = draw(ht)# 提取热图聚类后的列排序co = column_order(ht)# 随机生成10个数值value = runif(10)# 根据注释信息进行打点decorate_annotation("foo", { x = 1:ncol(matrix(rnorm(100), nrow = 10)) # x轴顺序为1~列数 value = value[co] # y轴的值对应重聚类后的顺序 pushViewport(viewport(xscale = c(0.5, 10.5), yscale = c(0, 1))) grid.lines(c(0.5, 10.5), c(0.5, 0.5), gp = gpar(lty = 2), default.units = "native") grid.points(x, value, pch = 11, size = unit(3, "mm"), gp = gpar(col = ifelse(value > 0.5, "red", "blue")), default.units = "native") grid.yaxis(at = c(0, 0.5, 1)) popViewport()})

# 上面用grid进行打点，也可以打上其他类型的图片ha = HeatmapAnnotation(foo = anno_empty(border = TRUE, height = unit(3, "cm")))ht = Heatmap(matrix(rnorm(100), nrow = 10), name = "mat", top_annotation = ha)ht = draw(ht)co = column_order(ht)value = runif(10)decorate_annotation("foo", { x = 1:ncol(matrix(rnorm(100), nrow = 10)) # x轴顺序为1~列数 value = value[co] # y轴的值对应重聚类后的顺序 pushViewport(viewport(xscale = c(0.5, 10.5), yscale = c(0, 1))) grid.lines(c(0.5, 10.5), c(0.5, 0.5), gp = gpar(lty = 2), default.units = "native") grid.circle(x, value,r=0.2, gp = gpar(fill = ifelse(value > 0.5, "pinK", "skyblue")), default.units = "native") grid.yaxis(at = c(0, 0.5, 1)) popViewport()})

好了，空白注释先说这么多，我们后面还会用到，所以就先不展开，我们接下去看我们可能最常用到的模块注释。

模块注释

# 空白注释用anno_empty，那模块注释就是anno_block咯Heatmap(mat,name = "mat",top_annotation = HeatmapAnnotation(foo = anno_block(gp = gpar(fill = 4:6))),column_km = 3)

# 在上面的代码中，我们对每一个聚类的结果添加了颜色模块，fill就是映射的颜色参数# 在颜色中添加文本Heatmap(mat, name = "mat",top_annotation = HeatmapAnnotation(foo = anno_block(gp = gpar(fill = 4:6),labels = c("Cluster1", "Cluster2", "Cluster3"), labels_gp = gpar(col = "gray40", fontsize = 10))),column_km = 3,left_annotation = rowAnnotation(foo = anno_block(gp = gpar(fill = 4:6),labels = c("Writer", "Reader", "Eraser"), labels_gp = gpar(col = "gray40", fontsize = 10))),row_km = 3)

# 这部分看起来跟pheatmap很像，但是可操控性更强

点注释

在前面我们使用了grid来进行打点注释，但是操作还是比较麻烦，有没有一个好用的函数呢？有的：

# anno_points进行点注释可以显示数据列表的分布，对象x的数据点可以是单个向量或矩阵# 设置向量作为点注释的内容Heatmap(mat,name = "mat",left_annotation = rowAnnotation(foo = anno_points(runif(18))),column_km = 3)

# 设置矩阵作为注释的内容anno_matrix <-matrix(runif(48), nc = 2)Heatmap(mat, name = "mat",top_annotation = HeatmapAnnotation(foo = anno_points(anno_matrix,pch = 10:11,gp = gpar(col = 10:11))),column_km = 3)

# 当然也有很多参数进行个性化设置Heatmap(mat, name = "mat", top_annotation = HeatmapAnnotation(Percentage = anno_points(runif(24), ylim = c(0, 1), height = unit(3, "cm"), axis_param = list(side = "right", at = c(0, 0.5, 1), labels = c("0", "50%","100%")))), column_km = 3)

Heatmap(mat, name = "mat", left_annotation = rowAnnotation(Percentage = anno_points(runif(18), ylim = c(0, 1), width = unit(2, "cm"), axis_param = list(side = "bottom", at = c(0, 0.5, 1), labels = c("0", "50%", "100%"), labels_rot = 45))), column_km = 3)

如果觉得单纯是点太单调，可以加上线条，那就是线条注释了。

线条注释

# anno_lines一般可以配合 anno_points使用，就把点连成线，像是平时绘制的点线图一样# 不要打点，直接连线Heatmap(mat,name = "mat",top_annotation = HeatmapAnnotation(foo = anno_lines(runif(24))),column_km = 3)

# 使用矩阵Heatmap(mat,name = "mat",top_annotation = HeatmapAnnotation(foo = anno_lines(cbind(c(1:12, 1:12), c(12:1, 12:1)),gp = gpar(col = 2:3), add_points = FALSE, pt_gp = gpar(col = 5:6), pch = c(1, 16))),column_km = 3)

# 想添加点可以直接将上面的add_points改为TRUE# 可以将曲线变平滑Heatmap(mat,name = "mat",top_annotation = HeatmapAnnotation(foo = anno_lines(cbind(c(1:12, 1:12), c(12:1, 12:1)),gp = gpar(col = 2:3), add_points = TRUE, smooth = TRUE,height = unit(3, "cm"),pt_gp = gpar(col = 5:6), pch = c(1, 2))),column_km = 3)

# 这种拟合就像是样条回归的拟合一样，并不能完全准确，一般除非有特殊目的，否则不推荐使用

好了，今天的内容又结束啦。这些点线注释可以用在哪呢？个人觉得操作余地很多，比如你可以打上常见的RiskScore值，或者是TMB、MSI这样其他维度的数据，让你的数据多一个维度进行讨论，添加上颜色变得更出彩一点，这些都可以，就看你自己的创造力啦。下一讲，我们要开始图形注释，包括堆叠柱状图、箱式图、密度图等等内容，但是是基于今天内容之上哦！我们下期不见不散！o(*￣▽￣*)ブ

欢迎大家关注解螺旋生信频道-挑圈联靠公号~

继续阅读

阅读原文