学一个R包给文章加一张CNS级别的美图(五)
一文学会ComplexHeatmap包的复杂注释
大家好,我是风。我们今天继续学习ComplexHeatmap的注释。对于一张热图来说,热图主体展示的信息无非就是一个矩阵和聚类,想在图片上添加更多的信息,我们需要赋予这个主体更多的信息,那这些信息体现在哪里呢?就是我们的注释信息。简单的有各种条形注释,通过每一个注释来标注不同分组,然后使用渐变或者不同颜色区分分组中的每一个元素,例如常见的使用样本类型区分热图的疾病组和非疾病组,或者使用临床信息注释表达矩阵的热图。在高分文章中,我们经常会看到更加丰富的注释信息来对热图进行修饰,热图的上下左右都可以添加注释,并且可以多个注释,甚至是热图的行列名都可以进行修饰然后部分突出,以及我们还可以对聚类树进行注释。这一切操作,只要你学会了ComplexHeatmap,就会变得手到擒来。传送门 手把手教你复现一篇CNS级别美图!附代码,建议收藏! CNS级别的美图复现起来难不难?小白也能快速上手(附代码)(二) 学一个R包给文章加一张CNS级别的美图(四)
今天我们继续聊聊ComplexHeatmap的热图注释:
我们还是构建一个跟前面推文一致的数据:
加载R包
library(ComplexHeatmap)
library(circlize)
构建一个跟之前一样的矩阵
set.seed(123)
nr1 = 4; nr2 = 8; nr3 = 6; nr = nr1 + nr2 + nr3
nc1 = 6; nc2 = 8; nc3 = 10; nc = nc1 + nc2 + nc3
mat <- cbind(rbind(matrix(rnorm(nr1*nc1, mean = 1, sd = 0.5), nr = nr1),
matrix(rnorm(nr2*nc1, mean = 0, sd = 0.5), nr = nr2),
matrix(rnorm(nr3*nc1, mean = 0, sd = 0.5), nr = nr3)),
rbind(matrix(rnorm(nr1*nc2, mean = 0, sd = 0.5), nr = nr1),
matrix(rnorm(nr2*nc2, mean = 1, sd = 0.5), nr = nr2),
matrix(rnorm(nr3*nc2, mean = 0, sd = 0.5), nr = nr3)),
rbind(matrix(rnorm(nr1*nc3, mean = 0.5, sd = 0.5), nr = nr1),
matrix(rnorm(nr2*nc3, mean = 0.5, sd = 0.5), nr = nr2),
matrix(rnorm(nr3*nc3, mean = 1, sd = 0.5), nr = nr3)))
mat <- mat[sample(nr, nr), sample(nc, nc)]
rownames(mat) = paste0("gene", seq_len(nr))
colnames(mat) = paste0("Sample", seq_len(nc))
head(mat)
# Sample1 Sample2 Sample3 Sample4 Sample5 Sample6
# gene1 0.90474160 -0.35229823 0.5016096 1.26769942 0.8251229 0.16215217
# gene2 0.90882972 0.79157121 1.0726316 0.01299521 0.1391978 0.46833693
# gene3 0.28074668 0.02987497 0.7052595 1.21514235 0.1747267 0.20949120
# gene4 0.02729558 0.75810969 0.5333504 -0.49637424 -0.5261114 0.56724357
# gene5 -0.32552445 1.03264652 1.1249573 0.66695147 0.4490584 1.04236865
# gene6 0.58403269 -0.47373731 0.5452483 0.86824798 -0.1976372 -0.03565404
# Sample7 Sample8 Sample9 Sample10 Sample11 Sample12
# gene1 -0.2869867 0.68032622 -0.1629658 0.8254537 0.7821773 -0.49625358
# gene2 1.2814948 0.38998256 -0.3473535 1.3508922 1.1183375 2.05005447
# gene3 -0.6423579 -0.31395304 0.2175907 -0.2973086 0.4322058 -0.25803192
# gene4 0.8127096 -0.01427338 1.0844780 0.2426662 0.8783874 1.38452112
# gene5 2.6205200 0.75823530 -0.2333277 1.3439584 0.8517620 0.85980233
# gene6 -0.3203530 1.05534136 0.7771690 0.4594983 0.2550648 -0.02778098
# Sample13 Sample14 Sample15 Sample16 Sample17 Sample18
# gene1 -0.0895258 -0.35520328 0.1072694 0.96322199 -0.39245223 -0.1878014
# gene2 1.3770269 -0.77437640 0.9829664 0.23854381 -0.53589561 1.3003544
# gene3 -0.5686518 -0.51321045 -0.0451598 0.82272880 -0.02251386 0.2427300
# gene4 0.8376570 0.10797078 0.4520019 0.81648036 0.02650211 0.4072600
# gene5 1.9986067 -0.50928769 0.7708173 -0.09421702 -1.15458444 1.2715970
# gene6 -0.2112484 0.01669142 -0.3259750 0.54460361 0.89101254 0.3699738
# Sample19 Sample20 Sample21 Sample22 Sample23 Sample24
# gene1 1.08736320 0.7132199 -0.1853300 -0.14238650 0.6407669 1.3266288
# gene2 0.14237891 0.4471643 0.4475628 -0.31251963 0.7057150 0.8120937
# gene3 1.09511516 0.5852612 0.1926402 0.51278568 1.6361334 1.1339175
# gene4 -0.02625664 0.9556956 0.3443201 0.07668656 1.7857291 0.5280084
# gene5 0.60599022 0.8304101 -0.1902355 -0.14753574 1.0123366 0.2140750
# gene6 0.81537706 1.2737905 1.8575325 0.88491126 0.5670193 0.5372756
我对空白注释的理解就是像训练营中讲到的ggplot2或者plot绘图时候的打画板,打上一个空白画板之后,不断在上面叠加不同的注释信息,让注释信息映射到画板上,这像不像我们常规的画图思路?我们使用anno_empty来进行注释:
ha = HeatmapAnnotation(foo = anno_empty(border = TRUE))
Heatmap(mat, name = "mat", top_annotation = ha)
在打上空白的画板之后,我们可以添加一些我们想要画的修饰,比如说我们在绘制热图的时候需要对基因进行分组,然后标注出一些重要的基因,就像火山图给点打上标签一样(训练营的课程内容),我们也可以在热图上面突出显示一些基因:
# 随机生成字母组合作为基因名字
random_text = function(n){
sapply(1:n, function(i){
paste0(sample(letters, sample(3:5, 1)), collapse = "")
})
}
# 随机生成四个text作为注释信息
text_list = list(
text1 = random_text(2),
text2 = random_text(3),
text3 = random_text(3),
text4 = random_text(4)
)
# 设置空白注释
ha = rowAnnotation(foo = anno_empty(border = FALSE,
width = max_text_width(unlist(text_list)) + unit(4, "mm")))
# 生成热图,并将前面生成的text文本打入空白注释的位置
Heatmap(matrix(rnorm(1000), nrow = 100), name = "mat", row_km = 4, right_annotation = ha)
for(i in 1:4) {
decorate_annotation("foo", slice = i, {
grid.rect(x = 0, width = unit(2, "mm"), gp = gpar(fill = i, col = NA), just = "left")
grid.text(paste(text_list[[i]], collapse = "\n"), x = unit(4, "mm"), just = "left")
})
}
# 在使用自己的数据时,可以将基因名保存在文本中,读入后构建一个text_list,然后把绘图数据替换成自己的数据,其他不需要进行修改,如果注释信息较多,则继续添加i = 5 6 7 8 ......
前面我们也说过了,空白注释其实就相当于一个画板,这个画板除了打上文本,当然也可以用来绘图,比如:
# 添加空白注释
ha = HeatmapAnnotation(foo = anno_empty(border = TRUE, height = unit(3, "cm")))
#绘制热图
ht = Heatmap(matrix(rnorm(100), nrow = 10), name = "mat", top_annotation = ha)
ht = draw(ht)
# 提取热图聚类后的列排序
co = column_order(ht)
# 随机生成10个数值
value = runif(10)
# 根据注释信息进行打点
decorate_annotation("foo", {
x = 1:ncol(matrix(rnorm(100), nrow = 10)) # x轴顺序为1~列数
value = value[co] # y轴的值对应重聚类后的顺序
pushViewport(viewport(xscale = c(0.5, 10.5), yscale = c(0, 1)))
grid.lines(c(0.5, 10.5), c(0.5, 0.5), gp = gpar(lty = 2),
default.units = "native")
grid.points(x, value, pch = 11, size = unit(3, "mm"),
gp = gpar(col = ifelse(value > 0.5, "red", "blue")), default.units = "native")
grid.yaxis(at = c(0, 0.5, 1))
popViewport()
})
# 上面用grid进行打点,也可以打上其他类型的图片
ha = HeatmapAnnotation(foo = anno_empty(border = TRUE, height = unit(3, "cm")))
ht = Heatmap(matrix(rnorm(100), nrow = 10), name = "mat", top_annotation = ha)
ht = draw(ht)
co = column_order(ht)
value = runif(10)
decorate_annotation("foo", {
x = 1:ncol(matrix(rnorm(100), nrow = 10)) # x轴顺序为1~列数
value = value[co] # y轴的值对应重聚类后的顺序
pushViewport(viewport(xscale = c(0.5, 10.5), yscale = c(0, 1)))
grid.lines(c(0.5, 10.5), c(0.5, 0.5), gp = gpar(lty = 2),
default.units = "native")
grid.circle(x, value,r=0.2,
gp = gpar(fill = ifelse(value > 0.5, "pinK", "skyblue")), default.units = "native")
grid.yaxis(at = c(0, 0.5, 1))
popViewport()
})
好了,空白注释先说这么多,我们后面还会用到,所以就先不展开,我们接下去看我们可能最常用到的模块注释。
# 空白注释用anno_empty,那模块注释就是anno_block咯
name = "mat",
top_annotation = HeatmapAnnotation(foo = anno_block(gp = gpar(fill = 4:6))),
column_km = 3)
# 在上面的代码中,我们对每一个聚类的结果添加了颜色模块,fill就是映射的颜色参数
# 在颜色中添加文本
Heatmap(mat, name = "mat",
top_annotation = HeatmapAnnotation(foo = anno_block(gp = gpar(fill = 4:6),
labels = c("Cluster1", "Cluster2", "Cluster3"),
labels_gp = gpar(col = "gray40", fontsize = 10))),
column_km = 3,
left_annotation = rowAnnotation(foo = anno_block(gp = gpar(fill = 4:6),
labels = c("Writer", "Reader", "Eraser"),
labels_gp = gpar(col = "gray40", fontsize = 10))),
row_km = 3)
# 这部分看起来跟pheatmap很像,但是可操控性更强
在前面我们使用了grid来进行打点注释,但是操作还是比较麻烦,有没有一个好用的函数呢?有的:
# anno_points进行点注释可以显示数据列表的分布,对象x的数据点可以是单个向量或矩阵
# 设置向量作为点注释的内容
name = "mat",
left_annotation = rowAnnotation(foo = anno_points(runif(18))),
column_km = 3)
# 设置矩阵作为注释的内容
anno_matrix <-matrix(runif(48), nc = 2)
Heatmap(mat, name = "mat",
top_annotation = HeatmapAnnotation(foo = anno_points(anno_matrix,
pch = 10:11,gp = gpar(col = 10:11))),
column_km = 3)
# 当然也有很多参数进行个性化设置
Heatmap(mat, name = "mat",
top_annotation = HeatmapAnnotation(Percentage = anno_points(runif(24), ylim = c(0, 1),
height = unit(3, "cm"),
axis_param = list(side = "right",
at = c(0, 0.5, 1),
labels = c("0", "50%",
"100%")))),
column_km = 3)
Heatmap(mat, name = "mat",
left_annotation = rowAnnotation(Percentage = anno_points(runif(18), ylim = c(0, 1),
width = unit(2, "cm"),
axis_param = list(side = "bottom",
at = c(0, 0.5, 1),
labels = c("0", "50%", "100%"),
labels_rot = 45))),
column_km = 3)
如果觉得单纯是点太单调,可以加上线条,那就是线条注释了。
# anno_lines一般可以配合 anno_points使用,就把点连成线,像是平时绘制的点线图一样
# 不要打点,直接连线
name = "mat",
top_annotation = HeatmapAnnotation(foo = anno_lines(runif(24))),
column_km = 3)
# 使用矩阵
name = "mat",
top_annotation = HeatmapAnnotation(foo = anno_lines(cbind(c(1:12, 1:12), c(12:1, 12:1)),
gp = gpar(col = 2:3),
add_points = FALSE,
pt_gp = gpar(col = 5:6),
pch = c(1, 16))),
column_km = 3)
# 想添加点可以直接将上面的add_points改为TRUE
# 可以将曲线变平滑
name = "mat",
top_annotation = HeatmapAnnotation(foo = anno_lines(cbind(c(1:12, 1:12), c(12:1, 12:1)),
gp = gpar(col = 2:3),
add_points = TRUE,
smooth = TRUE,
height = unit(3, "cm"),
pt_gp = gpar(col = 5:6),
pch = c(1, 2))),
column_km = 3)
# 这种拟合就像是样条回归的拟合一样,并不能完全准确,一般除非有特殊目的,否则不推荐使用
好了,今天的内容又结束啦。这些点线注释可以用在哪呢?个人觉得操作余地很多,比如你可以打上常见的RiskScore值,或者是TMB、MSI这样其他维度的数据,让你的数据多一个维度进行讨论,添加上颜色变得更出彩一点,这些都可以,就看你自己的创造力啦。下一讲,我们要开始图形注释,包括堆叠柱状图、箱式图、密度图等等内容,但是是基于今天内容之上哦!我们下期不见不散!o(* ̄▽ ̄*)ブ
欢迎大家关注解螺旋生信频道-挑圈联靠公号~
最新评论
推荐文章
作者最新文章
你可能感兴趣的文章
Copyright Disclaimer: The copyright of contents (including texts, images, videos and audios) posted above belong to the User who shared or the third-party website which the User shared from. If you found your copyright have been infringed, please send a DMCA takedown notice to [email protected]. For more detail of the source, please click on the button "Read Original Post" below. For other communications, please send to [email protected].
版权声明:以上内容为用户推荐收藏至CareerEngine平台,其内容(含文字、图片、视频、音频等)及知识版权均属用户或用户转发自的第三方网站,如涉嫌侵权,请通知[email protected]进行信息删除。如需查看信息来源,请点击“查看原文”。如需洽谈其它事宜,请联系[email protected]。
版权声明:以上内容为用户推荐收藏至CareerEngine平台,其内容(含文字、图片、视频、音频等)及知识版权均属用户或用户转发自的第三方网站,如涉嫌侵权,请通知[email protected]进行信息删除。如需查看信息来源,请点击“查看原文”。如需洽谈其它事宜,请联系[email protected]。