一文读懂Thanos多集群监控
介绍
- https://github.com/particuleio/teks/tree/main/terragrunt/live/thanos
- https://github.com/particuleio/terraform-kubernetes-addons/tree/main/modules/aws
Kubernetes普罗米修斯技术栈
- Prometheus:收集度量标准
- 告警管理器:根据指标查询向各种提供者发送警报
- Grafana:可视化豪华仪表板
Thanos,它来了
- Thanos Store
- Thanos Sidecar
- Thanos Query
多集群架构
- 一个观察者集群[3]
- 一个被观察集群[4]
.
├── env_tags.yaml
├── eu-west-1
│ ├── clusters
│ │ └── observer
│ │ ├── eks
│ │ │ ├── kubeconfig
│ │ │ └── terragrunt.hcl
│ │ ├── eks-addons
│ │ │ └── terragrunt.hcl
│ │ └── vpc
│ │ └── terragrunt.hcl
│ └── region_values.yaml
└── eu-west-3
├── clusters
│ └── observee
│ ├── cluster_values.yaml
│ ├── eks
│ │ ├── kubeconfig
│ │ └── terragrunt.hcl
│ ├── eks-addons
│ │ └── terragrunt.hcl
│ └── vpc
│ └── terragrunt.hcl
└── region_values.yaml
- Grafana启用
- Thanos边车上传到特定的桶
kube-prometheus-stack = {
enabled =
true allowed_cidrs = dependency.vpc.outputs.private_subnets_cidr_blocks
thanos_sidecar_enabled =
true thanos_bucket_force_destroy =
true extra_values = <<-EXTRA_VALUES
grafana:
deploymentStrategy:
type
: Recreate
ingress:
enabled:
true annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer:
"letsencrypt" hosts:
- grafana.
${local.default_domain_suffix} tls:
- secretName: grafana.
${local.default_domain_suffix} hosts:
- grafana.
${local.default_domain_suffix} persistence:
enabled:
true storageClassName: ebs-sc
accessModes:
- ReadWriteOnce
size: 1Gi
prometheus:
prometheusSpec:
replicas: 1
retention: 2d
retentionSize:
"10GB" ruleSelectorNilUsesHelmValues:
false serviceMonitorSelectorNilUsesHelmValues:
false podMonitorSelectorNilUsesHelmValues:
false storageSpec:
volumeClaimTemplate:
spec:
storageClassName: ebs-sc
accessModes: [
"ReadWriteOnce"]
resources:
requests:
storage: 10Gi
EXTRA_VALUES
- 这个CA将被进入sidecar的被观察集群所信任
- 为Thanos querier组件生成TLS证书,这些组件将查询被观察集群
- Thanos组件全部部署完成
- 查询前端,作为Grafana的数据源端点
- 存储网关用于查询观察者桶
- Query将对存储网关和其他查询器执行查询
- 配置了TLS的Thanos查询器对每个被观察集群进行查询
thanos-tls-querier = {
"observee"
= {
enabled =
true default_global_requests =
true default_global_limits =
false stores = [
"thanos-sidecar.${local.default_domain_suffix}:443"
]
}
}
thanos-storegateway = {
"observee"
= {
enabled =
true default_global_requests =
true default_global_limits =
false bucket =
"thanos-store-pio-thanos-observee" region =
"eu-west-3" }
- Thanos这边就是上传给观察者特定的桶
- Thanos边车与TLS客户端认证的入口对象一起发布,并信任观察者集群CA
kube-prometheus-stack = {
enabled =
true allowed_cidrs = dependency.vpc.outputs.private_subnets_cidr_blocks
thanos_sidecar_enabled =
true thanos_bucket_force_destroy =
true extra_values = <<-EXTRA_VALUES
grafana:
enabled:
false prometheus:
thanosIngress:
enabled:
true ingressClassName: nginx
annotations:
cert-manager.io/cluster-issuer:
"letsencrypt" nginx.ingress.kubernetes.io/ssl-redirect:
"true" nginx.ingress.kubernetes.io/backend-protocol:
"GRPC" nginx.ingress.kubernetes.io/auth-tls-verify-client:
"on" nginx.ingress.kubernetes.io/auth-tls-secret:
"monitoring/thanos-ca" hosts:
- thanos-sidecar.
${local.default_domain_suffix} paths:
- /
tls:
- secretName: thanos-sidecar.
${local.default_domain_suffix} hosts:
- thanos-sidecar.
${local.default_domain_suffix} prometheusSpec:
replicas: 1
retention: 2d
retentionSize:
"6GB" ruleSelectorNilUsesHelmValues:
false serviceMonitorSelectorNilUsesHelmValues:
false podMonitorSelectorNilUsesHelmValues:
false storageSpec:
volumeClaimTemplate:
spec:
storageClassName: ebs-sc
accessModes: [
"ReadWriteOnce"]
resources:
requests:
storage: 10Gi
EXTRA_VALUES
- Thanos压缩器来管理这个特定集群的下采样
thanos = {
enabled =
true bucket_force_destroy =
true trusted_ca_content = dependency.thanos-ca.outputs.thanos_ca
extra_values = <<-EXTRA_VALUES
compactor:
retentionResolution5m: 90d
query:
enabled:
false queryFrontend:
enabled:
false storegateway:
enabled:
false EXTRA_VALUES
}
再深入一点
kubectl -n monitoring get pods
NAME READY STATUS RESTARTS AGE
alertmanager-kube-prometheus-stack-alertmanager-0 2/2 Running 0 120m
kube-prometheus-stack-grafana-c8768466b-rd8wm 2/2 Running 0 120m
kube-prometheus-stack-kube-state-metrics-5cf575d8f8-x59rd 1/1 Running 0 120m
kube-prometheus-stack-operator-6856b9bb58-hdrb2 1/1 Running 0 119m
kube-prometheus-stack-prometheus-node-exporter-8hvmv 1/1 Running 0 117m
kube-prometheus-stack-prometheus-node-exporter-cwlfd 1/1 Running 0 120m
kube-prometheus-stack-prometheus-node-exporter-rsss5 1/1 Running 0 120m
kube-prometheus-stack-prometheus-node-exporter-rzgr9 1/1 Running 0 120m
prometheus-kube-prometheus-stack-prometheus-0 3/3 Running 1 120m
thanos-compactor-74784bd59d-vmvps 1/1 Running 0 119m
thanos-query-7c74db546c-d7bp8 1/1 Running 0 12m
thanos-query-7c74db546c-ndnx2 1/1 Running 0 12m
thanos-query-frontend-5cbcb65b57-5sx8z 1/1 Running 0 119m
thanos-query-frontend-5cbcb65b57-qjhxg 1/1 Running 0 119m
thanos-storegateway-0 1/1 Running 0 119m
thanos-storegateway-1 1/1 Running 0 118m
thanos-storegateway-observee-storegateway-0 1/1 Running 0 12m
thanos-storegateway-observee-storegateway-1 1/1 Running 0 11m
thanos-tls-querier-observee-query-dfb9f79f9-4str8 1/1 Running 0 29m
thanos-tls-querier-observee-query-dfb9f79f9-xsq24 1/1 Running 0 29m
kubectl -n monitoring get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
kube-prometheus-stack-grafana <none> grafana.thanos.teks-tg.clusterfrak-dynamics.io k8s-ingressn-ingressn-afa0a48374-f507283b6cd101c5.elb.eu-west-1.amazonaws.com 80, 443 123m
kubectl -n monitoring get pods
NAME READY STATUS RESTARTS AGE
alertmanager-kube-prometheus-stack-alertmanager-0 2/2 Running 0 39m
kube-prometheus-stack-kube-state-metrics-5cf575d8f8-ct292 1/1 Running 0 39m
kube-prometheus-stack-operator-6856b9bb58-4cngc 1/1 Running 0 39m
kube-prometheus-stack-prometheus-node-exporter-bs4wp 1/1 Running 0 39m
kube-prometheus-stack-prometheus-node-exporter-c57ss 1/1 Running 0 39m
kube-prometheus-stack-prometheus-node-exporter-cp5ch 1/1 Running 0 39m
kube-prometheus-stack-prometheus-node-exporter-tnqvq 1/1 Running 0 39m
kube-prometheus-stack-prometheus-node-exporter-z2p49 1/1 Running 0 39m
kube-prometheus-stack-prometheus-node-exporter-zzqp7 1/1 Running 0 39m
prometheus-kube-prometheus-stack-prometheus-0 3/3 Running 1 39m
thanos-compactor-7576dcbcfc-6pd4v 1/1 Running 0 38m
kubectl -n monitoring get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
kube-prometheus-stack-thanos-gateway nginx thanos-sidecar.thanos.teks-tg.clusterfrak-dynamics.io k8s-ingressn-ingressn-95903f6102-d2ce9013ac068b9e.elb.eu-west-3.amazonaws.com 80, 443 40m
k -n monitoring logs -f thanos-tls-querier-observee-query-687dd88ff5-nzpdh
level=info ts=2021-02-23T15:37:35.692346206Z
caller=storeset.go:387 component=storeset msg=
"adding new storeAPI to query storeset" address=thanos-sidecar.thanos.teks-tg.clusterfrak-dynamics.io:443 extLset=
"{cluster=\"pio-thanos-observee\", prometheus=\"monitoring/kube-prometheus-stack-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-stack-prometheus-0\"}"kubectl -n monitoring port-forward thanos-tls-querier-observee-query-687dd88ff5-nzpdh 10902
kubectl -n monitoring port-forward thanos-query-7c74db546c-d7bp8 10902
- 观察者把本地Thanos聚集
- 我们的存储网关(一个用于远程观测者集群,一个用于本地观测者集群)
- 本地TLS查询器,它可以查询被观察的sidecar
在Grafana可视化
总结
- https://docs.google.com/document/d/1H47v7WfyKkSLMrR8_iku6u9VB73WrVzBHb2SB6dL9_g/edit#heading=h.2v27snv0lsur
- https://github.com/particuleio/teks
- https://github.com/particuleio/teks/tree/main/terragrunt/live/thanos/eu-west-1/clusters/observer
- https://github.com/particuleio/teks/tree/main/terragrunt/live/thanos/eu-west-3/clusters/observee
- https://thanos.io/tip/operating/cross-cluster-tls-communication.md/
文章转载:分布式实验室
(版权归原作者所有,侵删)
点击下方“阅读原文”查看更多
关键词
指标
集群
时间
github.com
基础设施
最新评论
推荐文章
作者最新文章
你可能感兴趣的文章
Copyright Disclaimer: The copyright of contents (including texts, images, videos and audios) posted above belong to the User who shared or the third-party website which the User shared from. If you found your copyright have been infringed, please send a DMCA takedown notice to [email protected]. For more detail of the source, please click on the button "Read Original Post" below. For other communications, please send to [email protected].
版权声明:以上内容为用户推荐收藏至CareerEngine平台,其内容(含文字、图片、视频、音频等)及知识版权均属用户或用户转发自的第三方网站,如涉嫌侵权,请通知[email protected]进行信息删除。如需查看信息来源,请点击“查看原文”。如需洽谈其它事宜,请联系[email protected]。
版权声明:以上内容为用户推荐收藏至CareerEngine平台,其内容(含文字、图片、视频、音频等)及知识版权均属用户或用户转发自的第三方网站,如涉嫌侵权,请通知[email protected]进行信息删除。如需查看信息来源,请点击“查看原文”。如需洽谈其它事宜,请联系[email protected]。