k8s自动扩缩容的实现方案

 

方案一: 基于k8s的Resource(CPU、Memory利用率)进行自动扩缩容处理。

实施:

 1.  deployment的yaml中containers下需要声明resources

resources:
  limits:
    cpu: "2"       # 容器最多可以使用 2 核 CPU
    memory: 2Gi     # 容器最多可以使用 2 GiB 内存
  requests:
    cpu: "1"       # 容器启动时至少需要 1 核 CPU
    memory: 1Gi     # 容器启动时至少需要 1 GiB 内存

  • limits:表示容器可以使用的最大资源量。如果容器尝试使用超过这些限制的资源,会被限制或使container重启。

  • cpu:限制的 CPU 资源量(以核为单位)。

    • memory:限制的内存资源量(以字节为单位,可以使用 Ki、Mi、Gi 等单位)。

  • requests:表示容器启动时所需的最小资源量。Kubernetes 调度器会根据这些请求来决定将容器调度到哪个节点上。

  • cpu:请求的 CPU 资源量(以核为单位)。

    • memory:请求的内存资源量(以字节为单位,可以使用 Ki、Mi、Gi 等单位)。

 

 2.  自动扩缩容策略(HPA)声明

 

apiVersion: autoscaling/v2  # HPA的API版本
kind: HorizontalPodAutoscaler  # 资源类型为HPA
metadata:
  name: project_name # HPA 的名称,可与deployment的名称保持一致
  namespace: k8s_namespace # HPA 所在的命名空间,,与deployment的命名空间保持一致
spec:
  scaleTargetRef:
    apiVersion: apps/v1  # 目标资源的API版本
    kind: Deployment  # 目标资源的类型
    name: project_name  # 目标资源的名称
  minReplicas: 1  # 最小副本数
  maxReplicas: 3  # 最大副本数
  metrics:
    - type: Resource  # 指标类型为资源
      resource:
        name: cpu  # 资源类型为CPU
        target:
          type: Utilization  # 目标类型为利用率
          averageUtilization: 180  # 平均CPU利用率目标为resource中limit的90%(averageUtilization = 目标利用率 * limit / request)
    - type: Resource  # 指标类型为资源
      resource:
        name: memory  # 资源类型为内存
        target:
          type: Utilization  # 目标类型为利用率
          averageUtilization: 160 # 平均内存利用率目标为resource中limit的80%(averageUtilization = 目标利用率 * limit / request)
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # 缩容稳定窗口时间为300秒,在这个时间段内,HPA 会观察资源的使用情况,并且不会立即缩容,而是等待 300 秒以确保资源使用的变化是稳定的
      policies:
        - periodSeconds: 60  # 缩容策略的周期为60秒
          type: Percent  # 缩容策略类型为百分比
          value: 100  # 每次缩容的百分比为100%,也就是每次缩容时可以将副本数减少最多 100%(即全部缩容)。
      selectPolicy: Max  # 在多种策略都适用的情况下,选择最大的策略进行执行
    scaleUp:
      policies:
        - periodSeconds: 60  # 扩容策略的周期为60秒
          type: Pods  # 扩容策略类型为Pod数
          value: 1  # 每次扩容的Pod数为1
      selectPolicy: Max  # 选择最大策略
      stabilizationWindowSeconds: 300  # 扩容稳定窗口时间为300秒

 

 注意点:

  1. Resource类型计算平均利用率时获取当前资源是通过 metrics.k8s.io 接口获取,而在开发测试环境是由promethues-adapter提供的接口,采集频率为1分钟一次(deployment/monitoring/prometheus-adapter:--metrics-relist-interval=1m),另外预生产 生产环境无promethues,而是使用metrics-server实现 metrics.k8s.io 接口,其采集频率也是默认60秒一次(metrics-server),而自动扩缩容的计算评率默认为15秒( Horizontal Pod Autoscaling),  因此这个平均利用率的计算存在延时。

  2. metrics.k8s.io 接口在获取资源使用情况时是直接采集整个pod的CPU和Memory使用情况,而一个POD包含3个Container的CPU和Memory(微服务应用的container以及 POD和istio-proxy的container,其中  istio-proxy 是在部署服务deployment时自动创建的。 POD:POD不是一个实际的容器。POD是Kubernetes中的一个基本调度单元,它可以包含一个或多个容器。POD本身基本不消耗资源(实际通过接口返回,内存占不到400K)。istio-proxy:是Istio服务网格中的一个sidecar容器,通常是Envoy代理。它负责处理服务间的通信、负载均衡、服务发现、安全等功能。istio-proxy容器会消耗CPU和内存资源,具体的资源使用量取决于流量和配置)因此该平均利用率最终的计算结果是 微服务container+istio-proxy container的总CPU或内存除以resource limit的CPU或内存。根据开发测试环境的观察istio-proxy container的内存占200M左右,CPU不到100m。因此对于内存的平均利用率的计算会存在这一部分的误差。

 

 

方案二: 基于监控工具Prometheus进行自定义指标进行自动扩缩容处理。

准备:

 1. 通过helm安装prometheus服务

helm添加charts:    

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

helm安装promethues服务

helm install prometheus prometheus-community/prometheus -n oms4-dev

helm安装promethues-adaper服务

helm install prometheus-adapter prometheus-community/prometheus-adapter -n oms4-dev

安装完成后确保promethues-adaper的configmap中与promethues服务链接正常配置,且adapter正常启动,adapter正常启动(5分钟)后通过 以下命令查看暴露的信息

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1"

 

实施 (以服务的内存利用率为例):

    1.  deployment的yaml中containers下需要声明resources

resources:
  limits:
    cpu: "2"       # 容器最多可以使用 2 核 CPU
    memory: 2Gi     # 容器最多可以使用 2 GiB 内存
  requests:
    cpu: "1"       # 容器启动时至少需要 1 核 CPU
    memory: 1Gi     # 容器启动时至少需要 1 GiB 内存

    2.  在adapter中配置自定义指标

           #seriesQuery 这是一个 Prometheus 查询,用于选择符合条件的时间序列数据。这里选择了所有 namespace 和 container 不为空的 container_memory_usage_bytes 指标。而container_memory_usage_bytes表示pod中具体container的内存使用了多少bytes
      - seriesQuery: 'container_memory_usage_bytes{namespace!=\"\",container!=\"\"}'
        resources:
         overrides:
           namespace: {resource: \"namespace\"}
           pod: {resource: \"pod\"}
                #这部分定义了如何将 Prometheus 标签映射到 Kubernetes 资源。namespace 标签被映射到 Kubernetes 的 namespace 资源,pod 标签被映射到 Kubernetes 的 pod 资源。
        name:
           matches: \"^(.*)_bytes\"
           as: \"${1}_utilization\"
                #这部分定义了如何将 Prometheus 指标名称转换为 Kubernetes 自定义指标名称。matches 使用正则表达式匹配 Prometheus 指标名称,这里匹配所有以 _bytes 结尾的指标。as 定义了转换后的名称,这里将 _bytes 替换为 _percentag。
        metricsQuery: |
            sum(container_memory_usage_bytes{<<.LabelMatchers>>}) by (namespace, container, pod) / sum(container_spec_memory_limit_bytes{<<.LabelMatchers>>}) by (namespace, container, pod)
           #这是具体的promethues查询方法, sum by的方式以实现两个指标返回的标签一致,否则无法正常解析。该指标相当于直接在prometheus中输入:sum(container_memory_usage_bytes{namespace="oms4-dev", container=~"support.*"}) by (namespace, container, pod) / sum(container_spec_memory_limit_bytes{namespace="oms4-dev", container=~"support.*"}) by (namespace, container, pod)
       
       

 配置好指标并重启adapter,执行以下命令获取具体服务的使用率

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/oms4-dev/pods/*/container_memory_usage_utilization"

命令返回的信息:

{ "describedObject":{ "kind":"Pod", "namespace":"oms4-dev", "name":"prometheus-dev-server-5dc5d54d86-v747d", "apiVersion":"/v1" }, "metricName":"container_memory_usage_percentag", "timestamp":"2024-12-18T08:03:41Z", "value":"794m", "selector":null }

 
其中value即代表使用率,返回的值带有 m 单位是因为 Kubernetes 使用 m 表示 milli (千分之一) 的单位。也就是说,794m 实际上是 0.794。 也就是使用率为79.4%

 

   3. 在k8s 自动扩缩容配置中添加自定义指标

apiVersion: autoscaling/v2  # HPA的API版本
kind: HorizontalPodAutoscaler  # 资源类型为HPA
metadata:
  name: project_name # HPA 的名称,可与deployment的名称保持一致
  namespace: k8s_namespace # HPA 所在的命名空间,,与deployment的命名空间保持一致
spec:
  scaleTargetRef:
    apiVersion: apps/v1  # 目标资源的API版本
    kind: Deployment  # 目标资源的类型
    name: project_name  # 目标资源的名称
  minReplicas: 1  # 最小副本数
  maxReplicas: 3  # 最大副本数
  metrics:
    - type: Pods   # 指标类型为Pods
     pods:
       metric:
            name: container_memory_usage_utilization  #直接使用自定义指标的名称
      target:
        type: AverageValue  #平均值
        averageUtilization: 800m # resource中limit的80%
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # 缩容稳定窗口时间为300秒,在这个时间段内,HPA 会观察资源的使用情况,并且不会立即缩容,而是等待 300 秒以确保资源使用的变化是稳定的
      policies:
        - periodSeconds: 60  # 缩容策略的周期为60秒
          type: Percent  # 缩容策略类型为百分比
          value: 100  # 每次缩容的百分比为100%,也就是每次缩容时可以将副本数减少最多 100%(即全部缩容)。
      selectPolicy: Max  # 在多种策略都适用的情况下,选择最大的策略进行执行
    scaleUp:
      policies:
        - periodSeconds: 60  # 扩容策略的周期为60秒
          type: Pods  # 扩容策略类型为Pod数
          value: 1  # 每次扩容的Pod数为1
      selectPolicy: Max  # 选择最大策略
      stabilizationWindowSeconds: 300  # 扩容稳定窗口时间为300秒

 

扩展: 
1. prometheus的一些指标参考  指标

2. spring boot项目可通以下配置提供符合promethues规则的指标内容

<dependency> 
  <groupId>org.springframework.boot</groupId> 
  <artifactId>spring-boot-starter-actuator</artifactId>
</dependency> 

<dependency> 
  <groupId>io.micrometer</groupId> 
  <artifactId>micrometer-registry-prometheus</artifactId> 
</dependency>

和application.properties

management.metrics.tags.application=${spring.application.name}
management.endpoints.web.exposure.include=*

通过服务接口/actuator/prometheus获取信息:

# HELP process_cpu_usage The "recent cpu usage" for the Java Virtual Machine process
# TYPE process_cpu_usage gauge
process_cpu_usage{application="support",} 0.007317073170731708
# HELP jvm_gc_pause_seconds Time spent in GC pause
# TYPE jvm_gc_pause_seconds summary
jvm_gc_pause_seconds_count{action="end of minor GC",application="support",cause="Allocation Failure",} 99.0
jvm_gc_pause_seconds_sum{action="end of minor GC",application="support",cause="Allocation Failure",} 1.173
jvm_gc_pause_seconds_count{action="end of major GC",application="support",cause="Metadata GC Threshold",} 1.0
jvm_gc_pause_seconds_sum{action="end of major GC",application="support",cause="Metadata GC Threshold",} 0.197
jvm_gc_pause_seconds_count{action="end of major GC",application="support",cause="System.gc()",} 11.0
jvm_gc_pause_seconds_sum{action="end of major GC",application="support",cause="System.gc()",} 1.104
jvm_gc_pause_seconds_count{action="end of major GC",application="support",cause="Allocation Failure",} 4.0
jvm_gc_pause_seconds_sum{action="end of major GC",application="support",cause="Allocation Failure",} 0.425
# HELP jvm_gc_pause_seconds_max Time spent in GC pause
# TYPE jvm_gc_pause_seconds_max gauge
jvm_gc_pause_seconds_max{action="end of minor GC",application="support",cause="Allocation Failure",} 0.0
jvm_gc_pause_seconds_max{action="end of major GC",application="support",cause="Metadata GC Threshold",} 0.0
jvm_gc_pause_seconds_max{action="end of major GC",application="support",cause="System.gc()",} 0.0
jvm_gc_pause_seconds_max{action="end of major GC",application="support",cause="Allocation Failure",} 0.0
# HELP executor_queue_remaining_tasks The number of additional elements that this queue can ideally accept without blocking
# TYPE executor_queue_remaining_tasks gauge
executor_queue_remaining_tasks{application="support",name="applicationTaskExecutor",} 2.147483647E9
# HELP tomcat_sessions_created_sessions_total  
# TYPE tomcat_sessions_created_sessions_total counter
tomcat_sessions_created_sessions_total{application="support",} 0.0
# HELP executor_pool_core_threads The core number of threads for the pool
# TYPE executor_pool_core_threads gauge
executor_pool_core_threads{application="support",name="applicationTaskExecutor",} 8.0
# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool
# TYPE jvm_buffer_count_buffers gauge
jvm_buffer_count_buffers{application="support",id="direct",} 11.0
jvm_buffer_count_buffers{application="support",id="mapped",} 0.0
# HELP jvm_gc_memory_promoted_bytes_total Count of positive increases in the size of the old generation memory pool before GC to after GC
# TYPE jvm_gc_memory_promoted_bytes_total counter
jvm_gc_memory_promoted_bytes_total{application="support",} 4.28232136E8
# HELP process_uptime_seconds The uptime of the Java virtual machine
# TYPE process_uptime_seconds gauge
process_uptime_seconds{application="support",} 109395.852
# HELP jvm_buffer_memory_used_bytes An estimate of the memory that the Java virtual machine is using for this buffer pool
# TYPE jvm_buffer_memory_used_bytes gauge
jvm_buffer_memory_used_bytes{application="support",id="direct",} 81921.0
jvm_buffer_memory_used_bytes{application="support",id="mapped",} 0.0
# HELP tomcat_sessions_active_max_sessions  
# TYPE tomcat_sessions_active_max_sessions gauge
tomcat_sessions_active_max_sessions{application="support",} 0.0
# HELP jvm_threads_peak_threads The peak live thread count since the Java virtual machine started or peak was reset
# TYPE jvm_threads_peak_threads gauge
jvm_threads_peak_threads{application="support",} 23.0
# HELP tomcat_sessions_alive_max_seconds  
# TYPE tomcat_sessions_alive_max_seconds gauge
tomcat_sessions_alive_max_seconds{application="support",} 0.0
# HELP process_files_max_files The maximum file descriptor count
# TYPE process_files_max_files gauge
process_files_max_files{application="support",} 1048576.0
# HELP executor_completed_tasks_total The approximate total number of tasks that have completed execution
# TYPE executor_completed_tasks_total counter
executor_completed_tasks_total{application="support",name="applicationTaskExecutor",} 0.0
# HELP system_cpu_usage The "recent cpu usage" of the system the application is running in
# TYPE system_cpu_usage gauge
system_cpu_usage{application="support",} 0.006428420609756098
# HELP logback_events_total Number of events that made it to the logs
# TYPE logback_events_total counter
logback_events_total{application="support",level="trace",} 0.0
logback_events_total{application="support",level="error",} 0.0
logback_events_total{application="support",level="debug",} 0.0
logback_events_total{application="support",level="warn",} 0.0
logback_events_total{application="support",level="info",} 34.0
# HELP jvm_classes_loaded_classes The number of classes that are currently loaded in the Java virtual machine
# TYPE jvm_classes_loaded_classes gauge
jvm_classes_loaded_classes{application="support",} 10971.0
# HELP process_files_open_files The open file descriptor count
# TYPE process_files_open_files gauge
process_files_open_files{application="support",} 16.0
# HELP jvm_gc_live_data_size_bytes Size of long-lived heap memory pool after reclamation
# TYPE jvm_gc_live_data_size_bytes gauge
jvm_gc_live_data_size_bytes{application="support",} 1.706524E7
# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{application="support",area="nonheap",id="Metaspace",} 5.6314112E7
jvm_memory_used_bytes{application="support",area="nonheap",id="CodeHeap 'profiled nmethods'",} 1.86912E7
jvm_memory_used_bytes{application="support",area="heap",id="Eden Space",} 1325328.0
jvm_memory_used_bytes{application="support",area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 4952704.0
jvm_memory_used_bytes{application="support",area="nonheap",id="CodeHeap 'non-nmethods'",} 1323008.0
jvm_memory_used_bytes{application="support",area="heap",id="Tenured Gen",} 1.706524E7
jvm_memory_used_bytes{application="support",area="heap",id="Survivor Space",} 0.0
jvm_memory_used_bytes{application="support",area="nonheap",id="Compressed Class Space",} 6846824.0
# HELP executor_pool_max_threads The maximum allowed number of threads in the pool
# TYPE executor_pool_max_threads gauge
executor_pool_max_threads{application="support",name="applicationTaskExecutor",} 2.147483647E9
# HELP tomcat_sessions_rejected_sessions_total  
# TYPE tomcat_sessions_rejected_sessions_total counter
tomcat_sessions_rejected_sessions_total{application="support",} 0.0
# HELP application_ready_time_seconds Time taken (ms) for the application to be ready to service requests
# TYPE application_ready_time_seconds gauge
application_ready_time_seconds{application="support",main_application_class="com.y3technologies.support.SupportApplication",} 16.087
# HELP jvm_memory_max_bytes The maximum amount of memory in bytes that can be used for memory management
# TYPE jvm_memory_max_bytes gauge
jvm_memory_max_bytes{application="support",area="nonheap",id="Metaspace",} -1.0
jvm_memory_max_bytes{application="support",area="nonheap",id="CodeHeap 'profiled nmethods'",} 1.22912768E8
jvm_memory_max_bytes{application="support",area="heap",id="Eden Space",} 1.43130624E8
jvm_memory_max_bytes{application="support",area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 1.22916864E8
jvm_memory_max_bytes{application="support",area="nonheap",id="CodeHeap 'non-nmethods'",} 5828608.0
jvm_memory_max_bytes{application="support",area="heap",id="Tenured Gen",} 3.57957632E8
jvm_memory_max_bytes{application="support",area="heap",id="Survivor Space",} 1.7891328E7
jvm_memory_max_bytes{application="support",area="nonheap",id="Compressed Class Space",} 1.073741824E9
# HELP application_started_time_seconds Time taken (ms) to start the application
# TYPE application_started_time_seconds gauge
application_started_time_seconds{application="support",main_application_class="com.y3technologies.support.SupportApplication",} 15.963
# HELP system_load_average_1m The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
# TYPE system_load_average_1m gauge
system_load_average_1m{application="support",} 1.62
# HELP jvm_classes_unloaded_classes_total The total number of classes unloaded since the Java virtual machine has started execution
# TYPE jvm_classes_unloaded_classes_total counter
jvm_classes_unloaded_classes_total{application="support",} 180.0
# HELP executor_pool_size_threads The current number of threads in the pool
# TYPE executor_pool_size_threads gauge
executor_pool_size_threads{application="support",name="applicationTaskExecutor",} 0.0
# HELP jvm_gc_max_data_size_bytes Max size of long-lived heap memory pool
# TYPE jvm_gc_max_data_size_bytes gauge
jvm_gc_max_data_size_bytes{application="support",} 3.57957632E8
# HELP jvm_memory_usage_after_gc_percent The percentage of long-lived heap pool used after the last GC event, in the range [0..1]
# TYPE jvm_memory_usage_after_gc_percent gauge
jvm_memory_usage_after_gc_percent{application="support",area="heap",pool="long-lived",} 0.04767391019057809
# HELP tomcat_sessions_active_current_sessions  
# TYPE tomcat_sessions_active_current_sessions gauge
tomcat_sessions_active_current_sessions{application="support",} 0.0
# HELP process_start_time_seconds Start time of the process since unix epoch.
# TYPE process_start_time_seconds gauge
process_start_time_seconds{application="support",} 1.734403057283E9
# HELP jvm_threads_live_threads The current number of live threads including both daemon and non-daemon threads
# TYPE jvm_threads_live_threads gauge
jvm_threads_live_threads{application="support",} 20.0
# HELP http_server_requests_seconds Duration of HTTP server request handling
# TYPE http_server_requests_seconds summary
http_server_requests_seconds_count{application="support",exception="None",method="POST",outcome="SUCCESS",status="200",uri="/load/startMemoryLoad",} 13.0
http_server_requests_seconds_sum{application="support",exception="None",method="POST",outcome="SUCCESS",status="200",uri="/load/startMemoryLoad",} 0.035915358
http_server_requests_seconds_count{application="support",exception="None",method="POST",outcome="CLIENT_ERROR",status="404",uri="/**",} 2.0
http_server_requests_seconds_sum{application="support",exception="None",method="POST",outcome="CLIENT_ERROR",status="404",uri="/**",} 0.01325648
http_server_requests_seconds_count{application="support",exception="None",method="POST",outcome="SUCCESS",status="200",uri="/load/stopMemoryLoad",} 11.0
http_server_requests_seconds_sum{application="support",exception="None",method="POST",outcome="SUCCESS",status="200",uri="/load/stopMemoryLoad",} 1.168532813
http_server_requests_seconds_count{application="support",exception="None",method="GET",outcome="CLIENT_ERROR",status="404",uri="/**",} 1.0
http_server_requests_seconds_sum{application="support",exception="None",method="GET",outcome="CLIENT_ERROR",status="404",uri="/**",} 0.04204703
http_server_requests_seconds_count{application="support",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",} 1815.0
http_server_requests_seconds_sum{application="support",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",} 8.238591722
# HELP http_server_requests_seconds_max Duration of HTTP server request handling
# TYPE http_server_requests_seconds_max gauge
http_server_requests_seconds_max{application="support",exception="None",method="POST",outcome="SUCCESS",status="200",uri="/load/startMemoryLoad",} 0.0
http_server_requests_seconds_max{application="support",exception="None",method="POST",outcome="CLIENT_ERROR",status="404",uri="/**",} 0.0
http_server_requests_seconds_max{application="support",exception="None",method="POST",outcome="SUCCESS",status="200",uri="/load/stopMemoryLoad",} 0.0
http_server_requests_seconds_max{application="support",exception="None",method="GET",outcome="CLIENT_ERROR",status="404",uri="/**",} 0.0
http_server_requests_seconds_max{application="support",exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/prometheus",} 0.004422325
# HELP jvm_gc_overhead_percent An approximation of the percent of CPU time used by GC activities over the last lookback period or since monitoring began, whichever is shorter, in the range [0..1]
# TYPE jvm_gc_overhead_percent gauge
jvm_gc_overhead_percent{application="support",} 0.002213333333333333
# HELP jvm_threads_daemon_threads The current number of live daemon threads
# TYPE jvm_threads_daemon_threads gauge
jvm_threads_daemon_threads{application="support",} 16.0
# HELP jvm_threads_states_threads The current number of threads
# TYPE jvm_threads_states_threads gauge
jvm_threads_states_threads{application="support",state="runnable",} 6.0
jvm_threads_states_threads{application="support",state="timed-waiting",} 3.0
jvm_threads_states_threads{application="support",state="blocked",} 0.0
jvm_threads_states_threads{application="support",state="waiting",} 11.0
jvm_threads_states_threads{application="support",state="new",} 0.0
jvm_threads_states_threads{application="support",state="terminated",} 0.0
# HELP system_cpu_count The number of processors available to the Java virtual machine
# TYPE system_cpu_count gauge
system_cpu_count{application="support",} 1.0
# HELP executor_active_threads The approximate number of threads that are actively executing tasks
# TYPE executor_active_threads gauge
executor_active_threads{application="support",name="applicationTaskExecutor",} 0.0
# HELP executor_queued_tasks The approximate number of tasks that are queued for execution
# TYPE executor_queued_tasks gauge
executor_queued_tasks{application="support",name="applicationTaskExecutor",} 0.0
# HELP disk_total_bytes Total space for path
# TYPE disk_total_bytes gauge
disk_total_bytes{application="support",path="/.",} 1.06270035968E11
# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
# TYPE jvm_buffer_total_capacity_bytes gauge
jvm_buffer_total_capacity_bytes{application="support",id="direct",} 81920.0
jvm_buffer_total_capacity_bytes{application="support",id="mapped",} 0.0
# HELP jvm_memory_committed_bytes The amount of memory in bytes that is committed for the Java virtual machine to use
# TYPE jvm_memory_committed_bytes gauge
jvm_memory_committed_bytes{application="support",area="nonheap",id="Metaspace",} 5.8851328E7
jvm_memory_committed_bytes{application="support",area="nonheap",id="CodeHeap 'profiled nmethods'",} 1.8743296E7
jvm_memory_committed_bytes{application="support",area="heap",id="Eden Space",} 2.2872064E7
jvm_memory_committed_bytes{application="support",area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 4980736.0
jvm_memory_committed_bytes{application="support",area="nonheap",id="CodeHeap 'non-nmethods'",} 2555904.0
jvm_memory_committed_bytes{application="support",area="heap",id="Tenured Gen",} 5.6909824E7
jvm_memory_committed_bytes{application="support",area="heap",id="Survivor Space",} 2818048.0
jvm_memory_committed_bytes{application="support",area="nonheap",id="Compressed Class Space",} 7733248.0
# HELP tomcat_sessions_expired_sessions_total  
# TYPE tomcat_sessions_expired_sessions_total counter
tomcat_sessions_expired_sessions_total{application="support",} 0.0
# HELP jvm_gc_memory_allocated_bytes_total Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next
# TYPE jvm_gc_memory_allocated_bytes_total counter
jvm_gc_memory_allocated_bytes_total{application="support",} 1.626672056E9
# HELP disk_free_bytes Usable space for path
# TYPE disk_free_bytes gauge
disk_free_bytes{application="support",path="/.",} 7.6804419584E10

最后通过在promethues的configmap中添加以下配置,即可对java应用进行信息采集

- job_name: 'microservice-actuator'
  metrics_path: /actuator/prometheus
  static_configs:
   - targets: ['support.oms4-dev:8080']