Home
home

3장 실습 - Amazon EKS FinOps - KubeCost와 OpenCost 구성하기

1. 기본 환경 배포

이번 실습은 IAM 사용자 계정을 통해 관리 콘솔에 접근하고 액세스 키를 활용해 awscli 도구를 사용합니다.
해당 작업을 수행하지 않았다면 아래 토글을 확장해 작업을 선행하고 본격적인 실습에 들어갑니다.
IAM 사용자 생성 및 액세스 키 생성

1.1. Terraform을 통한 기본 인프라 배포

Terraform을 통한 기본 인프라 배포에 앞서 SSH 키 페어, IAM User Access Key ID, IAM User Secret Access Key를 미리 확인하고 메모해 둡니다.
Terraform으로 기본 인프라 배포
cd cnaee_class_tf/ch3
Bash
복사
# 실습 디렉터리 경로 진입
export TF_VAR_KeyName=[각자 ssh keypair] export TF_VAR_MyIamUserAccessKeyID=[각자 iam 사용자의 access key id] export TF_VAR_MyIamUserSecretAccessKey=[각자 iam 사용자의 secret access key] export TF_VAR_SgIngressSshCidr=$(curl -s ipinfo.io/ip)/32
Bash
복사
# Terraform 환경 변수 저장
terraform init terraform plan
Bash
복사
# Terraform 배포
nohup sh -c "terraform apply -auto-approve" > create.log 2>&1 &
Bash
복사
Note:  nohup으로 백그라운드로 실행하도록 변경했습니다. Terraform 배포가 완료되면 정상적으로 자원 생성이 되었는지 확인을 합니다.(cat create.log)
Terraform을 통한 기본 인프라 배포가 완료되면 관리 콘솔에서 생성된 인프라들을 확인합니다.
Note: 
AWS 관리 콘솔에 로그인 할 땐 IAM 사용자 계정으로 진행합니다.

1.2. 기본 정보 확인 및 설정

Terraform 배포가 완료 후 출력되는 Output 정보에서 bastion_host_ip의 퍼블릭 IP를 확인합니다.
해당 IP로 EKS 관리용 인스턴스(myeks-bastion-EC2)에 SSH로 접속하고 아래 명령어를 통해 정보를 확인합니다.
Note:
myeks-bastion-EC2의 OS 변경으로 SSH 접근에 대한 계정을 ubuntu로 지정합니다.
(ssh -i ~/.ssh/XXXX.pem ubuntu@X.X.X.X)
기본 설정 및 확인
aws eks update-kubeconfig \ --region $AWS_DEFAULT_REGION \ --name $CLUSTER_NAME
Shell
복사
# EKS 클러스터 인증 정보 업데이트
kubens default
Bash
복사
# kubens default 설정
echo $AWS_DEFAULT_REGION echo $CLUSTER_NAME echo $VPCID echo $PublicSubnet1,$PublicSubnet2,$PublicSubnet3 echo $PrivateSubnet1,$PrivateSubnet2,$PrivateSubnet3
Bash
복사
# 변수 호출 종합
eksctl get cluster
Bash
복사
# eksctl을 통한 eks cluster 정보 확인
eksctl get nodegroup \ --cluster $CLUSTER_NAME \ --name ${CLUSTER_NAME}-node-group
Bash
복사
# eksctl을 통한 노드 그룹 정보 확인
kubectl get node -owide
Bash
복사
# kubectl을 통한 노드 정보 확인
PublicN1=$(kubectl get node --label-columns=topology.kubernetes.io/zone --selector=topology.kubernetes.io/zone=ap-northeast-2a -o jsonpath={.items[0].status.addresses[0].address}) PublicN2=$(kubectl get node --label-columns=topology.kubernetes.io/zone --selector=topology.kubernetes.io/zone=ap-northeast-2b -o jsonpath={.items[0].status.addresses[0].address}) PublicN3=$(kubectl get node --label-columns=topology.kubernetes.io/zone --selector=topology.kubernetes.io/zone=ap-northeast-2c -o jsonpath={.items[0].status.addresses[0].address}) echo "export PublicN1=$PublicN1" >> /etc/profile echo "export PublicN2=$PublicN2" >> /etc/profile echo "export PublicN3=$PublicN3" >> /etc/profile echo $PublicN1, $PublicN2, $PublicN3
Bash
복사
# 노드 IP 변수 선언
for node in $PublicN1 $PublicN2 $PublicN3; \ do \ ssh -i ~/.ssh/kp_node.pem -o StrictHostKeyChecking=no ec2-user@$node hostname; \ done
Bash
복사
# 노드에 ssh 접근 확인

1.3. Grafana Mimir와 Grafana Agent 설치

KubeCost와 OpenCost에 연결할 Metrics Backend System으로 Grafana Mimir를 설치하고, Metric 값을 수집하기 위한 Collector는 Grafana Agent로 설치합니다.
사전 준비
export MyDomain=[각자 도메인] echo "export MyDomain=$MyDomain" >> /etc/profile; echo $MyDomain export NICKNAME=[각자의 닉네임] echo "export NICKNAME=$NICKNAME" >> /etc/profile; echo $NICKNAME export OIDC_ARN=$(aws iam list-open-id-connect-providers --query 'OpenIDConnectProviderList[*].Arn' --output text) echo "export OIDC_ARN=$OIDC_ARN" >> /etc/profile; echo $OIDC_ARN export OIDC_URL=${OIDC_ARN#*oidc-provider/} echo "export OIDC_URL=$OIDC_URL" >> /etc/profile; echo $OIDC_URL
Bash
복사
# 변수 선언
cat <<EOT | kubectl apply -f - kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: gp3 allowVolumeExpansion: true provisioner: ebs.csi.aws.com volumeBindingMode: WaitForFirstConsumer parameters: type: gp3 allowAutoIOPSPerGBIncrease: 'true' encrypted: 'true' EOT
Bash
복사
# gp3 storage class 생성
kubectl create ns monitoring kubectl create ns kubecost kubectl create ns opencost
Bash
복사
# namespace 생성
Grafana Mimir 설치
helm repo add grafana https://grafana.github.io/helm-charts helm repo update
Bash
복사
# helm repo 추가
aws s3api create-bucket \ --bucket mimir-${NICKNAME} \ --region $AWS_DEFAULT_REGION \ --create-bucket-configuration LocationConstraint=$AWS_DEFAULT_REGION aws s3 ls
Bash
복사
# mimir용 s3 bucket 생성 및 확인
export MIMIR_BUCKET_NAME="mimir-${NICKNAME}" echo "export MIMIR_BUCKET_NAME=$MIMIR_BUCKET_NAME" >> /etc/profile; echo $MIMIR_BUCKET_NAME
Bash
복사
# s3 bucket 이름 변수 저장
cat >grafana-mimir-s3-policy.json <<EOF { "Version": "2012-10-17", "Statement": [ { "Sid": "MimirStorage", "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:PutObject", "s3:GetObject", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::${MIMIR_BUCKET_NAME}", "arn:aws:s3:::${MIMIR_BUCKET_NAME}/*" ] } ] } EOF cat grafana-mimir-s3-policy.json
Bash
복사
# grafana-mimir-s3-poilcy.json 파일 생성
aws iam create-policy \ --policy-name aws-mimir-s3 \ --policy-document file://grafana-mimir-s3-policy.json
Bash
복사
# aws-mimir-s3 IAM Policy 생성
cat >trust-rs-mimir.json <<EOF { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Federated": "${OIDC_ARN}" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "${OIDC_URL}:sub": "system:serviceaccount:monitoring:mimir", "${OIDC_URL}:aud": "sts.amazonaws.com" } } } ] } EOF cat trust-rs-mimir.json
Bash
복사
# Mimir IAM Role Trust rs 생성
aws iam create-role \ --role-name AWS-Mimir-Role \ --assume-role-policy-document file://trust-rs-mimir.json
Bash
복사
# AWS-Mimir-Role 생성
aws iam attach-role-policy \ --role-name AWS-Mimir-Role \ --policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/aws-mimir-s3
Bash
복사
# IAM Policy와 IAM Role 연결
export MIMIR_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/AWS-Mimir-Role echo "export MIMIR_ROLE_ARN=$MIMIR_ROLE_ARN" >> /etc/profile; echo $MIMIR_ROLE_ARN
Bash
복사
# Mimir IAM Role ARN 변수 선언
cat >mimir-values.yaml <<EOF image: repository: grafana/mimir tag: 2.10.3 pullPolicy: IfNotPresent mimir: structuredConfig: limits: max_label_names_per_series: 60 compactor_blocks_retention_period: 30d blocks_storage: backend: s3 s3: bucket_name: ${MIMIR_BUCKET_NAME} endpoint: s3.${AWS_DEFAULT_REGION}.amazonaws.com region: ${AWS_DEFAULT_REGION} tsdb: retention_period: 13h bucket_store: ignore_blocks_within: 10h querier: query_store_after: 12h ingester: ring: replication_factor: 3 serviceAccount: create: true name: "mimir" annotations: "eks.amazonaws.com/role-arn": "${MIMIR_ROLE_ARN}" minio: enabled: false alertmanager: enabled: false ruler: enabled: false compactor: persistentVolume: enabled: true annotations: {} accessModes: - ReadWriteOnce size: 5Gi storageClass: gp3 ingester: zoneAwareReplication: enabled: false persistentVolume: enabled: true annotations: {} accessModes: - ReadWriteOnce size: 5Gi storageClass: gp3 store_gateway: zoneAwareReplication: enabled: false persistentVolume: enabled: true annotations: {} accessModes: - ReadWriteOnce size: 5Gi storageClass: gp3 EOF cat mimir-values.yaml
Bash
복사
# mimir-values.yaml 생성 및 확인
watch kubectl get pod,pv,pvc,cm -n monitoring
Bash
복사
# [모니터링1] monitoring 네임스페이스 - pod, pv, pvc, configmap 모니터링
helm install mimir grafana/mimir-distributed \ -n monitoring \ -f mimir-values.yaml \ --version 5.4.0
Bash
복사
# mimir-values 파일을 활용해서 Mimir를 helm chart로 설치
Grafana Agent 설치
export MIMIR_ENDPOINT_PUSH="http://mimir-nginx.monitoring/api/v1/push"
Bash
복사
# mimir endpoint 주소 - 변수 선언
cat <<EOF | kind: ConfigMap metadata: name: grafana-agent apiVersion: v1 data: agent.yaml: | metrics: wal_directory: /var/lib/agent/wal global: scrape_interval: 60s external_labels: cluster: ${CLUSTER_NAME} configs: - name: integrations remote_write: - headers: X-Scope-OrgID: kubecost_mimir url: ${MIMIR_ENDPOINT_PUSH} - url: ${MIMIR_ENDPOINT_PUSH} scrape_configs: - job_name: kubecost honor_labels: true scrape_interval: 1m scrape_timeout: 10s metrics_path: /metrics scheme: http dns_sd_configs: - names: - kubecost-cost-analyzer.kubecost type: 'A' port: 9003 - job_name: kubecost-networking kubernetes_sd_configs: - role: pod relabel_configs: # Scrape only the the targets matching the following metadata - source_labels: [__meta_kubernetes_pod_label_app] action: keep regex: 'kubecost-network-costs' - job_name: prometheus static_configs: - targets: - localhost:9090 - job_name: 'kubernetes-nodes-cadvisor' scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - action: labelmap regex: __meta_kubernetes_node_label_(.+) - target_label: __address__ replacement: kubernetes.default.svc:443 - source_labels: [__meta_kubernetes_node_name] regex: (.+) target_label: __metrics_path__ replacement: /api/v1/nodes/\$\$1/proxy/metrics/cadvisor metric_relabel_configs: - source_labels: [ __name__ ] regex: (container_cpu_usage_seconds_total|container_memory_working_set_bytes|container_network_receive_errors_total|container_network_transmit_errors_total|container_network_receive_packets_dropped_total|container_network_transmit_packets_dropped_total|container_memory_usage_bytes|container_cpu_cfs_throttled_periods_total|container_cpu_cfs_periods_total|container_fs_usage_bytes|container_fs_limit_bytes|container_cpu_cfs_periods_total|container_fs_inodes_free|container_fs_inodes_total|container_fs_usage_bytes|container_fs_limit_bytes|container_cpu_cfs_throttled_periods_total|container_cpu_cfs_periods_total|container_network_receive_bytes_total|container_network_transmit_bytes_total|container_fs_inodes_free|container_fs_inodes_total|container_fs_usage_bytes|container_fs_limit_bytes|container_spec_cpu_shares|container_spec_memory_limit_bytes|container_network_receive_bytes_total|container_network_transmit_bytes_total|container_fs_reads_bytes_total|container_network_receive_bytes_total|container_fs_writes_bytes_total|container_fs_reads_bytes_total|cadvisor_version_info|kubecost_pv_info|kubelet_volume_stats_used_bytes|kubelet_volume_stats_capacity_bytes|kubelet_volume_stats_available_bytes|kubelet_volume_stats_inodes|kubelet_volume_stats_inodes_free|kubelet_volume_stats_inodes_used) action: keep - source_labels: [ container ] target_label: container_name regex: (.+) action: replace - source_labels: [ pod ] target_label: pod_name regex: (.+) action: replace - job_name: 'kubernetes-nodes' tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - action: labelmap regex: __meta_kubernetes_node_label_(.+) - target_label: __address__ replacement: kubernetes.default.svc:443 - source_labels: [__meta_kubernetes_node_name] regex: (.+) target_label: __metrics_path__ replacement: /api/v1/nodes/\$\$1/proxy/metrics metric_relabel_configs: - source_labels: [ __name__ ] regex: (kubelet_volume_stats_used_bytes) # this metric is in alpha action: keep - job_name: 'kubernetes-service-endpoints' kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_endpoints_name] action: keep regex: (.*kube-state-metrics|.*prometheus-node-exporter|kubecost-network-costs) - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme] action: replace target_label: __scheme__ regex: (https?) - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port] action: replace target_label: __address__ regex: ([^:]+)(?::\d+)?;(\d+) replacement: \$\$1:\$\$2 - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_service_name] action: replace target_label: kubernetes_name - source_labels: [__meta_kubernetes_pod_node_name] action: replace target_label: kubernetes_node metric_relabel_configs: - source_labels: [ __name__ ] regex: (container_cpu_allocation|container_cpu_usage_seconds_total|container_fs_limit_bytes|container_fs_writes_bytes_total|container_gpu_allocation|container_memory_allocation_bytes|container_memory_usage_bytes|container_memory_working_set_bytes|container_network_receive_bytes_total|container_network_transmit_bytes_total|DCGM_FI_DEV_GPU_UTIL|deployment_match_labels|kube_daemonset_status_desired_number_scheduled|kube_daemonset_status_number_ready|kube_deployment_spec_replicas|kube_deployment_status_replicas|kube_deployment_status_replicas_available|kube_job_status_failed|kube_namespace_annotations|kube_namespace_labels|kube_node_info|kube_node_labels|kube_node_status_allocatable|kube_node_status_allocatable_cpu_cores|kube_node_status_allocatable_memory_bytes|kube_node_status_capacity|kube_node_status_capacity_cpu_cores|kube_node_status_capacity_memory_bytes|kube_node_status_condition|kube_persistentvolume_capacity_bytes|kube_persistentvolume_status_phase|kube_persistentvolumeclaim_info|kube_persistentvolumeclaim_resource_requests_storage_bytes|kube_pod_container_info|kube_pod_container_resource_limits|kube_pod_container_resource_limits_cpu_cores|kube_pod_container_resource_limits_memory_bytes|kube_pod_container_resource_requests|kube_pod_container_resource_requests_cpu_cores|kube_pod_container_resource_requests_memory_bytes|kube_pod_container_status_restarts_total|kube_pod_container_status_running|kube_pod_container_status_terminated_reason|kube_pod_labels|kube_pod_owner|kube_pod_status_phase|kube_replicaset_owner|kube_statefulset_replicas|kube_statefulset_status_replicas|kubecost_cluster_info|kubecost_cluster_management_cost|kubecost_cluster_memory_working_set_bytes|kubecost_load_balancer_cost|kubecost_network_internet_egress_cost|kubecost_network_region_egress_cost|kubecost_network_zone_egress_cost|kubecost_node_is_spot|kubecost_pod_network_egress_bytes_total|node_cpu_hourly_cost|node_cpu_seconds_total|node_disk_reads_completed|node_disk_reads_completed_total|node_disk_writes_completed|node_disk_writes_completed_total|node_filesystem_device_error|node_gpu_count|node_gpu_hourly_cost|node_memory_Buffers_bytes|node_memory_Cached_bytes|node_memory_MemAvailable_bytes|node_memory_MemFree_bytes|node_memory_MemTotal_bytes|node_network_transmit_bytes_total|node_ram_hourly_cost|node_total_hourly_cost|pod_pvc_allocation|pv_hourly_cost|service_selector_labels|statefulSet_match_labels|kubecost_pv_info|up) action: keep - job_name: 'kubernetes-service-endpoints-slow' scrape_interval: 5m scrape_timeout: 30s kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow] action: keep regex: true - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme] action: replace target_label: __scheme__ regex: (https?) - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port] action: replace target_label: __address__ regex: ([^:]+)(?::\d+)?;(\d+) replacement: \$\$1:\$\$2 - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_service_name] action: replace target_label: kubernetes_name - source_labels: [__meta_kubernetes_pod_node_name] action: replace target_label: kubernetes_node - job_name: 'prometheus-pushgateway' honor_labels: true kubernetes_sd_configs: - role: service relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe] action: keep regex: pushgateway - job_name: 'kubernetes-services' metrics_path: /probe params: module: [http_2xx] kubernetes_sd_configs: - role: service relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe] action: keep regex: true - source_labels: [__address__] target_label: __param_target - target_label: __address__ replacement: blackbox - source_labels: [__param_target] target_label: instance - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [__meta_kubernetes_namespace] target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_service_name] target_label: kubernetes_name EOF (export NAMESPACE=kubecost && kubectl apply -n $NAMESPACE -f -)
Bash
복사
# grafana agent ConfigMap 설치 및 확인
curl -O https://raw.githubusercontent.com/cloudneta/cnaeelab/master/_data/agent-bare.yaml kubectl apply -f agent-bare.yaml kubectl get all -n kubecost
Bash
복사
# Grafana Agent - Manifest 다운로드 및 설치
kubectl -n kubecost logs grafana-agent-0
Bash
복사
# grafana agent logs 확인

2. KubeCost 설치 및 확인

2.1. KubeCost 설치

KubeCost 설치
MIMIR_ENDPOINT="mimir-nginx.monitoring" echo "export MIMIR_ENDPOINT=$MIMIR_ENDPOINT" >> /etc/profile; echo $MIMIR_ENDPOINT MIMIR_ORG_ID="0" echo "export MIMIR_ORG_ID=$MIMIR_ORG_ID" >> /etc/profile; echo $MIMIR_ORG_ID
Bash
복사
# 변수 선언
cat >kubecost-values.yaml <<EOF global: mimirProxy: enabled: true mimirEndpoint: http://${MIMIR_ENDPOINT} orgIdentifier: ${MIMIR_ORG_ID} prometheus: enabled: false fqdn: http://${MIMIR_ENDPOINT}/prometheus persistentVolume: enabled: true storageClass: gp3 kubecostProductConfigs: clusterName: ${CLUSTER_NAME} ingress: enabled: true annotations: kubernetes.io/ingress.class: "alb" alb.ingress.kubernetes.io/scheme: "internet-facing" alb.ingress.kubernetes.io/target-type: "ip" alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80}]' hosts: - "kubecost.${MyDomain}" paths: - "/*" pathType: "ImplementationSpecific" EOF cat kubecost-values.yaml
Bash
복사
# kubecost helm values 파일 생성
helm repo add kubecost https://kubecost.github.io/cost-analyzer/ helm repo update
Bash
복사
# helm repo 추가
watch kubectl get all -n kubecost
Bash
복사
# [모니터링1] kubecost 네임스페이스
helm upgrade -i kubecost kubecost/cost-analyzer \ -n kubecost \ --version 2.6.3 \ --values kubecost-values.yaml
Bash
복사
# kubecost 설치 (helm)

2.2. KubeCost 확인

KubeCost WebUI 확인
Note:
http://kubecost.${MyDomain}으로 접속합니다.
1.
Overview : Kubernetes 클러스터의 비용 관리를 위한 종합 대시보드
2.
Monitor - Allocations : Kubernetes Cluster 내의 리소스 사용 및 비용 할당에 대한 정보를 상세하게 출력
3.
Monitor - Assets : Kubernetes Cluster의 자원에 대한 비용 및 사용 내역 확인
4.
Monitor - Clusters : Kubecost가 모니터링하는 모든 Kubernetes Cluster에 대한 비용을 확인
5.
Monitor - Efficiency : Kubernetes Cluster나 Node 단위로 비용 효용성을 표현하는 페이지
6.
Reports : 저장한 레포트 파일을 모아서 확인하는 페이지
7.
Saving - Insights : 비용 절감 방안과 예상 절감액을 표현하는 페이지
8.
Alerts : Kubecost에서 Kubernetes Cluster 및 Kubecost 자체의 상태를 모니터링하고 경고를 설정할 수 있는 기능을 제공
Warning:
KubeCost 설치 후 일정 시간 동안 데이터 수집 및 데이터 스토어 초기화 작업이 진행됩니다.
5분 이상 대기 후 정상적인 결과 값을 출력하니 참고바랍니다.
KubeCost APIs 확인
kubectl run curl --image=appropriate/curl --restart=Never --rm -it -- sh
Bash
복사
# kubecost API 확인을 위한 테스트용 파드 접근
KUBECOST_EP="http://kubecost-cost-analyzer.kubecost.svc.cluster.local:9090" apk add --no-cache jq
Bash
복사
# 테스트용 파드 접속 상태에서 변수 선언 및 jq 설치
curl $KUBECOST_EP/model/allocation \ -d window=3d \ -d aggregate=namespace \ -d accumulate=false \ -d shareIdle=false \ -d format=json \ -G | jq
Bash
복사
# Allocations 정보 API로 확인
curl $KUBECOST_EP/model/assets \ -d window=today \ -d aggregate=type \ -d accumulate=true \ -d disableAdjustments=true \ -d format=json \ -G | jq
Bash
복사
# Assets 정보 API로 확인
exit
Bash
복사
# 확인이 완료되면 종료
kubectl cost 설치
os=$(uname | tr '[:upper:]' '[:lower:]') && \ arch=$(uname -m | tr '[:upper:]' '[:lower:]' | sed -e s/x86_64/amd64/) && \ curl -s -L https://github.com/kubecost/kubectl-cost/releases/latest/download/kubectl-cost-$os-$arch.tar.gz | tar xz -C /tmp && \ chmod +x /tmp/kubectl-cost && \ sudo mv /tmp/kubectl-cost /usr/local/bin/kubectl-cost
Bash
복사
# kubectl cost plugin 설치
kubectl cost 확인
kubectl cost namespace
Bash
복사
# 네임스페이스 단위로 비용 확인
kubectl cost deployment --show-cpu
Bash
복사
# Depolyment 단위로 비용 확인 (CPU 항목 추가)
kubectl cost pod --show-pv
Bash
복사
# Pod 단위로 비용 확인 (PV 항목 추가)
kubectl cost node --window 7d --show-cpu --show-memory
Bash
복사
# Node 단위로 7일간 활동을 통한 월간 예상 비용 측정 (CPU, Memory 항목 추가)
kubectl cost tui
Bash
복사
# Terminal User Interface 접근
KubeCost 삭제
helm uninstall kubecost -n kubecost
Bash
복사
# KubeCost 자원 삭제

3. OpenCost 설치 및 확인

3.1. OpenCost 설치

OpenCost 설치
cat >opencost.yaml <<EOF --- apiVersion: v1 kind: ServiceAccount metadata: name: opencost namespace: opencost --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: opencost rules: - apiGroups: - '' resources: - configmaps - deployments - nodes - pods - services - resourcequotas - replicationcontrollers - limitranges - persistentvolumeclaims - persistentvolumes - namespaces - endpoints verbs: - get - list - watch - apiGroups: - extensions resources: - daemonsets - deployments - replicasets verbs: - get - list - watch - apiGroups: - apps resources: - statefulsets - deployments - daemonsets - replicasets verbs: - list - watch - apiGroups: - batch resources: - cronjobs - jobs verbs: - get - list - watch - apiGroups: - autoscaling resources: - horizontalpodautoscalers verbs: - get - list - watch - apiGroups: - policy resources: - poddisruptionbudgets verbs: - get - list - watch - apiGroups: - storage.k8s.io resources: - storageclasses verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: opencost roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: opencost subjects: - kind: ServiceAccount name: opencost namespace: opencost --- apiVersion: apps/v1 kind: Deployment metadata: name: opencost namespace: opencost labels: app: opencost spec: replicas: 1 selector: matchLabels: app: opencost strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate template: metadata: labels: app: opencost spec: restartPolicy: Always serviceAccountName: opencost containers: - image: ghcr.io/opencost/opencost:latest name: opencost resources: requests: cpu: "10m" memory: "55M" limits: cpu: "999m" memory: "1G" env: - name: PROMETHEUS_SERVER_ENDPOINT value: "http://${MIMIR_ENDPOINT}.svc/prometheus" - name: CLUSTER_ID value: "${CLUSTER_NAME}" imagePullPolicy: Always securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsUser: 1001 - image: ghcr.io/opencost/opencost-ui:latest name: opencost-ui resources: requests: cpu: "10m" memory: "55M" limits: cpu: "999m" memory: "1G" imagePullPolicy: Always --- kind: Service apiVersion: v1 metadata: name: opencost namespace: opencost spec: selector: app: opencost type: ClusterIP ports: - name: opencost port: 9003 targetPort: 9003 - name: opencost-ui port: 9090 targetPort: 9090 --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: opencost namespace: opencost annotations: nginx.ingress.kubernetes.io/rewrite-target: / kubernetes.io/ingress.class: alb alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/target-type: ip spec: ingressClassName: alb rules: - host: "opencost.${MyDomain}" http: paths: - path: / pathType: Prefix backend: service: name: opencost port: number: 9090 EOF cat opencost.yaml
Bash
복사
# opencost.yaml 파일 생성 및 opencost 설치
watch kubectl get all -n opencost
Bash
복사
# [모니터링1] opencost 네임스페이스
kubectl create -f opencost.yaml
Bash
복사
# OpenCost 자원 배포

3.2. OpenCost 확인

OpenCost WebUI 확인
Note:
http://opencost.${MyDomain}으로 접속합니다.
Note:
Cost Allocation 페이지에서 Currency(통화 단위)를 KRW로 변경하면 천원 단위로 결과를 출력하는 것 같습니다.
예를 들어 ₩2는 2원이 아니라 2천원인 것이죠.
kubectl cost 확인
Warning:
kubectl cost는 기본적으로 kubecost에 맞춰 있어 service-port와 service-name과 namespace를 지정해야 합니다.
kubectl cost \ --service-port 9003 \ --service-name opencost \ --kubecost-namespace opencost \ --allocation-path /allocation/compute \ namespace \ --show-efficiency=true
Bash
복사
# 네임스페이스 단위로 비용 확인 (Efficiency 항목 추가)
kubectl cost \ --service-port 9003 \ --service-name opencost \ --kubecost-namespace opencost \ --allocation-path /allocation/compute \ tui
Bash
복사
# Terminal User Interface 접근

4. 실습 환경 삭제

실습 환경 삭제와 Terraform 삭제 작업을 진행합니다.

4.1. 실습 자원 삭제

실습 자원 삭제
kubectl delete -f opencost.yaml kubectl delete -f agent-bare.yaml kubectl delete pvc --all -n kubecost
Bash
복사
# opencost와 grafana agent 삭제
helm과 pvc 삭제
helm uninstall mimir -n monitoring kubectl delete pvc --all -n monitoring
Bash
복사
# mimir 삭제 및 PVC 삭제
Mimir - S3 IAM Policy와 Role 삭제
aws iam detach-role-policy --role-name AWS-Mimir-Role --policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/aws-mimir-s3 aws iam delete-policy --policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/aws-mimir-s3 aws iam delete-role --role-name AWS-Mimir-Role
Bash
복사
# IAM Policy와 Role 삭제
S3 Bucket 삭제
aws s3 rm s3://$MIMIR_BUCKET_NAME --recursive aws s3api delete-bucket --bucket $MIMIR_BUCKET_NAME --region $AWS_DEFAULT_REGION aws s3 ls
Bash
복사
# s3 bucket 삭제 및 확인

4.2. Terraform 삭제

Terraform 자원 삭제
nohup sh -c "terraform destroy -auto-approve" > delete.log 2>&1 &
Bash
복사
# terraform 자원 삭제
Warning:
nohup으로 백그라운드로 실행하도록 변경했습니다. Terraform 삭제가 완료되면 정상적으로 자원 삭제 되었는지 확인을 합니다.(cat delete.log)
여기까지 3장의 두 번째 실습인 “KubeCost와 OpenCost 구성하기” 실습을 마칩니다.
수고하셨습니다 :)