1. 기본 환경 배포
이번 실습은 IAM 사용자 계정을 통해 관리 콘솔에 접근하고 액세스 키를 활용해 awscli 도구를 사용합니다.
해당 작업을 수행하지 않았다면 아래 토글을 확장해 작업을 선행하고 본격적인 실습에 들어갑니다.
IAM 사용자 생성 및 액세스 키 생성
1.1. Terraform을 통한 기본 인프라 배포
Terraform을 통한 기본 인프라 배포에 앞서 SSH 키 페어, IAM User Access Key ID, IAM User Secret Access Key를 미리 확인하고 메모해 둡니다.
Terraform으로 기본 인프라 배포
cd cnaee_class_tf/ch3
Bash
복사
# 실습 디렉터리 경로 진입
export TF_VAR_KeyName=[각자 ssh keypair]
export TF_VAR_MyIamUserAccessKeyID=[각자 iam 사용자의 access key id]
export TF_VAR_MyIamUserSecretAccessKey=[각자 iam 사용자의 secret access key]
export TF_VAR_SgIngressSshCidr=$(curl -s ipinfo.io/ip)/32
Bash
복사
# Terraform 환경 변수 저장
terraform init
terraform plan
Bash
복사
# Terraform 배포
nohup sh -c "terraform apply -auto-approve" > create.log 2>&1 &
Bash
복사
Note:
nohup으로 백그라운드로 실행하도록 변경했습니다.
Terraform 배포가 완료되면 정상적으로 자원 생성이 되었는지 확인을 합니다.(cat create.log)
Terraform을 통한 기본 인프라 배포가 완료되면 관리 콘솔에서 생성된 인프라들을 확인합니다.
Note:
AWS 관리 콘솔에 로그인 할 땐 IAM 사용자 계정으로 진행합니다.
1.2. 기본 정보 확인 및 설정
Terraform 배포가 완료 후 출력되는 Output 정보에서 bastion_host_ip의 퍼블릭 IP를 확인합니다.
해당 IP로 EKS 관리용 인스턴스(myeks-bastion-EC2)에 SSH로 접속하고 아래 명령어를 통해 정보를 확인합니다.
Note:
myeks-bastion-EC2의 OS 변경으로 SSH 접근에 대한 계정을 ubuntu로 지정합니다.
(ssh -i ~/.ssh/XXXX.pem ubuntu@X.X.X.X)
기본 설정 및 확인
aws eks update-kubeconfig \
--region $AWS_DEFAULT_REGION \
--name $CLUSTER_NAME
Shell
복사
# EKS 클러스터 인증 정보 업데이트
kubens default
Bash
복사
# kubens default 설정
echo $AWS_DEFAULT_REGION
echo $CLUSTER_NAME
echo $VPCID
echo $PublicSubnet1,$PublicSubnet2,$PublicSubnet3
echo $PrivateSubnet1,$PrivateSubnet2,$PrivateSubnet3
Bash
복사
# 변수 호출 종합
eksctl get cluster
Bash
복사
# eksctl을 통한 eks cluster 정보 확인
eksctl get nodegroup \
--cluster $CLUSTER_NAME \
--name ${CLUSTER_NAME}-node-group
Bash
복사
# eksctl을 통한 노드 그룹 정보 확인
kubectl get node -owide
Bash
복사
# kubectl을 통한 노드 정보 확인
PublicN1=$(kubectl get node --label-columns=topology.kubernetes.io/zone --selector=topology.kubernetes.io/zone=ap-northeast-2a -o jsonpath={.items[0].status.addresses[0].address})
PublicN2=$(kubectl get node --label-columns=topology.kubernetes.io/zone --selector=topology.kubernetes.io/zone=ap-northeast-2b -o jsonpath={.items[0].status.addresses[0].address})
PublicN3=$(kubectl get node --label-columns=topology.kubernetes.io/zone --selector=topology.kubernetes.io/zone=ap-northeast-2c -o jsonpath={.items[0].status.addresses[0].address})
echo "export PublicN1=$PublicN1" >> /etc/profile
echo "export PublicN2=$PublicN2" >> /etc/profile
echo "export PublicN3=$PublicN3" >> /etc/profile
echo $PublicN1, $PublicN2, $PublicN3
Bash
복사
# 노드 IP 변수 선언
for node in $PublicN1 $PublicN2 $PublicN3; \
do \
ssh -i ~/.ssh/kp_node.pem -o StrictHostKeyChecking=no ec2-user@$node hostname; \
done
Bash
복사
# 노드에 ssh 접근 확인
1.3. Grafana Mimir와 Grafana Agent 설치
KubeCost와 OpenCost에 연결할 Metrics Backend System으로 Grafana Mimir를 설치하고, Metric 값을 수집하기 위한 Collector는 Grafana Agent로 설치합니다.
사전 준비
export MyDomain=[각자 도메인]
echo "export MyDomain=$MyDomain" >> /etc/profile; echo $MyDomain
export NICKNAME=[각자의 닉네임]
echo "export NICKNAME=$NICKNAME" >> /etc/profile; echo $NICKNAME
export OIDC_ARN=$(aws iam list-open-id-connect-providers --query 'OpenIDConnectProviderList[*].Arn' --output text)
echo "export OIDC_ARN=$OIDC_ARN" >> /etc/profile; echo $OIDC_ARN
export OIDC_URL=${OIDC_ARN#*oidc-provider/}
echo "export OIDC_URL=$OIDC_URL" >> /etc/profile; echo $OIDC_URL
Bash
복사
# 변수 선언
cat <<EOT | kubectl apply -f -
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp3
allowVolumeExpansion: true
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
type: gp3
allowAutoIOPSPerGBIncrease: 'true'
encrypted: 'true'
EOT
Bash
복사
# gp3 storage class 생성
kubectl create ns monitoring
kubectl create ns kubecost
kubectl create ns opencost
Bash
복사
# namespace 생성
Grafana Mimir 설치
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
Bash
복사
# helm repo 추가
aws s3api create-bucket \
--bucket mimir-${NICKNAME} \
--region $AWS_DEFAULT_REGION \
--create-bucket-configuration LocationConstraint=$AWS_DEFAULT_REGION
aws s3 ls
Bash
복사
# mimir용 s3 bucket 생성 및 확인
export MIMIR_BUCKET_NAME="mimir-${NICKNAME}"
echo "export MIMIR_BUCKET_NAME=$MIMIR_BUCKET_NAME" >> /etc/profile; echo $MIMIR_BUCKET_NAME
Bash
복사
# s3 bucket 이름 변수 저장
cat >grafana-mimir-s3-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "MimirStorage",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::${MIMIR_BUCKET_NAME}",
"arn:aws:s3:::${MIMIR_BUCKET_NAME}/*"
]
}
]
}
EOF
cat grafana-mimir-s3-policy.json
Bash
복사
# grafana-mimir-s3-poilcy.json 파일 생성
aws iam create-policy \
--policy-name aws-mimir-s3 \
--policy-document file://grafana-mimir-s3-policy.json
Bash
복사
# aws-mimir-s3 IAM Policy 생성
cat >trust-rs-mimir.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "${OIDC_ARN}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${OIDC_URL}:sub": "system:serviceaccount:monitoring:mimir",
"${OIDC_URL}:aud": "sts.amazonaws.com"
}
}
}
]
}
EOF
cat trust-rs-mimir.json
Bash
복사
# Mimir IAM Role Trust rs 생성
aws iam create-role \
--role-name AWS-Mimir-Role \
--assume-role-policy-document file://trust-rs-mimir.json
Bash
복사
# AWS-Mimir-Role 생성
aws iam attach-role-policy \
--role-name AWS-Mimir-Role \
--policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/aws-mimir-s3
Bash
복사
# IAM Policy와 IAM Role 연결
export MIMIR_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/AWS-Mimir-Role
echo "export MIMIR_ROLE_ARN=$MIMIR_ROLE_ARN" >> /etc/profile; echo $MIMIR_ROLE_ARN
Bash
복사
# Mimir IAM Role ARN 변수 선언
cat >mimir-values.yaml <<EOF
image:
repository: grafana/mimir
tag: 2.10.3
pullPolicy: IfNotPresent
mimir:
structuredConfig:
limits:
max_label_names_per_series: 60
compactor_blocks_retention_period: 30d
blocks_storage:
backend: s3
s3:
bucket_name: ${MIMIR_BUCKET_NAME}
endpoint: s3.${AWS_DEFAULT_REGION}.amazonaws.com
region: ${AWS_DEFAULT_REGION}
tsdb:
retention_period: 13h
bucket_store:
ignore_blocks_within: 10h
querier:
query_store_after: 12h
ingester:
ring:
replication_factor: 3
serviceAccount:
create: true
name: "mimir"
annotations:
"eks.amazonaws.com/role-arn": "${MIMIR_ROLE_ARN}"
minio:
enabled: false
alertmanager:
enabled: false
ruler:
enabled: false
compactor:
persistentVolume:
enabled: true
annotations: {}
accessModes:
- ReadWriteOnce
size: 5Gi
storageClass: gp3
ingester:
zoneAwareReplication:
enabled: false
persistentVolume:
enabled: true
annotations: {}
accessModes:
- ReadWriteOnce
size: 5Gi
storageClass: gp3
store_gateway:
zoneAwareReplication:
enabled: false
persistentVolume:
enabled: true
annotations: {}
accessModes:
- ReadWriteOnce
size: 5Gi
storageClass: gp3
EOF
cat mimir-values.yaml
Bash
복사
# mimir-values.yaml 생성 및 확인
watch kubectl get pod,pv,pvc,cm -n monitoring
Bash
복사
# [모니터링1] monitoring 네임스페이스 - pod, pv, pvc, configmap 모니터링
helm install mimir grafana/mimir-distributed \
-n monitoring \
-f mimir-values.yaml \
--version 5.4.0
Bash
복사
# mimir-values 파일을 활용해서 Mimir를 helm chart로 설치
Grafana Agent 설치
export MIMIR_ENDPOINT_PUSH="http://mimir-nginx.monitoring/api/v1/push"
Bash
복사
# mimir endpoint 주소 - 변수 선언
cat <<EOF |
kind: ConfigMap
metadata:
name: grafana-agent
apiVersion: v1
data:
agent.yaml: |
metrics:
wal_directory: /var/lib/agent/wal
global:
scrape_interval: 60s
external_labels:
cluster: ${CLUSTER_NAME}
configs:
- name: integrations
remote_write:
- headers:
X-Scope-OrgID: kubecost_mimir
url: ${MIMIR_ENDPOINT_PUSH}
- url: ${MIMIR_ENDPOINT_PUSH}
scrape_configs:
- job_name: kubecost
honor_labels: true
scrape_interval: 1m
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
dns_sd_configs:
- names:
- kubecost-cost-analyzer.kubecost
type: 'A'
port: 9003
- job_name: kubecost-networking
kubernetes_sd_configs:
- role: pod
relabel_configs:
# Scrape only the the targets matching the following metadata
- source_labels: [__meta_kubernetes_pod_label_app]
action: keep
regex: 'kubecost-network-costs'
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
- job_name: 'kubernetes-nodes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/\$\$1/proxy/metrics/cadvisor
metric_relabel_configs:
- source_labels: [ __name__ ]
regex: (container_cpu_usage_seconds_total|container_memory_working_set_bytes|container_network_receive_errors_total|container_network_transmit_errors_total|container_network_receive_packets_dropped_total|container_network_transmit_packets_dropped_total|container_memory_usage_bytes|container_cpu_cfs_throttled_periods_total|container_cpu_cfs_periods_total|container_fs_usage_bytes|container_fs_limit_bytes|container_cpu_cfs_periods_total|container_fs_inodes_free|container_fs_inodes_total|container_fs_usage_bytes|container_fs_limit_bytes|container_cpu_cfs_throttled_periods_total|container_cpu_cfs_periods_total|container_network_receive_bytes_total|container_network_transmit_bytes_total|container_fs_inodes_free|container_fs_inodes_total|container_fs_usage_bytes|container_fs_limit_bytes|container_spec_cpu_shares|container_spec_memory_limit_bytes|container_network_receive_bytes_total|container_network_transmit_bytes_total|container_fs_reads_bytes_total|container_network_receive_bytes_total|container_fs_writes_bytes_total|container_fs_reads_bytes_total|cadvisor_version_info|kubecost_pv_info|kubelet_volume_stats_used_bytes|kubelet_volume_stats_capacity_bytes|kubelet_volume_stats_available_bytes|kubelet_volume_stats_inodes|kubelet_volume_stats_inodes_free|kubelet_volume_stats_inodes_used)
action: keep
- source_labels: [ container ]
target_label: container_name
regex: (.+)
action: replace
- source_labels: [ pod ]
target_label: pod_name
regex: (.+)
action: replace
- job_name: 'kubernetes-nodes'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/\$\$1/proxy/metrics
metric_relabel_configs:
- source_labels: [ __name__ ]
regex: (kubelet_volume_stats_used_bytes) # this metric is in alpha
action: keep
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_endpoints_name]
action: keep
regex: (.*kube-state-metrics|.*prometheus-node-exporter|kubecost-network-costs)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: \$\$1:\$\$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: kubernetes_node
metric_relabel_configs:
- source_labels: [ __name__ ]
regex: (container_cpu_allocation|container_cpu_usage_seconds_total|container_fs_limit_bytes|container_fs_writes_bytes_total|container_gpu_allocation|container_memory_allocation_bytes|container_memory_usage_bytes|container_memory_working_set_bytes|container_network_receive_bytes_total|container_network_transmit_bytes_total|DCGM_FI_DEV_GPU_UTIL|deployment_match_labels|kube_daemonset_status_desired_number_scheduled|kube_daemonset_status_number_ready|kube_deployment_spec_replicas|kube_deployment_status_replicas|kube_deployment_status_replicas_available|kube_job_status_failed|kube_namespace_annotations|kube_namespace_labels|kube_node_info|kube_node_labels|kube_node_status_allocatable|kube_node_status_allocatable_cpu_cores|kube_node_status_allocatable_memory_bytes|kube_node_status_capacity|kube_node_status_capacity_cpu_cores|kube_node_status_capacity_memory_bytes|kube_node_status_condition|kube_persistentvolume_capacity_bytes|kube_persistentvolume_status_phase|kube_persistentvolumeclaim_info|kube_persistentvolumeclaim_resource_requests_storage_bytes|kube_pod_container_info|kube_pod_container_resource_limits|kube_pod_container_resource_limits_cpu_cores|kube_pod_container_resource_limits_memory_bytes|kube_pod_container_resource_requests|kube_pod_container_resource_requests_cpu_cores|kube_pod_container_resource_requests_memory_bytes|kube_pod_container_status_restarts_total|kube_pod_container_status_running|kube_pod_container_status_terminated_reason|kube_pod_labels|kube_pod_owner|kube_pod_status_phase|kube_replicaset_owner|kube_statefulset_replicas|kube_statefulset_status_replicas|kubecost_cluster_info|kubecost_cluster_management_cost|kubecost_cluster_memory_working_set_bytes|kubecost_load_balancer_cost|kubecost_network_internet_egress_cost|kubecost_network_region_egress_cost|kubecost_network_zone_egress_cost|kubecost_node_is_spot|kubecost_pod_network_egress_bytes_total|node_cpu_hourly_cost|node_cpu_seconds_total|node_disk_reads_completed|node_disk_reads_completed_total|node_disk_writes_completed|node_disk_writes_completed_total|node_filesystem_device_error|node_gpu_count|node_gpu_hourly_cost|node_memory_Buffers_bytes|node_memory_Cached_bytes|node_memory_MemAvailable_bytes|node_memory_MemFree_bytes|node_memory_MemTotal_bytes|node_network_transmit_bytes_total|node_ram_hourly_cost|node_total_hourly_cost|pod_pvc_allocation|pv_hourly_cost|service_selector_labels|statefulSet_match_labels|kubecost_pv_info|up)
action: keep
- job_name: 'kubernetes-service-endpoints-slow'
scrape_interval: 5m
scrape_timeout: 30s
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: \$\$1:\$\$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: kubernetes_node
- job_name: 'prometheus-pushgateway'
honor_labels: true
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: pushgateway
- job_name: 'kubernetes-services'
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name
EOF
(export NAMESPACE=kubecost && kubectl apply -n $NAMESPACE -f -)
Bash
복사
# grafana agent ConfigMap 설치 및 확인
curl -O https://raw.githubusercontent.com/cloudneta/cnaeelab/master/_data/agent-bare.yaml
kubectl apply -f agent-bare.yaml
kubectl get all -n kubecost
Bash
복사
# Grafana Agent - Manifest 다운로드 및 설치
kubectl -n kubecost logs grafana-agent-0
Bash
복사
# grafana agent logs 확인
2. KubeCost 설치 및 확인
2.1. KubeCost 설치
KubeCost 설치
MIMIR_ENDPOINT="mimir-nginx.monitoring"
echo "export MIMIR_ENDPOINT=$MIMIR_ENDPOINT" >> /etc/profile; echo $MIMIR_ENDPOINT
MIMIR_ORG_ID="0"
echo "export MIMIR_ORG_ID=$MIMIR_ORG_ID" >> /etc/profile; echo $MIMIR_ORG_ID
Bash
복사
# 변수 선언
cat >kubecost-values.yaml <<EOF
global:
mimirProxy:
enabled: true
mimirEndpoint: http://${MIMIR_ENDPOINT}
orgIdentifier: ${MIMIR_ORG_ID}
prometheus:
enabled: false
fqdn: http://${MIMIR_ENDPOINT}/prometheus
persistentVolume:
enabled: true
storageClass: gp3
kubecostProductConfigs:
clusterName: ${CLUSTER_NAME}
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: "alb"
alb.ingress.kubernetes.io/scheme: "internet-facing"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80}]'
hosts:
- "kubecost.${MyDomain}"
paths:
- "/*"
pathType: "ImplementationSpecific"
EOF
cat kubecost-values.yaml
Bash
복사
# kubecost helm values 파일 생성
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm repo update
Bash
복사
# helm repo 추가
watch kubectl get all -n kubecost
Bash
복사
# [모니터링1] kubecost 네임스페이스
helm upgrade -i kubecost kubecost/cost-analyzer \
-n kubecost \
--version 2.6.3 \
--values kubecost-values.yaml
Bash
복사
# kubecost 설치 (helm)
2.2. KubeCost 확인
KubeCost WebUI 확인
Note:
http://kubecost.${MyDomain}으로 접속합니다.
1.
Overview : Kubernetes 클러스터의 비용 관리를 위한 종합 대시보드
2.
Monitor - Allocations : Kubernetes Cluster 내의 리소스 사용 및 비용 할당에 대한 정보를 상세하게 출력
3.
Monitor - Assets : Kubernetes Cluster의 자원에 대한 비용 및 사용 내역 확인
4.
Monitor - Clusters : Kubecost가 모니터링하는 모든 Kubernetes Cluster에 대한 비용을 확인
5.
Monitor - Efficiency : Kubernetes Cluster나 Node 단위로 비용 효용성을 표현하는 페이지
6.
Reports : 저장한 레포트 파일을 모아서 확인하는 페이지
7.
Saving - Insights : 비용 절감 방안과 예상 절감액을 표현하는 페이지
8.
Alerts : Kubecost에서 Kubernetes Cluster 및 Kubecost 자체의 상태를 모니터링하고 경고를 설정할 수 있는 기능을 제공
Warning:
KubeCost 설치 후 일정 시간 동안 데이터 수집 및 데이터 스토어 초기화 작업이 진행됩니다.
5분 이상 대기 후 정상적인 결과 값을 출력하니 참고바랍니다.
KubeCost APIs 확인
kubectl run curl --image=appropriate/curl --restart=Never --rm -it -- sh
Bash
복사
# kubecost API 확인을 위한 테스트용 파드 접근
KUBECOST_EP="http://kubecost-cost-analyzer.kubecost.svc.cluster.local:9090"
apk add --no-cache jq
Bash
복사
# 테스트용 파드 접속 상태에서 변수 선언 및 jq 설치
curl $KUBECOST_EP/model/allocation \
-d window=3d \
-d aggregate=namespace \
-d accumulate=false \
-d shareIdle=false \
-d format=json \
-G | jq
Bash
복사
# Allocations 정보 API로 확인
curl $KUBECOST_EP/model/assets \
-d window=today \
-d aggregate=type \
-d accumulate=true \
-d disableAdjustments=true \
-d format=json \
-G | jq
Bash
복사
# Assets 정보 API로 확인
exit
Bash
복사
# 확인이 완료되면 종료
kubectl cost 설치
os=$(uname | tr '[:upper:]' '[:lower:]') && \
arch=$(uname -m | tr '[:upper:]' '[:lower:]' | sed -e s/x86_64/amd64/) && \
curl -s -L https://github.com/kubecost/kubectl-cost/releases/latest/download/kubectl-cost-$os-$arch.tar.gz | tar xz -C /tmp && \
chmod +x /tmp/kubectl-cost && \
sudo mv /tmp/kubectl-cost /usr/local/bin/kubectl-cost
Bash
복사
# kubectl cost plugin 설치
kubectl cost 확인
kubectl cost namespace
Bash
복사
# 네임스페이스 단위로 비용 확인
kubectl cost deployment --show-cpu
Bash
복사
# Depolyment 단위로 비용 확인 (CPU 항목 추가)
kubectl cost pod --show-pv
Bash
복사
# Pod 단위로 비용 확인 (PV 항목 추가)
kubectl cost node --window 7d --show-cpu --show-memory
Bash
복사
# Node 단위로 7일간 활동을 통한 월간 예상 비용 측정 (CPU, Memory 항목 추가)
kubectl cost tui
Bash
복사
# Terminal User Interface 접근
KubeCost 삭제
helm uninstall kubecost -n kubecost
Bash
복사
# KubeCost 자원 삭제
3. OpenCost 설치 및 확인
3.1. OpenCost 설치
OpenCost 설치
cat >opencost.yaml <<EOF
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: opencost
namespace: opencost
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: opencost
rules:
- apiGroups:
- ''
resources:
- configmaps
- deployments
- nodes
- pods
- services
- resourcequotas
- replicationcontrollers
- limitranges
- persistentvolumeclaims
- persistentvolumes
- namespaces
- endpoints
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- daemonsets
- deployments
- replicasets
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- statefulsets
- deployments
- daemonsets
- replicasets
verbs:
- list
- watch
- apiGroups:
- batch
resources:
- cronjobs
- jobs
verbs:
- get
- list
- watch
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
- apiGroups:
- policy
resources:
- poddisruptionbudgets
verbs:
- get
- list
- watch
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: opencost
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: opencost
subjects:
- kind: ServiceAccount
name: opencost
namespace: opencost
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: opencost
namespace: opencost
labels:
app: opencost
spec:
replicas: 1
selector:
matchLabels:
app: opencost
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: opencost
spec:
restartPolicy: Always
serviceAccountName: opencost
containers:
- image: ghcr.io/opencost/opencost:latest
name: opencost
resources:
requests:
cpu: "10m"
memory: "55M"
limits:
cpu: "999m"
memory: "1G"
env:
- name: PROMETHEUS_SERVER_ENDPOINT
value: "http://${MIMIR_ENDPOINT}.svc/prometheus"
- name: CLUSTER_ID
value: "${CLUSTER_NAME}"
imagePullPolicy: Always
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
runAsUser: 1001
- image: ghcr.io/opencost/opencost-ui:latest
name: opencost-ui
resources:
requests:
cpu: "10m"
memory: "55M"
limits:
cpu: "999m"
memory: "1G"
imagePullPolicy: Always
---
kind: Service
apiVersion: v1
metadata:
name: opencost
namespace: opencost
spec:
selector:
app: opencost
type: ClusterIP
ports:
- name: opencost
port: 9003
targetPort: 9003
- name: opencost-ui
port: 9090
targetPort: 9090
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: opencost
namespace: opencost
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
spec:
ingressClassName: alb
rules:
- host: "opencost.${MyDomain}"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: opencost
port:
number: 9090
EOF
cat opencost.yaml
Bash
복사
# opencost.yaml 파일 생성 및 opencost 설치
watch kubectl get all -n opencost
Bash
복사
# [모니터링1] opencost 네임스페이스
kubectl create -f opencost.yaml
Bash
복사
# OpenCost 자원 배포
3.2. OpenCost 확인
OpenCost WebUI 확인
Note:
http://opencost.${MyDomain}으로 접속합니다.
Note:
Cost Allocation 페이지에서 Currency(통화 단위)를 KRW로 변경하면 천원 단위로 결과를 출력하는 것 같습니다.
예를 들어 ₩2는 2원이 아니라 2천원인 것이죠.
kubectl cost 확인
Warning:
kubectl cost는 기본적으로 kubecost에 맞춰 있어 service-port와 service-name과 namespace를 지정해야 합니다.
kubectl cost \
--service-port 9003 \
--service-name opencost \
--kubecost-namespace opencost \
--allocation-path /allocation/compute \
namespace \
--show-efficiency=true
Bash
복사
# 네임스페이스 단위로 비용 확인 (Efficiency 항목 추가)
kubectl cost \
--service-port 9003 \
--service-name opencost \
--kubecost-namespace opencost \
--allocation-path /allocation/compute \
tui
Bash
복사
# Terminal User Interface 접근
4. 실습 환경 삭제
실습 환경 삭제와 Terraform 삭제 작업을 진행합니다.
4.1. 실습 자원 삭제
실습 자원 삭제
kubectl delete -f opencost.yaml
kubectl delete -f agent-bare.yaml
kubectl delete pvc --all -n kubecost
Bash
복사
# opencost와 grafana agent 삭제
helm과 pvc 삭제
helm uninstall mimir -n monitoring
kubectl delete pvc --all -n monitoring
Bash
복사
# mimir 삭제 및 PVC 삭제
Mimir - S3 IAM Policy와 Role 삭제
aws iam detach-role-policy --role-name AWS-Mimir-Role --policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/aws-mimir-s3
aws iam delete-policy --policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/aws-mimir-s3
aws iam delete-role --role-name AWS-Mimir-Role
Bash
복사
# IAM Policy와 Role 삭제
S3 Bucket 삭제
aws s3 rm s3://$MIMIR_BUCKET_NAME --recursive
aws s3api delete-bucket --bucket $MIMIR_BUCKET_NAME --region $AWS_DEFAULT_REGION
aws s3 ls
Bash
복사
# s3 bucket 삭제 및 확인
4.2. Terraform 삭제
Terraform 자원 삭제
nohup sh -c "terraform destroy -auto-approve" > delete.log 2>&1 &
Bash
복사
# terraform 자원 삭제
Warning:
nohup으로 백그라운드로 실행하도록 변경했습니다.
Terraform 삭제가 완료되면 정상적으로 자원 삭제 되었는지 확인을 합니다.(cat delete.log)
여기까지 3장의 두 번째 실습인 “KubeCost와 OpenCost 구성하기” 실습을 마칩니다.
수고하셨습니다 :)