분류 전체보기

Prometheus vs InfluxDB 2019.04.21
IPMI Log 가져오기 2019.04.15
CENTOS 7 , firewall-cmd Port forwarding 2019.04.05
Git 상황별로 작업 원복 하기 2019.03.25
nodejs 설치 in CentOS 2019.03.21
Azure VNets 와 AWS VPC 간의 유사점과 차이점 2019.03.19
Running Thousands of KVM Guests on Amazon's new i3.metal Instances 2019.03.18
"컨테이너 관리의 정석" 쿠버네티스의 이해와 활용 - IDG 2019.02.10
DDoS 장비 구성 방식 2019.01.23
Firecracker로 VM을 매우 빠르고 가볍게 띄워보자. 2018.12.14

Prometheus vs InfluxDB

2019. 4. 21. 22:04

편향적인 내용으로 당연히 절대 평가는 되지 않습니다. 상황에 맞게 써야 하며, 특성을 참고하는데 도움은 됩니다.

원문: https://bitworking.org/news/2017/03/prometheus

우리는 모든 모니터링을 InfluxDB 에서 Prometheus로 마이그레이션하는 작업을 끝내 었습니다. 변화에 대한 이유를 적어 두었 습니다. 이 내용은 저의 개인적인 관찰이며 특정 프로젝트와 관련이 있습니다. 이러한 문제는 귀하에게 적용되지 않을 수 있으며 각 제품을 귀하 자신의 용도로 평가해야합니다.

업데이트 : 명확히하기 위해 InfluxDB와 Prometheus의 버전은 InfluxDB 1.1.1과 Prometheus 1.5.2입니다.

Push vs Pull

InfluxDBInfluxDB는 Push 기반 시스템입니다. 즉, 실행중인 응용 프로그램이 데이터를 모니터링 시스템에 적극적으로 Push해야합니다. Prometheus는 Pull 기반 시스템이며, Prometheus 서버는 주기적으로 실행중인 응용 프로그램에서 메트릭 값을 가져옵니다.

Prometheus로 폴링하는 방법을 중앙에서 제어 할 수 있기 때문에 Prometheus 서버의 구성을 조정하는 것만으로 매분마다 폴링을 전환 할 수 있습니다. InfluxDB를 사용하면 메트릭을 얼마나 자주 Push해야하는지에 대한 변경으로 모든 애플리케이션을 재배포해야합니다. 또한 Prometheus pull 메서드를 사용하면 Prometheus가 응용 프로그램의 실행 여부를 모니터링하는 합성 "UP"메트릭을 만들고 제공 할 수 있습니다. 수명이 짧은 애플리케이션의 경우 Prometheus에는 푸시 게이트웨이가 있습니다.

데이터 저장소

InfluxDB는 메트릭 값과 인덱스 모두에 대해 모 놀리 식 데이터베이스를 사용합니다. Prometheus는 지표에 대해 LevelDB를 사용하지만 각 측정 항목은 자체 파일에 저장됩니다.

둘 다 키 / 값 데이터 저장소를 사용하지만 키 스토어를 사용하는 방법은 매우 다르며 제품의 성능에 영향을 미칩니다. InfluxDB는 느린 속도 였고 똑같은 측정법으로 Prometheus보다 훨씬 많은 디스크 공간을 차지했습니다. InfluxDB를 시작하고 그로 인해 측정 된 데이터의 수가 적 으면 데이터 저장소가 1GB로 증가한 다음 Prometheus가 모든 측정 항목을 사용하여 아직 10GB를 크랙하지 않은 채로 전체 측정 항목에 대해 데이터 저장소를 100GB까지 빠르게 확장했습니다. . InfluxDB가 우리가 실행하고 있던 InfluxDB 버전을 업그레이드하려는 시도가 실패하거나 실패했을 때 InfluxDB가 모든 데이터를 잃어 버렸습니다.

업데이트 : Prometheus는 몇 초 만에 시작되는 반면, InfluxDB는 색인을 유효화하거나 다시 작성하는 동안 정기적으로 5 분이 걸리고 전체 프로세스 중에 데이터를 수집하지 않는 등 시작 시간에 관한 또 다른 데이터 저장소 관련 문제가 있음을 상기했습니다.

CPU

아마도 데이터 저장소의 효율성과 밀접한 관련이있는 InfluxDB는 실행중인 서버를 최대한으로 늘려 가고 있었지만 Prometheus는 동일한 인스턴스에서 최대 0.2 개의로드를 넘었습니다.

검색어 언어

InfluxDB는 SQL의 변형을 사용합니다. Prometheus는 실질적으로 보다 간단하고 직접적인 쿼리 모델을 사용합니다.

무엇을 입력 하시겠습니까?

SELECT * FROM "cpu_load_short" WHERE "value" > 0.9

또는

cpu_load_short > 0.9

설정

InfluxDB : 설정 파일 & SQL 명령의 혼합

Prometheus : 텍스트 파일 사용

Prometheus 설정은 간단히 YAML 파일이며, 전체 설정은 파일을 통해 이루어진다. InfluxDB를 사용하면 예를 들어 메트릭스를 저장할 명명된 데이터베이스를 생성하는 등의 구성 중 일부가 실제로 수행되는 것을 우려해야 한다. 또한 Prometheus는 15일 동안만 데이터를 저장하는 것이 기본이고, InfluxDB는 모든 데이터를 영구적으로 저장하는 것이 기본이며, 모든 데이터를 영구적으로 저장하지 않으려면 데이터가 보존되는 방법을 제어하기 위해 서버로 전송하는 SQL 명령을 생성해야 한다.

저작자표시 비영리 (새창열림)

'DevOps , SRE' 카테고리의 다른 글

SRE Recruit (0)	2019.04.29
nGrinder 빨리 설치하기 (0)	2019.04.29
Git 상황별로 작업 원복 하기 (0)	2019.03.25
"마이크로서비스는 답이 아니었다"··· 세그먼트가 모놀리틱으로 돌아온 이유 (0)	2018.09.13
root certificates 추가하기 (0)	2018.05.30

IPMI Log 가져오기

2019. 4. 15. 23:44

Linux 에서 IPMI Log 가져오기

결론 요약 커맨드

#로그 확인
ipmitool -I lan -U root -L USER sel list -H 10.178.209.106
#현재 시스템 시간
ipmitool -I lan -U root -L USER sel time get -H 10.178.209.106

네트워크 접근

Softlayer 에서는 IPMI의 사설 주소는 같은 계정 내에서는 모두 접근 할수 있다.
물론 네트워크 정책상 허용 했을 경우 이다. (기본 허용)
VLAN 분리나, 방화벽으로 분리한 경우는 당연히 안된다.

IPMI 계정 접근

포털 UI나 API로 확인 가능하다.
그런데 User ID는 기본으로 root인데 권한은 일반 USER이다. (괜히 헤갈리게 ㅡoㅡ+)
물론 티켓으로 root 유저를 Admin으로 승급 시켜달라고 하면 해주긴한다.
그냥 IPMI 커맨드를 날리면 Privilege Level : ADMINISTRATOR 로 잡기 때문에
Activate Session error: Requested privilege level exceeds limit 오류가 뜬다.
결론은 -L USER 옵션을 주면 그냥 USER 권한으로 실행한다.
지금은 로그 값 읽기만 하면 되므로 그냥 기본 USER 권한을 유지한 채로 사용 할것이다.

디버그 옵션

-vv 옵션을 주면 상세 내용이 나온다.

[root@cmd ~]# ipmitool -H 10.178.209.106 -U root fru
Password:
Activate Session error:    Requested privilege level exceeds limit
Error: Unable to establish LAN session
Error: Unable to establish IPMI v1.5 / RMCP session

이걸 -vv를 주면 Privilege Level : ADMINISTRATOR 를 요청 해서 오류나는것을 확인 할 수 있다.

[root@cmd ~]# ipmitool -vv -H 10.178.209.106 -U root fru
Password:
Sending IPMI/RMCP presence ping packet
Received IPMI/RMCP response packet:
  IPMI Supported
  ASF Version 1.0
  RMCP Version 1.0
  RMCP Sequence 255
  IANA Enterprise 4542

ipmi_lan_send_cmd:opened=[1], open=[-204238448]
Channel 01 Authentication Capabilities:
  Privilege Level : ADMINISTRATOR
  Auth Types      : MD2 MD5 PASSWORD
  Per-msg auth    : enabled
  User level auth : enabled
  Non-null users  : enabled
  Null users      : disabled
  Anonymous login : disabled

Proceeding with AuthType MD5
ipmi_lan_send_cmd:opened=[1], open=[-204238448]
Opening Session
  Session ID      : ff00001a
  Challenge       : 458a142850a040802142840810204080
  Privilege Level : ADMINISTRATOR
  Auth Type       : MD5
ipmi_lan_send_cmd:opened=[1], open=[-204238448]
Activate Session error:    Requested privilege level exceeds limit
Error: Unable to establish LAN session
Error: Unable to establish IPMI v1.5 / RMCP session

FRU (Field Replaceable Unit) 확인

root@cmd ~]# ipmitool -H 10.178.209.106 -U root -L USER fru
Password:
FRU Device Description : Builtin FRU Device (ID 0)
 Chassis Type          : Other
 Chassis Part Number   : CSE-819UTS-ㅇㅇㅇㅇ-ST031
 Chassis Serial        : C8ㅇㅇㅇㅇㅇㅇ13
 Board Mfg Date        : Mon Jan  1 09:00:00 1996
 Board Mfg             : Supermicro
 Board Serial          : OM17ㅇㅇㅇㅇ9
 Board Part Number     : X11DPU
 Product Manufacturer  : Supermicro
 Product Part Number   : SYS-ㅇㅇㅇㅇ-TN4R4T
 Product Serial        : A291ㅇㅇㅇㅇ908570

SEL (System Event Log) 확인

-I lan은 인터페이스 지정으로 위에서 처럼 생략 가능 하다. -L USER로 권한을 일반 유저로 지정한 것을 확인한다.

 [root@cmd ~]# ipmitool -I lan -U root -L USER sel list -H 10.178.209.106
Password:
   1 | 02/12/2019 | 23:43:04 | OS Boot | C: boot completed () | Asserted
   2 | 02/12/2019 | 23:56:40 | OS Critical Stop | Run-time critical stop () | Asserted
   3 | 02/12/2019 | 23:56:40 | OS Critical Stop | OS graceful shutdown () | Asserted
   4 | 02/12/2019 | 23:58:28 | OS Boot | C: boot completed () | Asserted
   5 | 02/12/2019 | 23:59:19 | OS Critical Stop | OS graceful shutdown () | Asserted
   6 | 02/13/2019 | 00:01:27 | OS Boot | C: boot completed () | Asserted
   7 | 02/13/2019 | 00:18:01 | Unknown #0xff |  | Asserted
   8 | 02/13/2019 | 00:19:30 | Physical Security #0xaa | General Chassis intrusion () | Asserted
   9 | 02/13/2019 | 00:20:08 | OS Boot | C: boot completed () | Asserted
   a | 02/13/2019 | 00:47:00 | OS Critical Stop | OS graceful shutdown () | Asserted
   b | 02/13/2019 | 00:49:20 | OS Boot | C: boot completed () | Asserted
   c | 02/14/2019 | 03:42:02 | OS Critical Stop | OS graceful shutdown () | Asserted
   d | 02/14/2019 | 10:18:49 | OS Boot | C: boot completed () | Asserted
   e | 02/14/2019 | 10:22:57 | OS Boot | C: boot completed () | Asserted
   f | 02/14/2019 | 10:35:27 | OS Boot | C: boot completed () | Asserted
  10 | 02/14/2019 | 10:49:15 | OS Critical Stop | Run-time critical stop () | Asserted
  11 | 02/14/2019 | 10:49:15 | OS Critical Stop | OS graceful shutdown () | Asserted
  12 | 02/14/2019 | 10:51:04 | OS Boot | C: boot completed () | Asserted
  13 | 02/14/2019 | 10:51:55 | OS Critical Stop | OS graceful shutdown () | Asserted
  14 | 02/14/2019 | 10:54:03 | OS Boot | C: boot completed () | Asserted
  15 | 02/14/2019 | 11:39:11 | OS Critical Stop | OS graceful shutdown () | Asserted
  16 | 02/14/2019 | 11:41:31 | OS Boot | C: boot completed () | Asserted
  17 | 03/21/2019 | 23:50:01 | OS Critical Stop | OS graceful shutdown () | Asserted
  18 | 03/21/2019 | 23:52:24 | OS Boot | C: boot completed () | Asserted
  19 | 03/21/2019 | 23:53:10 | OS Critical Stop | OS graceful shutdown () | Asserted
  1a | 03/21/2019 | 23:55:23 | OS Boot | C: boot completed () | Asserted
  1b | 04/14/2019 | 21:24:42 | Session Audit #0xff |  | Asserted
  1c | 04/14/2019 | 21:34:01 | Session Audit #0xff |  | Asserted
  1d | 04/14/2019 | 21:40:01 | Session Audit #0xff |  | Asserted
  1e | 04/14/2019 | 21:41:17 | Session Audit #0xff |  | Asserted

IPMI 상의 현재 시스템 시간 확인

 [root@cmd ~]# ipmitool -I lan -U root -L USER sel time get -H 10.178.209.106

활용

AWX에서 서버별 실행해서 Elastic Search로 집어넣으려고 했으나.... 변경 분만 어떻게 넣을 지 고민이다.
그냥 매번 넣고 ES에서 중복제거를 해야 할지..

IPMI 세부 참조는 아래 글 참고

https://docs.oracle.com/cd/E19464-01/820-6850-11/IPMItool.html#50602039_63068
http://fibrevillage.com/sysadmin/71-ipmitool-useful-examples
http://coffeenix.net/board_print.php?bd_code=1765
http://coffeenix.net/board_print.php?bd_code=1766
https://annvix.com/using_swatch_to_monitor_logfiles

저작자표시 비영리 (새창열림)

'Cloud > Softlayer' 카테고리의 다른 글

IPMI user 권한 상승 (0)	2019.04.29
Windows 2016 Language Pack 설치가 안될 때 (0)	2018.09.10
VRF(Virtual Routing and Forwarding)란 무엇인가요? (0)	2018.07.24
Global IP setting (0)	2018.07.08
SuperMicro 보드 Turbo Boost 켜기 (0)	2018.06.28

CENTOS 7 , firewall-cmd Port forwarding

2019. 4. 5. 19:02

요즘 포트 포워딩 할일이 자꾸 생긴다.

firewall-cmd --permanent --zone=public --add-masquerade #masquerade 설정 (stateful)
firewall-cmd --permanent --zone=public --add-port=1978/tcp #1978 포트 방화벽 열고
firewall-cmd --permanent --add-forward-port=port=1978:proto=tcp:toaddr=1.2.3.4:toport=3389 #1.2.3.4:3389로 포트포워딩
firewall-cmd --reload #기본 설정 파일에서 설정 읽어오기
firewall-cmd --list-all #설정 확인

tcpdump를 떠보면 어떻게 동작하는지 볼 수 있다. 포트 변화에 유의해보자.
#tcpdump -nni eth1 port 3389 or port 1978

#client:1.1.9.9 nat:1.1.7.7 server:1.2.3.4
IP 1.1.9.9.50885 > 1.1.7.7.1978
IP 1.1.7.7.50885 > 1.2.3.4.3389
IP 1.2.3.4.3389 > 1.1.7.7.50885
IP 1.1.7.7.1978 > 1.1.9.9.50885

저작자표시 비영리 (새창열림)

'OS > Linux' 카테고리의 다른 글

bash {} expansion (0)	2020.03.28
IPMI 설정 및 활용 방안 (0)	2019.09.02
nodejs 설치 in CentOS (0)	2019.03.21
CentOS 7 Full NAT 설정, Secondary IP 추가 (0)	2018.11.05
pdnsd로 DNS Proxy 설정하기 (0)	2018.11.05

Git 상황별로 작업 원복 하기

2019. 3. 25. 02:11

clone 받은 상태로 원복 할려면 해당 경로로 이동 후

git checkout . && git clean -fdx

git add 이전

git checkout DIR_PATH #특정 폴더 아래의 모든 수정 사항 원복

git checkout . # 현재 폴더 아래 모든 수정 사항 원복

git checkout FILE_PATH #특정 파일 원복

git add 한 경우

git reset

git commit 까지 한경우 :

commit 내용을 없애고 이전 상태로 원복

master 브랜치의 마지막 커밋을 가리키던 HEAD를 그 이전으로 이동시켜서 commit 내용을 없앰

git reset --hard HEAD^

commit은 취소하고 commit 했던 내용은 남기고 unstaged 상태로 만들기

git reset HEAD^

commit은 취소하고 commit 했던 내용은 남기고 staged 상태로 만들기

git reset --soft HEAD^

불필요한 파일 및 디렉토리 지우기 (untracked)

git clean -fdx

git push를 한 경우 remote repository도 이전으로 되돌리기

git reset HEAD^

git commit -m "..."

git push origin +master

저작자표시 비영리 (새창열림)

'DevOps , SRE' 카테고리의 다른 글

SRE Recruit (0)	2019.04.29
nGrinder 빨리 설치하기 (0)	2019.04.29
Prometheus vs InfluxDB (0)	2019.04.21
"마이크로서비스는 답이 아니었다"··· 세그먼트가 모놀리틱으로 돌아온 이유 (0)	2018.09.13
root certificates 추가하기 (0)	2018.05.30

nodejs 설치 in CentOS

2019. 3. 21. 14:52

curl -sL https://rpm.nodesource.com/setup_10.x | sudo bash -

sudo yum -y install nodejs

저작자표시 비영리 (새창열림)

'OS > Linux' 카테고리의 다른 글

IPMI 설정 및 활용 방안 (0)	2019.09.02
CENTOS 7 , firewall-cmd Port forwarding (0)	2019.04.05
CentOS 7 Full NAT 설정, Secondary IP 추가 (0)	2018.11.05
pdnsd로 DNS Proxy 설정하기 (0)	2018.11.05
firewalld 설정 (0)	2018.11.04

Azure VNets 와 AWS VPC 간의 유사점과 차이점

2019. 3. 19. 13:11

클라우드 컴퓨팅은 소프트웨어 설계 및 데이터 센터의 역할 / 기능에 대해 우리가 알고있는 것을 변화 시켰습니다. 클라우드로의 여행은 클라우드 제공 업체를 선택하고 사설망을 프로비저닝하거나 사내 네트워크를 확장하는 것으로 시작됩니다. 클라우드에서 리소스를 프로비저닝하려는 고객은 다양한 클라우드 공급자가 제공하는 다양한 사설 네트워크 중에서 선택할 수 있습니다. 가장 많이 배치 된 사설망은 가상 네트워크 (VNet) 및 가상 사설 클라우드 (VPC)마이크로 소프트와 아마존으로부터 각각. 이 블로그는 잠재적 인 고객에게 두 개의 사설망을 차별화 할 수있는 정보를 제공하고 작업 부하에 적합한 결정을 내리는 데 도움이되는 두 가지 사설 네트워크 제품의 유사점과 차이점을 살펴 봅니다. 이 시리즈의 첫 번째 블로그에서는 모든 클라우드 네트워크의 기본 구성 요소를 다룰 것입니다. 이 현재 블로그는 복잡성 때문에 가격을 비교하지 않을 것이며 향후 게시물에서 다루어 질 것입니다.

개념적으로 Azure VNet과 AWS VPC는 모두 클라우드에서 자원과 서비스를 프로비저닝하기위한 기반을 제공합니다. 두 네트워크 모두 동일한 빌딩 블록을 제공하지만 구현의 정도에는 차이가 있습니다. 다음은 이러한 빌딩 블록 중 일부에 대한 요약입니다.

서브넷 - Azure VNet과 AWS VPC는 클라우드에 배치 된 자원을 효과적으로 설계하고 제어하기 위해 네트워크를 서브넷과 분리합니다. AWS VPC는 해당 지역의 모든 가용 영역 (AZ)에 걸쳐 있으므로 AWS VPC의 서브넷은 가용 영역 (AZ)에 매핑됩니다. 서브넷은 하나의 AZ에만 속해야하며 AZ를 스팬 할 수 없습니다. Azure VNet 서브넷은 할당 된 IP 주소 블록에 의해 정의됩니다. AWS VPC의 모든 서브넷 간 통신은 AWS 백본을 통해 이루어지며 기본적으로 허용됩니다. AWS VPC 서브넷은 개인 또는 공개 중 하나 일 수 있습니다. 인터넷 게이트웨이 (IGW)가 연결된 서브넷은 공용입니다. AWS는 VPC 당 하나의 IGW 만 허용하며 공용 서브넷은 배포 된 리소스가 인터넷 액세스를 허용합니다. AWS는 각 지역에 대한 기본 VPC 및 서브넷을 만듭니다. 이 기본 VPC에는 VPC가 상주하는 각 지역에 대한 서브넷이 있으며, 이 VPC에 배포 된 모든 이미지 (EC2 인스턴스)에는 공용 IP 주소가 할당되므로 인터넷 연결이 가능합니다. Azure VNet은 기본 VNet을 제공하지 않으며 AWS VPC와 같이 개인 또는 공용 서브넷을 갖지 않습니다. VNet에 연결된 리소스는 기본적으로 인터넷에 액세스 할 수 있습니다.
IP 주소 - AWS VPC와 Azure VNET 모두 RFC 1918에 명시된대로 개인 IPv4 주소 범위의 전역 적으로 라우팅 할 수없는 CIDR을 사용합니다.이 RFC의 주소는 전 세계적으로 라우팅 할 수 없지만 고객은 다른 공용 IP 주소를 계속 사용할 수 있습니다. Azure VNet은 지정된 CIDR 블록의 개인 IP 주소에 VNet에 연결되어 배포 된 리소스를 할당합니다. Azure VNet에서 지원되는 가장 작은 서브넷은 / 29이고 가장 큰 서브넷은 / 8입니다. 또한 AWS는 동일한 RFC 1918 또는 공개적으로 라우팅 가능한 IP 블록의 IP 주소를 허용합니다. 현재 AWS는 공개적으로 라우팅 가능한 IP 블록에서 인터넷에 직접 액세스 할 수 없으므로 인터넷 게이트웨이 (IGW)를 통해서도 인터넷에 연결할 수 없습니다. 가상 사설망을 통해서만 접근 할 수 있습니다. 이 때문에 Windows 인스턴스를 범위가 224.0.0.0~ 인 VPC에 시작하면 Windows 인스턴스를 올바르게 부팅 할 수 없습니다.255.255.255.255(클래스 D 및 클래스 E IP 주소 범위). 서브넷의 경우 AWS는 / 28의 최소 주소 블록과 / 16의 최대 주소 블록을 권장합니다. 이 블로그를 작성할 때 Microsoft Azure VNet의 IPv6 지원은 제한적이지만 AWS VPC는 2017 년 1 월 현재 중국을 제외한 모든 지역에 대해 IPv6을 지원합니다. IPv6의 경우 VPC는 고정 크기 인 / 56 (CIDR 표기법) 서브넷 크기는 / 64로 고정됩니다. IPv6에서는 모든 주소가 인터넷 라우팅이 가능하며 기본적으로 인터넷에 연결할 수 있습니다. AWS VPC는 전용 서브넷의 리소스에 Egress-Only Internet Gateway (EGW)를 제공합니다. 아웃 바운드 트래픽을 허용하면서 들어오는 트래픽을 차단합니다. AWS를 사용하면 기존 리소스에 대해 IPv6을 사용하도록 설정하고 인터넷에 액세스해야하는 사설 서브넷의 리소스에 대해 전용 인터넷 게이트웨이를 제공 할 수 있습니다. 발신 전용 인터넷 게이트웨이는 인터넷에 액세스 할 수 있지만 들어오는 트래픽은 차단합니다. 이러한 CIDR 블록에서 IP 주소를 할당하는 방법을 이해하면 AWS VPC 네트워크를 설계하는 것이 중요합니다. 디자인 이후 서브넷 IP 주소를 변경하는 것이 쉽지 않기 때문입니다. Azure VNet은이 분야에서 더 많은 유연성을 제공합니다 -서브넷의 IP 주소 는 초기 설계 후에 변경할 수 있습니다. 그러나 현재 서브넷의 리소스는 현재 서브넷에서 마이그레이션해야합니다.
라우팅 테이블 - AWS는 경로 테이블을 사용하여 서브넷의 아웃 바운드 트래픽에 허용되는 경로를 지정합니다. VPC에서 생성 된 모든 서브넷은 자동으로 기본 라우팅 테이블과 연결되므로 VPC의 모든 서브넷은 보안 규칙에 의해 명시 적으로 거부되지 않는 한 다른 서브넷의 트래픽을 허용 할 수 있습니다. Azure VNet에서 VNet의 모든 리소스는 시스템 경로를 사용하여 트래픽 흐름을 허용합니다. 기본적으로 Azure VNet은 서브넷, VNets 및 사내 구축 형 네트워크 간의 라우팅을 제공하므로 경로를 구성하고 관리 할 필요가 없습니다. 시스템 경로를 사용하면 트래픽이 자동으로 원활하게 처리되지만 가상 어플라이언스를 통해 패킷 라우팅을 제어하려는 경우가 있습니다. Azure VNet은 시스템 경로 테이블을 사용하여 모든 VNet의 서브넷에 연결된 리소스가 기본적으로 서로 통신하는지 확인합니다. 하나, 기본 경로를 재정의하려는 경우가 있습니다. 이러한 시나리오의 경우 사용자 정의 라우트 (UDR)를 구현할 수 있습니다. 트래픽이 각 서브넷에 라우팅되는 위치 및 / 또는 BGP 라우트 (Azure VPN 게이트웨이 또는 ExpressRoute 연결을 사용하여 VNet을 사내 구축 형 네트워크로) 제어 할 수 있습니다. UDR은 서브넷에서 나가는 트래픽에만 적용되며 UDR의 목표가 일종의 검사 NVA 등으로 트래픽을 보내는 것이라면 Azure VNet 배포에 보안 계층을 제공 할 수 있습니다. UDR을 사용하면 다른 서브넷에서 하나의 서브넷으로 전송 된 패킷을 일련의 라우트에서 네트워크 가상 어플라이언스를 통과하도록 강제 설정할 수 있습니다. 하이브리드 설정에서 Azure VNet은 UDR, BGP (ExpressRoute가 사용되는 경우) 및 시스템 라우팅 테이블의 세 가지 경로 테이블 중 하나를 사용할 수 있습니다. Azure VNet에서, 서브넷은 라우트 테이블이 명시 적으로 서브넷과 연관 될 때까지 트래픽에 대한 시스템 라우트에 의존합니다. 연결이 설정되면, 즉 UDR 및 / 또는 BGP 라우트가 존재하면 가장 긴 접두사 일치 (LPM)를 기반으로 라우팅이 수행됩니다. 프리픽스 길이가 동일한 경로가 두 개 이상인 경우 경로는 사용자 정의 경로, BGP 경로 (ExpressRoute 사용시) 및 시스템 경로 순으로 해당 출발지를 기준으로 선택됩니다. 반면 AWS VPC에서 라우팅 테이블은 둘 이상일 수 있지만 동일한 유형입니다. BGP 경로 (ExpressRoute 사용시)와 시스템 경로. 반면 AWS VPC에서 라우팅 테이블은 둘 이상일 수 있지만 동일한 유형입니다. BGP 경로 (ExpressRoute 사용시)와 시스템 경로. 반면 AWS VPC에서 라우팅 테이블은 둘 이상일 수 있지만 동일한 유형입니다.
보안 - AWS VPC는 네트워크에 배포 된 리소스에 대해 두 가지 수준의 보안을 제공합니다. 첫 번째 보안 그룹 (SG)이라고합니다. 보안 그룹은 EC2 인스턴스 수준에서 적용되는 상태 저장 개체입니다. 기술적으로이 규칙은 ENI (Elastic Network Interface) 수준에서 적용됩니다. 응답 트래픽은 트래픽이 허용되면 자동으로 허용됩니다. 두 번째 보안 메커니즘은 네트워크 액세스 제어 (NACL)라고합니다. NACL은 서브넷 수준에서 적용되며 서브넷에 배포 된 모든 리소스에 적용되는 상태 비 저장 필터링 규칙입니다. 진입 트래픽이 허용되면 응답은 서브넷에 대한 규칙에서 명시 적으로 허용되지 않는 한 자동으로 허용되지 않기 때문에 무 상태입니다. NACL은 서브넷에 들어가고 나가는 트래픽을 검사하여 서브넷 수준에서 작동합니다. NACL을 사용하여 허용 및 거부 규칙을 설정할 수 있습니다. NACL을 여러 서브넷과 연결할 수 있습니다. 그러나 서브넷은 한 번에 하나의 NACL 만 연결할 수 있습니다. NACL 규칙은 번호가 매겨져 가장 낮은 번호의 규칙부터 순서대로 평가되어 네트워크 ACL과 연결된 서브넷 안팎으로 트래픽이 허용되는지 여부를 결정합니다. 규칙에 사용할 수있는 가장 높은 번호는 32766입니다. 번호가 매겨진 마지막 규칙은 항상 별표이며, 서브넷에 대한 트래픽을 거부합니다. NACL 목록의 규칙이 트래픽과 일치하지 않는 경우에만이 규칙에 도달합니다. Azure VNet은 NSG (Network Security Groups)를 제공하며 AWS SG 및 NACL의 기능을 결합합니다. NSG는 상태 기반이며 서브넷 또는 NIC 수준에서 적용될 수 있습니다. 하나의 NSG 만 NIC에 적용 할 수 있습니다.
게이트웨이 - VNet과 VPC 모두 서로 다른 연결 목적을 위해 다른 게이트웨이를 제공합니다. AWS VPC는 NAT 게이트웨이를 추가하는 경우 주로 세 개의 게이트웨이 (네 개)를 사용합니다. AWS를 사용하면 하나의 인터넷 게이트웨이 (IGW)가 IPv4를 통해 인터넷 연결을 제공하고 IPv6 만 사용하는 인터넷 연결을 위해 송신 전용 인터넷 게이트웨이를 제공 할 수 있습니다. AWS에서 IGW가없는 서브넷은 사설 서브넷으로 간주되며 NAT 게이트웨이 또는 NAT 인스턴스가없는 인터넷 연결이 없습니다 (AWS는 고 가용성 및 확장 성을 위해 NAT 게이트웨이를 권장합니다). 또 다른 AWS 게이트웨이 인 VPG (Virtual Private Gateway)를 통해 AWS는 VPN 또는 Direct Connect를 통해 AWS에서 다른 네트워크로 연결을 제공 할 수 있습니다. 비 AWS 네트워크에서 AWS는 AWS VPC에 연결하기 위해 고객 측의 고객 게이트웨이 (CGW)를 요구합니다. Azure VNet은 VPN 게이트웨이와 ExpressRoute 게이트웨이의 두 가지 유형의 게이트웨이를 제공합니다. VPN 게이트웨이는 VNet에서 VNet으로의 암호화 된 트래픽을 VNet 또는 VNet VPN의 경우 공용 연결을 통해 또는 Microsoft의 백본에서 온 - 프레미스 위치로 암호화합니다. 그러나 ExpressRoute 및 VPN 게이트웨이에는 게이트웨이 서브넷이 필요합니다. 게이트웨이 서브넷에는 가상 네트워크 게이트웨이 서비스가 사용하는 IP 주소가 들어 있습니다. Azure VNET to VNET은 VPN을 통해 기본적으로 연결할 수 있지만 AWS에서는 VPC와 VPC가 서로 다른 지역에있는 경우 VPC와 타사 NVA가 필요합니다. 게이트웨이 서브넷에는 가상 네트워크 게이트웨이 서비스가 사용하는 IP 주소가 들어 있습니다. Azure VNET to VNET은 VPN을 통해 기본적으로 연결할 수 있지만 AWS에서는 VPC와 VPC가 서로 다른 지역에있는 경우 VPC와 타사 NVA가 필요합니다. 게이트웨이 서브넷에는 가상 네트워크 게이트웨이 서비스가 사용하는 IP 주소가 들어 있습니다. Azure VNET to VNET은 VPN을 통해 기본적으로 연결할 수 있지만 AWS에서는 VPC와 VPC가 서로 다른 지역에있는 경우 VPC와 타사 NVA가 필요합니다.
하이브리드 연결성 - AWS VPC와 Azure VNet은 각각 VPN 및 / 또는 Direct Connect와 ExpressRoute를 사용하여 하이브리드 연결을 허용합니다. Direct Connect 또는 ExpressRoute를 사용하면 최대 10Gbps의 연결을 사용할 수 있습니다. AWS DC 연결은 라우터의 포트와 Amazon 라우터 간의 단일 연결로 구성됩니다. 하나의 DC 연결을 통해 공용 AWS 서비스 (예 : Amazon S3) 또는 Amazon VPC에 가상 인터페이스를 직접 만들 수 있습니다. AWS DC를 사용하기 전에 가상 인터페이스를 만들어야합니다. AWS는 AWS Direct Connect 연결 당 50 개의 가상 인터페이스를 허용하며, AWS에 연락하면이 인터페이스를 확장 할 수 있습니다. 이중화가 필요한 경우 AWS DC 연결은 중복되지 않으며 두 번째 연결이 필요합니다. AWS VPN은 AWS VPC와 사내 구축 형 네트워크간에 두 개의 터널을 생성합니다. Direct Connect에 내결함성을 제공하려면, AWS는 터널 중 하나를 사용하여 VPN 및 BGP를 통해 사내 구축 형 데이터 네트워크에 연결할 것을 권장합니다. Azure ExpressRoute는 또한 연결 용으로 두 개의 링크와 SLA를 제공합니다. Azure는 최소 99.95 % ExpressRoute 전용 회로 가용성을 보장하므로 예측 가능한 네트워크 성능을 보장합니다.

원문: https://devblogs.microsoft.com/premier-developer/differentiating-between-azure-virtual-network-vnet-and-aws-virtual-private-cloud-vpc/

저작자표시 비영리 (새창열림)

Running Thousands of KVM Guests on Amazon's new i3.metal Instances

2019. 3. 18. 16:16

The Amazon i3 Family

Amazon has recently released to general availability the i3.metal instance, which allows us to do some things which we could not do before in the Amazon cloud, such as running an unmodified hypervisor. We were able to run more than six thousand KVM virtual machines on one of these instances, far beyond our pessimistic guess of around two thousand. In the remainder of this post we will discuss what makes these platforms important and unique, how we ran KVM virtual machines on the platform using Amazon’s own Linux distribution, and how we measured its performance and capacity using kprobes and the extended Berkeley Packetcpu Filter eBPF .

Read on for details!

i3.metal and the Nitro System

The i3 family platforms include two improvements from what Amazon has historically offered to AWS customers. The first is the combination of the Annapurna ASIC and the Nitro PCI card, which together integrate security, storage, and network I/O within custom silicon. The second improvement is the Nitrohypervisor, which replaces Xen for all new EC2 instance types. Together, we refer to the Nitro card, Annapurna ASIC, and Nitro hypervisor as the Nitro System. (See the EC2 FAQs entry for the Nitro Hypervisor for some additional details.)

Although Amazon has not released much information about the Nitro system there are important technical insights in Brendan Gregg’s blog and in two videos ( here and here ) from the November 2017 AWS re:Invent conference. From these presentations, it is clear that the Nitro firmware includes a stripped-down version of the KVM hypervisor that forgoes the QEMU emulator and passes hardware directly to the running instance. In this sense, Nitro is more properly viewed as partitioning firmware that uses hardware self-virtualization features, including support for nested virtualization on the i3.metal instances.

Nitro protects the Annapurna ASIC and the multi-root PCI hardware from being reprogrammed for the i3.metal systems, but nothing else (this invisible presence is to protect against the use of unauthorized elastic block stores or network access.) For example, while Nitro has no hardware emulation (which is the role of QEMU in a conventional KVM hypervisor), Nitro does enable self-virtualizing hardware (pdf). Importantly, Nitro on the i3.metal system exposes hardware virtualization features to the running kernel, which can be a hypervisor. Thus, a hypervisor such as KVM, Xen, or VMWare can be run directly in an i3.metal instance partitioned by the Nitro firmware.

Image above: Amazon’s i3 platform includes the Annapurna ASIC, the Nitro PCI Card, and the Nitro Firmware. See https://youtu.be/LabltEXk0VQ

Key Virtualization Features Exploited by the Nitro Firmware

Below is a brief, incomplete summary of virtualization features exploited by the Nitro system—particularly in the bare metal instances.

VMCS Shadowing

Virtual Machine Control Structure (VMCS) Shadowing provides hardware-nested virtualization on Intel Processors. The VMCS is a set of registers that controls access to hardware features by a virtual machine (pdf). The first-level hypervisor—in this case the Nitro system—keeps a copy of the second to nth level VMCS and only investigates registers that are different from the cached version. Not every register in the VMCS requires the first level hypervisor to monitor. The Nitro firmware thus provides nested virtualization with no material effect on performance (consuming only a small amount of additional processor resources). If the instance hypervisor does not violate the boundaries established by Nitro, there is no intervention and no effect upon performance.

Most significantly, VMCS shadowing registers are freely available to the kernel running on the bare-metal instance, which is unique for EC2 instances.

Extended Page Tables

Once the hypervisor has established memory boundaries for the virtual machine, Extended Page Tables (EPT) are a hardware feature that allows a virtual machine to manage its own page tables. Enabling this hardware feature produced a two order magnitude of improvement in virtual machine performance on x86 hardware.

Like VMCS shadowing, EPT works especially well with nested hypervisors. The Nitro firmware establishes a page table for the bare-metal workload (Linux, KVM, or another hypervisor.) The bare-metal workload manages its own page tables.

As long as it does not violate the boundaries established by the Nitro firmware, Nitro does not effect the performance or functionality of the bare-metal workload. Nitro’s role on i3.metal workloads prevents the workload from gaining the ability to re-configure the Annapurna ASIC or the Nitro card and violating the limits set for the instance.

Posted Interrupts

The multi-root virtualization capability (pptx) in the i3 instances virtualizes the Amazon Enhanced Networking and Elastic Block Storage (EBS) using PCI hardware devices (Annapurna ASIC and the Nitro card) assigned by the Nitro firmware to specific bare-metal workloads.

Posted interrupts (pdf) allow system firmware to deliver hardware interrupts directly to a virtual machine, when that virtual machine is assigned a PCI function. The Nitro system uses posted interrupts to allow the bare-metal workload to process hardware interrupts generated by the Nitro hardware without any intervention from the Nitro System.

That is, the Annapurna ASIC and Nitro PCI card can interrupt the bare-metal workload directly, while remaining protected from re-configuration by the bare metal workload. There are no detrimental effects on performance as long as the Nitro System does not over-provision CPUs, which it does not do. (The bare-metal workload may, even if it is a hypervisor, as we will see below in the limited testing we did)

Loading KVM on a Bare Metal Instance

On an EC2 Bare Metal system (i3.metal in the screen grab above), Nitro is hardware partitioning firmware. The Nitro firmware is based on KVM and does not use hardware emulation software (such as QEMU). It does initialize the custom Amazon hardware and pass-through hardware to the running instance: networking, storage, processors, PCI trees, and memory. It then jumps into the bare-metal instance kernel, which in our testing was Amazon Linux. (Amazon also supports the VMware Hypervisor as a bare-metal instance)

The Nitro firmware only activates if the bare-metal kernel violates established partitioning. The fact that the Nitro firmware is actually Linux and KVM is not new: Linux has been used as BIOS for many years for complex systems that consolidate networked or shared resources for hardware platforms.

Passing-through the VMX flag and Running Nested Virtualization

The Bare Metal kernel sees the vmx flag when it inspects /proc/cpuinfo:

1
2
grep -E "(vmx|svm)" /proc/cpuinfo | wc -l
72

This flag is necessary in order to load KVM. It indicates that the Virtual Machine Control Structure (VMCS) is programmable by the Linux-KVM kernel. VMCS Shadowing makes this possible; it uses copy-on-write methods and register caching in the processor itself to run each layer in the stack (Nitro, KVM, and the Virtual Machine) directly on the processor hardware. Each layer is controlled by the layer beneath it.

The i3.metal systems use register caching and snooping to provide hardware-virtualized processors to each layer in the system, beginning with the Nitro System, up to virtual machines being run by the bare-metal instance (KVM in this case).

The Nitro firmware does not use QEMU because it does not emulate any hardware. In our testing, we did use QEMU hardware emulation in the upper layer virtual machines. This resulted in the picture below, where the Nitro firmware is running beneath the i3 instance kernel. We then loaded KVM, and used QEMU to provide hardware emulation to the virtual machines:

When running a hypervisor such as KVM on the i3.metal systems, each layer has direct access to the processor through VMCS Shadowing, which provides each layer with the Virtual Machine Control registers.

Installing KVM on an Amazon Linux Image

The Amazon Linux distribution is derived from Fedora Linux with KVM available as two loadable modules. (KVM is maintained and supported by Amazon as a standard feature of the bare metal instance.)

Some components need to be installed, for example QEMU:

1
2
3
4
[ec2-user@ip-10-0-6-93 ~]$ yum list | grep qemu-kvm
qemu-kvm.x86_64                       10:1.5.3-141.6.amzn1          @amzn-updates
qemu-kvm-common.x86_64                10:1.5.3-141.6.amzn1          @amzn-updates
qemu-kvm-tools.x86_64                 10:1.5.3-141.6.amzn1          @amzn-updates

Libvirt is not part of the Amazon Linux distribution, which saves cost . We do not need Libvirt, and it would get in the way of later testing.

1
2
[ec2-user@ip-10-0-6-93 ~]$ yum list | grep -i Libvirt | wc -l
0

Libvirt is an adequate collection of software, but qemu-kvm is not aware of it, meaning the virtual machine state information stored by Libvirt may be out of sync with qemu-kvm . Libvirt also provides an additional attack vector to KVM while providing little additional functionality over what is provided by standard Linux utilities and kernel features, with qemu-kvm.

Built-in Processor Support for KVM

The i3.metal instance has 72 threads running on 36 physical cores that support KVM and posted interrupts. This information may be read in /proc/cpuinfo:

1
2
grep -E "(vmx|svm)" /proc/cpuinfo | wc -l 
72

Loading KVM on the Nitro system is most easily done by modprobe’ing the KVM modules:

1
2
3
4
5
6
[ec2-user@ip-10-0-6-93 ~]$ sudo modprobe kvm-intel
 
[ec2-user@ip-10-0-6-93 ~]$ lsmod | grep kvm
kvm_intel             183379  0
kvm                   562462  1 kvm_intel
irqbypass               3903  1 kvm

The irqbypass module provides posted interrupts to KVM virtual machines, reminding us again that we may pass PCI devices present on the bare-metal host through to KVM virtual machines.

Built-in virtio virtual I/O at the Linux Kernel Level

virtio is a Linux kernel i/o virtualization feature: it is maintained and supported by Amazon and that it works with qemu-kvm to provide isolated (not shared as in Xen’s dom0 netback and blockback) virtual i/o devices for virtual machines that do not need direct access to a hardware PCI device. Each virtio device is a unique and private virtual PCI device with separation provided by the Linux kernel.

The Amazon Linux kernel supports virtio devices, as shown by this excerpt of the Amazon Linux configuration file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# CONFIG_VIRTIO_VSOCKETS is not set
CONFIG_VIRTIO_BLK=m
CONFIG_SCSI_VIRTIO=m
CONFIG_VIRTIO_NET=m
CONFIG_VIRTIO_CONSOLE=m
CONFIG_HW_RANDOM_VIRTIO=m
# CONFIG_DRM_VIRTIO_GPU is not set
CONFIG_VIRTIO=m
# Virtio drivers
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_PCI_LEGACY=y
# CONFIG_VIRTIO_BALLOON is not set
# CONFIG_VIRTIO_INPUT is not set
CONFIG_VIRTIO_MMIO=m
# CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set

Kernel Shared Memory (KSM)

KSM is a Linux kernel feature that scans memory pages, merges duplicates, marks those pages as read-only, and copies the pages when they are written (COW). KSM provides a kernel-level mechanism for over-provisioning memory. KSM is automatic, built in, and does not require an external module as Xen does, for example, with its Dom0 balloon driver.

KSM is documented in the Linux kernel documentation directory.

The Amazon Linux kernel is configured with KSM:

1
2
[ec2-user@ip-10-0-6-93 ~]$ cat /boot/config-4.9.77-31.58.amzn1.x86_64 | grep -i ksm
CONFIG_KSM=y

Running a KVM virtual machine with copy-on-write memory is straightforward, by starting the virtual machine with the mem-merge feature turned on:

1
2
3
4
5
6
7
8
$ sudo /x86_64-softmmu/qemu-system-x86_64 \
-enable-kvm -m 1G \
...
-chardev stdio,id=mon0 \
-mon chardev=mon0,mode=readline \
-machine mem-merge=on
 
QEMU 1.7.50 monitor - type ’help’ for more information

Using the -machine mem-merge=on command upon virtual machine startup causes QEMU to execute anmadvise system call with the MADV_MERGEABLE parameter for the virtual machine memory, marking the VM memory as merge-able.

To disable merging for a virtual machine upon startup, use the same command but substitute mem-merge=off .

Running the KVM Virtual Machine

We created a virtual machine using a minimal Linux distribution: TTY Linux. It has an image built specifically to run with KVM using virtio network and block devices.

We ran KVM Linux virtual machines using this command line:

1
2
3
sudo /usr/libexec/qemu-kvm --enable-kvm -name ttylinux -m 1G -hda ttylinux.qcow2 --cdrom ttylinux-virtio_x86_64-16.1.iso -chardev stdio,id=mon0 -mon chardev=mon0,mode=readline
QEMU 1.5.3 monitor - type 'help' for more information
(qemu) VNC server running on `127.0.0.1:5900'

Only three steps are required to create the virtual machine:

Download the TTY Linux distribution and unzip to an iso image:

1
2
3
4
5
6
7
8
[ec2-user@ip-10-0-6-93 ~]$ wget https://www.dropbox.com/s/dumum2xsajzvjvw/ttylinux-virtio_x86_64-16.1.iso.gz
https://www.dropbox.com/s/dumum2xsajzvjvw/ttylinux-virtio_x86_64-16.1.iso.gz
Resolving www.dropbox.com (www.dropbox.com)... 162.125.6.1, 2620:100:601c:1::a27d:601
Connecting to www.dropbox.com (www.dropbox.com)|162.125.6.1|:443... connected.
...
ttylinux-virtio_x86_64-16.1.is 100%[====================================================>]  41.62M  9.21MB/s   in 4.5s   
...
[ec2-user@ip-10-0-6-93 ~]$ gunzip ttylinux virtio_x86_64-16.1.iso.gz

Create the qcow disk image for the virtual machine:

1
2
qemu-img create -f qcow2 ttylinux.qcow2 1G
Formatting 'ttylinux.qcow2', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 lazy_refcounts=off 

Run the virtual machine:

1
2
3
sudo /usr/libexec/qemu-kvm --enable-kvm -name ttylinux -m 1G -hda ttylinux.qcow2 --cdrom ttylinux-virtio_x86_64-16.1.iso -chardev stdio,id=mon0 -mon chardev=mon0,mode=readline
QEMU 1.5.3 monitor - type 'help' for more information
(qemu) VNC server running on `127.0.0.1:5900'

We were struck by how easy it was to run KVM virtual machines on these Nitro systems, configured as they are with Amazon Linux. Each virtual machine in our testing had 1G of memory and 1G of writeable storage.

numactl and other Linux Process Control

A benefit of KVM on i3.metal is the ability to use standard Linux system calls to control virtual machine resources. A good example is using the Linux numactl command to allocate CPU cores for a kvm virtual machine:

1
2
3
4
5
6
7
#!/usr/bin/bash
numactl --physcpubind=1 /usr/bin/qemu-system-x86_64 \
 -enable-kvm -name ttylinux -m 1G \
 -hda /var/lib/libvirt/filesystems/ttylinux.qcow \
 --cdrom /var/lib/libvirt/filesystems/ttylinux-virtio_x86_64-16.1.iso \
 -vnc 10.0.1.5:1 -chardev stdio,id=mon0 \
 -monitor stdio

The above command uses numactl utility to bind the KVM virtual machine to Core #1. It demonstrates how integrated KVM is with the Linux kernel and how simple it is to allocate memory and cores to specific virtual machines.

Integration with the Linux Kernel: cgroups, nice, numactl, taskset

We can turn the Linux kernel into a hypervisor by loading the KVM modules and starting a virtual machine, but the Linux personality is still there. We can control the virtual machine using standard Linux resource and process control tools such as cgroups, nice, numactl, and taskset :

1
2
3
4
5
6
7
[ec2-user@ip-10-0-6-93 ~]$ numactl -s
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 
cpubind: 0 1 
nodebind: 0 1 
membind: 0 1 

All cgroup commands work naturally with KVM virtual machines. As far as cgroups is concerned, each KVM virtual machine is a normal Linux process (although KVM runs that process at the highest privilege level in VMX guest mode (pptx), which provides hardware virtualization support directly to the virtual machine). There are two utilities to bind a KVM virtual machine to a specific processor, NUMA node, or memory zone:taskset and numactl .

In summary, the Linux command set along with qemu-kvm allows us native control over processors, memory zones, and other platform properties for to running KVM virtual machines. Libvirt, on the other hand, is a layer over these native control interfaces that tends to obscure what is really going on at the hardware level.

Testing the Limits of Bare-Metal AWS Hypervisor Performance

To more securely run virtual-machine workloads on cloud services, we accessed a bare-metal instance for project research during the preview period. We wanted to first verify that KVM can be used as a hypervisor on EC2 bare-metal instances, and second, get a read on stability and performance. We had limited time for this portion of the research.

To measure system response, we decided to use the BPF Compiler Collection (BCC) (building and using this toolset may be the subject of another blog post).

BCC uses the extended Berkeley Packet Filter, an amazing piece of technology in recent Linux Kernels that runs user-space byte code within kernel space. BCC compiles byte code that uses dynamic kernel probes to instrument kernel behavior.

To test CPU load, we added a simple shell script to each VM’s init process:

1
while [[ 1 ]]; do :; done

This ensured that each virtual machine would be consuming all the CPU cycles allowed to it by KVM.

Next, we used a simple shell script to start KVM virtual machines into oblivion:

1
2
3
while [[ 1 ]]; do 
sudo /usr/libexec/qemu-kvm --enable-kvm -name ttylinux -m 1G -hda ttylinux.qcow2;
done

Then we ran the BCC program runqlat.py, which measures how much time processes are spending on the scheduler’s run queue – a measure of system load and stability. The histogram below shows the system when running 6417 virtual machines.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[ec2-user@ip-10-0-6-93 tools]$ sudo ./runqlat.py      
Tracing run queue latency... Hit Ctrl-C to end.
^C
usecs               : count     distribution
         0 -> 1          : 897      |*                                       |
         2 -> 3          : 4700     |*****                                   |
         4 -> 7          : 34035    |****************************************|
         8 -> 15         : 25067    |*****************************           |
        16 -> 31         : 6939     |********                                |
        32 -> 63         : 9622     |***********                             |
        64 -> 127        : 8046     |*********                               |
       128 -> 255        : 4801     |*****                                   |
       256 -> 511        : 1736     |**                                      |
       512 -> 1023       : 635      |                                        |
      1024 -> 2047       : 913      |*                                       |
      2048 -> 4095       : 1767     |**                                      |
      4096 -> 8191       : 2031     |**                                      |
      8192 -> 16383      : 1841     |**                                      |
     16384 -> 32767      : 900      |*                                       |
     32768 -> 65535      : 249      |                                        |
     65536 -> 131071     : 201      |                                        |
    131072 -> 262143     : 109      |                                        |
    262144 -> 524287     : 51       |                                        |

The histogram above demonstrates how long, within a range, each sampled process waited on the KVM scheduler’s run queue before it was actually placed on the processor and run. The wait time in usecs shows how long a process that is runnable (not sleeping or waiting for any resources or events to occur) waited in order to run. There are three things to look for in this histogram:

How closely grouped are the sampled wait times? Most processes should be waiting approximately the same time. This histogram shows this is the case, with close to half samples waiting between 4 and 15 microseconds.
How low are the wait times? On a system that is under-utilized, the wait times should be mostly immaterial (just a few microseconds or less on this hardware). This system is over-utilized, and yet the wait times for most of the samples are fewer than 15 microseconds.
How scattered are the samples in terms of wait times? In this histogram there are two groups: the larger group with wait times less than 511 microseconds, and the smaller group with wait times between 1024 and 32767 microseconds. The second group consists of only roughly 7% of samples. We would expect a distressed system to show several different groups clustered around longer wait times, with outliers comprising more than 7% of all samples.

Upon reaching 6417 virtual machines, the system was unable to start any new VMs, due to memory exhaustion. However we were able to ssh to running VMs; when we stopped a VM, KVM started a new one. This system appeared to be capable of running indefinitely with this extreme load placed upon the CPU resources.

CPU and Memory Over-provisioning

When fully loaded with virtual machines, CPUs were overloaded 10:1 virtual cycles to physical cycles. There were more than thirty thousand processes running on the system, and it was actively reclaiming memory using KSM (discussed above). Before running the tests, the consensus among our team was that perhaps we could run 2K virtual machines before the system fell apart. This guess (that’s all it was) proved to be overly pessimistic. (However, we did not test I/O capacity in any significant way.)

Beyond proving that we could run a hell of a lot of virtual machines on the i3.metal platform, and that CPU over provisioning was wickedly efficient, we didn’t accomplish much else; for example, we can conclude nothing about the I/O performance of the system. But these are rich grounds for further performance and limit testing using the BCC toolkit, which we hope to discuss in a later blog post.

저작자표시 비영리 (새창열림)

'Cloud > AWS' 카테고리의 다른 글

AWS Well-Architected (0)	2019.06.03
Drills Down On Cloud Adoption And Amazon’s Culture (0)	2019.04.30
Firecracker로 VM을 매우 빠르고 가볍게 띄워보자. (0)	2018.12.14
신규 AWS 비용 계산기 AWS Calculator (0)	2018.10.29
NFS Performance Test with Amazon EFS (0)	2018.09.15

"컨테이너 관리의 정석" 쿠버네티스의 이해와 활용 - IDG

2019. 2. 10. 16:01

http://www.itworld.co.kr/techlibrary/114957

"컨테이너 관리의 정석" 쿠버네티스의 이해와 활용 - IDG DeepDive

‘쿠버네티스(Kubenetes)’가 거침없이 질주하고 있다. 쿠버네티스는 컨테이너 일정 관리부터 컨테이너 간 서비스 검색, 시스템의 부하 분산, 롤링 업데이트/롤백, 고가용성 등을 지원하는 오케스트레이션 툴이다. 컨테이너 원천 기술을 가진 도커의 ‘스웜(Swarm)’ 을 가볍게 제압하고, 이제는 기업의 60%가 사용하는 사실상의 표준 컨테이너 툴이 됐다. 340억 달러, 우리 돈 38조 원에 달하는 IBM의 레드햇 인수도 그 이면에는 쿠버네티스가 자리 잡고 있다.

오늘날 기업 IT 인프라에서 쿠버네티스가 중요한 이유와 구축 방법을 살펴본다. 관리와 보안을 도와줄 유용한 툴과 주요 클라우드 업체의 쿠버네티스 서비스도 심층 분석한다.

<주요 내용>
Tech Trend
- 미안, 리눅스! 이제 주인공은 ‘쿠버네티스’야
- “최신 1.12부터 구버전까지” 쿠버네티스 컨테이너 버전별 변천사
HowTo
- “배포판부터 예제까지” 올바른 쿠버네티스 시작 가이드
- 컨테이너 혁명 이끄는 주요 쿠버네티스 배포판 12가지
Tech Solution
- 쿠버네티스 ‘관리 지옥’에서 탈출하는 필수 툴 15가지
- “쿠버네티스를 더 안전하게” 필수 컨테이너 보안 툴 7종
AWS vs. 애저 vs. 구글 클라우드 … 관리형 쿠버네티스 3종 심층 비교
Column
- 쿠버네티스, 고통은 쓰지만 열매는 ‘너무’ 달다
- “서버 비용 80% 절감” 영국 파이낸셜 타임스의 쿠버네티스 도입기

상기 링크에서 다운.

저작자표시 비영리 (새창열림)

'Docker , Kubernetes' 카테고리의 다른 글

docker-compose 부팅시 자동 시작하기 / 서비스로 등록하기 (0)	2018.08.14
consul, nomad, vault (0)	2018.03.18

DDoS 장비 구성 방식

2019. 1. 23. 17:38

https://www.ahnlab.com/kr/site/securityinfo/secunews/secuNewsView.do?seq=17002

[Product Issue] TG DPX, 인라인 vs. 아웃오브패스 대결의 종결자

안철수연구소
2010-11-19

2006년도 하반기부터 시작된 국내 DDoS 시장에서 DDoS 공격 대응 제품은 여러 형태나 분류로 나눌 수 있다. 그 중 구성 방식에 따라 인라인(Inline) 방식의 제품과 아웃오브패스(Out-of-Path) 방식의 제품으로 구분하는 것이 가장 대표적이다. 이러한 두 가지 형태의 구성 방식은 사실 네트워크 담당자가 아니라면 생소하게 느껴질 수 밖에 없는 용어다.
이 글에서는 인라인 구성 방식과 아웃오브패스 구성 방식에 대해 쉽고 명확하게 이해를 할 수 있도록 소개하고자 한다.

구성 방식에 따른 분류

인라인 방식과 아웃오브패스 방식은 해당 장비가 네트워크 구간 내에 어디에 위치하느냐에 따라 구분된다. 인라인 방식은 네트워크 구간 내에 위치하게 되며, 빠른 대응이 가장 큰 장점이다. 반면, 아웃오브패스 방식은 네트워크 외부에 위치하며, 네트워크 안정성이 높다는 것이 장점이다. 인라인 방식과 아웃오브패스 방식, 각각의 특성은 [표 1]과 같다.

	Inline 구성 방식	Out-of-Path 구성 방식
구성의 특징	네트워크 구간 내에 위치	네트워크 외부에 위치
트래픽 기준	양방향 트래픽	단뱡향 트래픽 (일부 양방향 트래픽)
보안 적용의 장점	빠른 대응	네트워크 안정성 뛰어남 대규모 트래픽 구간에 설치 적합
구성의 단점	평상시 네트워크에 관여	Inline 대비 느린 대응 양방향 트래픽 구성 시 구성 복잡
적용 제품	방화벽, IPS, 웹 방화벽, L2/L3/L4/L7 Switch	DDoS, 웹 방화벽, L4/L7 Switch, Proxy

이제 본격적으로 인라인 방식과 아웃오브패스 방식에 대해 자세히 알아보기로 하자.

Inline 구성 방식이란?

1. 구성 방식의 설명
인라인 방식은 라우터, 스위치 등과 같은 네트워크 장비 및 방화벽, IPS와 같은 보안 장비들의 구성 방식처럼 트래픽 소통 구간에 설치되는 방식을 의미한다(그림 1).

[그림 1] Inline 구성 방식의 예

이 방식의 경우 네트워크 구간 사이에 위치하기 때문에 해당 장비의 설치 시 실시간 네트워크 트래픽의 단절 현상이 있으며, 회선 구성 등의 변경도 불가피하게 일어날 수 밖에 없다. 특히 인라인 방식의 제품이 IP 어드레스(Address)를 설정하는 구성 방식인 L3 (Routed) 인라인일 경우에는 설치되는 네트워크 구간 상단 및 하단 장비의 네트워크 IP 설정도 변경이 되어야 하는 부분이 있다. 이로 인해 최근에는 인라인 구성 방식으로 설치 시에는 IP 어드레스가 필요없는 L2 (Transparent) 모드를 선호하며, 대부분의 인라인 제품은 해당 구성 방식을 지원하고 있다.

2. 인라인 제품이 관여하게 되는 네트워크 트래픽의 특징
인라인 방식의 제품은 [그림 1]과 같이 트래픽이 소통되는 네트워크 구간 내에 설치가 되기 때문에, 내부로 들어오는 인바운드(Inbound) 트래픽과 외부로 나가는 아웃바운드(Outbound) 트래픽 등 양 양방향 트래픽 모두 해당 제품을 거치게 된다.
모든 고객에게 있어서 설치된 인라인 제품의 장애나 트래픽 전송 지연 문제에 대해서는 매우 민감한 사안일 수 밖에 없다. 이로 인해 기본적으로 인라인 제품은 최소한의 전송 지연 시간 (Latency Time)이나 장애 시 트래픽 바이패스(Traffic Bypass) 기능 등의 대비책을 제공하고 있다.

3. 보안적인 측면에서 인라인 방식은 인바운드

보안적인 측면에서 인라인 방식은 인바운드 트래픽에 대한 보안 설정과 함께 아웃바운드 트래픽에 대한 보안 설정을 할 수 있다는 이점을 가지고 있다.
특히 TCP 프로토콜(Protocol)의 경우에는 세션기반 프로토콜(Session Oriented Protocol)로서, 클라이언트와 서버 간의 양방향 통신을 유지해야만 하는 특징이 있다. 따라서, 일반적인 인라인 구성 방식의 방화벽이나 IPS 에서는 TCP 프로토콜에 대해 양방향 세션이 정상적으로 통신이 되고 있는지와 함께, TCP 프로토콜의 규약에 맞는 양방향 통신이 되고 있는지에 대해서도 점검하여 비정상적인 세션을 차단할 수 있는 보안 기능을 제공한다. 이 기능이 바로 잘 알려진 ‘스테이트풀 인스펙션(Stateful Inspection)’이다.
또한, 네트워크 구간 내에서 동작을 하고 있으므로, 보안 위협에 대한 탐지에 대해 즉시 차단 명령을 내리면 실시간으로 보안 정책이 적용될 수 있는 것이 가장 큰 특징이다. 물론, 최근에는 네트워크 구성이 매우 복잡하고 고도화되고 있어 네트워크 구간 내의 회선이 2중화 이상으로 구성이 되는 경우가 비일비재하다. 이러한 구성 방식을 대응하기 위하여 인라인 방식 제품은 하나의 장비에서 여러 회선을 수용할 수 있는 구성 방식과 함께 2대 이상의 인라인 방식 제품이 서로 실시간 TCP 세션을 공유할 수 있도록 하는 액티브-액티브(Active-Active) HA 구성 방식도 지원하는 제품도 있다.

4. 적용 제품의 예
A. 방화벽
    A. L3 (Routed) Mode 구성 방식의 방화벽
    B. L2 (Transparent) Mode 구성 방식의 방화벽
    C. L2 (Transparent) Mode 구성 방식의 Bypass 기능을 내장한 방화벽
B. IPS
    A. L2 (Transparent) Mode 구성 방식의 Bypass 기능을 내장한 IPS
C. Router/Switch
    A. L3 기반의 Routing 처리(Static & Dynamic Routing Protocol)
    B. L2 기반의 Swtching 처리

Out-of-Path 구성 방식이란?

1. 구성 방식의 설명
아웃오브패스 방식은 ‘Out-of-Path’라는 단어에서 의미하는 바와 같이 설치되는 장비가 트래픽의 소통 구간에서 외부로 빠져 나와있는 구성 방식을 의미한다. 이를 도식화하면 [그림 2]와 같다.

[그림 2] Out-of-Path 구성 방식의 예

특히, 아웃오브패스 방식은 인라인 방식과는 다르게 네트워크 구간 외부에 설치되어 전체 트래픽 중 특정한 트래픽만 통과하거나, 평상시에는 전혀 트래픽이 통과하지 않는 구성으로 이용이 가능하다. 이로 인해 제품이 설치가 되더라도 기존의 네트워크 트래픽 흐름에는 영향을 주지 않는 장점을 가지고 있다. 즉, 인라인 방식의 약점인 전송 지연, 또는 장애 등의 문제로부터 좀 더 자유로울 수 있다는 것이다.

2. 아웃오브패스 방식이 관여하는 네트워크 트래픽의 방향성
아웃오브패스 방식을 통과하는 트래픽의 유형은 크게 ‘특정 서비스 트래픽의 양방향 트래픽’이나 ‘특정 서비스 트래픽의 단방향 트래픽’의 두 가지 형태로 분류할 수 있다.

[그림 3] Out-of-Path 구성 양방향 트래픽 [그림 4] Out-of-Path 단방향 트래픽

먼저, 여기서 언급한 ‘특정 서비스 트래픽’이란 전체 네트워크 트래픽 아웃오브패스 방식의 제품만을 통과하는 서비스 트래픽을 의미하는 것이다. 예를 들어 대외 서비스를 하는 서버가 웹 서버, DNS 서버, 메일 서버가 있다고 가정하자. 이 경우, 아웃오브패스 방식 제품에서는 DNS 서버와 메일 서버의 트래픽을 제외한 웹 서버만을 통과시킨다. 즉, DNS 서버와 메일 서버의 트래픽은 아웃오브패스 방식의 제품이 관여하지 않는다는 것이다. 이로 인하여 기존의 DNS 서버와 메일 서버의 트래픽은 기존의 네트워크 경로를 그대로 이용하게 되므로, 서비스에 영향을 받지 않는다.
만약 양방향 트래픽을 이용할 경우에는 위에서 예를 든 웹 서버의 경우와 같이 클라이언트에서 서버로 요청하는 트래픽(Inbound Traffic)이 아웃오브패스 방식 제품으로 유입되고, 서버에서 클라이언트로 응답하는 트래픽(Outbound Traffic) 또한 아웃오브패스 방식의 제품으로 유입되는 구성 방식을 의미한다.

3. 보안 제품의 구성 제약 및 적용의 범위
아웃오브패스 방식에서 양방향 트래픽을 구성할 경우에는 네트워크 상에서의 트래픽 라우팅의 고려가 매우 중요하다. 일반적으로 네트워크 트래픽 라우팅은 목적지 IP 기반의 라우팅 정책을 적용하게 된다. 예를 들어 ‘웹 서버로 향하는 트래픽을 아웃오브패스 방식 제품으로 보내겠다’라고 하는 형태로 구성이 된다. 하지만 웹 서버가 랜덤한 출발지 IP로의 응답 트래픽을 아웃오브패스 방식 제품으로 보내고자 하는 라우팅은 목적지 IP 기반의 라우팅 정책으로는 적용이 불가능하다. 이로 인하여 PBR (Policy Based Routing) 등의 기법을 이용하여 출발지 IP 기반의 라우팅 기법을 적용해야만 웹 서버의 아웃바운드 트래픽도 아웃오브패스 방식 제품으로 트래픽을 유입시킬 수 있다.

양방향 트래픽 기준의 아웃오브패스 방식 제품은 이러한 문제점을 해결하기 위해 트래픽 통신을 중계시켜 주는 ‘프록시 IP’를 설정하여 운영하게 된다. 예를 들어 클라이언트가 웹 서버로 요청할 때에는 아웃오브패스 방식 제품에 설정된 웹 서버의 대표 IP로 요청을 하게 되고, 아웃오브패스 방식 제품은 설정된 프록시 IP를 이용해 서버로 트래픽을 중계하게 된다. 이후 서버의 응답 트래픽은 프록시 IP로 전달되고, 이 응답을 아웃오브패스 방식 제품이 클라이언트에게 전달하는 복잡한 구조로 대응한다.

하지만 대부분 가장 효율적인 방법으로 랜덤한 클라이언트의 요청 트래픽인 인바운드 트래픽에 대해서만 관심을 가지고 아웃바운드 트래픽에 대해서는 적용하지 않는 단방향 기준의 보안 정책도 많이 사용하고 있다. 이에 대한 대표적인 예가 DDoS 공격 방어의 경우이다.
예를 들어, 웹 서비스의 보안 영역 중 DDoS 공격의 경우에는 외부로부터 웹 서버로 유입되는 대규모의 DDoS 트래픽에 관한 보안 정책이 필요하고, 그 반대인 웹 서버가 외부의 불특정 대상에게 DDoS 공격을 감행하는 사례는 매우 적다고 판단할 수 있다. 이때에는 DDoS 공격 방어를 위해서는 아웃바운드 트래픽에는 관여할 필요 없이 인바운드 트래픽 만을 대상으로 DDoS 공격 방어를 수행하면 된다. 한가지 예를 더 들자면, L4/L7 스위치의 경우에도 클라이언트의 웹 서비스에 대한 요청 트래픽만 처리하고, 웹 서비스의 응답 트래픽은 L4/L7 스위치를 거치지 않고 직접 클라이언트에게 전달되는 DSR(Direct Server Response) 방식을 들 수 있다.

특히 대규모 웹 서비스의 트래픽 특성을 살펴보면, 웹 서비스를 요청하는 클라이언트로부터의 인바운드 트래픽은 양이 작으며, 서비스에 응답을 하고 웹 페이지를 전달하는 아웃바운드 트래픽은 상대적으로 양이 많다. 이 경우 인바운드 트래픽만을 기준으로 보안을 구축할 때에는 대규모 서비스 망이라고 하더라도 상대적으로 적은 인바운드 트래픽 기준으로 보안 제품을 선택할 수 있으므로, 제품의 활용 효과 면에서도 이점을 가질 수 있다.

4. 적용 제품의 예
특정 서비스의 트래픽 중 양방향 트래픽을 사용하는 제품은 Transparent Mode의 L4/L7 스위치나 프록시 기반의 방화벽, 웹 방화벽 기반이 될 수 있다.
특정 서비스의 트래픽 중 단방향 트래픽을 사용하는 제품은 DSR(Direct Server Response) 모드의 L4/L7 스위치나 DDoS 공격 방어 제품이 있다.

지금까지 인라인 구성 방식과 아웃오브패스 구성 방식에 대해 간단히 기술적으로 비교해 보았다. 이 중 가장 많은 이야기를 이어나갈 수 있는 DDoS 공격 대응 제품에서의 인라인 구성 방식과 아웃오브패스 구성 방식에 대해서 이야기를 하고자 한다.

DDoS 공격 대응 제품, 인라인 vs. 아웃오브패스 전격 비교

DDoS 공격 대응 제품에 있어서 인라인 방식과 아웃오브패스 방식의 대결 구도는 어제 오늘의 일이 아니다. 고객은 DDoS 공격 대응 제품의 도입부터 이 두 가지 방식을 고려해야만 했다. 또한 벤더사는 자신의 제품이 채택한 방식의 강점을 부각시키기 위해 노력해왔다.

DDoS 공격 대응 제품의 구성 방식에 대한 이해

[그림 1]과 같이 인라인 구성 방식의 경우 네트워크 구간 내에 위치하게 되며, 단일 제품 또는 2대 이상의 다수의 제품으로 구성된다. 이 경우 DDoS 공격 대응 제품은 실시간으로 트래픽을 모니터링하며 차단할 수 있다.
반면 아웃오브패스 구성 방식의 경우에는 앞에서 설명한 바와 같이 네트워크 구간 외부에 위치하여 평상시에는 동작하지 않는 특징을 가지고 있다. 이로 인하여 DDoS 공격 발생 시 동작할 수 있도록 항시 모니터링하는 ‘탐지 전용’ 제품과 DDoS 공격을 실제적으로 차단하게 되는 ‘차단 전용’ 제품으로 역할이 나누어지게 된다. 물론, 대규모 트래픽을 처리할 수 있도록 탐지 장비와 차단 장비는 2대 이상의 다수의 제품으로 구성되는 클러스터(Cluster) 구성 방식도 있다.

DDoS 공격 방어 대상 트래픽: 양방향 Vs. 단방향

일부 네트워크 구성의 특성에 따라 인바운드 DDoS 공격과 아웃바운드 DDoS 공격 두 가지를 모두 고려하는 경우가 있다. 이러한 사례를 생각해보면 DDoS 공격 대응 제품이 보호하는 대상이 서버뿐만 아니라 클라이언트까지 포함된 네트워크 구간일 경우가 여기에 해당된다. 특히 서버가 외부 IDC에 있는 것이 아닌 자체 망의 DMZ 구간 내에 포함이 되어 있는 경우이다. 이 경우 DDoS 공격 대응 제품은 DMZ 네트워크와 내부 네트워크의 트래픽 단일 유입 구간에 설치가 될 경우에는 인바운드 DDoS 공격에 대해 DMZ 네트워크의 서버에 대한 DDoS 공격 방어를 수행한다. 그리고, 내부 네트워크 구간의 클라이언트 PC가 아웃바운드 DDoS 공격을 할 경우에도 방어를 수행하는 경우가 있다. 이 때에는 양방향 트래픽의 방향성에 적합한 인라인 구성 방식의 DDoS 공격 대응 제품이 적합하다.

하지만, 일반적으로 DDoS 공격은 외부에서 내부 서비스로의 대규모 트래픽을 유발하여 외부의 정상적인 사용자들이 서비스로 접근을 하지 못하게 하는 특징을 가지고 있다. 따라서 일반적으로 DDoS 공격 대응 제품은 인바운드 트래픽 관점에서의 방어를 원칙으로 한다.

따라서, 대외 서비스를 수행하는 서버에 대한 DDoS 공격 방어를 하기 위해서는 인라인 구성방식의 DDoS 공격 대응 제품에서 인바운드 DDoS 공격만을 관여하도록 하거나, 또는 대규모 트래픽이 발생하는 IDC 구간일 경우에는 아웃오브패스 구성 방식의 DDoS 공격 대응 제품이 적합하다.

DDoS 공격 방어 동작 시간의 적합성: 인라인 vs. 아웃오브패스

DDoS 공격은 공격이 발생하면 즉시 서비스가 마비되는 현상을 초래하므로 이에 대한 대응 시간이 매우 중요하다.
인라인 구성의 경우 앞서 설명한 바와 같이 양방향, 또는 단방향 트래픽에 대해 실시간으로 탐지/차단을 수행하기 때문에 DDoS 공격 발생 시 즉시 설정되어 있는 정책에 의해 공격을 차단할 수 있다. 이 경우 최소 1초 이내의 DDoS 공격 탐지/차단 동작이 수행될 수 있는 장점이 있다.

반면, 아웃오브패스 구성의 경우에는 평상시 DDoS 공격을 차단하는 제품은 동작하지 않고, 이를 동작시키기 위해 상시 모니터링을 하는 DDoS 공격 탐지 장비가 별도로 구성이 된다. 이 경우 DDoS 공격이 발생이 되면 먼저 DDoS 공격 탐지 장비가 DDoS 공격을 인지하여 DDoS 공격 차단 장비로 동작 명령을 전달하여, 차단 장비가 DDoS 트래픽에 대해 차단을 처리하게 된다. 이로 인하여 아웃오브패스 구성 방식의 DDoS 공격 대응 제품은 인라인 구성 방식의 제품보다는 대응 시간이 느리다는 구조적 한계가 분명히 존재한다.

이를 보완하기 위하여 아웃오브패스 구성 방식에서 빠르게 동작할 수 있는 여러 가지 기술들이 개발되고 있다. 예를 들어, 기존의 사용되었던 스위치/라우터에서 수집되는 트래픽 플로우 정보가 아닌 실시간 트래픽 기준으로 탐지할 수 있는 기술이 적용되고 있다. 아울러 인바운드 트래픽만을 기준으로 하여 실시간 트래픽을 모니터링해 효율성을 극대화하는 기술도 현재 이용되고 있다.

DDoS 공격 방어 기능: 인라인 vs. 아웃오브패스/양방향 vs. 단방향/클러스터링

지금까지 인라인 구성 방식과 아웃오브패스 구성 방식, 그리고 트래픽의 방향성에 대해 설명을 하였다. 하지만, 가장 중요한 것은 각 구성 환경과 트래픽 특성에 따라 DDoS 공격을 어떻게 방어할 수 있는지에 대한 DDoS 공격 방어 기능이다. 단순히 네트워크 구성만 지원한다고 해서 DDoS 공격을 대응할 수 있는 것은 아니므로, 본연의 기능인 DDoS 공격 방어 기능에 다시 초점을 맞추어야 한다.

우선, 인라인 구성 방식에서의 양방향 트래픽 관점에서의 DDoS 공격 방어 기능을 살펴보자. 이 경우에는 외부로부터 유입되는 인바운드 DDoS 공격에 대해 탐지/차단할 수 있는 기능이 기본적으로 제공이 되어야 한다. 특히 단순한 임계치(Threshold) 기반의 DDoS 공격 탐지/차단은 해당 정책에 의해 탐지/차단 되는 트래픽이 실제로 정상적인 패킷(Packet)인지 아닌지에 대해 명확히 판단할 근거가 없다. 따라서, 이러한 임계치 기반의 DDoS 공격 탐지/차단 동작 방식 이전에 해당 패킷이 정상적인 패킷 인지를 검증해 줄 수 있는 ‘인증 기반’의 DDoS 공격 탐지/차단 방식이 필요하다. 아울러 아웃바운드 DDoS 공격에 대해서는 내부 클라이언트/서버의 IP는 알고 있으므로, 이와 다른 IP 에서 발생할 경우에는 즉시 차단하고, 정상적인 응답 트래픽 특성이 아닌 경우에는 인증 기반 및 임계치 기반으로 탐지/차단 할 수 있는 기술이 필요하다.

두 번째로, 인라인 구성의 단방향 트래픽 관점과 아웃오브패스 구성의 단방향 트래픽 관점에서의 DDoS 공격 방어 기능에 대해 살펴보자. 앞서 언급한 바와 같이 인바운드 DDoS 공격에 대해 정확히 방어하기 위하여 ‘인증 기반’의 패킷 검증 방식과 ‘임계치(Threshold) 기반’의 정책 기반의 DDoS 탐지/차단 방식이 동시에 지원되어야 한다. 하지만 ‘인증 기반’의 경우에는 해당 구성에서는 단방향 트래픽만을 가지고 ‘인증 기반’을 사용해야 하므로, 인라인 구성 방식에서 사용하는 인증 기반 기술과는 다른 단방향 트래픽을 기준으로 한 독특한 기술이 필요하다. 또한 단순히 아웃오브패스 DDoS 탐지 장비에서 보여지는 양방향 트래픽 기준의 정상/비정상 연결(Connection) 여부만을 가지고 판단하는 것이 아닌, DDoS 공격 발생 시 아웃오브패스 DDoS 차단 장비에서의 실시간 인증 기반 검증이 제대로 된 DDoS 공격을 방어할 수 있는 기술이다.

마지막으로, 단방향 트래픽 기준의 인라인 및 아웃오브패스 구성 방식을 고려할 때 가장 큰 강점은 바로 여러 대의 DDoS 제품을 하나의 DDoS 제품처럼 클러스터링하는 것이다. 이 경우에도 앞서 설명한 바와 같이 ‘인증 기반’과 ‘임계치 기반’ 두 가지 방식이 지원되어야 하는 것이 매우 중요하다. 이와 더불어 다중 장비의 정책을 동시에 동기화 할 수 있는 관리 방안과 함께 ‘인증 기반’의 검증의 경우에는 다중 장비의 ‘인증’ 동작 정보가 실시간으로 공유가 되어야 여러 대의 DDoS 제품이 하나의 DDoS 공격 대응 솔루션으로 사용될 수 있다.

[그림 5] Inline Cluster 구성의 예 [그림 6] Out-of-Path Cluster 구성의 예

AhnLab TrusGuard DPX,
Inline / Out-of-Path / Clustering 구성이 지원되는 DDoS 공격 대응 제품

AhnLab TrusGuard DPX(이하 TG DPX) 는 업계 최초로 라이선스(License) 변경만으로도 동일 제품에서 인라인 구성 방식과 아웃오브패스 구성 방식, 그리고 클러스터링 구성까지 지원하는 제품이다.
특히, 다단계 필터 구조를 통하여 기본적으로 제공되는 자동 학습 기반의 임계치 기준의 DDoS 탐지/차단 방식을 지원할 뿐만 아니라 실시간 인증 기반을 통하여 DDoS 공격 발생 시 유입되는 트래픽이 정상적인지 비정상적인지를 확인할 수 있는 기능을 제공한다.

[그림 7] AhnLab TrusGuard DPX 구성 방식의 예

TG DPX는 단방향 트래픽만을 기준으로 하여 아웃오브패스에서의 DDoS 공격 탐지를 할 수 있으며, 공격 차단을 공격 대상의 트래픽만을 기준으로 하여 인증 기반과 임계치 기반의 DDoS 공격 방어를 처리할 수 있는 독특한 기술을 제공하고 있다. 또한 DDoS 공격 차단을 수행하는 다중 장비간의 인증 정보를 실시간으로 공유함으로써, 단방향 트래픽 기준의 인라인 및 아웃오브패스 클러스터링 구성 방식까지 지원하고 있다.
이를 통하여 TG DPX는 기존 DDoS 공격 방어 시 임계치 기반 방식의 DDoS 공격 차단 처리 방식과 대비 정확하고 오탐이 최소화된 DDoS 공격 대응 기능을 제공한다. @

저작자표시 비영리 (새창열림)

'ETC.' 카테고리의 다른 글

MacBook 터치바 오류 해결 (0)	2020.03.22
PM, RFI, RFP, WBS, SOW, POC, BMT (0)	2019.09.18
애자일 소프트웨어 개발 선언, (0)	2018.08.12
TCP 튜닝 (0)	2018.03.30
Let's Encrypt Wildcard 인증서 발급하기 (0)	2018.03.29

Firecracker로 VM을 매우 빠르고 가볍게 띄워보자.

2018. 12. 14. 19:11

Firecracker는 Serverless 컴퓨팅을 위한 안전하고 빠른 microVM을 위한 AWS에서 만든 KVM기반의 새로운 hypervisor 이다.

Lambda와 Fargate등 컨테이너를 써야 하지만 공유 보안이 문제가 되는 곳에 컨테이너 수준의 빠른 실행을 보장하는 hypervisor이다.

이는 AWS 뿐만 아니라 OnPremise이든 IBM의 Baremetal이든 다 올릴 수 있다. 물론 PC에서도 가능하다. (vagrant에 쓰면 좋겠는데?)

원문 : https://aws.amazon.com/ko/blogs/opensource/firecracker-open-source-secure-fast-microvm-serverless/

가상화 기술의 새로운 과제

오늘날 고객은 서버리스 컴퓨팅을 사용하여 인프라의 구축 또는 관리에 대한 걱정없이 애플리케이션을 구축 할 수 있습니다. 개발자는 코드를 AWS Fargate를 사용하여 서버리스 컨테이너를 사용하거나 AWS Lambda를 사용하여 서버리스 함수들을 사용 할 수 있습니다. 고객들은 서버리스의 낮은 운영 오버 헤드를 너무 좋아합니다. 우리 또한 서버리스가 향후 컴퓨팅에서 중추적 인 역할을 계속할 것으로 믿습니다.

고객이 서버리스를 점점 더 많이 채택함에 따라 기존의 가상화 기술은 이벤트 중심 또는 짧은 수명의 특성이 있는 이러한 유형의 워크로드의 특성에 최적화되지 않았음을 알게 되었습니다. 우리는 서버리스 컴퓨팅을 위해 특별히 설계된 가상화 기술을 구축 할 필요성을 확인했습니다. 가상 시스템의 하드웨어 가상화 기반 보안 경계를 제공하면서도 컨테이너 크기와 기능의 민첩성을 유지하면서 크기를 유지할 수있는 방법이 필요했습니다.

Firecracker Technology

Meet Firecracker, an open source virtual machine monitor (VMM) that uses the Linux Kernel-based Virtual Machine (KVM). Firecracker allows you to create micro Virtual Machines or microVMs. Firecracker is minimalist by design – it includes only what you need to run secure and lightweight VMs. At every step of the design process, we optimized Firecracker for security, speed, and efficiency. For example, we can only boot relatively recent Linux kernels, and only when they are compiled with a specific set of configuration options (there are 1000+ kernel compile config options). Also, there is no support for graphics or accelerators of any kind, no support for hardware passthrough, and no support for (most) legacy devices.

Firecracker boots a minimal kernel config without relying on an emulated bios and without a complete device model. The only devices are virtio net and virtio block, as well as a one-button keyboard (the reset pin helps when there’s no power management device). This minimal device model not only enables faster startup times (< 125 ms on an i3.metal with the default microVM size), but also reduces the attack surface, for increased security. Read more details about Firecracker’s promise to enable minimal-overhead execution of container and serverless workloads.

In the fall of 2017, we decided to write Firecracker in Rust, a modern programming language that guarantees thread and memory safety and prevents buffer overflows and many other types of memory safety errors that can lead to security vulnerabilities. Read more details about the features and architecture of the Firecracker VMM at Firecracker Design.

Firecracker microVMs improve efficiency and utilization with a low memory overhead of < 5 MiB per microVMs. This means that you can pack thousands of microVMs onto a single machine. You can use an in-process rate limiter to control, with fine granularity, how network and storage resources are shared, even across thousands of microVMs. All hardware compute resources can be safely oversubscribed, to maximize the number of workloads that can run on a host.

We developed Firecracker with the following guiding tenets (unless you know better ones) for the open source project:

Built-In Security: We provide compute security barriers that enable multitenant workloads, and cannot be mistakenly disabled by customers. Customer workloads are simultaneously considered sacred (shall not be touched) and malicious (shall be defended against).
Light-Weight Virtualization: We focus on transient or stateless workloads over long-running or persistent workloads. Firecracker’s hardware resources overhead is known and guaranteed.
Minimalist in Features: If it’s not clearly required for our mission, we won’t build it. We maintain a single implementation per capability.
Compute Oversubscription: All of the hardware compute resources exposed by Firecracker to guests can be securely oversubscribed.

We open sourced this foundational technology because we believe that our mission to build the next generation of virtualization for serverless computing has just begun.

Firecracker Usage

AWS Lambda uses Firecracker as the foundation for provisioning and running sandboxes upon which we execute customer code. Because Firecracker provides a secure microVM which can be rapidly provisioned with a minimal footprint, it enables performance without sacrificing security. This lets us drive high utilization on physical hardware, as we can now optimize how we distribute and run workloads for Lambda, mixing workloads based on factors like active/idle periods, and memory utilization.

Previously, Fargate Tasks consisted of one or more Docker containers running inside a dedicated EC2 VM to ensure isolation across Tasks. These Tasks now execute on Firecracker microVMs, which allows us to provision the Fargate runtime layer faster and more efficiently on EC2 bare metal instances, and improve density without compromising kernel-level isolation of Tasks. Over time, this will allow us to continue to innovate at the runtime layer, giving our customers even better performance while maintaining our high security bar, and lowering the overall cost of running serverless container architectures.

Firecracker runs on Intel processors today, with support for AMD and ARM coming in 2019.

You can run Firecracker on AWS .metal instances, as well as on any other bare-metal server, including on-premises environments and developer laptops.

Firecracker will also enable popular container runtimes such as containerd to manage containers as microVMs. This allows Docker and container orchestration frameworks such as Kubernetes to use Firecracker. We have built a prototype that enables containerd to manage containers as Firecracker microVMs and would like to with with community to take it further.

Getting Started with Firecracker

Getting Started with Firecracker provides detailed instructions on how to download the Firecracker binary, start Firecracker with different options, build from the source, and run integration tests. You can run Firecracker in production using the Firecracker Jailer.

Let’s take a look at how to get started with using Firecracker on AWS Cloud (these steps can be used on any bare metal machine):

Create an i3.metal instance using Ubuntu 18.04.1.

Firecracker is built on top of KVM and needs read/write access to /dev/kvm. Log in to the host in one terminal and set up that access:

sudo setfacl -m u:${USER}:rw /dev/kvm

Download and start the Firecracker binary:

curl -L https://github.com/firecracker-microvm/firecracker/releases/download/v0.11.0/firecracker-v0.11.0
./firecracker-v0.11.0 --api-sock /tmp/firecracker.sock

Each microVM can be accessed using a REST API. In another terminal, query the microVM:

curl --unix-socket /tmp/firecracker.sock "http://localhost/machine-config"

This returns a response:

{ "vcpu_count": 1, "mem_size_mib": 128,  "ht_enabled": false,  "cpu_template": "Uninitialized" }

This starts a VMM process and waits for the microVM configuration. By default, one vCPU and 128 MiB memory are assigned to each microVM. Now this microVM needs to be configured with an uncompressed Linux kernel binary and an ext4 file system image to be used as root filesystem.

Download a sample kernel and rootfs:

curl -fsSL -o hello-vmlinux.bin https://s3.amazonaws.com/spec.ccfc.min/img/hello/kernel/hello-vmlinux.bin
curl -fsSL -o hello-rootfs.ext4 https://s3.amazonaws.com/spec.ccfc.min/img/hello/fsfiles/hello-rootfs.ext4

Set up the guest kernel:

curl --unix-socket /tmp/firecracker.sock -i \
    -X PUT 'http://localhost/boot-source'   \
    -H 'Accept: application/json'           \
    -H 'Content-Type: application/json'     \
    -d '{        "kernel_image_path": "./hello-vmlinux.bin", "boot_args": "console=ttyS0 reboot=k panic=1 pci=off"    }'

Set up the root filesystem:

curl --unix-socket /tmp/firecracker.sock -i \
    -X PUT 'http://localhost/drives/rootfs' \
    -H 'Accept: application/json'           \
    -H 'Content-Type: application/json'     \
    -d '{        "drive_id": "rootfs",        "path_on_host": "./hello-rootfs.ext4",        "is_root_device": true,        "is_read_only": false    }'

Once the kernel and root filesystem are configured, the guest machine can be started:

curl --unix-socket /tmp/firecracker.sock -i \
    -X PUT 'http://localhost/actions'       \
    -H  'Accept: application/json'          \
    -H  'Content-Type: application/json'    \
    -d '{        "action_type": "InstanceStart"     }'

The first terminal now shows a serial TTY prompting you to log in to the guest machine:

Welcome to Alpine Linux 3.8
Kernel 4.14.55-84.37.amzn2.x86_64 on an x86_64 (ttyS0)
localhost login:

localhost login: root
Password:
Welcome to Alpine! 

The Alpine Wiki contains a large amount of how-to guides and general information about administrating Alpine systems. 

See <http://wiki.alpinelinux.org>. 

You can setup the system with the command: setup-alpine 

You may change this message by editing /etc/motd.

login[979]: root login on 'ttyS0' 
localhost:~#

You can see the filesystem usingls /

localhost:~# ls /
bin         home        media       root        srv         usr
dev         lib         mnt         run         sys         var
etc         lost+found  proc        sbin        tmp

Terminate the microVM using the reboot command. Firecracker currently does not implement guest power management, as a tradeoff for efficiency. Instead, the reboot command issues a keyboard reset action which is then used as a shutdown switch.

Once the basic microVM is created, you can add network interfaces, add more drives, and continue to configure the microVM.

Want to create thousands of microVMs on your bare metal instance?

for ((i=0; i<1000; i++)); do
    ./firecracker-v0.10.1 --api-sock /tmp/firecracker-$i.sock &
done

Multiple microVMs may be configured with a single shared root file system, and each microVM can then be assigned its own read/write share.

Firecracker and Open Source

It is our mission to innovate on behalf of and for our customers, and we will continue to invest deeply in serverless computing at all three critical layers of the stack: the application, virtualization, and hardware layers. We want to offer our customers their choice of compute, whether instances or serverless, with no compromises on security, scalability, or performance. Firecracker is a fundamental building block for providing that experience.

Investing deeply in foundational technologies is one of the key ways that we at AWS approach innovation – not for tomorrow, but for the next decade and beyond. Sharing this technology with the community goes hand-in-hand with this innovation. Firecracker is licensed under Apache 2.0. Please visit the Firecracker GitHub repo to learn more and contribute to Firecracker.

By open sourcing Firecracker, we not only invite you to a deeper examination of the foundational technologies that we are building to underpin the future of serverless computing, but we also hope that you will join us in strengthening and improving Firecracker. See the Firecracker issues list and the Firecracker roadmap for more information.

저작자표시 비영리 (새창열림)

'Cloud > AWS' 카테고리의 다른 글

Drills Down On Cloud Adoption And Amazon’s Culture (0)	2019.04.30
Running Thousands of KVM Guests on Amazon's new i3.metal Instances (0)	2019.03.18
신규 AWS 비용 계산기 AWS Calculator (0)	2018.10.29
NFS Performance Test with Amazon EFS (0)	2018.09.15
어떻게 메인프레임을 AWS 클라우드로 이전하나요? (0)	2018.09.10

PREV 1 2 3 4 5 6 ···10 NEXT

분류 전체보기

Push vs Pull

데이터 저장소

CPU

검색어 언어

설정

'DevOps , SRE' 카테고리의 다른 글

Linux 에서 IPMI Log 가져오기

결론 요약 커맨드

네트워크 접근

IPMI 계정 접근

디버그 옵션

FRU (Field Replaceable Unit) 확인

SEL (System Event Log) 확인

IPMI 상의 현재 시스템 시간 확인

활용

IPMI 세부 참조는 아래 글 참고

'Cloud > Softlayer' 카테고리의 다른 글

'OS > Linux' 카테고리의 다른 글

'DevOps , SRE' 카테고리의 다른 글

'OS > Linux' 카테고리의 다른 글

The Amazon i3 Family

i3.metal and the Nitro System

Key Virtualization Features Exploited by the Nitro Firmware

VMCS Shadowing

Extended Page Tables

Posted Interrupts

Loading KVM on a Bare Metal Instance

Passing-through the VMX flag and Running Nested Virtualization

Installing KVM on an Amazon Linux Image

Built-in Processor Support for KVM

Kernel Shared Memory (KSM)

Running the KVM Virtual Machine

numactl and other Linux Process Control

Testing the Limits of Bare-Metal AWS Hypervisor Performance

CPU and Memory Over-provisioning

'Cloud > AWS' 카테고리의 다른 글

"컨테이너 관리의 정석" 쿠버네티스의 이해와 활용 - IDG DeepDive

'Docker , Kubernetes' 카테고리의 다른 글

[Product Issue] TG DPX, 인라인 vs. 아웃오브패스 대결의 종결자

'ETC.' 카테고리의 다른 글

Firecracker Technology

Firecracker Usage

Getting Started with Firecracker

Firecracker and Open Source

'Cloud > AWS' 카테고리의 다른 글

티스토리툴바