云服務(wù)器ECS：GPU計(jì)算型實(shí)例上

2019-04-30|HiShop|閱讀量：

導(dǎo)讀：　云服務(wù)器ECS：GPU計(jì)算型實(shí)例上...

　　導(dǎo)讀：云服務(wù)器ECS實(shí)例之GPU計(jì)算型實(shí)例上咨詢(xún)熱線 4006-333-292

　　本文介紹GPU計(jì)算型實(shí)例規(guī)格族vgn5i、gn6i、gn6v、gn5、gn5i和gn4，并列出了具體的實(shí)例規(guī)格。

　　輕量級(jí)GPU計(jì)算型實(shí)例規(guī)格族 vgn5i

　　規(guī)格族特點(diǎn)

　　· I/O優(yōu)化實(shí)例

　　· 支持IPv6

　　· 僅支持SSD云盤(pán)和高效云盤(pán)

　　· 采用NVIDIA P4 GPU計(jì)算加速器

　　· 實(shí)例包含分片虛擬化后的虛擬GPU

　　· 計(jì)算能力支持NVIDIA Tesla P4的 1/8、1/4、1/2和1:1

　　· GPU顯存支持1 GiB、2 GiB、4 GiB和8 GiB

　　· 處理器與內(nèi)存配比為1:3

　　· 處理器：2.5 GHz主頻的Intel Xeon E5-2682 v4(Broadwell)

　　· 實(shí)例網(wǎng)絡(luò)性能與計(jì)算規(guī)格對(duì)應(yīng)(規(guī)格越高網(wǎng)絡(luò)性能越強(qiáng))

　　· 適用場(chǎng)景：

　　· 云游戲的云端實(shí)時(shí)渲染

　　· AR/VR的云端實(shí)時(shí)渲染

　　· AI(DL/ML)推理，適合用戶(hù)彈性部署含有AI推理計(jì)算應(yīng)用的互聯(lián)網(wǎng)業(yè)務(wù)

　　· 深度學(xué)習(xí)的教學(xué)練習(xí)環(huán)境

　　· 深度學(xué)習(xí)的模型實(shí)驗(yàn)環(huán)境

　　實(shí)例規(guī)格

　　實(shí)例規(guī)格vCPU內(nèi)存(GiB)本地存儲(chǔ)(GiB)*GPU顯存(GiB)網(wǎng)絡(luò)帶寬能力(出/入)(Gbit/s)**網(wǎng)絡(luò)收發(fā)包能力(出/入)(萬(wàn)PPS)***支持IPv6多隊(duì)列****彈性網(wǎng)卡(包括一塊主網(wǎng)卡)*****

　　ecs.vgn5i-m1.large26無(wú)P4*1/81130是22

　　ecs.vgn5i-m2.xlarge412無(wú)P4*1/42250是23

　　ecs.vgn5i-m4.2xlarge824無(wú)P4*1/24380是24

　　ecs.vgn5i-m8.4xlarge1648無(wú)P4*185100是45

　　說(shuō)明更多信息，請(qǐng)參見(jiàn) 創(chuàng)建GPU計(jì)算型實(shí)例。

　　回到目錄查看其他實(shí)例規(guī)格族。

　　GPU計(jì)算型實(shí)例規(guī)格族 gn6i

　　規(guī)格族特點(diǎn)

　　· I/O優(yōu)化實(shí)例

　　· 支持IPv6

　　· 處理器與內(nèi)存配比為1:4

　　· 處理器：2.5 GHz主頻的Intel Xeon Platinum 8163(Skylake)

　　· 支持ESSD(百萬(wàn)IOPS)、SSD云盤(pán)和高效云盤(pán)

　　· 基于X-Dragon神龍新一代計(jì)算架構(gòu)，性能更優(yōu)

　　· GPU加速器：T4

　　· 創(chuàng)新的Turing架構(gòu)

　　· 多達(dá)320個(gè)Turing Tensorcore

　　· 2560個(gè)CUDA Cores

　　· 可變精度Tensor Cores支持65 TFlops FP16、130 INT8 TOPS、260 INT4 TOPS

　　· 顯存16 GiB(顯存帶寬320 GB/s)

　　· 實(shí)例網(wǎng)絡(luò)性能與計(jì)算規(guī)格對(duì)應(yīng)(規(guī)格越高網(wǎng)絡(luò)性能越強(qiáng))

　　· 適用場(chǎng)景：

　　· AI(DL/ML)推理，適合計(jì)算機(jī)視覺(jué)、語(yǔ)音識(shí)別、語(yǔ)音合成、NLP、機(jī)器翻譯、推薦系統(tǒng)

　　· 云游戲云端實(shí)時(shí)渲染

　　· AR/VR的云端實(shí)時(shí)渲染

　　· 重載圖形計(jì)算或圖形工作站

　　· GPU加速數(shù)據(jù)庫(kù)

　　· 高性能計(jì)算

　　實(shí)例規(guī)格

　　ecs.gn6i-c4g1.xlarge415無(wú)T4*116450是22

　　ecs.gn6i-c8g1.2xlarge831無(wú)T4*116580是22

　　ecs.gn6i-c16g1.4xlarge1662無(wú)T4*1166100是43

　　ecs.gn6i-c24g1.6xlarge2493無(wú)T4*1167.5120是64

　　ecs.gn6i-c24g1.12xlarge48186無(wú)T4*23215240是126

　　ecs.gn6i-c24g1.24xlarge96372無(wú)T4*46430480是248

　　ecs.gn6i-c32g1.8xlarge32124無(wú)T4*11610160是86

　　ecs.gn6i-c48g1.12xlarge48186無(wú)T4*11612240是126

　　ecs.gn6i-c72g1.18xlarge72279無(wú)T4*11621.5360是188

　　說(shuō)明更多信息，請(qǐng)參見(jiàn) 創(chuàng)建GPU計(jì)算型實(shí)例。

　　回到目錄查看其他實(shí)例規(guī)格族。

　　GPU計(jì)算型實(shí)例規(guī)格族 gn6v

　　規(guī)格族特點(diǎn)

　　· I/O優(yōu)化實(shí)例

　　· 支持IPv6

　　· 僅支持SSD云盤(pán)和高效云盤(pán)

　　· 采用NVIDIA V100 GPU計(jì)算卡

　　· 處理器與內(nèi)存配比為1:4

　　· 處理器：2.5 GHz主頻的Intel Xeon Platinum 8163(Skylake)

　　· GPU加速器：V100(SXM2封裝)

　　· 創(chuàng)新的Volta架構(gòu)

　　· 顯存16 GiB HBM2

　　· CUDA Cores 5120

　　· Tensor Cores 640

　　· 顯存帶寬900 GB/s

　　· 支持6個(gè)NVLink鏈路，每個(gè)25 GB/s，總共300 GB/s

　　· 實(shí)例網(wǎng)絡(luò)性能與計(jì)算規(guī)格對(duì)應(yīng)(規(guī)格越高網(wǎng)絡(luò)性能越強(qiáng))

　　· 適用場(chǎng)景：

　　· 深度學(xué)習(xí)，如圖像分類(lèi)、無(wú)人駕駛、語(yǔ)音識(shí)別等人工智能算法的訓(xùn)練以及推理應(yīng)用

　　· 科學(xué)計(jì)算，如計(jì)算流體動(dòng)力學(xué)、計(jì)算金融學(xué)、分子動(dòng)力學(xué)、環(huán)境分析等

　　實(shí)例規(guī)格

　　實(shí)例規(guī)格vCPU內(nèi)存(GiB)本地存儲(chǔ)(GiB)*GPUGPU顯存(GB)網(wǎng)絡(luò)帶寬能力(出/入)(Gbit/s)**網(wǎng)絡(luò)收發(fā)包能力(出/入)(萬(wàn)PPS)***支持IPv6多隊(duì)列****彈性網(wǎng)卡(包括一塊主網(wǎng)卡)*****

　　ecs.gn6v-c8g1.2xlarge832.0無(wú)1 * NVIDIA V1001 * 162.580是44

　　ecs.gn6v-c8g1.8xlarge32128.0無(wú)4 * NVIDIA V1004 * 1610.0200是88

　　ecs.gn6v-c8g1.16xlarge64256.0無(wú)8 * NVIDIA V1008 * 1620.0250是168

　　說(shuō)明更多信息，請(qǐng)參見(jiàn) 創(chuàng)建GPU計(jì)算型實(shí)例。

　　回到目錄查看其他實(shí)例規(guī)格族。

　　GPU計(jì)算型實(shí)例規(guī)格族 gn5

　　規(guī)格族特點(diǎn)

　　· I/O優(yōu)化實(shí)例

　　· 僅支持SSD云盤(pán)和高效云盤(pán)

　　· 采用NVIDIA P100 GPU計(jì)算卡

　　· 多種處理器與內(nèi)存配比

　　· 高性能NVMe SSD本地盤(pán)

　　· 處理器：2.5 GHz主頻的Intel Xeon E5-2682 v4(Broadwell)

　　· 實(shí)例網(wǎng)絡(luò)性能與計(jì)算規(guī)格對(duì)應(yīng)(規(guī)格越高網(wǎng)絡(luò)性能越強(qiáng))

　　· 適用場(chǎng)景：

　　· 深度學(xué)習(xí)

　　· 科學(xué)計(jì)算，如計(jì)算流體動(dòng)力學(xué)、計(jì)算金融學(xué)、基因組學(xué)研究、環(huán)境分析

　　· 高性能計(jì)算、渲染、多媒體編解碼及其他服務(wù)器端GPU計(jì)算工作負(fù)載

　　實(shí)例規(guī)格

　　ecs.gn5-c4g1.xlarge430.04401 * NVIDIA P1001 * 163.030否13

　　ecs.gn5-c8g1.2xlarge860.04401 * NVIDIA P1001 * 163.040否14

　　ecs.gn5-c4g1.2xlarge860.08802 * NVIDIA P1002 * 165.0100否24

　　ecs.gn5-c8g1.4xlarge16120.08802 * NVIDIA P1002 * 165.0100否48

　　ecs.gn5-c28g1.7xlarge28112.04401 * NVIDIA P1001 * 165.0100否88

　　ecs.gn5-c8g1.8xlarge32240.017604 * NVIDIA P1004 * 1610.0200否88

　　ecs.gn5-c28g1.14xlarge56224.08802 * NVIDIA P1002 * 1610.0200否148

　　ecs.gn5-c8g1.14xlarge54480.035208 * NVIDIA P1008 * 1625.0400否148

　　說(shuō)明更多信息，請(qǐng)參見(jiàn) 創(chuàng)建GPU計(jì)算型實(shí)例。

　　回到目錄查看其他實(shí)例規(guī)格族。

　　GPU計(jì)算型實(shí)例規(guī)格族 gn5i

　　規(guī)格族特點(diǎn)

　　· I/O優(yōu)化實(shí)例

　　· 支持IPv6

　　· 僅支持SSD云盤(pán)和高效云盤(pán)

　　· 采用NVIDIA P4 GPU計(jì)算卡

　　· 處理器與內(nèi)存配比為1:4

　　· 處理器：2.5 GHz主頻的Intel Xeon E5-2682 v4(Broadwell)

　　· 實(shí)例網(wǎng)絡(luò)性能與計(jì)算規(guī)格對(duì)應(yīng)(規(guī)格越高網(wǎng)絡(luò)性能越強(qiáng))

　　· 適用場(chǎng)景：

　　· 深度學(xué)習(xí)推理

　　· 多媒體編解碼等服務(wù)器端GPU計(jì)算工作負(fù)載

　　實(shí)例規(guī)格

　　ecs.gn5i-c2g1.large28.0無(wú)1 * NVIDIA P41 * 81.010是22

　　ecs.gn5i-c4g1.xlarge416.0無(wú)1 * NVIDIA P41 * 81.520是23

　　ecs.gn5i-c8g1.2xlarge832.0無(wú)1 * NVIDIA P41 * 82.040是44

　　ecs.gn5i-c16g1.4xlarge1664.0無(wú)1 * NVIDIA P41 * 83.080是48

　　ecs.gn5i-c16g1.8xlarge32128.0無(wú)2 * NVIDIA P42 * 86.0120是88

　　ecs.gn5i-c28g1.14xlarge56224.0無(wú)2 * NVIDIA P42 * 810.0200是148

　　說(shuō)明更多信息，請(qǐng)參見(jiàn) 創(chuàng)建GPU計(jì)算型實(shí)例。

　　回到目錄查看其他實(shí)例規(guī)格族。

　　GPU計(jì)算型實(shí)例規(guī)格族 gn4

　　規(guī)格族特點(diǎn)

　　· I/O優(yōu)化實(shí)例

　　· 僅支持SSD云盤(pán)和高效云盤(pán)

　　· 采用NVIDIA M40 GPU計(jì)算卡

　　· 多種處理器與內(nèi)存配比

　　· 處理器：2.5 GHz主頻的Intel Xeon E5-2682 v4(Broadwell)

　　· 實(shí)例網(wǎng)絡(luò)性能與計(jì)算規(guī)格對(duì)應(yīng)(規(guī)格越高網(wǎng)絡(luò)性能越強(qiáng))

　　· 適用場(chǎng)景

　　· 深度學(xué)習(xí)

　　· 科學(xué)計(jì)算，如計(jì)算流體動(dòng)力學(xué)、計(jì)算金融學(xué)、基因組學(xué)研究、環(huán)境分析

　　· 高性能計(jì)算、渲染、多媒體編解碼及其他服務(wù)器端GPU計(jì)算工作負(fù)載

　　實(shí)例規(guī)格

　　ecs.gn4-c4g1.xlarge430.0無(wú)1 * NVIDIA M401 * 123.030否13

　　ecs.gn4-c8g1.2xlarge830.0無(wú)1 * NVIDIA M401 * 123.040否14

　　ecs.gn4.8xlarge3248.0無(wú)1 * NVIDIA M401 * 126.080否38

　　ecs.gn4-c4g1.2xlarge860.0無(wú)2 * NVIDIA M402 * 125.050否14

　　ecs.gn4-c8g1.4xlarge1660.0無(wú)2 * NVIDIA M402 * 125.050否18

　　ecs.gn4.14xlarge5696.0無(wú)2 * NVIDIA M402 * 1210.0120否48

　　GPU計(jì)算型實(shí)例必須安裝GPU驅(qū)動(dòng)才可以使用。您可以在創(chuàng)建實(shí)例時(shí)自動(dòng)安裝GPU驅(qū)動(dòng)，也可以在實(shí)例創(chuàng)建后手動(dòng)安裝GPU驅(qū)動(dòng)。本文介紹如何創(chuàng)建GPU計(jì)算型實(shí)例并自動(dòng)安裝驅(qū)動(dòng)。

　　注意事項(xiàng)

　　如果您使用了自動(dòng)安裝GPU驅(qū)動(dòng)功能，請(qǐng)注意：

　　· 自動(dòng)安裝只支持Linux公共鏡像。

　　· 自動(dòng)安裝過(guò)程受不同實(shí)例規(guī)格的內(nèi)網(wǎng)帶寬和CPU核數(shù)的影響，安裝時(shí)間約4～10分鐘，在安裝過(guò)程中無(wú)法使用GPU，請(qǐng)勿對(duì)實(shí)例進(jìn)行任何操作，也不要安裝其它GPU相關(guān)軟件，以防自動(dòng)安裝失敗，導(dǎo)致實(shí)例不可用。

　　· 如果您在創(chuàng)建完成后更換操作系統(tǒng) ，請(qǐng)確保使用同一鏡像或者為可自動(dòng)安裝CUDA和GPU驅(qū)動(dòng)的鏡像，以防自動(dòng)安裝失敗。

　　· 您可以遠(yuǎn)程連接實(shí)例，通過(guò)安裝日志查看安裝進(jìn)程和結(jié)果：

　　· 如果您勾選了自動(dòng)安裝GPU驅(qū)動(dòng)，安裝日志位于/root/nvidia_install.log。

　　· 如果您在實(shí)例自定義數(shù)據(jù)配置nvidia_install_v2.0版本的安裝腳本，安裝日志位于/root/nvidia/nvidia_install.log。

　　操作步驟

　　本步驟重點(diǎn)介紹GPU計(jì)算型實(shí)例相關(guān)的配置，您可以參見(jiàn)創(chuàng)建ECS實(shí)例了解其它通用配置。

　　1. 前往ECS售賣(mài)頁(yè)。

　　2. 完成基礎(chǔ)配置。在選擇配置時(shí)，請(qǐng)注意：

　　· 地域：請(qǐng)根據(jù)表格提供GPU計(jì)算型實(shí)例的地域和可用區(qū)選擇地域和可用區(qū)。如果售賣(mài)頁(yè)顯示的地域和可用區(qū)和表格不一致，以售賣(mài)頁(yè)為準(zhǔn)。

　　· 實(shí)例：定位到異構(gòu)計(jì)算GPU/FPGA > GPU計(jì)算型，然后根據(jù)需求選擇實(shí)例規(guī)格。

　　· 鏡像：部分Linux公共鏡像支持自動(dòng)安裝CUDA和GPU驅(qū)動(dòng)，支持的鏡像請(qǐng)參見(jiàn)支持自動(dòng)安裝的鏡像。

　　如果您選擇的鏡像支持自動(dòng)安裝驅(qū)動(dòng)，勾選自動(dòng)安裝GPU驅(qū)動(dòng)，并選擇驅(qū)動(dòng)版本。如果是新業(yè)務(wù)系統(tǒng)，建議選擇最新的版本。

　　如果您不勾選自動(dòng)安裝GPU驅(qū)動(dòng)，或者鏡像不支持自動(dòng)安裝，則需要在實(shí)例自定義數(shù)據(jù)模塊下配置安裝腳本，或者在創(chuàng)建實(shí)例后手動(dòng)安裝GPU驅(qū)動(dòng)。如何配置安裝腳本，請(qǐng)參見(jiàn)腳本版本。

　　說(shuō)明如果調(diào)用RunInstances創(chuàng)建GPU計(jì)算型實(shí)例，必須通過(guò) UserData參數(shù)上傳安裝腳本，腳本內(nèi)容需要采用Base64方式編碼。

　　3. 完成網(wǎng)絡(luò)和安全組配置。在選擇配置時(shí)，請(qǐng)注意：

　　· 網(wǎng)絡(luò)：選擇專(zhuān)有網(wǎng)絡(luò)。

　　· 公網(wǎng)帶寬：請(qǐng)根據(jù)您的業(yè)務(wù)需要選擇帶寬。

　　注意如果您在基礎(chǔ)配置中選用了Windows 2008 R2及以下版本的鏡像，在GPU驅(qū)動(dòng)安裝生效后，您將無(wú)法通過(guò)管理終端連接GPU計(jì)算型實(shí)例，遠(yuǎn)程連接時(shí)會(huì)始終顯示黑屏或停留在啟動(dòng)界面。您需要在此處勾選分配公網(wǎng)IP地址，或者在創(chuàng)建實(shí)例后綁定彈性公網(wǎng)IP，以便通過(guò)其他協(xié)議連接實(shí)例，例如RDP(Windows自帶的遠(yuǎn)程連接)、PCOIP、XenDeskop HDX 3D等。其中RDP不支持DirectX、OpenGL等應(yīng)用，您需要自行安裝VNC服務(wù)和客戶(hù)端。

　　4. 完成系統(tǒng)配置。在選擇配置時(shí)，請(qǐng)注意：

　　· 登錄憑證：建議選擇密鑰對(duì)或自定義密碼。如果您選擇創(chuàng)建后設(shè)置，通過(guò)管理終端登錄實(shí)例時(shí)必須綁定SSH密鑰對(duì)或者重置密碼，然后重啟實(shí)例使修改生效。如果此時(shí)GPU驅(qū)動(dòng)尚未安裝完成，重啟操作會(huì)導(dǎo)致安裝失敗。

　　· 實(shí)例自定義數(shù)據(jù)：

　　· 如果您在基礎(chǔ)配置頁(yè)面的鏡像中選擇了自動(dòng)安裝GPU驅(qū)動(dòng)，此處會(huì)顯示自動(dòng)安裝CUDA和GPU驅(qū)動(dòng)的注意事項(xiàng)和Shell腳本內(nèi)容。

　　· 如果您未選擇自動(dòng)安裝GPU驅(qū)動(dòng)，可以在實(shí)例自定義數(shù)據(jù)處配置安裝腳本，腳本示例請(qǐng)參見(jiàn)實(shí)例自定義數(shù)據(jù)方式安裝腳本。

　　5. 根據(jù)需要完成分組設(shè)置并確認(rèn)訂單，完成創(chuàng)建GPU計(jì)算型實(shí)例。

　　說(shuō)明

　　· 如果您配置了自動(dòng)安裝腳本，實(shí)例啟動(dòng)后會(huì)自動(dòng)安裝GPU驅(qū)動(dòng)。安裝完成后實(shí)例會(huì)自動(dòng)重啟，重啟過(guò)后GPU驅(qū)動(dòng)才能正常工作。

　　· GPU驅(qū)動(dòng)在Persistence Mode下工作更穩(wěn)定。安裝腳本會(huì)自動(dòng)開(kāi)啟GPU驅(qū)動(dòng)的Persistence Mode，并將該設(shè)置添加到Linux系統(tǒng)的自啟動(dòng)腳本中，確保實(shí)例重啟后還能默認(rèn)開(kāi)啟Persistence Mode。

　　提供GPU計(jì)算型實(shí)例的地域和可用區(qū)

　　提供各GPU計(jì)算型實(shí)例規(guī)格族的地域和可用區(qū)如下表所示：

　　實(shí)例規(guī)格地域和可用區(qū)

　　gn4· 華北2(可用區(qū)A)、華東2(可用區(qū)B)

　　· 華南1(可用區(qū)C)

　　gn5· 華北2(可用區(qū)C、E)、華北5(可用區(qū)A)

　　· 華東1(可用區(qū)G、F)、華東2(可用區(qū)D、B、E)

　　· 華南1(可用區(qū)D)

　　· 香港(可用區(qū)C、B)

　　· 亞太東南1(可用區(qū)B、A)、亞太東南2(可用區(qū)A)、亞太東南3(可用區(qū)A)、亞太東南5(可用區(qū)A)

　　· 美國(guó)西部1(可用區(qū)B、A)、美國(guó)東部1(可用區(qū)B、A)

　　· 歐洲中部1(可用區(qū)A)

　　gn5(部署NGC環(huán)境)部分地域下的gn5實(shí)例不支持部署NGC(NVIDIA GPU CLOUD)環(huán)境，更多信息請(qǐng)參見(jiàn)在gn5實(shí)例上部署NGC環(huán)境。

　　gn5i· 華北2(可用區(qū)C、E、A)

　　· 華東1(可用區(qū)B)、華東2(可用區(qū)D、B)

　　· 華南1(可用區(qū)A)

　　gn6v華東2(可用區(qū)F)

　　支持自動(dòng)安裝的鏡像

　　支持自動(dòng)安裝CUDA和GPU驅(qū)動(dòng)的鏡像如下：

　　鏡像來(lái)源鏡像版本

　　公共鏡像支持如下版本：

　　· CentOS 64位(目前提供的所有版本都支持)

　　· Ubuntu16.04 64位鏡像

　　· SUSE Linux Enterprise Server 12 SP2 64位鏡像

　　鏡像市場(chǎng)請(qǐng)按以下方式獲?。?/p>

　　· 搜索NVIDIA并選擇需要的鏡像，目前只支持CentOS 7.3。

　　· 如果GPU計(jì)算型實(shí)例用于深度學(xué)習(xí)，您可以選擇預(yù)裝深度學(xué)習(xí)框架的鏡像。搜索深度學(xué)習(xí)并選擇需要的鏡像，目前只支持CentOS 7.3。

　　腳本版本

　　實(shí)例首次啟動(dòng)時(shí)，cloud-init會(huì)自動(dòng)執(zhí)行Shell腳本安裝CUDA和GPU驅(qū)動(dòng)。

　　· 如果您勾選了自動(dòng)安裝GPU驅(qū)動(dòng)，實(shí)例會(huì)使用安裝腳本的nvidia_install_v1.0版本。目前，可選的CUDA和GPU驅(qū)動(dòng)版本如下：

　　CUDAGPU驅(qū)動(dòng)支持的實(shí)例規(guī)格

　　9.1.85390.46· gn5

　　· gn5i

　　· gn6v

　　· gn4

　　9.0.176· 390.46

　　· 384.125

　　· 384.111· gn5

　　· gn5i

　　· gn6v

　　· gn4

　　8.0.61· 390.46

　　· 384.125

　　· 384.111· gn5

　　· gn5i

　　· gn4

　　· 如果您在實(shí)例自定義數(shù)據(jù)配置安裝腳本，建議使用安裝腳本的nvidia_install_v2.0版本，腳本內(nèi)容請(qǐng)參見(jiàn)實(shí)例自定義數(shù)據(jù)方式安裝腳本。nvidia_install_v2.0版本具有以下優(yōu)勢(shì)：

　　· 提供最新版本的CUDA、GPU驅(qū)動(dòng)和cuDNN庫(kù)。

　　· 登錄實(shí)例后，如果正在安裝驅(qū)動(dòng)，您可以看到安裝進(jìn)度條，如果已經(jīng)安裝完成，無(wú)論是否成功，您可以看到安裝結(jié)果提示(NVIDIA INSTALL OK或NVIDIA INSTALL FAIL)。

　　使用nvidia_install_v2.0版本時(shí)，您需要修改安裝腳本的以下參數(shù)，指定GPU驅(qū)動(dòng)、CUDA、cuDNN版本號(hào)，例如：

　　試用

　　driver_version="410.79"cuda_version="9.0.176"cudnn_version="7.4.2"

　　目前支持的CUDA、GPU驅(qū)動(dòng)和cuDNN庫(kù)版本如下：

　　CUDAGPU驅(qū)動(dòng)cuDNN

　　10.0.130410.79· 7.4.2

　　· 7.3.1

　　9.2.148· 410.79

　　· 396.44· 7.4.2

　　· 7.3.1

　　· 7.1.4

　　9.0.176· 410.79

　　· 396.44

　　· 390.46· 7.4.2

　　· 7.3.1

　　· 7.1.4

　　· 7.0.5

　　8.0.61· 410.79

　　· 396.44

　　· 390.46· 7.1.3

　　· 7.0.5

　　實(shí)例自定義數(shù)據(jù)方式安裝腳本

　　通過(guò)實(shí)例自定義數(shù)據(jù)方式安裝驅(qū)動(dòng)時(shí)，建議使用安裝腳本的nvidia_install_v2.0版本，腳本內(nèi)容如下：

　　試用

　　#!/bin/sh

　　driver_version=$1

　　cuda_version=$2

　　cudnn_version=$3

　　NVIDIA_DIR="/root/nvidia"log=${NVIDIA_DIR}"/nvidia_install.log"

　　PROCESS_NAME="/var/lib/cloud/instance/scripts/part-001"

　　DRIVER_PROCESS_NAME=${NVIDIA_DIR}"/NVIDIA-Linux-x86_64"

　　CUDA_PROCESS_NAME=${NVIDIA_DIR}"/cuda"

　　CUDNN_PROCESS_NAME=${NVIDIA_DIR}"/cudnn"

　　DOWNLOAD_PROCESS_NAME="wget"

　　SUCCESS_STR="NVIDIA INSTALL OK"

　　DOWNLOAD_SUCCESS_STR="Download OK"

　　DRIVER_FAIL_STR="Driver INSTALL FAIL"

　　CUDA_FAIL_STR="CUDA INSTALL FAIL"

　　CUDNN_FAIL_STR="CUDNN INSTALL FAIL"

　　DOWNLOAD_FAIL_STR="Download FAIL"

　　install_notes="The script automatically downloads and installs a NVIDIA GPU driver and CUDA/CUDNN library.

　　1. The installation takes 6 to 10 minutes, depending on the intranet bandwidth and the quantity of vCPU cores of the instance. Please do not operate the GPU or install any GPU-related software until the GPU driver is installed successfully.

　　2. After the GPU is installed successfully, the instance will restarts automatically."

　　check_install()

　　{

　　b=''

　　if [ "$1" = "NVIDIA" ]; then

　　ProcessName=$DRIVER_PROCESS_NAME

　　t=2

　　elif [ "$1" = "cuda" ]; then

　　ProcessName=$CUDA_PROCESS_NAME

　　t=2.5

　　elif [ "$1" = "cudnn" ]; then

　　ProcessName=$CUDNN_PROCESS_NAME

　　t=0.5

　　i=0

　　while true

　　pid_num=$(ps -ef | grep $ProcessName |grep -v grep | wc -l)

　　if [ $pid_num -eq 0 ]; then

　　str=$(printf "%-100s" "#")

　　b=$(echo "$str" | sed 's/ /#/g')

　　printf "| %-100s | %d%% \r\n" "$b" "100";

　　break

　　i=$(($i+1))

　　str=$(printf "%-${i}s" "#")

　　b=$(echo "$str" | sed 's/ /#/g')

　　printf "| %-100s | %d%% \r" "$b" "$i";

　　sleep $t

　　done

　　echo

　　return 0

　　}

　　check_download()

　　{

　　name=$1

　　i=0

　　b=''

　　filesize=0

　　percent=0

　　sleep 0.5

　　while true

　　pid_num=$(ps -ef | grep wget |grep $name |grep -v grep | wc -l)

　　if [ $pid_num -eq 0 ]; then

　　filesize=$(du -sk /root/nvidia/${name}* | awk '{print $1}')

　　str=$(printf "%-100s" "#")

　　b=$(echo "$str" | sed 's/ /#/g')

　　printf "%-8s| %-100s | %d%% \r\n" "${filesize}K" "$b" "100";

　　break

　　line=$(tail -2 /root/nvidia/nvidia_install.log)

　　filesize=$(echo $line | awk -F ' ' '{print $1}')

　　percent=$(echo $line | awk -F '%' '{print $1}' | awk -F ' ' '{print $NF}')

　　if [ "$percent" -ge 0 ] 2>/dev/null ;then

　　str=$(printf "%-${percent}s" "#")

　　b=$(echo "$str" | sed 's/ /#/g')

　　printf "%-8s| %-100s | %d%% \r" "${filesize}" "$b" "$percent";

　　else

　　continue

　　sleep 0.5

　　done

　　return 0

　　}

　　check_install_log()

　　{

　　if [ ! -f "$log" ];then

　　echo "NVIDIA install log $log not exist! Install may fail!"

　　echo

　　exit 1

　　if [ "$1" = "NVIDIA" ]; then

　　succstr=$SUCCESS_STR

　　str2=$(cat $log |grep "INSTALL_ERROR")

　　echo

　　if [ -n "$succstr" ] && [ -z "$str2" ]; then

　　echo "$succstr !!"

　　echo

　　return 0

　　else

　　echo "NVIDIA install may have some INSTALL_ERROR, please check log $log !"

　　return 1

　　if [ "$1" = "DRIVER" ]; then

　　failstr=$DRIVER_FAIL_STR

　　elif [ "$1" = "CUDA" ]; then

　　failstr=$CUDA_FAIL_STR

　　elif [ "$1" = "CUDNN" ]; then

　　failstr=$CUDNN_FAIL_STR

　　str1=$(cat $log |grep "$failstr")

　　if [ -n "$str1" ] ;then

　　echo

　　echo "NVIDIA $failstr ! please check install log $log !"

　　return 1

　　}

　　check_install_process()

　　{

　　echo "CHECKING NVIDIA INSTALL, PLEASE WAIT ......"

　　echo "$install_notes"

　　echo

　　while true

　　pid_num=$(ps -ef | grep $PROCESS_NAME |grep -v grep | grep -v check | wc -l)

　　if [ $pid_num -eq 0 ];then

　　check_install_log "NVIDIA"

　　return 0

　　else

　　pid_num=$(ps -ef | grep $DOWNLOAD_PROCESS_NAME |grep driver |grep -v grep | wc -l)

　　if [ $pid_num -gt 0 ];then

　　echo "Driver-${1} downloading, need 10 seconds. Remaining installation time 360 - 600 seconds!"

　　check_download "NVIDIA"

　　pid_num=$(ps -ef | grep $DOWNLOAD_PROCESS_NAME |grep cuda |grep -v grep | wc -l)

　　if [ $pid_num -gt 0 ];then

　　echo "CUDA-${2} downloading, need 150 or more seconds. Remaining installation time 350 - 590 seconds!"

　　while true

　　check_download "cuda"

　　sleep 1

　　pid_num=$(ps -ef | grep $DOWNLOAD_PROCESS_NAME |grep cuda |grep -v grep | wc -l)

　　if [ $pid_num -eq 0 ];then

　　break

　　done

　　pid_num=$(ps -ef | grep $DOWNLOAD_PROCESS_NAME |grep cudnn |grep -v grep | wc -l)

　　if [ $pid_num -gt 0 ];then

　　echo "cuDNN-${3} downloading, need about 30 seconds. Remaining installation time 200 - 430 seconds!"

　　check_download "cudnn"

　　pid_num=$(ps -ef | grep $DRIVER_PROCESS_NAME |grep -v grep | wc -l)

　　if [ $pid_num -gt 0 ];then

　　echo

　　echo "Driver-${1} installing, need 30 - 160 seconds. Remaining installation time 160 - 400 seconds!"

　　check_install "NVIDIA"

　　check_install_log "DRIVER"

　　pid_num=$(ps -ef | grep $CUDA_PROCESS_NAME |grep -v grep | wc -l)

　　if [ $pid_num -gt 0 ];then

　　echo "CUDA-${2} installing, need 80 - 200 seconds. Remaining installation time 90 - 220 seconds!"

　　check_install "cuda"

　　check_install_log "CUDA"

　　pid_num=$(ps -ef | grep $CUDNN_PROCESS_NAME |grep -v grep | wc -l)

　　if [ $pid_num -gt 0 ];then

　　echo "cuDNN-${3} installing, need 10 seconds. Installation will be successful soon, please wait......"

　　check_install "cudnn"

　　check_install_log "CUDNN"

　　sleep 1

　　done

　　}

　　create_nvidia_repo_centos()

　　{

　　baseurl_centos=$(cat /etc/yum.repos.d/CentOS-Base.repo |grep baseurl | head -1 | awk -F'[/]' '{print $1"//"$3}')

　　if [ -z "$baseurl_centos" ]; then

　　url="http://mirrors.cloud.aliyuncs.com"

　　cudaurl=$baseurl_centos"/opsx/ecs/linux/rpm/cuda/${version}/\$basearch/"

　　driverurl=$baseurl_centos"/opsx/ecs/linux/rpm/driver/${version}/\$basearch/"

　　echo "[ecs-cuda]" > /etc/yum.repos.d/nvidia.repo

　　echo "name=ecs cuda - \$basearch" >> /etc/yum.repos.d/nvidia.repo

　　echo $cudaurl >> /etc/yum.repos.d/nvidia.repo

　　echo "enabled=1" >> /etc/yum.repos.d/nvidia.repo

　　echo "gpgcheck=0" >> /etc/yum.repos.d/nvidia.repo

　　echo "[ecs-driver]" >> /etc/yum.repos.d/nvidia.repo

　　echo "name=ecs driver - \$basearch" >> /etc/yum.repos.d/nvidia.repo

　　echo $driverurl >> /etc/yum.repos.d/nvidia.repo

　　echo "enabled=1" >> /etc/yum.repos.d/nvidia.repo

　　echo "gpgcheck=0" >> /etc/yum.repos.d/nvidia.repo

　　yum clean all >> $log 2>&1

　　yum makecache >> $log 2>&1

　　}

　　disable_nouveau_centos()

　　{

　　echo "blacklist nouveau" > /etc/modprobe.d/blacklist-nouveau.conf

　　echo "options nouveau modeset=0" >> /etc/modprobe.d/blacklist-nouveau.conf

　　echo "***exec \"dracut --force\" to regenerate the kernel initramfs"

　　dracut --force

　　}

　　disable_nouveau_ubuntu()

　　{

　　echo "blacklist nouveau" > /etc/modprobe.d/blacklist-nouveau.conf

　　echo "options nouveau modeset=0" >> /etc/modprobe.d/blacklist-nouveau.conf

　　echo "***exec \"update-initramfs -u\" to regenerate the kernel initramfs"

　　update-initramfs -u

　　}install_kernel_centos()

　　{

　　kernel_version=$(uname -r)

　　kernel_devel_num=$(rpm -qa | grep kernel-devel | grep $kernel_version | wc -l)

　　if [ $kernel_devel_num -eq 0 ];then

　　echo "******exec \"yum install -y kernel-devel-$kernel_version\""

　　yum install -y kernel-devel-$kernel_version

　　if [ $? -ne 0 ]; then

　　echo "INSTALL_ERROR: install kernel-devel fail!!!"

　　return 1

　　return 0

　　}install_kernel_suse()

　　{

　　kernel_version=$(uname -r|awk -F'-' '{print $1"-"$2}')

　　kernel_devel_num=$(rpm -qa | grep kernel-default-devel | wc -l)

　　if [ $kernel_devel_num -eq 0 ];then

　　echo "***exec \"zypper install -y kernel-default-devel=$kernel_version\""

　　zypper install -y kernel-default-devel=$kernel_version

　　if [ $? -ne 0 ]; then

　　echo "error: install kernel-default-devel fail!!!"

　　return 1

　　}install_kernel_ubuntu()

　　{

　　kernel_version=$(uname -r)

　　linux_headers_num=$(dpkg --list |grep linux-headers | grep $kernel_version | wc -l)

　　if [ $linux_headers_num -eq 0 ];then

　　echo "***exec \"apt-get install -y --allow-unauthenticated linux-headers-$kernel_version\""

　　apt-get install -y --allow-unauthenticated linux-headers-$kernel_version

　　if [ $? -ne 0 ]; then

　　echo "error: install linux-headers fail!!!"

　　return 1

　　}

　　download()

　　{

　　download_url="${baseurl}/opsx/ecs/linux/binary/nvidia"

　　wget ${download_url}/driver/${driver_file}

　　if [ $? -ne 0 ]; then

　　echo "INSTALL_ERROR: Download driver fail!!! return: $?"

　　return 1

　　cudafilelist=$(curl ${download_url}/cuda/${cuda_version}/ |grep "cuda_${cuda_version}" | awk -F '>' '{print $2}' | awk -F '<' '{print $1}')

　　if [ -z "$cudafilelist" ]; then

　　echo "INSTALL_ERROR: Download CUDA fail!!! get cuda-${cuda_version} filename fail!!"

　　return 1

　　mkdir /root/nvidia/cuda

　　cd /root/nvidia/cuda

　　echo $cudafilelist

　　for cudafile in $cudafilelist

　　sleep 1

　　wget ${download_url}/cuda/${cuda_version}/$cudafile

　　if [ $? -ne 0 ]; then

　　echo "INSTALL_ERROR: Download CUDA fail!!! wget $cudafile fail! return: $?"

　　return 1

　　done

　　chmod +x /root/nvidia/cuda/*

　　cd /root/nvidia

　　wget ${download_url}/cudnn/${cuda_big_version}/${cudnn_file}

　　if [ $? -ne 0 ]; then

　　echo "INSTALL_ERROR: Download cuDNN fail!!! return :$?"

　　return 1

　　chmod +x /root/nvidia/*

　　echo "$DOWNLOAD_SUCCESS_STR !"

　　return 0

　　}

　　install_driver()

　　{

　　/root/nvidia/$driver_file --silent

　　if [ $? -ne 0 ]; then

　　echo "INSTALL_ERROR: driver install fail!!!"

　　return 1

　　echo "$DRIVER_SUCCESS_STR !"

　　return 0

　　}

　　install_cuda()

　　{

　　cd /root/nvidia/cuda

　　cuda_file=$(ls -S | grep cuda | grep $cuda_version | head -1)

　　echo "cuda file: "$cuda_file

　　if [ -z "$cuda_file" ]

　　then

　　echo "INSTALL_ERROR: cuda file is null, cuda install fail!!!"

　　return 1

　　/root/nvidia/cuda/$cuda_file --silent --toolkit --samples --samplespath=/root

　　if [ $? -ne 0 ]; then

　　echo "INSTALL_ERROR: cuda install fail!!!"

　　return 1

　　cuda_patchfile=$(ls | grep cuda | grep $cuda_version | grep -v ${cuda_file})

　　for cuda_patch in $cuda_patchfile

　　echo "install cuda patch file: "$cuda_patch

　　/root/nvidia/cuda/$cuda_patch --silent --installdir=/usr/local/cuda --accept-eula

　　if [ $? -ne 0 ]; then

　　echo "INSTALL_ERROR: cuda patch install fail!!!"

　　return 1

　　done

　　echo "$CUDA_SUCCESS_STR !"

　　return 0

　　}

　　install_cudnn()

　　{

　　tar zxvf /root/nvidia/$cudnn_file -C /usr/local

　　if [ $? -ne 0 ]; then

　　echo "INSTALL_ERROR: CUDNN INSTALL FAIL !!!"

　　return 1

　　echo "$CUDNN_SUCCESS_STR !"

　　return 0

　　}

　　enable_pm()

　　{

　　echo "#!/bin/bash" > /etc/init.d/enable_pm.sh

　　echo "nvidia-smi -pm 1" >> /etc/init.d/enable_pm.sh

　　echo "exit 0" >> /etc/init.d/enable_pm.sh

　　chmod +x /etc/init.d/enable_pm.sh

　　str=$(tail -1 $filename |grep "exit")

　　if [ -z "$str" ]; then

　　echo "/etc/init.d/enable_pm.sh" >> $filename

　　else

　　sed -i '$i\/etc/init.d/enable_pm.sh' $filename

　　chmod +x $filename

　　}

　　issue=$(cat /etc/issue | grep Ubuntu)if [ -n "$issue" ];then

　　os="ubuntu"

　　profile_file="/root/.profile"

　　filename="/etc/rc.local"

　　else

　　issue=$(cat /etc/issue | grep SUSE)

　　if [ -n "$issue" ];then

　　os="suse"

　　filename="/etc/init.d/after.local"

　　else

　　os="centos"

　　filename="/etc/rc.d/rc.local"

　　profile_file="/root/.bash_profile"fi

　　if [ "$1" = "check" ];then

　　check_install_process $driver_version $cuda_version $cudnn_version

　　sed -i '/part-001 /d' $profile_file

　　exit 0else

　　mkdir $NVIDIA_DIR

　　echo "begin to install, driver: $driver_version, cuda: $cuda_version, cudnn: $cudnn_version " >> $log 2>&1

　　driver_file="NVIDIA-Linux-x86_64-"${driver_version}".run"

　　cuda_big_version=$(echo $cuda_version | awk -F'.' '{print $1"."$2}')

　　cudnn_file="cudnn-"${cuda_big_version}"-linux-x64-v"${cudnn_version}".tgz"

　　echo "sh /var/lib/cloud/instance/scripts/part-001 check" | tee -a $profile_filefiecho "os:$os" >> $log 2>&1if [ "$os" = "ubuntu" ]; then

　　disable_nouveau_ubuntu >> $log 2>&1

　　if [ -f "/etc/apt/sources.list.d/sources-aliyun-0.list" ]; then

　　repo_file="/etc/apt/sources.list.d/sources-aliyun-0.list"

　　else

　　repo_file="/etc/apt/sources.list"

　　baseurl=$(cat $repo_file |grep "^deb" | head -1 | awk -F'[/]' '{print $1"//"$3}' |awk -F ' ' '{print $2}')

　　if [ -z "$baseurl" ]; then

　　baseurl="http://mirrors.cloud.aliyuncs.com"

　　fielif [ "$os" = "suse" ]; then

　　baseurl=$(cat /etc/zypp/repos.d/SLES12-SP2-0.repo |grep baseurl | head -1| awk -F'[=/]' '{print $2"//"$4}')

　　if [ -z "$baseurl" ]; then

　　baseurl="http://mirrors.cloud.aliyuncs.com"

　　fielif [ "$os" = "centos" ]; then

　　baseurl=$(cat /etc/yum.repos.d/CentOS-Base.repo |grep baseurl | head -1 | awk -F'[/]' '{print $1"//"$3}' |awk -F '=' '{print $2}')

　　if [ -z "$url" ]; then

　　baseurl="http://mirrors.cloud.aliyuncs.com"

　　if [ ! -f "/usr/bin/lsb_release" ]; then

　　pkgname=$(yum provides /usr/bin/lsb_release |grep centos|grep x86_64 |head -1 |awk -F: '{print $1}')

　　if [ -z "$pkgname" ]; then

　　echo "INSTALL_ERROR: /usr/bin/lsb_release pkg not exists!" >> $log 2>&1

　　exit 1

　　yum install -y $pkgname >> $log 2>&1

　　if [ ! -f "/usr/bin/gcc" ]; then

　　yum install -y gcc

　　disable_nouveau_centos >> $log 2>&1

　　str=$(lsb_release -r | awk -F'[:.]' '{print $2}')

　　version=$(echo $str | sed 's/ //g')

　　create_nvidia_repo_centosfi

　　install_kernel_${os} >> $log 2>&1if [ $? -ne 0 ]; then

　　echo "INSTALL_ERROR: kernel-devel install fail!!!" >> $log 2>&1

　　exit 1fi

　　cd /root/nvidia

　　begin_download=$(date '+%s')

　　download >> $log 2>&1if [ $? -ne 0 ]; then

　　exit 1fi

　　end_download=$(date '+%s')

　　time_download=$((end_download-begin_download))echo "NVIDIA download OK! Using time $time_download s !!" >> $log 2>&1

　　begin=$(date '+%s')

　　install_driver >> $log 2>&1if [ $? -ne 0 ]; then

　　exit 1fi

　　end=$(date '+%s')

　　time_install=$((end-begin))echo "NVIDIA install driver OK! Using time $time_install s !!" >> $log 2>&1

　　begin=$(date '+%s')

　　install_cuda >> $log 2>&1if [ $? -ne 0 ]; then

　　exit 1fi

　　end=$(date '+%s')

　　time_install=$((end-begin))echo "NVIDIA install cuda OK! Using time $time_install s !!" >> $log 2>&1

　　begin=$(date '+%s')

　　install_cudnn >> $log 2>&1if [ $? -ne 0 ]; then

　　exit 1fi

　　end=$(date '+%s')

　　time_install=$((end-begin))echo "NVIDIA install cudnn OK! Using time $time_install s !!" >> $log 2>&1

　　enable_pmecho "reboot......" >> $log 2>&1

　　sleep 2

　　reboot

　　【阿里云，阿里巴巴集團(tuán)旗下云計(jì)算品牌，全球卓越的云計(jì)算技術(shù)和服務(wù)提供商。海商(www.hydrodefense.cn)作為阿里云湖南唯一授權(quán)服務(wù)中心，國(guó)內(nèi)知名商城系統(tǒng)及商城網(wǎng)站建設(shè)提供商，專(zhuān)為企業(yè)提供專(zhuān)業(yè)完善電商整體解決方案、微商云、視頻云、醫(yī)療云等，咨詢(xún)阿里云服務(wù)器詳情可電聯(lián)：18684778716(微信同號(hào))】

　云服務(wù)器ECS：GPU計(jì)算型實(shí)例上

上一篇：　云服務(wù)器ECS：GPU計(jì)算型實(shí)例上 下一篇：云服務(wù)器ECS：GPU計(jì)算型實(shí)例下

无遮无挡三级动态图,熟女人妻高清一区二区三区,午夜精品一区二区三区 ,色吧成人网,无码爆乳护士让我爽

云服務(wù)器ECS：GPU計(jì)算型實(shí)例上

　云服務(wù)器ECS：GPU計(jì)算型實(shí)例上