JetsonNano Docker and JupyterLab
tags: docker Jupyter JetsonNano
‘’’
本教學參考/翻譯自 Nvidia 線上免費課程(英文教學):
https://courses.nvidia.com/courses/course-v1:DLI+S-RX-02+V2/about
‘’’
名詞解釋:
- DLI: 全名為 Deep Learning Institute,給大專院校教職員提供免費下載的課程教材
- NGC: 全名為 NVIDIA GPU CLOUD,提供深度學習、機器學習和HPC的GPU最佳化應用軟體免費下載,有豐富的 docker image可下載
usb ip: 192.168.55.1
Download Docker And Start JupyterLab
應用版:
- Terminal ssh 連線
- 撰寫command執行檔 docker_jupyter_run.sh:
 (實際上仍是使用教程的dli-image,但不掛載鏡頭)
 sudo docker run --runtime nvidia -it --rm --network host
 –volume ~/nvdli-data:/nvdli-nano/data
 nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.5.0
- 重點:
- 啟動: ./docker_jupyter_run.sh
- IP and port: :8888 
- Jupyter 密碼: dlinano
- 檔案儲存位置: ~/nvdli-data
 
初學者教學版:
- 
首先於遠端SSH連線進 Jetson nano - win10 開啟 powershell
 ssh@ 
 
- win10 開啟 powershell
- 
建立一個專案資料夾(用來保存docker-container更動的檔案): nvdli-data 
 mkdir -p ~/nvdli-data
- 
撰寫一個 docker_dli_run.sh 檔案 
 echo “sudo docker run --runtime nvidia -it --rm --network host
 –volume ~/nvdli-data:/nvdli-nano/data
 –device /dev/video0
 nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-填入L4T版本” > docker_dli_run.sh- L4T_version: r32.5.0
- L4T_version查詢方式
- [!!!] 輸入: jetson_release 即可
- 輸入指令查看JetsonNano的版本: cat /etc/nv_tegra_release
 目前得到: # R32 (release), REVISION: 5.1, GCID: 27362550, BOARD: t210ref, EABI: aarch64, DATE: Wed May 19 18:07:59 UTC 2021
 表示版本為: r32 5.1
- 前往官網查看提供的 docker image 版本: https://ngc.nvidia.com/catalog/containers/nvidia:dli:dli-nano-ai
- 對應最新版似乎只有 r32.5.0,只好硬裝看看 (一開始誤裝 r32.4.4 結果無法執行)
- 結果: 幸好可以執行
 
 
 
- 
賦予 docker_dli_run.sh 執行權限 
 chmod +x docker_dli_run.sh
- 
以後要執行這個 docker 就輸入以下指令即可: 
 ./docker_dli_run.sh
- 
Logging Into The JupyterLab Server - Open the following link address:
 於遠端機器瀏覽器輸入::8888 
 若是使用 usb 連接筆電,則固定IP為: 192.168.55.1:8888- The JupyterLab server running on the Jetson Nano will open up with a login prompt the first time.
 
- Enter the password: dlinano
- You will see this screen. Congratulations!
 
- Open the following link address:
old-one: ssh 連線時千萬不可以 reboot,會導致 ssh public key 對不上等可怕問題, vnc關閉(這似乎會堵塞所有連線)
old user: kuihao
password: same as laptop (force change passwd: sudo passwd 
docker中有實驗記錄
安裝 jupyter, password = same as laptop
new-one: kuihao
password: same as laptop
剛設定完,並已安裝基本必要套件、設定cuda、swap(應該是設定失敗,要照官方重新設定)
- 調整功耗模式
- 鎖住功率使其不過載
 sudo jetson_clocks
- 顯示當前模式
 sudo nvpmodel -q
- check the current performance mode, issue:
 $ sudo nvpmodel -q --verbose
- 預設為高效能模式MAX N模式(10W) (這個功率需要接DC 5V 4A,不然會突然關機)
 sudo nvpmodel -m 0
- 切換換到 5W 模式(Micro-USB供電)
 sudo nvpmodel -m 1
 
- 鎖住功率使其不過載
- 調整swap空間
- 檢查一些目前系統是否有設定 Swap 空間, 可以用 “swapon -s” 指令
- check your memory and swap values with this command:
 free -m
- If you don’t have the right amount of swap, or want to change the value, use the following procedure to do so (from a terminal):
- 
根據某網友實測,設定8G才比較不會當機 
- 
Disable ZRAM: 
 sudo systemctl disable nvzramconfig
- 
Create 8GB swap file 
 sudo fallocate -l 8G /mnt/8GB.swap
 sudo chmod 600 /mnt/8GB.swap
 sudo mkswap /mnt/8GB.swap
- 
Append the following line to /etc/fstab 
 [可能失敗]: sudo echo “/mnt/8GB.swap swap swap defaults 0 0” >> /etc/fstab
 [若失敗則改成開檔改寫]:- sudo nano /etc/fstab
- 新增或修改原本的 swap 設定,填入 /mnt/8GB.swap swap swap defaults 0 0
- ctrl+s 儲存
- ctrl+x 退出
 
- 
REBOOT! (重新啟動才會套用新設定) 
- 
檢視更改是否成功: 
 free -m
 應會看到多了一行: Swap: total 8191
 
- 
 
docker image:
* image: L4T 測試 docker on jetson(?)
    * https://ngc.nvidia.com/catalog/containers/nvidia:l4t-base
    #xhost +
    #sudo docker run -it --rm --net=host --runtime nvidia  -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.4.3
    #apt-get update && apt-get install -y --no-install-recommends make g++
    root@nano:/# cp -r /usr/local/cuda/samples /tmp
    root@nano:/# cd /tmp/samples/5_Simulations/nbody
    root@nano:/# make
    root@nano:/# ./nbody
* Tensorflow (失敗: ARM 硬體版本不合):
    * https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow
        docker pull nvcr.io/nvidia/tensorflow:21.06-tf1-py3
    * Run the container image.
        sudo docker run --gpus all -it --rm -v /home/kuihao/K/DockerData/TensorflowImage:/K/DockerData nvcr.io/nvidia/tensorflow:21.06-tf1-py3
        
        - `-it` means run in interactive mode (以使用者身分進入 bash/container)
        - `--rm` will delete the container when finished
        - `-v` is the mounting directory
        - `local_dir` is the directory or file from your host system (absolute path) that you want to access from inside your container.  For example, the `local_dir` in the following path is `/home/jsmith/data/mnist`.  
        1
-v /home/jsmith/data/mnist:/data/mnist
        
        If you are inside the container, for example, `ls /data/mnist`, you will see the same files as if you issued the `ls /home/jsmith/data/mnist` command from outside the container.
        
        - `container_dir` is the target directory when you are inside your container.  For example, `/data/mnist` is the target directory in the example:
        
        1
-v /home/jsmith/data/mnist:/data/mnist
        
        - `xx.xx` is the container version. For example, `20.01`.
        - `tfx` is the version of TensorFlow. For example, `tf1` or `tf2`.
    * TensorFlow is run by importing it as a Python module:
        $ python
        >>> import tensorflow as tf
        >>> print(tf.__version__)
* L4T 版本的 Tensorflow:
    * https://ngc.nvidia.com/catalog/containers/nvidia:l4t-tensorflow
        docker pull nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf2.3-py3
    * Running the Container:
        (X) sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf1.15-py3
    * Running the Container && Mounting Directories from the Host Device:
        * Tensorflow 1.15:
            sudo docker run -it --rm --runtime nvidia --network host -v /home/kuihao/K/DockerData/TF1Image:/K/DockerData nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf1.15-py3
        * Tensorflow 2.3:
            sudo docker run -it --rm --runtime nvidia --network host -v /home/kuihao/K/DockerData/TF2Image:/K/DockerData nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf2.3-py3
* docker container 指令:
    * https://ithelp.ithome.com.tw/articles/10191634
    * --restart=always:如果 container 遇到例外的情況被 stop 掉,例如是重新開機,docker 會試著重新啟動此 container
    * --name=<name>:設定 container 的 name 為 <name>
    * [日常執行重複使用container]  
        * 建立:
        sudo docker run -it --runtime nvidia --restart=always --network host -v /home/kuihao/K/DockerData/TF2Image:/K/DockerData --name contain_TF2 nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf2.3-py3
        * 日後進入:
        sudo docker exec -it contain_TF2 bash
        * 刪除 container:
        sudo docker rm <container ID or name>
    * 更多 docker 指令:
        * https://docs.docker.com/cloud/aci-integration/
        * https://ithelp.ithome.com.tw/articles/10191727
- 
取得 abslute path 
 readlink -f file.txt
- 
風扇控制 (可撰寫開機自動啟動): - https://blog.cavedu.com/2019/10/04/nvidia-jetson-nano-fan/
- 自訂風速 0~255,設為 0 就關閉風扇:
 sudo sh -c ‘echo 100 > /sys/devices/pwm-fan/target_pwm’
- 更佳的選擇 (已安裝) https://github.com/Pyrestone/jetson-fan-ctl
 
- 
監測 jetson nano 溫度: