{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "VoiceChangerDemo", "provenance": [], "authorship_tag": "ABX9TyNpx+wMQQKflNKso1Aructr", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" }, "accelerator": "GPU", "gpuClass": "standard" }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "source": [ "MMVCプレイヤー(普通版)\n", "---\n", "\n", "このノートはColab上でMMVCのボイチェンを行うノートです。\n", "\n", "正式版はローカルPCのDocker上で動かすアプリケーションです。\n", "\n", "正式版は、多くの場合より少ないタイムラグで滑らかに音声を変換できます。\n", "\n", "詳細な使用方法はこちらの[リポジトリ](https://github.com/w-okada/voice-changer)からご確認ください。\n" ], "metadata": { "id": "Lbbmx_Vjl0zo" } }, { "cell_type": "markdown", "source": [ "# GPUを確認\n", "GPUを用いたほうが高速に処理が行えます。\n", "\n", "下記のコマンドでGPUが確認できない場合は、上のメニューから\n", "\n", "「ランタイム」→「ランタイムの変更」→「ハードウェア アクセラレータ」\n", "\n", "でGPUを選択してください。" ], "metadata": { "id": "oUKi1NYMmXrr" } }, { "cell_type": "code", "source": [ "# (1) GPUの確認\n", "!nvidia-smi" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "vV1t7PBRm-o6", "outputId": "76db7bf4-f28a-4049-833c-a9f17c1b5472" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Sat Dec 3 10:47:24 2022 \n", "+-----------------------------------------------------------------------------+\n", "| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |\n", "|-------------------------------+----------------------+----------------------+\n", "| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n", "| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n", "| | | MIG M. |\n", "|===============================+======================+======================|\n", "| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |\n", "| N/A 51C P0 25W / 70W | 0MiB / 15109MiB | 0% Default |\n", "| | | N/A |\n", "+-------------------------------+----------------------+----------------------+\n", " \n", "+-----------------------------------------------------------------------------+\n", "| Processes: |\n", "| GPU GI CI PID Type Process name GPU Memory |\n", "| ID ID Usage |\n", "|=============================================================================|\n", "| No running processes found |\n", "+-----------------------------------------------------------------------------+\n" ] } ] }, { "cell_type": "markdown", "source": [ "# 使用するモデルとコンフィグファイルの指定\n", "\n", "使用するトレーニング済みのモデルと、トレーニングで使用したコンフィグファイルのパスを指定してください。\n", "\n", "多くの場合はGoogle Driveに格納されているファイルを使用すると思います。その場合は、下の(2-2)のセルを実行してドライブをマウントしてください" ], "metadata": { "id": "mHvGrgaWnIPA" } }, { "cell_type": "code", "source": [ "# (2-1) 使用するモデルとコンフィグファイルの指定\n", "CONFIG=\"/content/drive/MyDrive/VoiceChanger/config.json\"\n", "MODEL=\"/content/drive/MyDrive/VoiceChanger/G_326000.pth\"" ], "metadata": { "id": "nSXATMWYb4Ik" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "2wxD-gRSMU5R", "outputId": "ff4560ce-c04a-4fe8-bec3-3b93dfdd77ee" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Mounted at /content/drive\n" ] } ], "source": [ "# (2-2) Google Driveのマウント\n", "from google.colab import drive\n", "drive.mount('/content/drive')" ] }, { "cell_type": "markdown", "source": [ "# リポジトリのクローン\n", "リポジトリをクローンします" ], "metadata": { "id": "sLBfykjBnjWc" } }, { "cell_type": "code", "source": [ "# (3) リポジトリのクローン\n", "!git clone --depth 1 https://github.com/isletennos/MMVC_Trainer.git -b v1.3.1.5 /MMVC_Trainer\n", "!git clone --depth 1 https://github.com/w-okada/voice-changer.git\n", "%cd /MMVC_Trainer/monotonic_align \n", "!cythonize -3 -i core.pyx &> /dev/null\n", "!mv core.cpython-*.so monotonic_align &> /dev/null\n", "%cd /content/voice-changer/demo/\n" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "86wTFmqsNMnD", "outputId": "35bad20a-b0a8-44c9-a072-10e11a199c50" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Cloning into '/MMVC_Trainer'...\n", "remote: Enumerating objects: 917, done.\u001b[K\n", "remote: Counting objects: 100% (917/917), done.\u001b[K\n", "remote: Compressing objects: 100% (828/828), done.\u001b[K\n", "remote: Total 917 (delta 3), reused 889 (delta 0), pack-reused 0\u001b[K\n", "Receiving objects: 100% (917/917), 53.02 MiB | 20.41 MiB/s, done.\n", "Resolving deltas: 100% (3/3), done.\n", "Cloning into 'voice-changer'...\n", "remote: Enumerating objects: 108, done.\u001b[K\n", "remote: Counting objects: 100% (108/108), done.\u001b[K\n", "remote: Compressing objects: 100% (94/94), done.\u001b[K\n", "remote: Total 108 (delta 15), reused 75 (delta 3), pack-reused 0\u001b[K\n", "Receiving objects: 100% (108/108), 18.79 MiB | 15.05 MiB/s, done.\n", "Resolving deltas: 100% (15/15), done.\n", "/MMVC_Trainer/monotonic_align\n", "/content/voice-changer/demo\n" ] } ] }, { "cell_type": "markdown", "source": [ "# ファイルの配置\n", "アプリケーションの挙動を記した設定ファイルをコピーします(4-1)。(4-2)はコピーした設定ファイルを表示しています。もしかしたらうまく動かないときに役立つかもしれません。" ], "metadata": { "id": "jmDY8W_fnuSi" } }, { "cell_type": "code", "source": [ "# (4-1) 設定ファイルの配置\n", "!cp ../template/setting_mmvc_colab.json ../frontend/dist/assets/setting.json\n" ], "metadata": { "id": "Bn4kV8TgXp8i" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# (4-2) 設定ファイルの確認\n", "!cat ../frontend/dist/assets/setting.json\n" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "pjxPsOOaXXTj", "outputId": "0aaa53de-2a45-45c1-b0e6-cc6de98873fe" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "{\n", " \"app_title\": \"voice-changer\",\n", " \"majar_mode\": \"colab\",\n", " \"voice_changer_server_url\": \"/test\",\n", " \"sample_rate\": 48000,\n", " \"buffer_size\": 1024,\n", " \"prefix_chunk_size\": 48,\n", " \"chunk_size\": 48,\n", " \"speakers\": [\n", " {\n", " \"id\": 100,\n", " \"name\": \"ずんだもん\"\n", " },\n", " {\n", " \"id\": 107,\n", " \"name\": \"user\"\n", " },\n", " {\n", " \"id\": 101,\n", " \"name\": \"そら\"\n", " },\n", " {\n", " \"id\": 102,\n", " \"name\": \"めたん\"\n", " },\n", " {\n", " \"id\": 103,\n", " \"name\": \"つむぎ\"\n", " }\n", " ],\n", " \"src_id\": 107,\n", " \"dst_id\": 100,\n", " \"vf_enable\": true,\n", " \"voice_changer_mode\": \"realtime\",\n", " \"gpu\": 0,\n", " \"available_gpus\": [-1, 0, 1, 2, 3, 4],\n", " \"screen\": {\n", " \"enable_screen\": true,\n", " \"backgournd_image_url\": \"./assets/images/bg_natural_sougen.jpg\"\n", " },\n", " \"avatar\": {\n", " \"enable_avatar\": false,\n", " \"motion_capture_face\": false,\n", " \"motion_capture_upperbody\": false,\n", " \"lip_overwrite_with_voice\": false,\n", " \"avatar_url\": \"./assets/vrm/zundamon/zundamon.vrm\",\n", " \"background_color\": \"#0000dd\",\n", " \"chroma_key\": \"#0000dd\",\n", " \"avatar_canvas_size\": [1280, 720],\n", " \"screen_canvas_size\": [1280, 720]\n", " },\n", " \"advance\": {\n", " \"avatar_draw_skip_rate\": 3,\n", " \"screen_draw_skip_rate\": 3,\n", " \"visualizer_draw_skip_rate\": 3,\n", " \"cross_fade_lower_value\": 0.1,\n", " \"cross_fade_offset_rate\": 0.3,\n", " \"cross_fade_end_rate\": 0.6,\n", " \"cross_fade_type\": 2\n", " },\n", " \"transcribe\": {\n", " \"lang\": \"日本語(ja-JP)\",\n", " \"expire_time\": 5\n", " }\n", "}\n" ] } ] }, { "cell_type": "markdown", "source": [ "# モジュールのインストール\n", "\n", "必要なモジュールをインストールします。" ], "metadata": { "id": "8Na2PbLZSWgZ" } }, { "cell_type": "code", "source": [ "# (5) 設定ファイルの確認\n", "!apt-get install -y espeak libsndfile1-dev &> /dev/null\n", "!pip install unidecode &> /dev/null\n", "!pip install phonemizer &> /dev/null\n", "!pip install retry &> /dev/null\n", "!pip install python-socketio &> /dev/null\n", "!pip install fastapi &> /dev/null\n", "!pip install python-multipart &> /dev/null\n", "!pip install uvicorn &> /dev/null\n", "!pip install websockets &> /dev/null\n", "!pip install pyOpenSSL &> /dev/null" ], "metadata": { "id": "LwZAAuqxX7yY" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "# サーバの起動\n", "\n", "サーバを起動します。(6-1)\n", "\n", "サーバの起動状況を確認します。(6-2) \n", "\n", "このセルは繰り返し実行することになるのでCtrl+Retでセルを実行してください。\n", "\n", "アクセスできるようになるまで、1~2分かかるようです。コーヒーでも飲みに行きましょう。\n", "\n", "下記のようなテキストが表示されたら起動完了です。\n", "\n", "**`INFO:root:Loaded checkpoint ...`**\n", "\n", "```\n", "INFO:root:Loaded checkpoint '/content/drive/MyDrive/VoiceChanger/G_326000.pth' (iteration 1136)\n", "VoiceChanger Initialized (GPU_NUM:1)\n", " PHASE1:__main__\n", "Start MMVC SocketIO Server\n", " CONFIG:/content/drive/MyDrive/VoiceChanger/config.json, MODEL:/content/drive/MyDrive/VoiceChanger/G_326000.pth\n", "DEBUG:asyncio:Using selector: EpollSelector\n", " Phase name:MMVCServerSIO\n", " PHASE3:MMVCServerSIO\n", "INFO:root:Loaded checkpoint '/content/drive/MyDrive/VoiceChanger/G_326000.pth' (iteration 1136)\n", "```\n", "\n" ], "metadata": { "id": "-_2OcN9Borke" } }, { "cell_type": "code", "source": [ "# (6-1) サーバの起動\n", "import random\n", "PORT = 10000 + random.randint(1, 9999)\n", "LOG_FILE = f\"LOG_FILE_{PORT}\"\n", "\n", "get_ipython().system_raw(f'python3 MMVCServerSIO.py -t MMVC -p {PORT} -c {CONFIG} -m {MODEL} --colab True >{LOG_FILE} 2>&1 &')\n", "#print(f\"PORT:{PORT}, LOG_FILE:{LOG_FILE}\")" ], "metadata": { "id": "iNOAB7zISI6J" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "# (6-2) サーバの起動確認 (Ctrl+Retで実行)\n", "!tail -20 {LOG_FILE}" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "chu06KpAjEK6", "outputId": "c1f8ca20-23e3-42c0-9143-b44498e84bd2" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "\u001b[32m Phase name:__main__\u001b[0m\n", "\u001b[32m PHASE3:__main__\u001b[0m\n", "INFO:root:Loaded checkpoint '/content/drive/MyDrive/VoiceChanger/G_326000.pth' (iteration 1136)\n", "VoiceChanger Initialized (GPU_NUM:1)\n", "\u001b[32m PHASE1:__main__\u001b[0m\n", "\u001b[17mStart MMVC SocketIO Server\u001b[0m\n", "\u001b[34m CONFIG:/content/drive/MyDrive/VoiceChanger/config.json, MODEL:/content/drive/MyDrive/VoiceChanger/G_326000.pth\u001b[0m\n", "DEBUG:asyncio:Using selector: EpollSelector\n", "\u001b[32m Phase name:MMVCServerSIO\u001b[0m\n", "\u001b[32m PHASE3:MMVCServerSIO\u001b[0m\n", "INFO:root:Loaded checkpoint '/content/drive/MyDrive/VoiceChanger/G_326000.pth' (iteration 1136)\n" ] } ] }, { "cell_type": "markdown", "source": [ "# プロキシを起動\n", "ウェブサーバへのアクセスをするためのプロキシを起動します。\n", "\n", "表示されたURLをクリックして開くと別タブでアプリが開きます。\n", "\n", "Colabなので、ロードにある程度時間がかかります(30秒くらい)。" ], "metadata": { "id": "WhxcFLQEpctq" } }, { "cell_type": "code", "source": [ "# (7) プロキシを起動\n", "from google.colab.output import eval_js\n", "proxy = eval_js( \"google.colab.kernel.proxyPort(\" + str(PORT) + \")\" )\n", "print(f\"{proxy}front/\")" ], "metadata": { "id": "nkRjZm95l87C", "colab": { "base_uri": "https://localhost:8080/", "height": 34 }, "outputId": "96886f3d-3f88-4392-df41-90514fe4b0bc" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "https://3lddbr773py-496ff2e9c6d22116-11741-colab.googleusercontent.com/front/\n" ] } ] }, { "cell_type": "code", "source": [], "metadata": { "id": "Jos5WZHGmz4s" }, "execution_count": null, "outputs": [] } ] }