{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "provenance": [], "collapsed_sections": [], "authorship_tag": "ABX9TyNuz5ToQB/hiwJTFCBOyGT/", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" }, "gpuClass": "standard" }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "source": [ "Voice Recorder\n", "---\n", "\n", "このノートでは、MMVCのトレーニング用の音声を録画するアプリ\"Voice Recorder\"をColab上から起動します。\n", "\n", "録音された音声はこのノートを通してGoogle Drive上にアップロードすることができます。\n", "\n", "また、従来のVoice Recorderと同様にローカルPCにダウンロードすることもできます。\n", "\n", "録音後にブラウザとcolab上のサーバ間でやり取りを行うので、更新に少しタイムラグが発生します。\n", "\n", "ご自身のPCでトレーニングを行う予定の場合は、colab上のサーバで録音するメリットはほぼありませんので、より快適な録音をするために[こちらのgithub上のVoice Recorder](https://w-okada.github.io/voice-changer/)をご使用ください。\n", "\n", "\n", "より詳細な情報はこちらの[リポジトリ](https://github.com/w-okada/voice-changer)からご確認いただけます。\n" ], "metadata": { "id": "Lbbmx_Vjl0zo" } }, { "cell_type": "markdown", "source": [ "# 録音データを格納するフォルダを指定\n", "\n", "フォルダは次の二つを指定する必要があります。\n", "1. 録音アプリ用のキャッシュデータ格納フォルダ\n", "2. トレーニングデータの格納フォルダ\n", "\n", "通常、録音データはGoogle Drive上のフォルダに格納すると思います。\n", "\n", "まずは(1-1)を実行してドライブをマウントしてください。\n", "\n", "その後、(1-2)で上記の格納フォルダを指定してください。" ], "metadata": { "id": "mHvGrgaWnIPA" } }, { "cell_type": "code", "source": [ "# (1-1) Google Driveのマウント\n", "from google.colab import drive\n", "drive.mount('/content/drive')" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Eihm8H2X-7wm", "outputId": "e51016e6-7f6e-4b95-8822-a4713017a6a6" }, "execution_count": 1, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Mounted at /content/drive\n" ] } ] }, { "cell_type": "code", "source": [ "# (1-2) 使用するモデルとコンフィグファイルの指定\n", "RECORDER_DATA_DIR=\"/content/drive/MyDrive/VoiceChanger/voice_data\"\n", "MMVC_DATA_DIR=\"/content/drive/MyDrive/VoiceChanger/dataset\"\n" ], "metadata": { "id": "nSXATMWYb4Ik" }, "execution_count": 2, "outputs": [] }, { "cell_type": "markdown", "source": [ "# リポジトリのクローン\n", "リポジトリをクローンします" ], "metadata": { "id": "sLBfykjBnjWc" } }, { "cell_type": "code", "source": [ "# (2) リポジトリのクローン\n", "!git clone https://github.com/w-okada/voice-changer.git\n", "%cd voice-changer/docs/\n" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "86wTFmqsNMnD", "outputId": "63e02151-2e55-49f3-8219-ba16cbb28233" }, "execution_count": 3, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Cloning into 'voice-changer'...\n", "remote: Enumerating objects: 499, done.\u001b[K\n", "remote: Counting objects: 100% (83/83), done.\u001b[K\n", "remote: Compressing objects: 100% (65/65), done.\u001b[K\n", "remote: Total 499 (delta 26), reused 30 (delta 18), pack-reused 416\u001b[K\n", "Receiving objects: 100% (499/499), 21.10 MiB | 13.43 MiB/s, done.\n", "Resolving deltas: 100% (253/253), done.\n", "/content/voice-changer/docs\n" ] } ] }, { "cell_type": "markdown", "source": [ "# ファイルの配置\n", "アプリケーションの挙動を記した設定ファイルをコピーします(3-1)。(3-2)はコピーした設定ファイルを表示しています。もしかしたらうまく動かないときに役立つかもしれません。" ], "metadata": { "id": "jmDY8W_fnuSi" } }, { "cell_type": "code", "source": [ "# (3-1) 設定ファイルのコピー\n", "!cp ../template/setting_recorder_colab.json assets/setting.json" ], "metadata": { "id": "ow88ZaubluOJ" }, "execution_count": 4, "outputs": [] }, { "cell_type": "code", "source": [ "# (3-2) 設定ファイルの内容確認\n", "\n", "!cat assets/setting.json" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "rpWUobjlBCNF", "outputId": "0dd8bbc1-dd1e-47fe-fef6-fbc22540dc7a" }, "execution_count": 5, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "{\n", " \"app_title\": \"voice-recorder\",\n", " \"storage_type\":\"server\",\n", " \"use_mel_spectrogram\":true,\n", " \"text\": [\n", " {\n", " \"title\": \"ITA-emotion\",\n", " \"wavPrefix\": \"emotion\",\n", " \"file\": \"./assets/text/ITA_emotion_all.txt\",\n", " \"file_hira\": \"./assets/text/ITA_emotion_all_hira.txt\"\n", " },\n", " {\n", " \"title\": \"ITA-recitation\",\n", " \"wavPrefix\": \"recitation\",\n", " \"file\": \"./assets/text/ITA_recitation_all.txt\",\n", " \"file_hira\": \"./assets/text/ITA_recitation_all_hira.txt\"\n", " },\n", " {\n", " \"title\": \"wagahaiwa\",\n", " \"wavPrefix\": \"wagahaiwa\",\n", " \"file\": \"./assets/text/wagahaiwa.txt\",\n", " \"file_hira\": \"./assets/text/wagahaiwa_hira.txt\"\n", " }\n", " ]\n", "}\n" ] } ] }, { "cell_type": "markdown", "source": [ "# モジュールのインストール\n", "\n", "必要なモジュールをインストールします。" ], "metadata": { "id": "8Na2PbLZSWgZ" } }, { "cell_type": "code", "source": [ "# (4) 設定ファイルの確認\n", "!pip install flask\n", "!pip install flask_cors\n" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "LwZAAuqxX7yY", "outputId": "627e09e8-bc64-4110-ce0a-5b3f84e8bf1d" }, "execution_count": 6, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Requirement already satisfied: flask in /usr/local/lib/python3.7/dist-packages (1.1.4)\n", "Requirement already satisfied: click<8.0,>=5.1 in /usr/local/lib/python3.7/dist-packages (from flask) (7.1.2)\n", "Requirement already satisfied: itsdangerous<2.0,>=0.24 in /usr/local/lib/python3.7/dist-packages (from flask) (1.1.0)\n", "Requirement already satisfied: Werkzeug<2.0,>=0.15 in /usr/local/lib/python3.7/dist-packages (from flask) (1.0.1)\n", "Requirement already satisfied: Jinja2<3.0,>=2.10.1 in /usr/local/lib/python3.7/dist-packages (from flask) (2.11.3)\n", "Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.7/dist-packages (from Jinja2<3.0,>=2.10.1->flask) (2.0.1)\n", "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Collecting flask_cors\n", " Downloading Flask_Cors-3.0.10-py2.py3-none-any.whl (14 kB)\n", "Requirement already satisfied: Flask>=0.9 in /usr/local/lib/python3.7/dist-packages (from flask_cors) (1.1.4)\n", "Requirement already satisfied: Six in /usr/local/lib/python3.7/dist-packages (from flask_cors) (1.15.0)\n", "Requirement already satisfied: itsdangerous<2.0,>=0.24 in /usr/local/lib/python3.7/dist-packages (from Flask>=0.9->flask_cors) (1.1.0)\n", "Requirement already satisfied: Werkzeug<2.0,>=0.15 in /usr/local/lib/python3.7/dist-packages (from Flask>=0.9->flask_cors) (1.0.1)\n", "Requirement already satisfied: click<8.0,>=5.1 in /usr/local/lib/python3.7/dist-packages (from Flask>=0.9->flask_cors) (7.1.2)\n", "Requirement already satisfied: Jinja2<3.0,>=2.10.1 in /usr/local/lib/python3.7/dist-packages (from Flask>=0.9->flask_cors) (2.11.3)\n", "Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.7/dist-packages (from Jinja2<3.0,>=2.10.1->Flask>=0.9->flask_cors) (2.0.1)\n", "Installing collected packages: flask-cors\n", "Successfully installed flask-cors-3.0.10\n" ] } ] }, { "cell_type": "markdown", "source": [ "# サーバの起動\n", "\n", "サーバを起動します。(5-1) \n", "\n", "サーバの起動状況を確認します。(5-2) \n", "\n", "このセルは繰り返し実行することになるのでCtrl+Retでセルを実行してください。\n", "\n", "アクセスできるようになるまで、数秒かかります。\n", "\n", "下記のようなテキストが表示されたら起動完了です。\n", "\n", "```\n", "[2022-09-13 22:20:49,936] INFO in recorderServer: START APP\n", " * Serving Flask app \"recorderServer\" (lazy loading)\n", " * Environment: production\n", " WARNING: This is a development server. Do not use it in a production deployment.\n", " Use a production WSGI server instead.\n", " * Debug mode: on\n", "[2022-09-13 22:20:49,946] INFO in _internal: * Running on http://0.0.0.0:8018/ (Press CTRL+C to quit)\n", "[2022-09-13 22:20:49,947] INFO in _internal: * Restarting with stat\n", "[2022-09-13 22:20:50,166] INFO in recorderServer: START APP\n", "[2022-09-13 22:20:50,174] WARNING in _internal: * Debugger is active!\n", "[2022-09-13 22:20:50,200] INFO in _internal: * Debugger PIN: 334-166-753\n", "```\n", "\n" ], "metadata": { "id": "-_2OcN9Borke" } }, { "cell_type": "code", "source": [ "# (5-1) サーバの起動\n", "PORT=8018\n", "get_ipython().system_raw(f'python3 recorderServer.py {PORT} {RECORDER_DATA_DIR} >foo 2>&1 &')" ], "metadata": { "id": "iNOAB7zISI6J" }, "execution_count": 7, "outputs": [] }, { "cell_type": "code", "source": [ "# (5-2) サーバの起動確認 (Ctrl+Retで実行)\n", "!cat foo" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "chu06KpAjEK6", "outputId": "a42873c4-2826-4b54-f497-01adb1683875" }, "execution_count": 8, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "[2022-09-13 22:45:22,054] INFO in recorderServer: START APP\n", " * Serving Flask app \"recorderServer\" (lazy loading)\n", " * Environment: production\n", " WARNING: This is a development server. Do not use it in a production deployment.\n", " Use a production WSGI server instead.\n", " * Debug mode: on\n", "[2022-09-13 22:45:22,062] INFO in _internal: * Running on http://0.0.0.0:8018/ (Press CTRL+C to quit)\n", "[2022-09-13 22:45:22,063] INFO in _internal: * Restarting with stat\n", "[2022-09-13 22:45:22,238] INFO in recorderServer: START APP\n", "[2022-09-13 22:45:22,244] WARNING in _internal: * Debugger is active!\n", "[2022-09-13 22:45:22,268] INFO in _internal: * Debugger PIN: 334-166-753\n" ] } ] }, { "cell_type": "markdown", "source": [ "# プロキシを起動\n", "ウェブサーバへのアクセスをするためのプロキシを起動します。\n", "\n", "表示されたURLをクリックして開くと別タブでアプリが開きます。\n", "\n", "Colabなので、ロードにある程度時間がかかります(30秒くらい)。" ], "metadata": { "id": "WhxcFLQEpctq" } }, { "cell_type": "code", "source": [ "# (7) プロキシを起動\n", "from google.colab import output\n", "\n", "output.serve_kernel_port_as_window(PORT)" ], "metadata": { "id": "nkRjZm95l87C", "colab": { "base_uri": "https://localhost:8080/", "height": 34 }, "outputId": "768df0ee-9499-430b-ab4f-c602311114ae" }, "execution_count": 9, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "" ], "application/javascript": [ "(async (port, path, text, element) => {\n", " if (!google.colab.kernel.accessAllowed) {\n", " return;\n", " }\n", " element.appendChild(document.createTextNode(''));\n", " const url = await google.colab.kernel.proxyPort(port);\n", " const anchor = document.createElement('a');\n", " anchor.href = new URL(path, url).toString();\n", " anchor.target = '_blank';\n", " anchor.setAttribute('data-href', url + path);\n", " anchor.textContent = text;\n", " element.appendChild(anchor);\n", " })(8018, \"/\", \"https://localhost:8018/\", window.element)" ] }, "metadata": {} } ] }, { "cell_type": "markdown", "source": [ "# トレーニング用データフォルダ\n", "\n", "以下、トレーニング用のフォルダを作成します。\n", "\n", "\n" ], "metadata": { "id": "ZGuYYN7oCSM4" } }, { "cell_type": "code", "source": [ "corpus_id = \"14oXoQqLxRkP8NJK8qMYGee1_q2uEED1z\"\n", "\n", "data_setting = [\n", " [\"user\", \"\", \"\", \"00_myvoice\", \"107\"],\n", " [\"zundamon\", \"1h8Ajyvoig7Hl3LSSt2vYX0sUHX3JDF3R\", \"1205_zundamon\", \"01_target_zundamon\", \"100\"],\n", " [\"tsumugi\", \"14zE0F_5ZCQWXf6m6SUPF5Y3gpL6yb7zk\", \"344_tsumugi\", \"02_target_tsumugi\", \"103\"],\n", " [\"metan\", \"1iCrpzhqXm-0YdktOPM8M1pMtgQIDF3r4\", \"459_methane\", \"03_target_metan\", \"102\"],\n", " [\"sora\", \"1MXfMRG_sjbsaLihm7wEASG2PwuCponZF\", \"912_sora\", \"04_target_ksora\", \"101\"],\n", "]" ], "metadata": { "id": "3PhrmCD2LaCH" }, "execution_count": 43, "outputs": [] }, { "cell_type": "code", "source": [ "import os, glob\n", "\n", "os.makedirs(MMVC_DATA_DIR, exist_ok=True)\n", "speaker_list = os.path.join(MMVC_DATA_DIR, \"multi_speaker_correspondence.txt\")\n", "!echo \"00_myvoice|107\" > {speaker_list}\n", "!echo \"01_target_zundamon|100\" >> {speaker_list}\n", "!echo \"02_target_tsumugi|103\" >> {speaker_list}\n", "!echo \"03_target_metan|102\" >> {speaker_list}\n", "!echo \"04_target_ksora|101\" >> {speaker_list}\n", "\n", "!cat {speaker_list}\n", "\n" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "f5l6ggSyACLs", "outputId": "4db3571a-46e6-4fd9-c560-628cf4af9284" }, "execution_count": 57, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "00_myvoice|107\n", "01_target_zundamon|100\n", "02_target_tsumugi|103\n", "03_target_metan|102\n", "04_target_ksora|101\n" ] } ] }, { "cell_type": "code", "source": [ "!rm -rf /content/drive/MyDrive/VoiceChanger/train_data/00_myvoice/wav/*" ], "metadata": { "id": "UEVb2GGZSesY" }, "execution_count": 71, "outputs": [] }, { "cell_type": "code", "source": [ "import gdown\n", "\n", "gdown.download(f'https://drive.google.com/uc?id={corpus_id}', f'ita_corpus.zip', quiet=False)\n", "!unzip -oq 'ita_corpus.zip'\n", "\n", "for chara in data_setting:\n", " chara_root_dir = os.path.join(MMVC_DATA_DIR, chara[3])\n", " os.makedirs(chara_root_dir, exist_ok=True)\n", " \n", " chara_text_dir = os.path.join(chara_root_dir, \"text\")\n", " os.makedirs(chara_text_dir, exist_ok=True)\n", " chara_wav_dir = os.path.join(chara_root_dir, \"wav\")\n", " os.makedirs(chara_wav_dir, exist_ok=True)\n", "\n", " if chara[0] != \"user\":\n", " gdown.download(f'https://drive.google.com/uc?id={chara[1]}', f'{chara[0]}.zip', quiet=False)\n", " !unzip -f '{chara[0]}.zip'\n", " !cp -rf {chara[2]}/* {chara_root_dir}/\n", "\n", " if chara[0] == \"user\":\n", " !cp MMVC向けITAコーパス文章ファイル_配布用/ITA_emotion_hira_100file/* {chara_text_dir}\n", " !cp MMVC向けITAコーパス文章ファイル_配布用/ITA_recitation_hira_324file/* {chara_text_dir}\n", "\n", " file_list = [os.path.abspath(p) for p in glob.glob(f\"{RECORDER_DATA_DIR}/*/*.zip\")]\n", " for f in list(file_list):\n", " # print(f)\n", " basename = os.path.basename(f)\n", " wavname = os.path.splitext(basename)[0] + \".wav\"\n", " full_path = os.path.join(chara_wav_dir, wavname)\n", " # print(basename, wavname, full_path)\n", " !unzip -oq {f} vf24kTrim.wav\n", " !cp vf24kTrim.wav {full_path}\n", "\n", "\n", "\n", "\n" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "L8UsVp3dDs4R", "outputId": "5d640caf-87b0-45a6-aa0c-76295e537f6a" }, "execution_count": 73, "outputs": [ { "output_type": "stream", "name": "stderr", "text": [ "Downloading...\n", "From: https://drive.google.com/uc?id=14oXoQqLxRkP8NJK8qMYGee1_q2uEED1z\n", "To: /content/voice-changer/docs/ita_corpus.zip\n", "100%|██████████| 1.20M/1.20M [00:00<00:00, 87.9MB/s]\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "/content/drive/MyDrive/VoiceChanger/voice_data/ITA-emotion/emotion000.zip\n", "/content/drive/MyDrive/VoiceChanger/voice_data/ITA-emotion/emotion002.zip\n", "/content/drive/MyDrive/VoiceChanger/voice_data/ITA-emotion/emotion001.zip\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "Downloading...\n", "From: https://drive.google.com/uc?id=1h8Ajyvoig7Hl3LSSt2vYX0sUHX3JDF3R\n", "To: /content/voice-changer/docs/zundamon.zip\n", "100%|██████████| 55.6M/55.6M [00:00<00:00, 251MB/s]\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Archive: zundamon.zip\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "Downloading...\n", "From: https://drive.google.com/uc?id=14zE0F_5ZCQWXf6m6SUPF5Y3gpL6yb7zk\n", "To: /content/voice-changer/docs/tsumugi.zip\n", "100%|██████████| 73.0M/73.0M [00:00<00:00, 226MB/s]\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Archive: tsumugi.zip\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "Downloading...\n", "From: https://drive.google.com/uc?id=1iCrpzhqXm-0YdktOPM8M1pMtgQIDF3r4\n", "To: /content/voice-changer/docs/metan.zip\n", "100%|██████████| 51.8M/51.8M [00:00<00:00, 219MB/s]\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Archive: metan.zip\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "Downloading...\n", "From: https://drive.google.com/uc?id=1MXfMRG_sjbsaLihm7wEASG2PwuCponZF\n", "To: /content/voice-changer/docs/sora.zip\n", "100%|██████████| 70.2M/70.2M [00:00<00:00, 184MB/s]\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Archive: sora.zip\n" ] } ] }, { "cell_type": "code", "source": [], "metadata": { "id": "yHmaXx31EOta" }, "execution_count": null, "outputs": [] } ] }