{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"collapsed_sections": [],
"authorship_tag": "ABX9TyP6Wv8fttT+X2Op6MmqB/kg",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
},
"gpuClass": "standard"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"source": [
"Voice Recorder\n",
"---\n",
"\n",
"このノートでは、MMVCのトレーニング用の音声を録画するアプリ\"Voice Recorder\"をColab上から起動します。\n",
"\n",
"録音された音声はこのノートを通してGoogle Drive上にアップロードすることができます。\n",
"\n",
"また、従来のVoice Recorderと同様にローカルPCにダウンロードすることもできます。\n",
"\n",
"録音後にブラウザとcolab上のサーバ間でやり取りを行うので、更新に少しタイムラグが発生します。\n",
"\n",
"ご自身のPCでトレーニングを行う予定の場合は、colab上のサーバで録音するメリットはほぼありませんので、より快適な録音をするために[こちらのgithub上のVoice Recorder](https://w-okada.github.io/voice-changer/)をご使用ください。\n",
"\n",
"\n",
"より詳細な情報はこちらの[リポジトリ](https://github.com/w-okada/voice-changer)からご確認いただけます。\n"
],
"metadata": {
"id": "Lbbmx_Vjl0zo"
}
},
{
"cell_type": "markdown",
"source": [
"# 録音データを格納するフォルダを指定\n",
"\n",
"フォルダは次の二つを指定する必要があります。\n",
"1. 録音アプリ用のキャッシュデータ格納フォルダ\n",
"2. トレーニングデータの格納フォルダ\n",
"\n",
"通常、録音データはGoogle Drive上のフォルダに格納すると思います。\n",
"\n",
"まずは(1-1)を実行してドライブをマウントしてください。\n",
"\n",
"その後、(1-2)で上記の格納フォルダを指定してください。"
],
"metadata": {
"id": "mHvGrgaWnIPA"
}
},
{
"cell_type": "code",
"source": [
"# (1-1) Google Driveのマウント\n",
"from google.colab import drive\n",
"drive.mount('/content/drive')"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Eihm8H2X-7wm",
"outputId": "76331fb1-5ef8-40e6-a381-258b5e425853"
},
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Mounted at /content/drive\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# (1-2) 使用するモデルとコンフィグファイルの指定\n",
"RECORDER_DATA_DIR=\"/content/drive/MyDrive/VoiceChanger/voice_data\"\n",
"MMVC_DATA_DIR=\"/content/drive/MyDrive/VoiceChanger/dataset\"\n"
],
"metadata": {
"id": "nSXATMWYb4Ik"
},
"execution_count": 2,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"# リポジトリのクローン\n",
"リポジトリをクローンします"
],
"metadata": {
"id": "sLBfykjBnjWc"
}
},
{
"cell_type": "code",
"source": [
"# (2) リポジトリのクローン\n",
"!git clone --depth 1 https://github.com/w-okada/voice-changer.git -b ver_1.0\n",
"%cd voice-changer/docs/\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "86wTFmqsNMnD",
"outputId": "40471833-d720-41c9-f4a7-ac15fbf18e14"
},
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Cloning into 'voice-changer'...\n",
"remote: Enumerating objects: 81, done.\u001b[K\n",
"remote: Counting objects: 100% (81/81), done.\u001b[K\n",
"remote: Compressing objects: 100% (68/68), done.\u001b[K\n",
"remote: Total 81 (delta 12), reused 53 (delta 5), pack-reused 0\u001b[K\n",
"Unpacking objects: 100% (81/81), done.\n",
"Note: checking out 'f8823cb7e2025f13227f5918408cceda224bf9f0'.\n",
"\n",
"You are in 'detached HEAD' state. You can look around, make experimental\n",
"changes and commit them, and you can discard any commits you make in this\n",
"state without impacting any branches by performing another checkout.\n",
"\n",
"If you want to create a new branch to retain commits you create, you may\n",
"do so (now or later) by using -b with the checkout command again. Example:\n",
"\n",
" git checkout -b \n",
"\n",
"/content/voice-changer/docs\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"# ファイルの配置\n",
"アプリケーションの挙動を記した設定ファイルをコピーします(3-1)。(3-2)はコピーした設定ファイルを表示しています。もしかしたらうまく動かないときに役立つかもしれません。"
],
"metadata": {
"id": "jmDY8W_fnuSi"
}
},
{
"cell_type": "code",
"source": [
"# (3-1) 設定ファイルのコピー\n",
"!cp ../template/setting_recorder_colab.json assets/setting.json"
],
"metadata": {
"id": "ow88ZaubluOJ"
},
"execution_count": 4,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# (3-2) 設定ファイルの内容確認\n",
"\n",
"!cat assets/setting.json"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "rpWUobjlBCNF",
"outputId": "285e0259-16af-4932-e78b-bec94f337e9c"
},
"execution_count": 5,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"{\n",
" \"app_title\": \"voice-recorder\",\n",
" \"storage_type\":\"server\",\n",
" \"use_mel_spectrogram\":true,\n",
" \"text\": [\n",
" {\n",
" \"title\": \"ITA-emotion\",\n",
" \"wavPrefix\": \"emotion\",\n",
" \"file\": \"./assets/text/ITA_emotion_all.txt\",\n",
" \"file_hira\": \"./assets/text/ITA_emotion_all_hira.txt\"\n",
" },\n",
" {\n",
" \"title\": \"ITA-recitation\",\n",
" \"wavPrefix\": \"recitation\",\n",
" \"file\": \"./assets/text/ITA_recitation_all.txt\",\n",
" \"file_hira\": \"./assets/text/ITA_recitation_all_hira.txt\"\n",
" },\n",
" {\n",
" \"title\": \"wagahaiwa\",\n",
" \"wavPrefix\": \"wagahaiwa\",\n",
" \"file\": \"./assets/text/wagahaiwa.txt\",\n",
" \"file_hira\": \"./assets/text/wagahaiwa_hira.txt\"\n",
" }\n",
" ]\n",
"}\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"# モジュールのインストール\n",
"\n",
"必要なモジュールをインストールします。"
],
"metadata": {
"id": "8Na2PbLZSWgZ"
}
},
{
"cell_type": "code",
"source": [
"# (4) 設定ファイルの確認\n",
"!pip install flask\n",
"!pip install flask_cors\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "LwZAAuqxX7yY",
"outputId": "ea2b3b39-d571-4d47-a38b-d0e657a335cd"
},
"execution_count": 6,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
"Requirement already satisfied: flask in /usr/local/lib/python3.7/dist-packages (1.1.4)\n",
"Requirement already satisfied: Jinja2<3.0,>=2.10.1 in /usr/local/lib/python3.7/dist-packages (from flask) (2.11.3)\n",
"Requirement already satisfied: Werkzeug<2.0,>=0.15 in /usr/local/lib/python3.7/dist-packages (from flask) (1.0.1)\n",
"Requirement already satisfied: click<8.0,>=5.1 in /usr/local/lib/python3.7/dist-packages (from flask) (7.1.2)\n",
"Requirement already satisfied: itsdangerous<2.0,>=0.24 in /usr/local/lib/python3.7/dist-packages (from flask) (1.1.0)\n",
"Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.7/dist-packages (from Jinja2<3.0,>=2.10.1->flask) (2.0.1)\n",
"Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
"Collecting flask_cors\n",
" Downloading Flask_Cors-3.0.10-py2.py3-none-any.whl (14 kB)\n",
"Requirement already satisfied: Six in /usr/local/lib/python3.7/dist-packages (from flask_cors) (1.15.0)\n",
"Requirement already satisfied: Flask>=0.9 in /usr/local/lib/python3.7/dist-packages (from flask_cors) (1.1.4)\n",
"Requirement already satisfied: Werkzeug<2.0,>=0.15 in /usr/local/lib/python3.7/dist-packages (from Flask>=0.9->flask_cors) (1.0.1)\n",
"Requirement already satisfied: itsdangerous<2.0,>=0.24 in /usr/local/lib/python3.7/dist-packages (from Flask>=0.9->flask_cors) (1.1.0)\n",
"Requirement already satisfied: Jinja2<3.0,>=2.10.1 in /usr/local/lib/python3.7/dist-packages (from Flask>=0.9->flask_cors) (2.11.3)\n",
"Requirement already satisfied: click<8.0,>=5.1 in /usr/local/lib/python3.7/dist-packages (from Flask>=0.9->flask_cors) (7.1.2)\n",
"Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.7/dist-packages (from Jinja2<3.0,>=2.10.1->Flask>=0.9->flask_cors) (2.0.1)\n",
"Installing collected packages: flask-cors\n",
"Successfully installed flask-cors-3.0.10\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"# サーバの起動\n",
"\n",
"サーバを起動します。(5-1) \n",
"\n",
"サーバの起動状況を確認します。(5-2) \n",
"\n",
"このセルは繰り返し実行することになるのでCtrl+Retでセルを実行してください。\n",
"\n",
"アクセスできるようになるまで、数秒かかります。\n",
"\n",
"下記のようなテキストが表示されたら起動完了です。\n",
"\n",
"```\n",
"[2022-09-13 22:20:49,936] INFO in recorderServer: START APP\n",
" * Serving Flask app \"recorderServer\" (lazy loading)\n",
" * Environment: production\n",
" WARNING: This is a development server. Do not use it in a production deployment.\n",
" Use a production WSGI server instead.\n",
" * Debug mode: on\n",
"[2022-09-13 22:20:49,946] INFO in _internal: * Running on http://0.0.0.0:8018/ (Press CTRL+C to quit)\n",
"[2022-09-13 22:20:49,947] INFO in _internal: * Restarting with stat\n",
"[2022-09-13 22:20:50,166] INFO in recorderServer: START APP\n",
"[2022-09-13 22:20:50,174] WARNING in _internal: * Debugger is active!\n",
"[2022-09-13 22:20:50,200] INFO in _internal: * Debugger PIN: 334-166-753\n",
"```\n",
"\n"
],
"metadata": {
"id": "-_2OcN9Borke"
}
},
{
"cell_type": "code",
"source": [
"# (5-1) サーバの起動\n",
"PORT=8018\n",
"get_ipython().system_raw(f'python3 recorderServer.py {PORT} {RECORDER_DATA_DIR} >foo 2>&1 &')"
],
"metadata": {
"id": "iNOAB7zISI6J"
},
"execution_count": 7,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# (5-2) サーバの起動確認 (Ctrl+Retで実行)\n",
"!cat foo"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "chu06KpAjEK6",
"outputId": "6f5cbed9-65a7-4570-f58a-5447e402947c"
},
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"[2022-11-08 19:11:17,679] INFO in recorderServer: START APP\n",
" * Serving Flask app \"recorderServer\" (lazy loading)\n",
" * Environment: production\n",
" WARNING: This is a development server. Do not use it in a production deployment.\n",
" Use a production WSGI server instead.\n",
" * Debug mode: on\n",
"[2022-11-08 19:11:17,696] INFO in _internal: * Running on http://0.0.0.0:8018/ (Press CTRL+C to quit)\n",
"[2022-11-08 19:11:17,697] INFO in _internal: * Restarting with stat\n",
"[2022-11-08 19:11:17,893] INFO in recorderServer: START APP\n",
"[2022-11-08 19:11:17,900] WARNING in _internal: * Debugger is active!\n",
"[2022-11-08 19:11:17,930] INFO in _internal: * Debugger PIN: 225-344-519\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"# プロキシを起動\n",
"ウェブサーバへのアクセスをするためのプロキシを起動します。\n",
"\n",
"表示されたURLをクリックして開くと別タブでアプリが開きます。\n",
"\n",
"Colabなので、ロードにある程度時間がかかります(30秒くらい)。"
],
"metadata": {
"id": "WhxcFLQEpctq"
}
},
{
"cell_type": "code",
"source": [
"# (7) プロキシを起動\n",
"from google.colab import output\n",
"\n",
"output.serve_kernel_port_as_window(PORT)"
],
"metadata": {
"id": "nkRjZm95l87C",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "7d2664b8-945c-4ee6-b49f-51f7f96cf388"
},
"execution_count": 11,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
""
],
"application/javascript": [
"(async (port, path, text, element) => {\n",
" if (!google.colab.kernel.accessAllowed) {\n",
" return;\n",
" }\n",
" element.appendChild(document.createTextNode(''));\n",
" const url = await google.colab.kernel.proxyPort(port);\n",
" const anchor = document.createElement('a');\n",
" anchor.href = new URL(path, url).toString();\n",
" anchor.target = '_blank';\n",
" anchor.setAttribute('data-href', url + path);\n",
" anchor.textContent = text;\n",
" element.appendChild(anchor);\n",
" })(8018, \"/\", \"https://localhost:8018/\", window.element)"
]
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": [
"# トレーニング用データフォルダ\n",
"\n",
"以下、トレーニング用のフォルダを作成します。\n",
"\n",
"\n"
],
"metadata": {
"id": "ZGuYYN7oCSM4"
}
},
{
"cell_type": "code",
"source": [
"corpus_id = \"14oXoQqLxRkP8NJK8qMYGee1_q2uEED1z\"\n",
"\n",
"data_setting = [\n",
" [\"user\", \"\", \"\", \"00_myvoice\", \"107\"],\n",
" [\"zundamon\", \"1h8Ajyvoig7Hl3LSSt2vYX0sUHX3JDF3R\", \"1205_zundamon\", \"01_target_zundamon\", \"100\"],\n",
" [\"tsumugi\", \"14zE0F_5ZCQWXf6m6SUPF5Y3gpL6yb7zk\", \"344_tsumugi\", \"02_target_tsumugi\", \"103\"],\n",
" [\"metan\", \"1iCrpzhqXm-0YdktOPM8M1pMtgQIDF3r4\", \"459_methane\", \"03_target_metan\", \"102\"],\n",
" [\"sora\", \"1MXfMRG_sjbsaLihm7wEASG2PwuCponZF\", \"912_sora\", \"04_target_ksora\", \"101\"],\n",
"]"
],
"metadata": {
"id": "3PhrmCD2LaCH"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"import os, glob\n",
"\n",
"os.makedirs(MMVC_DATA_DIR, exist_ok=True)\n",
"speaker_list = os.path.join(MMVC_DATA_DIR, \"multi_speaker_correspondence.txt\")\n",
"!echo \"00_myvoice|107\" > {speaker_list}\n",
"!echo \"01_target_zundamon|100\" >> {speaker_list}\n",
"!echo \"02_target_tsumugi|103\" >> {speaker_list}\n",
"!echo \"03_target_metan|102\" >> {speaker_list}\n",
"!echo \"04_target_ksora|101\" >> {speaker_list}\n",
"\n",
"!cat {speaker_list}\n",
"\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "f5l6ggSyACLs",
"outputId": "4db3571a-46e6-4fd9-c560-628cf4af9284"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"00_myvoice|107\n",
"01_target_zundamon|100\n",
"02_target_tsumugi|103\n",
"03_target_metan|102\n",
"04_target_ksora|101\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"!rm -rf /content/drive/MyDrive/VoiceChanger/train_data/00_myvoice/wav/*"
],
"metadata": {
"id": "UEVb2GGZSesY"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"import gdown\n",
"\n",
"gdown.download(f'https://drive.google.com/uc?id={corpus_id}', f'ita_corpus.zip', quiet=False)\n",
"!unzip -oq 'ita_corpus.zip'\n",
"\n",
"for chara in data_setting:\n",
" chara_root_dir = os.path.join(MMVC_DATA_DIR, chara[3])\n",
" os.makedirs(chara_root_dir, exist_ok=True)\n",
" \n",
" chara_text_dir = os.path.join(chara_root_dir, \"text\")\n",
" os.makedirs(chara_text_dir, exist_ok=True)\n",
" chara_wav_dir = os.path.join(chara_root_dir, \"wav\")\n",
" os.makedirs(chara_wav_dir, exist_ok=True)\n",
"\n",
" if chara[0] != \"user\":\n",
" gdown.download(f'https://drive.google.com/uc?id={chara[1]}', f'{chara[0]}.zip', quiet=False)\n",
" !unzip -f '{chara[0]}.zip'\n",
" !cp -rf {chara[2]}/* {chara_root_dir}/\n",
"\n",
" if chara[0] == \"user\":\n",
" !cp MMVC向けITAコーパス文章ファイル_配布用/ITA_emotion_hira_100file/* {chara_text_dir}\n",
" !cp MMVC向けITAコーパス文章ファイル_配布用/ITA_recitation_hira_324file/* {chara_text_dir}\n",
"\n",
" file_list = [os.path.abspath(p) for p in glob.glob(f\"{RECORDER_DATA_DIR}/*/*.zip\")]\n",
" for f in list(file_list):\n",
" # print(f)\n",
" basename = os.path.basename(f)\n",
" wavname = os.path.splitext(basename)[0] + \".wav\"\n",
" full_path = os.path.join(chara_wav_dir, wavname)\n",
" # print(basename, wavname, full_path)\n",
" !unzip -oq {f} vf24kTrim.wav\n",
" !cp vf24kTrim.wav {full_path}\n",
"\n",
"\n",
"\n",
"\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "L8UsVp3dDs4R",
"outputId": "5d640caf-87b0-45a6-aa0c-76295e537f6a"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Downloading...\n",
"From: https://drive.google.com/uc?id=14oXoQqLxRkP8NJK8qMYGee1_q2uEED1z\n",
"To: /content/voice-changer/docs/ita_corpus.zip\n",
"100%|██████████| 1.20M/1.20M [00:00<00:00, 87.9MB/s]\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"/content/drive/MyDrive/VoiceChanger/voice_data/ITA-emotion/emotion000.zip\n",
"/content/drive/MyDrive/VoiceChanger/voice_data/ITA-emotion/emotion002.zip\n",
"/content/drive/MyDrive/VoiceChanger/voice_data/ITA-emotion/emotion001.zip\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"Downloading...\n",
"From: https://drive.google.com/uc?id=1h8Ajyvoig7Hl3LSSt2vYX0sUHX3JDF3R\n",
"To: /content/voice-changer/docs/zundamon.zip\n",
"100%|██████████| 55.6M/55.6M [00:00<00:00, 251MB/s]\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Archive: zundamon.zip\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"Downloading...\n",
"From: https://drive.google.com/uc?id=14zE0F_5ZCQWXf6m6SUPF5Y3gpL6yb7zk\n",
"To: /content/voice-changer/docs/tsumugi.zip\n",
"100%|██████████| 73.0M/73.0M [00:00<00:00, 226MB/s]\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Archive: tsumugi.zip\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"Downloading...\n",
"From: https://drive.google.com/uc?id=1iCrpzhqXm-0YdktOPM8M1pMtgQIDF3r4\n",
"To: /content/voice-changer/docs/metan.zip\n",
"100%|██████████| 51.8M/51.8M [00:00<00:00, 219MB/s]\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Archive: metan.zip\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"Downloading...\n",
"From: https://drive.google.com/uc?id=1MXfMRG_sjbsaLihm7wEASG2PwuCponZF\n",
"To: /content/voice-changer/docs/sora.zip\n",
"100%|██████████| 70.2M/70.2M [00:00<00:00, 184MB/s]\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Archive: sora.zip\n"
]
}
]
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "yHmaXx31EOta"
},
"execution_count": null,
"outputs": []
}
]
}