voice-changer/demo/MMVC_Trainer/notebook/00_Rec_Voice.ipynb

2 lines
36 KiB
Plaintext
Raw Normal View History

2022-12-09 07:15:52 +03:00
{"cells":[{"cell_type":"markdown","metadata":{"id":"Q3egrWHAzs7H"},"source":["# MMVC学習用の音声データの録音\n","\n","ver.2022/06/19\n","\n"]},{"cell_type":"markdown","metadata":{"id":"V61VE7vxLaY3"},"source":["## 1 概要\n","MMVC学習用の自身の音声データを録音します。"]},{"cell_type":"markdown","metadata":{"id":"kN0-Tj3TL-T2"},"source":["## 2 Google Drive をマウント\n","**Google Drive にアップロードした MMVC_Trainer を参照できるように、設定します。**\n","\n","「このノートブックに Google ドライブのファイルへのアクセスを許可しますか?」\n","\n","といったポップアップが表示されるので、「Google ドライブに接続」を押下し、google アカウントを選択して、「許可」を選択してください。\n","\n","成功すれば、下記メッセージが出ます。\n","```\n","Mounted at /content/drive/\n","```\n"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"hdfce8bPkpMB"},"outputs":[],"source":["from google.colab import drive\n","drive.mount('/content/drive')"]},{"cell_type":"markdown","metadata":{"id":"AXCsRWVLNn0c"},"source":["cdコマンドを実行して、マウントしたGoogle Drive のMMVC_Trainerディレクトリに移動します。\n","\n","%cd 「MMVC_Trainerをgoogle driveにパップロードしたパス」\n","\n","としてください。\n","\n","正しいパスが指定されていれば\n","\n","-rw------- 1 root root 11780 Mar 4 16:53 attentions.py\n","\n","-rw------- 1 root root 4778 Mar 4 16:53 commons.py\n","\n","drwx------ 2 root root 4096 Mar 5 15:20 configs\n","\n","...といった感じに表示されるはずです。"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"cLxKrVTGkvsq"},"outputs":[],"source":["%cd /content/drive/MyDrive/\n","!ls -la"]},{"cell_type":"markdown","metadata":{"id":"Kj6t5IEkI21q"},"source":["## 3 学習用音声を録音する\n","\n","ITAコーパスを利用して音声を録音します。 \n","次の録音用のプログラムを読み込みます。"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"1Txer2hJJvii"},"outputs":[],"source":["# from https://gist.github.com/korakot/c21c3476c024ad6d56d5f48b0bca92be\n","# from https://colab.research.google.com/github/espnet/notebook/blob/master/espnet2_asr_realtime_demo.ipynb\n","\n","from IPython.display import Javascript\n","from google.colab import output\n","from base64 import b64decode\n","\n","RECORD = \"\"\"\n","const sleep = time => new Promise(resolve => setTimeout(resolve, time))\n","const b2text = blob => new Promise(resolve => {\n"," const reader = new FileReader()\n"," reader.onloadend = e => resolve(e.srcElement.result)\n"," reader.readAsDataURL(blob)\n","})\n","var record = time => new Promise(async resolve => {\n"," stream = await navigator.mediaDevices.getUserMedia({ audio: true })\n"," recorder = new MediaRecorder(stream)\n"," chunks = []\n"," recorder.ondataavailable = e => chunks.push(e.data)\n"," recorder.start()\n"," await sleep(time)\n"," recorder.onstop = async ()=>{\n"," blob = new Blob(chunks)\n"," text = await b2text(blob)\n"," resolve(text)\n"," }\n"," recorder.stop()\n","})\n","\"\"\"\n","\n","def record(sec, filename='audio.wav'):\n"," display(Javascript(RECORD))\n"," s = output.eval_js('record(%d)' % (sec * 1000))\n"," b = b64decode(s.split(',')[1])\n"," with open(filename, 'wb+') as f:\n"," f.write(b)\n","\n","import librosa\n","import librosa.display\n","import matplotlib.pyplot as plt\n","import pysndfile\n","from IPython.display import display, Audio\n","import warnings\n","warnings.filterwarnings('ignore')\n","\n","def rec(sec, filename, text, hira):\n"," myvoice_dir = \"./dataset/textful/00_myvoice/wav/\"\n"," sampling_rate = 24000\n"," if sec > 15 :\n"," print(\"15秒以上の音声は学習できません。\\n録音秒数を15秒以下に指定してください。\")\n"," return\n"," print(f\"{text}\\n{hira}\")\n"," record(sec, \"temp.wav\")\n"," print(\"---終了---\")