リアルタイムボイスチェンジャー Realtime Voice Changer
Go to file
2023-04-09 02:50:14 +09:00
.vscode first commit of recorder 2023-02-09 03:12:43 +09:00
client customize pipline 2023-04-08 05:34:26 +09:00
docker WIP: docker support v1.5.x trial 5 2023-02-11 00:59:44 +09:00
docker_onnx WIP refactoring 2023-01-07 20:07:39 +09:00
docker_trainer modify animation setting, update supportting onnx for mmvc v.1.5.x 2023-04-03 01:22:44 +09:00
docs update package 2023-03-27 23:05:30 +09:00
recorder update package 2023-03-27 23:05:30 +09:00
script WIP: 2023-01-31 18:27:26 +09:00
server remove unused comment 2023-04-08 13:11:33 +09:00
trainer WIP: v1.5support 1 2023-01-31 17:16:45 +09:00
.gitignore WIP: gui commonize 2 2023-04-07 05:26:42 +09:00
Checklist.md update doc 2023-01-29 21:26:40 +09:00
LICENSE add license 2022-08-24 09:50:45 +09:00
MMVCTrainerFrontendDemo.ipynb Colaboratory を使用して作成しました 2022-12-11 09:15:57 +09:00
package-lock.json update 2022-08-27 11:43:55 +09:00
package.json cleaning 2023-02-12 17:09:47 +09:00
README_en.md update 2023-04-09 02:50:14 +09:00
README.md update 2023-04-09 02:50:14 +09:00
SoftVcDemo.ipynb Colaboratory を使用して作成しました 2022-10-29 09:56:28 +09:00
start2.sh WIP releasing... 2023-01-29 15:34:56 +09:00
start_v0.1.sh update 2022-10-30 00:58:21 +09:00
VoiceChangerDemo_Simple.ipynb Colaboratory を使用して作成しました 2023-01-30 06:23:17 +09:00
VoiceChangerDemo.ipynb Colaboratory を使用して作成しました 2023-01-30 06:25:34 +09:00
VoiceRecorder.ipynb Colaboratory を使用して作成しました 2022-11-09 04:12:09 +09:00

VC Client

What's New!

What is VC Client

VC Client is a client software for real-time voice changers that uses AI such as MMVC and so-vits-svc, RVC(Retrieval-based-Voice-Conversion). It also provides an app for recording training audio for real-time voice changers, specifically for MMVC.

Features

  1. Cross-platform compatibility Supports Windows, Mac (including Apple Silicon M1), Linux, and Google Colaboratory.

  2. No need to install a separate audio recording app Audio recording can be done directly on the application hosted on Github Pages. Since it runs entirely on the browser, there is no need to install any special application. Additionally, since it works entirely as a browser application, no data is sent to the server.

  3. Distribute the load by running Voice Changer on a different PC The real-time voice changer of this application works on a server-client configuration. By running the MMVC server on a separate PC, you can run it while minimizing the impact on other resource-intensive processes such as gaming commentary.

image

usage

Details are summarized here.

(1) Recorder (Voice Recording App for Training)

This is an app that allows you to easily record training voice for MMVC. It can be run on Github Pages, making it available from various platforms with just a browser. The recorded data is saved in the browser and will not leak externally.

Recorder app on Github Pages

Explanation video

(2) Player (Voice Changer App)

This is an app for performing voice changes with MMVC and so-vits-svc.

It can be used in three main ways, in order of difficulty:

  • Using Google Colaboratory (MMVC only)
  • Using a pre-built binary
  • Setting up an environment with Docker or Anaconda and using it

For those who are not familiar with this software or MMVC, it is recommended to gradually get used to it from the top.

(2-1) Use on Google Colaboratory (MMVC only)

You can run it on Google's machine learning platform, Colaboratory. If you have already used Colaboratory, you do not need to prepare anything as the training of MMVC model has been completed. However, the voice changer may have a large time lag depending on the network environment or the situation of Colaboratory.

  • Simple version: You can run it from Colab without any prior setup.
  • Normal version: You can load the model by cooperating with Google Drive.

Explanation video

(2-2) Usage with pre-built binaries

You can download and run executable binaries. We offer Windows and Mac versions.

  • For Mac version, after unzipping the downloaded file, double-click the startHttp_xxx.command file corresponding to your VC. If a message indicating that the developer cannot be verified is displayed, please press the control key and click to run it again (or right-click to run it). (Details below * 1)

  • For Windows version, we offer ONNX(cpu, cuda), PyTorch(cpu) version, ONNX(cpu, cuda), PyTorch(cpu, cuda) version, and ONNX(cpu, DirectML), PyTorch(cpu) version. Please download the zip file corresponding to your environment. After unzipping the downloaded zip file, please run the start_http_xxx.bat file corresponding to your VC.

  • The following voice changers can be launched with each startHttp_xxx.command file (Mac) and start_http_xxx.bat file (Windows).

# Batch file Description
1 start_http_v13.bat MMVC v.1.3.x series models can be used.
2 start_http_v15.bat MMVC v.1.5.x series models can be used.
3 start_http_so-vits-svc_40.bat so-vits-svc 4.0 series models can be used.
4 start_http_so-vits-svc_40v2.bat so-vits-svc 4.0v2 series models can be used.
5 start_http_so-vits-svc_40v2_tsukuyomi.bat Use Tsukuyomi-chan's model. (Cannot be changed)
6 start_http_so-vits-svc_40v2_amitaro.bat Use Amitaro's model. (Cannot be changed)
7 start_http_RVC.bat RVC series models can be used.
  • If you are connecting remotely, please use the .command file (Mac) or .bat file (Windows) with https instead of http.

  • If you have an Nvidia GPU on Windows, it will usually work with the ONNX(cpu,cuda),PyTorch(cpu) version. In rare cases, the GPU may not be recognized, in which case please use the ONNX(cpu,cuda), PyTorch(cpu,cuda) version (which is much larger in size).

  • If you do not have an Nvidia GPU on Windows, it will usually work with the ONNX(cpu,DirectML), PyTorch(cpu) version.

  • If you are using so-vits-svc 4.0/so-vits-svc 4.0v2 on Windows, please use the ONNX(cpu,cuda), PyTorch(cpu,cuda) version.

  • To use so-vits-svc 4.0/so-vits-svc 4.0v2 or tsukuyomi-chan, you need the content vec model. Please download the ContentVec_legacy 500 model from this repository, and place it in the same folder as startHttp_xxx.command or start_http_xxx.bat to run.

  • You need to have the hubert model to use RVC(Retrieval-based-Voice-Conversion). Please download hubert_base.pt from this repository and store it in the folder where the batch file is located.

Version OS フレームワーク link サポート VC サイズ
v.1.5.1.15b win ONNX(cpu,cuda), PyTorch(cpu) normal MMVC v.1.5.x, MMVC v.1.3.x, RVC 773MB
win ONNX(cpu,cuda), PyTorch(cpu,cuda) normal MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2, RVC 2794MB
win ONNX(cpu,DirectML), PyTorch(cpu) normal MMVC v.1.5.x, MMVC v.1.3.x, RVC 488MB
win ONNX(cpu,DirectML), PyTorch(cpu,cuda) normal MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2, RVC 2665MB
mac ONNX(cpu), PyTorch(cpu) normal MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2, RVC 615MB
Version OS Framework link VC Support Size
v.1.5.1.15a win ONNX(cpu,cuda), PyTorch(cpu,cuda) normal MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2, RVC 2641MB
Version OS Framework link VC Support Size
v.1.5.1.14 mac ONNX(cpu), PyTorch(cpu) normal MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2 581MB
mac - Tsukuyomi-chan - 874MB
mac - Kikoto Mahiro - 872MB
mac - Amitaro - 872MB
mac - Kikoto Kurage - 873MB
win ONNX(cpu,cuda), PyTorch(cpu) normal MMVC v.1.5.x, MMVC v.1.3.x 530MB
win ONNX(cpu,cuda), PyTorch(cpu,cuda) normal MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2 2624MB
win ONNX(cpu,DirectML), PyTorch(cpu) normal MMVC v.1.5.x, MMVC v.1.3.x 417MB
win ONNX(cpu,DirectML), PyTorch(cpu,cuda) normal MMVC v.1.5.x, MMVC v.1.3.x, so-vits-svc 4.0, so-vits-svc 4.0v2 2509MB
win - Tsukuyomi-chan - 823MB
win - Kikoto Mahiro - 821MB
win - Kikoto Kurage - 823MB
win - Amitaro - 821MB

*1 MMVC v.1.5.x is Experimental.

*2 Tsukuyo Michan uses free character "Tsukuyo Michan" voice data that is publicly available for free. (Details such as terms of use are at the end of the document)

*3 If unpacking or starting is slow, there is a possibility that virus checking is running on your antivirus software. Please try running it with the file or folder excluded from the target. (At your own risk)

*4 This software is not signed by the developer. A warning message will appear, but you can run the software by clicking the icon while holding down the control key. This is due to Apple's security policy. Running the software is at your own risk.

image

https://user-images.githubusercontent.com/48346627/212569645-e30b7f4e-079d-4504-8cf8-7816c5f40b00.mp4

(2-3) Usage after setting up the environment such as Docker or Anaconda

Clone this repository and use it. Setting up WSL2 is essential for Windows. Additionally, setting up virtual environments such as Docker or Anaconda on WSL2 is also required. On Mac, setting up Python virtual environments such as Anaconda is necessary. Although preparation is required, this method works the fastest in many environments. Even without a GPU, it may work well enough with a reasonably new CPU (refer to the section on real-time performance below).

Explanation video on installing WSL2 and Docker

Explanation video on installing WSL2 and Anaconda

Real-time performance

Conversion is almost instantaneous when using GPU.

https://twitter.com/DannadoriYellow/status/1613483372579545088?s=20&t=7CLD79h1F3dfKiTb7M8RUQ

Even with CPU, recent ones can perform conversions at a reasonable speed.

https://twitter.com/DannadoriYellow/status/1613553862773997569?s=20&t=7CLD79h1F3dfKiTb7M8RUQ

With an old CPU (i7-4770), it takes about 1000 msec for conversion.

Acknowledgments

This software uses the voice data of the free material character "Tsukuyomi-chan," which is provided for free by CV. Yumesaki Rei.

  • Tsukuyomi-chan Corpus (CV. Yumesaki Rei)

https://tyc.rei-yumesaki.net/material/corpus/

Copyright. Rei Yumesaki

Terms of Use

In accordance with the Tsukuyomi-chan Corpus Terms of Use for the Tsukuyomi-chan Real-time Voice Changer, the use of the converted voice for the following purposes is prohibited.

  • Criticizing or attacking individuals (the definition of "criticizing or attacking" is based on the Tsukuyomi-chan character license).

  • Advocating for or opposing specific political positions, religions, or ideologies.

  • Publicly displaying strongly stimulating expressions without proper zoning.

  • Publicly disclosing secondary use (use as materials) for others. (Distributing or selling as a work for viewing is not a problem.)

Regarding the Real-time Voice Changer Amitaro, we prohibit the following uses in accordance with the terms of use of the Amitaro's koe-sozai kobo.detail

Regarding the Real-time Voice Changer Kikoto Mahiro, we prohibit the following uses in accordance with the terms of use of Replica doll.detail

Disclaimer

We are not liable for any direct, indirect, consequential, incidental, or special damages arising out of or in any way connected with the use or inability to use this software.