TTS Train and Voice Swap Script

Closed job
no avatar
kam
Employer
3 deals
Job category:
Other IT services
Expected budget:

Negotiable

Preferable skills:
Published:
Valid until:

Job description

I am looking for a freelancer that would create for me a Colabolatory Script that will allow me to train and test TTS based on nVidia NeMo Framewrok. It is important to have an option to also modify/fake/augment/swap voice. I wish to train on a large amount of data and swap voice based on smaller sample of the data. To be more precise:

Colabolatory Script:

- Should be clear and easy to understand, with comments and description what each cell with code actually does

- It should have indicated places where I could change training parameters or methods + a comment what are possible values to put there

- After training there should be option to input text and get generated audio file

- After training there should be option to measure Mel cepstral distortion (MCD)

- When above is done next cell should be able to modify/fake/augment/swap voice based on small audio sample of a voice. For this task I would like to able to use either trained model, or pre=trained from nVidia NeMo.

- There should be option to upload from computer previously trained models – with and without voice swap. And use them for TTS or MCD.

Training data:

- I need the script to able to process M-AILABS and Common Voice 7.0, separately or joined together

- I need also to able to provide my own training data – I need you to define some simple data format in which I will have to prepare my own data

- Since free Collaboratory has limits, it is OK to train in fragments of the data I mentioned

- You can test on any language but before delivery I need to work with West Slavic language I will define.

Required functions:

Place or location:

Submitted offers

Add your offer or ask the client for more details

No one sends a job offer at this moment.