New TTS engine: MaryTTS

lolodomo · August 1, 2015, 9:33am

MaryTTS is now available as a TTS engine for the Sonos plugin: trunk – Sonos Wireless HiFi Music Systems
You will have to upload 4 files: I_Sonos1.xml + J_Sonos1.js + L_SonosTTS.lua + S_Sonos1.xml. Take them from the ZIP file.

To setup the engine, you have to set your MaryTTS server URL through the TTS tab of the plugin. This URL should look like http://192.168.0.20:59125

Please note that you have to set language to “en_GB” or “en_US” for English. If you use only “en”, it does not work. I will update later the UI to make the choice easier.
For other languages like French, you just use “fr” like for the other engines.

Note that this engine produces a WAV file, that means a bigger file.

I tested with MaryTTS installed on a RPI. The time to deliver the WAV file is very long (several seconds) and the quality of the speech is low.

In your scenes, you have to use “MARY” to identify this engine.

lolodomo · August 1, 2015, 3:24pm

I just imrpoved error checking. Please update, my previous version lead to TTS not working after a first failure.
A warning is now logged when the audio file cannot be retrieved and Sonos audio is not cut at all in this case.

I will imrprove again error checking for MaryTTS later by checking that the URL is ok and the language is an available language.

lolodomo · August 2, 2015, 10:03am

I committed few changes. If the TTS request failed, a control is done to check if the MaryTTS server is reachable (URL is ok) and if used language is available. So the logged error is now a little more detailed.

lolodomo · August 2, 2015, 11:29am

If you installed several voices for a language, one will be the default voice for this language and only this one is currently available through the plugin. To give the choice of the voice, I should add a new argument to the Say action. Not sure I will do it as it is something only available with this TTS engine.
The workaround is to keep on the server side only the voice you prefer for each language.

BOFH · December 28, 2015, 9:08pm

I decided to Install MaryTTS on my WHS2011 server that also runs Blue Iris and Serviio and it was pretty painless. I do like the response time as the server is on my local network.

The installation procedure I followed (as I could not find it in the Wiki) is at [url=http://fastertutorials.com/?p=35]http://fastertutorials.com/?p=35[/url]

MaryTTS requires Java runtime 1.7 or higher which I already had installed as Serviio runs on that server.

MaryTTS can be found at : [url=http://mary.dfki.de/download/]http://mary.dfki.de/download/[/url]

Voices for MaryTTS are downloadable at [url=https://github.com/marytts/marytts/tree/master/download]https://github.com/marytts/marytts/tree/master/download[/url]

One caveat for WHS2011: You will manually have to create an inbound allow rule for TCP port 59125 to provide access to the MaryTTS http server. By default this port is closed.

MaryTTS server URL format: http://IP.OF.YOUR.SERVER:59125 (replace IP.OF.YOUR.SERVER with the IP address or name of your server) You can use the URL in a browser to test the MaryTTS server. It should bring up the web interface.

Michael_N_Blackwell · December 28, 2015, 9:29pm

@lolodomo, question I finally moved over to Microsoft TTS, is there an advantage using Mary TTS e.g better speech options, quicker? based on @BOFH I could have my own copy on a local network. Mike

integlikewhoa · December 28, 2015, 10:03pm

I would also assume no limit now. Curious to see what else.

integlikewhoa · December 28, 2015, 10:49pm

I’m running this on windows 8 with BI and PLEX and a few other things. Can I close the Command prompt window or any easy way to run it without the CMD window open?

BOFH · December 28, 2015, 10:53pm

MaryTTS has no limits on the amount of characters as it’s public license software and you run it on your own system.
It has a number of voices available and it’s great fun having eg a Italian female voice speak something in English. Adorable accent. MaryTTS has a lot of options on inflection, tone, timbre etc but unfortunately at this time these don’t seem to be supported by the plugin. Nor is being able to choose the voice. I can only switch between a US female (en-US) and a UK female (en-GB) voice. I wish I could specify inflections such as eg ‘questioning’ etc. Using en-CA results in silence, not even an ‘eh’ as MaryTTS only supports de, en-GB, en-US, fr, it, ru, sv, te and tr

Installation was pretty simple and took less than 15 minutes. With another 30 figuring out the firewall issue and how to set up the scheduler to run the MaryTTS bat file at startup.

I’m hoping Lolodomo will consider adding some of the options MaryTTS supports to the plugin so we can actually pick the voice and inflection.

I’m still defaulted at this time to the MS voice until I can do more testing but I’m pretty sure I’ll migrate to MaryTTS as the default as it is not dependent on a 3rd party server on the Internet.

@integlikewhoa: Use the task scheduler to make the server batch file run at boot, on my WHS2011 it neatly hides the CMD window. If you run it manually, as soon as you log out of the server (eg RDP session) it exits. I don’t know if there is a way to run it as a service. I’ve ran it with the CMD window minimized with no issues.

integlikewhoa · December 29, 2015, 12:01am

OK made a .VBS file that I run at windows startup which runs this .bat file hidden in the background on my windows machine.

Use notepad and saved it with a changed extension to “.VBS”
Used the info below for my information in notepad with my file location and name in the “”.

CreateObject(“Wscript.Shell”).Run “C:\MaryTTS\marytts-5.1.2\bin\marytts-server.bat”,0,True

lolodomo · December 29, 2015, 10:14am

In my humble opinion, the quality of voices is very low compared to Google or Microsoft.
On a RPI, building the WAV file is very long and as a result Google or Microsoft are faster to deliver the text than MaryTTS.
The only advantage I see in MaryTTS is that it is usable without Internet. But using Internet is not a problem for me and I prefer Microsoft quality.

lolodomo · December 29, 2015, 10:21am

Let me know how you would see the changes.
I don’t really want to add one or several parameters to the Say action as they are already very numerous. And remember that Say action is common for all engines.

BOFH · December 29, 2015, 2:45pm

@Lolodomo: Thanks for the offer and I do realize you don’t want to many parameters for the say command. But would it be possible to incorporate the VOICE parameter so specific voices can be specified on the TTS tab for MaryTTS? I’m perfectly happy having to type the voicename if it’s to much of a programming effort for the plugin to query the defined MaryTTS server for installed voices. The list of locally installed is at http://IP.ADDRESS:59125/Voices . Like others, I prefer to be able to use different voices for different situations. Cheerful Poppy works fine for the weather but a male voice would be more suited in case of a fire or similar scenario.

I can see where an rPi could be slow responding. My WHS2011 server is a Lenovo TS140 with 4 core Xeon CPU. Even though it runs Blue Iris with about 13 cameras as well as the Serviio media server, it generates a wav file almost instantly. As you said, speech quality for MS is better, but it’s response times seem to vary, especially with multiple requests.

I downloaded the Prudence and Poppy voices and they are of better quality than the default ones but not there yet as compared to Google and MicroSoft. MaryTTS allows you to create your own voices though. Lord help us if I ever get a hold of Barbara Eden as that would more than likely result in a ‘Jeannie’ voice. I’m trying to figure out what is needed in the form of wav files to build a voice. It would be a good excuse for buying the DVD set of the Jeannie series if I coudl get all that’s needed from there.

I’ll settle for lower quality though if that means faster response and I’m not dependent on a 3rd party internet server. Just look at what happened to Google TTS.

symonsaz · December 30, 2015, 12:08am

just before i throw in the towel with microsoft TTS … and go through setting up marytts… IS anyone else having issues with Microsoft as of today??? Like the Webserver is not working… similar to how Googletts went down a few months ago?

BOFH · December 30, 2015, 12:33am

as replied in the other thread ypu posted this. Just had Vera tell me the weather using Microsoft and the en-CA voice.

integlikewhoa · December 30, 2015, 12:36am

I also used it like 20 min ago.

lolodomo · December 30, 2015, 12:12pm

No problem for me with Microsoft. Just tested.

lolodomo · December 30, 2015, 12:22pm

As I understood, you want to be able to select voice but even to use other options. What I could propose is a generic new additional parameter that will allow to provide additional information to the URL (I will concatene the provided data to the URL). Would it be ok ?

I am not sure at all but I think I already noticed in the past that changing parameters definition for an action could break current action in an existing scene. If that is confirmed, if I do that, every user having a Say action in a scene should update these scenes ! Imagine the number of users that will come here to say the TTS is not working anymore… S I hesitate. But the best would be to take time to confirm or not this fact.

BOFH · December 30, 2015, 5:38pm

@lolodomo: That idea would be perfect and future proof. I do see your point about possibly breaking other user’s scenes. Perhaps an additional parameter in the device options you could create (defaulting to FALSE) we would have to switch to TRUE for the plugin to use that new action in creating the request? If set to FALSE, it would just use the current way. Is that an option?

lolodomo · December 31, 2015, 12:25pm

No, impossible. Each action is defined in a XML file provided with the plugin.