Say action - different problems and solutions

teonebello · December 31, 2012, 1:57pm

Hi All,

as you are speaking about the say function, I would ask you a support.

I create a “good morning message” using the sonos say function.

As the text is long I had to split in the call.
I tried to separate using the deelay function.

…
luup.call_action(LS_SID, “SetURIToPlay”, {URIToPlay = “x-rincon-mp3radio://translate.google.com/translate_tts?tl=it&q=buongiorno+famiglia+XXX.+al+momento+ci+sono+”…tostring( lul_temp )…“+gradi+e+vento+a+”…tostring( lul_wind )…“+kilometriorari”}, DEVICE_NO)
luup.call_action(MN_SID, “Play”, {}, DEVICE_NO)
luup.sleep(6500)
luup.call_action(LS_SID, “SetURIToPlay”, {URIToPlay = “x-rincon-mp3radio://translate.google.com/translate_tts?tl=it&q=le+previsioni+per+oggi+sono+”…tostring( lul_weather )…“+con+una+massima+di+”…tostring( lul_maxtemp )…“+gradi”}, DEVICE_NO)
luup.call_action(MN_SID, “Play”, {}, DEVICE_NO)
luup.sleep(6500)
luup.call_action(LS_SID, “SetURIToPlay”, {URIToPlay = “x-rincon-mp3radio://translate.google.com/translate_tts?tl=it&q=Il+sistema+di+allarme+eh+attualmente+”…tostring( lul_smart_text )…“+…+Buona+giornata”}, DEVICE_NO)
luup.call_action(MN_SID, “Play”, {}, DEVICE_NO)
luup.sleep(7000)
luup.call_action(MN_SID, “Stop”, {}, DEVICE_NO)

Is the delay function indicate? It seems that the scene is busy until the deelay are finished.
Is there still the limit of length in the say function? That can solve my issue.

Thank you for your support
Matteo

lolodomo · December 31, 2012, 2:22pm

You encountered the problem with the URL too long. I have not checked but the way you handle it is probably a way to get it work.
But why not using the Say function ?

To come back to the Say function, we could try to cut automatically the text and finally call several times Google. We will have to find the best cut points in the text (like end of sentences) as there will be some delay between each fragment playback (time for Sonos to switch to next “web-radio” file). Other alternative would be to let the user define the cut points in the text, for example with a special character ? Then we will add other cut points only if there are too much characters.

parkerc · January 4, 2013, 11:12am

Hi lolodomo

Just thinking (outloud as i do)

As it seems Google TTS has a 100 character limit per request, an idea might be to have a character count next to the SAY field (or a fixed limit on that field) so people know or can see if they are going to reach the limit on any request. (There looks to be no restriction at the moment)

I’m testing your skills now but may consider having an ‘+’ option so you can add extra SAY commands, then if all I requested at almost the same time, it might be possible to play them in order with little to no delay.

lolodomo · January 5, 2013, 10:32am

[quote=“parkerc, post:23, topic:173748”]Hi lolodomo

Just thinking (outloud as i do)

As it seems Google TTS has a 100 character limit per request[/quote]

When typing my text directly in http://translate.google.fr/?hl=fr I have not the limit I get when using the API. Isn’t the web site using the API ?

an idea might be to have a character count next to the SAY field (or a fixed limit on that field) so people know or can see if they are going to reach the limit on any request. (There looks to be no restriction at the moment)

You mean the field in the UI (player tab) ? As it is Javascript, it is probably possible. But that would not help the users using the other ways (scene) to use Say. So the idea is more to find a global solution, not something limited to a particular usage.

I'm testing your skills now but may consider having an '+' option so you can add extra SAY commands, then if all I requested at almost the same time, it might be possible to play them in order with little to no delay.

Sorry I don’t understand your request.

brientim · January 5, 2013, 11:12am

I would suggest the last comment was to enable a “+” which would provide another input for addition text.

However, if the API is limited 100 characters, it might be a better option to substring at the last complete word before the 100 character limit and send it in multile blocks.

parkerc · January 5, 2013, 12:13pm

Thanks Brientim,

Yes that was the idea, to have the ability to do multiple requests, that could then be played in order (maybe added in sequence to the queue).

@lolodomo

As for the 100 character limit - check this out The Unofficial Google Text-To-Speech API - ViralPatel.net. I interpret this to mean that requests not made directly into the google tts page have a limit set on them…

brientim · January 5, 2013, 12:18pm

Something like this chain of thought.

lolodomo · January 5, 2013, 12:54pm

The limit at 100 characters is exactly what I have with my initial example.
The good news is that the limit is absolutely not dependent on the way we call Google.
So we have now just to cut a more than 100 characters in several blocks.

lolodomo · January 5, 2013, 1:09pm

[quote=“Brientim, post:27, topic:173748”]Something like this chain of thought.

My question is then: can we concate all downloaded files to make one unique file ?
Some tests have to be done to check if concatening two files having the same sampling rate is ok for playing by Sonos.

lolodomo · January 5, 2013, 1:20pm

[quote=“lolodomo, post:29, topic:173748”][quote=“Brientim, post:27, topic:173748”]Something like this chain of thought.

My question is then: can we concate all downloaded files to make one unique file ?
Some tests have to be done to check if concatening two files having the same sampling rate is ok for playing by Sonos.[/quote]

The answer is yes, I just made a positive test 8)

brientim · January 5, 2013, 1:24pm

That was very quick. :o

lolodomo · January 5, 2013, 1:24pm

So my idea would be the following for cutting the buffer:
1 - first find # characters in the string and cut after these characters (that means the user can decide where to cut)
2 - if not enough (and/or # characters not found), cut after a .
3 - if not enough, cut at the end of the last word

brientim · January 5, 2013, 1:55pm

I suppose there are few options:
what you have stated where there is a distinct character to find and used to cut. I see there only issue here is the user would need to know the active character position/block length to enable them to enter the character with the blocks. E.g if first block ends @80, the next block finishes @180.

Provide multiple inputs max length 99 characters and then concatenate with a space.

I’d work on the premise it maybe best to do this automatically without user intervention.

What should be the maximum entry allowed, 200, 300? Whatever it is allocated, the input box would need to be resized to enable a user to see full text especially if option 1 is implemented.

parkerc · January 5, 2013, 2:01pm

Agreed.

Limiting a SAY to 100 would be good, and if more are needed then the uses can choose to add ‘+’ another SAY input box…

(How would this work in a scene)?

parkerc · January 5, 2013, 2:07pm

@Lolodomo

I’m probably pushing my luck now, but for a future release maybe move ‘Speach’ to its own tab and think about a dynamic ‘Say’ TTS builder .? (With some of these as presets - http://forum.micasaverde.com/index.php/topic,12408.0.html)

The best speeches for me are the ones that Vera makes up itself

brientim · January 5, 2013, 2:13pm

The penny drops and hits me. I had not thought about scenes. As I do not currently use Vera for any AV equipment I not on the habit of doing this on Vera.

How would it deal with multiple entries in a scene. One thing at a time.

lolodomo · January 5, 2013, 2:33pm

If calling several times Say (in a scene), there will be a restore previous context between each call.

lolodomo · January 6, 2013, 9:48am

[quote=“Brientim, post:33, topic:173748”]I’d work on the premise it maybe best to do this automatically without user intervention.

What should be the maximum entry allowed, 200, 300? Whatever it is allocated, the input box would need to be resized to enable a user to see full text especially if option 1 is implemented.[/quote]

Ok, I will do it automatically but the rendering could be worst than something done cleverly by the user.
I will define no limit.
Regarding the UI, for the scene advanced tab, it is controled by MCV, I cannot change it.
For the player tab (Sonos), I miss space. As suggested, the best could be to add a new tab dedicated to TTS. But these TTS fields are here only for testing purpose. So I am not sure it is interesting to loose time on this.

lolodomo · January 6, 2013, 10:00am

[quote=“parkerc, post:35, topic:173748”]@Lolodomo

I’m probably pushing my luck now, but for a future release maybe move ‘Speach’ to its own tab and think about a dynamic ‘Say’ TTS builder .? (With some of these as presets - http://forum.micasaverde.com/index.php/topic,12408.0.html)

The best speeches for me are the ones that Vera makes up itself ;)[/quote]

This is of course an interesting usage of the Say command and I will myself define some Say commands in my scenes. But in my opinion, it cannot be included in the plugin because everybody has its own needs, starting by the language to be used. The best would be that everybody add their examples to your topic.

lolodomo · January 6, 2013, 10:34am

[quote=“lolodomo, post:19, topic:173748”]Unfortunately I discovered that my solution produces a strange rendering of the end of the file.
I imagine it is due to the fact that the Sonos buffers a certain number of MP3 packets before playing them.
I think the solution would be to add real MP3 packets to the file, packets corresponding to silence.[/quote]

Anyone has skills to let me know what could be a valid MP3 silent packet ?