As a follow up to my previous post, I did a bit more digging and can see there is also available the Project Oxford Bing Voice Output API:
https://msdn.microsoft.com/en-us/library/mt679063.aspx
Whose API endpoint is:
https://speech.platform.bing.com/synthesize
I’m guessing this is a newer API and suppose the mostly supported one going forward? In your post, however, the people you spoke with did mention the “/Speak” endpoint, which this one is not, as compared to the one being used by the plugin, which while a very different URL, that is calling “/Speak”.
Regardless, I used the documentation on the Oxford Project based API, where gender can indeed be specified, and built out some PHP code to test, and can, with 100% consistency, have my text converted to audio in the specific “dialect” and “gender” specified.
The key difference is calling this:
https://speech.platform.bing.com/synthesize
Versus:
http://api.microsofttranslator.com/V2/Http.svc/Speak
So, I suppose, if worse comes to worse, the plugin could potentially be updated to use the Oxford Project based API, however, it seems the limits are a bit lower. I was able to use the same client ID, but had to signup there to obtain a new / different key / secret to get it going.