I’m working on the Google Latitude plugin that @mrw298 started and am a bit stumped at the moment.
For whatever reason a http.request() (or a https.request()) call will eventually get stuck - i.e. it never, ever, returns to the calling function. Regardless of what I’ve set http.TIMEOUT, socket.http.TIMEOUT or https.TIMEOUT to.
I’ve also tried using a debug.sethook() and pcall() function in an attempt to force the function to at least get terminated at some point. It still doesn’t, so I’m thinking it’s probably a bug in the MiOS implementation of Lua5.1(?). I’ve reported the issue to Micasaverde tech support, but as we all know from experience, responsiveness doesn’t always appear to be one of their strong suits.
Since it seems the problem is related to the frequency of the https.request() call, i.e. it manifests much sooner if I poll the Google API servers every 5 seconds vs if I poll them every minute or 4 minutes, and the problem “spreads” to other Google Latitude plugin instances, it seems like a deadlock issue in either the linux ip/tcp/socket or lua tcp/socket/http/https implementation. But, I can’t tell if it will affect anything else lua related since I’ve not been willing to leave the “hung” plugins in that state for an extended period (beyond overnight).
So my question is two-fold:
1: Anybody else experiencing this in their plugins?
2: How are you testing that timeouts work properly and how do you prevent functions from getting “hung” in your implementations?
I’d rather not release the Google Latitude plugin into the world with what is a massive problem.
[quote=“sjolshagen, post:1, topic:174631”]For whatever reason a http.request() (or a https.request()) call will eventually get stuck - i.e. it never, ever, returns to the calling function. Regardless of what I’ve set http.TIMEOUT, socket.http.TIMEOUT or https.TIMEOUT to. …
1: Anybody else experiencing this in their plugins?
2: How are you testing that timeouts work properly and how do you prevent functions from getting “hung” in your implementations?[/quote]
Never saw this with the Nest or ecobee thermostat plugins which use https.request() calls. The latter may sometimes do 3 https.request() calls in immediate succession; the former only performs one https.request() every polling cycle.
The plugin code doesn’t handle timeouts because I’ve never seen the calls get hung up.
I don’t see the hang until I’ve done a new request every 5 seconds for 30-40 minutes (so, ballpark, 300-400 requests or so). But it can happen with fewer requests too. For instance, it happens - it seems - randomly whenever I do a request every 3 minutes for several hours from each of the plugin instances.
Actually, as I type that, I start to wonder if there isn’t a correlation between the number of requests and the issue… I’ve got 2 Google Latitude plugins configured and one (and eventually both) will hang after a while.
A reload of the Luup/Lua engine will fix the problem for a while.
When it’s in the hung state, what does the output of [tt]netstat -a[/tt] look like? Saving the output of netstat and comparing at different times might show something.
My thermostat plugins seem to run for days without any issues (at least, my Vera’s current uptime is 15 days+).
Being stuck in CLOSE_WAIT suggests that the client (your code, or the Luasocket library) isn’t closing its sockets properly. There is a byte in the receive queue (note the 1 in your log), and until you read it and close your handle, it stays open, just in case you want to read the byte an hour from now.
Perhaps the Luasocket http implementation doesn’t behave if there are trailing bytes after the content? (Edit: on seeing the source for the class, it’s hard to see this being true. It seems to do all the right things.)
See if supplying your own socket to the http.request() call makes any difference. (Edit: I suspect not.)
It may also help to have the code you’re using to do this. Perhaps there’s a particular [LuaSocket] API usage that’s triggering the issue to occur.
For my usages of it, it’s all been local machine access, something to a machine running on my local network.
I did run into TIMEOUT issues, but have only validated the behavior under UI4/Vera2. Vera3’s run a different chipset, for example, and IIRC the Ethernet driver had to be developed new… so there’s always potential for problems (although, I wouldn’t expect them to be clean like this)
In my usages, I’ve always set certain headers on the outgoing HTTP request. This is for stuff like telling the other end to close the connection.
Other tools to help see what’s going on are [tt]lsof[/tt] and [tt]strace[/tt]… although you’d want to strip down the plugins you’re running, and use the “options” parameter, before getting deeply into using [tt]strace[/tt] as it tends to generate a lot of output (I’ve been running it on Vera’s [tt]NetworkMonitor[/tt], since it’s got serious TCP problems)
It seems there may be a bug in the version of the either the core liblua, lua interpreter or luac (compiler?) used in v1.5.622 of MiOS causing the “permahang” I’ve been trying to work around.
MCV Support told me to try and upgrade the library. I - erroneously? - assumed it had to be the luasocket library, but it seems that hasn’t been updated since 2011(?) Well, at least the version numbers haven’t changed between the current OpenWRT available online and the version used by MCV.
So, instead I upgraded to a more recent version (v5.1.5-1) of liblua, luac & lua. I have been running the plugin for 24+ hours now and it’s not “hung up” yet.
The connection to Google’s API server has gotten stuck and failed (more than once), but the plugin is able to keep on truckin and doesn’t get stuck. And it seems my pcall()/debug.sethook() combination is working as expected now.
I’m giving it at least another 48 hours of error-free running before I’m ready to declare, but I wanted to thank y’all for the input you provided.
have you found out what the problem was? I?m stucking in the same problem right now. Could you please give me a more detailed description on how to solve that problem?
Thank you
Best Home Automation shopping experience. Shop at Ezlo!