I am having my zwave commands delayed by 10-15min (i.e. I press a button in the dashboard and the light doesn’t turn on for 10 minutes). This happens 2-3 times a day. The only way to clear these delays from happening is to reboot vera. I thought things were better last week with the new dongle but then this started happening
On the plus side, I can reboot vera remotely and clear this problem up… wheras when the dongle locks up I had to power cycle vera completely. Still frustrating though…
Is this happening to anyone else… or am I really really unlucky?
BTW I was running 918 and this seemed uncontrollable… back to 899 and seems better but may just take some time to get all screwy again.
unplugging the dongle is the only thing that helps if it has completely locked up… i am not local to the vera most of the time so a remote reboot is the best i can do. a remote reboot does seem to fix the problem when commands are “delayed” and not completely locked up.
honestly i haven’t looked at the logs via ssh, the html logs don’t tell me much. i should probably try and get more involved and telnet in but i don’t really know what I am looking for.
Well from the HTML log you can see what event number you are in. If it’s say 1-100 then vera is rebooting. If some polling is taking a long time this is where the hicup may be.
I downgraded to 899 and stopped all automatic polling, the delays seem are more tolerable since I did that… i hope the next firmware addresses this issue…
BOSSTON, I found the cause of your problems. You have a node #23 “Master Bathroom Vanity Lights” which every time it’s polled sends back a Z-Wave APPLICATION_BUSY with a “retry in” value of “1 second”.
It’s very rare that devices ever send this, and it usually means the node is processing a button push at that split second. So, our Z-Wave engine got back the ‘retry in 1 second’ from the light switch and blocked the poll job for 1 second to wait and retry. But the node kept doing this over and over and over again, sometimes hundreds of times in a row. And so the reason for the delays wasn’t memory leaks or anything like that, it’s that the poll job would get to node #23 and sometimes get “stuck” for a very long time. Your node should never respond like this–if it gives us back an application_busy retry in 1 second, that is supposed to mean that we can retry in 1 second and it will be fine. So our code didn’t handle this well–we just blocked everything and waited 1 second figuring the 2nd time it would go through.
I fixed our code and patched your system. Now, if we get back the application_busy we won’t block the job anymore; we’ll let other jobs come in and be processed. I updated your system, let it run for several hours, and then tried those 2 scenes you recommended. They ran instantly. So I’m marking this issue as resolved on our end, but, if node #23 is still under warranty, you should look into getting it repaired since it is somehow defective.
I noticed you updated the system, and yes everything is working better this evening. Thank you very much.
I located the offending switch. Tried to poll it manually and as I expected from your testing a red poll event showed up in vera. Curious to how this switch responded to other commands… I sent the switch an on command and it came on. Sent it an off command, and it went off. There didnt appear to be anything wrong with the switch. So I sent it another poll, and it came up red a second time. So polls were the only command that were getting the busy signal. This also explains why turning off automatic polling helped solve my issues in 899… I had a feeling it was getting hung up somewhere amidst all the heavy traffic.
I then pulled the air gap on the switch for a few seconds to reboot it. Once it came back online, I sent another manual poll with vera. Great success… green poll. Very interesting… all seems better as of right now.
I am glad you fixed the code though… if these devices decide to “hang up” like this one did… how is the user supposed to know which offending device it is? The switch was working properly as far as I could tell… but it was causing a major problem on my network. I will keep an eye on it if it does it again… but it doesn’t make me feel good since it is 1 of about 30 Leviton Vizia RF+ switches in my house and I hope all of them don’t get “hung up” every once in awhile.
Aaron, once again, I would like to thank you for your help diagnosing and solving this problem, I will keep you posted if I run into any other major problems.
I don’t know WHY they hang like this - could it be because Vera keeps polling them periodically and they just “get tired”?
And yes, controlling them still works, just polling stops working.
There should probably be a way of identifying that a switch is doing that, I would imagine that it’s an issue with the switch firmware that is causing the issue.
FYI, not to pile on with this problem, but tonight something stood out and bore mentioning:
I have a scene that’s triggered by a code unlocking my Schlage deadbolt, which is designed to turn on a specific lamp at 50%, wait three minutes, then turn off the same lamp.
Vera, the lamp and its RP200 module, as well as the Schlage are all within 13 feet of one another.
The turning ‘On’ of the lamp is often delayed, tonight by a solid MINUTE! The turning ‘Off’ I’ve noticed is often either delayed or just plain doesn’t happen (rarely).
I’m in Luup revision .918 firmware, and otherwise happy. Just felt like reporting or commiserating as the case may be.
Best Home Automation shopping experience. Shop at Ezlo!