There are plenty of threads regarding UI7 and the dreaded daily “Can’t detect device” error. For me, they pop up early in the morning and the only way to resolve it is a reboot (restarting LUUP does not do it).
I do not know scripting. I assembled the script below from various threads. I want the scene to determine if an error is present. If it is, restart Vera. Is the script below correct? Thanks.
local hasFailure = false
for devNum, devAttr in pairs(luup.devices) do
local commfailure = luup.variable_get("urn:micasaverde-com:serviceId:HaDevice1", "CommFailure", devNum)
if (commfailure == "1") then
hasFailure = true
end
end
if (hasFailure == true) then
luup.call_action("os.execute("reboot")")
end
I think this would be closer to the mark:
for devNum in pairs(luup.devices) do
local commfailure = luup.variable_get("urn:micasaverde-com:serviceId:HaDevice1", "CommFailure", devNum)
if (commfailure == "1") then
os.execute "reboot"
end
end
[quote=“akbooer, post:2, topic:191694”]I think this would be closer to the mark:
for devNum in pairs(luup.devices) do
local commfailure = luup.variable_get("urn:micasaverde-com:serviceId:HaDevice1", "CommFailure", devNum)
if (commfailure == "1") then
os.execute "reboot"
end
end
[/quote]
Thank you. I’ve implemented your code into my scene.
For curiosity purposes, after seeing your code, I restructured the one I originally posted–would this have been more correct? I was wondering why it was needed to first set the hasFailure variable to false, then checks for it, then set another variable, and lastly execute based on the variable state.
Reading your profile “Less is more”, I presume this is a lot of code doing unneeded things?
Cheers.
local hasFailure = false
for devNum, devAttr in pairs(luup.devices) do
local commfailure = luup.variable_get("urn:micasaverde-com:serviceId:HaDevice1", "CommFailure", devNum)
if (commfailure == "1") then
hasFailure = true
if (hasFailure == true) then
os.execute "reboot"
end
end
end
Does it work?
For curiosity purposes, after seeing your code, I restructured the one I originally posted--would this have been more correct? I was wondering why it was needed to first set the hasFailure variable to false, then checks for it, then set another variable, and lastly execute based on the variable state.
Reading your profile “Less is more”, I presume this is a lot of code doing unneeded things?
Yes, exactly so:
[ul][li]you’re not using [tt]devAttr[/tt] (which, BTW, is not exactly the device attributes)[/li]
[li]you have no need of the logical variable [tt]hasFailure[/tt], because it’s logically equivalent to the condition [tt](commfailure == “1”)[/tt][/li]
[li]there is no need for the test [tt](hasFailure == true)[/tt], since the preceding statement sets it to true, it will always be so.[/li][/ul]
The less you write, the less can go wrong. I had a programmer working for me once who said:
I don't understand what went wrong, I only changed one line!
Thank you again for the explanation. The “Can’t detect devices” error pops up 5 out of 7 days. I will observe the system and let you know if the script works. Cheers.
[quote author=akbooer link=topic=36976.msg275772#msg275772 date=1458659672]
Does it work?
[quote]
The script appears to be working. However–I was wrong about the regularity in which this error appears. I previously thought that it was always between 5 and 6 AM, but today the error did not appear until after 6 AM. By that time, the script had already ran for the day.
In consideration of that–is it possible to write a “watchdog” script so that if/when the error appears and persists for ten minutes (to ensure that it is not transient), then the reboot?
If that is not possible, I suppose my remaining option is run the script once per hour or in some repetitive fashion. It’s not my preference, but I see no way out of the problem until Vera fixes the root cause.
It’s fairly straight-forward to write a callback routine which monitors changes in all CommFailure variables and times how long they remain in that state.
Is that really what you think is necessary to keep your Vera going?
Undesirably, yes [for now]. This thread http://forum.micasaverde.com/index.php/topic,36482.60.html is one of many examples where owners have contacted Vera with the same problem but without a resolve. Several reported that Vera acknowledges the problem and is working on a fix. However, that has been going on since 2014.
All the issues are related to UI7. Some were UI5 to UI7, one owner bought a Vera Plus to replace Vera Lite after he ecountered the problem with Lite, only to find that Plus had the same problem.
I have not examined the logs to see if there is a clue (my knowledge would be limited anyhow).
I rather not return to UI5, so for the time being, I figure I can just trigger a reboot whenever the problem arises.
The scenario would be:
Whenever the specific type of failure appears, wait ten minutes, check to see if the error persists, if yes, reboot Vera. I want the wait time because sometimes cameras will show up as “not responding”, but that is temporary.
If you can provide some guidance [read: provide a script ;)], I would much appreciate it. Thanks for your continued assistance.
I imagine it would be something along these lines (run in Lua startup) …
CommsTable = {}
function CommsMonitor (device, service, variable, _, value_new)
local now = os.time ()
local hash = table.concat ({device, service, variable}, '.')
CommsTable[hash] = {time = now, state = value_new}
luup.log (table.concat ({"Comms state change: ", hash, value_new}, ' '))
end
function CommsPulse ()
local timeout = os.time() - 10 * 60 -- ten minute timeout
for hash, info in pairs(CommsTable) do
if (info.state == "1") and (info.time < timeout) then
luup.log "rebooting due to overdue Comms failure"
os.execute "reboot"
end
end
end
luup.variable_watch ("CommsMonitor", "urn:micasaverde-com:serviceId:HaDevice1", "CommFailure")
luup.call_delay ("CommsPulse", 60, '')
…I have not tested this.
Thank you very much! I’ll give this a try later today and report back.
I see the issue that could arise is an infinite reboot loop every 10 minutes if a device fails thus rending the whole system inoperable!
Depending on the scale of the deployment, you may need to increase the time to allow more attempts.
You could to increment a variable that before executing the reboot verify that after x reboots don’t and notify only.
Yes, there may be more checking required, but note that the timeout only begins when the state is set to failure. I’m unsure how system startup behaves, but if the device has already failed, then the state should not be reset. It needs testing.
I really don’t approve of this approach, but needs must…
Thank you both. I have a Vera Plus inbound to replace the Lite. I will first make the upgrade, see how things go, and perhaps work with support before I apply this script. I agree that this solution is not nearly ideal. The UI5 to UI7 move has claimed many victims…