How many of you do a daily or weekly power cycle of your unit? I have the ability to do this with a WiFi outlet and was wondering if there is any benefit? Wondering if this would help with the occasional Vera errors or occasional sluggishness that just seems to crop up from time to time.
Weekly, but I do it using os.execute( ârebootâ ) rather than power cycle. Just started a couple weeks ago so canât say yet whether it appears to help stability.
I hammer on my Veras pretty hard when Iâm working on plugins, and rarely find it necessary to do any kind of hard reset. When needed, I usually just launch /sbin/reboot via ssh. I canât recall how long itâs been since Iâve had one in a state that only power cycling would recover (arriving in that state spontaneously; when upgrading firmware, all bets are off). It has happened, but not often, and not in a while.
I do a scheduled daily reboot via a scene, using Veraâs OS reboot command. This insures an orderly shutdown and restart. I used to do it weekly, but found that Vera would sometimes get a bit sluggish after four or five days. Both my Veras are connected to IP-based power switches, which are powered from a UPS. If I need to power-cycle either unit, it can be done either locally or remotely. So far that hasnât been necessary.
The IP-power switch is designed for server control. It pings each Vera once per minute, and if 3 consecutive pings fail, it will turn of the power, wait 30 seconds, then turn it back on. I did this because I travel a great deal, and am frequently away from an internet connection. Having done all this, the power switch logs show that neither of the units have required a power cycle. But just in case⊠;D
I could try that but Iâve found in the past that those scenes wonât run when Vera is in a Lua Error state.
M
I donât. Have just above 100 devices and a vera edge. Just a handfull of plugins. Only reboot I do are on upgrade. Not seen any need to do reboots to be honest.
[quote=âresq93, post:5, topic:198507â]I could try that but Iâve found in the past that those scenes wonât run when Vera is in a Lua Error state.
M[/quote]
You could run the reboot as a scheduled job in /etc/crontab. That would work regardless of the state of the Lua subsystem.
One of my Veraâs is in remote location, where Iâm not present most of the time. For this unit I have WiFi wallplug switch (WeMo), which I can turn off/on remotely if Vera gets unresponsive for some reason. During last year using such switch was needed once or twice.
After such cases Iâve installed Datamine2 and System Monitor Plugins and track memory status. If everything is correct, the amount of free/used/cached memory should be more or less stable.
If your memory has big up and downâs or is considerably low, it is a signal that something is happening to your unit. This is how I found wrong logging settings on my Edge, resulting in âcanât write user dataâ error.
But I donât do reboots on regular basis, as I donât see any profit here. If system works fine (stable memory amounts, no frequent luup reloads/restarts), I donât think there is a need to reboot it.
Occasional sluggishness may be caused by luup reloads (you can use system monitor plugin to track it), and in this case rebooting doesnât do much help.
[quote=âHSD99, post:7, topic:198507â][quote=âresq93, post:5, topic:198507â]I could try that but Iâve found in the past that those scenes wonât run when Vera is in a Lua Error state.
M[/quote]
You could run the reboot as a scheduled job in /etc/crontab. That would work regardless of the state of the Lua subsystem.[/quote]
Not sure i know how to do that. Can you elaborate with instructions?
Ty
M
http://forum.micasaverde.com/index.php/topic,6751.msg42761.html#msg42761
This is an old thread. The /etc/crontab edits were suggested by MCV. If you are not comfortable with SSHing into your Vera and editing files, you might not want to use this method. This solution probably will not survive a firmware upgrade or restoration from a backup as /etc/crontab may get overwritten.
Additional thought about reboots
Today my Vera got unresponsive.
I did reboot and checked memory usage in Datamine - yesterday amount of free memory dropped down significantly consumed by unreasonable increase of cached memory (see screenshot)
I plan to make a scene forcing controller to reboot, just not on the schedule, but on the basis of free memory left (i.e. warn me if it drops below 50k and do a reboot if it drops below 25k)
Found that if I donât do a reboot every couple of weeks, Iâm forced to do it when it stops responding altogether. One hasnât been scheduled (and thank you everyone for posting methods of HOW to do this) but it will happen. It sucks pretty bad when youâre not at home and you get an alert saying your Vera is down and you have NO WAY to fix it remotely. When I do an occasional âbounceâ every couple of weeks, this doesnât happen.
Vera Secure
Firmware 1.7.3535
Every situation is unique. There is isnât a âone size fits allâ solution to the âshould I schedule an automatic rebootâ question. I do a daily reboot because the potential downsides (for me) are very small, while the upside is large. My experience with Vera (starting with a VL on UI5 and now two VP on UI7-1.7.3232 and 1.7.3532) is that eventually the system became unstable for whatever reason. This instability might corrupt the system to the point that a reboot didnât help, and restoring from an older backup was the only way to regain a stable system. My daily reboot insures that memory leaks or whatever else is happening donât get an opportunity to become a problem. So far, this has worked well for me. Iâll be interested in seeing if kwietoâs script to reboot on low memory works for him.
Iâll post an update, but taking into consideration that something like that happened only once (I have this controller and setup for about a month) I canât predict if and when it will happen again.
For previous controller, I didnât do any reboots for couple of months, except those during update or moving controller to another place.
Just for information, as Iâve checked Datamine records, the system was fully operational during âofflineâ time. Sensors were reporting data, power measurement was also reported correctly. It is a huge advantage of Plus (or newer firmware) over my previous controller (Edge), where low amount of memory caused âcanât write user dataâ error and it was just hanging-up.
As usual, you can make various predictions and reality goes on its own: today the problem with increasing amount of cached memory repeated.
Scene worked perfectly, rebooting controller long before memory would be drained enough to make it unstable.
Controller rebooted and in two minutes it was operational.
Nevertheless, time to issue a ticketâŠ
[quote=âkwieto, post:15, topic:198507â]As usual, you can make various predictions and reality goes on its own: today the problem with increasing amount of cached memory repeated.
Scene worked perfectly, rebooting controller long before memory would be drained enough to make it unstable.
Controller rebooted and in two minutes it was operational.
Nevertheless, time to issue a ticketâŠ[/quote]
Iâll be interested in Veraâs analysisâand congratulations that your scene accomplished its purpose!
If you want, here are error messages from the time around which the increase of cached memory started. I donât understand this enough to say where (and if) is the issue:
02 02/07/18 10:18:42.391 eZWJob_PollNode::ReceivedFrame HandlePollUpdate failed job job#1602 :pollnode #12 dev:197 (0x1b21938) N:12 P:100 S:5 Id: 1602 got after 1 seconds FUNC_ID_APPLICATION_COMMAND_HANDLER node info for 12 status 0 data 0xdd 0x14 0xc0 0xdb 0x89 0x4b 0x35 0x12 (e[34;1m#####K5#e[0m)e <0x76377520>
24 02/07/18 10:18:42.784 ZWaveSerial::Send m_iFrameID 42927 type 0x0 command 0x13 got failure 0x18 iNumFailedResponse 1 m_iSendsWithoutReceive 0 numretriesforack 3 <0x76377520>
01 02/07/18 10:18:43.006 eZWaveNode::DecryptMessage node 12 dev 197 failed and backup nonce is 0e <0x76377520>
02 02/07/18 10:18:43.034 eZWJob_PollNode::ReceivedFrame HandlePollUpdate failed job job#1602 :pollnode #12 dev:197 (0x1b21938) N:12 P:100 S:5 Id: 1602 got after 2 seconds FUNC_ID_APPLICATION_COMMAND_HANDLER node info for 12 status 0 data 0x91 0x2d 0x82 0x4c 0x1b 0x36 0x34 0x5 (e[34;1m#-#L#64#e[0m)e <0x76377520>
01 02/07/18 10:18:45.116 eZWaveSerial::Send m_iFrameID 42935 type 0x0 command 0x13 got repeat failure 24 iNumFailedResponse 1 time 39213116 start time 39213076 wait 2000 m_iSendsWithoutReceive 0e <0x75d77520>
01 02/07/18 10:18:49.188 eZWJob_PollNode::Run job job#1602 :pollnode #12 dev:197 (0x1b21938) N:12 P:100 S:1 Id: 1602 ZW_Send_Data to node 12 failed 5 req (nil)/-1 abort m_iFrameID 0e <0x75d77520>
02 02/07/18 10:18:49.189 eZWJob_PollNode::PollFailed job job#1602 :pollnode #12 dev:197 (0x1b21938) N:12 P:100 S:1 Id: 1602 node 12 battery 0 notlist:0e <0x75d77520>
06 02/07/18 10:18:49.193 Device_Variable::m_szValue_set device: 1 service: urn:micasaverde-com:serviceId:ZWaveNetwork1 variable: eLastErrore was: Poll failed now: Poll failed #hooks: 0 upnp: 0 skip: 0 v:0x1143618/NONE duplicate:1 <0x75d77520>
24 02/07/18 10:18:49.193 ZWJob_PollNode::m_eJobStatus job job#1602 :pollnode #12 dev:197 (0x1b21938) N:12 P:100 S:2 Id: 1602 <0x1b21938> m_eJobStatus Failed after 10.89295000 seconds <0x75d77520>
04 02/07/18 10:18:49.194 <Job ID="1602" Name="pollnode #12 7 cmds" Device="197" Created="2018-02-07 10:18:39" Started="2018-02-07 10:18:39" Completed="2018-02-07 10:18:49" Duration="10.89295000" Runtime="10.75287000" Status="Failed" LastNote="" Node="12" NodeType="ZWaveMultiEmbedded" NodeDescription="ĆwiatĆo|WisĆa"/> <0x75d77520>
02 02/07/18 10:18:54.995 eZWJob_SendData::ReceivedFrame job job#1603 :Wakeup done 30 dev:48 (0x1bf3c40) N:30 P:102 S:5 Id: 1603 to node 30 command 132/8 failed m_cTxStatus 1 retries 0e <0x76377520>
01 02/07/18 10:18:54.995 eZWJob_SendData::ReceivedFrame job job#1603 :Wakeup done 30 dev:48 (0x1bf3c40) N:30 P:102 S:5 Id: 1603 to node 30 command 0x84/0x08 failed 0/0 or Quit 0e <0x76377520>
24 02/07/18 10:18:54.995 ZWaveNode::m_bLastContactFailed_set device 48 = 1, force 0, m_bNotListening 1 zw poll9 <0x76377520>
24 02/07/18 10:18:54.996 ZWaveNode::m_bLastContactFailed_set device 48 skipping <0x76377520>
10 02/07/18 10:18:54.996 Job::m_sNotes_set job#1603 :Wakeup done 30 dev:48 (0x1bf3c40) N:30 P:102 S:5 Id: 1603 dataversion 955947397 changing from Waiting for node to reply after 0 retries -to- Cannot contact device, error code: 1 <0x76377520>
01 02/07/18 10:18:54.996 eZWJob_SendData::JobFailed job#1603 :Wakeup done 30 dev:48 (0x1bf3c40) N:30 P:102 S:5 Id: 1603 Priority 102e <0x76377520>
04 02/07/18 10:18:54.998 <Job ID="1603" Name="Wakeup done 30" Device="48" Created="2018-02-07 10:18:40" Started="2018-02-07 10:18:49" Completed="2018-02-07 10:18:54" Duration="14.81387000" Runtime="5.801944000" Status="Aborted" LastNote="Cannot contact device, error code: 1" Node="30" NodeType="ZWaveBinarySensor" NodeDescription="Skrzynia|Pokrywa"/> <0x76377520>
There are a lot of other messages here, but part of above are red or orange (rest is black) so I suppose they are bigger issues (?)
[quote=âkwieto, post:17, topic:198507â]If you want, here are error messages from the time around which the increase of cached memory started. I donât understand this enough to say where (and if) is the issue:
There are a lot of other messages here, but part of above are red or orange (rest is black) so I suppose they are bigger issues (?)[/quote]
Is this the actual log dump, or has it been filtered for errors?
Filtered.
As the controller is on remote location and I canât access log easily, I used AltUI and os.command panel buton: âErrors Warningsâ (os.command: cat /var/log/cmh/LuaUPnP.log | grep -i -E âwarning|error|failedâ )
Thanks. I also use ALTUI for that purpose. Itâs quite usefulânow if Vera can tell you what it all means⊠![]()