Constant Reboots/Restarts (Vera 3) About Every 45 Seconds

So my Vera3 has decided to restart itself about 20 times this morning, approximately every 45 seconds…and keeps going.

I get a VeraAlert each time the device is restarted…so my phone has been going pretty crazy.

Any idea on why this is happening? What should I do to enable some better troubleshooting? I am going to put a USB stick in it since it looks like that is the only way I can capture any logs.

Your best bet is to contact MCV. It honestly can be anything. I would look at what plug-ins you are using and see if they are the culprit.

  • Garrett

This could be a memory exhaustion.
You should still be able to SSH (Putty) to your Vera and look around.

Hi EOppie,

Can you please enable tech support and submit a trouble ticket from the Vera unit interface, so we can look into this ?

Thank you.

I reached out to MCV and made sure to enable “Remote Support” on the Vera 3. I was able to submit a trouble ticket from the unit itself, it was assigned an ID of [MiOS #102167]

Here are the plug-ins (apps) I am currently using.

Deus Ex Machina - The Vacation Plugin	1.0	
Combination Switch	15.0	
VeraAlerts	3.9	
Program Logic Core	5.13	
Program Logic Event Generator	5.1	
Day or Night	1.5	
MiOS Update Utility	1.1	
Virtual ON/OFF Switches	1.34	
Garage Door	1.20	
Honeywell Ademco Vista Alarm Panels via AD2USB	3.0	
ERGY Plugin	1.43	
Google Calendar Switch	1.251

MCV Support has gone somewhat silent on me, so I am reaching back here for ideas.

From what I can tell, I am not maxing out my memory. I installed the system monitor app, and it always seems to be I am well within the limits for the system.

Can someone give me some ideas on what to check? I have enabled USB logging, however I am not skilled enough to seem to access them…unless I am missing something ??? I did try to SSH into the system, however I couldn’t seem to access the historical logs based on the instructions I followed.

I post here, and MCV support replies to my email…go figure :slight_smile:

According to support:

it appears that the Ergy plugin is causing the reload of the Luup Engine, not the entire Vera unit.

If it helps you, the logs when the Luup engine restarts is looking like this:
02 11/19/13 6:44:38.036 OL: (0?10866f4) (>9443) Lua LuaInterface.cpp l:1990 time: 6:44:30a (8 s) thread: 0?2e933680 Rel: N Got: Y <0?2bb33680>
01 11/19/13 6:44:38.037 Deadlock problem. going to reload and quit <0?2bb33680>
03 11/19/13 6:44:38.037 JobHandler_LuaUPnP::Reload: deadlock Critical 1 m_bCriticalOnly 0 dirty data 1 <0?2bb33680>

And if we search for the line ( ^02 ) we can see the that the thread that is causing the issue is 0?2e933680

00 11/19/13 6:44:35.466 luup_log:80: EchoEnergyManager::ERROR: getDeviceCategory: FAILED to find a matching category ID for category number 0 <0?2e933680>
00 11/19/13 6:44:35.467 luup_log:80: EchoEnergyManager::ERROR: getDeviceCategory: stack traceback:
[string ???]:459: in function ?loopLog?
[string ???]:556: in function ?writeToLog?
[string ???]:1327: in function ?getDeviceCategory?
[string ???]:2867: in function ?getReducedDeviceData?
[string ???]:2931: in function ?getReducedDevicesData?
[string ???]:2989: in function ?registerDevices?
[string ???]:3902: in function <[string ???]:3554> <0?2e933680>

Oh what a surprise… The ergy plugin :slight_smile:

Yeah, kinda frustrating since that is what Vera pushes for energy monitoring and management…and it is the culprit!

Since disabling it seems to be pretty solid today.

I guess I will now need to look for another solution for long-term monitoring.

Still having issues with this. MCV support maintains they are “monitoring the issue” however I have still yet to hear anything.

It looks like I am experiencing a LUA startup, which is still a pain in the butt, since it tends to happen while a scene is running. This then obviously looses all timings associated with the scene since it is usually in the middle of running something.

I took a look at the logs, and found this during one of the times that LUA rebooted:

02 12/18/13 8:49:58.009 ZW_Send_Data node 8 USING ROUTE 255.101.100.27 LEAK this:249856 start:3842048 to 0x169f000 <0x2b72b680>
04 12/18/13 8:49:58.138 <0x2b32b680>
02 12/18/13 8:50:03.105 LOG_CHECK_MEMORY_LEAK pMem start 0x1554000 now 0x169f000 last 0x1662000 leaked 1355776 <0x2b32b680>

I also found this line:

50 12/18/13 8:50:42.422 luup_log:30: PLEG:30:Initialize:Restart <0x2bd83680>

Any ideas guys? Let me know if there is something else I should be searching for in the logs.

Still curious about this line: 50 12/27/13 0:20:53.503 luup_log:76: PLEG:76:Initialize:Restart <0x2b5dd680>

PLEG: 76 is a PLEG Scene for “WorkWakeup” which is run by the gCal plugin.

Seems to have happened again around the time of a reboot.

PLEG:76:Initialize:Restart
Probably a poor name choice for an info message. It just means the PLEG has run previously (i.e. this is not a fresh install). You might want to upgrade to the latest PLEG/PLC. There was some work to make the startup more efficient.

It would be good to see the entire log prior to the restart. It will provide more context of what was happening before the restart.

Do you have a USB stick for Logs ?
It might help if you are running out of memory.

Thanks for the response Richard! Sorry for my delay in getting back to you.

I do have a USB stick installed, and I have verbose logging on.

I will try to grab a shot at the logs the next time vera decides to act up again.

I guess I failed at posting a copy of the logs, however I do have another update from MCV support:

Hi Eric,

I?ve checked the logs and as far as I could see the Vera engine reload wasn?t caused by that scene, and it seems to have been caused by a memory leak.

As far as I could see from the logs it seems to have been caused by the Alarm panel plugin. Memory leaks are usually caused by memory intensive plugins.

I would suggest to remove the plugins that are not necessary for your automation.

Let me know how it goes once you do that.

Thanks.

Regards,

George Diaconescu
Vera Smarter Home Control
Technical Support Team

Whenever I check the memory usage of my Vera, it never seems to be peaking…however I guess it is possible that something is consuming all of it and then causing the reboot.

Disabling the Alarm Panel Plugin is really not my idea of a “solution”. It seems that the solution from MCV support is to just disable everything that helps…well…automate.

Anyone else have an idea?

If it really is the Alarm Panel plugin, then there is nothing you can do, beyond trying to motivate the alarm plugin developer to fix their issue.

You can only know whether or not it is that plugin by removing it and seeing if its absence resolves your issue.

Take a backup first, so that you can restore if you choose to deal with the issue or determine that the issue is not being caused by that plugin.

[quote=“EOppie, post:14, topic:177789”]Whenever I check the memory usage of my Vera, it never seems to be peaking…however I guess it is possible that something is consuming all of it and then causing the reboot.

Disabling the Alarm Panel Plugin is really not my idea of a “solution”. It seems that the solution from MCV support is to just disable everything that helps…well…automate.

Anyone else have an idea?[/quote]
You’ll need to post [at least] the last 2 minutes of the Verbose log file to get more hints from folks here. Just remember to scrub it of any sensitive data before you do :wink:

It’s typically those last dying moments, in the last 2 minutes, that give the biggest clues. Feel free to give more, it’ll just take longer to scrub it, because sometimes there are multiple issues going on (like the GCal stuff from another user recently)

BTW: Is it a reboot (OS-level), or a restart (LuaUPnP level)? Support says restart, but you say reboot, and there are very different causes of each.

Sorry, I believe it is simply a restart. We experienced a restart tonight and I was home to be able to catch it in the logs. This was during our “Garage NFC Entry” scene.

Here is a dump of all the 01 (critical) errors that seem to be around the time of the restart. I attached a more detailed dump of the logs from around the same time period. MCV Support seems to be a bit baffled, would appreciate any help those can provide here.

01 01/28/14 21:15:08.283 ZWaveNode::CurrentRouteFailed node 16 device 71 position 18 route 19.7.0.0 hop 2 (from 7) <0x2bb19680>
01 01/28/14 21:16:01.781 got CAN <0x2c319680>
01 01/28/14 21:16:02.851 got CAN <0x2c319680>
01 01/28/14 21:16:03.921 got CAN <0x2c319680>
01 01/28/14 21:16:16.181 ZWJob_PollNode::ReceivedFrame job job#2388 :pollnode_callb #10 dev:61 (0x1210ca0) N:10 P:20 S:5 got FUNC_ID_APPLICATION_COMMAND_HANDLER node info expected 10 got 9 <0x2bb19680>
01 2014-1-28 21:16:19 caught signal 11 <0x2bf19680>
01 01/28/14 21:16:22.402 UPnPAction_Send::ParseState can’t find name <0x2ad0e000>
01 01/28/14 21:16:23.012 got CAN <0x2bc41680>
01 01/28/14 21:16:24.041 got CAN <0x2bc41680>
01 01/28/14 21:16:25.111 got CAN <0x2bc41680>
01 01/28/14 21:16:31.302 got CAN <0x2bc41680>
01 01/28/14 21:16:32.371 got CAN LEAK this:36864 start:36864 to 0xfcd000 <0x2bc41680>
01 01/28/14 21:16:33.042 JobHandler_LuaUPnP::DownloadPlugin 5 failed to convert error -4: ERROR: no version relased for this plugin <0x2b641680>
01 01/28/14 21:16:33.043 JobHandler_LuaUPnP::DownloadFiles new plugin 5 returned 0 <0x2b641680>
01 01/28/14 21:16:33.442 got CAN <0x2bc41680>
01 01/28/14 21:16:39.838 luup_require can’t find veraUserTemplateDefinitions LEAK this:405504 start:2408448 to 0x1210000 <0x2b641680>
01 01/28/14 21:17:40.312 got CAN <0x2bc41680>
01 01/28/14 21:18:11.692 got CAN <0x2bc41680>
01 01/28/14 21:19:01.991 got CAN <0x2bc41680>
01 01/28/14 21:19:03.061 got CAN <0x2bc41680>
01 01/28/14 21:19:05.271 AlarmManager::Run callback for alarm 0xa0b368 entry 0xd40b20 type 52 id 188 param=0xd40ab8 entry->when: 1390961941 time: 1390961945 tnum: 1 slow 0 tardy 4 <0x2b441680>
01 01/28/14 21:20:25.118 FileUtils::ReadURL 28/resp:0 size 0 http://192.168.1.166/CgiTagMenu?page=Top&Language=0 <0x2b641680>
01 01/28/14 21:20:40.135 FileUtils::ReadURL 28/resp:0 size 0 http://192.168.1.166/CgiTagMenu?page=Top&Language=0 LEAK this:8192 start:2654208 to 0x124c000 <0x2b641680>
01 01/28/14 21:20:55.151 FileUtils::ReadURL 28/resp:0 size 0 http://192.168.1.166/CgiTagMenu?page=Top&Language=0 LEAK this:4096 start:2662400 to 0x124e000 <0x2b641680>

Don’t think this will solve your problem but it’s a starting point (two years later and still not fixed!):

http://bugs.micasaverde.com/view.php?id=2749
http://forum.micasaverde.com/index.php/topic,8885.msg58101.html#msg58101

as per this:

DownloadPlugin 5 failed to convert error -4

The VistaAlarmPanel looks as though it has a debug mode (I don’t have one) - might help to turn it on and recheck the log?

VistaAlarmPanel::getDebugMode

A few comments:

[ul][li]The log file doesn’t contain the “2 minutes before restart”, so my comments are more general in nature, rather than specific to any restarts you’re seeing.[/li]
[li]All of the Z-Wave CAN’s you’re seeing are not normal.
It looks like your Z-Wave controller module on Vera is having problems, either directly or in response to certain commands being sent out (that the devices don’t appear to understand)
There are also times that some of the polling is taking a long time (4-25s) which might be related to this, either way that doesn?t look healthy.
You can see these running something like the following over the logs:
[tt] grep “Duration="[1-9]” Vera\ Log\ 12-28-2014.txt[/tt]
[/li]
[li]Turn off the Network scanning in Vera (Setup/Net & Wifi/[ ] Auto detect devices).
This will eliminate the unecessary HTTP calls that Vera is doing to probe various IP-based devices on your Network. Basically, all the lines from your log that are “[tt]FileUtils:[/tt]” (it’s used for other stuff also, but in your log they’re all the auto-configure stuff)[/li]
[li]There are times when Z-Wave calls are having to be sent repeatedly Z-Wave nodes 10, 18 and 21.
These are the Devices ?Garage Entry Door?, ?Front Door Entry? and ?_Home Energy Monitor? resp. See the [tt]ZW_Send_Data[/tt] calls for details, but this typically wouldn?t happen if Z-Wave is healthy. I recently ?power-cycled? (at the breaker) one of my Routing nodes because it was causing havoc on my Network? might be worth a try for Z-Wave node 8 (?Garage Door Switch?, the routing node for 18, 21)[/li][/ul]

PS: I edited, and re-up’d your log file, since it contained a number of PII things. You may want to check it for others, since I only did the more common/typical changes.

Thank you for looking into this!

I thought that I dumped more than 2 minutes prior to the restart in the full log file attachment, next time I get a restart I will try to ensure to attach a fuller file.

I will try disabling network scanning, as well as power cycle a few of the sensors. I assume doing a network heal won’t hurt either. Is there anything else to try and determine if the Z-Wave controller is having issues with hardware, or if there is a sensor issue going on?

My home is actually quite small in comparison to others, so I would hope distance wouldn’t be a huge issue. It is 1,774 sqft, all on a single level. I have metal stud construction, and concrete block exterior…however I do not have any sensors with a concrete wall in between vera and the sensor. The Garage Entry Door is a Schlage Lock, same with the Front Door Entry. The Garage Entry Door is literally less than 3 feet from where the vera unit sits.

The Garage Door Switch is what triggers the garage door, it is an Evolve LFM-20.

Thank you so much for spending the time with this! I have gotten further here in a few days than months of support requests with MCV.