Securing and stabilizing the Vera by taking it off the grid

Mine has been running for a few weeks now. I just checked I forgot the port number for the vera for the dongle call.

it should be

wget -O /dev/null "http://#veraip#:3480/data_request?id=action&DeviceNum=1&serviceId=urn:micasaverde-com:serviceId:ZWaveNetwork1&action=BackupDongle&Restore="

D’oh! I should have seen that…still jet lagged. :-[

Working now. Thanks!

The backup scripts are running for both VPs. I added a test in the cron script for successful completion of each backup. If there is a failure, the script emails me. An additional benefit of the local Vera backup is that the server has a RAID array, and also writes a nightly backup to an external USB 3.0 drive. Thanks for the tip!

Glad you got it working. Backups and backups of backups are great indeed.

My system has been settling and strangely I have not had any false positive on security sensors and have had fewer undetected sensor untrip in the past few days. I hope it will stay that way. The Vera still does random Luup reloads every 30-80 hours.

I turned by old Vera edge and upgraded it to the latest firmware and ssh into it to find out that it?s storage drive has a number of bad sectors. I had retired it 2 years ago when it suddenly lost all of its zwave devices out of the blue and kept wanting downgrade to a certain firmware when I contacted support. (Spent hours and hours trying to figure it out at the time). Given how the UI7 works and the very frequent update and rewriting of the user-data.json file, if one has a very large system, the file can start getting very big even if it is compressed. Mine was a couple of 100KB. In spite of being SLC NAND flash with very good endurance, the fact that the Veras have very little storage and a portion of it gets rewritten every few seconds, the NAND cell certainly worn off relatively quickly (~1year). There is just not to many NAND cell to do wear leveling on assuming that the OS or the drive FW does it. This was probably the reason for my impression that I had frequent data corruption and… gremlins in my setup.

Yup. Backups with backups are good.

I’ve wondered about wear-leveling, with the system checkpoint occurring every 6 minutes, plus who knows what else? My user_data.json.lzo file is around 182KB. I wonder if some of the odd crashes some folks have had are due to FLASH errors? The OpenWRT docs are vague on what wear-leveling mechanism is used. I assume the NAND flash is directly connected to the SOC, so management is coming from the OS or other software.

Both my systems are stable. The production system, running 1.7.3232, just works. I don’t pay much attention to LUUP reloads, but they don’t seem to occur very often. The test system, using the latest firmware, runs System Monitor—the last LUUP restart was a couple of weeks ago and may have been me playing with a camera.

Upgraded the production system to 1.7.3831 this morning, after a couple of weeks of testing on the secondary system. The production system had 66% used in the overlay partition before the upgrade. No issues during the upgrade, other than taking a couple of minutes longer than expected, no doubt due to upgrading the Z-Wave firmware on the co-processor. Everything is working, and there were some icon changes that had to be addressed by changing category/subcategory settings. The icons now do a better job of representing the actual device type. so no problem.

The overlay partition is now at 81%. I can see why folks who started with an almost full overlay had upgrade issues. As has been mentioned before, this upgrade comes with all the Z-Wave upgrade files for all regions. See the attached file. There’s ~ 1.8MB of unused Z-wave 6.1 .hex files, and at least 720KB of unneeded Z-wave 4.5 files. A forum member noticed this and deleted the unnecessary files and replaced them with 0 length files to avoid Vera automatically replacing them. It would obviously be much easier for the install script to delete the superfluous files, freeing up ~ 2.5MB of precious overlay space.

Well, for me blocking all communications from the vera from the internet has prevented it from getting any of the files back and I have indeed also deleted all of the unused files.
You are lucky to have such long intervals between luup reloads… Mine still reloads every ~36-45h for no apparent reason.

I don’t monitor LUUP reloads on the production system. I had installed System Monitor last year, and it seems that the frequency of LUUP reloads increased, so I removed it. I’ll be keeping a close eye on it for the next few days. The upgraded system seems stable and everything is working.

I’ll be following you lead on taking Vera off-line, as soon as I’m home for a couple of weeks. I don’t want to make major changes and head out of town as the family doesn’t appreciate it when Vera goes off the rails. :frowning:

I treat these random Luup reloads as if they were system crash and they really are since they can have dreadful consequences… As some other forum members have experienced, it can destroy your house if one is not careful: If the vera is controlling irrigation or HVAC or even security devices. I have experienced flooding of my yard because the vera rebooted while in the middle of a scene. Imagine your locks forgetting to lock etc… These must be taken seriously. I have written code now in my startup lua to prevent many of the catastrophes but one can’t use the vera beyond lighting if it is as unreliable as it is. (the worst you would get is coming to the house lighting up like a Christmas tree out of control). I have now removed all the files associated with the rogue vera apps and the unit seems to still be fine.
I am now also suspecting than there is correlation on one of my sensors having false trips and its battery status changing so it seems that using NiMH rechargeable batteries (low voltage) making it report 1% even when full addresses the problem since it never changes battery level.

All true.

I don’t use Vera for anything mission-critical–no HVAC, or locks of any kind. The HVAC system is pretty fancy and has a thermostat that’s really a terminal for the processor in the system. The only “smart” T-stat that will work with the system is the vendor’s own, and it’s ridiculously overpriced, is cloud-based, and gets some of the worst reviews imaginable. The stock non-automated T-stat works great, and there’s no reason to have it be part of Vera.

Wireless locks scare the heck out of me, and not just because Vera might hiccup and unlock all the doors. There are too many ways they can be compromised, and the latest security flaw in Z-Wave means all these locks could be compromised. I did automate the garage door when I first started with Vera four years ago. It seemed cool to be able to operate the door remotely, until the door opened all by itself for no apparent reason. I removed the control, but still have the door position sensor so I can check it remotely.

There’s an electric wall heater and a wall-mount AC unit in the shop outbuilding. These are both on Z-Wave outlets, but since they take care of themselves, there’s no danger if they get turned on unintentionally. The remote control is to either heat or cool the shop before use; convenient but not essential. And there is a scene that turns them both off at 11PM just in case we forget.

And as far as irrigation goes, I had a battery-powered irrigation hose timer fail in the “on” position when I was out of town. You can imagine the result, so no automation for irrigation, Vera or otherwise.

I use Vera for lighting, where failures are annoying but not disastrous. I also am careful not to write scenes with delays, for obvious reasons. I run with as few plugins as possible (Countdown Timer, DelayLight, Wunderground and couple more that are solid) and don’t want to interface to external cloud systems.

I bought Vera as an experiment, to see if low-cost home automation was ready for the masses. HINT: It’s not, but I’m not convinced that any of the other solutions out there would be that much better. I don’t think Vera’s hardware is that bad—I’ve built much-higher performance real-time control systems with much less powerful hardware. Vera’s problems are on the software and corporate management side, and in fact Z-Wave itself in 2018 is a primitive, low-performance HW/SW solution. That said, my Vera has been very stable since summer of 2017 when I rebuilt it from scratch (I had carried over a config from a VL) and I’m pretty happy with it.

So I’ll follow your lead and take Vera off-line as time permits. If I decide to get a fancier system in the future, I’ll probably look at Homeseer as I have a 24/7 Win 10 box that should handle it with ease. Thanks again for sharing your experiences.

I use PLEG to start a 10 minute timer for the garage overhead door and a 1 minute timer for the front door. If the garage overhead door is open for 10 minutes, I get a reminder every 10 minutes the door is still open. Similarly, the front door will send me a message every minute.

With PLEG I have not experienced as many failures during a LUUP reload, however, they tend to happen about once a week. This morning, at 5:30 am, I had a LUUP restart, after my bedroom fan was turned off and my wife’s nightstand light turned to 15% but before the night stand could ramp up to 100% and House Mode changed from Night to Home. So after my shower and getting dressed, I enter the living room and I am getting motion notifications on my phone. Sure enough, still on Night Mode.

Annoying but not catastrophic.

I think everyone who goes down the HA path needs to make their own risk assessment when automating a task. “Annoying but not catastrophic” is a pretty good benchmark.

I actually added a fail-safe to the garage door so it could be opened remotely in an emergency. The output from the Z-Wave pulse relay goes through a series of relays that are controlled by a LAN-based system that is not connected to Vera and has no automation capabilities. The four series relays must all be set to the correct state to pass the pulse relay signal on to the garage door opener. The relays can be programed for an “ON” time, then they reset. You set the four relays to the correct state (kind of a hex lock code) and have 2 minutes to have Vera send a door command. I also set up alerts similar to yours when the door is open.

Probably overkill, but I had the relay system in the parts bin. ;D

[quote=“Don Phillips, post:31, topic:199140”]With PLEG I have not experienced as many failures during a LUUP reload, however, they tend to happen about once a week. This morning, at 5:30 am, I had a LUUP restart, after my bedroom fan was turned off and my wife’s nightstand light turned to 15% but before the night stand could ramp up to 100% and House Mode changed from Night to Home. So after my shower and getting dressed, I enter the living room and I am getting motion notifications on my phone. Sure enough, still on Night Mode.

Annoying but not catastrophic.[/quote]
Agree not catastrophic, lives and safety are not at risk, but it is an immense failure - that things can and do randomly fail leaving your home in an unpredictable, frustrating, and potentially unprotected state.

As someone who has had a setup that has been suffering an extreme number of such failures, I have accelerated the window of time my wife has taken to become fed up. Having things half happen, not happen, or even happen when they shouldn’t has driven her up the wall.

So while not catastrophic, over time it really does become unacceptable. Lights left on, windows or doors left open, things left in night mode, security potentially not armed, things staying dim when they need to be bright, etc won’t put lives at risk but they’ll drive your family insane.

Personally I’m at a point where I am having to be very careful and thorough with my next step, as I’ve pushed my wife so far my next step needs to either work or I simply give up and rip everything out. Given all my problems have simply been with Vera being insanely under resourced and bug ridden, rather than being a design flaw with Z-Wave itself (security concerns aside), I am confident I can nip this… but I am equally confident there is no way Vera is capable of achieving that.

One more step taken today: I have now removed my mobile apps and I am considering removing my production vera from my vera account next since they are no longer talking anyway but am hesitating because some of the settings still require to connect to the server and login which I am finding absurd but is explained by the vera?s lack of local security…

Good luck with this—I wonder if you’ll be able to remove your production Vera without it being online? Or, if you have to be online to remove it, will the Vera servers send something to the unit that will cause you problems?

[quote=“rafale77, post:29, topic:199140”]I treat these random Luup reloads as if they were system crash and they really are since they can have dreadful consequences… As some other forum members have experienced, it can destroy your house if one is not careful: If the vera is controlling irrigation or HVAC or even security devices. I have experienced flooding of my yard because the vera rebooted while in the middle of a scene. Imagine your locks forgetting to lock etc… These must be taken seriously. I have written code now in my startup lua to prevent many of the catastrophes but one can’t use the vera beyond lighting if it is as unreliable as it is. (the worst you would get is coming to the house lighting up like a Christmas tree out of control). I have now removed all the files associated with the rogue vera apps and the unit seems to still be fine.
I am now also suspecting than there is correlation on one of my sensors having false trips and its battery status changing so it seems that using NiMH rechargeable batteries (low voltage) making it report 1% even when full addresses the problem since it never changes battery level.[/quote]
This is a question about the report generated by System Monitor:

Last CMH Reboot 08:47:46 Wed 14 Feb 2018
Last Vera Restart 19:31:15 Wed 16 May 2018
Last Luup Restart 01:00:55 Tue 05 Jun 2018

I know that Last Vera Restart indicates a soft reboot—you’ll see this entry after sending a reboot command from the UI. I’m not sure what Last CMH Reboot indicates—is this a power-cycle boot?

I’m now running SysMon on both the production and test system, and I see a Last Luup Restart every morning at 1:01 AM on both systems. This is when the Linux server finishes the Z-Wave/config backup script. I just noticed this today, and the logs have rotated out so I can’t see what happened. I can test this by forcing the backup script to run, but have some other tests running so I won’t get to it for a few days. I notice that a “Reload Engine” command is a complete forced shutdown and restart of Luup by looking at the logs, and is indicated in SysMon’s “Last Luup Restart” report. I’m not sure why Vera needs to do this after creating a backup.

Will try to answer your questions:

  1. I do not get a Luup reload after my backup script. Proof is I go much more than 24hr between reloads and I am backing up every 24 hrs. There must be something going on in your system causing this.
  2. My understanding of the 3 categories of reload:
    CMH Reload is when the reload is triggered through the mcv server.
    The Vera restart is a full restart of the OS
    Luup reload is a restart of the Luup engine only.

Good luck with this—I wonder if you’ll be able to remove your production Vera without it being online? Or, if you have to be online to remove it, will the Vera servers send something to the unit that will cause you problems?[/quote]

Yes you can remove a vera from your account without it being online. I have done it on my test unit. The worry if it is online when you do it is that it may do a factory reset on your vera…

That explains why I may go months between CMH reloads even if I do a factory reset, firmware update, and restore from backup.

Played some more with the vera being blocked from the internet and here are some observations:

  1. With the vera setup with local NTP, When the vera has no internet access, it actually reports the last vera restart correctly by getting time from the local NTP server. When it has internet access then during the OS boot process, it will report Jan1 00:00 2000 UTC as the last reboot time. This is quite interesting. It means that it is not actually using the NTP settings and must be updating the OS time from somewhere else.

  2. With the files for the Sercomm camera plugin and Alexa removed the vera seems to go nuts after a few minutes and will no longer run any lua code. The Zwave driver and the UI still work but test lua and scenes / startup lua will stop working after a few minutes. I could not reboot, reload luup or do anything from the advanced menus and had to go unplug the vera to get it to reboot. Not sure why MCV wants us to keep these 2 plugins so desperately. I will just let it be but it remains a mystery. The vera would sometimes go into an infinite luup reload loops or would load correctly and then lose it completely after sometime and go into a full reboot. I don’t understand how the reboot behavior could be so random and inconsistent with 3 different hard reboot leading to 3 different behaviors.

  3. The weather logo at the top right is actually obtained from yahoo.com. I noticed it from my dns server logs.

Overall for those who think the vera can work without internet, it really cannot. The core engine is much too unstable and luup reloads constantly leads to the vera looking for something from their server or elsewhere on the net. It is taking me a lot of work to figure out what these items are and I wish vera was more transparent about them.