Okay, using veralite for 3 years, almost always happy. Now facing an issue on a frequent restart (lua I think) every 6 minutes exact. Have been searching the forum for hours and found lots of users facing the same issue. However no solution. The restart isn’t a very big problem but I want to have it solved. What I tried: remove Fibaro’s 3 in 1, remove all plugins, disable all scenes, remove all old plugin files but nothing changes the strange behavior. With everything removed Memory cannot be the problem (also checked top, free etc). Disabled/enabled USB logging no result.
Something runs every 6 minutes (its like a Swiss clock on the second) and causes the issue:
[quote=“RichardTSchaefer, post:617, topic:172785”]There are two typical sources of restarts:
Memory limitations
Deadlocks (Something running too often, or taking to long)
I find a better way metric for determining if Vera is short of memory is to look the MEM size % from the top command (for the LuaUPnP program)
This includes memory that it MIGHT use that it currently has not locked down into memory (i.e. has not been paged into the address space).
This includes the stacks for device that have not been active yet. But it also includes “CODE” segments that are paged out of the executable files.
So the number can legally be over 100% without exceeding memory limits. But I find that Vera get very unstable as this approaches and exceeds 100%
My Vera 3 with 150+ devices runs at 86% normally … and occasionally works up and restarts.
My Vera Lite with a handful of devices (20, more than half are plugins) running UI7 is running at 102% and is very unstable.
When I get it down to less than 100% it becomes much more stable …
But this is my test system and I use these other plugins to test my plugins with.
But this is NOT the correct thread for this … so further discussions should start a new THREAD.[/quote]
Maybe that is your problem… Indeed it is mine, I have 198% mem usage of LuaUPnP, and I have a lot of restarts with VeraLite. I am planning to upgrade to a Vera Edge but asking Tech Support to remove the force update flag so I can downgrade it because I don’t like UI7. In the meanwhile, does someone know a way to get this number down?
@Vreo, I removed anything consuming memory and my TOP overview never looked that clean, for LuaUPnP around 80%. Strange thing is that there is no difference in this restart behaviour,having all scenes and plugins in or everything removed. Looks like every time the command crond: USER root pid 3082 cmd /usr/bin/Rotate_Logs.sh runs the reset happens. And this command seems to run every 6 minutes.
For the filesystems themselves, you can use “df -k” to tell you what’s what. Some will be read-only and/or 100% full (intentionally) and others might be 100% full (unintentionally)
Here’s mine, for reference, using USB Logging (with Log uploads to MiOS completely disabled):
Log rotation/compression puts a whole bunch of extra load on Vera, esp if you’re not running USB Logging… where you’ve already got a large log in Memory (as a memory filesystem) and then you compress it into that same filesystem before eventually removing it.
If you have BAD Blocks, that can also be a serious issue, and Support is best to get engaged. Think of them as landmines on the disk (Flash) surface, and when you step on one, by trying to write to it, bad/unpredictable things occur.
If you’re lucky, and already running USB Logging, and the bad blocks happen to be there, then you can swap it out for another.
Anyhow, there were indications in your original logs that something wasn’t able to write the user_data.xml files, or to compress them, hence the reason to ask how much free-space you had (in addition to BAD Blocks on the Flash)… either of which can readily trip up Vera.
Thanks Guessed. Filesystem seems to be ok. I use USB logging but I also switched it off to check if that made a difference, it didn’t. I will replace the device just to make sure.
Just an idea, you can change the crontab to make less often the logs rotation, especially if you have USB enables with sufficient storage space (512 MB). I am going to try that my self.
I do not think this has anything to do with the crontab …
You did however remove the Sync Energy script that runs at 12:06 everyday … probably OK
However the log rotation script which runs EVERY minute is still enabled.
The 6 minute interval I believe is the interval VERA uses to save any changes to state variables in the persistent file user_data.json (in a compressed format).
You appear to have enough room on the USB … so that looks good for logs.
Your root file system seems to have room … that’s where this file is written (overlayfs:/overlay 11264 1780 9484 16% /)
So a few more possibilities:
You do not have enough free memory to compress and write the file.
The device state is corrupted … so when it tries to write it out … the output file is way to large (i.e it’s trying to write a lot of garbage to the disk).
The flash memory (that acts like a disk) has too many errors (bad sectors).
If it’s #1 you can fix this … Whats the output from the command:
free
or the top few lines from the command (control-c to quit):
top
Feb 22 19:37:35 MiOS_xxxxxx kern.warn kernel: ra_nand_block_checkbad: offs:760000 tag: BAD
Feb 22 19:37:35 MiOS_xxxxx kern.warn kernel: ra_nand_block_checkbad: offs:764000 tag: BAD
Feb 22 19:37:35 MiOS_xxxxx kern.warn kernel: ra_nand_block_checkbad: offs:768000 tag: BAD
Feb 22 19:37:35 MiOS_xxxxx kern.warn kernel: ra_nand_block_checkbad: offs:76c000 tag: BAD
Feb 22 19:37:35 MiOS_xxxxx kern.warn kernel: ra_nand_block_checkbad: offs:770000 tag: BAD
Feb 22 19:37:35 MiOS_xxxxx kern.warn kernel: ra_nand_block_checkbad: offs:774000 tag: BAD
Feb 22 19:37:35 MiOS_xxxxx kern.warn kernel: ra_nand_block_checkbad: offs:778000 tag: BAD
Feb 22 19:37:35 MiOS_xxxxx kern.warn kernel: ra_nand_block_checkbad: offs:77c000 tag: BAD
are caused by the “Rotate_Logs.sh” script… It forcibly closes the log files so that it can rename/move/archive them… This is also normal behaviour, and happens regardless of where the logs are.
Your posted logs do not have these teltale signs of a LuaUPnP crash… just of the new normal operating procedures.
What kind of restart is it I see in this code? As it seems to be normal to happen every 6 minutes and lasts for 11 seconds it can disturb normal processes I supose.
[code]
01 02/23/15 19:03:32.279 UserData::WriteUserData saved–before move File Size: 41733 save size 41733 LEAK this:253952 start:2052096 to 0x15bc000 <0x2b68a680>
02 02/23/15 19:03:32.279 UserData::TempLogFileSystemFailure start <0x2b68a680>
02 02/23/15 19:03:32.305 UserData::TempLogFileSystemFailure 4898
-rw-r–r-- 1 root root 33 Feb 6 18:21 /etc/cmh/HW_Key
-rw-r–r-- 1 root root 32 Feb 6 18:21 /etc/cmh/HW_Key2
-rw-r–r-- 1 root root 9 Feb 6 18:21 /etc/cmh/PK_AccessPoint
-rw-r–r-- 1 root root 7 Feb 22 18:24 /etc/cmh/PK_Account
-rw-r–r-- 1 root root 4707 Feb 23 18:56 /etc/cmh/alerts.json
-rw-r–r-- 1 root root 412 Feb 23 06:31 /etc/cmh/cmh.conf
-rw-r–r-- 1 root root 0 Jan 27 2012 /etc/cmh/devices
-rw-r–r-- 1 root root 16383 Feb 22 12:08 /etc/cmh/dongle.3.20.dump.0
-rw-r–r-- 1 root root 16383 Feb 22 11:17 /etc/cmh/dongle.3.20.dump.1
-rw-r–r-- 1 root root 16383 Feb 10 22:12 /etc/cmh/dongle.3.20.dump.2
-rw-r–r-- 1 root root 16383 Feb 8 16:44 /etc/cmh/dongle.3.20.dump.3
-rw-r–r-- 1 root root 16383 Feb 8 15:16 /etc/cmh/dongle.dump
-rw-r–r-- 1 root root 227 Feb 22 18:19 /etc/cmh/ergy.conf
-rw-r–r-- 1 root root 41 Mar 20 2012 /etc/cmh/ergy_key
-rw-r–r-- 1 root root 0 Jan 27 2012 /etc/cmh/first_boot
-rw-r–r-- 1 root root 0 Dec 17 2012 /etc/cmh/fresh_install
-rw-r–r-- 1 root root 48 Mar 20 2012 /etc/cmh/keys
-rw-r–r-- 1 root root 3 Feb 4 09:10 /etc/cmh/language
-rw-r–r-- 1 root root 2 Feb 4 09:10 /etc/cmh/language_id
-rw-r–r-- 1 root root 10 Feb 23 18:20 /etc/cmh/last_backup
-rw-r–r-- 1 root root 11 Feb 22 12:06 /etc/cmh/last_report
-rw-r–r-- 1 root root 14085 Jan 23 15:12 /etc/cmh/network_pnp.lua
-rw-r–r-- 1 root root 15971 Jan 23 15:12 /etc/cmh/network_pnp_sys.xml
-rw-r–r-- 1 root root 12 Feb 4 09:10 /etc/cmh/platform
-rw-r–r-- 1 root root 458 Jan 23 15:12 /etc/cmh/ra_key
-rw-r–r-- 1 root root 10 Jul 26 2014 /etc/cmh/reupgraded.firmware
-rw-r–r-- 1 root root 539 Feb 23 02:16 /etc/cmh/route.data
-rw-r–r-- 1 root root 0 Feb 4 09:10 /etc/cmh/scenarios
-rw-r–r-- 1 root root 1976 Feb 22 16:26 /etc/cmh/servers.conf
-rw-r–r-- 1 root root 1464 Feb 4 09:10 /etc/cmh/servers.conf.default
-rw-r–r-- 1 root root 20 Feb 22 18:20 /etc/cmh/servers.conf.timestamp
-rw-r–r-- 1 root root 1431 Feb 22 18:25 /etc/cmh/services.conf
-rw-r–r-- 1 root root 20 Feb 8 15:50 /etc/cmh/sync_kit
-rw-r–r-- 1 root root 20 Feb 8 15:50 /etc/cmh/sync_rediscover
-rw-r–r-- 1 root root 2 Feb 4 09:10 /etc/cmh/ui
-rw-r–r-- 1 root root 2 Feb 4 09:10 /etc/cmh/ui_man
-rw-r–r-- 1 root root 5 Feb 4 09:10 /etc/cmh/ui_skin
-rw-r–r-- 1 root root 504 Jul 14 2011 /etc/cmh/user_data.json.luup.lzo
-rw-r–r-- 1 root root 41798 Feb 23 18:57 /etc/cmh/user_data.json.lzo
-rw-r–r-- 1 root root 41685 Feb 23 18:51 /etc/cmh/user_data.json.lzo.1
-rw-r–r-- 1 root root 41779 Feb 23 18:45 /etc/cmh/user_data.json.lzo.2
-rw-r–r-- 1 root root 41667 Feb 23 18:39 /etc/cmh/user_data.json.lzo.3
-rw-r–r-- 1 root root 41712 Feb 23 18:33 /etc/cmh/user_data.json.lzo.4
-rw-r–r-- 1 root root 41649 Feb 23 18:27 /etc/cmh/user_data.json.lzo.5
-rw-r–r-- 1 root root 41733 Feb 23 19:03 /etc/cmh/user_data.json.lzo.new
-rw-r–r-- 1 root root 19 Feb 22 18:25 /etc/cmh/users.conf
-rw-r–r-- 1 root root 20 Feb 22 18:25 /etc/cmh/users.conf.timestamp
-rw-r–r-- 1 root root 1 Feb 4 09:10 /etc/cmh/vera_model
-rw-r–r-- 1 root root 8 Feb 4 09:10 /etc/cmh/version
-rw-r–r-- 1 root root 8 Feb 22 18:19 /etc/cmh/version_latest
-rw-r–r-- 1 root root 8 Feb 23 06:33 /etc/cmh/zwave_house_id
-rw-r–r-- 1 root root 27 Feb 22 10:11 /etc/cmh/zwave_house_id.history
-rw-r–r-- 1 root root 3 Feb 6 18:21 /etc/cmh/zwave_locale
-rw-r–r-- 1 root root 74874 Jan 23 15:12 /etc/cmh/zwave_products_sys.xml
-rw-r–r-- 1 root root 48 Feb 23 06:33 /etc/cmh/zwave_version
/etc/cmh/orig:
-rw-r–r-- 1 root root 0 Feb 4 09:10 devices
-rw-r–r-- 1 root root 0 Feb 4 09:10 first_boot
-rw-r–r-- 1 root root 0 Feb 4 09:10 fresh_install
-rw-r–r-- 1 root root 0 Feb 4 09:10 scenarios
-rw-r–r-- 1 root root 526 Feb 4 09:10 user_data.json.lzo
/etc/cmh/persist:
/etc/cmh/wan_failover:
-rw-r–r-- 1 root root 44 Jan 23 15:12 check_internet.hosts LEAK this:20480 start:2072576 to 0x15c1000 <0x2b68a680>
02 02/23/15 19:03:32.398 UserData::TempLogFileSystemFailure start <0x2b68a680>
02 02/23/15 19:03:32.425 UserData::TempLogFileSystemFailure 4809
-rw-r–r-- 1 root root 33 Feb 6 18:21 /etc/cmh/HW_Key
-rw-r–r-- 1 root root 32 Feb 6 18:21 /etc/cmh/HW_Key2
-rw-r–r-- 1 root root 9 Feb 6 18:21 /etc/cmh/PK_AccessPoint
-rw-r–r-- 1 root root 7 Feb 22 18:24 /etc/cmh/PK_Account
-rw-r–r-- 1 root root 4707 Feb 23 18:56 /etc/cmh/alerts.json
-rw-r–r-- 1 root root 412 Feb 23 06:31 /etc/cmh/cmh.conf
-rw-r–r-- 1 root root 0 Jan 27 2012 /etc/cmh/devices
-rw-r–r-- 1 root root 16383 Feb 22 12:08 /etc/cmh/dongle.3.20.dump.0
-rw-r–r-- 1 root root 16383 Feb 22 11:17 /etc/cmh/dongle.3.20.dump.1
-rw-r–r-- 1 root root 16383 Feb 10 22:12 /etc/cmh/dongle.3.20.dump.2
-rw-r–r-- 1 root root 16383 Feb 8 16:44 /etc/cmh/dongle.3.20.dump.3
-rw-r–r-- 1 root root 16383 Feb 8 15:16 /etc/cmh/dongle.dump
-rw-r–r-- 1 root root 227 Feb 22 18:19 /etc/cmh/ergy.conf
-rw-r–r-- 1 root root 41 Mar 20 2012 /etc/cmh/ergy_key
-rw-r–r-- 1 root root 0 Jan 27 2012 /etc/cmh/first_boot
-rw-r–r-- 1 root root 0 Dec 17 2012 /etc/cmh/fresh_install
-rw-r–r-- 1 root root 48 Mar 20 2012 /etc/cmh/keys
-rw-r–r-- 1 root root 3 Feb 4 09:10 /etc/cmh/language
-rw-r–r-- 1 root root 2 Feb 4 09:10 /etc/cmh/language_id
-rw-r–r-- 1 root root 10 Feb 23 18:20 /etc/cmh/last_backup
-rw-r–r-- 1 root root 11 Feb 22 12:06 /etc/cmh/last_report
-rw-r–r-- 1 root root 14085 Jan 23 15:12 /etc/cmh/network_pnp.lua
-rw-r–r-- 1 root root 15971 Jan 23 15:12 /etc/cmh/network_pnp_sys.xml
-rw-r–r-- 1 root root 12 Feb 4 09:10 /etc/cmh/platform
-rw-r–r-- 1 root root 458 Jan 23 15:12 /etc/cmh/ra_key
-rw-r–r-- 1 root root 10 Jul 26 2014 /etc/cmh/reupgraded.firmware
-rw-r–r-- 1 root root 539 Feb 23 02:16 /etc/cmh/route.data
-rw-r–r-- 1 root root 0 Feb 4 09:10 /etc/cmh/scenarios
-rw-r–r-- 1 root root 1976 Feb 22 16:26 /etc/cmh/servers.conf
-rw-r–r-- 1 root root 1464 Feb 4 09:10 /etc/cmh/servers.conf.default
-rw-r–r-- 1 root root 20 Feb 22 18:20 /etc/cmh/servers.conf.timestamp
-rw-r–r-- 1 root root 1431 Feb 22 18:25 /etc/cmh/services.conf
-rw-r–r-- 1 root root 20 Feb 8 15:50 /etc/cmh/sync_kit
-rw-r–r-- 1 root root 20 Feb 8 15:50 /etc/cmh/sync_rediscover
-rw-r–r-- 1 root root 2 Feb 4 09:10 /etc/cmh/ui
-rw-r–r-- 1 root root 2 Feb 4 09:10 /etc/cmh/ui_man
-rw-r–r-- 1 root root 5 Feb 4 09:10 /etc/cmh/ui_skin
-rw-r–r-- 1 root root 504 Jul 14 2011 /etc/cmh/user_data.json.luup.lzo
-rw-r–r-- 1 root root 41733 Feb 23 19:03 /etc/cmh/user_data.json.lzo
-rw-r–r-- 1 root root 41798 Feb 23 18:57 /etc/cmh/user_data.json.lzo.1
-rw-r–r-- 1 root root 41685 Feb 23 18:51 /etc/cmh/user_data.json.lzo.2
-rw-r–r-- 1 root root 41779 Feb 23 18:45 /etc/cmh/user_data.json.lzo.3
-rw-r–r-- 1 root root 41667 Feb 23 18:39 /etc/cmh/user_data.json.lzo.4
-rw-r–r-- 1 root root 41712 Feb 23 18:33 /etc/cmh/user_data.json.lzo.5
-rw-r–r-- 1 root root 19 Feb 22 18:25 /etc/cmh/users.conf
-rw-r–r-- 1 root root 20 Feb 22 18:25 /etc/cmh/users.conf.timestamp
-rw-r–r-- 1 root root 1 Feb 4 09:10 /etc/cmh/vera_model
-rw-r–r-- 1 root root 8 Feb 4 09:10 /etc/cmh/version
-rw-r–r-- 1 root root 8 Feb 22 18:19 /etc/cmh/version_latest
-rw-r–r-- 1 root root 8 Feb 23 06:33 /etc/cmh/zwave_house_id
-rw-r–r-- 1 root root 27 Feb 22 10:11 /etc/cmh/zwave_house_id.history
-rw-r–r-- 1 root root 3 Feb 6 18:21 /etc/cmh/zwave_locale
-rw-r–r-- 1 root root 74874 Jan 23 15:12 /etc/cmh/zwave_products_sys.xml
-rw-r–r-- 1 root root 48 Feb 23 06:33 /etc/cmh/zwave_version
/etc/cmh/orig:
-rw-r–r-- 1 root root 0 Feb 4 09:10 devices
-rw-r–r-- 1 root root 0 Feb 4 09:10 first_boot
-rw-r–r-- 1 root root 0 Feb 4 09:10 fresh_install
-rw-r–r-- 1 root root 0 Feb 4 09:10 scenarios
-rw-r–r-- 1 root root 526 Feb 4 09:10 user_data.json.lzo
[quote=“ranneman, post:18, topic:186081”]You can ssh to your vera and the command logread. SSH see this [url=http://wiki.micasaverde.com/index.php/Logon_Vera_SSH]http://wiki.micasaverde.com/index.php/Logon_Vera_SSH[/url][/quote]Thanx! And is there a way to fix bad sectors? I have three…