Basic Event Logging Problems - TempLogFileSystemFailure

trouty00 · February 24, 2016, 8:00pm

[quote=“RichardTSchaefer, post:40, topic:182683”]This is Vera saving it’s state in case it restarts because of a failure …
In general Vera tries to save the current state before it restarts. Depending on the failure it might not be able to.

When vera reloads, it reverts to the last save point (within 6 min of the current time)[/quote]

ok - that makes sense i guess, maybe worrying about nothing in terms of resource use then, although should we be worried that they feel the need to do this so regular, are vera that worried about it falling over. I guess they cant win but its just my brain ticking…

mano · February 24, 2016, 8:10pm

[quote=“RichardTSchaefer, post:40, topic:182683”]This is Vera saving it’s state in case it restarts because of a failure …
In general Vera tries to save the current state before it restarts. Depending on the failure it might not be able to.

When vera reloads, it reverts to the last save point (within 6 min of the current time)[/quote]
Thanks for the insight…
hmm. so design intent !

bkurtz · February 25, 2016, 8:56pm

Mano,

The saving of the state every 6 minutes is a design intent, getting a restart every 6 minutes is not.

I remember fondly when this was my problem. My VeraPlus doesn’t save the state for PLEG and then it crashes =).

x

mano · February 26, 2016, 10:21am

Hi xenith, Richard,
Bit more details…
I updated my V-Lite to the latest firmware with no config and it aso restarts every 31 minutes also.
Now the real issue, is eventually the restarts leads to Lua and scene errors where some scenes do not run at all and becoming un-reliable.

Following a luup.restart using the http://192.168.0.101:3480/data_request?id=reload, all the scenes can be triggered and used without any errors being reported by luup. Eventualy the erros start creeping in. So my “suspicion” is each of these restarts builds up the degradation as time goes on, may be the save point at which is restarts is incomplete or corrupting…

RichardTSchaefer · February 26, 2016, 10:11pm

There is definitely some type of system corruption going on. The fact that MCV has all of these diagnostics in the log file is an indication that they have a problem and do not have a good handle on the exact problem and as a result do not have a solution.

It might only effect a fraction of a percent … But that is a lot of people if you consider the number of units they sell.

Possible problems can be the RAM used as a disk drive randomly failing. That might be why they dump a listing of files in the etc directory.

It can also be a lack of program memory (This is different memory than the memory used for disk space, although the log files use this space if you do not have a thumb drive installed.)
A lack of program memory can cause havoc with all kinds of operations. And this is definitely the weak spot for a Vera Light.

mano · February 27, 2016, 9:03am

Richard,
Just to sync where I am,
VeraLite is on UI7, for testing around.
Vera3 still at UI5 and worked with all the scenes without any errors since it became available. (Now in Slave mode) and powered off
VeraPlus at UI7 latest and running the house and then erroring at some time during the day.

I have rasied a ticket into Support, no response. Scenes to turn lights on are failing randomly and the better half is giving me grief. I am getting to the point of dropiing back to Vera 3 on UI5 and power off the VeraPlus.

Not unless I issue a luup.reload to Veraplus every hour or so to keep it useful.

bkurtz · February 28, 2016, 12:57am

[quote=“RichardTSchaefer, post:45, topic:182683”]There is definitely some type of system corruption going on. The fact that MCV has all of these diagnostics in the log file is an indication that they have a problem and do not have a good handle on the exact problem and as a result do not have a solution.

It might only effect a fraction of a percent … But that is a lot of people if you consider the number of units they sell.

Possible problems can be the RAM used as a disk drive randomly failing. That might be why they dump a listing of files in the etc directory.

It can also be a lack of program memory (This is different memory than the memory used for disk space, although the log files use this space if you do not have a thumb drive installed.)
A lack of program memory can cause havoc with all kinds of operations. And this is definitely the weak spot for a Vera Light.[/quote]

I wouldnt think the problem is hardware (at least centrally) because my problems got far worse moving from a VeraLite to a VeraPlus. Its true that the system could have been corrupted when migrating to the VeraPlus, but it would not make sense for it to get worse after the switch.

The only good news is it seems we have at least a few people for whom things got worse with the VeraPlus so maybe they’ll actually invest some time into this issue. Its really strange this isnt the number one issue for them given the complaints since UI7 rolled out.

X

pwlivewire · February 28, 2016, 1:41am

I’ve logged it and support have remote access. I’m chasing for an update.

mixedup · March 6, 2016, 7:45am

I see I’m getting this in the logs too. Should I await others feedback or log it also to Vera?

Frequency

root@MiOS_35202858:/tmp/log/cmh# cat LuaUPnP.log | grep -i "TempLogFileSystemFailure start" 02 03/06/16 16:05:53.180 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:05:53.293 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:11:53.184 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:11:53.297 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:17:53.265 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:17:53.505 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:23:53.174 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:23:53.286 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:29:53.174 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:29:53.283 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:35:53.180 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:35:53.294 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:41:53.244 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:41:53.447 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:47:53.173 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:47:53.282 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:53:53.174 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:53:53.289 UserData::TempLogFileSystemFailure start 0 <0x2ba14680> 02 03/06/16 16:56:12.148 UserData::TempLogFileSystemFailure start 0 <0x2e057680> 02 03/06/16 16:56:12.257 UserData::TempLogFileSystemFailure start 0 <0x2e057680> 02 03/06/16 16:56:12.378 UserData::TempLogFileSystemFailure start 0 <0x2b0c3000> 02 03/06/16 16:56:12.486 UserData::TempLogFileSystemFailure start 0 <0x2b0c3000> 02 03/06/16 16:56:16.128 UserData::TempLogFileSystemFailure start 1 <0x2ba39000> 02 03/06/16 17:11:16.183 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:11:16.294 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:17:16.176 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:17:16.291 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:23:16.174 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:23:16.296 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:29:16.173 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:29:16.281 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:35:16.175 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:35:16.288 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:41:16.180 UserData::TempLogFileSystemFailure start 0 <0x2c38a680> 02 03/06/16 17:41:16.292 UserData::TempLogFileSystemFailure start 0 <0x2c38a680

Log example

[code]02 03/06/16 02 03/06/16 -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rwx------ 1 root root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root -rw-r–r-- 1 root 17:41:16.292 [[33;1mUserData::TempLogFileSystemFailure start 0[[0m <0x2c38a680>
17:41:16.316 [[33;1mUserData::TempLogFileSystemFailure 4125 res:1
root 33 Feb 7 11:47 /etc/cmh/HW_Key
root 33 Jan 1 2000 /etc/cmh/HW_Key2
root 9 Feb 7 11:47 /etc/cmh/PK_AccessPoint
root 6 Feb 7 11:47 /etc/cmh/PK_Account
root 4389 Mar 6 16:35 /etc/cmh/alerts.json
root 358 Feb 7 11:47 /etc/cmh/cmh.conf
root 0 Nov 17 17:37 /etc/cmh/devices
root 16383 Feb 15 03:19 /etc/cmh/dongle.dump
root 41 Mar 6 13:33 /etc/cmh/ergy_key
root 0 Nov 17 17:37 /etc/cmh/first_boot
root 0 Nov 17 17:37 /etc/cmh/fresh_install
root 48 Nov 21 07:31 /etc/cmh/keys
root 3 Jan 22 19:16 /etc/cmh/language
root 2 Jan 22 19:16 /etc/cmh/language_id
root 10 Mar 6 11:14 /etc/cmh/last_backup
root 11 Mar 6 00:47 /etc/cmh/last_report
root 28738 Nov 16 17:09 /etc/cmh/network_pnp.lua
root 12 Jan 22 19:16 /etc/cmh/platform
458 Nov 16 17:09 /etc/cmh/ra_key
root 1 Feb 13 11:09 /etc/cmh/ra_ports
root 0 Jan 22 19:16 /etc/cmh/scenarios
root 1976 Feb 7 11:49 /etc/cmh/servers.conf
root 1464 Nov 17 17:37 /etc/cmh/servers.conf.default
root 20 Mar 2 00:44 /etc/cmh/servers.conf.timestamp
root 154 Nov 16 17:09 /etc/cmh/servers_whitelist
root 1431 Mar 2 00:44 /etc/cmh/services.conf
root 20 Nov 21 07:31 /etc/cmh/sync_kit
root 20 Nov 21 07:31 /etc/cmh/sync_rediscover
root 2 Jan 22 19:16 /etc/cmh/ui
root 2 Jan 22 19:16 /etc/cmh/ui_man
root 5 Jan 22 19:16 /etc/cmh/ui_skin
root 504 Sep 28 16:06 /etc/cmh/user_data.json.luup.lzo
root 23667 Mar 6 17:41 /etc/cmh/user_data.json.lzo
root 23620 Mar 6 17:35 /etc/cmh/user_data.json.lzo.1
root 23642 Mar 6 17:29 /etc/cmh/user_data.json.lzo.2
root 23604 Mar 6 17:23 /etc/cmh/user_data.json.lzo.3
root 23625 Mar 6 17:17 /etc/cmh/user_data.json.lzo.4
root 23555 Mar 6 17:11 /etc/cmh/user_data.json.lzo.5
root 21 Feb 7 11:48 /etc/cmh/users.conf
root 20 Feb 7 11:49 /etc/cmh/users.conf.timestamp
root 1 Nov 17 17:37 /etc/cmh/vera_model
root 8 Jan 22 19:16 /etc/cmh/version
root 8 Mar 5 19:57 /etc/cmh/version_latest
root 8 Mar 6 16:56 /etc/cmh/zwave_house_id
root 3 Jan 1 2000 /etc/cmh/zwave_locale
root 48 Mar 6 16:56 /etc/cmh/zwave_version

/etc/cmh/orig:
-rw-r–r-- 1 root root 0 Jan 22 19:16 devices
-rw-r–r-- 1 root root 0 Jan 22 19:16 first_boot
-rw-r–r-- 1 root root 0 Jan 22 19:16 fresh_install
-rw-r–r-- 1 root root 0 Jan 22 19:16 scenarios
-rw-r–r-- 1 root root 519 Jan 22 19:16 user_data.json.lzo

/etc/cmh/persist:

/etc/cmh/wan_failover:
-rw-r–r-- 1 root root 44 Sep 28 16:06 check_internet.hosts
[[0m <0x2c38a680>[/code]

mixedup · March 6, 2016, 7:47am

PS. This doesn’t impact any of our LUUP code running do it? Doesn’t cause a LUUP engine restart or anything like this I assume?

mano · March 6, 2016, 2:43pm

For me its causing errors and scenes failing with script errors. First line at support have now referred the issue ro 2nd line Support. Suggest raising a ticket with support, which can only help in the fault location.

bkurtz · March 8, 2016, 4:22pm

When I see this, I can see certain things getting delayed, occasionally getting an alarm in the log file. When vera does crash, it often seems right after one of these switches.

X

Chrisfraser05 · March 8, 2016, 11:47pm

I’ve got this issue too, put in a ticket but no reply

jsingle · March 15, 2016, 5:20pm

I also have this issue on my new VeraPlus. Wondering if it was specific to VeraPlus, but it looks like historically has happened. Really don’t want to reconfigure my whole house again after just finishing with the upgrade to VeraPlus from Edge. Very disappointed in the amount of recent downtime MCV has and the amount of tinkering needed to keep these things operational when there is any change in the environment. Hoping things change quickly…

mano · March 26, 2016, 8:23am

Update:
I moved 90% of the scenes into PLEG and still continued to suffer random failed scenes. For this I sent a luup reload command in every hour to keep some sanity.
Looking further into the LuaUPNP.log, I also noticed 4 scenes (NOT in PLEG) returning nil value randomley.
eg.
01 03/21/16 18:04:03.800 luainterface::callfunction_scene scene 36 failed attempt to call a nil value <…

Had one particularly to poll a sensor every 5 minutes, which would run for hours and then fail with the dreaded blue bar with error in scence script message.
Removed these scenes from VeraPlus and the Blue error bar stopped appearing. Is been good few days now.
Re-introduced one of the luup code iin one of the scenes.

it used to be in UI5 as…
luup.call_action(“urn:micasaverde-com:serviceId:HaDevice1”,“Poll”,{},91)
return true

Started to see failures, then modified the code… to

Now in UI7 as…
local resultCode, resultString, job, returnArguments = luup.call_action(“urn:micasaverde-com:serviceId:HaDevice1”,“Poll”,{},91)
return true

This scene has now been running for 24+ hours and has not caused the blue error bar.

some progress…

RichardTSchaefer · March 26, 2016, 12:35pm

What kind of device are you polling … If you are repeatedly force polling a device that has Z-Wave communications problems (i.e. it’s marginally connected) your Vera will become very unstable.

mano · March 27, 2016, 9:25am

Richard,
Its a AEON 4in1 sensor, powered by 5V mains PSU unit, I had some issues of the unit going into sleep randomely and not detecting movement, Its 10 feet away.
Similarly, nil value returned was also on the alarm panel as below,
local VendorStatus = luup.variable_get(“urn:micasaverde-com:serviceId:AlarmPartition2”, “VendorStatus”, 63)
local PanelMode = luup.variable_get(“urn:micasaverde-com:serviceId:PowermaxAlarmPanel1”, “PowerlinkMode”, 62)

where the status,mode is used then to send a notification mail of panel activity.
M

erkme73 · June 12, 2016, 2:04am

There were a number of posters here who had started or escalated support tickets because of this error every six minutes. I’m now getting the same thing on my Edge. It’s precisely every 6 minutes. I noticed it only because Vera Alerts kept notifying me that Vera is restarting - and my PLEG actions were horribly delayed (i.e. opening a door takes 5-10 seconds to turn on the light - and sometimes it never comes on). Now, everytime I hear VA say “Vera restarted” I want to puke. Likewise, when I enter a dark room and sit there counting, 1 1000, 2 1000, 3 1000… I want to rip the frigging thing out of the house.

So, did any of you that started/escalated tickets EVER get resolution on this? It seems to take DAYS before Vera support responds - and then it’s usually more questions than answers… followed by days more waiting. My patience is wearing out.

tinker3433 · June 12, 2016, 5:04am

[quote=“erkme73, post:58, topic:182683”]There were a number of posters here who had started or escalated support tickets because of this error every six minutes. I’m now getting the same thing on my Edge. It’s precisely every 6 minutes. I noticed it only because Vera Alerts kept notifying me that Vera is restarting - and my PLEG actions were horribly delayed (i.e. opening a door takes 5-10 seconds to turn on the light - and sometimes it never comes on). Now, everytime I hear VA say “Vera restarted” I want to puke. Likewise, when I enter a dark room and sit there counting, 1 1000, 2 1000, 3 1000… I want to rip the frigging thing out of the house.

So, did any of you that started/escalated tickets EVER get resolution on this? It seems to take DAYS before Vera support responds - and then it’s usually more questions than answers… followed by days more waiting. My patience is wearing out.[/quote]

Here is what i was told:

Subject: Still stuck in loop

MAY 07, 2016 | 10:50AM PDT
Johnny replied:
Hello Jeff,
The TempLogFileSystemFailure is a function in the Vera engine used to log the WriteUserData function. This will check if the user_data configuration file has been saved correctly and save the output in a temporary file and if it was it will delete it. The functions will verify periodically if the configuration file has been saved, no matter if you?ve made these changes yourself or the unit updated variables based on the connected devices output.
As long as you?ll see the ?WriteUserData saved? message you?ll know for sure the file has been saved correctly and the temporary file will be deleted. Each time the configuration file fails to save, it will log it in the alerts section of your Vera unit. Also keep in mind that this is one of the few safeties available for the user_data configuration file, which will backup the last 5 changes made to it so the Vera engine will be able to restore one of the earlier versions if necessary.
When the unit is loosing connection with the Imperihome, have you tried to see if you can connect to the unit from our app or from our web portal? I?m trying to put all the pieces together so I can find the root of this issue.
Thank you.
Regards,

Johnny ▾ Customer Care Advocate
Vera Control, Ltd. ▾ Smarter Home Control
www.getvera.com ▾ support@getvera.com ▾ +1 (866) 966-2272

How am I doing? Please email my manager, Daniel Stefan (daniel_stefan@getvera.com), with your feedback.

HOURS OF OPERATION (Pacific Time Zone, UTC−8)
Monday - Friday 12:00 am ? 06:00 pm
Saturday - Sunday 04:00 am ? 06:00 pm

erkme73 · June 13, 2016, 3:24am

Does that make any sense? It seems like Johnny’s getting paid by the word, but ultimately says nothing definitive or helpful.

Today, I’ve noticed fewer Vera Alert push notifications stating “vera startup”, and most triggers/actions seem to be working at their regular speed. Still seeing the templogfilesystemfailure every 6 minutes, but now I’m wondering if that really isn’t the way it’s supposed to be. I hate being this helpless. This must be how most women feel when they take their cars to mechanics…