How to troubleshoot LuaUPnP.log when everything looks great

I have been fighting deadlocks. I pull up the log every time, and am at the point I can’t find anything left to fix. The attached example from a few minutes ago - almost NOTHING happened between when the system went to turn the light on at 18:10:31, and then 60 seconds later when Vera declared a deadlock at 18:11:31. Do I just assume the Z-wave message never was successful so there was a restart? The light did not turn on, but the device has an almost perfect auto-route of “0-5,43-14,21-16,27-18”, so shouldn’t be a transmission issue?

These logs really need to give us a clear error message. I have so much time invested in trying to get Vera stable, and I get real frustrated when there seem to be no adequate tools to troubleshoot. I guess I have to turn back on verbose logging, but that hasn’t gotten me anywhere to date…

If you want to see what Vera is sending/receiving at the Z-Wave level, you can enable Level 41 & 42 in the LogLevels line of [tt]cmh.conf[/tt].

If anything Z-Wave is sent, or received, and not otherwise logged in another form, then they’ll be logged here. They’ll be in Hex, which isn’t useful for anything other than confirming that something was “sent out” or “received”, but at least it’s data.

That may help fill the missing time-gap you’re seeing.

Level 51 & 52 are the user-data (Serial) equivalents of those, in case you happen to have any form of IO device attached, and that “by accident” is causing the issue.

Verbose logging may also get it for you, since it’s clear Vera thinks something is going on in the gap, since it’s attributed the lock-acquisition to that timeframe.

Looks like you are turning on multiple lights near the same time:
#109
#18
#22

What initiated each of these ?

[quote=“RichardTSchaefer, post:3, topic:186077”]Looks like you are turning on multiple lights near the same time:
#109
#18
#22

What initiated each of these ?[/quote]

All the same PLEG (attached). #109 and #107 were being turned off based on untripping of EnOcean occupancy sensor. #18 was (supposed to have been) being turned on based on tripping of EnOcean occupancy sensor. #22 was turned off due to expiration of schedule, but that was after the restart…

[quote=“RichardTSchaefer, post:3, topic:186077”]Looks like you are turning on multiple lights near the same time:
#109
#18
#22

What initiated each of these ?[/quote]

Is it just that Vera can’t handle two separate PLEG actions going off at the same time? Cause I am there - I don’t think a Vera3 on UI5 with PLEG can handle multiple simultaneous conditions firing simultaneous actions. Crashed again tonight, turning off one light switch at the same time it needed to turn on another.

It should be able to handle it … but if you have a Z-Wave network that often has transmission problems … you are setting up a potential deadlock if a 3rd Z-Wave activity needs access … such as the background polling …

If you have 3 things backed up … I think it will trigger a deadlock, (caused by transmission problems on the 1st thing).

[quote=“RichardTSchaefer, post:6, topic:186077”]It should be able to handle it … but if you have a Z-Wave network that often has transmission problems … you are setting up a potential deadlock if a 3rd Z-Wave activity needs access … such as the background polling …

If you have 3 things backed up … I think it will trigger a deadlock, (caused by transmission problems on the 1st thing).[/quote]

Yes, I am pretty sure z-wave messages are what is hanging. Is there a z-wave ‘best-practices’ guide for Vera? I really think I have done those that make sense, and the heal reports look great. Only see occasional "x"s in auto-routes, which are resolved with the overnight heal.

[ul][li]Antenna mod with 3db L-Com antenna[/li]
[li]Eliminated battery powered z-wave devices - other than locks[/li]
[li]Have approximately 40 z-wave devices spread out all around the house[/li]
[li]Use alarm system devices instead of z-wave devices when at all possible[/li][/ul]

I guess the only thing I can criticize about my setup is that the z-wave devices are heterogeneous. While most of the 40 are GE/Jasco light switches, there are 4 Leviton fan controllers, 3 Honeywell thermostats, 3 Intermatic outlets, and one Aeon labs smartstrip.

Is your Z-Wave controller centrally located ?
I have heard a few good things about the Aeon Z-Wave repeater … but I have not needed it.
I have 5000 sq/ft over two floors … mostly cube like with Vera (and an Antennae) centrally located …

[quote=“RichardTSchaefer, post:8, topic:186077”]Is your Z-Wave controller centrally located ?
I have heard a few good things about the Aeon Z-Wave repeater … but I have not needed it.
I have 5000 sq/ft over two floors … mostly cube like with Vera (and an Antennae) centrally located …[/quote]

Good question - I have a three-story brownstone, so three cubes on top of each other totaling around 2,100 square feet. Vera right now is in the middle of the bottom floor. That was partially due to the fact that my network/AV closet is on the first floor, and also due to the fact this location made Vera physically close to the door locks. So I guess I can experiment with other locations to house Vera, although there really isn’t a great central location on the second floor.

Almost all of the devices auto-route to Vera direct, with really low 3-6 scores, so I really thought the location was good…

I fired up my unused VeraLite, and moved non-critical services and applications to it.

My Vera3 is now strictly HA. I even removed EventWatcher, which I really, really did not want to do. I still have 7 plug-ins, but I consider them all core services - DSC Alarm Panel, PLEG and PL Core, Day or Night, Vera Alerts, Garage Door, and EnOcean ESP3 Gateway. Hopefully the slightly reduced load will be enough to keep the Lighting, Climate, and Security PLEGs running smooth…

Deadlocks continue unabated. I finally have a verbose log captured during a crash (attached), complete with 41/42 z-wave messages - and I STILL can’t find anything that looks out of sorts. If anyone gets a chance to look to see if I missed something - I would greatly appreciate that.

Vera appears to start a z-wave send with “ZWJob_SendData UPDATE MANUAL ROUTE”, but unlike the other times, it is not followed by “ZW_Send_Data node XX USING ROUTE”…

As a test:
If you have any PLEG actions that use more than 2 Z-Wave commands try the following:

  1. Only send one Z-Wave command per action interval.
  2. For additional Z-Wave commands … add a 1 second delay from previous command … add Z-Wave command to that.
  3. Repeat for all Z-Wave commands.

See if that does not reduce the dead locks … Put closest most reliable nodes at the front.

If that works … I may have to trottle actions

See if that does not reduce the dead locks ...

I have had Vera crashes if I fired a lot of Z-Wave actions at one time. I now space them out as a matter of course.

By that I assume you mean a lot of ZWave commands in PLEG/PLTS actions ?

By that I assume you mean a lot of ZWave commands in PLEG/PLTS actions ?

I do indeed. I haven’t seen it with actions from scenes but I have very few of these.

Previously I would walk the list and just call each action.
I will now check if they were implemented as a Job (I believe most Z-Wave device actions are jobs). If they are a job, I will wait until the job is complete before sending the next action.

Can you define “a lot”? I’ve got scenes and PLEGs with up to 7 simultaneous Z-Wave commands, not many in my opinion, and I’ve experienced no issues.

I just want to be sure that @RichardTSchaefer doesn’t slow PLEG down for extreme or edge cases that are better managed manually

[quote=“Z-Waver, post:17, topic:186077”]Can you define “a lot”? I’ve got scenes and PLEGs with up to 7 simultaneous Z-Wave commands, not many in my opinion, and I’ve experienced no issues.

I just want to be sure that @RichardTSchaefer doesn’t slow PLEG down for extreme or edge cases that are better managed manually[/quote]

It was a while back but I think the most persistent offender was the Away/Home event with around five Z-Wave actions. One of the actions is a set-point change on a battery-operated thermostat which probably compounds the problem.

I agree it would be a shame to slow down all PLEG actions to cope with rare situations. Perhaps it could be optional?

Thanks RichardTSchaefer! I have a few PLEG actions that send two z-wave commands, and I will add a second of delay between the commands to see if that helps.

Most of the rest are a single z-wave, and nearly all of them are trigger driven. So can’t predict when two triggers might execute at the same time. Like when someone is starting up the stairs and needs lights at the same time someone else enters a dark bathroom on another floor… Although often it is the "Off’ command that deadlocks - in a perfect world there would be a way to de-prioritize "Off"s in favor of “On” commands.

There are regularly as many as three triggers executing within a few seconds of each other needing "On"s, and when you factor that the "Off"s will follow (generally two minutes after the trigger untrips), there are times the Z-wave network is pretty busy. Another thought is maybe to back off polling (Only poll a node if the Z-Wave network is idle at least XX seconds) or segment my home network to keep UPnP devices from talking to Vera.

I filed a TT with MiCasaVerde, and they haven’t found anything. The update from last night was “We’ve checked the logs and it seems that this issue is caused by the PLEG plugin that you have installed on your unit. Can you please re-check the scenes that implies this plugin ?”

This is already a Vera default located at Settings → Z-Wave Settings → Options. The default time is set at 10 seconds.