I created a new implementation file for a Custom Dimmable Light device (controlled using web service requests).
I refresh the status of this device every 5 seconds by making http.request calls and getting its status (using luup.calltimer) and I can set its status using another web server request in a SetLoadLevelTarget Job (as per the pseudo code below).
Everything works great until a Job is executed at the same time when the refresh function is called.
Both http calls fail crashing luup at times with the below errors.
[tt]
01 10/14/11 19:17:01.225 LuaInterface::CallFunction-5 function refreshStatus failed /usr/lib/lua/socket/http.lua:145: attempt to call field ‘try’ (a string value) <0x402>
.
.
.
01 10/14/11 19:17:02.141 LuaInterface::CallFunction-1 device 60 function SDimming1_Dimming1_SetLoadLevelTarget_job failed /usr/lib/lua/socket/http.lua:150: attempt to call field ‘try’ (a string value) <0x400>
01 10/14/11 19:17:02.141 Lua_Job::Run job#27 : (0x9f28a0) P:50 S:0 failed <0x400>
[/tt]
It seems that LuaSocket does not implement any multithreading and a function called using luup.calltimer is not serialized or put in a queue.
Is there any way to create a timed function that runs within a user created job so both http calls are not executed at the exact same time ?
pseudo-code: (I_CustomDimmableLight1.xml)
[tt]
local http=require('socket.http')
function refreshStatus()
-- Calling the function in 5 seconds
luup.call_timer("refreshStatus", 1, "5", "", "")
-- Initiate a web request to refresh the device status
socket.http.request{...}
end
…
</functions>
<startup>Startup</startup>
<actionList>
<action>
<serviceId>urn:upnp-org:serviceId:Dimming1</serviceId>
<name>SetLoadLevelTarget</name>
<job>
-- Initiate a web request to set the device status
socket.http.request{...}
</job>
</action>
</actionList>
I didn’t notice this parameter on call_delay before.
I replaced call_timer with [tt]luup.call_delay(“refreshStatus”, 5, “”, 1)[/tt] but unfortunately I’m still getting the same errors
Can I have Vera poll the device status every 5 seconds (I implemented the urn:micasaverde-com:serviceId:HaDevice1 Poll action and having it call the web service) without using timers ?
<functions>
local http=require('socket.http')
lock = 0
function refreshStatus()
-- Wait until the other job releases the lock.
while (lock == 1) do end
lock = 1 -- Acquire the lock
socket.http.request{...}
lock = 0 -- Release the lock
-- Calling the function in 5 seconds
luup.call_delay ("refreshStatus", 5, "", 1)
end
...............
</functions>
<startup>Startup</startup>
<actionList>
<action>
<serviceId>urn:upnp-org:serviceId:Dimming1</serviceId>
<name>SetLoadLevelTarget</name>
<job>
-- Wait until the other job releases the lock.
while (lock == 1) do end
lock = 1 -- Acquire the lock.
socket.http.request{...}
lock = 0 -- Release the lock.
</job>
</action>
</actionList>
That wont work either. Lua is single threaded, but MiOS is making concurrent calls to it across threads. The more frequent these calls, the.more likely a timer & IO collision will occur.
@MCV was working on some changes in this space to avoid the issue, since it cannot be avoided from within lua code itself.
@guessed is right. I had tried this without any luck.
I also tried to implement the below logic (Calling the Poll action from the refresh function); but this time, luup crashed with each call_action call to Poll ???
luup.call_action returned weird values (null for lul_resultcode, table for lul_job)!!
<functions>
local http=require('socket.http')
function refreshStatus()
-- Calling the function in 5 seconds
luup.call_timer("refreshStatus", 1, "5", "", "")
local lul_arguments = {}
local lul_resultcode, lul_resultstring, lul_job, lul_returnarguments
lul_resultcode, lul_resultstring, lul_job, lul_returnarguments = luup.call_action("urn:micasaverde-com:serviceId:HaDevice1", "Poll", lul_arguments, tonumber(lul_device))
end
......
</functions>
<startup>Startup</startup>
<actionList>
.
.
.
.
<action>
<serviceId>urn:micasaverde-com:serviceId:HaDevice1</serviceId>
<name>Poll</name>
<job>
socket.http.request{...}
</job>
</action>
</actionList>
Sorry, I just noticed that I modified your post instead of hitting Reply. Must be the lack of sleep. Does anybody know if there is a revert function for forum posts? ;D
For future references (And in case no revert function exist): I replied that coroutines seemed to improve things a lot as http.request are now all successful but I’m still getting luup crashes without any apparent cause showing in the log file when calling the SetLoadLevelTarget action repeatedly.
A few memory leaks do show up in the log file and I tried to resolve that by calling collectgarbage(“collect”) in the refreshStatus() function but that didn’t seem to solve it.
As for your question florin, yes i’m returning 4 in case of success and 2 in case of failure (returning only one value).
I suppose that either Verbose Logging wasn’t enabled or the messages have been filtered with grep ‘^01|^06’ . I can’t tell why it crashed from these logs.
After many days of debugging, I think I managed to find out the cause of the problem.
Calling luup.variable_set within a timed function (whether using call_timer or call_delay) is creating a memory leak (tested over many days and confirmed this morning).
Setting at the same time the same variable when using a call_action crashes LuaUpnP.
I need to update the device status at least every 5 secs, and I don’t see any other solution besides using call_delay or call_timer.
Is there any way to force vera to call the “poll” action as it is done with z-wave devices ?
I removed the timed functions from the plugin and used instead the startup Lua script to call the Poll action where I refresh the device’s status.
It’s not the elegant solution I had in mind as I have to edit the startup script every time I add a new device but at least it works!
I’m glad that you found a workaround for this. I would recommend putting the luup.call_delay function after the luup.call_action functions, because polling takes some time and there is the risk that one refreshCache instance will overlap over the other refreshCache instance if polling takes 5 seconds or more.
I added bugs #1939 and #1940 to the Mantis with the problems you reported.
I had put luup.call_delay at the start of the function just in case one of later statements crashes but it I don’t think there’s such a risk in this case.
@Roy S,
If you want to guard your code from failures, you can use [tt]pcall[/tt] to wrap the bits that might fail in odd ways: Lua 5.1 Reference Manual
I didn’t know about that. Nice alternative for the missing (try…catch).
I also thought about adding an extra 10 seconds to the first call_delay call, just in case the devices r not properly initialized when the startup code runs.
Best Home Automation shopping experience. Shop at Ezlo!