HOW-TO: Time Series

rigpapa · May 10, 2019, 3:37pm

Earlier this morning, @jonas2 mentioned that he was planning on using Reactor to detect spikes in humidity data in his bathroom, and turn on the fan automatically. By detecting spikes, he doesn’t need to seasonally adjust a threshold; that is, because ambient humidity changes over the course of the year, a fixed threshold can affect the sensitivity of detection and likely require seasonal adjustment. For example, if your indoor humidity averages 33% in winter, as it does where I live, and 50% in summer, it’s hard to find a single threshold value that works all year–setting your threshold to 55% may be too high for winter, and cause nuisance fan runs on rainy summer days–so you end up having to tweak your threshold seasonally, and having to remember to do it. By detecting spikes, the system can ignore the gradual seasonal changes of humidity, and be more likely to only respond to sudden changes as seen when someone is taking a hot shower.

So what @jonas2 needs is to implement a time series–another common HA task. By using a time series, it’s easy to find spikes in the data, and take action.

Let’s create a new ReactorSensor called “Bathroom Data Collector” and add some variable/expressions to it, to create a super-basic starting point:

series = push( series, tonumber( getstate( "Bathroom Humidity Sensor", "urn:micasaverde-com:serviceId:HumiditySensor1", "CurrentLevel" ) ), 5 )
range = max(series) - min(series)

And to our Reactor conditions, we’ll add only an Interval condition, with an interval of 1 minute for this example. That’s short, but just for this example, if you’re following along and creating this RS as you read, this short interval will make it easier to see changes happening when you watch it work; if I were doing this in a real-world implementation, I would likely use a 5- or 10-minute interval for this particular application.

Let’s look at the series expression first, since it’s the real work. Working from the inside of the expression out, the getstate() function retrieves the value of a state variable on a device, in this case the current humidity level of our humidity sensor. The idea here is that our interval condition will fire every minute, and when it fires, the expressions are re-evaluated, so this getstate() will grab the then-current value of the humidity sensor. The getstate() call is wrapped in a tonumber() to convert it to a true numeric value (it comes back from getstate() as a string), and then that is wrapped with an push(), and this adds the fetched, converted value to an array–the first argument is the array (which is created if it doesn’t exist), the second argument is the value to be added to the array, and the third argument is the maximum allowed length of the array. I’ve set the maximum array length at 5, so that we keep only the five most recent results–additional pushes cause the oldest value to fall off the array, so it will never be longer than five elements. Since our interval is one minute, that means our array will have five minutes worth of samples. The function result of push() is the new, updated array, which is stored in series (so it’s self-updating).

Since we now have five minutes worth of humidity samples to look at, we can pretty easily find a spike in that data by finding the extrema of the array elements (the minimum and maximum), and computing the difference between them, which is what the range expression does. It uses the expression parser’s built-in min() and max() functions, passing in the arrays for it to scan to make quick work of the job.

Now, to determine if we should turn on the fan, it’s a simple matter of examining range to see if it is greater than some threshold. For example, if wanted to trigger our fan when the change in humidity is 4% or more in our five-minute period, we would simply check to see if range >= 4 (do this in another RS–see further comment below).

But there are two minor problems with this simple implementation. First, Reactor’s getstate() function forces an immediate re-evaluation of the sensor configuration whenever the subject device state variable changes, and this causes an undesirable side-effect: whenever the humidity sensor updates itself, its state variable changes, and this causes the RS to re-evaluate the conditions and store an additional element in the array between our desired intervals, which may skew our data. More importantly, though, is that our Interval condition has two states–true and false–and each transition between them causes our expressions to be re-evaluated–so they will be evaluated twice per interval. That means the configuration we have created will actually store two humidity samples per minute (one when the condition goes false to true and one when the condition goes true to false), and that will certainly goof up our data.

We can easily fix this by having the series expression only update the array when the Interval condition is true. This will filter out both the unpredictable updates from the humidity sensor itself, and the updates caused by the “falling edge” reset (true to false transition) of the interval condition itself. Here’s how we change our expressions to do that:

isint = getstate( "Bathroom Data Collector", "urn:toggledbits-com:serviceId:ReactorGroup", "GroupStatus_root" )
series = if( isint=="1", push( series, tonumber( getstate( "Bathroom Humidity Sensor", "urn:micasaverde-com:serviceId:HumiditySensor1", "CurrentLevel" ), 5 ) ), series )
range = max(series)-min(series)

The new isint variable is “self-reflective”: it’s looking at its own ReactorSensor to get the status of the root group, which contains (only) our Interval condition. So when isint goes to “1”, the Interval condition has gone true and the re-evaluation is caused by that event, and not any external update/trigger or the interval’s reset.

So then all we need to do is wrap our series expression in an if() function. The if() function takes three arguments: a conditional expression (which evaluates to true or false), an expression to be evaluated if the conditional is true, and an expression to be evaluated if the conditional expression is false. We make our push() expression the true condition expression, and we supply series as the false expression. That last bit is important, because all expressions return values, so if it’s not time to update our series array, we must return series as it currently is, thus setting it back to itself. So basically, with the if() in place, if we’re updating on the interval trigger (isint==“1”`), we now return the array with a new value pushed, and if not, we return the array untouched.

That takes care of the side-effects of re-evaluations that can happen off the interval’s schedule, and we should be good to go.

One more important implementation note: because of the evaluations performed and the importance of their timing, I would do the series data collection shown here in a separate ReactorSensor from any logic. This RS is only a data collector. Let the decisions be made elsewhere. This will help prevent excessive (and potentially unnecessary) evaluations.

Now… tuning. There are three factors in tuning this time-series generator:

The interval at which samples are collected (sampling interval);
The total length (in time) of the series;
The threshold for reaction.

The sampling interval is, predictably, controlled by the Interval condition. If you want samples collected every 15 minutes, then you would set the Interval condition for 15 minutes; if you wanted samples hourly, then you would set it for one hour.

The total length in time of the series is controlled by two factors: the sampling interval, and the maximum length of the series array, which is controlled by the third argument to push(). If our sampling interval is 15 minutes, and our maximum array length is 4, then our total time is 4 * 15 = 60 minutes. If our sampling interval is two hours, and our array length is 12, then the total time is 2 * 12 = 24 hours.

The last tuning element is the threshold, and this is something you will just have to “feel out” based on the parameters of your room and devices, and your sensibilities. It may take a little experimentation to figure out the right value. In this case, we might watch range while running the shower a bit and seeing how it changes. Or, you could add another expression with push() that stores the last 100 values of range for you to review.

We can also get as complicated as we want with the math on this data. Let’s add a couple of expressions, in the order shown:

lastMA = if( isint=="1", MA, lastMA )
MA = sum(track) / count(track)
deltaMA = abs( MA - lastMA )

Here, MA is the computed average of the series data–a simple moving average. The expression before it, lastMA, will, on the interval as we did with series, store the current value of MA. Because lastMA is evaluated before MA, lastMA will store the previously computed value of MA before MA then changes. Finally, deltaMA gives us the difference between the two. For some series, you may need to “boil out” small spikes in the data so you can focus on the really big ones you care about, and using a moving average such as this can help you do that. But the idea here is to show that once you have the data in the series array, you can do other computations with it, not just the min/max shown in the original examples.

EDIT 2019-08-24: Several changes made above, in the text directly rather than as deltas here, for ease of reading/continuity. Most importantly, remember that getstate() always returns a string, even when the value being retrieved is (conceptually) a number–if you want to use it as a true number, you have to wrap your getstate() call with a tonumber() call.

EDIT 2020-12-28: The arraypush() function is now deprecated, so these instructions have been amended to show the new push() function replacement.

akbooer · May 10, 2019, 6:00pm

Gosh, that reminds me of an older thread…

rigpapa · May 10, 2019, 6:10pm

Absolutely! The idea certainly isn’t new, and has a lot of applications.

akbooer · May 10, 2019, 7:15pm

Ah! But it’s the way that it’s implemented that makes or break it.
Yours looks very neat

rigpapa · May 10, 2019, 7:21pm

I think its biggest advantage is that you can make it a true series, not just a last-current comparison as in the PLEG example (if I read it correctly–haven’t used PLEG in years so my PLEG-fu is weak). And if you really want to hammer on the series data, it’s readily accessible from Lua in the “Run Lua” actions, so if the expression syntax doesn’t deliver what’s needed in a specific case, there’s a way out of that jail.

akbooer · May 10, 2019, 7:59pm

Exactly, that’s why I implemented a really easy way to recover time series in openLuup…

…reading data from the cache/archive is trivial, using an additional {start,end} time parameter to variable_get():

local sid = "urn:upnp-org:serviceId:TemperatureSensor1"
local now = os.time()
local v2,t2 = luup.variable_get (sid,"CurrentTemperature", 33, {now-3600, now})
print ("over the last hour", pretty {values = v2, times=t2})

gives

over the last hour 	{
	 times = {1530023886,1530023926.5466,1530024526.9677,1530025127.1189,1530026327.7046,1530026929.0275,1530027486},
	 values = {31.1,31,31.2,31.3,31.2,31.3,31.3}
}

Looking forward to people making great use of time series!

Edit: Sorry, hope I haven’t derailed your thread.

jonas2 · May 11, 2019, 6:08am

You are awesome! Thank you. I will give this i try later tonight … But one question that comes up when i read through this … Is all this done with one RS och two sensors? You have been really good in describing this, but i guess i have some problem understanding because english isn’t my native language … But i will give it a try tonight!

Then next thing, is that when i start the exhaust fan above the stove, the ventilation should go to 0% … Even if the moisture is to high.

Now i have 3 simple sensors, 0%, 30% and 100% controlled by watt from the stove, moisture from the bathroom and the 30% that is triggered when the stove fan are off and moisture is low …

The 0% RS is always prioritized cus i wont get rid of the fumes from the stove if the house ventilation is on to high.

rigpapa · May 11, 2019, 12:36pm

The data collector in the example above should be it’s own RS. You can use a second RS to react to the series value from the data collector RS, and it may contain as many groups as is necessary to control all your various endpoints related to the series data. This is not a functional limitation, just a recommended approach.

jonas2 · May 12, 2019, 7:46am

I’m so sorry, i fully understand if you don’t have time to help me more than you already have, but if you do …

I still doesn’t get how to get this right, i have done one reactor sensor like this, but i do not understand how i should move on.

Regards Jonas

rigpapa · May 12, 2019, 11:52am

OK. First a couple of things about what you’ve done so far…

In your getstate() function, you’ve put the device type as the second argument, but it needs to be the service ID. This why your series is not getting any data at all. The service ID has “serviceId” as the third component, and for this particular variable, would be urn:micasaverde-com:serviceId:HumiditySensor1.
Your isint definition is out of position. In this case, we want to make sure isint is defined before it is used, so move isint above series. You can do that by grabbing the hamburger icon (you’ll see the cursor change to a hand) and dragging.

Then your time series should start building properly. Now with that done, just for a quick start to get you rolling, here’s the simplest test to turn a fan on:

Create another ReactorSensor;
Add a single condition to it:
- Select Device State condition type
- Choose the ReactorSensor that is generating the humidity data series
- Choose the variable range from the variables drop down.
- Choose the “>=” operating
- For the value, put in whatever maximum change in humidity you are going to allow, let’s say 5%… put in 5.
Save
Go to the Activities tab
To the “is TRUE” activity, add a “Device Action” to turn on the fan.

That’s it. That will get the fan turned on. I leave it to you to figure out how/when to turn it off!

Hint: create two groups; rename the first group to “Fan On” and move/drag the condition we created above into the first group; remove the “fan on” device activity from where it currently is and put it in the “Fan On is TRUE” activity. Rename the second group to “Fan Off”, and put your condition logic for when the fan should turn off there, and the Activity “Fan Off is TRUE” should turn off the fan.

jonas2 · May 12, 2019, 1:39pm

Thank you!!! I think i have it working now … When the moisture is high, AND the stove fan is off (< 11 watt) the fan sets to 100%

When the fan is ON the dimmer value (ventilation) sets to 0%

Your are the one that makes Vera i great product!!!

Now i just need to figure out how to set the fan to 30% when the moisture is low and the stove fan is off …

By the way, i just send you a donation

regards Jonas

jonas2 · May 20, 2019, 3:21pm

Now to the big question … Will it be possible to make a condition that “remebers” the value that triggered the fan to go on, to use that moisture value as trigger for fan OFF ?

rigpapa · May 20, 2019, 4:02pm

I’ll give you hints:

If you create a variable with no expression (just blank/empty), then any value assigned by the SetVariable action (in Activities) on the ReactorSensor will be persistent. You can use that variable in other expressions and conditions.
In order to force the creation of the state variable that mirrors the empty expression variable, you have assign it a value (e.g. 123) and then Restart the RS, then go back and remove/empty out the value. Then the state var will exist so you can use it in conditions. This is only necessary due to a small bug that will be fixed in an upcoming version.

rigpapa · May 23, 2019, 2:49pm

Have you been successful, @jonas2 ?

jonas2 · May 24, 2019, 7:43am

Nope, i haven’t had the time yet. Will give it a try this weekend

maddios · August 24, 2019, 5:41am

I’ve got a weird problem with my test of this setup.

I created a new reactor sensor, added the 3 expressions in the first post, I’m getting the series pile up an array of humidities, but for some reason the range just has an error “Can’t coerce null to number”

I have them in the order listed: isint, series then range.

series looks like: [“55.75”,“55.75”,“55.75”,“55.49”,“55.49”,“55.49”]

I’m pretty confused why max and min are both returning a null for this series, any tips are appreciated

Second issue is that I created a second ReactorSensor to control the actual fan but when I try to add device state for the series sensor I can’t select the “range” variable, it’s simply not listed in the dropdown.

I must be doing something wrong.

Thanks

rigpapa · August 24, 2019, 2:20pm

Hmm… something to fix in the original post! Notice that your series is filled with strings. The min() and max() functions are picky about data types–they need to be numbers to be considered, so with no eligible values (since they are all strings) the result of either function will be null. Our series expression needs to wrap its getstate() in a tonumber(): if( isint=="1", arraypush( series, tonumber( getstate( "Office Humidity", "urn:...etc...", "CurrentLevel" ) ), 10 ), series )

With regard to the range variable, mark is as “exported” (up/arrow on the expression row–it will highlight green when the expression value is exported).

I’ll update the original post as well.

maddios · August 24, 2019, 6:34pm

thanks, yeah that was it for the first issue, changing type to a number worked, thanks

Any idea how to get the 2nd reactor sensor to see these calculated values from the first sensor? Did I miss a step?

I’m following these instructions and step 2, part C i have no range:

Create another ReactorSensor;
Add a single condition to it:

Select Device State condition type
Choose the ReactorSensor that is generating the humidity data series
Choose the variable range from the variables drop down.
Choose the “>=” operating
For the value, put in whatever maximum change in humidity you are going to allow, let’s say 5%… put in 5.

rigpapa · August 24, 2019, 7:13pm

That was the second paragraph of my reply: you need to mark the “range” variable “exported”.

maddios · August 24, 2019, 7:26pm

Oh yeah, sorry I totally missed that bit.

It works now, thanks!