openLuup: Data Historian

Data Historian now supports mirroring to external Grafana Graphite and InfluxDB databases. (as of Development branch v18.11.26)

Great! I’m going to try it on InfluxDB!

Does that also support custom date ranges (not only … so far ranges) in Grafana do you know?

Another question, I’ve been logging my sensors with Historian in a Whisper database for quite some time now.

If I look at graphs in Grafana and look at larger timespans, I notice that sometimes a sensor logs an invalid value (like a way too high temperature or light measurement) which gives a big peak in the graph, flattening all the other measurements.

My question is twofold:

  • is there a way to remove invalid data from the whisper files? I’ve tried this with whisper-tools, but it gives me errors about not readable metadata.
  • is there a way to specify a range for the logged variables, thus ignoring and eliminating the invalid values from getting stored in the DB in the first place?

Thanks,

Joris

Not sure I fully understand. The date range as applied by the Grafana menu bar works, as do the per-graph duration and timeshift fields. I haven’t tried an explicit time interval in the metric query line on a graph, since I’m not really SQL-savvy.

Yes, this is, IMHO, a ZWave transmission problem. It’s particularly prevalent with some devices, notably, for me, energy meters. It is not an historian problem, per se. It is, however, a nuisance.

- is there a way to remove invalid data from the whisper files? I've tried this with whisper-tools, but it gives me errors about not readable metadata.

Yes, there is. And I use it about once a week to fix such problems. You can’t use the Whisper tools, since the database is not binary compatible. I have, however, written an openLuup CGI, originally for DataYours, which allows editing of the data. I’ve made a version which is tailored towards the Historian database. You exercise the basic editing functionality through an HTML form - I have a crude webpage which suffices, but really would like to do better. However, I’m thinking that Graphite/InfluxDB mirroring, means that you could use the standard tools instead.

- is there a way to specify a range for the logged variables, thus ignoring and eliminating the invalid values from getting stored in the DB in the first place?

I wrote a user-defined processing module for DataYours, which would allow just this, but, in the spirit of recording what’s actually received, rather than what you would like to be received, I haven’t implemented that for the Historian. There’s a philosophical discussion to be had there (one which has been going on here on this forum for at least five years in the context of the dataMine plugin, see: http://forum.micasaverde.com/index.php/topic,14692.0.html)

I’m open for good suggestions as to how to proceed.

So, to recap, there is a simple (and crude) solution, which I’m happy to share, if that addresses your need.

Thanks for your reply akbooer.

I’m sure it’s a Zwave transmission glitch or sensor glitch (they are getting smaller and cheaper, so I think the ICs are prone to measuring errors once in a while).

If you could share the crude clean-up code that would be awesome. I’m going to test the InfluxDB route in the near future as well.

I’m looking at improving this massively, possibly adding it as an editable HTML table to the Console > History DB page, along with the graphic. However, for the moment here’s a file [tt]graphite-editor.lua[/tt] which you should put into [tt]cmh-ludl/cgi/[/tt].

I actually invoke this from a link on my Grafana pages:

http://openLuupIP:3480/cgi/graphite-editor.lua

This brings up an HTML page with three sections:

[ul][li]READ - a form with three fields and a “Read” button
[list]
[li]Target - the finder pattern of the metric you want to edit, eg. openLuup.2*..Memo[/li]
[li]From - the start time, can be Graphite relative time (-1d, -3h) or ISO datetime (2018-11-28T16:45)[/li]
[li]Until - the stop time, same format (can also have the value ‘now’)[/li]
[/list]
[/li]
[li]Data field - initially blank, filled with the data from the above time interval once you press the Read button. This request goes directly to the Historian’s Graphite finder, so can include fully qualified metric names, or wildcards. It can, therefore, return data from multiple metrics, but try just to stick to one. The returned JSON data format is the standard Graphite render one.[/li]
[li]WRITE - a form with a single text field and a Write button[/li]
[list]
[li]POST content - here is where you put the revised data to write. The format is exactly the same as returned in the Data field, so the easiest thing by far is simply to cut and paste everything from the returned data to this field. Find and fix the incorrect data values, taking care not to screw up the JSON format or change any of the times. For this reason, It’s best to have homed in on the required part of the data by fine tuning the time intervals on the read request (which you can repeat as many times as you like.) Once you press the Write button, the revised data is sent to the relevant archive file and in the data area the points which have actually been changed are returned by way of verification.[/li]
[/list][/ul]

There are some subtleties. All archives older than the one retrieved and edited get updated using the correct aggregation function - this means that everything is taken care for you and you don’t need to do separate edits for each individual archive within the Whisper file, which will remain data consistent. BUT, the question is “which archive did you retrieve?” The answer depends on the OLDEST time interval that you asked for, and the definition of the file’s retentions, so you are best to set the From time to relatively close to the time of the errant sample, and do so as soon as you spot the error. It’s not a problem, though, because after a time all the younger archives will have been overwritten.

So, as I warned you, it is very crude, but quite easy to use - the explanation takes far longer to read than it does to make an edit.

This is really great stuff. Definitely lucky to have someone as dedicated as you akbooer. I have been playing catch-up over the last month or so and have, for the most part, done so.

That is, I have openLuup and Grafana running on a pi. I am logging system metrics to a nas. I have been able to view both old DataYours whisper files and new Historian whisper files in Grafana. Really great stuff, but…I have a bit of a problem currently and another question about pushing data to a whisper file using a scene (save that one for later).

First the problem I’m having…I can plot openLuup metrics in Grafana, but when I try to zoom to a time period using the mouse to select a new time range, no data or partial data shows. I’ve used the built-in Grafana query inspector to look at the data request and the response, and something doesn’t seem quite right. When I ‘zoom’ into old data using a relative time request like this (note: from=-2d and until=-1d),

xhrStatus:"complete" request:Object method:"POST" url:"api/datasources/proxy/1/render" data:"target=openLuup.2_openLuup.openLuup.MemAvail_Mb&from=-2d&until=-1d&format=json&maxDataPoints=2000"

the data shows in the graph, but if I try to use UNIX Epoch time in the request like this (note: from=1549399500)…

    data:"target=openLuup.2_openLuup.openLuup.MemAvail_Mb&from=1549399500&until=-1d&format=json&maxDataPoints=2000"

or…

data:"target=openLuup.2_openLuup.openLuup.MemAvail_Mb&from=1549399500&until=1549485900&format=json&maxDataPoints=2000"

I get nothing showing on the graph. If I inspect the response, I see that in the first case I get back 256 data points with the first one being two days ago from the present system time. But, if I use the UNIX epoch time for that same time I get back 3 points with the first one being ~24 hours after the requested time, it’s like I’m getting data back, but for the wrong time range.

I’ve searched and searched for what is going wrong, even peeked at some of openLuup source, but haven’t yet been able to find the problem. Could this be a bug in the historian.lua or whisper.lua in the conversion from one time to another?

Great diagnostic work! I wish everyone was as thorough as describing and investigating problems…

I know where the problem might be - there is a syntax ambiguity between Unix epoch and ISO format times if they’re not fully expressed.

I’ll take a look over the weekend, because I’m otherwise busy until then.

Sorry for the problem, but glad that you are, on the whole, enjoying the system!

Wonderful, I will patiently wait for the results of your investigations. Please let me know if you would like me to provide more information.

Hi,

I see an occasional LUA error on line 101 in whisper.lua. The f: format fails with the error it expects a number but gets a string.

Maybe this helps.

Cheers Rene

Interesting… I’ve not noticed that, and the code hasn’t changed for over 5 years!

Must be something using it incorrectly… any more diagnostics available on that (ie. what’s going on when it happens?)

Thanks

I’ve tried, and failed, to replicate this error. Using the mouse in Grafana to zoom into a graph seems to work just fine for me. Perhaps we are using different Grafana versions? I’m somewhat out of date, running on:

[tt]Grafana v3.1.1 (commit: v3.1.0+20-g2bfccd7). [/tt]

With your HTML examples, I’m not surprised. It is, as I mentioned, the fact that the current implementation of the Graphite API uses ISO 8601 date/time format (YYYY-MM-DDThh:mm:ss and variants.) This is, indeed, not strictly in keeping with the Graphite standard, but is for historical reasons, and hasn’t for me caused any problem with Grafana.

I need to go an check the Graphite documentation again to see which formats are supported there. Some of their relative time syntax can be quite exotic, and I didn’t implement all of them.

I checked the Grafana version and I’m running v5.4, so much newer.

Yes, it seems that Grafana is using epoch time when querying the database using a mouse selection box, and for things like ‘yesterday’ and ‘day before yesterday’ according to the query inspector. I suppose the built-in query inspector was not yet implemented in v3.1.1 (it arrived in V4.5), so you cannot easily look to see how the request from Grafana looks on your system.

The things that work are the relative times that use a ‘now’, like ‘from=now-2d’ and ‘to=now-1d’ gives me yesterday.

I have searched the documentation and cannot find a setting that allows one to choose the time format for the query. Perhaps there is a config file or something. The other option would be to add a handler for epoch time in the graphite API that openLuup uses. How to move forward?

Clearly, I need to update.

Yes, it seems that Grafana is using epoch time when querying the database using a mouse selection box, and for things like 'yesterday' and 'day before yesterday' according to the query inspector. I suppose the built-in query inspector was not yet implemented in v3.1.1 (it arrived in V4.5), so you cannot easily look to see how the request from Grafana looks on your system.

Actually, easy. Just enabling debug on the Graphite API implementation tells me what’s going on. Indeed, I see that epoch time is used. Curious, though, since this seems to be in direct conflict with the official Graphite API:

https://graphite.readthedocs.io/en/latest/render_api.html#from-until

The things that work are the relative times that use a 'now', like 'from=now-2d' and 'to=now-1d' gives me yesterday.

Yes, that’s explicitly coded, per the documentation.

I have searched the documentation and cannot find a setting that allows one to choose the time format for the query. Perhaps there is a config file or something. The other option would be to add a handler for epoch time in the graphite API that openLuup uses. How to move forward?

Maybe there isn’t a setting anywhere. Curiously, the epoch time seems to work sometimes. It’s been over 5 years since I implemented this. I need to look at it closely again. I’m sure we can fix this.

I’ve pushed a fix to the development branch (v19.2.9) … trusting that this doesn’t break any other app.

Simply type development into the openLuup Update box on the Plugins page and click the update button.

The use of Unix epoch in the from/until parameters still defies the official documentation[sup]*[/sup], IMHO, but that’s what Grafana seems to do.

Note that if you’re requesting a time interval a long time in the past, then it may be that the archive retention definition is such that no data actually exists in that interval. i.e. the sample rate of the archive is coarser than the requested rendering interval.


[sup]*[/sup] I do note, however, that epoch IS used in the Metrics (rather than the Render) API, per this doc page:

https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api

Updated to the latest development version of openLuup as you noted (v19.2.9).

Genius, absolute genius. It’s working as expected with the update.

I have a couple more questions that I will get to in the next day or so. Thanks.

Interesting… I’ve not noticed that, and the code hasn’t changed for over 5 years!

Must be something using it incorrectly… any more diagnostics available on that (ie. what’s going on when it happens?)

Thanks[/quote]
It could be some variable with an off value type populated on my system. I’ll keep an eye out.

Cheers Rene

Now that I have data collection humming along I, of course, need to up my game and have a couple questions/clarifications.

  1. I see that the ‘storage-schemas.conf’ file is in the /history path. I also understand that the storage-schema for a new file is determined by regex matching to the pattern, such as ‘/.d$’ matches a file ending in ‘.d’. My question is how do I set the filename? In DataYours, I request to watch a variable who’s metric ends with a ‘.watts’ for example and I can modify the storage-schemas for that metric type. Now, in data historian I simply check a box next to a variable, is there a way to specify the metric filename so that I can match a particular storage-schema? Yes, I want to customize the storage-schema for my metric types.

  2. I have an outdoor temperature sensor (MySensor-based). I have PLEG resetting a string variable each day at midnight to the current outdoor temperature. I also have PLEG watching for variable changes and setting two string variables, one tracking the lowest temperature of the day, and another tracking the highest temperature of the day. I would like to set up a PLEG trigger at 23:59 to push each of these temperatures to two whisper files (high and low) with a timestamp of noon and have a storage-schema 1d:10y.

How would I create the files initially, and secondly push these variables with timestamp to the file. To push the variables, I think part of the answer is ‘netcat’ command, but not exactly sure. (maybe this is not a DataHistorian topic, sorry). I know I could set it up as a storage-schema of ‘5m:1d,1d:10y’ with aggregation of ‘max’ or ‘min’ depending on whether it the ‘high’ or ‘low’ variable, respectively. But, what time will be reflected in the ‘1d’ aggregation? I’ve sort of tried this and I see that the timestamp of the capture is some seemingly random time, but always the same time, just not when I want it to be captured.

Many thanks for some quick guidance on these topics.

[quote=“skogen75, post:119, topic:199464”]Now that I have data collection humming along I, of course, need to up my game and have a couple questions/clarifications.
[…]
Many thanks for some quick guidance on these topics.[/quote]

This ain’t going to be quick…

1. I see that the 'storage-schemas.conf' file is in the /history path. I also understand that the storage-schema for a new file is determined by regex matching to the pattern, such as '/.d$' matches a file ending in '.d'. My question is how do I set the filename? In DataYours, I request to watch a variable who's metric ends with a '.watts' for example and I can modify the storage-schemas for that metric [i]type[/i]. Now, in data historian I simply check a box next to a variable, is there a way to specify the metric filename so that I can match a particular storage-schema? Yes, I want to customize the storage-schema for my metric types.

[ul][li]Following the Graphite system, yes, the rules for storage schemas and aggregations are set in configuration files [tt]storage-schemas.conf[/tt] and [tt]aggregation-schemas.conf[/tt] using regex. Deviating slightly from the standard, these files are placed in the database folder to which they apply. This means that different databases can have different rules. There are default files for the case that physical files are not present in the folder - these are stored in [tt]openLuup/virtualfilesystem.lua[/tt]. You could use these as a template for the physical files.[/li]
[li]Sadly, the console UI is not quite yet sophisticated enough to use the checkboxes on the Historian cache page to define which files are archived. These are currently read-only fields (and set to that in the HTML, but the browser doesn’t seem to honour this.) I am slowly improving the HTML pages (and my knowledge of HTML at the same time) so this should come. My only rule is that it must be pure CSS/HTML5, no JavaScript. So if you have some pointers for me, that would be good.[/li]
[li]In fact, the Historian does not exactly follow the Graphite/DataYours way of using the configuration rules. Each rule has a name defined in square brackets, and these are referenced by another rule set, this time defined in the file [tt]openLuup/servertables.lua[/tt], under the structure [tt]archive_rules[/tt]. This brings another, necessary, level of flexibility for the Historian. To change these rules, you’d have to edit that file directly, since it’s not overriden by any physical file. The key fields here are:
[list]
[li]schema - defines the name of the actual aggregation rule schema to use[/li]
[li]patterns - a list of patterns to use when matching the Historian data variables. Note that these rules DON’T use regex OR Lua patterns, but the Graphite API style defined here Graphite Graphing Metrics which are very powerful and tailored specifically to the needs of the Graphite storage finder naming syntax.[/li]
[/list]
[/li]
[li]Since this second level of rule indirection is somewhat confusing, I am, in fact, in the process of developing it further to include retentions, xFF and aggregation methods, and deprecate indirect reference to .conf files, as per the comment that you will find there in the code. This means that they will be completely independent from the Graphite .conf files, although when the Historian is used as a DataYours substitute to write non-Historian files, it will behave as a simple Graphite system and revert to using them.[/li]
[li]
This may not be helping to directly to answer your question #1, but all you really need to know is that once an appropriate archive file is created, the tickbox will show for that variable and the data will be archived to disc. Then the only question is how to create the file… (it’s easy!)
[/li][/ul]

2. I have an outdoor temperature sensor (MySensor-based). I have PLEG resetting a string variable each day at midnight to the current outdoor temperature. I also have PLEG watching for variable changes and setting two string variables, one tracking the lowest temperature of the day, and another tracking the highest temperature of the day. I would like to set up a PLEG trigger at 23:59 to push each of these temperatures to two whisper files (high and low) with a timestamp of noon and have a storage-schema 1d:10y.

How would I create the files initially, and secondly push these variables with timestamp to the file. To push the variables, I think part of the answer is ‘netcat’ command, but not exactly sure. (maybe this is not a DataHistorian topic, sorry). I know I could set it up as a storage-schema of ‘5m:1d,1d:10y’ with aggregation of ‘max’ or ‘min’ depending on whether it the ‘high’ or ‘low’ variable, respectively. But, what time will be reflected in the ‘1d’ aggregation? I’ve sort of tried this and I see that the timestamp of the capture is some seemingly random time, but always the same time, just not when I want it to be captured.

Several ways to do this:

[ul][li]Get your PLEG code to write numeric variables (or perhaps the existing string variables will do if they have no text like units or anything) and then the Historian will save them as per normal. You would need to create the right rules (or, directly, the files) with the appropriate max/min aggregations.[/li]
[li]Push data directly using UDP datagrams to DataYours (or, in future the Historian acting as a DataYours/Graphite server.) No need for [tt]netcat[/tt] or anything like that.[/li]
[li]Write Whisper files directly yourself - this is really a one-liner and described in other posts such as this one: http://forum.micasaverde.com/index.php/topic,24669.msg173841.html#msg173841[/li][/ul]

In Historian, you cannot lie about the time that the variable is written. In the other two options your can write using any timestamp you please…

HOWEVER…

…be aware that the aggregation functionality of Whisper files means that the precision of the timestamp will only be as good as the sampling of the archive which is holding the value. Thus with a ‘1d’ precision, the timestamp of any value is truncated to 00:00 on the start of the day in question. If you wanted ‘12h’ precision, you could do that with that schema but it would mean that you would only be using every-other slot in the archive file (which may not be a problem.)

The example post I linked to shows that to create a one-off file directly (rather than using rules) you just need: a call like

whisper.create (filename, "1s:1m,1m:1d,5m:7d,1h:90d,6h:1y,1d:5y", 0, "sum")

To write a datapoint:

whisper.update (filename, value, timestamp)

I would caution against using this direct method, unless you have very good reasons, preferring to use the Historian either directly, or as a Graphite server via the UDP datagram (which again is a couple of one-liners in openLuup):

local io = require "openLuup/io"
local myUDP = io.udp.open "123.12.34.56:1234"    -- create the port
myUDP:send "filename value timestamp"   -- write a data value

Do not, whatever you do, write arbitrary Whisper file in the Historian directy, although you may do so in the DataYours/Whisper directory.

So, as I said, not ‘quick’ guidance, but perhaps having waded through it, not very hard at all.

I feel sure you’ll have further questions, though, …