DataYours - filtering out wrong values

Hi akbooer how are you?

I finally got my VeraPlus installed so I now have a spare Veralite (now UI7 also) to play with so I am now evaluating DataYours and dataMine 2.

I seem to have a bit of a problem with DataYours. I get some strange values in the graphs so they become unreadable and example in the picture below.
Both graphs should show the same thing since both are the datamine graph only the top one is shown in datayours.

Another example is when I log the same temperature directly in DataYours I got some readings that were 99600294000 when they should be between 22 and 24,5. Same thing with humidity value I got some readings att 279, 535 and -32742.

Im sure there is a possibility that the MySensors node sometime is sending crap but is dataMine filtering the values then?

Is there any way to clean out crap values from the database or define upper lower limits for values?

Hi - this is such a good topic that I’ve split it out into its own thread.

Invalid values are something that I see from time to time on a Zwave network (typically once every two month I get a wild reading, usually with some correct digits, from my power meter readers) but I’ve never seen it on a MySensors network. This seems to be a particularly bad case and surely points to some other problem with the sensor or network?

That being said, incorrect values happen and need to be dealt with. DataMine has a filter for this, and so, in fact, does DataYours, but it’s done very differently. We need to take a dive into the working of DataYours, a highly modular system which can be distributed over multiple processors. The basic acquisition chain is something like this:

incoming data → DataWatcher → optional daemons → DataCache → Whisper database → DataGraph → plotted graph

Each one of the processed in the chain can be on a different machine, but it’s an implementation decision as to whether you want to filter data before it goes into the database, or afterwards. Chris’s (the original dataMine author) choice was to do it afterwards, in the plotting. We had some discussion of this about three years ago: Automatically cleaning up erroneous logged data, and he said:

I don’t know whether or not @ConstantSphere has changed this philosophy in version 2.

Anyway, in DataYours you have the choice. At this time, if you define a stored graph, you have the option of setting Y-axis limits, which would effectiely do what you need for display purposes. The alternative, not currently implemented but almost trivial to do, is to insert another daemon in the chain at “optional daemons” which implements whatever filtering you like, before the data goes into the Whisper database.

I don't know whether or not @ConstantSphere has changed this philosophy in version 2.

I’ve kept with the philosophy of not messing with the source data and applying filters to remove erroneous data, if needed. In dataMine2, I’ve moved the filters from client side to server side so they can be applied before any aggregations and are available via the API.

By default the filters are off in dataMine2, so unless you have configured them under the Properties pane of the Configuration tab, then I wouldn’t expect to see any differences between the data you are seeing in dataMine2 and DataYours.

You are right @ConstantSphere I had actually enabled the out-of-limits filter for the data object I was comparing. It made a big difference.

[quote=“sle111”][url=http://github.com/akbooer/DataYours/issues/1]http://github.com/akbooer/DataYours/issues/1[/url]

Data manipulation?

I was thinking about intercepting the value and based on a simple regex (or limit), redirect the value to a new whisper dB variable. Is this any possible, and if so, how?[/quote]

As I responded there, Graphite / DataYours is exactly designed to be able to insert other processing into the data acquisition chain, as shown in my previous post on this thread. I see two alternatives:

[ol][li]create a new daemon which does whatever you want and place it after DataWatcher, or[/li]
[li]extend DataWatcher to be able to include a user script to do what you like.[/li][/ol]

Rather than inventing a GUI or some new syntax, I’m thinking of creating a really simple API which can interface to a user-defined Lua script. The API should be able to:

[ul][li]receive the incoming metric name / value / time[/li]
[li]modify / delete / create new value(s)[/li]
[li]write those to (possibly different) destinations[/li][/ul]

The relay-rules functionality of the relay daemon (which I have not implemented in DataWatcher) are not sufficient to be able to route to different destinations based on metric values, which is what is needed for this type of application.

I’m wondering if there is any way we could manipulate upstream of this all?

For example:

Data watcher → lua script → optional daemons-> etc

The script would set a new device value variable, which would be captured by the data watcher. Basically, the script would set the value IN the device’s values list instead of manipulating the message seem by the DataWatcher. I would expect the data watcher to trigger again for the new value, which would be ignored by the script as it would be in the correct variable.

Originally I was thinking about grabbing the device implementation directly and save it as a new implementation after tweaking it, but I don’t think the basic energy implementation in the Vera is available.

Let me know your thoughts

Oh, yes, I do that all the time. You can run a script in startup. It would be best to create two new device variables which have the correct power and voltage values, and simply log both of those with DataYours.

I have some NorthQ meter readers which don’t calculate power, so I have a script which wakes up on every new meter reading and does the calculation. This really digresses from the thread topic, now, but it is, I suppose, a useful alternative approach.

In your Lua Startup you have some code:

function FixPowerCallback (lul_device, lul_service, lul_variable, lul_value_old, lul_value_new)
	local v_or_p = tonumber (lul_value_new)
	if v_or_p then
		if v_or_p < 300 then
			luup.variable_set (lul_service, "Correct_V", v_or_p, lul_device)
		else
			luup.variable_set (lul_service, "Correct_KWH", v_or_p, lul_device)
		end
	end
end

luup.variable_watch("FixPowerCallback", "urn:micasaverde-com:serviceId:EnergyMetering1", "KWH",  YOUR_DEVICE_NUMBER_HERE)

…remembering to set YOUR_DEVICE_NUMBER_HERE to your actual power meter device number.

Oh, yes, I do that all the time. You can run a script in startup. It would be best to create two new device variables which have the correct power and voltage values, and simply log both of those with DataYours.
.[/quote]

So simple I could cry!

I will do just that, and change the device XML and json to include the new variable along with kWh. This has been annoying me for a long time!

Next, I’ll whip up a new mySensor node with a small 2.5" LCD to view a moving graph of the past hour at a glance just like a thermometer :blush:

No need to change the device .xml or .json files at all… there’s no problem with the code simply creating the new variables the first time it runs.

The latest version of DataWatcher on GitHub, GitHub - akbooer/DataYours: Pure Lua implementation of the Graphite / Carbon Whisper database system, now supports user-defined processing.

Any incoming data, which could be from any of three sources:

[ol][li]watched device variables[/li]
[li]HTTP requests (this is what AltUI uses for watches)[/li]
[li]UDP Whisper plaintext messages (basic Graphite relay daemon functionality)[/li][/ol]

…gets passed to a new module [tt]L_DataUser[/tt]

-- DataUser is a user-defined module with a single global function 'run' 
-- called by DataWatcher for every incoming metric (wherever it comes from.)
-- The processing can do anything you like within the Luup environment
-- and returns a single function, an iterator, which returns the
-- (possibly modified) metric and data to send to the DataCache for storage.
--
-- The iterator function is called until its first return argument is nil.
-- This module can, therefore, choose to totally reject an incoming metric or 
-- return multiple different ones for storage.
--
-- By default, this module simply returns each metric unchanged.

By way of example…

  -- Typically, your processing will test for a specific metric
  -- and modify the value under certain conditions.  
  -- A simple example is bounds-checking:
  --
  if metric: match "Temperature" then 
    local v = tonumber(value)
    if v < -50 then value = -50 
    elseif v > 150 then value = 150
    end
  end

Take a look at the [tt]L_DataUser.lua[/tt] file to see how this example integrates with DataWatcher.

This means that you have full control over what metrics and what values get stored in the database, wherever it is. The metric names are in the usual [tt]device.serviceId.variable[/tt] format used by DataYours.

Hi akbooer,

do you think would be difficult to write a web tool (javascript, jquery, php or similar) to edit wsp files, change some values and rewite it . If I read the wsp file in json format how can I rewite it to its original format ?

tnks

donato

I couldn’t say, because I’m not proficient in those languages. But the .wsp files for DataYours are in CSV format, so anything should be able to to read it. Writing it as ASCII would be harder, because since it is random access, the fields have to be fixed width.

If I read the wsp file in json format how can I rewite it to its original format ?

It depends where you want to do this. If you can use Lua, then the full functionality of the Whisper library is available to you. If you’re doing this in openLuup you could easily write a CGI file (in Lua) which accepts an HTTP POST of JSON data and writes the file. CGIs are easy since I added a WSAPI interface. You don’t have to modify any system files at all, just add a directory /etc/cmh-ludl/cgi/ and put your WSAPI CGI files there.

I couldn’t say, because I’m not proficient in those languages. But the .wsp files for DataYours are in CSV format, so anything should be able to to read it. Writing it as ASCII would be harder, because since it is random access, the fields have to be fixed width.

If I read the wsp file in json format how can I rewite it to its original format ?

It depends where you want to do this. If you can use Lua, then the full functionality of the Whisper library is available to you. If you’re doing this in openLuup you could easily write a CGI file (in Lua) which accepts an HTTP POST of JSON data and writes the file. CGIs are easy since I added a WSAPI interface. You don’t have to modify any system files at all, just add a directory /etc/cmh-ludl/cgi/ and put your WSAPI CGI files there.[/quote]

I’d like to update some values of a wsp file in a web app/page that edit the entire file in a table with two columns (date and value). This id due to the fact that sometimes some values are non correct for device problems or errors.

tnks

donato

IIRC, your master Whisper database is written by an openLuup machine. My suggested solution would be a client/server approach:

[ul][li]write a couple of CGI files which read and write the Whisper database using the Lua library and respond to HTTP to requests to do so.[/li]
[li]write whatever client-side code you like in JavaScript, or whatever, to run in your browser and do the editing.[/li][/ul]

Just a couple of caveats:

[ul][li]if you are editing multi-archive files, I would stick to only editing one archive at a time, but it would be up to you to ensure that different ones remained consistent.[/li]
[li]an active database may change whilst you are editing it - you could lose some real-time data.[/li][/ul]

For a small fee, I could write you the server-side code :wink:

Thanks to an excellent suggestion from @d55m14, here is a first attempt to provide some editing capability for Whisper files…

Rather than augmenting the DataYours plugin, I’ve written this as a stand-alone WSAPI CGI which you can easily add to an openLuup system. The CGI provides server-side access to reading and writing portions of Whisper files, through GET and POST requests, in a JSON-encoded format, identical to that used in Carbon/Graphite (and DataGraph) and documented here: [url=https://graphite.readthedocs.io/en/latest/render_api.html#data-display-formats]The Render URL API — Graphite 1.2.0 documentation.

In addition to the CGI, I’ve also written a very crude web page with two forms (READ / WRITE) which can exercise the functionality. Although this is very basic it can, in fact, be used successfully to edit errant values in Whisper files without too much difficulty.

Installing:

You just need to place these two files:

[ul][li]whisper-editor.lua - into /etc/cmh-ludl/cgi/[/li]
[li]whisper-editor.html - into /etc/cmh-ludl/[/li][/ul]

You do need to run this on an openLuup system with DataYours up and running (it uses DataYours to find the Whisper database.)

No need for a system reload - the CGI .lua file will be loaded on request.

To access the test web page simply browse the following URL:

http://openLuupIP:3480/whisper-editor.html

A snapshot of the page displayed is shown below. To understand its use, here’s an overview of the GET/POST parameters:

GET:

The request is of the form:

http://openLuupIP:3480/cgi/whisper-editor.lua?target=cpu.d&from=2016-07-06T12:00&until=2016-07-06T14:30

Parameters are:

[ul][li]target - metric name to be read (Whisper filename without the .wsp extension)[/li]
[li]from - an ISO-8601 formatted datetime at which to begin[/li]
[li]until - an ISO-8601 formatted datetime at which to end[/li][/ul]

The demo webpage uses an input field of type datetime-local so depending on your browser, this may display as a date-picker (sadly, Firefox does not do this, although Safari does)

POST:

The request content uses enctype=“text/plain” and the content itself should be a JSON-encoded format identical to that which the READ request returns, prefaced by “json=” (this does not appear on the demo web page WRITE box because it is prepended by the textarea box in which it is written.

Demo page usage:

Simple:

[ol][li]select a metric name, start, and stop times, and click the Read button. The received text will appear in the frame on the right-hand side. If there’s an error it may return an error message, or blank if there’s no data to return.[/li]
[li]copy and paste the returned data into the POST content window, deleting the entries you want to remain unchanged, and changing the data values (first number of each value/time pair) as required. Click on the Write button and a confirmation of the changed data should appear in the right-hand frame.[/li][/ol]

Since it’s just a quick demo page, the timestamps are in Unix epoch format. If you have the talent (and I do not) it should be straight-forward to write a JavaScript front-end for this to run in your browser and make the whole thing prettier.

It’s basic, but a whole lot easier that what I used to do to correct the odd error, which was to directly edit the Whisper files… very easy to corrupt the database like that! This approach is much safer since syntax errors should generate error messages and only valid entries are changed.

I would be very pleased to hear/see if anyone takes up the challenge of making a better front-end for this.

Hi akbooer,

finally I’ve tested your code and it seems ok. As soon as possible I’ll try to write a different web front-end.

tnks as usual

donato

Really good to know! Ask if you need anything else. The Web form I made was just for testing, so I’m sure you can do much better.

Hi akbooer,

excuse me but what is the format of a http request to write the wsp file changed ?

Is it : http://openLuupIP:3480/cgi/whisper-editor.lua?json='json file modified’

tnks

donato

[quote=“d55m14, post:18, topic:191815”]what is the format of a http request to write the wsp file changed ?

Is it : http://openLuupIP:3480/cgi/whisper-editor.lua?json='json file modified’[/quote]

No, not quite. The request above looks like a GET query request, but for writing data, you need to generate a POST request. It’s not strictly according to HTTP protocol to use a GET request to modify data (although Vera does it all the time.)

So you use the same URL, [tt]http://openLuupIP:3480/cgi/whisper-editor.lua[/tt], but the POST method, with a plain-text encoded content which is “json={ … }”. This is what the HTML form demo I posted does.

However, if you need the editor CGI changed to accept a GET request to modify data, that can be done.

[quote=“akbooer, post:19, topic:191815”][quote=“d55m14, post:18, topic:191815”]what is the format of a http request to write the wsp file changed ?

Is it : http://openLuupIP:3480/cgi/whisper-editor.lua?json='json file modified’[/quote]

No, not quite. The request above looks like a GET query request, but for writing data, you need to generate a POST request. It’s not strictly according to HTTP protocol to use a GET request to modify data (although Vera does it all the time.)

So you use the same URL, [tt]http://openLuupIP:3480/cgi/whisper-editor.lua[/tt], but the POST method, with a plain-text encoded content which is “json={ … }”. This is what the HTML form demo I posted does.

However, if you need the editor CGI changed to accept a GET request to modify data, that can be done.[/quote]

ok tnks, I’ll try to use what you already have done.