Taking the DataMine logs further

As development of the DataMine plugin seems to have stalled I would like to analyze the logs some other way to get more info out fo them. For example I would like to see my power usage per day, week, month, year and also per device and category (Heating, Lighting, Appliances etc.).

What kind of info can I get out of the DataMine logs and what tool would be the best to use for what I want to achive? Would I be better off logging the usage some other way?

Look here for an alternative:

http://forum.micasaverde.com/index.php/topic,16984.0.html

  • Garrett

Thanks for the tip, Garrett. So I would be better off dropping DataMine for the things I’d like to achive?

This might be a better alternative as DataMine development has stopped and the code is pretty complex. The alternative I offered you may be easier to work with and not as complex. Something to look into.

  • Garrett

As well as continuing development of EventWatcher, I’m also extending the dmDB module to reproduce dataMine graphics (it all builds on the same libraries). I’ll post a version here shortly for you to try out.

EventWatcher is fine for reviewing data and events over the last 24 hours, but it’s not intended to have archival capability - I’m really happy with dataMine for that.

Sounds very promising, akbooer. I’ll be happy to try it out.

OK, so here’s a package of modules which implements new graphics and reporting formats for the dataMine database.

Just unzip and load the three Lua files through the usual UI Luup files page. Then in either Lua Startup or Test Luup run:

require "dmDBserver"

The HTTP request handler should now respond to browser URLs of the form:

http://[your Vera IP address]:3480/data_request?id=lr_dmDB

The above request, on its own, responds with ‘help text’ describing the allowed parameters:

PARAMETERS: names and values can be abbreviated

  [actions]
                plot = plot specified dataMine channel [number]
               graph = plot specified dataMine graph [number]
              report = various report types [channels/graphs]

  [searchKeys]
                name = dataMine channel name [string]
             channel = dataMine channel id [number]
            variable = Luup variable name [string]
           devicenum = Luup device number [number]
           serviceid = Luup serviceId [number]

  [options]
              height = HTML output height [number]
              format = report format [csv/iso/Table/LineChart/AreaChart]
               width = HTML output width [number]

  [times]
          t2/stop/to = stop times in unix or ISO format [string]
       t1/start/from = start times in unix or ISO format [string]
         dt/interval = duration [day/week/month/year]

EXAMPLE: &start=2013-12-01&device=123&report=log&format=Table

[ul]There are four basic request types:
[list type=decimal]
[li]searches - these allow retrieval of raw dataMine channel data in various formats:
[list]
[li]all described here [url=http://forum.micasaverde.com/index.php/topic,17499.msg136838.html#msg136838]http://forum.micasaverde.com/index.php/topic,17499.msg136838.html#msg136838[/url][/li]
[/list][/li]
[li]reports - two flavours (both these types can be sorted by different columns - just click):
[list]
[li][tt]&report=channels[/tt] - lists all the channels which dataMine is logging[/li]
[li][tt]&report=graphs[/tt] - lists all the graphs which are defined in dataMine[/li]
[/list][/li]
[li]plots - enables plotting of individual dataMine channels. For example
[list]
[li][tt]&plot=3&int=week[/tt] - plots channel 3 data over the last week (you can use day/week/month, I wouldn’t advise year). This is limited to 2000 points, to save you running out of memory.[/li]
[li][tt]&plot=3&from=2013-10-01T00&to=2013-11-01T00[/tt] - plots channel 3 for the month of October (these are ISO 8601 format dates)[/li]
[/list][/li]
[li]graphs - recreate predefined plots, numbered by dataMine from 1 to N. Example:
[list]
[li][tt]&graph=7[/tt] - plots whatever you have defined as graph 7[/li]
[li][tt]&graph=7&interval=week[/tt] - as above, but overrides the default timescale to show, in this case, the last week’s worth of data[/li]
[/list][/li]
[/list][/ul]

All of these displays are available both on your local network and, via HTTPS through the MiOS secure servers with the appropriate syntax, from anywhere on the internet (unlike dataMine’s native graphics.)

Some example reports and plots attached. This is definitely Beta. I wasn’t going to release quite yet, but since you asked…

Just want to quickly thank you for the plugin. I haven’t had time to test it yet with all Christmas preparations going on but will for sure try to find a calm night soon to try it out. Very exited that there might be a future for the dataMine plugin (or what comes after it).

Merry Christmas to you and all forum readers!

@akbooer re: chat about Graphite/Whisper in another thread moved to here:

The Graphite/Whisper stuff looks really good.

What’s the plan for importing existing dataMine data. The data conversion is easy enough but through what method will the user experience the import process? Or would you write a “custom finder”?

How will meta data be handled? For example dataMine knows what category of energy consumption a device falls into eg HVAC, Whole_House, Entertainment… Plus data offsets can be applied to data to calibrate it.

Also what hardware do you see this software running on - that supports Apache, Python, etc?

I can imagine this running on a NUC, that’s say recording TV programs for me and simultaneously talking to any number of Veras and any other enabled equipment.

Given the number of reboots I experience, I worry about what would happen if Vera lost contact with the database. Any chance of caching the data for a day or so / sounding the alarm when the link dies.

Sorry for all the questions - this looks very promising but will certainly require the user have a high level of knowledge re: computers.

Thanks for the repost - a good choice of place to continue the conversation!

The Graphite/Whisper stuff looks really good.
Whisper [u]is[/u] good, because you can, if you like, just run it as a round-robin database, putting a limit on the maximum duration you're ever interested in, [u]or[/u] you can aggregate the data gracefully over time, reducing the resolution but increasing the maximum time limit. All I've done is to translate it to Lua from Python.
What's the plan for importing existing dataMine data. The data conversion is easy enough but through what method will the user experience the import process? Or would you write a "custom finder"?
There's two ways to go here (not mutually exclusive):

[ol][li]data conversion - as you’ve mentioned, could be en bloc or on an as-needed basis. I’ve found my dataMine archive getting rather ragged as old devices go and new ones replace them. Selective conversion makes sense. Obviously some user involvement here to choose what’s wanted.[/li]
[li]database federation - quite possible to add a thin layer which makes it invisible whether the data is in Whisper or dataMine. That way, with almost no user involvement, you could ensure that the old data is still available through the new interface.[/li][/ol]

How will meta data be handled? For example dataMine knows what category of energy consumption a device falls into eg HVAC, Whole_House, Entertainment.... Plus data offsets can be applied to data to calibrate it.
An excellent question! Again, two answers:

[ol][li]device category - In EventWatcher, I tried to use Vera’s device categorization, but it falls short because it is not applied uniformly (or at all, in some cases.) The only categorization which is reliable is the device / serviceId / variable parameter group as used in luup.variable_set or get. The Graphite/Whisper database presents a tree-structured hierarchy which maps directly onto this. I’m using a namespace of the form: [tt]Vera-serialNo.deviceNo.serviceId.variable[/tt], so there’s no ambiguity about which variable (eg. Watts) belongs to which device. Wildcards are also allowed in some contexts, so [tt]Vera-12345678.*.urn:micasaverde-com:serviceId:EnergyMetering1.Watts[/tt] refers to all the power variables on that machine.
[/li]
[li][tt]other metadata[/tt] - Chris’s philosophy for dataMine was that raw data was sacrosanct and shouldn’t be changed. The offsets and filters are just a layer above that in the graphics. Same will apply here, but it may need a bit more care since an erroneous reading will get propagated through the Whisper aggregation process as the data ages. Indeed, a separate ‘database’ will be needed for this metadata… MySQL ??? (no, only joking, I currently plan to retain the same metadata files as dataMine)[/li][/ol]

Also what hardware do you see this software running on - that supports Apache, Python, etc?
Well, my initial requirement was just Vera. All the above sounds rather heavy, but in fact it's not at all. Obviously the data collection front-end has to run on Vera and I'm currently testing a version which: is ~200 lines long; doesn't run as a plugin so doesn't need its own stack space; doesn't need any file system (eg. CIFS or USB) to get the data off Vera; is fully configurable over HTTP; writes UDP packets to syslog and another data logging port. The output is designed to feed easily into: StatsD, RRD, Graphite, syslog, ...

The design route I’ve been following ensures a modular structure with standard interfaces, so whilst, like dataMine, it could all run on Vera, it equally well could be split across multiple, inhomogeneous, systems. I currently have three front-end acquisitions feeding into one syslog and another UDP socket for archival storage (but not yet connected up). The graphics from dmDBserver, or EventWatcher, are very light-weight too and the rendering is done in browser itself. The translated Whisper database is pure Lua so will run anywhere.

I can imagine this running on a NUC, that's say recording TV programs for me and simultaneously talking to any number of Veras and any other enabled equipment.
Whatever takes your fancy.
Given the number of reboots I experience, I worry about what would happen if Vera lost contact with the database. Any chance of caching the data for a day or so / sounding the alarm when the link dies.
Once again, the acquisition doesn't rely on a file system, so if Vera's up (and your network) then all should be well. You could also have a local database, but I haven't thought about synchronisation on network restarts in such a configuration.
Sorry for all the questions - this looks very promising ...
The questions are fine - I'm not sure whether the answers do them justice, so ask again if not clear.
...but will certainly require the user have a high level of knowledge re: computers.
No, I think you're wrong. The plan is that out-of-the-box this will be easier than dataMine if you just want to do a plain Vera install. If you're capable, though (and I assume that someone running a NAS with mySQL or an external server will be) then lots of configurations might be possible. The plan to easily interface to existing tools means that we don't have to roll our own.

…and here’s an update (see the above post for installation and use instructions)

I’ve added a new report [tt]&report=timeline[/tt] which shows a graphical representation of all the channels stored in dataMine and their start/stop dates.

It’s SLOW, about half a second per channel, because it opens each file to check timestamps, but be patient. On my system with over 100 channels it takes about 50 seconds before returning the report… (but it’s OK, Vera should not time it out.) A snapshot is attached - you may have to scroll the frame to see all of the channels. If you hover over a blue bar it will give you a full channel name and deviceNo / serviceID / variable path name along with the start and stop dates and total duration of recorded data.

It seemed to me that this was a prerequisite for migrating data to another database.

...All the above sounds rather heavy, but in fact it's not at all.... ...translate[d] it [Whisper] to Lua from Python...

I hadn’t realised that the Whisper code was so light weight. Hence my incorrect assumption that it would need to run on some other machine. Looks like the code is about 700 lines including comments. I’m hoping that the comments survive the translation process. Some of Vera’s plugin writers don’t believe in them :-\

So yes - the install process should not be as onerous as I had once imagined.

On the meta data - sounds reasonable to keep the original format. Please do us all a favor and ensure that any code that writes out the json text, puts some linefeeds into it. I’ve spent too much time going to http://jsonlint.com/, so I can reformat the json text, just in order to read it.

Chris's philosophy for dataMine was that raw data was sacrosanct and shouldn't be changed.

No argument with this one - doing anything else is the road to despair.

Being able to spray out info to anything that will listen to it and then to be able to use any number of existing tools to envision it or manipulate it, makes sense.

Data federation looks like a good starting point with subsequent migration/conversion once the bugs settle down?

aggregate the data gracefully over time, reducing the resolution

Have you tried this out - it would probably work OK on temperature data for example but I’m not so sure on some others types. However it looks like you can choose on a per channel basis whether this process is applied.

On a side note: you seem to be recording a lot of channels and have an interest in degree days. Is there some other research going on here?

Any way keep typing!

It’s almost line-for-line the same after removing some of the more esoteric calls like file locking. All the (relevant) comments are still there. It’s the rest of the Graphite / Carbon installation which is third-party heavy - happily, we don’t need it (unless you planned on installing it all on another machine.)

On the meta data - sounds reasonable to keep the original format. Please do us all a favor and ensure that any code that writes out the json text, puts some linefeeds into it. I've spent too much time going to http://jsonlint.com/, so I can reformat the json text, just in order to read it.
Depends what (hopefully) gets packaged in the next Luup software. CJSON was mentioned previously. If it's wicked fast then I'll use that, otherwise I have my own. However, you shouldn't ever have to look at the JSON from the metadata given the appropriate tools.
Data federation looks like a good starting point with subsequent migration/conversion once the bugs settle down?
Which bugs would those be, then? ;)
aggregate the data gracefully over time, reducing the resolution

Have you tried this out - it would probably work OK on temperature data for example but I’m not so sure on some others types. However it looks like you can choose on a per channel basis whether this process is applied.

Certainly works for temperature / humidity / light / … data. For others, you have to choose the right aggregation method. Summing security sensor trips over a time interval will give a good idea of what went on. You don’t have to use aggregation at all, given enough disk space. Although using one Whisper datapoint for a single bit of security data is a bit of a waste.

On a side note: you seem to be recording a lot of channels and have an interest in degree days. Is there some other research going on here?
Well spotted. Yes, I'm not doing this [u]just[/u] for the fun of it, I really want the functionality. Energy / weather / heating control is indeed a research topic. I have a ground source heatpump with other ECO measures being incrementally installed in the home. I'm developing a thermal model of the whole thing to assess the impact of changes. (Another work in progress!)

Akbooer, how are things coming around with this part of your work? I’m quite exited to start testing NG dataMine. :slight_smile:

I’ve posted the prototype here: [url=http://forum.micasaverde.com/index.php/topic,23109.msg156425.html#msg156425]http://forum.micasaverde.com/index.php/topic,23109.msg156425.html#msg156425[/url]

Beautiful… 8)

akbooer, I installed the plugin and after restarting LUUP, I saw the following error in the log

luup_log:0: dmDB: unable to open dataMine configuration file '/dataMine/dataMineConfig.json' <0x2ca91680>

I had to manually hard code my location (/nas/) in the dmDB file. I guess it would be possible to access the “setDataDirectory” variable from datamine to get the right location? Anyhow, a hard code works for me now and the fix is pretty easy.

keep up the good work!

Thanks for the feedback. Not clear, though, whether you are referring to the dmDBserver code or the latest Whisper code here: [url=http://forum.micasaverde.com/index.php/topic,23109.msg156425.html#msg156425]http://forum.micasaverde.com/index.php/topic,23109.msg156425.html#msg156425[/url]

I was refering to the dmDBServer code

local default = { -- parameter defaults json = "json-dm", -- not great, but guaranteed to be there! database = "/nas/", -- database filesystem root earliest = os.time { day = 1, month = 1, year = 2012 }, -- earliest allowable data retrieval maxpoints = 10080, -- maximum number of points returned by keyRange }

Ah, OK. Best, perhaps, not to change the dmDB.lua file. There’s a variable on line 15 in the dmDBserver.lua file:

local root 		-- database root directory

which is used way down at the bottom (line 331 or something like that) in a named parameter to the [tt]dmDB.open[/tt] method:

  dm, dmDBstatus = dmDB.open {json = json, database = root, maxpoints = 2000}

So if you just changed line 15 to:

local root = "/nas/"		-- database root directory

that would have done the trick too.