Change to dataMine database structure

In the near future, I’m planning on migrating the dataMine database from the current flat file structure, to a more tiered structure. The flat structure seemed like a good idea a couple of years ago, but I now have just under 3000 files in my database, and in a single directory, that’s trouble! Even doing a simple directory listing takes a significant amount of time, and WinSCP hangs for nearly a minute now when doing a directory!

The next step after this is to add the history file generation into the mix. This will generate more stats, and hence, more files. So, the 3000 files will quickly increase. For most people, the change in the database won’t matter - the plugin will sort it all out, and if anything, things might be a little quicker (probably not noticeable though).

However, if you are doing anything “non-standard” with the dataMine data - ie accessing the files directly - then your scripts will stop working. Likewise, backup solutions may not work as the files will be in a different place. So, I thought I’d give everyone plenty of notice that I’m going to do this, and also give a heads up on what I’m planning!

The new structure will be as follows -:
/dataMine/database/ID/TYPE/ENTRY.txt

In the /dataMine directory, there will just be a few files - the dataMine Config.json files, their backups, notifications, and sunriseSunset times.

A database sub directory will contain separate directories for each logged datapoint. This is the ID field above - it’s just an internal reference number that dataMine uses (if you look at the current files, it’s the number on the beginning of the filenames).

In the ID directory will be a number of other directories for different types of data - eg raw data, and different types of historical data. The same files format will be used as is currently used now (basically a CSV file), but the filenames will just be a number, not the long filenames we have now.

This means that each directory will only have a “few hundred” files - no more than one file per week. This makes directory listings, and file manipulation much faster.

I’m open to suggestions if advanced users have any comments on the structure, otherwise this will be rolled out some time in the future (in a few weeks) - I might try and release a beta version first if anyone fancies testing it. I’ve already implemented the changes and the conversion will be done automatically by the plugin, so again, most users won’t really notice any major difference.

Any comments, let me know.

Cheers
Chris

Chris,

Sounds good. Happy to beta test if helpful, particularly if it might improve parsing of logfiles with frequent/high volume data.

Thanks,

Jon

Thanks Jon,
I’ll probably post a test version early next week. The code seems ok, but given the significant change, it would be good to test it in a wider forum before it’s released generally…

This is really the first step in getting the history stuff done - that said, most of the history code is written and largely tested as well, so if it all works, it may not be too far away…

Cheers
Chris

SQlite would be something really useful.

My intention was to use SQLite. I have this working on Vera, but it was relatively slow (compared to a file system database). It would have slowed down the inserts considerably which I could have lived with (although it would have taken a really long time to import all my data!), but the big killer was data retrieval. I forget the benchmarks now, but it was considerably slower and I tried various configurations for speeding it up without success…

Chris

Attached is a beta version with the updated database. I’d appreciate it if at least some people could run this before I release it generally, but please read on first to make sure you’re happy to perform the update…

Firstly, I highly recommend taking a backup of your /dataMine directory. I’m quite confident this shouldn’t be needed, but it’s always a good idea since I’d hate for people to loose their data!

This version needs to be loaded using the standard UI5 apps interface - Apps | Develop Apps | Luup Files.

When loaded, this version will proceed to transfer the database to the new format. It will initially transfer the latest file of each logged variable to the new directory structure - this is done during the startup, and it should be reasonably quick.

Once startup is complete, a new “thread” is started to transfer the remaining files. This takes quite a while. Since Lua isn’t actually a multithreaded environment, I need to play “reasonably nicely” with everyone else so it will process one logged variable approximately every 30 seconds or so. If you’re trying to graph data during this time using dataMine, it will be a bit slow, but should work ok…

On completion, all files will be copied to their new location in the /dataMine/database directory. No files are deleted from the /dataMine directory, so you’ll actually have 2 copies of everything. If Lua or Vera resets during the upgrade, this shouldn’t cause any problem to the process. Once the update is complete, you shouldn’t notice any change in dataMine functionality - just the files will be in a different place.

If you have any questions, please ask - if you have any problems, please tell :wink:

Cheers
Chris

Chris,

Seemed to install OK, and the datamine charts are still working. I didn’t SSH in to check on the file structure but it didn’t seem to cause any adverse problems in terms of operations.

Will let you know if anything unexpected happens.

Thanks,

Jon

Thanks Jon.
Glad to hear it went ok - it might pay to have a SSH in and check if you get a chance - just to be sure :wink:

If it’s transferred over ok, I wouldn’t expect any issues in future. Other than the conversion routine, the change is pretty minimal…

Cheers
Chris

All folders seem present as you describe. My ‘type’ for all is raw, but I guess that’s fine/expected.

Let me know if you want me to test/check anything else.

That’s perfect - thanks Jon. All looks well ;D

Cheers
Chris

In the interest of tidiness, what old files which are not in the database subdirectory can I now delete?
Anything which was an old datafile (starting with a number), presumably, but what else?

Thanks!