Parsing XML

I am trying to write code in Lua to parse an xml document, but I am stumped by how to do this. I have done exhaustive google searches and tried different things, but to no avail.

I can grab the xml from the internet and read it in, but parsing it to get specific values out if it has me stumped.

Any simple, intuitive solutions out there?

a quick search on the forum would yield this relevant thread: http://forum.micasaverde.com/index.php?topic=10521.0

That doesn’t help me to write code to parse XML. I did see that thread.

The thread that @capjay references indicates that lxp is available on all Vera’s.

Did you try it, and find that it didn’t work?

The Belkin WeMo plugin has quite a bit of XML parsing in it. You could probably adapt some of it.

Can you be specific about what kind of parsing you want to do? Build a complete DOM of a document? Extract just one element at a given XPath? Flatten a node and its descendants?

Here’s some test code that I made up when I was having problems with the weather plugin - it’s test only stuff. The XML parsing is based on a Lua pattern. It’s a bit of a hack but works OK on simple XML jobs. If you have an API key it will run in the Lua test area and it will show results in the log.

[code]-- Test the weather parsing
– a-lurker 19 June 2013

local YOUR_API_KEY = ‘abcdefghi’
local YOUR_LATITUDE = ‘29.00’
local YOUR_LONGITUDE = ‘44.00’

local PLUGIN_NAME = ‘TestWeatherParsing’
local DEBUG_MODE = true

local function debug(textParm)
if DEBUG_MODE then
local text = ‘’
local theType = type(textParm)
if theType == ‘string’ then
text = textParm
else
text = 'type is: '…theType
end
luup.log(PLUGIN_NAME…" debug: "…text, 50)
end
end

local function log(textParm, levelParm)
local text = textParm or ‘nil’
local level = levelParm or 50
luup.log(PLUGIN_NAME…" plugin: "…text, level)
end

– refer to: LuaSocket: HTTP support
local function urlRequest()
– this forces a GET instead of a POST
request_body = ‘’

http = require('socket.http')
http.TIMEOUT = 5

local response_body = {}

-- r is 1, c is return status and h are the returned headers in a table variable
local r, c, h = http.request {
      url = 'https://api.wunderground.com/api/'..YOUR_API_KEY..'/conditions/forecast/q/'..YOUR_LATITUDE..','..YOUR_LONGITUDE..'.xml',
      method = 'POST',
      headers = {
        ['Content-Type']   = 'application/x-www-form-urlencoded',
        ['Content-Length'] = string.len(request_body)
      },
      source = ltn12.source.string(request_body),
      sink   = ltn12.sink.table(response_body)
}
debug('Web site replied with status = '..c)

local page = ''
if type(response_body) == 'table' then
   page = table.concat(response_body)
   debug('Returned web page data is : '..page)
   return true, page
end

return false, page

end

local WEATHER_PATTERN, tmp = string.gsub([[<response>.*
    <current_observation>.*<observation_location>.*<latitude>(.-)</latitude>.*<longitude>(.-)</longitude>.*</observation_location>.*
    <observation_epoch>(%d-)</observation_epoch>.*
    <weather>(.*)</weather>.*
    <temp_f>([%d%.%-]-)</temp_f>.*<temp_c>([%d%.%-]-)</temp_c>.*
    <relative_humidity>(%d-)%%</relative_humidity>.*
    <wind_string>(.*)</wind_string>.*<wind_dir>(%a-)</wind_dir>.*
    <wind_mph>([%d%.%-]-)</wind_mph>.*<wind_kph>([%d%.%-]-)</wind_kph>.*
    <icon>(.-)</icon>.*
    </current_observation>.*
    <forecast>.*<simpleforecast><forecastdays><forecastday>.*<period>1</period>
    <high><fahrenheit>(.-)</fahrenheit><celsius>(.-)</celsius></high>
    <low><fahrenheit>(.-)</fahrenheit><celsius>(.-)</celsius></low>.*
    </forecastday>.*<forecastday>.*<period>2</period>.*
]], "%s*", "")

local function parseData()
local st = os.time()
local success, webPage = urlRequest()

    -- strip out linefeeds and indentation spaces
    local xml = webPage:gsub(">%s*<", "><")

    debug("Successful execution of URL: xml is " .. xml)

    local lat, long, epoch, condition,
          currentTempF, currentTempC,
          currentHumidity,
          windCondition, windDirection, windMPH, windKPH, icon,
          forecastHighTempF, forecastHighTempC, forecastLowTempF, forecastLowTempC
        = xml:match(WEATHER_PATTERN)

    debug("Execution time: " .. (os.time() - st))

    if (lat == nil) then
        debug("Parse ERROR")
    else
        debug("Parse SUCCESSFUL")
        debug(lat)
        debug(long)
        debug(epoch)
        debug(condition)
        debug(icon)
        debug(currentTempF)
        debug(currentTempC)
        debug(currentHumidity)
        debug(windCondition)
        debug(windDirection)
        debug(windMPH)
        debug(windKPH)
        debug(forecastHighTempF)
        debug(forecastHighTempC)
        debug(forecastLowTempF)
        debug(forecastLowTempC)
    end

end

parseData()

return true
[/code]

Thanks for all of the replies. When I get home, I will give these a shot and let you know if it works or not.

@a-lurker, great code, and it is working pretty good, except when I enter a value for period, it seems to be off by 1. Easy enough to work around, but do you have any documentation you can point me to on the way you formatted the WEATHER PATTERN variable?

@jmutnick,
The pattern string looks to be extracted from the Weather Plugin code itself.

For that, I built it by looking at the XML of the Feed, and hand building the pattern (and later tuning it) according to the Patterns section of the Lua reference manual, and the “Captures” sub-section, specifically:

http://www.lua.org/manual/5.1/manual.html#5.4.1

This is not particularly a good way to do XML parsing, since it will (and does) break when slight formatting differences occur. Longer term, I really need to elim that code, but it works well enough for now, and the code is quite old (and was retrofitted for the Wunderground when Google decom’d Weather)

For example, it’ll break if field “ordering” changes in the feed, it would break even though it’s completely legit to re-order the elements of the XML.

If I were writing it again today, I’d likely try it first in LXP.

@jmutnick

This is a highly cut down/simplified version of @guessed’s weather plugin, as I mentioned previously. That is he wrote it and I simplified it so that:
a) I could check a problem I had with the weather plugin (since rectified)
b) so it could be used as a test vehicle for any site that returns a small amount of xml
So I haven’t “formatted the pattern” - best to do a few Google searches on Lua patterns.

Here’s a version of Futzle’s WeMo xml parser modified to suit the weather plugin (with Futzle’s permission) - it’s test only stuff. If you have an Wunderground API key it will run in the Lua test area and it will show the results in the log. It illustrates how the parser could be applied to any site that returns XML results.

I’ve added in a couple of variables that contain the rain prediction for the immediate future (I think these are: ‘to-day’ and ‘to-night’). This or something similar, could possibly help out those watering their gardens, with sprinkler systems: http://forum.micasaverde.com/index.php/topic,10909.15.html

@a-lurker
Thanks for the code. I am trying to use it for my sprinkler system.

try this one…XML Parser

Kerry