http get request and URL encoding: spaces to + and back?

Question regarding openLuup and URL encoding:

Suppose you have an http get request for a command to set a text value where the text value has a space. That space must be URL encoded, and since it is in the query part, it must be changed to a ‘+’ (if the path part, it must be % encoded). (This is from some quick Google searching - please correct me if I am mistaken.) If this is true, should openLuup change the ‘+’ back to a space as it receives it?

http://ip_address:3480/data_request?id=action&output_format=xml&DeviceNum=6&serviceId=urn:upnp-org:serviceId:VSwitch1&action=SetText1&newText1Value=something+with+a+space

In General the Software layer that is issuing the HTTP request will URL encode the arguments.
But many HTTP Request layers have different interfaces for previously encode or not encoded layers. Ultimately what gets sent as a byte stream must follow the http URL encoding spec.

The HTTP Receiver layer typically URL decodes information before handing this over to the users callback.

NOTE: There might by multiple layers of URL Encode/Decoding going on. It is important that Senders and Receivers be symmetric in handling.
If you call URL encode twice … you must call URL decode twice.

This lack of symmetry is what screwed up scenes in Vera earlier this year.

The OpenLuup layer should not need to deal with this. The underlying http server (listening on port 3480) will deal with this … as encoding is part of the http spec.

Doesn’t openLuup itself listen to port 3480? Maybe it has some other helper under the hood?

Yes, openLuup has its own port 3480 server, and in fact has no need of a system server at all (thanks to some collaboration with @amg0.)

It’s possible that there’s a bug. We need to see a direct comparison between Vera and openLuup.

I’ve done an initial test (which, to be honest, just mirrors parts of the ordinary openLuup unit testing.)

Running this code in Lua Test:

function _G.testHandler (a,b,c)
    local x = {}
    for j,k in pairs (b or {}) do
        x[#x+1] = j .. '=' .. tostring(k)
    end
    return table.concat (x, '\n')
end

luup.register_handler ("testHandler", "testHandler")

and, from a browser, invoking with:

http://172.16.42.10:3480/data_request?id=lr_testHandler&foo=forty+two&garp=one%20two%20three

gives:

garp=one two three
foo=forty+two

in both cases, which is what I would expect.

Of course, this isn’t an exhaustive test of all the requests.

I’m no expert on this, but I don’t think this result is correct:

foo=forty+two

The ‘+’ is supposed be an encoded ’ '. So that should have come out as ‘foo=forty two’.

If the requester wanted a +, then it would have sent a %2b. I.e., your result would correct with this input:

http://172.16.42.10:3480/data_request?id=lr_testHandler&foo=forty%2Btwo&garp=one%20two%20three

One additional point - in your test, the browser itself may have encoded the strings that got sent?

One additional point - in your test, the browser itself may have encoded the strings that got sent?

Yes, it certainly did. My point is simply that this particular test gives the same result on Vera and openLuup.

This came about because I’m trying to get openHAB to send text to a virtual switch on openLuup. My first attempt failed because the send command rightly refused to send text with a space. (I don’t know in advance what text will be sent - my first one happened to have a space, so I’m looking for a general solution.) openHAB allows you to URLencode the string, and it then was happy to send it. I was surprised that openLuup didn’t decode it back to a space and instead I saw a +.

Note: this is not high priority - I can live with the + sign - I’m more curious than anything else.

Out of curiosity, I tried curl -G with the --data-urlencode “newText1Value=with space”. This worked, but it encoded the space to a %20 which contradicts what I see for URLencoding of spaces in the query part - space should be a +. This wiki [url=https://en.wikipedia.org/wiki/Percent-encoding#The_application.2Fx-www-form-urlencoded_type]URL encoding - Wikipedia seems to indicate the + is correct in the query part of a GET. Confusing…

That wiki is specifically for FORMS data submission, I believe. Are you sending forms data, or is this simply on the GET URL request line?

For encoded forms, openLuuup does do the ‘+’ substitution.

BTW, are you using the latest development version?

I think you are correct - I had a misunderstanding of this. There is a distinction between the encoding for forms versus the GET query. There is a lot of information and some misinformation on this which unfortunately I have contributed to on this. I should not be directly using the Java URLEncode for it. It either has to be post-processed or another method must be used.