Writing an extension scriptable protocol

In this section, an extension scriptable protocol is to be developed, which will add some additional detection capability for hypertext transfer protocol (HTTP) traffic. Specifically, the script tag contents in POST requests will be correlated to some text in the response. If data are found in both request and response, then a property combining these two will be set in the client node.

Since we are authoring an HTTP extension, it is known beforehand that only HTTP traffic will be offered for processing. The can_handle can be trivial since it is desired that all HTTP traffic passes through the extension:

function can_handle()
  return true
end
    

In the update_status function, the extension functionality is to be implemented:


function update_status()
  msg = read_string()
  if is_post(msg) then
    handle_post(msg)
  elseif is_response(msg) then
    handle_response(msg)
  end
end

From all the functions that are invoked in it, only read_string is provided by the Guardian application programming interface (API), the other ones are defined within the scriptable protocol script and will be provided later on.

In this function, the complete HTTP message is read (reminder that extensions get to handle defragmented data) and then based on checks if this is a POST or a response message the appropriate functions are invoked.

The functions that support the handling of the POST messages are:

PENDING_SCRIPT_KEY = 0

function is_post(msg)
  req_pattern = "^POST"
  index, _ = string.find(msg, req_pattern)
  return index ~= nil
end

function parse_script(msg)
  pattern = "<script>(.*)</script>"
  _, _, script = string.find(msg, pattern)
  return script
end

function handle_post(msg)
  script = parse_script(msg)
  if script then
    session.set_pending_request_string(0, PENDING_SCRIPT_KEY, script)
  end
end

The is_post function checks whether the HTTP message starts with the POST string. If the index returned by the string.find Lua function is not nil, then the true is returned (the ^ anchor makes sure that a match will only be made at the beginning of the string).

The parse_script function again uses the Lua string.find function to check for the script contents, this time returning the captured text in parentheses. If no match could be found, then nil will be returned.

The handle_post function again searches for the script contents and if found it stores it in the session, under the PENDING_SCRIPT_KEY key. The session.set_pending_request_string function is part of the Guardian specific API and stores an arbitrary string in the session. The first argument is the request_id and since in the case of HTTP we don't expect to have multiplexed request / response pairs in the same session, it is set to a hardcoded 0. The second argument is used to discriminate between different data values stored for the same request, it is set here again to 0.

Thus at the end of the POST handling functions, a pending request string may be set on the session which will hold the contents of the script tag.

The response handling functions are:

function is_response(msg)
  res_pattern = "^HTTP"
  index, _ = string.find(msg, res_pattern)
  return index ~= nil
end

function parse_p(msg)
  pattern = "<p>(.*)</p>"
  _, _, p_body = string.find(msg, pattern)
 return p_body
end

function handle_response(msg)
  if session.has_pending_request_value(0, PENDING_SCRIPT_KEY) then
    p_body = parse_p(msg)
    if p_body then
      pending_script = session.read_pending_request_string(0, PENDING_SCRIPT_KEY)
      restored_value = pending_script .. " -- " .. p_body
      packet.destination_node():set_property("restored_value", restored_value)
    end
    session.close_pending_request(0)
  end
end

The is_response and parse_p functions are very similar to those that were described for the POST side handling, so they will not be commented further. Instead, the focus will be in the more interesting handle_response one.

For the response handling, the script first checks if the script key has already been stored in the session (when a POST message was handled). Note how in order to access the correct placeholder in the session, the session.has_pending_request_value function has been invoked with the same arguments as when the value was stored in the session. If the script value has been found in the session then the body of the p tag is searched for and if that one is found as well, then the script value is fetched from the session (session.read_pending_request_string) and gets concatenated with the body of the p tag. The resulting string is stored as a property in the destination node (packet.destination_node():set_property). At the end, the data stored in the session are cleared session.close_pending_request since they are no longer needed and would otherwise consume memory unnecessarily.

In the code block below, the complete script is provided:

function can_handle()
  return true
end

PENDING_SCRIPT_KEY = 0

function parse_script(msg)
  pattern = "<script>(.*)</script>"
  _, _, script = string.find(msg, pattern)
  return script
end

function parse_p(msg)
  pattern = "<p>(.*)</p>"
  _, _, p_body = string.find(msg, pattern)
 return p_body
end

function is_post(msg)
  req_pattern = "^POST"
  index, _ = string.find(msg, req_pattern)
  return index ~= nil
end

function is_response(msg)
  res_pattern = "^HTTP"
  index, _ = string.find(msg, res_pattern)
  return index ~= nil
end

function handle_post(msg)
  script = parse_script(msg)
  if script then
    session.set_pending_request_string(0, PENDING_SCRIPT_KEY, script)
  end
end

function handle_response(msg)
  if session.has_pending_request_value(0, PENDING_SCRIPT_KEY) then
    p_body = parse_p(msg)
    if p_body then
      pending_script = session.read_pending_request_string(0, PENDING_SCRIPT_KEY)
      restored_value = pending_script .. " -- " .. p_body
      packet.destination_node():set_property("restored_value", restored_value)

    end
    session.close_pending_request(0)
  end
end

function update_status()
  msg = read_string()
  if is_post(msg) then
    handle_post(msg)
  elseif is_response(msg) then
    handle_response(msg)
  end
end