Rapid7 Blog

URI Parsing: It's harder than you think... or is it?

|Last updated on Sep 27, 2017|1 min read
LinkedInFacebookX

I have to admit, parsing a URI is tricky.  Most Metasploit modules try to do it with some kind of crazy custom regex-fu, but unfortunately most of them are kind of buggy.  Because of this, I've committed a new patch to HttpClient -- a target_uri function that can automatically parse the URI for you. It's only a 4-line change, but should change the way we code HTTP-related modules.

Before I demonstrate how you can take advantage of target_uri, I should briefly explain why you should avoid doing this manually.  First off, the URI structure looks like this:

SchemeHierarchical URI IndicatorCredentialHostPortPath to resourceQuery stringFragment
  • Scheme: A string that indicates the protocol, such as: http, https, smb, ftp, etc.
  • Hierarchical URL indicator: Optional. A string of "//".
  • Credential: Optional. In this format: username:password@
  • Host: An address to the server (note this can be IPv4, IPv6, 32-bit integer, etc)
  • Port: Optional. This is pretty self-explanatory.
  • Path To Resource: A directory or file path.  This is trickier than you think, because how do you determine if "test" is a directory, or file?  Keep in mind that when you do "set TARGETURI test" in a browser exploit in Metasploit, 'test' is treated as a directory, not file.
  • Query String: Optional. Pretty much anything that comes after "?".
  • Fragment: Optional. Pretty much anything that comes after "#".

RFC-3986 covers the generic URI syntax pretty well in case you'd like to read up more, but as you can see, it's really a lot of hassle to break it down.  To ease off this process, we came up with a simple solution by using Ruby's stdlib -- or to be specific, the URI module. The following is a basic usage example:

require 'msf/core'  

class Metasploit3 < Msf::Auxiliary

include Msf::Exploit::Remote::HttpClient

def initialize(info = {})
super(update_info(info,
'Name' => 'URI test case',
'Description' => %q{This module tests the target_uri function},
'Author' => [ 'sinn3r' ],
'License' => MSF_LICENSE
))

register_options(
[
# You must use TARGETURI, or target_uri won't work
OptString.new('TARGETURI', [true, 'The URI Path', '/cms/index.php?page=1&cmd=id'])
], self.class)
end

def run
uri = target_uri
print_status(uri.inspect)
end
end

The above example should return something like this:

msf  auxiliary(test_case) > run
[*] #<URI::Generic:0x0000010c8c05e8 URL:/cms/index.php?page=1&cmd=id>

To retrieve just the resource path, you can simply do this (note: If there's no path, you will get a nil, so make sure you handle that properly. Same thing goes to scheme, port, query, fragment, etc):

uri = target_uri  
print_status(uri.path) #We get "/cms/index.php"

If you want the query string (or a specific parameter), here's another trick on how to handle it:

require 'msf/core'  

class Metasploit3 < Msf::Auxiliary

include Msf::Exploit::Remote::HttpClient
include Msf::Auxiliary::WmapScanUniqueQuery

def initialize(info = {})
super(update_info(info,
'Name' => 'URI test case',
'Description' => %q{This module tests the target_uri function},
'Author' => [ 'sinn3r' ],
'License' => MSF_LICENSE
))

register_options(
[
#You must use TARGETURI, or target_uri won't work
OptString.new('TARGETURI', [true, 'The URI Path', '/cms/index.php?page=1&cmd=id'])
], self.class)
end

def run
uri = target_uri
query = queryparse(uri.query || "")
param_page = query['page']
param_cmd = query['cmd']

print_status("Query is a #{query.class}")
print_status("Page is: #{param_page}")
print_status("CMD is : #{param_cmd}")
end
end

And this gives us the following output:

msf  auxiliary(testme) > rerun
[*] Reloading module...

[*] Query is a Hash
[*] Page is: 1
[*] CMD is : id
[*] Auxiliary module execution completed

And that's it for now. In case you're interested in other HttpClient functions, please feel free to check out the following documentation:

http://rapid7.github.io/metasploit-framework/api/