Wikiup:Lua/Modul/URLutil/en
Vorlagen- programmierung |
Diskussionen | Lua | Test | Unterseiten | ||||||
Modul | Deutsch | English
|
Esperanto | Dolnoserbski | Hornjoserbsce | Modul: | WP:Lua |
URLutil
– Module with functions for strings in context of internet addressing (URL; IP address – including IPv4 and IPv6 – as well as e-mail). Internationalized adresses (IRI) are also supported.
Supposing some benefit for a Wiki project, only persistent open access in world wide web is supported. Some special cases are not implemented, but hardly relevant:
- IPv4 address not in common notation (dotted decimal)
- URL with IPv6 host (in brackets; slightly opposing wikisyntax)
- Authority with username
Functions for templates
Most functions expect exactly one unnamed parameter (which should be provided to get a meaningful answer). Whitespace ahead and after content is ignored.
The return value is an empty string (“nothing”), if the parameter value does not fulfil the expectations. If there is a result or the query condition is true, at least one visible character will be returned. The result does not begin or end with a space, and HTML entities will be decoded.
- encode
- Encoding similar to parser function
{{urlencode:}}
- Critical characters at start wil be encoded as well as link brackets and pipe.
- Parameter 2 – (optional) encoding
2=%
– QUERY, spaces as plus2=WIKI
– sparse encoding, spaces as underscore2=PATH
– spaces percent-encoded
- getAuthority
- Extract server access from a resource URL (lowercase result)
- nothing – if invalid
- getFragment
- Extract fragment (if any) from a resource URL
- Parameter 2 – (optional) decoding
2=%
– URL is %-coded2=WIKI
– URL is Wiki-coded with dots and underscore
- Result:
- nothing – if not present
- starting with
#
– if present
- getHost
- Extract domain or IP address from a resource URL (lowercase result)
- nothing – if invalid
- getLocation
- Extract resource URL without a fragment, if any
- getPath
- Extract path from a resource URL without any query or fragment.
- Beginning with
/
as basic resource identification. - getPort
- Extract port number from a resource URL (numeric result)
- nothing – if not present or invalid
- getQuery
- Extract query from a resource URL
- Parameter 2 – (optional) single parameter name
- Parameter 3 – alternative separator like
;
– default:&
- Result:
- nothing – if not present
- single value, if single parameter requested
- getRelativePath
- Extract path and query including fragment (if any) from a resource URL but relative to host.
- getScheme
- Extract scheme from a resource URL (lowercase result, including double slashes)
//
– relative protocolhttps://
– protocol- nothing – if beginning of URL is invalid
- getTLD
- Extract top level domain from a resource URL (lowercase result)
- nothing – if invalid, or IP
- getTop2domain
- Extract first two top levels of domain from a resource URL (lowercase result)
- nothing – if invalid, or IP
- getTop3domain
- Extract three top levels of domain from a resource URL (lowercase result)
- nothing – if invalid, or IP
- isAuthority
- Is it a server address (also IP) of a resource, including port?
1
– yes
- isDomain
- Is it a named domain, including sub domains?
1
– yes
- isDomainExample
- Is it an example domain defined in RFC 2606 (example.com example.edu example.net example.org)?
1
– yes
- isDomainInt
- Is it an Internationalized Domain Name (non-ASCII or Punycode)?
1
– yes
- isHost
- Is it a server address without port (also IP)?
1
– yes
- isHostPathResource
- Is it a resource URL or a resource URL without protocol part?
1
– yes
- isIPlocal
- Is it an IPv4 address supposed to be local? RFC 1918, RFC 1122; even any like 0.0.0.0 (RFC 5735)
1
– yes
- isIPv4
- Is it an IPv4 address in common notation (segmentation by dots, decimal)?
1
– yes
- isIPv6
- Is it an IPv6 address?
1
– yes
- isMailAddress
- Is it an e-mail address?
1
– yes
- isMailLink
- Is it an e-mail link (mailto:)?
1
– yes
- isProtocolDialog
- Is it an URL or scheme keyword, which could be used to initiate a dialog in a Wiki?
mailto, irc, ircs, ssh, telnet
1
– yes
- isProtocolWiki
- Is it an URL or scheme keyword, which could point in a Wiki to a resource?
- Relative protocol and
ftp ftps git http https mms nntp sftp svn worldwind
- Not desired are here: gopher, wais as well as mailto, irc, ircs, ssh, telnet.
1
– yes
- isResourceURL
- Is it an URL, which provides general access to a resource? These are: relative protocol, http, https, ftp and also a valid host. Other URL might be used on project or functional pages, but not in encyclopedic context.
1
– yes
- isSuspiciousURL
- Is it an URL, which might be syntactically problematic and might trigger a warning?
1
– yes
- isUnescapedURL
- Is it an URL, where wikisyntax
[ | ]
is to be escaped?1
– yes
- isWebURL
- Is it a valid adress for a resource (any protocol)?
1
– yes
- wikiEscapeURL
- Wikisyntax-safe escaping of
[ | ]
characters.- Identical with parameter, if no problematic character present.
- Otherwise
[ | ]
replaced by webserver safe HTML entities. A pipe is not possible in plain template syntax.
- failsafe
- Version identification
{{Wikipedia:Lua/Modul-Failsafe/en|Modul=URLutil}}
Examples (test page)
A test page illustrates practical use.
Functions for Lua modules (API)
All functions described above can be used by other modules:
local lucky, URLutil = pcall( require, "Module:URLutil" )
if type( URLutil ) == "table" then
URLutil = URLutil.URLutil()
else
-- failure; URLutil is the error message
return "<span class='error'>" .. URLutil .. "</span>"
end
Subsequently there are available:
- URLutil.encode()
- URLutil.getAuthority()
- URLutil.getFragment()
- URLutil.getHost()
- URLutil.getLocation()
- URLutil.getPath()
- URLutil.getPort()
numerical value, orfalse
- URLutil.getQuery()
- URLutil.getQueryTable(url, separator)
table with all assignments key=value - URLutil.getRelativePath()
- URLutil.getScheme()
- URLutil.getTLD()
- URLutil.getTop2domain()
- URLutil.getTop3domain()
- URLutil.isAuthority()
- URLutil.isDomain()
- URLutil.isDomainExample()
- URLutil.isDomainInt()
- URLutil.isHost()
- URLutil.isIP()
numerical 4, 6, orfalse
- URLutil.isIPlocal()
- URLutil.isIPv4()
- URLutil.isIPv6()
- URLutil.isMailAddress()
- URLutil.isMailLink()
- URLutil.isProtocolDialog()
- URLutil.isProtocolWiki()
- URLutil.isResourceURL()
- URLutil.isSuspiciousURL()
- URLutil.isUnescapedURL()
- URLutil.isWebURL()
- URLutil.wikiEscapeURL()
- URLutil.failsafe(atleast)
- atleast
optional
nil or minimal version request or"wikidata"
- atleast
- Returns: string or false
If succeeding, the URLutil.get*() return a string, the URLutil.is*() true
(if no exception mentioned); on failure always false
.
Furthermore there are three string constants:
- URLutil.serial – string, current version ID (date)
- URLutil.suite –
"URLutil"
- URLutil.item – number, Item on Wikidata
Usage
General library; no limitations.
Dependencies
None.
See also
- mw: Uri library – other functionalities on general URI; but in particular helpful for Wiki-URL.
Antetype
en:Module:IPAddress – 2013-03-01
- Unit tests: en:Module:IPAddress/testcases