Wikiup:Lua/Modul/URLutil/en

aus Wikipedia, der freien Enzyklopädie
Vorlagen-
programmierung
Diskussionen Lua Test Unterseiten
Modul Deutsch English

Esperanto Dolnoserbski Hornjoserbsce Modul: WP:Lua

URLutil – Module with functions for strings in context of internet addressing (URL; IP address – including IPv4 and IPv6 – as well as e-mail). Internationalized adresses (IRI) are also supported.

Supposing some benefit for a Wiki project, only persistent open access in world wide web is supported. Some special cases are not implemented, but hardly relevant:

  • IPv4 address not in common notation (dotted decimal)
  • URL with IPv6 host (in brackets; slightly opposing wikisyntax)
  • Authority with username

Functions for templates

Most functions expect exactly one unnamed parameter (which should be provided to get a meaningful answer). Whitespace ahead and after content is ignored.

The return value is an empty string (“nothing”), if the parameter value does not fulfil the expectations. If there is a result or the query condition is true, at least one visible character will be returned. The result does not begin or end with a space, and HTML entities will be decoded.

encode
Encoding similar to parser function {{urlencode:}}
Critical characters at start wil be encoded as well as link brackets and pipe.
Parameter 2 – (optional) encoding
  • 2=% – QUERY, spaces as plus
  • 2=WIKI – sparse encoding, spaces as underscore
  • 2=PATH – spaces percent-encoded
getAuthority
Extract server access from a resource URL (lowercase result)
  • nothing – if invalid
getFragment
Extract fragment (if any) from a resource URL
Parameter 2 – (optional) decoding
  • 2=% – URL is %-coded
  • 2=WIKI – URL is Wiki-coded with dots and underscore
Result:
  • nothing – if not present
  • starting with # – if present
getHost
Extract domain or IP address from a resource URL (lowercase result)
  • nothing – if invalid
getLocation
Extract resource URL without a fragment, if any
getPath
Extract path from a resource URL without any query or fragment.
Beginning with / as basic resource identification.
getPort
Extract port number from a resource URL (numeric result)
  • nothing – if not present or invalid
getQuery
Extract query from a resource URL
Parameter 2 – (optional) single parameter name
Parameter 3 – alternative separator like ; – default: &
Result:
  • nothing – if not present
  • single value, if single parameter requested
getRelativePath
Extract path and query including fragment (if any) from a resource URL but relative to host.
getScheme
Extract scheme from a resource URL (lowercase result, including double slashes)
  • // – relative protocol
  • https:// – protocol
  • nothing – if beginning of URL is invalid
getTLD
Extract top level domain from a resource URL (lowercase result)
  • nothing – if invalid, or IP
getTop2domain
Extract first two top levels of domain from a resource URL (lowercase result)
  • nothing – if invalid, or IP
getTop3domain
Extract three top levels of domain from a resource URL (lowercase result)
  • nothing – if invalid, or IP
isAuthority
Is it a server address (also IP) of a resource, including port?
  • 1yes
isDomain
Is it a named domain, including sub domains?
  • 1yes
isDomainExample
Is it an example domain defined in RFC 2606 (example.com example.edu example.net example.org)?
  • 1yes
isDomainInt
Is it an Internationalized Domain Name (non-ASCII or Punycode)?
  • 1yes
isHost
Is it a server address without port (also IP)?
  • 1yes
isHostPathResource
Is it a resource URL or a resource URL without protocol part?
  • 1yes
isIP
Is it an IP address?
  • 4 if IPv4 (in common dotted decimal notation)
  • 6 if IPv6
  • nothing – else
isIPlocal
Is it an IPv4 address supposed to be local? RFC 1918, RFC 1122; even any like 0.0.0.0 (RFC 5735)
  • 1yes
isIPv4
Is it an IPv4 address in common notation (segmentation by dots, decimal)?
  • 1yes
isIPv6
Is it an IPv6 address?
  • 1yes
isMailAddress
Is it an e-mail address?
  • 1yes
isMailLink
Is it an e-mail link (mailto:)?
  • 1yes
isProtocolDialog
Is it an URL or scheme keyword, which could be used to initiate a dialog in a Wiki?
mailto, irc, ircs, ssh, telnet
  • 1yes
isProtocolWiki
Is it an URL or scheme keyword, which could point in a Wiki to a resource?
Relative protocol and ftp ftps git http https mms nntp sftp svn worldwind
Not desired are here: gopher, wais as well as mailto, irc, ircs, ssh, telnet.
  • 1yes
isResourceURL
Is it an URL, which provides general access to a resource? These are: relative protocol, http, https, ftp and also a valid host. Other URL might be used on project or functional pages, but not in encyclopedic context.
  • 1yes
isSuspiciousURL
Is it an URL, which might be syntactically problematic and might trigger a warning?
  • 1yes
isUnescapedURL
Is it an URL, where wikisyntax [ | ] is to be escaped?
  • 1yes
isWebURL
Is it a valid adress for a resource (any protocol)?
  • 1yes
wikiEscapeURL
Wikisyntax-safe escaping of [ | ] characters.
  • Identical with parameter, if no problematic character present.
  • Otherwise [ | ] replaced by webserver safe HTML entities. A pipe is not possible in plain template syntax.
failsafe
Version identification

{{Wikipedia:Lua/Modul-Failsafe/en|Modul=URLutil}}

Examples (test page)

A test page illustrates practical use.

Functions for Lua modules (API)

All functions described above can be used by other modules:

local lucky, URLutil = pcall( require, "Module:URLutil" )
if type( URLutil ) == "table" then
    URLutil = URLutil.URLutil()
else
    -- failure; URLutil is the error message
    return "<span class='error'>" .. URLutil .. "</span>"
end

Subsequently there are available:

  • URLutil.encode()
  • URLutil.getAuthority()
  • URLutil.getFragment()
  • URLutil.getHost()
  • URLutil.getLocation()
  • URLutil.getPath()
  • URLutil.getPort()
    numerical value, or false
  • URLutil.getQuery()
  • URLutil.getQueryTable(url, separator)
    table with all assignments key=value
  • URLutil.getRelativePath()
  • URLutil.getScheme()
  • URLutil.getTLD()
  • URLutil.getTop2domain()
  • URLutil.getTop3domain()
  • URLutil.isAuthority()
  • URLutil.isDomain()
  • URLutil.isDomainExample()
  • URLutil.isDomainInt()
  • URLutil.isHost()
  • URLutil.isIP()
    numerical 4, 6, or false
  • URLutil.isIPlocal()
  • URLutil.isIPv4()
  • URLutil.isIPv6()
  • URLutil.isMailAddress()
  • URLutil.isMailLink()
  • URLutil.isProtocolDialog()
  • URLutil.isProtocolWiki()
  • URLutil.isResourceURL()
  • URLutil.isSuspiciousURL()
  • URLutil.isUnescapedURL()
  • URLutil.isWebURL()
  • URLutil.wikiEscapeURL()
  • URLutil.failsafe(atleast)
    1. atleast
      optional
      nil or minimal version request or "wikidata"
    Returns: string or false

If succeeding, the URLutil.get*() return a string, the URLutil.is*() true (if no exception mentioned); on failure always false.

Furthermore there are three string constants:

  • URLutil.serial – string, current version ID (date)
  • URLutil.suite – "URLutil"
  • URLutil.item – number, Item on Wikidata

Usage

General library; no limitations.

Dependencies

None.

See also

  • mw: Uri library – other functionalities on general URI; but in particular helpful for Wiki-URL.

Antetype

en:Module:IPAddress – 2013-03-01