1\documentclass[11pt]{article} 2\usepackage{times} 3\usepackage{pl} 4\usepackage{html} 5\sloppy 6\makeindex 7 8\onefile 9\htmloutput{.} % Output directory 10\htmlmainfile{ssl} % Main document file 11\bodycolor{white} % Page colour 12 13\begin{document} 14 15\title{SWI-Prolog SSL Interface} 16\author{\href{https://www.metalevel.at}{Markus Triska}, Jan van der Steen, Matt Lilley and Jan Wielemaker \\[5pt] 17 E-mail: \email{jan@swi-prolog.org} 18 } 19 20\maketitle 21 22\begin{abstract} 23The SWI-Prolog SSL (Secure Socket Layer) library implements a pair of 24\jargon{filtered streams} that realises an SSL encrypted connection on 25top of a pair of Prolog \jargon{wire} streams, typically a network 26socket. SSL provides public key based encryption and digitally signed 27identity information of the \jargon{peer}. The SSL library is well 28integrated with SWI-Prolog's HTTP library for both implementing HTTPS 29servers and communicating with HTTPS servers. It is also used by the 30\href{http://www.swi-prolog.org/pack/list?p=smtp}{smtp pack} for 31accessing secure mail agents. Plain SSL can be used to realise secure 32connections between e.g., Prolog agents. 33\end{abstract} 34 35\pagebreak 36\tableofcontents 37\pagebreak 38 39 40\section{Introduction} 41\label{sec:ssl-intro} 42 43Raw TCP/IP networking is dangerous for two reasons: 44 45\begin{enumerate} 46\item It is hard to tell whether the party you think you are talking 47 to is indeed the right one. 48\item Anyone with access to a subnet through which your data flows can 49`tap' the wire and listen for sensitive information such as passwords, 50credit card numbers, etc. 51\end{enumerate} 52 53Transport Layer Security~(TLS) and its predecessor Secure Socket 54Layer~(SSL), which are both often collectively called~SSL, solve both 55problems. SSL~uses: 56 57\begin{itemize} 58\item certificates to establish the \textit{identity} of the peer 59\item \textit{encryption} to make it useless to tap into the wire. 60\end{itemize} 61 62SSL allows agents to talk in private and create secure web services. 63 64The SWI-Prolog \pllib{ssl} library provides an API to turn a pair of 65arbitrary Prolog \jargon{wire} streams into SSL powered encrypted 66streams. Note that secure protocols such as secure HTTP simply run the 67plain protocol over (SSL) encrypted streams. 68 69The \pllib{crypto} library provides additional predicates related to 70cryptography and authentication, secure hashes and elliptic~curves. 71 72Cryptography is a difficult topic. If you just want to download 73documents from an HTTPS server without worrying much about security, 74http_open/3 will do the job for you. As soon as you have higher security 75demands we strongly recommend you to read enough background material to 76understand what you are doing. See \secref{ssl-security} for some 77remarks regarding this implementation. This 78\href{http://www.tldp.org/HOWTO/SSL-Certificates-HOWTO/index.html}{The 79Linux Documentation Project page} provides some additional background 80and tips for managing certificates and keys. 81 82 83\input{ssllib.tex} 84 85\input{crypto.tex} 86 87\section{XML cryptographic libraries} 88\label{sec:ssl-xml-libs} 89 90The SSL package provides several libraries dealing with cryptographic 91operations of XML documents. These libraries depend on the \const{sgml} 92package. These libraries are part of this package because the 93\const{sgml} package has no external dependencies and will thus be 94available in any SWI-Prolog installation while configuring and building 95this \const{ssl} package is much more involved. 96 97\input{saml.tex} 98\input{xmlenc.tex} 99\input{xmldsig.tex} 100 101\section{SSL Security} 102\label{sec:ssl-security} 103 104Using SSL (in this particular case based on the OpenSSL implementation) 105to connect to SSL services (e.g., an \verb$https://$ address) easily 106gives a false sense of security. This section explains some of the 107pitfalls.\footnote{We do not claim to be complete, just to start warning 108you if security is important to you. Please make sure you 109understand (Open)SSL before relying on it.}. As stated in the 110introduction, SSL aims at solving two issues: tapping information from 111the wire by means of encryption and make sure that you are talking to 112the right address. 113 114Encryption is generally well arranged as long as you ensure that the 115underlying SSL library has all known security patches installed and you 116use an encryption that is not known to be weak. The Windows version and 117MacOS binaries of SWI-Prolog ships with its own binary of the OpenSSL 118library. Ensure this is up-to-date. On systems that ship with the 119OpenSSL library SWI-Prolog uses the system version. This applies notably 120for all Linux packages. Check the origin and version of the OpenSSL 121libraries and verify there are no more recent security patches regularly 122if security is important to you. The OpenSSL library version as reported 123by SSLeay_version() is available in the Prolog flag 124\const{ssl_library_version} as illustrated below on Ubuntu 14.04. 125 126\begin{code} 127?- [library(ssl)]. 128?- current_prolog_flag(ssl_library_version, X). 129X = 'OpenSSL 1.0.1f 6 Jan 2014'. 130\end{code} 131 132Whether you are talking to the right address is a complicated 133issue. The core of the validation is that the server provides a 134\jargon{certificate} that identifies the server. This certificate is 135digitally \jargon{signed} by another certificate, and ultimately by a 136\jargon{root certificate}. (There may be additional links in this 137chain as well, or there may just be one certificate signed by itself) 138Verifying the peer implies: 139 140\begin{enumerate} 141 \item Verifying the chain or digital signatures until a trusted 142 root certificate is found, taking care that the chain does not 143 contain any invalid certificates, such as certificates which have 144 expired, are not yet valid, have altered or forged signatures, 145 are valid for the purposes of SSL (and in the case of an issuer, 146 issuing child certificates) 147 \item Verifying that the signer of a certificate did not \jargon{revoke} 148 the signed certificate. 149 \item Verifying that the host we connected to is indeed the host 150 claimed in the certificate. 151\end{enumerate} 152 153The default https client plugin (\pllib{http/http_ssl_plugin}) 154registers the system trusted root certificate with OpenSSL. This is 155achieved using the option 156\term{cacerts}{[system(root_certificates)]} of ssl_context/3. The 157verification is left to OpenSSL. To the best of our knowledge, the 158current (1.0) version of OpenSSL \textbf{only} implements step (1) of 159the verification process outlined above. This implies that an attacker 160that can control DNS mapping (host name to IP) or routing (IP to 161physical machine) can fake to be a secure host as long as they manage 162to obtain a certificate that is signed from a recognised 163authority. Version 1.0.2 supports hostname checking, and will not 164validate a certificate chain if the leaf certificate does not match 165the hostname. 'Match' here is not a simple string comparison; 166certificates are allowed (subject to many rules) to have wildcards in 167their SubjectAltName field. Care must also be taken to ensure that the 168name we are checking against does not contain embedded NULLs. If 169SWI-Prolog is compiled against a version of OpenSSL that does NOT have 170hostname checking (ie 1.0.0 or earlier), it will attempt to do the 171validation itself. This is not guaranteed to be perfect, and it only 172supports a small subset of allowed wildcards. If security is 173important, use OpenSSL 1.0.2 or higher. 174 175After validation, the predicate ssl_peer_certificate/2 can be used to 176obtain the peer certificate and inspect its properties. 177 178\section{CRLs and Revocation} 179\label{sec:crl} 180Certificates must sometimes be revoked. Unfortunately this means that 181the elegant chain-of-trust model breaks down, since the information 182you need to determine whether a certificate is trustworthy no longer 183depends on just the certificate and whether the issuer is trustworthy, 184but now on a third piece of data - whether the certificate has been 185revoked. These are managed in two ways in OpenSSL: CRLs and 186OCSP. SWI-Prolog supports CRLs only. (Typically OCSP responders are 187configured in such a way as to just consult CRLs anyway. This gives 188the illusion of up-to-the-minute revocation information because OCSP 189is an interactive, online, real-time protocol. However the information 190provided can still be several \emph{weeks} out of date!) 191 192To do CRL checking, pass require_crl(true) as an option to the 193ssl_context/3 (or http_open/3) option list. If you do this, a 194certificate will not be validated unless it can be \emph{checked} for 195on a revocation list. There are two options for this: 196 197First, you can pass a list of filenames in as the option crl/1. If the 198CRL corresponds to an issuer in the chain, and the issued certificate 199is not on the CRL, then it is assumed to not be revoked. Note that 200this does NOT prove the certificate is actually trustworthy - the CRL 201you pass may be out of date! This is quite awkward to get right, since 202you do not necessarily know in advance what the chain of certificates 203the other party will present are, so you cannot reasonably be expected 204to know which CRLs to pass in. 205 206Secondly, you can handle the CRL checking in the cert_verify_hook when 207the Error is bound to unknown_crl. At this point you can obtain the 208issuer certificate (also given in the hook), find the CRL distribution 209point on it (the crl/1 argument), try downloading the CRL (the URL can 210have literally any protocol, most commonly HTTP and LDAP, but 211theoretically anything else, too, including the possibility that the 212certificate has no CRL distribution point given, and you are expected 213to obtain the CRL by email, fax, or telegraph. Therefore how to 214actually obtain a CRL is out of scope of this document), load it 215using load_crl/2, then check to see whether the certificate currently 216under scrutiny appears in the list of revocations. It is up to the 217application to determine what to do if the CRL cannot be obtained - 218either because the protocol to obtain it is not supported or because 219the place you are obtaining it from is not responding. Just because 220the CRL server is not responding does not mean that your certificate 221is safe, of course - it has been suggested that an ideal way to extend 222the life of a stolen certificate key would be to force a denial of 223service of the CRL server. 224 225 226\subsubsection{Disabling certificate checking} 227\label{sec:disable-certificate} 228 229In some cases clients are not really interested in host validation of 230the peer and whether or not the certificate can be trusted. In these 231cases the client can pass \term{cert_verify_hook}{cert_accept_any}, 232calling cert_accept_any/5 which accepts any certificate. Note that 233this will accept literally ANY certificate presented - including ones 234which have expired, have been revoked, and have forged 235signatures. This is probably not a good idea! 236 237 238\subsubsection{Establishing a safe connection} 239\label{sec:ssl-safe-connection} 240 241Applications that exchange sensitive data with e.g., a backend server 242typically need to ensure they have a secure connection to their 243peer. To do this, first obtain a non-secure connection to the peer (eg 244via a TCP socket connection). Then create an SSL context via 245ssl_context/3. For the client initiating the connection, the role is 246'client', and you should pass options host/1 and cacerts/1 247at the very least. If you expect the peer to have a certificate which 248would be accepted by your host system, you can pass 249\term{cacerts}{[system(root_certificates)]}, otherwise you will need 250a copy of the CA certificate which was used to sign the peer's 251certificate. Alternatively, you can pass cert_verify_hook/1 to write 252your own custom validation for the peer's certificate. Depending on 253the requirements, you may also have to provide your /own/ certificate 254if the peer demands mutual authentication. This is done via the 255certificate_file/1, key_file/1 and either password/1 or 256pem_password_hook/1. 257 258Once you have the SSL context and the non-secure stream, you can call 259ssl_negotiate/5 to obtain a secure stream. ssl_negotiate/5 will raise 260an exception if there were any certificate errors that could not be 261resolved. 262 263The peer behaves in a symmetric fashion: First, a non-secure 264connection is obtained, and a context is created using ssl_context/3 265with the role set to server. In the server case, you must provide 266certificate_file/1 and key_file/1, and then either password/1 or 267pem_password_hook/1. If you require the other party to present a 268certificate as well, then peer_cert(true) should be provided. If the 269peer does not present a certificate, or the certificate cannot be 270validated as trusted, the connection will be rejected. 271 272By default, revocation is not checked. To enable certificate 273revocation checking, pass require_crl(true) when creating the SSL 274context. See \secref{crl} for more information about revocations. 275 276 277\section{Example code} 278\label{sec:ssl-examples} 279 280Examples of a simple server and client (\file{server.pl} and 281\file{client.pl} as well as a simple HTTPS server (\file{https.pl}) can 282be found in the example directory which is located in 283\file{doc/packages/examples/ssl} relative to the SWI-Prolog installation 284directory. The \file{etc} directory contains example certificate files 285as well as a \file{README} on the creation of certificates using OpenSSL 286tools. 287 288\subsection{Accessing an HTTPS server} 289\label{sec:ssl-https-client} 290 291Accessing an \verb$https://$ server can be achieved using the code 292skeleton below. The line \verb$:- use_module(library(http/http_ssl_plugin)).$ 293can actually be omitted because the plugin is dynamically loaded by 294http_open/3 if the \const{https}~scheme is detected. 295See \secref{ssl-security} for more information about security aspects. 296 297\begin{code} 298:- use_module(library(http/http_open)). 299:- use_module(library(http/http_ssl_plugin)). 300 301 ..., 302 http_open(HTTPS_url, In, []), 303 ... 304\end{code} 305 306 307\subsection{Creating an HTTPS server} 308\label{sec:ssl-https-server} 309 310The SWI-Prolog infrastructure provides two main ways to launch an 311HTTPS~server: 312 313\begin{itemize} 314\item Using \const{library(http/thread_httpd)}, the server is started 315 in HTTPS~mode by adding an option~\const{ssl/1} to 316 http_server/2. The argument of \const{ssl/1} is an option list that 317 is passed as the third argument to ssl_context/3. 318\item Using \const{library(http/http_unix_daemon)}, an HTTPS~server is 319 started by using the command line argument~\const{--https}. 320\end{itemize} 321 322Two items are typically specified as, respectively, options or 323additional command~line arguments: 324 325\begin{itemize} 326\item \textbf{server certificate}. This identifies the server and 327 acts as a \jargon{public key} for the encryption. 328\item \textbf{private key} of the server, which must be kept secret. 329 The key \textit{may} be protected by a password. If this is the 330 case, the server must provide the password by means of the 331 \const{password} option, the \const{pem_password_hook} callback 332 or, in case of the Unix daemon, via the \const{--pwfile} or 333 \const{--password} command line options. 334\end{itemize} 335 336Here is an example that uses the self-signed demo certificates 337distributed with the SSL package. As is typical for publicly 338accessible HTTPS~servers, this version does \textit{not} require a 339certificate from the~client: 340 341\begin{code} 342:- use_module(library(http/thread_httpd)). 343:- use_module(library(http/http_ssl_plugin)). 344 345https_server(Port, Options) :- 346 http_server(reply, 347 [ port(Port), 348 ssl([ certificate_file('etc/server/server-cert.pem'), 349 key_file('etc/server/server-key.pem'), 350 password("apenoot1") 351 ]) 352 | Options 353 ]). 354\end{code} 355 356There are two \jargon{hooks} that let you extend HTTPS servers with 357custom definitions: 358 359\begin{itemize} 360\item \texttt{http:ssl_server_create_hook(+SSL0, -SSL, +Options)}: 361 This extensible predicate is called exactly \textit{once}, after 362 creating an HTTPS~server with Options. If this predicate succeeds, 363 \texttt{SSL} is the~context that is used for negotiating all new 364 connections. Otherwise, \texttt{SSL0} is used, which is the context 365 that was created with the given options. 366\item \texttt{http:ssl_server_open_client_hook(+SSL0, -SSL, 367 +Options)}: This predicate is called before \textit{each} connection 368 that the server negotiates with a client. If this predicate 369 succeeds, \texttt{SSL} is the context that is used for the 370 new~connection. Otherwise, \texttt{SSL0} is~used, which is the 371 context that was created when launching the~server. 372\end{itemize} 373 374Important use cases of these hooks are running dual-stack RSA/ECDSA 375servers, updating certificates while the server keeps running, and 376tweaking SSL~parameters for connections. Use ssl_set_options/3 to 377create and configure copies of existing contexts in these hooks. 378 379The example file \file{https.pl} also provides a server that 380\textit{does} require the client to show its certificate. This 381provides an additional level of security, often used to allow a 382selected set of clients to perform sensitive tasks. 383 384Note that a single Prolog program can call http_server/2 with different 385parameters to provide services at several security levels as described 386below. These servers can either use their own dispatching or commonly 387use http_dispatch/1 and check the \const{port} property of the request 388to verify they are called with the desired security level. If a service 389is approached at a too low level of security, the handler can deny 390access or use HTTP redirect to send the client to to appropriate 391interface. 392 393\begin{itemize} 394 \item A plain HTTP server at port 80. This can either be used for 395 non-sensitive information or for \jargon{redirecting} to a more 396 secure service. 397 \item An HTTPS server at port 443 for sensitive services to the 398 general public. 399 \item An HTTPS server that demands for a client key on a selected 400 port for administrative tasks or sensitive machine-to-machine 401 communication. 402\end{itemize} 403 404 405\subsection{HTTPS behind a proxy} 406\label{sec:https-proxy} 407 408The above expects Prolog to be accessible directly from the internet. 409This is becoming more popular now that services are often deployed 410using \jargon{virtualization}. If the Prolog services are placed behind 411a reverse proxy, HTTPS implementation is the task of the proxy server 412(e.g., Apache or Nginx). The communication from the proxy server to the 413Prolog server can use either plain HTTP or HTTPS. As plain HTTP is 414easier to setup and faster, this is typically preferred if the network 415between the proxy server and Prolog server can be trusted. 416 417Note that the proxy server \emph{must} decrypt the HTTPS traffic because 418it must decide on the destination based on the encrypted HTTP header. 419\jargon{Port forwarding} provides another option to make a server 420running on a machine that is not directly connected to the internet 421visible. It is not needed to decrypt the traffic using port forwarding, 422but it is also not possible to realise \jargon{virtual hosts} or 423\jargon{path-based} proxy rules. 424 425Virtual hosts for HTTPS are available via \jargon{Server Name 426 Indication}~(SNI). This is a TLS extension that allows servers to 427host different domains from the same IP address. See the sni_hook/1 428option of ssl_context/3 for more information. 429 430\section{Compatibility of the API} 431\label{sec:compatibility} 432Previous versions of the library used plain Prolog terms to represent 433the certificate objects as lists of fields. Newer versions of the 434library preserve the raw underlying structures as opaque handles to 435allow for more complicated operations to be performed on them. Any old 436code which obtains fields from the certificate using memberchk/2 437should be modified to use certificate_field/2 instead. For example, 438\begin{code} 439 memberchk(subject(Subject), Certificate) 440\end{code} 441will instead need to be 442\begin{code} 443 cerficate_field(Certificate, subject(Subject)) 444\end{code} 445 446Note that some of the fields do not match up exactly with their 447previous counterparts (key is now public_key, for example). 448 449\section{Acknowledgments} 450\label{sec:ssl-acknowledgments} 451 452The development of the SWI-Prolog SSL interface has been sponsored by 453\href{http://www.sss.co.nz}{Scientific Software and Systems Limited}. 454The current version contains contributions from many people. Besides 455the mentioned authors, \href{https://www.metalevel.at}{Markus Triska} 456has submitted several patches, and improved and documented the 457integration of this package with the HTTP infrastructure. 458 459%\bibliographystyle{plain} 460%\bibliography{ssl} 461 462\printindex 463 464\end{document} 465 466