1\documentclass[11pt]{article}
2\usepackage{times}
3\usepackage{pl}
4\usepackage{html}
5\sloppy
6\makeindex
7
8\onefile
9\htmloutput{.}					% Output directory
10\htmlmainfile{ssl}				% Main document file
11\bodycolor{white}				% Page colour
12
13\begin{document}
14
15\title{SWI-Prolog SSL Interface}
16\author{\href{https://www.metalevel.at}{Markus Triska}, Jan van der Steen, Matt Lilley and Jan Wielemaker \\[5pt]
17        E-mail: \email{jan@swi-prolog.org}
18       }
19
20\maketitle
21
22\begin{abstract}
23The SWI-Prolog SSL (Secure Socket Layer) library implements a pair of
24\jargon{filtered streams} that realises an SSL encrypted connection on
25top of a pair of Prolog \jargon{wire} streams, typically a network
26socket. SSL provides public key based encryption and digitally signed
27identity information of the \jargon{peer}. The SSL library is well
28integrated with SWI-Prolog's HTTP library for both implementing HTTPS
29servers and communicating with HTTPS servers. It is also used by the
30\href{http://www.swi-prolog.org/pack/list?p=smtp}{smtp pack} for
31accessing secure mail agents. Plain SSL can be used to realise secure
32connections between e.g., Prolog agents.
33\end{abstract}
34
35\pagebreak
36\tableofcontents
37\pagebreak
38
39
40\section{Introduction}
41\label{sec:ssl-intro}
42
43Raw TCP/IP networking is dangerous for two reasons:
44
45\begin{enumerate}
46\item It is hard to tell whether the party you think you are talking
47  to is indeed the right one.
48\item  Anyone with access to a subnet through which your data flows can
49`tap' the wire and listen for sensitive information such as passwords,
50credit card numbers, etc.
51\end{enumerate}
52
53Transport Layer Security~(TLS) and its predecessor Secure Socket
54Layer~(SSL), which are both often collectively called~SSL, solve both
55problems. SSL~uses:
56
57\begin{itemize}
58\item certificates to establish the \textit{identity} of the peer
59\item \textit{encryption} to make it useless to tap into the wire.
60\end{itemize}
61
62SSL allows agents to talk in private and create secure web services.
63
64The SWI-Prolog \pllib{ssl} library provides an API to turn a pair of
65arbitrary Prolog \jargon{wire} streams into SSL powered encrypted
66streams. Note that secure protocols such as secure HTTP simply run the
67plain protocol over (SSL) encrypted streams.
68
69The \pllib{crypto} library provides additional predicates related to
70cryptography and authentication, secure hashes and elliptic~curves.
71
72Cryptography is a difficult topic. If you just want to download
73documents from an HTTPS server without worrying much about security,
74http_open/3 will do the job for you. As soon as you have higher security
75demands we strongly recommend you to read enough background material to
76understand what you are doing. See \secref{ssl-security} for some
77remarks regarding this implementation. This
78\href{http://www.tldp.org/HOWTO/SSL-Certificates-HOWTO/index.html}{The
79Linux Documentation Project page} provides some additional background
80and tips for managing certificates and keys.
81
82
83\input{ssllib.tex}
84
85\input{crypto.tex}
86
87\section{XML cryptographic libraries}
88\label{sec:ssl-xml-libs}
89
90The SSL package provides several libraries dealing with cryptographic
91operations of XML documents. These libraries depend on the \const{sgml}
92package. These libraries are part of this package because the
93\const{sgml} package has no external dependencies and will thus be
94available in any SWI-Prolog installation while configuring and building
95this \const{ssl} package is much more involved.
96
97\input{saml.tex}
98\input{xmlenc.tex}
99\input{xmldsig.tex}
100
101\section{SSL Security}
102\label{sec:ssl-security}
103
104Using SSL (in this particular case based on the OpenSSL implementation)
105to connect to SSL services (e.g., an \verb$https://$ address) easily
106gives a false sense of security. This section explains some of the
107pitfalls.\footnote{We do not claim to be complete, just to start warning
108you if security is important to you. Please make sure you
109understand (Open)SSL before relying on it.}. As stated in the
110introduction, SSL aims at solving two issues: tapping information from
111the wire by means of encryption and make sure that you are talking to
112the right address.
113
114Encryption is generally well arranged as long as you ensure that the
115underlying SSL library has all known security patches installed and you
116use an encryption that is not known to be weak. The Windows version and
117MacOS binaries of SWI-Prolog ships with its own binary of the OpenSSL
118library. Ensure this is up-to-date. On systems that ship with the
119OpenSSL library SWI-Prolog uses the system version. This applies notably
120for all Linux packages. Check the origin and version of the OpenSSL
121libraries and verify there are no more recent security patches regularly
122if security is important to you. The OpenSSL library version as reported
123by SSLeay_version() is available in the Prolog flag
124\const{ssl_library_version} as illustrated below on Ubuntu 14.04.
125
126\begin{code}
127?- [library(ssl)].
128?- current_prolog_flag(ssl_library_version, X).
129X = 'OpenSSL 1.0.1f 6 Jan 2014'.
130\end{code}
131
132Whether you are talking to the right address is a complicated
133issue. The core of the validation is that the server provides a
134\jargon{certificate} that identifies the server. This certificate is
135digitally \jargon{signed} by another certificate, and ultimately by a
136\jargon{root certificate}. (There may be additional links in this
137chain as well, or there may just be one certificate signed by itself)
138Verifying the peer implies:
139
140\begin{enumerate}
141    \item Verifying the chain or digital signatures until a trusted
142    root certificate is found, taking care that the chain does not
143    contain any invalid certificates, such as certificates which have
144    expired, are not yet valid, have altered or forged signatures,
145    are valid for the purposes of SSL (and in the case of an issuer,
146    issuing child certificates)
147    \item Verifying that the signer of a certificate did not \jargon{revoke}
148    the signed certificate.
149    \item Verifying that the host we connected to is indeed the host
150    claimed in the certificate.
151\end{enumerate}
152
153The default https client plugin (\pllib{http/http_ssl_plugin})
154registers the system trusted root certificate with OpenSSL. This is
155achieved using the option
156\term{cacerts}{[system(root_certificates)]} of ssl_context/3. The
157verification is left to OpenSSL. To the best of our knowledge, the
158current (1.0) version of OpenSSL \textbf{only} implements step (1) of
159the verification process outlined above. This implies that an attacker
160that can control DNS mapping (host name to IP) or routing (IP to
161physical machine) can fake to be a secure host as long as they manage
162to obtain a certificate that is signed from a recognised
163authority. Version 1.0.2 supports hostname checking, and will not
164validate a certificate chain if the leaf certificate does not match
165the hostname. 'Match' here is not a simple string comparison;
166certificates are allowed (subject to many rules) to have wildcards in
167their SubjectAltName field. Care must also be taken to ensure that the
168name we are checking against does not contain embedded NULLs. If
169SWI-Prolog is compiled against a version of OpenSSL that does NOT have
170hostname checking (ie 1.0.0 or earlier), it will attempt to do the
171validation itself. This is not guaranteed to be perfect, and it only
172supports a small subset of allowed wildcards. If security is
173important, use OpenSSL 1.0.2 or higher.
174
175After validation, the predicate ssl_peer_certificate/2 can be used to
176obtain the peer certificate and inspect its properties.
177
178\section{CRLs and Revocation}
179\label{sec:crl}
180Certificates must sometimes be revoked. Unfortunately this means that
181the elegant chain-of-trust model breaks down, since the information
182you need to determine whether a certificate is trustworthy no longer
183depends on just the certificate and whether the issuer is trustworthy,
184but now on a third piece of data - whether the certificate has been
185revoked. These are managed in two ways in OpenSSL: CRLs and
186OCSP. SWI-Prolog supports CRLs only. (Typically OCSP responders are
187configured in such a way as to just consult CRLs anyway. This gives
188the illusion of up-to-the-minute revocation information because OCSP
189is an interactive, online, real-time protocol. However the information
190provided can still be several \emph{weeks} out of date!)
191
192To do CRL checking, pass require_crl(true) as an option to the
193ssl_context/3 (or http_open/3) option list. If you do this, a
194certificate will not be validated unless it can be \emph{checked} for
195on a revocation list. There are two options for this:
196
197First, you can pass a list of filenames in as the option crl/1. If the
198CRL corresponds to an issuer in the chain, and the issued certificate
199is not on the CRL, then it is assumed to not be revoked. Note that
200this does NOT prove the certificate is actually trustworthy - the CRL
201you pass may be out of date! This is quite awkward to get right, since
202you do not necessarily know in advance what the chain of certificates
203the other party will present are, so you cannot reasonably be expected
204to know which CRLs to pass in.
205
206Secondly, you can handle the CRL checking in the cert_verify_hook when
207the Error is bound to unknown_crl. At this point you can obtain the
208issuer certificate (also given in the hook), find the CRL distribution
209point on it (the crl/1 argument), try downloading the CRL (the URL can
210have literally any protocol, most commonly HTTP and LDAP, but
211theoretically anything else, too, including the possibility that the
212certificate has no CRL distribution point given, and you are expected
213to obtain the CRL by email, fax, or telegraph. Therefore how to
214actually obtain a CRL is out of scope of this document), load it
215using load_crl/2, then check to see whether the certificate currently
216under scrutiny appears in the list of revocations. It is up to the
217application to determine what to do if the CRL cannot be obtained -
218either because the protocol to obtain it is not supported or because
219the place you are obtaining it from is not responding. Just because
220the CRL server is not responding does not mean that your certificate
221is safe, of course - it has been suggested that an ideal way to extend
222the life of a stolen certificate key would be to force a denial of
223service of the CRL server.
224
225
226\subsubsection{Disabling certificate checking}
227\label{sec:disable-certificate}
228
229In some cases clients are not really interested in host validation of
230the peer and whether or not the certificate can be trusted.  In these
231cases the client can pass \term{cert_verify_hook}{cert_accept_any},
232calling cert_accept_any/5 which accepts any certificate. Note that
233this will accept literally ANY certificate presented - including ones
234which have expired, have been revoked, and have forged
235signatures. This is probably not a good idea!
236
237
238\subsubsection{Establishing a safe connection}
239\label{sec:ssl-safe-connection}
240
241Applications that exchange sensitive data with e.g., a backend server
242typically need to ensure they have a secure connection to their
243peer. To do this, first obtain a non-secure connection to the peer (eg
244via a TCP socket connection). Then create an SSL context via
245ssl_context/3. For the client initiating the connection, the role is
246'client', and you should pass options host/1 and cacerts/1
247at the very least. If you expect the peer to have a certificate which
248would be accepted by your host system, you can pass
249\term{cacerts}{[system(root_certificates)]}, otherwise you will need
250a copy of the CA certificate which was used to sign the peer's
251certificate. Alternatively, you can pass cert_verify_hook/1 to write
252your own custom validation for the peer's certificate. Depending on
253the requirements, you may also have to provide your /own/ certificate
254if the peer demands mutual authentication. This is done via the
255certificate_file/1, key_file/1 and either password/1 or
256pem_password_hook/1.
257
258Once you have the SSL context and the non-secure stream, you can call
259ssl_negotiate/5 to obtain a secure stream. ssl_negotiate/5 will raise
260an exception if there were any certificate errors that could not be
261resolved.
262
263The peer behaves in a symmetric fashion: First, a non-secure
264connection is obtained, and a context is created using ssl_context/3
265with the role set to server. In the server case, you must provide
266certificate_file/1 and key_file/1, and then either password/1 or
267pem_password_hook/1. If you require the other party to present a
268certificate as well, then peer_cert(true) should be provided. If the
269peer does not present a certificate, or the certificate cannot be
270validated as trusted, the connection will be rejected.
271
272By default, revocation is not checked. To enable certificate
273revocation checking, pass require_crl(true) when creating the SSL
274context. See \secref{crl} for more information about revocations.
275
276
277\section{Example code}
278\label{sec:ssl-examples}
279
280Examples of a simple server and client (\file{server.pl} and
281\file{client.pl} as well as a simple HTTPS server (\file{https.pl}) can
282be found in the example directory which is located in
283\file{doc/packages/examples/ssl} relative to the SWI-Prolog installation
284directory. The \file{etc} directory contains example certificate files
285as well as a \file{README} on the creation of certificates using OpenSSL
286tools.
287
288\subsection{Accessing an HTTPS server}
289\label{sec:ssl-https-client}
290
291Accessing an \verb$https://$ server can be achieved using the code
292skeleton below. The line \verb$:- use_module(library(http/http_ssl_plugin)).$
293can actually be omitted because the plugin is dynamically loaded by
294http_open/3 if the \const{https}~scheme is detected.
295See \secref{ssl-security} for more information about security aspects.
296
297\begin{code}
298:- use_module(library(http/http_open)).
299:- use_module(library(http/http_ssl_plugin)).
300
301    ...,
302    http_open(HTTPS_url, In, []),
303    ...
304\end{code}
305
306
307\subsection{Creating an HTTPS server}
308\label{sec:ssl-https-server}
309
310The SWI-Prolog infrastructure provides two main ways to launch an
311HTTPS~server:
312
313\begin{itemize}
314\item Using \const{library(http/thread_httpd)}, the server is started
315  in HTTPS~mode by adding an option~\const{ssl/1} to
316  http_server/2. The argument of \const{ssl/1} is an option list that
317  is passed as the third argument to ssl_context/3.
318\item Using \const{library(http/http_unix_daemon)}, an HTTPS~server is
319  started by using the command line argument~\const{--https}.
320\end{itemize}
321
322Two items are typically specified as, respectively, options or
323additional command~line arguments:
324
325\begin{itemize}
326\item \textbf{server certificate}. This identifies the server and
327  acts as a \jargon{public key} for the encryption.
328\item \textbf{private key} of the server, which must be kept secret.
329  The key \textit{may} be protected by a password. If this is the
330  case, the server must provide the password by means of the
331  \const{password} option, the \const{pem_password_hook} callback
332  or, in case of the Unix daemon, via the \const{--pwfile} or
333  \const{--password} command line options.
334\end{itemize}
335
336Here is an example that uses the self-signed demo certificates
337distributed with the SSL package. As is typical for publicly
338accessible HTTPS~servers, this version does \textit{not} require a
339certificate from the~client:
340
341\begin{code}
342:- use_module(library(http/thread_httpd)).
343:- use_module(library(http/http_ssl_plugin)).
344
345https_server(Port, Options) :-
346        http_server(reply,
347                    [ port(Port),
348                      ssl([ certificate_file('etc/server/server-cert.pem'),
349                            key_file('etc/server/server-key.pem'),
350                            password("apenoot1")
351                          ])
352                    | Options
353                    ]).
354\end{code}
355
356There are two \jargon{hooks} that let you extend HTTPS servers with
357custom definitions:
358
359\begin{itemize}
360\item \texttt{http:ssl_server_create_hook(+SSL0, -SSL, +Options)}:
361  This extensible predicate is called exactly \textit{once}, after
362  creating an HTTPS~server with Options. If this predicate succeeds,
363  \texttt{SSL} is the~context that is used for negotiating all new
364  connections.  Otherwise, \texttt{SSL0} is used, which is the context
365  that was created with the given options.
366\item \texttt{http:ssl_server_open_client_hook(+SSL0, -SSL,
367  +Options)}: This predicate is called before \textit{each} connection
368  that the server negotiates with a client. If this predicate
369  succeeds, \texttt{SSL} is the context that is used for the
370  new~connection.  Otherwise, \texttt{SSL0} is~used, which is the
371  context that was created when launching the~server.
372\end{itemize}
373
374Important use cases of these hooks are running dual-stack RSA/ECDSA
375servers, updating certificates while the server keeps running, and
376tweaking SSL~parameters for connections. Use ssl_set_options/3 to
377create and configure copies of existing contexts in these hooks.
378
379The example file \file{https.pl} also provides a server that
380\textit{does} require the client to show its certificate. This
381provides an additional level of security, often used to allow a
382selected set of clients to perform sensitive tasks.
383
384Note that a single Prolog program can call http_server/2 with different
385parameters to provide services at several security levels as described
386below. These servers can either use their own dispatching or commonly
387use http_dispatch/1 and check the \const{port} property of the request
388to verify they are called with the desired security level. If a service
389is approached at a too low level of security, the handler can deny
390access or use HTTP redirect to send the client to to appropriate
391interface.
392
393\begin{itemize}
394    \item A plain HTTP server at port 80.  This can either be used for
395    non-sensitive information or for \jargon{redirecting} to a more
396    secure service.
397    \item An HTTPS server at port 443 for sensitive services to the
398    general public.
399    \item An HTTPS server that demands for a client key on a selected
400    port for administrative tasks or sensitive machine-to-machine
401    communication.
402\end{itemize}
403
404
405\subsection{HTTPS behind a proxy}
406\label{sec:https-proxy}
407
408The above expects Prolog to be accessible directly from the internet.
409This is becoming more popular now that services are often deployed
410using \jargon{virtualization}. If the Prolog services are placed behind
411a reverse proxy, HTTPS implementation is the task of the proxy server
412(e.g., Apache or Nginx). The communication from the proxy server to the
413Prolog server can use either plain HTTP or HTTPS. As plain HTTP is
414easier to setup and faster, this is typically preferred if the network
415between the proxy server and Prolog server can be trusted.
416
417Note that the proxy server \emph{must} decrypt the HTTPS traffic because
418it must decide on the destination based on the encrypted HTTP header.
419\jargon{Port forwarding} provides another option to make a server
420running on a machine that is not directly connected to the internet
421visible. It is not needed to decrypt the traffic using port forwarding,
422but it is also not possible to realise \jargon{virtual hosts} or
423\jargon{path-based} proxy rules.
424
425Virtual hosts for HTTPS are available via \jargon{Server Name
426  Indication}~(SNI). This is a TLS extension that allows servers to
427host different domains from the same IP address. See the sni_hook/1
428option of ssl_context/3 for more information.
429
430\section{Compatibility of the API}
431\label{sec:compatibility}
432Previous versions of the library used plain Prolog terms to represent
433the certificate objects as lists of fields. Newer versions of the
434library preserve the raw underlying structures as opaque handles to
435allow for more complicated operations to be performed on them. Any old
436code which obtains fields from the certificate using memberchk/2
437should be modified to use certificate_field/2 instead. For example,
438\begin{code}
439  memberchk(subject(Subject), Certificate)
440\end{code}
441will instead need to be
442\begin{code}
443  cerficate_field(Certificate, subject(Subject))
444\end{code}
445
446Note that some of the fields do not match up exactly with their
447previous counterparts (key is now public_key, for example).
448
449\section{Acknowledgments}
450\label{sec:ssl-acknowledgments}
451
452The development of the SWI-Prolog SSL interface has been sponsored by
453\href{http://www.sss.co.nz}{Scientific Software and Systems Limited}.
454The current version contains contributions from many people.  Besides
455the mentioned authors, \href{https://www.metalevel.at}{Markus Triska}
456has submitted several patches, and improved and documented the
457integration of this package with the HTTP infrastructure.
458
459%\bibliographystyle{plain}
460%\bibliography{ssl}
461
462\printindex
463
464\end{document}
465
466