1# OpenLDAP: pkg/openldap-guide/admin/intro.sdf,v 1.45.2.9 2010/04/13 20:22:33 kurt Exp
2# Copyright 1999-2010 The OpenLDAP Foundation, All Rights Reserved.
3# COPYING RESTRICTIONS APPLY, see COPYRIGHT.
4H1: Introduction to OpenLDAP Directory Services
5
6This document describes how to build, configure, and operate
7{{PRD:OpenLDAP}} Software to provide directory services.  This
8includes details on how to configure and run the Standalone
9{{TERM:LDAP}} Daemon, {{slapd}}(8).  It is intended for new and
10experienced administrators alike.  This section provides a basic
11introduction to directory services and, in particular, the directory
12services provided by {{slapd}}(8).  This introduction is only
13intended to provide enough information so one might get started
14learning about {{TERM:LDAP}}, {{TERM:X.500}}, and directory services.
15
16
17H2: What is a directory service?
18
19A directory is a specialized database specifically designed for
20searching and browsing, in additional to supporting basic lookup
21and update functions.
22
23Note: A directory is defined by some as merely a database optimized
24for read access.  This definition, at best, is overly simplistic.
25
26Directories tend to contain descriptive, attribute-based information
27and support sophisticated filtering capabilities.  Directories
28generally do not support complicated transaction or roll-back schemes
29found in database management systems designed for handling high-volume
30complex updates.  Directory updates are typically simple all-or-nothing
31changes, if they are allowed at all.  Directories are generally
32tuned to give quick response to high-volume lookup or search
33operations. They may have the ability to replicate information
34widely in order to increase availability and reliability, while
35reducing response time.  When directory information is replicated,
36temporary inconsistencies between the replicas may be okay, as long
37as inconsistencies are resolved in a timely manner.
38
39There are many different ways to provide a directory service.
40Different methods allow different kinds of information to be stored
41in the directory, place different requirements on how that information
42can be referenced, queried and updated, how it is protected from
43unauthorized access, etc.  Some directory services are {{local}},
44providing service to a restricted context (e.g., the finger service
45on a single machine). Other services are global, providing service
46to a much broader context (e.g., the entire Internet).  Global
47services are usually {{distributed}}, meaning that the data they
48contain is spread across many machines, all of which cooperate to
49provide the directory service. Typically a global service defines
50a uniform {{namespace}} which gives the same view of the data no
51matter where you are in relation to the data itself.
52
53A web directory, such as provided by the {{Open Directory Project}}
54<{{URL:http://dmoz.org}}>, is a good example of a directory service.
55These services catalog web pages and are specifically designed to
56support browsing and searching.
57
58While some consider the Internet {{TERM[expand]DNS}} (DNS) is an
59example of a globally distributed directory service, DNS is not
60browseable nor searchable.  It is more properly described as a
61globally distributed {{lookup}} service.
62
63
64H2: What is LDAP?
65
66{{TERM:LDAP}} stands for {{TERM[expand]LDAP}}.  As the name suggests,
67it is a lightweight protocol for accessing directory services,
68specifically {{TERM:X.500}}-based directory services.  LDAP runs
69over {{TERM:TCP}}/{{TERM:IP}} or other connection oriented transfer
70services.  LDAP is an {{ORG:IETF}} Standard Track protocol and is
71specified in "Lightweight Directory Access Protocol (LDAP) Technical
72Specification Road Map" {{REF:RFC4510}}.
73
74This section gives an overview of LDAP from a user's perspective.
75
76{{What kind of information can be stored in the directory?}} The
77LDAP information model is based on {{entries}}. An entry is a
78collection of attributes that has a globally-unique {{TERM[expand]DN}}
79(DN).  The DN is used to refer to the entry unambiguously. Each of
80the entry's attributes has a {{type}} and one or more {{values}}.
81The types are typically mnemonic strings, like "{{EX:cn}}" for
82common name, or "{{EX:mail}}" for email address. The syntax of
83values depend on the attribute type.  For example, a {{EX:cn}}
84attribute might contain the value {{EX:Babs Jensen}}.  A {{EX:mail}}
85attribute might contain the value "{{EX:babs@example.com}}". A
86{{EX:jpegPhoto}} attribute would contain a photograph in the
87{{TERM:JPEG}} (binary) format.
88
89{{How is the information arranged?}} In LDAP, directory entries
90are arranged in a hierarchical tree-like structure.  Traditionally,
91this structure reflected the geographic and/or organizational
92boundaries.  Entries representing countries appear at the top of
93the tree. Below them are entries representing states and national
94organizations. Below them might be entries representing organizational
95units, people, printers, documents, or just about anything else
96you can think of.  Figure 1.1 shows an example LDAP directory tree
97using traditional naming.
98
99!import "intro_tree.png"; align="center"; \
100	title="LDAP directory tree (traditional naming)"
101FT[align="Center"] Figure 1.1: LDAP directory tree (traditional naming)
102
103The tree may also be arranged based upon Internet domain names.
104This naming approach is becoming increasing popular as it allows
105for directory services to be located using the {{DNS}}.
106Figure 1.2 shows an example LDAP directory tree using domain-based
107naming.
108
109!import "intro_dctree.png"; align="center"; \
110	title="LDAP directory tree (Internet naming)"
111FT[align="Center"] Figure 1.2: LDAP directory tree (Internet naming)
112
113In addition, LDAP allows you to control which attributes are required
114and allowed in an entry through the use of a special attribute
115called {{EX:objectClass}}.  The values of the {{EX:objectClass}}
116attribute determine the {{schema}} rules the entry must obey.
117
118{{How is the information referenced?}} An entry is referenced by
119its distinguished name, which is constructed by taking the name of
120the entry itself (called the {{TERM[expand]RDN}} or RDN) and
121concatenating the names of its ancestor entries. For example, the
122entry for Barbara Jensen in the Internet naming example above has
123an RDN of {{EX:uid=babs}} and a DN of
124{{EX:uid=babs,ou=People,dc=example,dc=com}}. The full DN format is
125described in {{REF:RFC4514}}, "LDAP: String Representation of
126Distinguished Names."
127
128{{How is the information accessed?}} LDAP defines operations for
129interrogating and updating the directory.  Operations are provided
130for adding and deleting an entry from the directory, changing an
131existing entry, and changing the name of an entry. Most of the
132time, though, LDAP is used to search for information in the directory.
133The LDAP search operation allows some portion of the directory to
134be searched for entries that match some criteria specified by a
135search filter. Information can be requested from each entry that
136matches the criteria.
137
138For example, you might want to search the entire directory subtree
139at and below {{EX:dc=example,dc=com}} for people with the name
140{{EX:Barbara Jensen}}, retrieving the email address of each entry
141found. LDAP lets you do this easily.  Or you might want to search
142the entries directly below the {{EX:st=California,c=US}} entry for
143organizations with the string {{EX:Acme}} in their name, and that
144have a fax number. LDAP lets you do this too. The next section
145describes in more detail what you can do with LDAP and how it might
146be useful to you.
147
148{{How is the information protected from unauthorized access?}} Some
149directory services provide no protection, allowing anyone to see
150the information. LDAP provides a mechanism for a client to authenticate,
151or prove its identity to a directory server, paving the way for
152rich access control to protect the information the server contains.
153LDAP also supports data security (integrity and confidentiality)
154services.
155
156
157H2: When should I use LDAP?
158
159This is a very good question. In general, you should use a Directory
160server when you require data to be centrally managed, stored and accessible via
161standards based methods.
162
163Some common examples found throughout the industry are, but not limited to:
164
165* Machine Authentication
166* User Authentication
167* User/System Groups
168* Address book
169* Organization Representation
170* Asset Tracking
171* Telephony Information Store
172* User resource management
173* E-mail address lookups
174* Application Configuration store
175* PBX Configuration store
176* etc.....
177
178There are various {{SECT:Distributed Schema Files}} that are standards based, but
179you can always create your own {{SECT:Schema Specification}}.
180
181There are always new ways to use a Directory and apply LDAP principles to address
182certain problems, therefore there is no simple answer to this question.
183
184If in doubt, join the general LDAP forum for non-commercial discussions and
185information relating to LDAP at:
186{{URL:http://www.umich.edu/~dirsvcs/ldap/mailinglist.html}} and ask
187
188H2: When should I not use LDAP?
189
190When you start finding yourself bending the directory to do what you require,
191maybe a redesign is needed. Or if you only require one application to use and
192manipulate your data (for discussion of LDAP vs RDBMS, please read the
193{{SECT:LDAP vs RDBMS}} section).
194
195It will become obvious when LDAP is the right tool for the job.
196
197
198H2: How does LDAP work?
199
200LDAP utilizes a {{client-server model}}. One or more LDAP servers
201contain the data making up the directory information tree ({{TERM:DIT}}).
202The client connects to servers and asks it a question.  The server
203responds with an answer and/or with a pointer to where the client
204can get additional information (typically, another LDAP server).
205No matter which LDAP server a client connects to, it sees the same
206view of the directory; a name presented to one LDAP server references
207the same entry it would at another LDAP server.  This is an important
208feature of a global directory service.
209
210
211H2: What about X.500?
212
213Technically, {{TERM:LDAP}} is a directory access protocol to an
214{{TERM:X.500}} directory service, the {{TERM:OSI}} directory service.
215Initially, LDAP clients accessed gateways to the X.500 directory service.
216This gateway ran LDAP between the client and gateway and X.500's
217{{TERM[expand]DAP}} ({{TERM:DAP}}) between the gateway and the
218X.500 server.  DAP is a heavyweight protocol that operates over a
219full OSI protocol stack and requires a significant amount of
220computing resources.  LDAP is designed to operate over
221{{TERM:TCP}}/{{TERM:IP}} and provides most of the functionality of
222DAP at a much lower cost.
223
224While LDAP is still used to access X.500 directory service via
225gateways, LDAP is now more commonly directly implemented in X.500
226servers.
227
228The Standalone LDAP Daemon, or {{slapd}}(8), can be viewed as a
229{{lightweight}} X.500 directory server.  That is, it does not
230implement the X.500's DAP nor does it support the complete X.500
231models.
232
233If you are already running a X.500 DAP service and you want to
234continue to do so, you can probably stop reading this guide.  This
235guide is all about running LDAP via {{slapd}}(8), without running
236X.500 DAP.  If you are not running X.500 DAP, want to stop running
237X.500 DAP, or have no immediate plans to run X.500 DAP, read on.
238
239It is possible to replicate data from an LDAP directory server to
240a X.500 DAP {{TERM:DSA}}.  This requires an LDAP/DAP gateway.
241OpenLDAP Software does not include such a gateway.
242
243
244H2: What is the difference between LDAPv2 and LDAPv3?
245
246LDAPv3 was developed in the late 1990's to replace LDAPv2.
247LDAPv3 adds the following features to LDAP:
248
249 * Strong authentication and data security services via {{TERM:SASL}}
250 * Certificate authentication and data security services via {{TERM:TLS}} (SSL)
251 * Internationalization through the use of Unicode
252 * Referrals and Continuations
253 * Schema Discovery
254 * Extensibility (controls, extended operations, and more)
255
256LDAPv2 is historic ({{REF:RFC3494}}).  As most {{so-called}} LDAPv2
257implementations (including {{slapd}}(8)) do not conform to the
258LDAPv2 technical specification, interoperability amongst
259implementations claiming LDAPv2 support is limited.  As LDAPv2
260differs significantly from LDAPv3, deploying both LDAPv2 and LDAPv3
261simultaneously is quite problematic.  LDAPv2 should be avoided.
262LDAPv2 is disabled by default.
263
264
265H2: LDAP vs RDBMS
266
267This question is raised many times, in different forms. The most common,
268however, is: {{Why doesn't OpenLDAP drop Berkeley DB and use a relational
269database management system (RDBMS) instead?}} In general, expecting that the
270sophisticated algorithms implemented by commercial-grade RDBMS would make
271{{OpenLDAP}} be faster or somehow better and, at the same time, permitting
272sharing of data with other applications.
273
274The short answer is that use of an embedded database and custom indexing system
275allows OpenLDAP to provide greater performance and scalability without loss of
276reliability. OpenLDAP uses Berkeley DB concurrent / transactional
277database software. This is the same software used by leading commercial
278directory software.
279
280Now for the long answer. We are all confronted all the time with the choice
281RDBMSes vs. directories. It is a hard choice and no simple answer exists.
282
283It is tempting to think that having a RDBMS backend to the directory solves all
284problems. However, it is a pig. This is because the data models are very
285different. Representing directory data with a relational database is going to
286require splitting data into multiple tables.
287
288Think for a moment about the person objectclass. Its definition requires
289attribute types objectclass, sn and cn and allows attribute types userPassword,
290telephoneNumber, seeAlso and description. All of these attributes are multivalued,
291so a normalization requires putting each attribute type in a separate table.
292
293Now you have to decide on appropriate keys for those tables. The primary key
294might be a combination of the DN, but this becomes rather inefficient on most
295database implementations.
296
297The big problem now is that accessing data from one entry requires seeking on
298different disk areas. On some applications this may be OK but in many
299applications performance suffers.
300
301The only attribute types that can be put in the main table entry are those that
302are mandatory and single-value. You may add also the optional single-valued
303attributes and set them to NULL or something if not present.
304
305But wait, the entry can have multiple objectclasses and they are organized in
306an inheritance hierarchy. An entry of objectclass organizationalPerson now has
307the attributes from person plus a few others and some formerly optional attribute
308types are now mandatory.
309
310What to do? Should we have different tables for the different objectclasses?
311This way the person would have an entry on the person table, another on
312organizationalPerson, etc. Or should we get rid of person and put everything on
313the second table?
314
315But what do we do with a filter like (cn=*) where cn is an attribute type that
316appears in many, many objectclasses. Should we search all possible tables for
317matching entries? Not very attractive.
318
319Once this point is reached, three approaches come to mind. One is to do full
320normalization so that each attribute type, no matter what, has its own separate
321table. The simplistic approach where the DN is part of the primary key is
322extremely wasteful, and calls for an approach where the entry has a unique
323numeric id that is used instead for the keys and a main table that maps DNs to
324ids. The approach, anyway, is very inefficient when several attribute types from
325one or more entries are requested. Such a database, though cumbersomely,
326can be managed from SQL applications.
327
328The second approach is to put the whole entry as a blob in a table shared by all
329entries regardless of the objectclass and have additional tables that act as
330indices for the first table. Index tables are not database indices, but are
331fully managed by the LDAP server-side implementation. However, the database
332becomes unusable from SQL. And, thus, a fully fledged database system provides
333little or no advantage. The full generality of the database is unneeded.
334Much better to use something light and fast, like Berkeley DB.
335
336A completely different way to see this is to give up any hopes of implementing
337the directory data model. In this case, LDAP is used as an access protocol to
338data that provides only superficially the directory data model. For instance,
339it may be read only or, where updates are allowed, restrictions are applied,
340such as making single-value attribute types that would allow for multiple values.
341Or the impossibility to add new objectclasses to an existing entry or remove
342one of those present. The restrictions span the range from allowed restrictions
343(that might be elsewhere the result of access control) to outright violations of
344the data model. It can be, however, a method to provide LDAP access to preexisting
345data that is used by other applications. But in the understanding that we don't
346really have a "directory".
347
348Existing commercial LDAP server implementations that use a relational database
349are either from the first kind or the third. I don't know of any implementation
350that uses a relational database to do inefficiently what BDB does efficiently.
351For those who are interested in "third way" (exposing EXISTING data from RDBMS
352as LDAP tree, having some limitations compared to classic LDAP model, but making
353it possible to interoperate between LDAP and SQL applications):
354
355OpenLDAP includes back-sql - the backend that makes it possible. It uses ODBC +
356additional metainformation about translating LDAP queries to SQL queries in your
357RDBMS schema, providing different levels of access - from read-only to full
358access depending on RDBMS you use, and your schema.
359
360For more information on concept and limitations, see {{slapd-sql}}(5) man page,
361or the {{SECT: Backends}} section. There are also several examples for several
362RDBMSes in {{F:back-sql/rdbms_depend/*}} subdirectories.
363
364
365H2: What is slapd and what can it do?
366
367{{slapd}}(8) is an LDAP directory server that runs on many different
368platforms. You can use it to provide a directory service of your
369very own.  Your directory can contain pretty much anything you want
370to put in it. You can connect it to the global LDAP directory
371service, or run a service all by yourself. Some of slapd's more
372interesting features and capabilities include:
373
374{{B:LDAPv3}}: {{slapd}} implements version 3 of {{TERM[expand]LDAP}}.
375{{slapd}} supports LDAP over both {{TERM:IPv4}} and {{TERM:IPv6}}
376and Unix {{TERM:IPC}}.
377
378{{B:{{TERM[expand]SASL}}}}: {{slapd}} supports strong authentication
379and data security (integrity and confidentiality) services through
380the use of SASL.  {{slapd}}'s SASL implementation utilizes {{PRD:Cyrus
381SASL}} software which supports a number of mechanisms including
382{{TERM:DIGEST-MD5}}, {{TERM:EXTERNAL}}, and {{TERM:GSSAPI}}.
383
384{{B:{{TERM[expand]TLS}}}}: {{slapd}} supports certificate-based
385authentication and data security (integrity and confidentiality)
386services through the use of TLS (or SSL).  {{slapd}}'s TLS
387implementation can utilize either {{PRD:OpenSSL}} or {{PRD:GnuTLS}} software.
388
389{{B:Topology control}}: {{slapd}} can be configured to restrict
390access at the socket layer based upon network topology information.
391This feature utilizes {{TCP wrappers}}.
392
393{{B:Access control}}: {{slapd}} provides a rich and powerful access
394control facility, allowing you to control access to the information
395in your database(s). You can control access to entries based on
396LDAP authorization information, {{TERM:IP}} address, domain name
397and other criteria.  {{slapd}} supports both {{static}} and {{dynamic}}
398access control information.
399
400{{B:Internationalization}}: {{slapd}} supports Unicode and language
401tags.
402
403{{B:Choice of database backends}}: {{slapd}} comes with a variety
404of different database backends you can choose from. They include
405{{TERM:BDB}}, a high-performance transactional database backend;
406{{TERM:HDB}}, a hierarchical high-performance transactional
407backend; {{SHELL}}, a backend interface to arbitrary shell scripts;
408and PASSWD, a simple backend interface to the {{passwd}}(5) file.
409The BDB and HDB backends utilize {{ORG:Oracle}} {{PRD:Berkeley
410DB}}.
411
412{{B:Multiple database instances}}: {{slapd}} can be configured to
413serve multiple databases at the same time. This means that a single
414{{slapd}} server can respond to requests for many logically different
415portions of the LDAP tree, using the same or different database
416backends.
417
418{{B:Generic modules API}}:  If you require even more customization,
419{{slapd}} lets you write your own modules easily. {{slapd}} consists
420of two distinct parts: a front end that handles protocol communication
421with LDAP clients; and modules which handle specific tasks such as
422database operations.  Because these two pieces communicate via a
423well-defined {{TERM:C}} {{TERM:API}}, you can write your own
424customized modules which extend {{slapd}} in numerous ways.  Also,
425a number of {{programmable database}} modules are provided.  These
426allow you to expose external data sources to {{slapd}} using popular
427programming languages ({{PRD:Perl}}, {{shell}}, and {{TERM:SQL}}).
428
429{{B:Threads}}: {{slapd}} is threaded for high performance.  A single
430multi-threaded {{slapd}} process handles all incoming requests using
431a pool of threads.  This reduces the amount of system overhead
432required while providing high performance.
433
434{{B:Replication}}: {{slapd}} can be configured to maintain shadow
435copies of directory information.  This {{single-master/multiple-slave}}
436replication scheme is vital in high-volume environments where a
437single {{slapd}} installation just doesn't provide the necessary availability
438or reliability.  For extremely demanding environments where a
439single point of failure is not acceptable, {{multi-master}} replication
440is also available.  {{slapd}} includes support for {{LDAP Sync}}-based
441replication.
442
443{{B:Proxy Cache}}: {{slapd}} can be configured as a caching
444LDAP proxy service.
445
446{{B:Configuration}}: {{slapd}} is highly configurable through a
447single configuration file which allows you to change just about
448everything you'd ever want to change.  Configuration options have
449reasonable defaults, making your job much easier. Configuration can
450also be performed dynamically using LDAP itself, which greatly
451improves manageability.
452
453