2title: "Scaling recommendations"
3layout: default
4canonical: "/puppetdb/latest/scaling_recommendations.html"
6# Scaling recommendations
8[configure_heap]: ./configure.markdown#configuring-the-java-heap-size
9[dashboard]: ./maintain_and_tune.markdown#monitor-the-performance-dashboard
10[heap]: ./maintain_and_tune.markdown#tune-the-max-heap-size
11[threads]: ./maintain_and_tune.markdown#tune-the-number-of-threads
12[postgres]: ./configure.markdown#using-postgresql
13[pg_ha]: http://www.postgresql.org/docs/current/interactive/high-availability.html
14[pg_replication]: http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
15[ram]: #bottleneck-java-heap-size
16[runinterval]: https://puppet.com/docs/puppet/latest/configuration.html#runinterval
18PuppetDB will be a critical component of your Puppet deployment, as agent nodes will be unable to request catalogs if it goes down. Therefore, you should make sure it can handle your site's load and is resilient against failures.
20When scaling any service, there are several possible performance and reliability bottlenecks. These can be dealt with in turn as they become problems.
23## Bottleneck: Database performance
25### PostgreSQL speed and availability
27PuppetDB will be limited by the performance of your PostgreSQL server.
28You can increase performance by making sure your database server has an
29extremely fast disk, plenty of RAM, a fast processor, and a fast
30network connection to your PuppetDB server. You may also need to look
31into database clustering and load balancing.
33It's also possible that the default PostgreSQL configuration on your
34system will be very conservative. If so, the
35[PgTune](http://pgfoundry.org/projects/pgtune/) tool can suggest
36settings that may be more appropriate for the relevant host.
38Database administration is beyond the scope of this guide, but the
39following links may be helpful:
41* ["High availability, load balancing, and replication"][pg_ha], from the PostgreSQL manual
42* ["Replication, clustering, and connection pooling"][pg_replication], from the PostgreSQL wiki
44## Bottleneck: Java heap size
46PuppetDB is limited by the amount of memory available to it, which is [set in the init script's config file][configure_heap]. If PuppetDB runs out of memory, it will start logging `OutOfMemoryError` exceptions and delaying command processing. Unlike many of the bottlenecks listed here, this one is fairly binary: PuppetDB either has enough memory to function under its load, or it doesn't. The exact amount needed will depend on the number of nodes, the similarity of the nodes, the complexity of each node's catalog, and how often the nodes check in.
48### Initial memory recommendations
50Use one of the following rules of thumb to choose an initial heap size; afterwards, [watch the performance dashboard][dashboard] and [adjust the heap if necessary][heap].
52* If you are using PostgreSQL, allocate 128 MB of memory as a base, plus 1 MB for each Puppet node in your infrastructure.
53* If you are using the embedded database, allocate at least 1 GB of heap.
55## Bottleneck: Node check-in interval
57The more frequently your Puppet nodes check in, the heavier the load on your PuppetDB server.
59You can reduce the need for higher performance by changing the [`runinterval`][runinterval] setting in every Puppet node's puppet.conf file. (Or, if running Puppet agent from cron, by changing the frequency of the cron task.)
61The frequency with which nodes should check in will depend on your site's policies and expectations --- this is as much a cultural decision as it is a technical one. A possible compromise is to use a wider default check-in interval, but implement MCollective's `puppetd` plugin to trigger immediate runs when needed.
63## Bottleneck: CPU cores and number of worker threads
65PuppetDB can take advantage of multiple CPU cores to handle the commands in its queue. Each core can run a worker thread. By default, PuppetDB will use half of the cores in its machine.
67You can increase performance by running PuppetDB on a machine with many CPU cores and then [tuning the number of worker threads][threads]:
69* More threads will allow PuppetDB to keep up with more incoming commands per minute. Watch the queue depth in the performance dashboard to see whether you need more threads.
70* Too many worker threads can potentially starve the message queue and web server of resources, which will prevent incoming commands from entering the queue in a timely fashion. Watch your server's CPU usage to see whether the cores are saturated.
72## Bottleneck: Single point of failure
74Although a single PuppetDB and PostgreSQL server probably _can_ handle all of the load at the site, you may want to run multiple servers for the sake of resilience and redundancy. To configure high-availability PuppetDB, you should:
76* Run multiple instances of PuppetDB on multiple servers, and use a reverse proxy or load balancer to distribute traffic between them.
77* Configure multiple PostgreSQL servers for high availability or clustering. More information is available at [the PostgreSQL manual][pg_ha] and [the PostgreSQL wiki][pg_replication].
78* Configure every PuppetDB instance to use the same PostgreSQL database. (In the case of clustered PostgreSQL servers, they may be speaking to different machines, but conceptually they should all be writing to one database.)
81## Bottleneck: SSL performance
83PuppetDB uses its own embedded SSL processing, which is usually not a performance problem. However, truly large deployments will be able to squeeze out more performance by terminating SSL with Apache or NGINX instead. If you are using multiple PuppetDB servers behind a reverse proxy, we recommend terminating SSL at the proxy server.
85Instructions for configuring external SSL termination are currently beyond the scope of this guide. However, we expect that if your site is big enough for this to be necessary, you have probably done it with several other services before.