1
2# PL/Proxy FAQ
3
4## General
5
6### What is PL/Proxy?
7
8PL/Proxy is compact language for remote calls between PostgreSQL
9databases. Syntax is similar to PL/pgSql and language contains only 4
10statements.
11
12With PL/Proxy user can create proxy functions that have same signature
13as remote functions to be called.  The function body describes
14how the remote connection should be acquired.
15
16When such proxy function is called, PL/Proxy:
17
18 1. Automatically generates the SQL query to be executed from function signature
19 2. Executes statements in function body to get the connection
20 3. Uses function arguments as input parameters to query
21 4. Passes the query result back as function result
22
23
24### Why functions?
25
26Concentrating on just function-calls allows PL/Proxy to keep
27its code small and also to present user simple and compact API.
28
29- The actual query run on remote database can be generated
30  based on plproxy function signature.  User just needs to
31  specify how the connection to remote database must be acquired.
32
33- There is no need for complex transaction handling as any
34  multi-statement transactions can be put into functions.
35  PL/Proxy can just execute all queries in 'autocommit' mode.
36
37- Simple autocommit transactions mean that the connection
38  handling is simple and can be done automatically.
39
40Using function-based database access has more general good points:
41
42- It's good to have SQL statements that operate on data
43  near to tables.  That makes life of DBA's easier.
44- It makes it possible to optimize and reorganize tables
45  transparently to the application.
46- Enables DBA's to move parts of database into another
47  database without changing application interface.
48- Easier to manage security if you don't have to do it on
49  table level.  In most cases you need to control what user
50  can do and on which data not on what tables.
51- All transactions can be made in 'autocommit' mode.
52  That means absolutely minimal amount of roundtrips (1)
53  for each query and also each transaction takes shortest
54  possible amount of time on server - remember that various
55  locks that transactions aquire are released on COMMIT.
56
57
58### Why not develop it into Remote Parallel PL/SQL?
59
60Huge benefit of PL/Proxy is it's compactness and efficiency.
61As it does not need to parse queries going through it adds very
62little overhead.
63
64Making it full-blown language for SQL execution would mean
65reimplementing PL/pgSQL, PL/Perl, parts of pgpool and more,
66which is waste of effort.
67
68Also when plproxy functions mirror actual functions, the
69PL/Proxy becomes optional component of setup - the client
70apps can bypass PL/Proxy and work directly on actual database.
71
72This is good for testing and also live deployment - we let
73clients work on smaller databases directly, they are put
74behind PL/Proxy only when load gets too high and we need
75to partition a database.
76
77
78### What can PL/Proxy be used for?
79
80- Remote calls from one database to another either used inside SQL or other procedures.
81  (If used as part of local transaction need to make sure only
82   one side is allowed to write to database, PL/Proxy does not
83   guarantee transactionality between 2 databases.)
84- Proxy databases for better security and data protection.
85- Proxy databases for hiding complexity of databases from application,
86  eg. if you have functions distributed between several databases
87- Horizontal partitioning.  Instead of buying more powerful servers you can
88  split your data between several servers and then use PL/Proxy to redirect
89  function calls into right partitions.
90- Load balancing if you have several read only replicas of your data.
91
92
93### How does it compare to dblink?
94
95- PL/Proxy handles connections automatically, dblink forces user to handle them.
96
97- PL/Proxy has single place where result column types are specified - function signature.
98  dblink requires them to be specified in each query.
99
100- PL/Proxy makes easy to run a query in several remote servers in parallel.
101  Seems that dblink async API makes that also possible, but the usage is complex.
102
103- dblink allows arbitrary complex transactions, PL/Proxy runs everything
104  in autocommit mode.  As previourly discussed, when using functions the
105  complex transactions are not needed and with such tradeoff PL/Proxy can
106  offer much simpler API.
107
108
109### How are PL/Proxy and PgBouncer related?
110
111PL/Proxy version 1 had PL and pooler integrated.  But such design
112caused a lot of unnecessary complexity.  With PL/Proxy version 2,
113we wrote both pooler and PL part from scratch, both designed
114to be standalone components.
115
116That allowed both components to be tested and used separately
117and resulted in compact and robust codebase.
118
119So PgBouncer can be used with PL/Proxy to lessen connection count
120on partition server, but such usage is not mandatory.
121
122
123## Internals
124
125### What are the external dependencies?
126
127It depends only on libpq and poll(2) + gettimeofday(2) system calls.
128So it should be quite portable.
129
130
131### How the remote calls are done?
132
133First a SELECT query is generated based on PL/Proxy function
134signature.
135
136A function signature of:
137
138    CREATE FUNCTION get_data(IN first_name text, IN last_name text,
139                             OUT bdate date, OUT balance numeric(20,10))
140
141Results in following query:
142
143    SELECT bdate::date, balance::numeric(20,10)
144      FROM public.get_data($1::text, $2::text);
145
146The casts and explicit `OUT` parameter names are used to survive minor type or
147result column order differences between local and remote databases.
148
149Then the `CLUSTER` statement is processed, optionally
150executing function.  This result in cluster name.
151
152Then `plproxy.get_cluster_version(<cluster_name>)`
153is executed.  This gives numeric version number for cluster.
154If resulting version number differs from version
155in cached cluster, the cache is dropped.
156
157If cluster information in not cached, the `plproxy.get_cluster_partitions()`
158function is executed, resulting in list of connect strings for that cluster.
159
160Then `RUN` statement is processed, optionally executing
161function if requested.  This will tag one or more connections
162in cluster to be used for query execution.
163
164Then the query is sent to remote server using libpq async API.
165If there are several remote connections tagged, the execution
166will happen in parallel.  PL/Proxy then waits until it has
167acquired resultsets from all connections and then returns
168them to local backend.
169
170
171### How does PL/Proxy handle connections?
172
173It opens them lazily - only when needed.  Then keeps them
174open until it libpq reports error on it or connection
175lifetime is over - which is by default 2h.
176
177There is a safety hack used - before sending query
178to already open connection a poll(2) call is run on
179connection.  If poll() shows events the connection
180is dropped to avoid use of likely broken connection.
181
182
183### Can PL/Proxy survive different settings in local and remote database?
184
185* `client_encoding`
186
187  If it differs, PL/Proxy sets the `client_encoding` on remote database
188  to be equal to local one.
189
190* `standard_conforming_strings`
191
192  Query parameters are passed separately, so in general the difference
193  should not matter.  Except when function uses explicit SELECT
194  and it contains literal strings.  Fix is to avoid use of SELECT.
195
196* `datestyle`, `timezone`
197
198  Currently no handling is done.
199
200* Rest of parameters
201
202  Cannot be handled.
203
204
205### Why does PL/Proxy require the number of partition be power of 2?
206
207There is no deep reason, mostly because of following points:
208
209- To have minimal sanity-checking on the output of `get_cluster_partitions()`.
210- To have clear way to map hashes to partition.  As users quite
211  likely need to write their own code for splitting and sanity checking
212  their data, the algorithm should be as simple as possible.
213
214There is now `modular_mapping` option to switch away from power-of-two requirement.
215
216## Partitioning
217
218### How to partition data?
219
220There are several usage patterns how PL/Proxy can be used
221to distribute load on several servers
222
223- Vertical partitioning.  Data is divided into separate servers table by table
224  and PL/Proxy calls are used to direct calls to right databases.  In some cases
225  wrapper functions that do several remote calls into other databases are needed.
226- Horizontal partitioning.   Using hashtext function any field can be
227  converted into integer. In simpler case you can use just your id field.
228- Two-level vertical partitioning.  PL/Proxy allows the cluster name also
229  be calculated on function arguments.  So it is possible to dedicate
230  different clusters to different categories or one cluster to read-queries,
231  second cluster to write-queries and then do the usual hash-based
232  partitioning inside clusters.
233- Read only replicas.  Load can be divided on read only replicas.  You can define
234  cluster to have more partitions in cluster that you have actual databases and
235  use repeating connect strings as weights on servers.
236
237In many of these scenarios good replication software like Londiste from SkyTools
238is handy.
239
240
241### How to spread single large query over several partitions?
242
243If each partition holds only part of the total data this
244happens automatically - just use RUN ON ALL.
245
246If the partitions are copies of each other or the query does
247not follow the split pattern for some other reason, you need
248to use `SPLIT` command to give each partition part of the data.
249
250### How to do aggregated queries?
251
252Aggregation needs to happen in 3 steps:
253
2541. Function on partition that does per-partition aggregation.
255
2562. PL/Proxy function that collects the result of per-partition aggregation.
257   It will return a row for each partition.
258
2593. Top-level aggregation that does the final aggregation on the
260   resultset of PL/Proxy function.  A regular PL/pgSQL function
261   can be used or this can be done outside database by client application.
262
263Note: some of the aggregations cannot be done naively - eg. `avg()`.
264Instead each partition must do `sum() + count()` and the top-level
265aggregator calculates actual average.
266
267
268### How to add partitions?
269
270The simple way would be to collect data from all partitions
271together then split it again to new partitions.  But that
272is a waste of resources.
273
274Few things to keep in mind to make the addition easier:
275
276- Always partition data to more pieces that you actually need.
277  Eg. if you think 2 servers would handle the load, then
278  do the split into 8 partitions, keeping 4 of them
279  on single server.  That way when load grows you just
280  need to move databases to separate server, not rehash
281  your data.  That also allows you to load-balance between
282  servers with inequal power - keep more partitions on server
283  that has more power.
284
285- Split one partition at a time, splitting it to 2 (preferably 4 or 8).
286  You just need to keep duplicate entries in partition list
287  for partitions that are not split yet.
288
289
290### Can I have foreign keys on my data?
291
292Yes, unless the data you want to partition on references
293itself.
294
295Another common scenario is that there are some big data
296tables that user wants to partition but they reference
297various smaller common tables that are not partitionable.
298In such situation the common tables should be managed
299from single external database and replicated to each
300partition.  That gives single place to manipulate data
301and correct transactionality when spreading data out.
302
303
304### What happens if I do updates in remote database?
305
306PL/Proxy is in autocommit mode so if remote function succeeds then changes are
307automatically committed at once. Special handling is needed if updates are done
308in both databases. If remote call fails both are rolled back but if remote call
309succeeds and local updates fail then only local updates are rolled back.
310Usually PgQ based solutions are used in these situations.
311
312
313### How to handle sequences?
314
315Best way is to use separate ranges for each partition.
316
317In our case, no code uses serials directly, instead they
318use wrapper function that combines unique ID each database
319has and plain sequence.  That way we don't need to manage
320sequences explicitly, instead only thing we need to do
321is to assign each database unique ID.
322
323