1<!-- doc/src/sgml/replication-origins.sgml --> 2<chapter id="replication-origins"> 3 <title>Replication Progress Tracking</title> 4 5 <indexterm zone="replication-origins"> 6 <primary>Replication Progress Tracking</primary> 7 </indexterm> 8 <indexterm zone="replication-origins"> 9 <primary>Replication Origins</primary> 10 </indexterm> 11 12 <para> 13 Replication origins are intended to make it easier to implement 14 logical replication solutions on top 15 of <link linkend="logicaldecoding">logical decoding</link>. 16 They provide a solution to two common problems: 17 <itemizedlist> 18 <listitem> 19 <para>How to safely keep track of replication progress</para> 20 </listitem> 21 <listitem> 22 <para>How to change replication behavior based on the 23 origin of a row; for example, to prevent loops in bi-directional 24 replication setups</para> 25 </listitem> 26 </itemizedlist> 27 </para> 28 29 <para> 30 Replication origins have just two properties, a name and an OID. The name, 31 which is what should be used to refer to the origin across systems, is 32 free-form <type>text</type>. It should be used in a way that makes conflicts 33 between replication origins created by different replication solutions 34 unlikely; e.g., by prefixing the replication solution's name to it. 35 The OID is used only to avoid having to store the long version 36 in situations where space efficiency is important. It should never be shared 37 across systems. 38 </para> 39 40 <para> 41 Replication origins can be created using the function 42 <link linkend="pg-replication-origin-create"><function>pg_replication_origin_create()</function></link>; 43 dropped using 44 <link linkend="pg-replication-origin-drop"><function>pg_replication_origin_drop()</function></link>; 45 and seen in the 46 <link linkend="catalog-pg-replication-origin"><structname>pg_replication_origin</structname></link> 47 system catalog. 48 </para> 49 50 <para> 51 One nontrivial part of building a replication solution is to keep track of 52 replay progress in a safe manner. When the applying process, or the whole 53 cluster, dies, it needs to be possible to find out up to where data has 54 successfully been replicated. Naive solutions to this, such as updating a 55 row in a table for every replayed transaction, have problems like run-time 56 overhead and database bloat. 57 </para> 58 59 <para> 60 Using the replication origin infrastructure a session can be 61 marked as replaying from a remote node (using the 62 <link linkend="pg-replication-origin-session-setup"><function>pg_replication_origin_session_setup()</function></link> 63 function). Additionally the <acronym>LSN</acronym> and commit 64 time stamp of every source transaction can be configured on a per 65 transaction basis using 66 <link linkend="pg-replication-origin-xact-setup"><function>pg_replication_origin_xact_setup()</function></link>. 67 If that's done replication progress will persist in a crash safe 68 manner. Replay progress for all replication origins can be seen in the 69 <link linkend="view-pg-replication-origin-status"> 70 <structname>pg_replication_origin_status</structname> 71 </link> view. An individual origin's progress, e.g., when resuming 72 replication, can be acquired using 73 <link linkend="pg-replication-origin-progress"><function>pg_replication_origin_progress()</function></link> 74 for any origin or 75 <link linkend="pg-replication-origin-session-progress"><function>pg_replication_origin_session_progress()</function></link> 76 for the origin configured in the current session. 77 </para> 78 79 <para> 80 In replication topologies more complex than replication from exactly one 81 system to one other system, another problem can be that it is hard to avoid 82 replicating replayed rows again. That can lead both to cycles in the 83 replication and inefficiencies. Replication origins provide an optional 84 mechanism to recognize and prevent that. When configured using the functions 85 referenced in the previous paragraph, every change and transaction passed to 86 output plugin callbacks (see <xref linkend="logicaldecoding-output-plugin"/>) 87 generated by the session is tagged with the replication origin of the 88 generating session. This allows treating them differently in the output 89 plugin, e.g., ignoring all but locally-originating rows. Additionally 90 the <link linkend="logicaldecoding-output-plugin-filter-origin"> 91 <function>filter_by_origin_cb</function></link> callback can be used 92 to filter the logical decoding change stream based on the 93 source. While less flexible, filtering via that callback is 94 considerably more efficient than doing it in the output plugin. 95 </para> 96</chapter> 97