• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..30-Sep-2020-

aggregation/H03-May-2022-

main/H30-Sep-2020-9572

testPcap/H03-May-2022-

testWorkloads/H03-May-2022-

README.mdH A D30-Sep-20206.5 KiB10169

assembly.goH A D30-Sep-202024.6 KiB752461

auth_test.goH A D30-Sep-20204.2 KiB146116

command_op.goH A D30-Sep-20207 KiB246186

command_reply.goH A D30-Sep-20205.6 KiB200150

connection_stub.goH A D30-Sep-20203 KiB9856

cursors.goH A D30-Sep-202010.7 KiB322207

cursors_test.goH A D30-Sep-202011.6 KiB374267

delete_op.goH A D30-Sep-20202.6 KiB10778

execute.goH A D30-Sep-20208.6 KiB280209

execute_test.goH A D30-Sep-20201.6 KiB6851

filter.goH A D30-Sep-20206.4 KiB233189

filter_test.goH A D30-Sep-202011.2 KiB454361

getmore_op.goH A D30-Sep-20203.3 KiB12080

insert_op.goH A D30-Sep-20202.7 KiB10576

killcursors_op.goH A D30-Sep-20202.3 KiB8151

log_wrapper.goH A D30-Sep-20201.7 KiB7755

message.goH A D30-Sep-20202.6 KiB10772

mongo_op_handler.goH A D30-Sep-202012.5 KiB389304

mongoreplay.goH A D30-Sep-20201.5 KiB4531

mongoreplay_test.goH A D30-Sep-202029.3 KiB1,073923

monitor.goH A D30-Sep-20206.1 KiB197145

msg_op.goH A D30-Sep-202011.2 KiB464371

op.goH A D30-Sep-20203.9 KiB13367

opcode.goH A D30-Sep-20202.4 KiB7659

packet_handler.goH A D30-Sep-20203.9 KiB130106

parallel_file_read_manager.goH A D30-Sep-20203.8 KiB139106

pcap_test.goH A D30-Sep-20207.6 KiB276227

play.goH A D30-Sep-20207.4 KiB217158

play_livedb_test.goH A D30-Sep-202037.3 KiB1,139812

play_test.goH A D30-Sep-20203 KiB134114

playbackfile.goH A D30-Sep-20206.4 KiB237172

query_op.goH A D30-Sep-20204.6 KiB188150

raw_op.goH A D30-Sep-20206.1 KiB223170

record.goH A D30-Sep-20206.4 KiB215167

recorded_op.goH A D30-Sep-20201.6 KiB6443

recorded_op_generator_test.goH A D30-Sep-202011.5 KiB400330

replay_test.shH A D30-Sep-20205 KiB223189

reply_op.goH A D30-Sep-20204.5 KiB174128

sanity_check.shH A D30-Sep-20201.5 KiB7255

stat_collector.goH A D30-Sep-202013.3 KiB454340

stat_format.goH A D30-Sep-20205.7 KiB182113

unknown_op.goH A D30-Sep-20201.5 KiB5633

update_op.goH A D30-Sep-20203.4 KiB13098

util.goH A D30-Sep-202014.9 KiB626499

version.goH A D30-Sep-2020920 3119

README.md

1# mongoreplay
2##### Purpose
3
4`mongoreplay` is a traffic capture and replay tool for MongoDB. It can be used to inspect commands being sent to a MongoDB instance, record them, and replay them back onto another host at a later time.
5##### Use cases
6- Preview how well your database cluster would perform a production workload under a different environment (storage engine, index, hardware, OS, etc.)
7- Reproduce and investigate bugs by recording and replaying the operations that trigger them
8- Inspect the details of what an application is doing to a mongo cluster (i.e. a more flexible version of [mongosniff](https://docs.mongodb.org/manual/reference/program/mongosniff/))
9
10## Quickstart
11
12Make a recording:
13
14    mongoreplay record -i lo0 -e "port 27017" -p playback.bson
15Analyze it:
16
17    mongoreplay stat -p playback.bson --report playback_stats.json
18Replay it against another server, at 2x speed:
19
20    mongoreplay play -p playback.bson --speed=2.0 --report replay_stats.json --host 192.168.0.4:27018
21
22## Detailed Usage
23
24Basic usage of `mongoreplay` works in two phases: `record` and `play`. Analyzing recordings can also be performed with the `stat` command.
25* The `record` phase takes a [pcap](https://en.wikipedia.org/wiki/Pcap) file (generated by `tcpdump`) and analyzes it to produce a playback file (in BSON format). The playback file contains a list of all the requests and replies to/from the Mongo instance that were recorded in the pcap dump, along with their connection identifier, timestamp, and other metadata.
26* The `play` reads in the playback file that was generated by `record`, and re-executes the workload against some target host.
27* The `stat` command reads a playback file and analyzes it, detecting the latency between each request and response.
28
29#### Capturing TCP (pcap) data
30
31To create a recording of traffic, use the `record` command as follows:
32
33    mongoreplay record -i lo0 -e "port 27017" -p recording.bson
34
35
36This will record traffic on the network interface `lo0` targeting port 27017.
37The options to `record` are:
38* `-i`: The network interface to listen on, e.g. `eth0` or `lo0`. You may be required to run `mongoreplay` with root privileges for this to work.
39* `-e`: An expression in Berkeley Packet Filter (BPF) syntax to apply to incoming traffic to record. See http://biot.com/capstats/bpf.html for details on how to construct BPF expressions.
40* `-p`: The output file to write the recording to.
41
42#### Recording a playback file from pcap data
43
44Alternatively, you can capture traffic using `tcpdump` and create a recording from a static PCAP file. First, capture TCP traffic on the system where the workload you wish to record is targeting. Then, run `mongoreplay record` using the `-f` argument (instead of `-i`) to create the playback file.
45
46    sudo tcpdump -i lo0 -n "port 27017" -w traffic.pcap
47
48    $ ./mongoreplay record -f traffic.pcap -p playback.bson
49
50Using the `record` command of mongoreplay, this will process the .pcap file to create a playback file. The playback file will contain everything needed to re-execute the workload.
51
52### Using playback files
53
54There are several useful operations that can be performed with the playback file.
55
56##### Re-executing the playback file
57The `play` command takes a playback file and executes the operations in it against a target host.
58
59    ./mongoreplay play -p playback.bson --host mongodb://target-host.com:27017
60
61To modify playback speed, add the `--speed` command line flag to the `play` command. For example, `--speed=2.0` will run playback at twice the rate of the recording, while `--speed=0.5` will run playback at half the rate of the recording.
62
63    mongoreplay play -p workload.playback --host staging-mongo-cluster-hostname
64
65###### Playback speed
66You can also play the workload back at a faster rate by adding the --speed argument; for example, --speed=2.0 will execute the workload at twice the speed it was recorded at.
67
68###### Logging metrics about execution performance during playback
69Use the `--report=<path-to-file>` flag to save  detailed metrics about the performance of each operation performed during playback to the specified json file. This can be used in later analysis to compare performance and behavior across  different executions of the same workload.
70
71##### Inspecting the operations in a playback file
72
73The `stat` command takes a static workload file (bson) and generates a json report, showing each operation and some metadata about its execution. The output is in the same format as that used by the json output generated by using the `play` command with `--report`.
74
75###### Report format
76
77The data in the json reports consists of one record for each request/response. Each record has the following format:
78```json
79{
80    "connection_num": 1,
81    "latency_us": 89,
82    "ns": "test.test",
83    "op": "getmore",
84    "order": 16,
85    "play_at": "2016-02-02T16:24:16.309322601-05:00",
86    "played_at": "2016-02-02T16:24:16.310908311-05:00",
87    "playbacklag_us": 1585
88}
89```
90
91The fields are as follows:
92 * `connection_num`: a key that identifies the connection on which the request was executed. All requests/replies that executed on the same connection will have the same value for this field. The value for this field does *not* match the connection ID logged on the server-side.
93 * `latency_us`: the time difference (in microseconds) between when the request was sent by the client, and a response from the server was received.
94 * `ns`: the namespace that the request was executed on.
95 * `op`: the type of operation represented by the request - e.g. "query", "insert", "command", "getmore"
96 * `order`: a monotonically increasing key indicating the order in which the operations were recorded and played back. This can be used to reconstruct the ordering of the series of ops executed on a connection, since the order in which they appear in the report file might not match the order of playback.
97 * `data`: the payload of the actual operation. For queries, this will contain the actual query that was issued. For inserts, this will contain the documents being inserted. For updates, it will contain the query selector and the update modifier, etc.
98 * `play_at`: The time at which the operation was supposed to be executed.
99 * `played_at`: The time at which the `play` command actually executed the operation.
100 * `playbacklag_us`: The difference (in microseconds) in time between `played_at` and `play_at`. Higher values generally indicate that the target server is not able to keep up with the rate at which requests need to be executed according to the playback file.
101