Introduction

Today we are going to review how to use Wireshark to analyze the MongoDB Wire Protocol and the information that is being sent.

MongoDB Wire Protocol

The MongoDB Wire Protocol is the protocol used for communication between the MongoDB client (application) and the MongoDB server. It defines the structure and rules for exchanging data and commands.

It is a simple socket-based, request-response style protocol that communicates with the database server through a regular TCP/IP socket.

Wireshark is a popular open source graphical user interface (GUI) tool for analyzing packets. However, it also provides a powerful command-line utility called TShark for people who prefer to work on the Linux command line.

How to read the information sent over the wire

First we are going to create an EC2 instance in AWS that will use Amazon Linux 2023

# Command to read the OS version and extract only PRETTY_NAME
grep PRETTY_NAME /etc/os-release | cut -d '=' -f 2- | tr -d '"'
> Amazon Linux 2023.4.20240513

We are creating the VM so you don't experience any compatibility issues with this tutorial.

Now, we are going to install Wireshark. For more information on how to compile it or install it for other OSes check this documentation

yum install wireshark 

Let's add the user ec2-user to the group wireshark

sudo usermod -a -G wireshark ec2-user

Let's review the version of tshark

tshark -v
> TShark (Wireshark) 4.0.8 (Git commit 81696bb74857).

Now, let's list the interfaces

sudo tshark -D

> Running as user "root" and group "root". This could be dangerous.
1. enX0
2. any
3. lo (Loopback)
4. bluetooth-monitor
5. nflog
6. nfqueue
7. ciscodump (Cisco remote capture)
8. dpauxmon (DisplayPort AUX channel monitor capture)
9. sdjournal (systemd Journal Export)
10. sshdump (SSH remote capture)
11. udpdump (UDP Listener remote capture)
12. wifidump (Wi-Fi remote capture)

We will use the interface enX0

sudo tshark -i enX0

And this is the response:

> Running as user "root" and group "root". This could be dangerous.
Capturing on 'enX0'
 ** (tshark:26693) 09:55:56.644283 [Main MESSAGE] -- Capture started.
 ** (tshark:26693) 09:55:56.644340 [Main MESSAGE] -- File: "/var/tmp/wireshark_enX0pxzrYU.pcapng"
    1 0.000000000 172.31.34.215 → 104.30.176.2 SSH 294 Server: Encrypted packet (len=228)
    2 0.002782311 104.30.176.2 → 172.31.34.215 TCP 66 51073 → 22 [ACK] Seq=1 Ack=229 Win=6 Len=0 TSval=1325938519 TSecr=883676608
    3 0.690250824 172.31.34.215 → 104.30.176.2 SSH 334 Server: Encrypted packet (len=268)
    4 0.693046964 104.30.176.2 → 172.31.34.215 TCP 66 51073 → 22 [ACK] Seq=1 Ack=497 Win=6 Len=0 TSval=1325939209 TSecr=883677299
    5 1.210111281 172.31.34.215 → 104.30.176.2 SSH 334 Server: Encrypted packet (len=268)
    6 1.212949679 104.30.176.2 → 172.31.34.215 TCP 66 51073 → 22 [ACK] Seq=1 Ack=765 Win=6 Len=0 TSval=1325939729 TSecr=883677819
    7 1.730222145 172.31.34.215 → 104.30.176.2 SSH 334 Server: Encrypted packet (len=268)
    8 1.733079609 104.30.176.2 → 172.31.34.215 TCP 66 51073 → 22 [ACK] Seq=1 Ack=1033 Win=6 Len=0 TSval=1325940249 TSecr=883678339
    9 2.250139669 172.31.34.215 → 104.30.176.2 SSH 334 Server: Encrypted packet (len=268)
   10 2.252914509 104.30.176.2 → 172.31.34.215 TCP 66 51073 → 22 [ACK] Seq=1 Ack=1301 Win=6 Len=0 TSval=1325940769 TSecr=883678859
   11 2.770154639 172.31.34.215 → 104.30.176.2 SSH 334 Server: Encrypted packet (len=268)

Now, we are ready to filter only the packages used by the MongoDB Wire Protocol. One way of doing this is to create a filter to only listen to the port 27017 and the hostname of the primary node.

sudo tshark -i any -f "tcp port 27017 and host mycluster-shard-00-02.ab9qr.mongodb.net"

When executing this command, we can see what traffic is being sent over the wire. This traffic will be encrypted as MongoDB Atlas only allows encrypted connections to MongoDB.

Now, let's start sending packages to MongoDB in another tab. One way of doing this is via a Python script like the following one, that will connect to the primary node of our MongoDB Atlas cluster.

sudo yum install python3-pip
pip3 install pymongo

After this, we can execute the following Python script (or run it in interactive mode)

import pymongo
from pymongo import MongoClient, ReturnDocument
client = MongoClient("mongodb://myusername:mypassword@mycluster-shard-00-02.ab9qr.mongodb.net:27017/?ssl=true")
db = client["mydatabase"]
collection = db["mycollection"]
list(collection.find())

And in the screen where tshark is running you will see this output:

Running as user "root" and group "root". This could be dangerous.
Capturing on 'any'
 ** (tshark:27066) 10:07:25.629520 [Main MESSAGE] -- Capture started.
 ** (tshark:27066) 10:07:25.629577 [Main MESSAGE] -- File: "/var/tmp/wireshark_anygl0U1e.pcapng"
    1 0.000000000  30.12.139.75 → 172.31.34.215 TLSv1.2 1228 Application Data
    2 0.000016835 172.31.34.215 → 30.12.139.75  TCP 68 52832 → 27017 [ACK] Seq=1 Ack=1161 Win=443 Len=0 TSval=3258285286 TSecr=3078370952
    3 0.029801006 172.31.34.215 → 30.12.139.75  TCP 76 47054 → 27017 [SYN] Seq=0 Win=62727 Len=0 MSS=8961 SACK_PERM TSval=3258285315 TSecr=0 WS=128
    4 0.030835489  30.12.139.75 → 172.31.34.215 TCP 76 27017 → 47054 [SYN, ACK] Seq=0 Ack=1 Win=65160 Len=0 MSS=1460 SACK_PERM TSval=3078370982 TSecr=3258285315 WS=128
    5 0.030851159 172.31.34.215 → 30.12.139.75  TCP 68 47054 → 27017 [ACK] Seq=1 Ack=1 Win=62848 Len=0 TSval=3258285317 TSecr=3078370982
    6 0.031155009 172.31.34.215 → 30.12.139.75  TLSv1 585 Client Hello
    7 0.032127525  30.12.139.75 → 172.31.34.215 TCP 68 27017 → 47054 [ACK] Seq=1 Ack=518 Win=64768 Len=0 TSval=3078370984 TSecr=3258285317
    8 0.057315142  30.12.139.75 → 172.31.34.215 TLSv1.2 3782 Server Hello, Certificate, Server Key Exchange, Certificate Request, Server Hello Done
    9 0.057336389 172.31.34.215 → 30.12.139.75  TCP 68 47054 → 27017 [ACK] Seq=518 Ack=3715 Win=59136 Len=0 TSval=3258285343 TSecr=3078371009
   10 0.058969373 172.31.34.215 → 30.12.139.75  TLSv1.2 206 Certificate, Client Key Exchange, Change Cipher Spec, Encrypted Handshake Message
   11 0.060331099  30.12.139.75 → 172.31.34.215 TLSv1.2 342 New Session Ticket, Change Cipher Spec, Encrypted Handshake Message
   12 0.060589152 172.31.34.215 → 30.12.139.75  TLSv1.2 374 Application Data
   13 0.061877602  30.12.139.75 → 172.31.34.215 TLSv1.2 1244 Application Data
   14 0.063336747 172.31.34.215 → 30.12.139.75  TLSv1.2 333 Application Data
   15 0.063616279 172.31.34.215 → 30.12.139.75  TCP 76 47058 → 27017 [SYN] Seq=0 Win=62727 Len=0 MSS=8961 SACK_PERM TSval=3258285349 TSecr=0 WS=128
   16 0.064586863  30.12.139.75 → 172.31.34.215 TCP 76 27017 → 47058 [SYN, ACK] Seq=0 Ack=1 Win=65160 Len=0 MSS=1460 SACK_PERM TSval=3078371016 TSecr=3258285349 WS=128
   ...

As mentioned before, the data is encrypted so if we want to see what the packages are sending, we can run a local instance of MongoDB (as by default the connection will not be encrypted) and start capturing these packages.

Package capture on local MongoDB instance

Let's install MongoDB locally and run the same test, to do that follow the instructions from the MongoDB documentation.

We will install MongoDB 7 Community.

sudo yum install mongodb-mongosh-shared-openssl3
sudo yum install -y mongodb-org
sudo systemctl start mongod

Let's connect to MongoDB locally and generate some data. We could use the Python script or mongosh.

mongosh
use mytest

for (let i = 0; i<=1000; i++) {db.mycollection.insertOne({name: 'David'})}

Inmediately you will start to see packages in tshark.

If you run tshark with the following command, you will output all the fields and redirect to a file. Also consider the parameter -T that will display also the content of the packages.

sudo tshark -i any -f "tcp port 27017 and host 127.0.0.1"  -T pdml > salida.txt

Once we leave it running, we can filter out the packages that are not related to MongoDB and we will only see the packages that are related to MongoDB and the payload.

cat salida.txt | grep "TCP payload"

This is the example of a payload where you can see document being sent to MongoDB to be inserted, including the ObjectId generated by the driver:

    <field name="tcp.payload" showname="TCP payload (168 bytes)" size="168" pos="68" show="a8:00:00:00:bd:02:00:00:00:00:00:00:dd:07:00:00:00:00:00:00:00:93:00:00:00:02:69:6e:73:65:72:74:00:0d:00:00:00:6d:79:63:6f:6c:6c:65:63:74:69:6f:6e:00:04:64:6f:63:75:6d:65:6e:74:73:00:2e:00:00:00:03:30:00:26:00:00:00:02:6e:61:6d:65:00:06:00:00:00:44:61:76:69:64:00:07:5f:69:64:00:66:4b:28:ef:fa:c7:84:b0:52:95:22:cd:00:00:08:6f:72:64:65:72:65:64:00:01:03:6c:73:69:64:00:1e:00:00:00:05:69:64:00:10:00:00:00:04:87:6c:bd:23:6c:c2:46:f8:98:6b:ee:0b:6f:a9:08:a0:00:02:24:64:62:00:05:00:00:00:74:65:73:74:00:00" value="a8000000bd02000000000000dd07000000000000009300000002696e73657274000d0000006d79636f6c6c656374696f6e0004646f63756d656e7473002e00000003300026000000026e616d650006000000446176696400075f696400664b28effac784b0529522cd0000086f7264657265640001036c736964001e000000056964001000000004876cbd236cc246f8986bee0b6fa908a000022464620005000000746573740000"/>

This corresponds to ObjectId('664b28effac784b0529522cd') which you can see in the value of the payload.

Roundtrips on a Transaction

Now let's try with a Transaction and see when driver is sending data to the MongoDB cluster. To do that, we will run the following Python script that will print a string in every line of the transaction block and wait for 15 seconds. At the same time and in the other tab (where tshark is running) we will see the packages being sent.

def mytest():
    try:
        # Start a session
        print("Start 'start_session' op")
        with client.start_session() as session:
            time.sleep(15)
            print("End 'start_session' op")
            # Start a transaction
            print("Start 'start_transaction' op")
            with session.start_transaction(read_concern=pymongo.read_concern.ReadConcern("snapshot"), write_concern=pymongo.write_concern.WriteConcern("majority")):
                time.sleep(15)
                print("End 'start_transaction' op")
                try:
                    print("Start 'insert_one' op")
                    collection.insert_one({'name': 'David'}, session=session)
                    time.sleep(15)
                    print("End 'insert_one' op")
                    print("Start 'commit_transaction' op")
                    session.commit_transaction()
                    time.sleep(15)
                    print("End 'commit_transaction' op")
                except Exception as e:
                    print("Start 'abort_transaction' op")
                    session.abort_transaction()
                    time.sleep(15)
                    print("End 'abort_transaction' op")
                    raise
    except pymongo.errors.PyMongoError as e:
        raise

Conclusions

After you execute this script you will see that the driver is not sending any information to the MongoDB cluster except in these situations:

  • When a query is executed (find_one, insert_one...)
  • When the transaction is committed
  • When the transaction is aborted

Hope you liked it and please let us know any comments or questions through our contact form.