# |
Message |
Description |
Workflow |
Nodes |
Format |
Answer Format |
0 |
Error |
Error is a special type of message, because this can be sent against any other message, even if such a message does not expect a reply usually. |
|
* → * |
- |
PNumber code, PString message |
1 |
RequestIdentification |
Request a node identification. This must be the first packet for any connection. |
|
* ⇄ * |
PEnum type, PUUID uuid, PAddress address, PString name, PFloat id_timestamp |
PEnum type, PUUID my_uuid, PNumber num_partitions, PNumber num_replicas, PUUID your_uuid |
2 |
Ping |
Empty request used as network barrier. |
|
* ⇄ * |
|
|
3 |
CloseClient |
Tell peer that it can close the connection if it has finished with us. |
|
* → * |
|
- |
4 |
PrimaryMaster |
Ask node identier of the current primary master. |
|
ctl ⇄ A |
|
PUUID primary_uuid |
5 |
NotPrimaryMaster |
Notify peer that I'm not the primary master. Attach any extra information to help the peer joining the cluster. |
|
SM → * |
PSignedNull primary, [PAddress address] |
- |
6 |
NotifyNodeInformation |
Notify information about one or more nodes. |
|
M → * |
PFloat id_timestamp, [(PEnum type, PAddress address, PUUID uuid, PEnum state, PFloat id_timestamp)] |
- |
7 |
Recovery |
Ask storage nodes data needed by master to recover. Reused by `neoctl print ids`. |
|
M ⇄ S ctl ⇄ A ⇄ M |
|
PPTID ptid, PTID backup_tid, PTID truncate_tid |
8 |
LastIDs |
Ask the last OID/TID so that a master can initialize its TransactionManager. Reused by `neoctl print ids`. |
|
M ⇄ S ctl ⇄ A ⇄ M |
|
PTID last_oid, PTID last_tid |
9 |
PartitionTable |
Ask storage node the remaining data needed by master to recover. This is also how the clients get the full partition table on connection. |
|
M ⇄ S C ⇄ M |
|
PPTID ptid, [(PNumber offset, [(PUUID uuid, PEnum state)])] |
10 |
NotifyPartitionTable |
Send the full partition table to admin/storage nodes on connection. |
|
M → A, S |
PPTID ptid, [(PNumber offset, [(PUUID uuid, PEnum state)])] |
- |
11 |
PartitionChanges |
Notify about changes in the partition table. |
|
M → * |
PPTID ptid, [(PNumber offset, PUUID uuid, PEnum state)] |
- |
12 |
StartOperation |
Tell a storage node to start operation. Before this message, it must only communicate with the primary master. |
|
M → S |
PBoolean backup |
- |
13 |
StopOperation |
Notify that the cluster is not operational anymore. Any operation between nodes must be aborted. |
|
M → S, C |
|
- |
14 |
UnfinishedTransactions |
Ask unfinished transactions, which will be replicated when they're finished. |
|
S ⇄ M |
[PNumber offset] |
PTID max_tid, [PTID unfinished_tid] |
15 |
LockedTransactions |
Ask locked transactions to replay committed transactions that haven't been unlocked. |
|
M ⇄ S |
|
{PTID ttid: PTID tid} |
16 |
FinalTID |
Return final tid if ttid has been committed, to recover from certain failures during tpc_finish. |
|
M ⇄ S C ⇄ M, S |
PTID ttid |
PTID tid |
17 |
ValidateTransaction |
Do replay a committed transaction that was not unlocked. |
|
M → S |
PTID ttid, PTID tid |
- |
18 |
BeginTransaction |
Ask to begin a new transaction. This maps to `tpc_begin`. |
|
C ⇄ M |
PTID tid |
PTID tid |
19 |
FailedVote |
Report storage nodes for which vote failed. True is returned if it's still possible to finish the transaction. |
|
C ⇄ M |
PTID tid, [PUUID uuid] |
Error |
20 |
FinishTransaction |
Finish a transaction. Return the TID of the committed transaction. This maps to `tpc_finish`. |
|
C ⇄ M |
PTID tid, [PTID oid], [PTID oid] |
PTID ttid, PTID tid |
21 |
LockInformation |
Commit a transaction. The new data is read-locked. |
|
M ⇄ S |
PTID ttid, PTID tid |
PTID ttid |
22 |
InvalidateObjects |
Notify about a new transaction modifying objects, invalidating client caches. |
|
M → C |
PTID tid, [PTID oid] |
- |
23 |
UnlockInformation |
Notify about a successfully committed transaction. The new data can be unlocked. |
|
M → S |
PTID ttid |
- |
24 |
GenerateOIDs |
Ask new OIDs to create objects. |
|
C ⇄ M |
PNumber num_oids |
[PTID oid] |
25 |
Deadlock |
Ask master to generate a new TTID that will be used by the client to solve a deadlock by rebasing the transaction on top of concurrent changes. |
|
S → M → C |
PTID ttid, PTID locking_tid |
- |
26 |
RebaseTransaction |
Rebase a transaction to solve a deadlock. |
|
C ⇄ S |
PTID ttid, PTID locking_tid |
[PTID oid] |
27 |
RebaseObject |
Rebase an object change to solve a deadlock. |
|
C ⇄ S |
PTID ttid, PTID oid |
(PTID serial, PTID conflict_serial, (PBoolean compression, PChecksum checksum, PString data)?)? |
28 |
StoreObject |
Ask to create/modify an object. This maps to `store`. |
|
C ⇄ S |
PTID oid, PTID serial, PBoolean compression, PChecksum checksum, PString data, PTID data_serial, PTID tid |
PTID conflict |
29 |
AbortTransaction |
Abort a transaction. This maps to `tpc_abort`. |
|
C → S C → M → S |
PTID tid, [PUUID uuid] |
- |
30 |
StoreTransaction |
Ask to store a transaction. Implies vote. |
|
C ⇄ S |
PTID tid, PString user, PString description, PString extension, [PTID oid] |
|
31 |
VoteTransaction |
Ask to vote a transaction. |
|
C ⇄ S |
PTID tid |
|
32 |
GetObject |
Ask a stored object by its OID, optionally at/before a specific tid. This maps to `load/loadBefore/loadSerial`. |
|
C ⇄ S |
PTID oid, PTID at, PTID before |
PTID oid, PTID serial_start, PTID serial_end, PBoolean compression, PChecksum checksum, PString data, PTID data_serial |
33 |
TIDList |
Ask for TIDs between a range of offsets. The order of TIDs is descending, and the range is [first, last). This maps to `undoLog`. |
|
C ⇄ S |
PIndex first, PIndex last, PNumber partition |
[PTID tid] |
34 |
TransactionInformation |
Ask for transaction metadata. |
|
C ⇄ S |
PTID tid |
PTID tid, PString user, PString description, PString extension, PBoolean packed, [PTID oid] |
35 |
ObjectHistory |
Ask history information for a given object. The order of serials is descending, and the range is [first, last]. This maps to `history`. |
|
C ⇄ S |
PTID oid, PIndex first, PIndex last |
PTID oid, [(PTID serial, PNumber size)] |
36 |
PartitionList |
Ask information about partitions. |
|
ctl ⇄ A |
PNumber min_offset, PNumber max_offset, PUUID uuid |
PPTID ptid, [(PNumber offset, [(PUUID uuid, PEnum state)])] |
37 |
NodeList |
Ask information about nodes. |
|
ctl ⇄ A |
PEnum type |
[(PEnum type, PAddress address, PUUID uuid, PEnum state, PFloat id_timestamp)] |
38 |
SetNodeState |
Change the state of a node. |
|
ctl ⇄ A ⇄ M |
PUUID uuid, PEnum state |
Error |
39 |
AddPendingNodes |
Mark given pending nodes as running, for future inclusion when tweaking the partition table. |
|
ctl ⇄ A ⇄ M |
[PUUID uuid] |
Error |
40 |
TweakPartitionTable |
Ask the master to balance the partition table, optionally excluding specific nodes in anticipation of removing them. |
|
ctl ⇄ A ⇄ M |
[PUUID uuid] |
Error |
41 |
SetClusterState |
Set the cluster state. |
|
ctl ⇄ A ⇄ M |
PEnum state |
Error |
42 |
Repair |
Ask storage nodes to repair their databases. |
|
ctl ⇄ A ⇄ M |
[PUUID uuid], PBoolean dry_run |
Error |
43 |
RepairOne |
Repair is translated to this message, asking a specific storage node to repair its database. |
|
M → S |
PBoolean dry_run |
- |
44 |
ClusterInformation |
Notify about a cluster state change. |
|
M → * |
PEnum state |
- |
45 |
ClusterState |
Ask the state of the cluster |
|
ctl ⇄ A A ⇄ M |
|
PEnum state |
46 |
ObjectUndoSerial |
Ask storage the serial where object data is when undoing given transaction, for a list of OIDs. |
|
C ⇄ S |
PTID tid, PTID ltid, PTID undone_tid, [PTID oid] |
{PTID oid: (PTID current_serial, PTID undo_serial, PBoolean is_current)} |
47 |
TIDListFrom |
Ask for length TIDs starting at min_tid. The order of TIDs is ascending. Used by `iterator`. |
|
C ⇄ S |
PTID min_tid, PTID max_tid, PNumber length, PNumber partition |
[PTID tid] |
48 |
Pack |
Request a pack at given TID. |
|
C ⇄ M ⇄ S |
PTID tid |
PBoolean status |
49 |
CheckReplicas |
Ask the cluster to search for mismatches between replicas, metadata only, and optionally within a specific range. Reference nodes can be specified. |
|
ctl ⇄ A ⇄ M |
{PNumber partition: PUUID source}, PTID min_tid, PTID max_tid |
Error |
50 |
CheckPartition |
Ask a storage node to compare a partition with all other nodes. Like for CheckReplicas, only metadata are checked, optionally within a specific range. A reference node can be specified. |
|
M → S |
PNumber partition, (PString upstream_name, PAddress address), PTID min_tid, PTID max_tid |
- |
51 |
CheckTIDRange |
Ask some stats about a range of transactions. Used to know if there are differences between a replicating node and reference node. |
|
S ⇄ S |
PNumber partition, PNumber length, PTID min_tid, PTID max_tid |
PNumber count, PChecksum checksum, PTID max_tid |
52 |
CheckSerialRange |
Ask some stats about a range of object history. Used to know if there are differences between a replicating node and reference node. |
|
S ⇄ S |
PNumber partition, PNumber length, PTID min_tid, PTID max_tid, PTID min_oid |
PNumber count, PChecksum tid_checksum, PTID max_tid, PChecksum oid_checksum, PTID max_oid |
53 |
PartitionCorrupted |
Notify that mismatches were found while check replicas for a partition. |
|
S → M |
PNumber partition, [PUUID uuid] |
- |
54 |
NotifyReady |
Notify that we're ready to serve requests. |
|
S → M |
|
- |
55 |
LastTransaction |
Ask last committed TID. |
|
C ⇄ M ctl ⇄ A ⇄ M |
|
PTID tid |
56 |
CheckCurrentSerial |
Check if given serial is current for the given oid, and lock it so that this state is not altered until transaction ends. This maps to `checkCurrentSerialInTransaction`. |
|
C ⇄ S |
PTID tid, PTID oid, PTID serial |
PTID conflict |
57 |
NotifyTransactionFinished |
Notify that a transaction blocking a replication is now finished. |
|
M → S |
PTID ttid, PTID max_tid |
- |
58 |
Replicate |
Notify a storage node to replicate partitions up to given 'tid' and from given sources. |
|
M → S |
PTID tid, PString upstream_name, {PNumber partition: PAddress address} |
- |
59 |
ReplicationDone |
Notify the master node that a partition has been successfully replicated from a storage to another. |
|
S → M |
PNumber offset, PTID tid |
- |
60 |
FetchTransactions |
Ask a storage node to send all transaction data we don't have, and reply with the list of transactions we should not have. |
|
S ⇄ S |
PNumber partition, PNumber length, PTID min_tid, PTID max_tid, [PTID tid] |
PTID pack_tid, PTID next_tid, [PTID tid] |
61 |
FetchObjects |
Ask a storage node to send object records we don't have, and reply with the list of records we should not have. |
|
S ⇄ S |
PNumber partition, PNumber length, PTID min_tid, PTID max_tid, PTID min_oid, {PTID serial: [PTID oid]} |
PTID pack_tid, PTID next_tid, PTID next_oid, {PTID serial: [PTID oid]} |
62 |
AddTransaction |
Send metadata of a transaction to a node that do not have them. |
|
S → S |
PTID tid, PString user, PString description, PString extension, PBoolean packed, PTID ttid, [PTID oid] |
- |
63 |
AddObject |
Send an object record to a node that do not have it. |
|
S → S |
PTID oid, PTID serial, PBoolean compression, PChecksum checksum, PString data, PTID data_serial |
- |