1. Introduction
When a receiver joins an ongoing SRTP [RFC3711] session, out-of-band signaling must provide the receiver with the value of the ROC the sender is currently using. For instance, it can be transferred in the Common Header Payload of a MIKEY [RFC3830] message. In some cases, the receiver will not be able to synchronize his ROC with the one used by the sender, even if it is signaled to him out of band. Examples of where synchronization failure will appear are: 1. The receiver receives the ROC in a MIKEY message together with a key required for a particular continuous service. He does not, however, join the service until after a few hours, at which point the sender's sequence number (SEQ) has wrapped around, and so the sender, meanwhile, has increased the value of ROC. When the user joins the service, he grabs the SEQ from the first seen SRTP packet and prepends the ROC to build the index. If integrity protection is used, the packet will be discarded. If there is no integrity protection, the packet may (if key derivation rate is non-zero) be decrypted using the wrong session key, as ROC is used as input in session key derivation. In either case, the receiver will not have its ROC synchronized with the sender, and it is not possible to recover without out-of-band signaling. 2. If the receiver leaves the session (due to being out of radio coverage or because of a user action), and does not start receiving traffic from the service again until after 2^15 packets have been sent, the receiver will be out of synchronization (for the same reasons as in example 1). 3. The receiver joins a service when the SEQ has recently wrapped around (say, SEQ = 0x0001). The sender generates a MIKEY message and includes the current value of ROC (say, ROC = 1) in the MIKEY message. The MIKEY message reaches the receiver, who reads the ROC value and initializes its local ROC to 1. Now, if an SRTP packet prior to wraparound, i.e., with a SEQ lower than 0 (say, SEQ = 0xffff), was delayed and reaches the receiver as the first SRTP packet he sees, the receiver will initialize its highest received sequence number, s_l, to 0xffff. Next, the receiver will receive SRTP packets with sequence numbers larger than zero, and will deduce that the SEQ has wrapped. Hence, the receiver will incorrectly update the ROC and be out of synchronization. 4. Similarly to (3), since the initial SEQ is selected at random by the sender, it may happen to be selected as a value very close to 0xffff. In this case, should the first few packets be lost, the receiver may similarly end up out of synchronization. These problems have been recognized in, e.g., 3GPP2 and 3GPP, where SRTP is used for streaming media protection in their respective multicast/broadcast solutions [BCMCS][MBMS]. Problem 4 actually exists inherently due to the way SEQ initialization is done in RTP. One possible approach to address the issue could be to carry the ROC in the MKI (Master Key Identifier) field of each SRTP packet. This has the advantage that the receiver immediately knows the entire index for a packet. Unfortunately, the MKI has no semantics in RFC 3711prop (other than specifying master key), and a regular RFC 3711prop compliant implementation would not be able to make use of the information carried in the MKI. Furthermore, the MKI field is not integrity protected; hence, care must be taken to avoid obvious attacks against the synchronization. In this document, a solution is presented where the ROC is carried in the authentication tag of a special integrity transform in selected SRTP packets. The benefit of this approach is that the functionality of fast and robust synchronization can be achieved as a separate integrity transform, using the hooks existing in SRTP. Furthermore, when the ROC is transmitted to the receiver it needs to be integrity protected to avoid persistent denial-of-service (DoS) attacks or transmission errors that could bring the receiver out of synchronization. (A DoS attack is regarded as persistent if it can last after the attacker has left the area; in this particular case, an attacker could modify the ROC in one packet and the victim would be out of synchronization until the next ROC is transmitted). The above discussion leads to the conclusion that it makes sense to carry the ROC inside the authentication tag of an integrity transform.
1.1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
