One document matched: draft-ietf-siprec-req-00.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc3261 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3261.xml">
<!ENTITY rfc2804 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2804.xml">
]>
<rfc category="info"
ipr="pre5378Trust200902"
docName="draft-ietf-siprec-req-00">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc iprnotified="no" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<?rfc strict="no" ?>
<front>
<title abbrev="Requirements for SIPREC">Requirements for SIP-based Media Recording (SIPREC)</title>
<author fullname="Ken Rehor" initials="K.R" role="editor" surname="Rehor">
<organization>Cisco Systems</organization>
<address>
<postal>
<street>170 West Tasman Dr.</street>
<street>Mail Stop SJC30/2/</street>
<city>San Jose</city>
<region>CA</region>
<code>95134</code>
<country>USA</country>
</postal>
<email>krehor@cisco.com</email>
</address>
</author>
<author fullname="Rajnish Jain" initials="R.J" surname="Jain">
<organization>IPC Systems</organization>
<address>
<postal>
<street>777 Commerce Drive</street>
<city>Fairfield</city>
<region>CT</region>
<code>06825</code>
<country>USA</country>
</postal>
<email>rajnish.jain@ipc.com</email>
</address>
</author>
<author fullname="Leon Portman" initials="L.P" surname="Portman">
<organization>NICE Systems</organization>
<address>
<postal>
<street>8 Hapnina</street>
<city>Ra'anana</city>
<code>43017</code>
<country>Israel</country>
</postal>
<email>leon.portman@nice.com</email>
</address>
</author>
<author fullname="Andrew Hutton" initials="A." surname="Hutton">
<organization>Siemens Enterprise Communications</organization>
<address>
<email>andrew.hutton@siemens-enterprise.com</email>
<uri>http://www.siemens-enterprise.com</uri>
</address>
</author>
<date month="May" year="2010"/>
<workgroup>DISPATCH</workgroup>
<keyword>Internet-Draft</keyword>
<abstract>
<t>Session recording is a critical requirement in many business
communications environments such as call centers and financial
trading floors. In some of these environments, all calls must be
recorded for regulatory and compliance reasons. In others, calls may
be recorded for quality control or business analytics.</t>
<t>Recording is typically done by sending a copy of the session media
to the recording devices. This document specifies requirements for
extensions to SIP that will manage delivery of RTP media from an end-point
that originates media (or that has access to it) to a recording device.
This is being referred to as SIP-based Media Recording.</t>
</abstract>
</front>
<middle>
<section title="Requirements notation">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119] and indicate
requirement levels for compliant mechanisms.</t>
</section>
<section title="Introduction">
<t>Session recording is a critical operational requirement in many
businesses, especially where voice is used as a medium for commerce
and customer support. A prime example where voice is used for trade
is the financial industry. The call recording requirements in this
industry are quite stringent. The recorded calls are used for
dispute resolution and compliance. Other businesses such as customer
support call centers typically employ call recording for quality
control or business analytics, with different requirements.</t>
<t>Depending on the country and its regulatory requirements, financial
trading floors typically must record all calls. The recorded media
content must be an exact copy of the actual conversation (i.e.
clipping and loss of media are unacceptable). A new call attempt
would be automatically rejected if the recording device becomes
temporarily unavailable. An existing call would be dropped in the
same situation. In contrast, support call centers typically only
record a subset of the calls, and calls must not fail regardless of
the availability of the recording device.</t>
<t>Furthermore, the scale and cost burdens vary widely, in all markets,
where the different needs for solution capabilities such as media
injection, transcoding, and security-related needs do not conform
well to a one-size-fits-all model. If a standardized solution
supports all of the requirements from every recording market, but
doing so would be expensive for markets with lesser needs, then
proprietary solutions for those markets will continue to propagate.
Care must be taken, therefore, to make a standards-based solution
support optionality and flexibility.</t>
<t>It should be noted that the requirements for the protocol between a
Session Recording Server and Session Recording Client have very similar requirements
(such as codec and transport negotiation, encryption key interchange,
firewall traversal) as compared to regular SIP media sessions. The
choice of SIP for session recording provides reuse of an existing
protocol. This document specifies requirements for using SIP <xref target="RFC3261"/>
to record a media session between a Session Recording Client and a Session Recording Server,
The Session Recording Client is the source of the recorded media. The Session Recording Server is the sink of
recorded media.</t>
<t>The recorded sessions can be any SIP-based RTP media sessions including voice, video, and text (as defined by [RFC4103]).</t>
<t>An archived session recording is typically comprised of the session media content and
the session metadata. The session metadata allows recording archives to be searched and
filtered at a later time. The conveyance of session metadata from the Session Recording Client
to the Session Recording Server may or may not be over SIP.
The requirements for session metadata delivery will be specified in a future revision of
this document or in a separate document.</t>
<t>In some situations it may be advantageous to correlate the Communication Session
with other information, such as an IM or other text chat session.</t>
<t>This document only considers active recording, where the Session Recording
Client purposefully streams media to a Session Recording Server. Passive
recording, where a recording device detects media directly from the
network, is outside the scope of this document. In addition, lawful
intercept is outside the scope of this document.</t>
</section>
<section title="Definitions">
<t>Session Recording Server (SRS): A Session Recording Server (SRS) is a SIP User Agent (UA) that is a
specialized media server or collector that acts as the sink of the recorded media. An
SRS is a logical function that typically archives media for extended durations of time and
provides interfaces for search and retrieval of the archived media.
An SRS is typically implemented as a multi-port device that is capable
of receiving media from several sources simultaneously. An SRS is
typically also the sink of the recorded session metadata. Note that
the term "Server" does not imply the SRS is the server side of a
signaling protocol - the SRS may be the initiator of recording requests, for example.</t>
<t>Session Recording Client (SRC): A Recording Client (SRC) is a SIP User Agent
(UA) that acts as the source of the recorded media, sending it to the SRS. An SRC is a logical
function. Its capabilities may be implemented across one or more
physical devices. In practice, an SRC could be a personal device
(such as a SIP phone), a SIP Media Gateway (MG), a Session Border
Controller (SBC) or a SIP Media Server (MS) integrated with an
Application Server (AS). This specification defines the term SRC such
that all such SIP entities can be generically addressed under one
definition. The SRC itself or another entity working on its behalf
(such as a SIP Application Server) may act as the source of the
recording metadata.</t>
<t>Communication Session (CS): A session created between two or more SIP User Agents (UAs)
that is the target for recording.</t>
<t>Recording Session (RS): The SIP session created between an SRC and SRS for the
purpose of recording a Communication Session.</t>
<figure anchor="recording_session">
<preamble></preamble>
<artwork><![CDATA[
+---------------------------------------------------------------+
| (Session Recording Client) |
| +-------------+ +-----------+ |
| | | Communication Session | | |
| | A |<------------------------->| B | |
| | | | | |
| +-------------+ +-----------+ |
+---------------------------------------------------------------+
|
| Recording
| Session
|
v
+-------------+
| |
| Session |
| Recording |
| Server |
| |
+-------------+
]]></artwork>
<postamble></postamble>
</figure>
<t>Metadata: Data that describes the Communication Session and Recording Session..</t>
<t>SIPREC: The set of SIP extensions that supports recording of Communication Sessions.</t>
<t>Pause/Resume recording:
Pause: Temporarily discontinue capture and storage of the incoming media stream.
Recording may continue via the Resume command.
Resume: Continue capture and storage of the incoming media stream into the same recording session.
</t>
</section>
<section title="Example Deployment Architectures">
<t>A recording system deployment consists of the Recording Client and Recording Server.
Recording Control is bi-directional; Recording Media and Call Metadata are sent from the RC to RS.</t>
<figure anchor="recording_client">
<preamble></preamble>
<artwork><![CDATA[
+-------------+ +--------------+
| | Call Metadata | |
| Session |-------------------------->| Session |
| Recording | | Recording |
| Client | Recording Control | Server |
| (SRC) |<------------------------->| (SRS) |
| | | |
| | Recorded Media | |
| |-------------------------->| |
| | | |
+-------------+ +--------------+
]]></artwork>
<postamble></postamble>
</figure>
</section>
<section title="Use Cases">
<t>Use Case 1: Full-time Recording: One (or more, in the case of redundant recording)
Recording Session for each Communication Session.</t>
<figure anchor="full_time">
<preamble>For example, the diagram below shows the lifecycle of Communication Sessions (CS)
and the relationship to the Recording Sessions (RS)</preamble>
<artwork><![CDATA[
CS |--- CS 1 ---| |--- CS 2 ---| |--- CS 3 ---|
RS |--- RS 1 ---| |--- RS 2 ---| |--- RS 3 ---|
]]></artwork>
<postamble></postamble>
</figure>
<t>Record every call for specific extension/person.
One Recording Session is created per Communication Session.</t>
<t>The need to record all calls is typically due to business process purposes
(such as transaction confirmation or dispute resolution)
or to ensure compliance with governmental regulations.
Applications include enterprise, contact center, and financial trading floors.</t>
<t>Also commonly known as Total Recording.</t>
<t>Use Case 2: Selective Recording: Start a Recording Session when a Communication Session is established.
Only specific calls are recorded, possibly based on business rules.
A Communication Session may or may not have a Recording Session (record only specific calls).</t>
<figure anchor="selective">
<preamble>In this example, Communication Sessions 1 and 3 are recorded but CS 2 is not.</preamble>
<artwork><![CDATA[
CS |--- CS 1 ---| |--- CS 2 ---| |--- CS 3 ---|
RS |--- RS 1----| |--- RS 2 ---|
]]></artwork>
<postamble></postamble>
</figure>
<t>Use Case 3: Dynamic Recording: Start/Stop a Recording Session during a Communication Session.</t>
<t>The Recording Session starts during a Communication Session, either manually via a user-controlled mechanism
(e.g. button on user’s phone) or automatically via an application (e.g. a Contact Center customer service application)
or business event.
A Recording Session either ends during the Communication Session, or when the Communication Session ends.</t>
<figure anchor="dynamic">
<preamble>One or more Recording Sessions per Communication Session:</preamble>
<artwork><![CDATA[
CS |------------- Communication Session -----------|
RS |---- RS 1 ----| |---- RS 2 -----|
]]></artwork>
<postamble></postamble>
</figure>
<t>Also known as Mid-session or Mid-call Recording.</t>
<t>Applications include:</t>
<t>o Enterprise back office recording where only specific calls (based
on privacy) have to be recorded</t>
<t>o A user requests session recording at call setup time or during a call.</t>
<t>Use Case 4: Persistent Recording: A single Recording Session captures one or more Communication Sessions,
in sequence (Fig. 6) or in parallel (Fig. 7).
The recording session is a single RTP stream, therefore consists of a single offer/answer exchange.
There may be mid-session RE-INVITE offer/answer exchanges for codec changes or for moving the RTP streams
to handle failure scenarios.
</t>
<figure anchor="persistent">
<preamble></preamble>
<artwork><![CDATA[
|--- CS 1 ---| |--- CS 2 ---| |--- CS 3 ---|
RS |---------------------- Recording Session ---------------------|
]]></artwork>
<postamble></postamble>
</figure>
<t>A Recording Session records continuously without interruption.
Applications include financial trading desks and emergency (first-responder) service
bureaus. The length of a Persistent Recording Sessions is
independent from the length of the actual Communication Sessions.
Persistent Recording Sessions avoid issues such as media clipping
that can occur due to delays in Recording Session establishment.</t>
<t>The connection and attributes of media in the Recording
Session are not dynamically signaled for each Communication Session before
it can be recorded; however, codec re-negotiation is possible.
Call details and metadata will still be
signaled, but can be post- correlated to the recorded media. There
will still need to be a means of correlating the recorded media
connection/packets to the Communication Session, however this may be on a
permanent filter-type basis, such as based on a SIP AoR of an agent
that is always recorded.</t>
<figure anchor="mixing">
<preamble>In some cases, more than one concurrent Communication Session
(on a single end-user apparatus, e.g. trading floor turret)
is mixed into one Recording Session:</preamble>
<artwork><![CDATA[
|-------- CS 1 -------|
|-------- CS 2 -------|
|-------- CS 3 -------|
RS |----------- Recording Session --------------|
]]></artwork>
<postamble></postamble>
</figure>
<t>Use Case 5: Real-time Recording Controls.</t>
<t>For an active Recording Session, privacy or security reasons may
demand not capturing a specific portion of a conversation. An
example is for PCI (payment card industry) compliance where credit
card info must be protected. One solution is to not record a caller
speaking their credit card information.</t>
<t>Real-time controls may include Pause/Resume, or Mute/Unmute.</t>
<t>Use Case 6: IVR / Voice Portal Recording.</t>
<t> Self-service Interactive Voice Response (DTMF or ASR) applications
may need to be recorded for application performance tuning or to meet
compliance requirements.</t>
<t>Metadata about an IVR session recording must include session information
and may include application context information (e.g. VoiceXML session variables, dialog names, etc.)</t>
<t>Use Case 7: Enterprise Mobility Recording.</t>
<t>Many agents and enterprise workers are not located on company premises.</t>
<t>Examples:</t>
<t>o Home-based agents or enterprise workers.</t>
<t>o Mobile phones of knowledge workers when they conduct work related
(and legally required recording) calls. i.e. insurance agents,
brokers, physicians.</t>
<t>Use Case 8: Geographically distributed or centralized recording.</t>
<t>Global banks with multiple branches up to thousands of small sites.</t>
<t>o Only phones and network infrastructure in branches, no recording
services.</t>
<t>o Internal calls inside or between branches must be recorded.</t>
<t>o Centralized recording system in data centers together with
telephony infrastructure (e.g. PBX).</t>
<t>Obtain media by forking:</t>
<t>o In data centers (inbound/outbound): fork media in gateway.</t>
<t>o At branch locations (internal): fork from end points or from site
level gateways.</t>
<t>o Conference based solution is not applicable because it will require
to reroute all communications via data centers.</t>
<t>Use Case 9: Record complex calls.</t>
<t>Record a call that is associated with another call.</t>
<t>Example:</t>
<t>o Customer in conversation with Agent</t>
<t>o Agent puts customer on hold in order to consult with a Supervisor.</t>
<t>o Agent in conversation with Supervisor.</t>
<t>o Agent disconnects from Supervisor, reconnects with Customer.</t>
<t>o The Supervisor call must be associated with the original customer call.</t>
<t>Use case 10: High-Availability, High Reliability Recording.</t>
<t>If recorder is not available at call setup time:</t>
<t>o Reject request to establish Communication Session.</t>
<t>o Redirect to alternate recorder.</t>
<t>If recorder has failed during a call, take specific action:</t>
<t>o Fail/Transfer call / fail over to a different recorder (may involve
loss of recording during transition time).</t>
<t>o Failover recovery situation: continuation of a call recorded on a
different recorder. Protocol must indicate that a Recording Session
is a continuation of a previous Recording Session.</t>
<t>For some use cases, if recording is not available or failed in a
middle of the call, original conversation has to be terminated or
alerted as well.</t>
<t>Use Case 11: Record multi-channel, multi-media session.</t>
<t>Some applications require the recording of more than one media
stream, possibly of different types. Media is synchronized, either
at storage or at playback.</t>
<t>Speech analytics technologies (e.g. word spotting, emotion detection,
speaker identification) may require speaker-separated recordings for
optimum performance.</t>
<t>Multi-modal Contact Centers may include audio, video, IM or other
interaction modalities.</t>
<t>In trading floors environments, in order to save resources,
it may be preferable to mix multiple concurrent calls (Communication Sessions)
on different handsets/speakers on the same turret into single recording session.</t>
<t>Use Case 12: Real-time media processing.</t>
<t>Recorder must support real-time media processing, such as speech analytics.</t>
<t>Recording and real-time analytics of trading floor interactions
(including video and instant messaging). Real time analytics is
required for automatic intervention (stopping interaction or alert)
if for example, trader is not following regulations.</t>
<t>Speaker separation is required in order to reliably detect who is
saying specific phrases.</t>
</section>
<section title="Requirements">
<t>The following are requirements for the Session Recording Protocol:</t>
<t>o REQ-000 The mechanism MUST provide a means for establishing, maintaining and
clearing Recording Sessions between a Session Recording Client and a
Session Recording Server for the purpose of recording, at the Session Recording Server,
media from Communication Sessions.</t>
<t>o REQ-001 The mechanism MUST support Full-time Recording Sessions.</t>
<t>o REQ-002 The mechanism MUST support Selective Recording Sessions.</t>
<t>o REQ-003 The mechanism MUST support Dynamic (on-demand) Recording
Sessions.</t>
<t>o REQ-004 The mechanism MUST support Persistent Recording Sessions.</t>
<t>o REQ-005 The mechanism MUST support establishing Recording Sessions
from the SRC to the SRS. This requirement typically applies when the
decision on whether a session should be recorded or not resides in
the SRC.</t>
<t>o REQ-006 The mechanism MUST support establishing Recording Sessions
from the SRS to the SRC. This requirement typically applies when the
decision on whether a session should be recorded or not resides in
the RS.</t>
<t>o REQ-007 The mechanism MUST support the recording of IVR sessions.</t>
<t>o REQ-008 The mechanism SHOULD support SRS failover and migration of
Recording Sessions to a working SRS without disconnecting the Communication
Sessions.</t>
<t>o REQ-009 A request for a new Recording Session MUST be rejected by the
SRS if service is unavailable (e.g. system overload, disk full, etc.)</t>
<t>o REQ-010 A request for a new Recording Session MUST be redirected to
an available SRS.</t>
<t>o REQ-011 If no recording resources are available, appropriate error
message MUST be returned.</t>
<t>o REQ-012 The mechanism MUST support the ability for an SRC to deliver mixed audio
streams from a single or multiple Communication Sessions to an SRS.</t>
<t>Note: A mixed audio stream is where several Communication Sessions
are carried in a single Recording Session. A mixed media stream is
typically produced by a mixer function.
The RS MAY be informed about the composition of the mixed streams through session metadata.</t>
<t>o REQ-013 The mechanism MUST support the ability to deliver multiple
media streams for a given Communication Session over separate Recording
Sessions to the SRS.</t>
<t>o REQ-014 The mechanism MUST support the ability to deliver multiple
media streams for a given Communication Session over a single Recording
Session to the SRS.</t>
<t>o REQ-015 The mechanism MUST support the ability to pause and resume
the Recording Session from the SRC.</t>
<t>o REQ-016 The mechanism MUST support the ability to pause and resume
the Recording Session from the SRS.</t>
<t>o REQ-017 The mechanism MUST support the ability to correlate the
request to record a call with the session being recorded.</t>
<t>o REQ-018 The mechanism MUST support the ability to correlate the
request to record specific media sessions with the SIP session and
media to be recorded (Recorded Session).</t>
<t>o REQ-019 The mechanism MUST support the ability to transport
the metadata in the same SIP dialog as the Recording Session.</t>
<t>o REQ-020 The mechanism MUST support the ability to transport
the metadata outside of the Recording Session SIP dialog.</t>
<t>o REQ-021 Metadata format must be agnostic of the transport protocol.</t>
<t>o REQ-022 The mechanism MUST support the ability to correlate metadata
to the Recording Session.</t>
<t>o REQ-023 The mechanism MUST support a means for a SIP UA to request
that a session is not recorded.</t>
<t>o REQ-024 The mechanism MUST provide a means of indicating to the end users
of a Communication Session that the session in which they are participating
is being recorded.</t>
<t>Examples include: inject tones into the Communication Session from the SRC,
play a message at the beginning of a session, a visual indicator on a display, etc.</t>
<t>o REQ-025 A Recording Session may start at call setup.</t>
<t>o REQ-026 A Recording Session may start during a call.</t>
<t>o REQ-027 The mechanism SHOULD support means to avoid clipping media
(leading or trailing samples) when the media is transported from the
SRC to the SRS.</t>
<t>Note: Media clipping can occur due to delays in recording session
establishment. SRC implementations typically buffer some portion of
the media to overcome this problem.</t>
<t>o REQ-028 Recorder MUST provide alarm notifications when a failure
occurs.</t>
<t>o REQ-029 The mechanism MUST support a means of providing security
(confidentiality, integrity and authentication) for the SIPREC.</t>
<t>o REQ-030 The mechanism MUST enable the Recording Session to identify
itself as a SIP session that is established for the purpose of recording.</t>
<t>o REQ-031 The mechanism MUST support the existing SIP security model,
including eavesdropping protection, authorization and authentication.</t>
<t>o REQ-032 If the Communication Session is encrypted,
the Recording Session must not share the same keys.</t>
<t>o REQ-033 The mechanism SHALL support means to relate Recording Session(s)
with Communication Session(s).</t>
</section>
<section title="Security Considerations">
<t>Session recording has substantial security implications, both for the
SIP UA's being recorded, and for the Session Recording Protocol
itself in terms of the SRC and SRS.</t>
<t>For the SIP UA's involved in the Communication Session, the requirements
in this draft enable the UA to identify that a session is being
recorded and for the UA to request that a given session is not
subject to recording.</t>
<t>Since humans don't typically look at or know about protocol signaling
such as SIP, and indeed the SIP session might have originated through
a PSTN Gateway without any ability to pass on in-signaling
indications of recording, users can be notified of recording in the
media itself through voice announcements, a visual indicator on
the endpoint, or other means.</t>
<t>With regards to security implications of the protocol(s), clearly
there is a need for authentication, authorization, eavesdropping
protection, and non-repudiation for the solution. The SRC needs to
know the SRS it is communicating with is legitimate, and vice-versa,
even if they are in different domains. Both the signaling and media
for the SIPREC needs the ability to be authenticated and protected from
eavesdropping and non-repudiation. Requirements are
detailed in the requirements section.</t>
</section>
<section title="IANA Considerations">
<t>This document has no IANA actions.</t>
</section>
<section title="Acknowledgements">
<t>Thanks to Dan Wing, Alan Johnson, Vijay Gurbani
and Cullen Jennings for their help with this document,
and to all the members of the DISPATCH WG mailing list for
providing valuable input to this work.</t>
</section>
<section title="Contributors">
<t>In addition to the editors, the following people provided substantial
technical and writing contributions to this document, listed
alphabetically:</t>
<t>Hadriel Kaplan<vspace/>
Acme Packet<vspace/>
71 Third Ave.<vspace/>
Burlington, MA 01803<vspace/>
USA<vspace/>
hkaplan@acmepacket.com</t>
<t>Henry Lum<vspace/>
Genesys, Alcatel-Lucent<vspace/>
1380 Rodick Road, Suite 200<vspace/>
Markham, Ontario L3R 4G5<vspace/>
Canada<vspace/>
henry.lum@genesyslab.com</t>
<t>Martin Palmer<vspace/>
BT Global Services<vspace/>
Annandale House,1 Hanworth Road,<vspace/>
Sunbury on Thames Middlesex TW16 5DJ<vspace/>
UK<vspace/>
martin.4.palmer@bt.com</t>
<t>Dave Smith<vspace/>
Genesys, Alcatel-Lucent<vspace/>
2001 Junipero Serra Blvd, Daly City, CA 94014<vspace/>
USA<vspace/>
dsmith@genesyslab.com</t>
</section>
</middle>
<back>
<references title='Normative References'>
&rfc2119;
&rfc3261;
&rfc2804;
&rfc4103;
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 22:12:27 |