One document matched: draft-westerlund-perc-webrtc-use-case-01.xml
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info" docName="draft-westerlund-perc-webrtc-use-case-01"
ipr="trust200902">
<front>
<title abbrev="WebRTC use cases in PERC">WebRTC Use Case and Framework for
Privacy Enhanced RTP Conferencing (PERC)</title>
<author fullname="Magnus Westerlund" initials="M." surname="Westerlund">
<organization>Ericsson</organization>
<address>
<postal>
<street>Farogatan 6</street>
<city>SE-164 80 Kista</city>
<country>Sweden</country>
</postal>
<phone>+46 10 714 82 87</phone>
<email>magnus.westerlund@ericsson.com</email>
</address>
</author>
<author fullname="John Mattsson" initials="J." surname="Mattsson">
<organization>Ericsson</organization>
<address>
<postal>
<street>Farogatan 6</street>
<city>SE-164 80 Kista</city>
<country>Sweden</country>
</postal>
<phone>+46 10 71 43 501</phone>
<email>john.mattsson@ericsson.com</email>
</address>
</author>
<date/>
<abstract>
<t>The work so far on Privacy Enhanced RTP Conferencing, which allows
end-to-end security also in centralized switched RTP based conferences,
has not considered WebRTC in detail. This document looks at the use case
of WebRTC based endpoints, it also considers implications of using
external providers for both conference applications and centralized
media distribution devices. From this a number of challenges have been
determined, and requirements are derived from these. Finally the draft
presents some straw man for possible solutions.</t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>This document discusses the implications on PERC WG's work on
providing end-to-end secure centralized RTP conferencing using WebRTC
browsers as endpoints. The WebRTC environment contains a number of
challenges that needs to be considered; these may affect how the final
solution is designed. The authors have also have a strong interest in
enabling usages where significant amount of sourcing of external
resources are possible to perform. Not only the media distribution
devices (MDD) and STUN/TURN resources, but also the core functionalities
of the conference application, such as the find and connect to establish
the conference. However, the control over the end-to-end security needs
to be possible to maintain within a single organization. This
organization needs to maintain control over both who is authorized to
participate in a particular conference, as well as having control over
the end-to-end keys used in that conference.</t>
<t>It needs to be stressed that the use case presented here is far from
the only one where WebRTC endpoints could be used to establish a
multiparty end-to-end secured conference. The authors have chose to
focus on use case that combines WebRTC endpoints, contextual
communication and outsourcing, a use case suitable for a number of
enterprises, businesses, and services.</t>
<t><xref target="sec-use-case"/> goes through a possible use case and
its high level motivation. <xref target="sec-entities-trust"/> discusses
the different trust domains as well as the entities that are considered
in this use case. <xref target="sec-challenges"/> discusses a number of
challenges where several are unique to WebRTC compared to more native
implementations of endpoints. <xref target="sec-requirements"/> derives
a number of requirements. Finally in <xref target="sec-straw-man"/> we
present a straw man solution to some of the challenges we raised.</t>
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in <xref
target="RFC2119">RFC 2119</xref>.</t>
</section>
<section anchor="sec-use-case" title="The Use Case">
<t>This section discusses the representative use case in more detail as
well as discussing relevant background information for this use
case.</t>
<t>A use case (Enterprise Real-time contextual communication) is that an
enterprise has the need for multi-party real-time conferencing with
audio and video that is combined with enterprise's internal data that
can be viewed and manipulated using a web application. The conferencing
is intended to allow multiple employees or external consultants to
discuss the data and manipulate its content as well as present during
the discussion, i.e. a form of contextual communication.</t>
<t>The already existing web application to view and manipulate the
internal data is desirable to be able to re-use and the conference
participants are already using a web browser for this purpose. Thus,
basing the solution on WebRTC appears logical as that will enable
integration of the Real-time communication (RTC) conferencing part with
the existing web application.</t>
<t>The enterprise has no desire to maintain substantial RTC
infrastructure to ensure well working conferencing, and prefers to
source the needed services and components from external providers.
However, the enterprise has strong interests in maintaining control of
the security properties and ensure that its security goals are meet, it
may even have legal requirements on its communication.</t>
<t>The enterprise already has existing methods for authenticating their
employees, consultants and other external parties that have some access
rights to the enterprise's data. It would be highly desirable to be able
to re-use the existing user database and authorization verification.</t>
</section>
<section anchor="sec-entities-trust" title="Entities and Trust">
<t>This section discusses various entities and the trust in these
roles.</t>
<section title="Trust Domains">
<t>The entities belong to three different trust domains:<list
style="hanging">
<t hangText="Trusted:">Entities in the trusted domain are fully
trusted to perform the role and actions put on them. They may have
access to unprotected content and keying material used to protect
content end-to-end.</t>
<t hangText="Semi-trusted:">Semi-trusted entities have no access
to confidential material such as the content and the keying
material used to protect content end-to-end. They are however
trusted to perform basic operations for selective forwarding of
content as well as session establishment.</t>
<t hangText="Untrusted:">Entities and functions in this domain are
not trusted by the other entities participating in the conference
or system. These entities are capable of preventing the conference
from functioning, however.</t>
</list></t>
<t>The entities in the trusted and semi-trusted domains are show in
Figure 1, and described in more detail in Sections 3.2 and 3.3. Note
that part of the "endpoint" is trusted, while other parts are only
semi-trusted. A PERC conference involves more than one conference
participant and may involve several MDDs.</t>
<figure>
<artwork align="center"><![CDATA[
|
+------------------------+ | +------------------------+
| Service Provider | | | Conference Provider |
+------------------------+ | +------------------------+
|
+------------------------+ | +------------------------+
| Conference Initiator | | | Call Processing |
+------------------------+ | +------------------------+
|
+------------------------+ | +------------------------+
| Key Management | | | Media Distribution |
| Function (KMF) | | | Device (MDD) |
+------------------------+ | +------------------------+
|
+------------------------+ | +------------------------+
| Conference Participant | | | STUN/TURN |
+------------------------+ | +------------------------+
|
+-Endpoint------------------------------------------------+
| +--------------------+ : +--------------------+ |
| | Core | : | Conference | |
| | Application | : | Application | |
| +--------------------+ : +--------------------+ |
| +--------------------+ : |
| | User Agent (UA) | : |
| +--------------------+ : |
+---------------------------------------------------------+
|
|
Trusted | Semi-trusted
Figure 1: Entities in the Trusted and Semi-trusted Domains
]]></artwork>
</figure>
</section>
<section title="Trusted Entities">
<t>The trusted entities that we consider in the use case are:<list
style="hanging">
<t hangText="The Service Provider:">The service provider is the
organization that establishes the requirement for the end-to-end
security in its RTC conference. It also has the control over
entities that provide functionalities in the trusted domain for
enabling the conference, such as the Key-Management Function and
the Core Application. This represents the enterprise in the above
use case description.</t>
<t hangText="Conference Initiator:">A human that schedules a
conference. The conference contains a number of invited conference
participants. These and only these participants shall be able to
join this conference.</t>
<t hangText="Conference Participant:">A conference participant is
a human or possibly another type of entity, like a conference room
that has one or more identities that has been invited into the
conference.</t>
<t hangText="User Agent:">The user agent is the WebRTC enabled
endpoint that the conference participant uses to join the
conference. In this document this is assumed to be a browser.</t>
<t hangText="Core Application:">The web application residing both
in the service provider's server as well as the web based
application in the User Agent. Should be able to delegate almost
everything to the Conference Application. Non RTC content part of
a contextual application is generally handled by this
application.</t>
<t hangText="Key Management Function:">The function within the
service provider that generates, stores and distributes the keys
giving access to the conference to User Agents that can provide
identity assertions that it can be verified and matches the
invited participants.</t>
<t hangText="Authorization Module:">A sub-function of the Key
Management Function (KMF) that verifies identity assertions
according to a particular authorization method.</t>
<t hangText="Conference Session Database:">A sub-system of the KMF
that contains information about scheduled conferences and which
identities that have been invited by the Conference Initiator.</t>
</list></t>
</section>
<section title="Semi-trusted Entities">
<t>The semi-trusted entities that we consider in the use case
are:<list style="hanging">
<t hangText="Conference Provider:">Provides the conferencing
service, potentially from within a separate administrative domain
than the service provider. This service contains find and connect
functions to establish the conference, media distribution devices
to selectively forward protected media content to the conference
participants as well as support functions for media path
establishment such as STUN and TURN services.</t>
<t hangText="Conference Application:">The web application residing
both in the conference provider's servers as well in the User
Agent that performs find and connect and other functions to
establish and maintain the conference.</t>
<t hangText="Media Distribution Devices:">A device that performs
switching and forwarding of protected media to the various
conference participants' user agents or to another cascading media
distribution device. The switching is performed based on meta
information about the content that has been provided by the user
agent. It applies security policies that prevent forwarding of
traffic not originating with a user agent that is an invited
participant. This protection is through a hop-by-hop security
mechanism that provides integrity and source authentication as
well as confidentiality of the metadata.</t>
<t hangText="STUN/TURN:">Support functions that help establish the
transport path used to send media between a user agent and the
media distribution device.</t>
</list></t>
</section>
</section>
<section anchor="sec-challenges" title="Challenges">
<t>This section discusses a number of challenges in meeting the goals as
discussed in <xref target="I-D.jones-perc-private-media-reqts"/>.</t>
<!--MW: Below we need to expand and derive the basis for the requirements. -->
<section title="Enable Delegation">
<t>As described above, the Service Provider delegates the
communication service to the Conference Provider. The Conference
Provider may in turn delegate functions like the media distribution
device and STUN/TURN services by sourcing them from other providers.
Further, the infrastructure (servers and network) that these functions
are run on, can also be sourced from external providers. This puts
even higher demands on control and the ability to verify other
entities actions from the perspective of the Service Provider.</t>
<t>The main security goal of providing end-to-end confidentiality
across a centralized conferencing infrastructure is the main enabler
of delegation, as the required trust in large part of the
infrastructure are significantly reduced by freeing them from handle
any content as plain-text. However, that is not sufficient as not only
the content handling needs to be limited to only the entities that are
required to handle it. Also the key-management and authorization parts
of the solution need to consider how they can limit the trust. For
example the find and connect service is a semi-trusted part as it
needs to be capable of establishing the connectivity with the right
entities. However, the key-management and authorization system is the
one that verifies the participants and their right to participate in a
particular conference, and first then provides that participant's
endpoint with the necessary secrets.</t>
<t>The system design needs to minimize the privacy sensitive
information a particular functions needs. Thus, enabling as much
functionality as possible to be outside of the trusted domain.
Important functions in the semi-trusted domain, when so necessary to
ensure secure operation of the system, must be verified by trusted
entities.</t>
</section>
<section title="Dealing with JavaScript related Attacks">
<t>The application, such as JavaScript application, running in the
browser is a potential attack vector. Using various attacks, including
cross-site scripting, the application can be compromised and perform
the actions an attacker dictates. Even if the application running in
the browser is malicious it must not be able to compromise the
security of the conference, only perform denial of service attacks
such as preventing the user from joining the conference.</t>
<t>A compromised application must be prevented from getting access to
content. This will most likely mean that when using the end-to-end
confidentiality, corresponding measures to prevent forwarding (of
plain text content), access to raw data through APIs etc. that the
media confidentiality mode defined in <xref
target="I-D.ietf-rtcweb-alpn"/> have to be applied.</t>
<t>The compromised application will get access to who the peer(s) are
in the conference. This is unpreventable as the application is the
responsible for establishing the communication legs that is creating
the conference. An attacker will also be able to use a compromised
entity to forward protected content to a destination of its
choice.</t>
</section>
<section title="Forcing End-to-End Security">
<t>The Service Provider needs a method to ensure that when the
conference provider application launches the RTCPeerConnections, they
are forced to use end-to-end security with the keys provided by the
KMF the Service Provider designates, and not normal hop-by-hop
security only or end-to-end security with other keys. Thus, the
service provider needs a way of applying policies on an web
application context, or the conference participants must actively
check and understand information in the browser chrome. This first
approach could e.g. be done as the Service Provider web server setting
policies and restrictions that the UA enforces towards the JavaScript.
Policies that are inherited by any child contexts and which can't be
modified by the application in the user agent. A user clicking the
correct link would then be secure. The second approach seems to give
much weaker security as the average user do not look for security
information and do not understand it. A desirable model is that of
HTTPS, as long as an end user enters the correct URL, they are
guaranteed e2e security.</t>
</section>
<section title="Restricting Usage of the e2e Keying Material">
<t>While the RTCPeerConnection must use end-to-end security with the
key provided by the Service Provider, neither the core or conference
applications must be able to extract the key or even use the e2e key
material for anything other than encrypted key transport (EKT) as this
may lead to information leakage by e.g. so called two-time pad. The
user-agent will be required to have a secure key-store for the
duration the key-material is present at the user-agent and valid. When
the validity of the key-material expires the key-material needs to be
disposed of to reduce the risk of retrospective attacks.</t>
</section>
<section title="Enabling Flexible Authentication">
<t>The authorization methods should be flexible and enable different
types of authorization back-ends. It is desirable that the method for
authorization does not need to be implemented as part of the user
agent. Requiring user agent modifications makes deployment of new
authorization method cumbersome and difficult and open up for down
grade attacks due to need for backwards compatibility support.</t>
</section>
<section title="Binding Authorization to Endpoint">
<t>As the authorization will be used to retrieve the group key used to
secure the RTP session end-to-end, it is important that the
authorization is bound to the device and user agent where the user
gave the authorization. Otherwise the conference provider would be
able to move the authorization credentials to another endpoint, use
that endpoint to retrieve the key and export it from that
endpoint.</t>
</section>
<section title="Secure Binding of e2e Source Identifier to User Name or Alias">
<t>In many usages, it is important that the conference participants
can see in real-time who is participating and who is talking. This
requires that the endpoint can map the e2e source identifier to the
user name. The list of participant names as well as the binding to the
e2e source identifiers needs to be authenticated by a trusted party to
prevent attacks where an semi-trusted entity suggests an incorrect
binding between an e2e Source Identifier and a user name.</t>
</section>
<section anchor="sec-challange-group-change"
title="Securing Group Membership Changes">
<t>During an ongoing conference the set of participants participating
in a conference will vary. In some usages a late joining participant
should get access to keying material to decrypt a conference
recording. In other usages it is important that joining participant
can not use the received keying material to perform a retrospective
attack and decrypt the content of the conference from a point prior to
the participant joining. Nor should the participant after having left
the conference be able continue to decrypt the content.</t>
<t>The known solution to this issue is to switch keys, both group key
as well as the transport keys each endpoint uses to protect its
streams. This puts certain requirements on key-management system.
First the key-management system must track the current set of
participants and on changes initiate the change of the group key. This
results in a second requirement that they key distribution method for
the group key can handle asynchronous distribution events in the KMF
to endpoint direction. Thirdly the transport key switching and
distribution needs to handle non-synchronized switching by the
different endpoints to the new group key.</t>
<t>A clear issue is how the KMF can ensure that a participant that is
leaving is correctly accounted for and the key change happens in a
timely fashion after the user left. First of all users may leave the
conference abruptly due to severed communication or an endpoint that
crashes. Secondly, the conference management application is only
semi-trusted. The design will have to make choices on how to balance
protocol complexities, resource consumption and achieved security
properties.</t>
<t>An additional complexity with this mode of operation is that the
conference participant likes to in a secure way know which other
participants that currently are part of the conference. This
information needs to be timely updated, and the current rooster needs
to be authenticated to prevent attacks where participants are fooled
to believe a particular participant has left, but is in fact still in
the conference.</t>
</section>
<section title="Key Revocation">
<t>The conference e2e group key is only required to reside on an
endpoint for the duration its in use. That use is limited by the
duration of the conference. When the conference ends there are no
reasons to retain the key on the endpoint. Thus, when the conference
ends it is desirable to have the key be revoked and deleted from the
endpoint. This should be possible to initiate from the KMF when it
learns that the conference has ended.</t>
<t>The user agent should upon the user closing the browsing context
where the application runs deleting the keys to prevent their
leakage.</t>
</section>
</section>
<section anchor="sec-requirements" title="Requirements">
<!--MW: This section needs to collect the requirements we derive. -->
<t>This section lists a number of derived requirements from the above
challenges. The requirements are:<list style="letters">
<t>The Web application running in the User Agent MUST NOT be able to
compromise the content confidentiality.<list style="numbers">
<t>Including getting access to media content (raw or
unencrypted) in the user agent through API or shared
resources.</t>
</list></t>
<t>The conference provider's application (server as well as in the
user agent) MUST NOT be able to downgrade the intended security
properties and policies established by the service provider and the
core application.</t>
<t>The key material for the end-to-end protection MUST NOT be
possible to extract from any web application. <list style="numbers">
<t>The user agent MUST protect the key-material against
extraction by user or other software running on the same
device.</t>
<t>The key material MUST be bound to the usage its intended to
prevent leakage.</t>
<t>Upon the termination of the conference or the browsing
context containing the application the key material SHALL be
deleted</t>
</list></t>
<t>Different Authorization methods MUST be supported.<list
style="numbers">
<t>It's preferable that authorization methods can be supported
without user agent modifications.</t>
<t>The authorization credentials MUST be bound to endpoint where
the participant provided its credentials.</t>
</list></t>
<t>The design SHOULD support confidentiality where only the current
set of participants has access to the media content.</t>
<t>The conference rooster and the binding to the e2e source
identifies MUST be provided by a trusted party.</t>
</list></t>
</section>
<section anchor="sec-straw-man" title="Solution Straw-man">
<t/>
<section title="High Level Example Message Flows">
<t>The following figure shows a very high level illustration of an
example message flow for Privacy Enhanced RTP Conferencing using
WebRTC.</t>
<figure align="center">
<artwork><![CDATA[
+-----------+ +-------------+ +------------------+
| User/UA | | Conf. Prov. | | Service Provider |
+-----------+ +-------------+ +------------------+
| | |
| Invitation (Service Provider Conference URI) |
|<----------------------------------------------------|
| | |
| Launch Service Provider APP |
|<--------------------------------------------------->|
| User Authentication |
|<--------------------------------------------------->|
| Request Authorization Tokens |
|<--------------------------------------------------->|
| Request e2e Keying Material |
|<--------------------------------------------------->|
| | |
| Launch Conference APP | |
|<------------------------>| |
| Session Setup | |
|<------------------------>| |
| Setup PeerConnection | |
|<------------------------>| |
| DTLS-SRTP | |
|<------------------------>| |
| SRTP | |
|<------------------------>| |
]]></artwork>
</figure>
<t>The Conference Initiator schedules a conference and invitations are
sent out to the conference participants. This could for example be
done via e-mail.</t>
<t>At a later stage, when the conference is about to start, the
Conference Participant enters the conference URI in a browser it
trusts to launch the core web application. The user then authenticates
to the Service Provider Authorization Module (using a authentication
method of choice), and downloads the end-to-end keying material from
the Service Provider KMF.</t>
<t>The UA then launches the Conference Application, negotiates the
session parameters and sets up the PeerConnection. The Conference
provider validates that the participant is authorized by the Service
Provider to join the conference. The hob-by-hop security is provided
by DTLS-SRTP and SRTP (modified to handle end-to-end and hop-by-hop).
The UA enforces the use of end-to-end security with the key provided
by the Service Provider.</t>
<t>All communication except the invitation and the PeerConnection is
to be done over HTTPS.</t>
</section>
<section title="Authentication and Authorization">
<figure>
<artwork><![CDATA[
+-----------+ +-------------+ +------------------+
| User/UA | | Conf. Prov. | | Service Provider |
+-----------+ +-------------+ +------------------+
| | |
| HTTPS OAuth 2.0 Authorization Requests |
|---------------------------------------------------->|
| HTTPS OAuth 2.0 Access Tokens |
|<----------------------------------------------------|
| | |
| HTTPS Key Request (token1) |
|---------------------------------------------------->|
| HTTPS Key Response | |
|<----------------------------------------------------|
| | |
| HTTPS Join (token2) | |
|------------------------->| |
| HTTPS 200 OK | |
|------------------------->| |
]]></artwork>
</figure>
<t>One way to make the end-to-end security solution flexible and
enable integration with different types of Service Provider
authorization back-ends is to use a general authorization framework
such as OAuth 2.0. The User requests access tokens for all the
protected resources from the Service Provider Authorization Module.
The protected resources can be hosted by the Service Provider (e.g.
the Key Management Function) as well as by the Conference provider.
The use of OAuth 2.0 allows the same framework to be used in both
cases.</t>
<!--MW: Should we comment on the possibilty to use the IdP solution to generate an asseration?-->
</section>
<section title="Key Management">
<figure>
<artwork><![CDATA[
+-----------+ +-------------+ +-------------+
| UA | | MDD | | KMF |
+-----------+ +-------------+ +-------------+
| | |
| HTTPS POST (ConferenceID, token) |
|---------------------------------------------------->|
| HTTPS 200 OK (KeyID, e2e Key) |
|<----------------------------------------------------|
| DTLS-SRTP | |
|<------------------------>| |
| SRTP | |
|<------------------------>| |
]]></artwork>
</figure>
<t>The core web application requests the end-to-end keying material
from the Service Provider KMF. The successful HTTP response uses the
HTTP Encryption-Key header <xref
target="I-D.thomson-http-encryption"/> to distribute the end-to-end
keying material to the UA. The new parameter "usage" and its value
"EKT" instructs the UA that the keying material will be used with SRTP
Encrypted Key Transport (EKT). The UA stores the keyid and the keying
material for usage as the EKT Key. Key material received with the
"usage=EKT" parameter SHALL NOT be extractable and SHALL only be used
for EKT. The EKT processing MUST be handled by the UA.</t>
<t>An example successful HTTP response from the KMF is shown
below:</t>
<figure>
<artwork><![CDATA[
HTTP/1.1 200 OK
Encryption-Key: keyid="pegh"; key="lupDujHomwIjlutebgharghmey";
usage="EKT"
Content-Length: 0
]]></artwork>
</figure>
<t>The hob-by-hop keying material is negotiated between the UA and the
MDD using DTLS-SRTP.</t>
</section>
</section>
<section anchor="IANA" title="IANA Considerations">
<t>This document makes no request of IANA.</t>
<t>Note to RFC Editor: this section may be removed on publication as an
RFC.</t>
</section>
<section anchor="Security" title="Security Considerations">
<t>The whole document is about making WebRTC based cloud-based
conferencing viable and trustworthy from a pervasive monitoring
perspective.</t>
</section>
<section anchor="Acknowledgements" title="Acknowledgements">
<t>The authors would like to thank Göran AP Eriksson for
challanging discussions and Russ White for valuable comments.</t>
</section>
</middle>
<back>
<references title="Informative References">
<?rfc include="reference.I-D.jones-perc-private-media-reqts"?>
<?rfc include="reference.I-D.thomson-http-encryption"?>
<?rfc include='reference.I-D.ietf-rtcweb-alpn'?>
<?rfc include='reference.RFC.2119'?>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 05:39:16 |