One document matched: draft-ietf-ips-iscsi-02.txt
Differences from draft-ietf-ips-iscsi-01.txt
IPS Julian Satran
Internet Draft Daniel Smith
Document: draft-ietf-ips-iscsi-02.txt Kalman Meth
Category: standards-track IBM
Constantin Sapuntzakis
Cisco Systems
Matt Wakeley
Agilent Technologies
Paul Von Stamwitz
Adaptec
Randy Haagens
Hewlett-Packard Co.
Efri Zeidner
SANGate
Luciano Dalle Ore
Quantum
Yaron Klein
SANRAD
iSCSI
Julian Satran Standards-Track, Expire June 2001 1
iSCSI December 30, 2000
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026 [1].
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or made obsolete by other documents at
any time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
The Small Computer Systems Interface (SCSI) is a popular family of
protocols for communicating with I/O devices, especially storage
devices. This memo describes a transport protocol for SCSI that
operates on top of TCP. The iSCSI protocol aims to be fully
compliant with the requirements laid out in the SCSI Architecture
Model - 2 [SAM2] document.
Acknowledgements
Besides the authors a large group of people contributed through their
review, comments and valuable insights to the creation of this
document - too many to mention them all. Nevertheless, we are
grateful to all of them. We are especially grateful to those that
found the time and patience to participate in our weekly phone
conferences and intermediate meetings in Almaden and Haifa and thus
helped shape this document: Jim Hafner, John Hufferd, Prasenjit
Sarkar, Meir Toledano, John Dowdy, Steve Legg, Alain Azagury (IBM),
Dave Nagle (CMU), David Black (EMC), John Matze (Veritas), Mark
Bakke, Steve DeGroote, Mark Shrandt (NuSpeed), Gabi Hecht (Gadzoox),
Robert Snively (Brocade), Nelson Nachum (StorAge). Many more helped
clean and improve this document within the IPS working group. We are
especially grateful to David Robinson (Sun), Charles Monia, Joshua
Tseng (Nishan), Somesh Gupta, Mallikarjun C., Michael Krause (HP),
Stephen Byan (Genroco), Yaron Klein (SANRAD). And last but not least
Satran, J. Standards-Track, June 2001 2
iSCSI December 30, 2000
thanks Ralph Weber for keeping us in-line with T10 (SCSI)
standardization.
Conventions used in this document
In examples, "I->" and "T->" indicate iSCSI PDUs sent by the
initiator and target respectively.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119.
Satran, J. Standards-Track, June 2001 3
iSCSI December 30, 2000
Table of Contents
Status of this Memo...................................................2
Abstract..............................................................2
Acknowledgements......................................................2
Conventions used in this document.....................................3
1. Overview...........................................................8
1.1 SCSI Concepts...................................................8
1.2 iSCSI Concepts & Functional Overview...........................9
1.2.1 Layers & Sessions............................................9
1.2.2 Ordering and iSCSI numbering.................................9
1.2.2.1 Command numbering........................................10
1.2.2.2 Response/Status numbering................................11
1.2.2.3 Data PDU numbering.......................................12
1.2.3 iSCSI Login.................................................12
1.2.4 Text mode negotiation.......................................13
1.2.5 iSCSI Full Feature Phase....................................14
1.2.6 iSCSI Connection Termination................................16
1.2.7 Naming & mapping............................................16
1.2.8 Message Framing.............................................18
1.2.8.1 Framing Justification....................................18
1.2.8.2 Markers At Fixed Intervals...............................20
1.2.8.3 iSCSI PDU Size...........................................20
1.2.8.4 Initial marker-less interval.............................20
2. iSCSI PDU Formats.................................................21
2.1 Template Header and Opcodes....................................21
2.1.1 Opcode......................................................22
2.1.2 Opcode-specific fields......................................23
2.1.3 Length......................................................23
2.1.4 LUN.........................................................23
2.1.5 Initiator Task Tag..........................................23
2.1.6 Header Digest and Data Digest...............................23
2.2 SCSI Command...................................................25
2.2.1 Flags & Task Attributes.....................................25
2.2.2 AddCDB......................................................26
2.2.3 CmdRN - Command Reference Number............................26
2.2.4 ExpStatRN - Expected Status Reference Number................26
2.2.5 Expected Data Transfer Length...............................26
2.2.6 CDB - SCSI Command Descriptor Block.........................27
2.2.7 Command-Data................................................27
2.3 SCSI Response..................................................28
2.3.1 Byte 1 - Flags..............................................28
2.3.2 Basic Residual Count........................................29
2.3.3 Bidi-Read Residual Count....................................29
2.3.4 Command Status..............................................29
Satran, J. Standards-Track, June 2001 4
iSCSI December 30, 2000
2.3.5 Resp_length - Response length...............................29
2.3.6 Sense_length - Length of sense data.........................29
2.3.7 Response and/or Sense Data..................................29
2.3.8 StatRN - Status Reference Number............................30
2.3.9 ExpCmdRN - next expected CmdRN from this initiator..........30
2.3.10 MaxCmdRN - maximum CmdRN acceptable from this initiator....30
2.4 SCSI Task Management Command...................................31
2.4.1 Function....................................................31
2.4.2 Referenced Task Tag.........................................32
2.5 SCSI Task Management Response..................................33
2.5.1 Referenced Task Tag.........................................34
2.6 SCSI Data......................................................35
2.6.1 F (Final) bit...............................................36
2.6.2 Length......................................................36
2.6.3 Target Task Tag.............................................37
2.6.4 Buffer Offset...............................................37
2.6.5 Flags.......................................................37
2.6.6 Data numbering (DataRN).....................................37
2.7 Text Command...................................................39
2.7.1 Length......................................................39
2.7.2 Initiator Task Tag..........................................39
2.7.3 Text........................................................39
2.8 Text Response..................................................41
2.8.1 Length......................................................41
2.8.2 Initiator Task Tag..........................................41
2.8.3 Text Response...............................................41
2.9 Login Command..................................................43
2.9.1 Version-major and Version-minor.............................43
2.9.2 CID.........................................................43
2.9.3 InitCmdRN...................................................43
2.9.4 Login Parameters............................................44
2.10 Login Response................................................45
2.10.1 Version-major minor........................................45
2.10.2 InitStatRN.................................................45
2.10.3 Status.....................................................46
2.10.4 TSID.......................................................46
2.10.5 Final bit..................................................46
2.11 NOP-Out.......................................................47
2.11.1 P - Ping bit...............................................48
2.11.2 Length.....................................................48
2.11.3 Initiator Task Tag.........................................48
2.11.4 Target Task Tag............................................48
2.11.5 Ping Data..................................................48
2.12 NOP-In........................................................49
2.12.1 Target Task Tag............................................49
2.13 Logout Command................................................50
Satran, J. Standards-Track, June 2001 5
iSCSI December 30, 2000
2.13.1 CID........................................................50
2.13.2 Reason Code................................................50
2.14 Logout Response...............................................51
2.14.1 Status.....................................................51
2.15 Ready To Transfer (R2T).......................................52
2.15.1 Desired Data Transfer Length and Buffer Offset.............53
2.15.2 Target Transfer Tag........................................53
2.16 Asynchronous Event............................................54
2.16.1 iSCSI Event................................................54
2.16.2 SCSI Event Indicator.......................................55
2.17 Third Party Commands..........................................56
2.18 Reject........................................................57
2.19 Reason........................................................57
3. Login phase.......................................................58
3.1 Login phase start..............................................58
3.2 Security negotiation...........................................59
3.3 iSCSI Security.................................................59
4. iSCSI Error Handling and Recovery.................................61
4.1 Connection failure.............................................61
4.2 Protocol Errors................................................62
4.3 Session Errors.................................................62
4.4 Format errors..................................................62
4.5 Digest errors..................................................63
5. Notes to Implementers.............................................64
5.1 Multiple Network Adapters......................................64
5.2 Autosense......................................................64
6. Security Considerations...........................................65
6.1 Data Integrity.................................................65
6.2 Network operations and the Threat Model........................65
6.2.1 Threat Model................................................65
6.2.1.1 Passive Attacks..........................................65
6.2.1.2 Active Attacks...........................................66
6.2.2 Security Model..............................................66
6.2.2.1 No Security..............................................66
6.2.2.2 End-to-End Authentication................................66
6.2.2.3 iSCSI integrity and authentication.......................66
6.2.2.4 Encryption...............................................67
6.2.3 Other Considerations........................................67
6.3 Login Process..................................................67
6.4 Feasibility....................................................67
7. IANA Considerations...............................................69
8. References and Bibliography.......................................70
9. Author's Addresses................................................72
Apendix A. iSCSI Security............................................75
01 Security keys and values..........................................75
02 Authentication....................................................77
Satran, J. Standards-Track, June 2001 6
iSCSI December 30, 2000
03 Salt..............................................................78
04 Challenge.........................................................78
05 Login Phase examples:.............................................78
Apendix B. Examples..................................................82
06 Read operation example............................................82
07 Write operation example...........................................83
Apendix C. Login/Text keys (not security related)....................84
08 MaxConnections....................................................84
09 Target............................................................84
10 Initiator.........................................................84
11 AccessID..........................................................84
12 UPFrame.................................Error! Bookmark not defined.
13 UseR2T............................................................86
14 BidiUseR2T........................................................86
15 DataNumber........................................................86
16 ImmediateDataLength...............................................87
17 ITagLength........................................................87
18 PingMaxReplyLength................................................87
19 StartSecure.......................................................87
20 TotalText.........................................................87
21 KeyValueText......................................................87
22 MaxOutstandingR2T.................................................88
Full Copyright Statement.............................................89
Satran, J. Standards-Track, June 2001 7
iSCSI December 30, 2000
1. Overview
1.1 SCSI Concepts
The SCSI Architecture Model-2 [SAM2] describes in detail the
architecture of the SCSI family of I/O protocols. This section
provides a brief background to situate readers in the vocabulary of
the SCSI architecture.
At the highest level, SCSI is a family of interfaces for requesting
services from I/O devices, including hard drives, tape drives, CD and
DVD drives, printers, and scanners. In SCSI parlance, an individual
I/O device is called a "logical unit" (LU).
SCSI is a client-server architecture. Clients of a SCSI interface are
called "initiators". Initiators issue SCSI "commands" to request
service from a logical unit. The "device server" on the logical unit
accepts SCSI commands and executes them.
A "SCSI transport" maps the client-server SCSI protocol to a specific
interconnect. Initiators are one endpoint of a SCSI transport. The
“target” is the other endpoint. A “target” can have multiple Logical
Units (LUs) behind it. Each logical unit has an address within a
target called a Logical Unit Number (LUN).
A SCSI task is a SCSI command or possibly a linked set of SCSI
commands. Some LUs support multiple pending (queued) tasks. The queue
of tasks is managed by the target, though. The target uses an
initiator provided "task tag" to distinguish between tasks. Only one
command in a task can be outstanding at any given time.
Each SCSI command results in an optional data phase and a required
response phase. In the data phase, information can travel from the
initiator to target (e.g. WRITE), target to initiator (e.g. READ), or
in both directions. In the response phase, the target returns the
final status of the operation, including any errors. A response
terminates a SCSI command. For performance reasons iSCSI allows
"phase-binding" - e.g., command and its associated data may be
shipped together from initiator to target and data and responses may
be shipped together from targets.
Command Data Blocks (CDB) are the data structures used to contain the
command parameters to be handed by an initiator to a target. The CDB
content and structure is defined by [SAM] and device-type specific
SCSI standards.
Satran, J. Standards-Track, June 2001 8
iSCSI December 30, 2000
1.2 iSCSI Concepts & Functional Overview
The iSCSI protocol is a mapping of the SCSI remote procedure
invocation model on top of the TCP protocol.
In keeping with similar protocols, the initiator and target divide
their communications into messages. This document will use the term
"iSCSI protocol data unit" (iSCSI PDU) for these messages.
iSCSI transfer direction is defined with regard to the initiator.
Outbound or outgoing transfers are transfers from initiator to target
while inbound or incoming transfers are from target to initiator.
1.2.1 Layers & Sessions
The following conceptual layering model is used in this document to
specify initiator and target actions and how those relate to
transmitted and received Protocol Data Units:
-the SCSI layer builds/receives SCSI CDB (Command Data Blocks)
and relays/receives them with the remaining command execute
parameters (cf. SAM-2) to/from the
-the iSCSI layer that builds/receives iSCSI PDUs and
relays/receives them to/from - one or more TCP connections that
form an initiator-target "session".
Communication between initiator and target occurs over one or more
TCP connections. The TCP connections carry control messages, SCSI
commands, parameters and data within iSCSI Protocol Data Units (iSCSI
PDUs). The group of TCP connections linking an initiator with a
target form a session (loosely equivalent to a SCSI I-T nexus). A
session is defined by a session ID (composed of an initiator part and
a target part). TCP connections can be added and removed from a
session. Connections within a session are identified by a connection
ID (CID).
Across all connections within a session, an initiator will see one
"target image". All target identifying elements, like LUN are the
same. In addition, across all connections within a session a target
will see one "initiator image". Initiator identifying elements like
Initiator Task Tag can be used to identify the same entity regardless
of the connection on which they are sent or received.
iSCSI targets and initiators MUST support at least one TCP connection
and MAY support several connections in a session.
1.2.2 Ordering and iSCSI numbering
Satran, J. Standards-Track, June 2001 9
iSCSI December 30, 2000
iSCSI uses Command, Status and Data numbering schemes.
Command numbering is session wide and is used for ordered command
delivery over multiple connections. It can also be used as a
mechanism for command flow control over a session.
Status numbering is per connection and is used to enable recovery
in case of connection failure.
Data numbering is per command and is meant to reduce the amount of
memory needed by a target sending unrecoverable data for command
retry.
Normally, fields in the iSCSI PDUs communicate the reference numbers
between the initiator and target. During periods when traffic on a
connection is unidirectional, iSCSI NOP-message PDUs may be utilized
to synchronize the command and status ordering counters of the target
and initiator.
iSCSI NOP-Out PDUs are used as acknowledgements for data numbering.
1.2.2.1 Command numbering
iSCSI supports ordered command delivery within a session. All
commands (initiator-to-target) are numbered.
Any SCSI activity is related to a task (SAM-2). The task is
identified by the Initiator Task Tag for the life of the task.
Commands in transit from the initiator SCSI layer to the target SCSI
layer are numbered by iSCSI and the number is carried by the iSCSI
PDU as CmdRN (Command-Reference-Number). The numbering is session-
wide. All iSCSI PDUs that have a task association carry this number.
CmdRNs are allocated by the initiator iSCSI within a 32 bit unsigned
counter (modulo 2**32). The value 0 is reserved and used to mean
immediate delivery. Comparisons and arithmetic on CmdRN SHOULD use
Serial Number Arithmetic as defined in [RFC1982] where SERIAL_BITS =
32.
The target may choose to deliver some task management commands for
immediate delivery. The means by which the SCSI layer may request
immediate delivery for a command or by which iSCSI will decide by
itself to mark a PDU for immediate delivery are outside the scope of
this document.
Satran, J. Standards-Track, June 2001 10
iSCSI December 30, 2000
CmdRNs are significant only during command delivery to the target.
Once the device serving part of the target SCSI has received a
command, CmdRN ceases to be significant. During command delivery to
the target, the allocated numbers are unique session wide.
The target iSCSI layer SHOULD deliver the commands to the target SCSI
layer in the order specified by CmdRN.
The initiator and target are assumed to have three counters that
define the allocation mechanism
- CmdRN - the current command reference number advanced by 1
on each command shipped
- ExpCmdRN - the next expected command by the target -
acknowledges all commands up to it
- MaxCmdRN - the maximum number to be shipped - MaxCmdRN -
ExpCmdRN defines the queuing capacity of the receiving iSCSI
layer.
The target SHOULD NOT transmit a MaxCmdRN that is more than 2**31 - 1
above the last ExpCmdRN. CmdRN can take any value from ExpCmdRN to
MaxCmdRN except 0. The target MUST silently ignore any command
outside this range or duplicates within the range not flagged with
the retry bit (the X bit in the opcode). The target and initiator
counters MUST uphold causal ordering.
iSCSI initiators MUST implement the command numbering scheme if they
support more than one connection per session (as even sessions with a
single connection may be expanded beyond one connection).
Command numbering for sessions that will only be made up of one
connection is optional. iSCSI initiators utilizing a single
connection for a session and not utilizing command numbering MUST
indicate that they will not support command numbering by setting
InitCmdRN to 0 in the Login command.
Whenever an initiator indicates support for command numbering, by
setting InitCmdRN to a non-zero value at Login, the target MUST
provide ExpCmdRN and MaxCmdRN values that will enable the initiator
to make progress.
1.2.2.2 Response/Status numbering
Responses in transit from the target to the initiator are numbered.
The StatRN (Status Reference Number) is used for this purpose. StatRN
Satran, J. Standards-Track, June 2001 11
iSCSI December 30, 2000
is a counter maintained per connection. ExpStatRN is used by the
initiator to acknowledge status.
To enable command recovery the target MAY maintain enough state to
enable data and status recovery after a connection failure.
A target can discard all the state information maintained for
recovery after the status delivery is acknowledged through ExpStatRN.
A large difference between StatRN and ExpStatRN may indicate a failed
connection.
Initiators and Targets MUST support the response-numbering scheme
regardless of the support for command recovery.
1.2.2.3 Data PDU numbering
Incoming Data PDUs MAY be numbered by a target to enable fast
recovery of long running READ commands.
Data PDUs are numbered with DataRN. NOP-Out PDUs carrying the same
Initiator Tag as the Data PDUs are used to acknowledge the incoming
Data PDUs with ExpDataRN. Support for Data PDU acknowledgement and
the maximum number of unacknowledged data PDUs are negotiated at
login.
In a PDU carrying both data and status, the field is used for StatRN
and the last set of data blocks is implicitly acknowledged when
Status is acknowledged.
1.2.3 iSCSI Login
The purpose of iSCSI login is to enable a TCP connection for iSCSI
use, authenticate the parties, negotiate the session's parameters,
open a security association protocol and mark the connection as
belonging to an iSCSI session.
A session is used to identify to a target all the connections with a
given initiator that belong to the same I_T nexus. If an initiator
and target are connected through more than one session each of the
initiator and target perceives the other as a different entity on
each session (a different I_T nexus in SAM-2 parlance).
The targets listen on a well-known TCP port for incoming connections.
The initiator begins the login process by connecting to that well-
known TCP port.
As part of the login process, the initiator and target MAY wish to
authenticate each other and set a security association protocol for
Satran, J. Standards-Track, June 2001 12
iSCSI December 30, 2000
the session. This can occur in many different ways and is subject to
negotiation.
Negotiation and security associations executed before the Login
Command are outside the scope of this document although they might
realize a related function (e.g., establish a IPsec or TLS session).
The Login Command starts the iSCSI Login Phase. Within the Login
Phase, negotiation is carried on through parameters of the Login
Command and Response and optionally through intervening Text Commands
and Responses. The Login Response concludes the Login Phase. Once
suitable authentication has occurred, the target MAY authorize the
initiator to send SCSI commands. How the target chooses to authorize
an initiator is beyond the scope of this document. The target
indicates a successful authentication and authorization by sending a
login response with "accept login". Otherwise, it sends a response
with a "login reject", indicating a session is not established.
It is expected that iSCSI parameters will be negotiated after the
security association protocol is established if there is a security
association.
The login message includes a session ID - composed with an initiator
part ISID and a target part TSID. For a new session, the TSID is
null. As part of the response, the target will generate a TSID.
Session specific parameters can be specified only for the first login
of a session (TSID null)(e.g., the maximum number of connections that
can be used for this session). Connection specific parameters (if
any) can be specified for any login. Thus, a session is operational
once it has at least one connection.
Any message except login and text sent on a TCP connection before
this connection gets into full feature phase at the initiator SHOULD
be ignored by the initiator. Any message except login and text
reaching a target on a TCP connection before the full feature phase
MUST be silently ignored by the target.
1.2.4 Text mode negotiation
During login and thereafter some session or connection parameters are
negotiated through an exchange of textual information.
In "list" negotiation, the offering party will send a list of values
for a key in its order of preference.
Satran, J. Standards-Track, June 2001 13
iSCSI December 30, 2000
The responding party will answer with a value from the list.
The value "none" MUST always be used to indicate a missing function.
However, none is a valid selection only if it was explicitly offered
and it MAY be selected by omission (i.e. <key>:none MAY be omitted).
The general format is:
Offer-> <key>:(<value1>,<value2>,...,<valuen>)
Answer-> <key>:<valuex>
In "numerical" negotiations, the offering and responding party state
a numerical value. The result of the negotiation is key dependent
(usually the lower or the higher of the two values).
1.2.5 iSCSI Full Feature Phase
Once the initiator is authorized to do so, the iSCSI session is in
iSCSI full feature phase. The initiator may send SCSI commands and
data to the various LUs on the target by wrapping them in iSCSI
messages that go over the established iSCSI session.
For SCSI commands that require data and/or parameter transfer, the
(optional) data and the status for a command must be sent over the
same TCP connection that was used to deliver the SCSI command (we
call this "connection allegiance"). Thus if an initiator issues a
READ command, the target must send the requested data, if any,
followed by the status to the initiator over the same TCP connection
that was used to deliver the SCSI command. If an initiator issues a
WRITE command, the initiator must send the data, if any, for that
command and the target MUST return R2T, if any, an the status over
the same TCP connection that was used to deliver the SCSI command.
However consecutive commands that are part of a SCSI linked commands
task MAY use different connections - connection allegiance is
strictly per-command and not per-task. During iSCSI Full Feature
Phase, the initiator and target MAY interleave unrelated SCSI
commands, their SCSI Data and responses, over the session.
Outgoing SCSI data (initiator to target - user data or command
parameters) will be sent as either solicited data or unsolicited
data. Solicited data are sent in response to Ready To Transfer (R2T)
PDUs. Unsolicited data can be part of an iSCSI command PDU
("immediate data") or an iSCSI data PDU. An initiator may send
unsolicited data (immediate or in a separate PDU) up to the SCSI
limit (initial burst size - mode page 02h). All subsequent data have
to be solicited.
Satran, J. Standards-Track, June 2001 14
iSCSI December 30, 2000
Targets operate in either solicited (R2T) data mode or unsolicited
(non R2T) data mode. An initiator MUST always honor an R2T data
request for a valid outstanding command (i.e., carrying a valid
Initiator Task Tag) and provided the command is supposed to deliver
outgoing data and the R2T specifies data within the command bounds.
It is considered an error for an initiator to send unsolicited data
PDUs to a target operating in R2T mode (only solicited data). It is
also an error for an initiator to send more data whether immediate or
as a separate PDU) than the SCSI limit for initial burst. An
initiator MAY request, at login, to send immediate data blocks of any
size. If the initiator requests a specific block size the target MUST
indicate the size of immediate data blocks it is ready to accept in
its response. Beside iSCSI, SCSI also imposes a limit on the amount
of unsolicited data a target is willing to accept. The iSCSI
immediate data limit MUST not exceed the SCSI limit.
A target SHOULD NOT silently discard data and request retransmission
through R2T. Initiators MUST NOT perform any score boarding for data
and the residual count calculation is to be performed by the targets.
Incoming data is always implicitly solicited. SCSI Data packets are
matched to their corresponding SCSI commands by using Tags that are
specified in the protocol.
Initiator tags for pending commands are unique initiator-wide for a
session. Target tags are not strictly specified by the protocol - it
is assumed that those will be used by the target to tag (alone or in
combination with the LUN) the solicited data. Target tags are
generated by the target and "echoed" by the initiator. The above
mechanisms are designed to accomplish efficient data delivery and a
large degree of control over the data flow.
iSCSI initiators and targets MUST also enforce some ordering rules to
achieve deadlock-free operation. Unsolicited data MUST be sent on
every connection in the same order in which commands were sent. If
the amount of data exceeds the amount allowed for unsolicited write
data, the specific connection MUST be stalled - i.e., no more
unsolicited data will not be on this connection until the specific
command has finished sending all its data and has received a
response. However new commands can be sent on the connection. A
target receiving data out of order or observing a connection
violating the above rules MUST terminate the session.
Each iSCSI session to a target is treated as if it originated from a
different and logically independent initiator.
Satran, J. Standards-Track, June 2001 15
iSCSI December 30, 2000
1.2.6 iSCSI Connection Termination
Connection termination is assumed an exceptional event.
Graceful TCP connection shutdowns are done by sending TCP FINs.
Graceful connection shutdowns MUST only occur when there are no
outstanding tasks that have allegiance to the connection. A target
SHOULD respond rapidly to a FIN from the initiator by closing it's
half of the connection after waiting for all outstanding tasks that
have allegiance to the connection to conclude and send their status.
Connection termination with outstanding tasks may require recovery
actions.
Connection termination is also required as prelude to recovery. By
terminating a connection before starting recovery, initiator and
target can avoid having stale PDUs being received after recovery. In
this case, the initiator will send a LOGOUT request on any of the
operational connections of a session indicating what connection
should be terminated.
1.2.7 Naming & mapping
Text string names are used in iSCSI to:
- provide explicitly a transportID for the target to enable the
latter to recognize the initiator because the conventional IP-
address and port pair is inaccurate behind firewalls and NAT
devices (key - initiator)
- provide a targetID for simple configurations hiding several
targets behind an IP-address and port (key - target)
- provide a symbolic address for source and destination targets
in third party commands; those will be mapped into SCSI
addresses by a SCSI aliasing mechanism
The targetID MUST be presented within the login phase.
The names do not require handling within iSCSI - i.e. are opaque
entities within this document. In order to enable implementers to
relate them to other names and name handling mechanisms the following
syntax for names SHOULD be used
<domain-name>[/modifier]
Where domain-name follows DNS (or dotted IP) rules and the modifier
is an alphanumeric string (N.B. the whole pattern follows the URL
structure)
Satran, J. Standards-Track, June 2001 16
iSCSI December 30, 2000
Some mapped names for third party command use might have to include a
port number. For those the following syntax SHOULD be used:
<domain-name>[:[port][/modifier]
The text to address transformation, wherever needed, will be
performed through available name translation services (DNS servers,
LDAP accessible directories etc.).
To enable simple devices to operate without name-to-address
conversion services the following conventions SHOULD be used:
A domain name that contains exactly four numbers separated by
dots (.), where each number is in the range 0 through 255, will
be interpreted as an IPv4 address.
A domain name that contains more than four, but at most 16
numbers separated by dots (.), where each number is in the
range 0 through 255, will be interpreted as an Ipv6 address.
Examples of IPv4 addresses/names:
10.0.0.1/diskfarm1
10.0.0.2
Examples of IPv6 addresses/names
12.5.7.10.0.0.1/tapefarm1
12.5.6.10.0.0.2
For management/support tools as well as naming services that use a
text prefix to express the protocol intended (as in http:// or
ftp://) the following form MAY be used:
iSCSI://<domain-name>[:port][/modifier]
Examples:
iSCSI://diskfarm1.acme.com
iSCSI://computingcenter.acme.com/diskfarm1
iSCSI://computingceneter.acme.com:4002/scanners
Satran, J. Standards-Track, June 2001 17
iSCSI December 30, 2000
When a target has to act as an initiator for a third party command,
it MAY use the initiator name it learned during login as required by
the authentication mechanism to the third party.
To address targets and logical units within a target, SCSI uses a
fixed length (8 bytes) uniform addressing scheme; in this document,
we call those addresses SCSI reference addresses (SRA).
To provide the target with the protocol specific addresses iSCSI
relies on the SCSI aliasing mechanism (work in progress in T10). The
aliasing support enables an initiator to associate protocol specific
addresses with SRAs; the later can be used in subsequent commands.
For iSCSI, a protocol specific address is a TCP address and a
selector.
1.2.8 Message Framing
1.2.8.1 Framing Justification
iSCSI presents a mapping of the SCSI protocol onto TCP. This
encapsulation is accomplished by sending iSCSI PDUs that are of
varying length. Unfortunately, TCP does not have a built-in mechanism
for signaling message boundaries at the TCP layer. iSCSI overcomes
this obstacle by placing the message length in the iSCSI message
header. This serves to delineate the end of the current message as
well as the beginning of the next message.
In situations where IP packets are delivered in-order from the
network, iSCSI message framing is not an issue (messages are
processed one after the other). In the presence of IP packet
reordering (e.g. frames being dropped), legacy TCP implementations
store the "out of order" TCP segments in temporary buffers until the
missing TCP segments arrive, upon which the data must be copied to
the application buffers. In iSCSI it is desirable to steer the SCSI
data within these out of order TCP segments into the pre-allocated
SCSI buffers rather than store them in temporary buffers. This
decreases the need for dedicated reassembly buffers as well as the
latency and bandwidth related to extra copies.
Unfortunately, when relying solely on the "message length in the
iSCSI message" scheme to delineate iSCSI messages, a missing TCP
segment that contains an iSCSI message header (with the message
length) makes it impossible to find message boundaries in subsequent
TCP segments. The missing TCP segment(s) must be received before any
of the following segments can be steered to the correct SCSI buffers
(due to the inability to determine the iSCSI message boundaries).
Satran, J. Standards-Track, June 2001 18
iSCSI December 30, 2000
Since these segments cannot be steered to the correct location, they
must be save in temporary buffers that must then be copied to the
SCSI buffers.
To reduce the amount of temporary buffering and copying,
synchronization information (markers) is placed at fixed intervals in
the TCP stream to enable accelerated iSCSI/TCP implementations to
find and delineate iSCSI messages in the presence of IP packet
reordering.
The use of markers is negotiable. Initiator and target MAY indicate
their readiness to receive and/or send markers, during login,
separately for each connection. The default is NO. In certain
environments a sender not willing to supply markers to a receiver
willing to accept markers MAY suffer from a considerable performance
degradation.
Satran, J. Standards-Track, June 2001 19
iSCSI December 30, 2000
1.2.8.2 Markers At Fixed Intervals
At fixed intervals in the TCP byte stream, a "Marker" is inserted.
This Marker indicates the offset to the next iSCSI message header.
The Marker is eight bytes in length, and contains two 32-bit offset
fields that indicate how many bytes to skip in the TCP stream to find
the next iSCSI message header. There are two copies of the offset in
the Marker to handle the case where the Marker straddles a TCP
segment boundary. Each end of the iSCSI session specifies during
login the interval of the Marker it will be receiving, or disables
the Marker altogether. If a receiver indicates that it desires a
Marker, the sender SHOULD provide the Marker at the desired interval.
The marker interval (and the initial marker-less interval) are
counted in terms of the TCP-sequence-number. Anything counted in the
TCP sequence-number is counted for the interval and the initial
marker-less interval.
Markers MUST point to a 4 byte word boundary in the TCP stream - the
last 2 bits of each marker word are reserved and will be considered 0
for offset computation.
Padding iSCSI PDU payloads to 4 byte word boundaries simplifies
marker manipulation.
1.2.8.3 iSCSI PDU Size
When a large iSCSI message is sent, the TCP segment(s) containing the
iSCSI header may be lost. The remaining TCP segment(s) up to the
next iSCSI message need to be buffered (in temporary buffers), since
the iSCSI header that indicates what SCSI buffers, the data is to be
steered to was lost. To minimize the amount of buffering, it is
recommended that the iSCSI PDU size be restricted to a small value
(perhaps a few TCP segments in length). Each end of the iSCSI session
specifies during login the maximum size of an iSCSI PDU it will
accept.
1.2.8.4 Initial marker-less interval
To enable the connection setup including the login phase negotiation
the negotiated marking will be started at negotiated boundary in the
stream. The marker-less interval will not be less than 64 kbytes and
the default will be 64 kbytes.
Satran, J. Standards-Track, June 2001 20
iSCSI December 30, 2000
2. iSCSI PDU Formats
All multi-byte integers specified in formats defined in this document
are to be represented in network byte order (i.e., big endian). Any
bits not defined should be set to zero.
2.1 iSCSI PDU length and padding
iSCSI PDUs are padded to an integer number of 4 byte words.
2.2 Template Header and Opcodes
All iSCSI PDUs begin with a 48-byte header. Additional data appears,
as necessary, beginning with byte 48. The fields of Opcode and Length
appear in all iSCSI PDUs. In addition, the Initiator Task tag,
Logical Unit Number, and Flags fields, when used, always appear in
the same location in the header.
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| Opcode |X| Opcode-specific fields |
| |P| |
+---------------+---------------+---------------+---------------+
4| Length of Data (after 48 byte Header) |
+---------------+---------------+---------------+---------------+
8| LUN or Opcode-specific fields |
+ +
12| |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag or Opcode-specific fields |
+---------------+---------------+---------------+---------------+
20/ Opcode-specific fields /
+/ /
+---------------+---------------+---------------+---------------+
48| Header digest (optional-constant-length) |
+---------------------------------------------------------------+
+n/ /
+/ Data (optional) /
+---------------------------------------------------------------+
m| Data digest (optional-variable-length) |
+---------------------------------------------------------------+
Satran, J. Standards-Track, June 2001 21
iSCSI December 30, 2000
2.2.1 Opcode
The Opcode indicates what type of iSCSI PDU the header encapsulates.
The Opcode is further encoded as follows:
b7 Response
b6-0 Operation
The opcodes are divided into two categories: initiator opcodes and
target opcodes. Initiator opcodes are in PDUs sent by the initiators,
and target opcodes are in PDUs sent by the target. The initiator MUST
NOT send target opcodes and the target MUST NOT send initiator
opcodes. Target opcodes are also called responses and are
distinguished by having the Response bit (bit 6) set to 1.
Valid initiator opcodes defined in this specification are:
0x00 NOP-Out (from initiator to target)
0x01 SCSI Command (encapsulates a SCSI Command Descriptor
Block)
0x02 SCSI Task Management Command
0x03 Login Command
0x04 Text Command
0x05 SCSI Data (for WRITE operation)
0x06 Logout Command
Valid target opcodes are:
0x80 NOP-In (from target to initiator)
0x81 SCSI Response (contains SCSI status and possibly sense
information or other response information)
0x82 SCSI Task Management Response
0x83 Login Response
0x84 Text Response
0x85 SCSI Data (for READ operation)
0x86 Logout Response
0x90 Ready To Transfer (R2T - sent by target to initiator when
it is ready to receive data from initiator)
0x91 Asynchronous Event (sent by target to initiator to
indicate certain special conditions)
0xef Reject
Satran, J. Standards-Track, June 2001 22
iSCSI December 30, 2000
Initiator opcodes 0x70-0x7f and target opcodes 0xf0-0xff are vendor
specific codes.
2.2.2 Opcode-specific fields
These fields have different meanings for different messages.
Bit 7 of the second byte is used as a retry indicator for commands (X
bit) or Poll bit (P bit) and must be 0 in all other iSCSI PDUs
2.2.3 Length
The Length field indicates the number of bytes, beyond the first 48
bytes, that are being sent together with this message header. The
length includes the header and data digests if any. It is anticipated
that most iSCSI PDUs (not counting data transfer PDUs) will not need
more than the 48 byte header. The length field accounts for proper
iSCSI PDU content; whatever padding is required to reach a 4 byte
boundary in the TCP stream is implied by the protocol but not
accounted for in the length field.
2.2.4 LUN
Some opcodes operate on a specific Logical Unit. The Logical Unit
Number (LUN) field identifies which Logical Unit. If the opcode does
not relate to a Logical Unit, this field either is ignored or may be
used for some other purpose. The LUN field is 64-bits in accordance
with [SAM2]. The exact format of this field can be found in the
[SAM2] document.
2.2.5 Initiator Task Tag
The initiator assigns a Task Tag to each SCSI task that it issues.
This tag is a session-wide unique identifier that can be used to
uniquely identify the Task.
2.2.6 Header Digest and Data Digest
Optional header and data digests protect the integrity and
authenticity of header and data, respectively. The digests, if
present, appear as trailers located, respectively, after the header
and PDU-specific data.
Satran, J. Standards-Track, June 2001 23
iSCSI December 30, 2000
The digest types are negotiated during the login phase.
The separation of the header and data digests is useful in iSCSI
routing applications, where only the header changes when a message is
forwarded. In this case, only the header digest should be re-
calculated.
Satran, J. Standards-Track, June 2001 24
iSCSI December 30, 2000
2.3 SCSI Command
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x01 |X|R|W|0 0|ATTR | Reserved (0) | AddCDB |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| Logical Unit Number (LUN) |
+ +
12| |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Expected Data Transfer Length |
+---------------+---------------+---------------+---------------+
24| CmdRN |
+---------------+---------------+---------------+---------------+
28| ExpStatRN |
+---------------+---------------+---------------+---------------+
32/ SCSI Command Descriptor Block (CDB) /
+/ /
+---------------+---------------+---------------+---------------+
48/ Command Data (Command Dependent) /
+/ /
+---------------+---------------+---------------+---------------+
2.3.1 Flags & Task Attributes
The flags field for a SCSI Command is:
b7 Retry (X)
b6 (R) set to 1 when input data is expected
b5 (W) set to 1 when output data is expected
b3-4 Reserved (MUST be 0)
b0-2 used to indicate Task Attributes
The Task Attributes (ATTR) can have one of the following integer
values (see [SAM2] for details):
0 Untagged
1 Simple
Satran, J. Standards-Track, June 2001 25
iSCSI December 30, 2000
2 Ordered
3 Head of Queue
4 ACA
2.3.2 AddCDB
Additional CDB length (over 16) in units of 4 bytes.
2.3.3 CmdRN - Command Reference Number
Enables ordered delivery across multiple connections in a single
session.
2.3.4 ExpStatRN - Expected Status Reference Number
Command responses up to ExpStatRN-1 (mod 2**32) have been received
(acknowledges status) on the connection.
2.3.5 Expected Data Transfer Length
For unidirectional operations, the Expected Data Transfer Length
field states the number of bytes of data involved in this SCSI
operation. For a WRITE operation, the initiator uses this field to
specify the number of bytes of data it expects to transfer for this
operation. For a READ operation, the initiator uses this field to
specify the number of bytes of data it expects the target to transfer
to the initiator. It corresponds to the SAM-2 byte count.
For bi-directional operations, this field states the number of data
bytes involved in the outbound transfer. For bi-directional
operations, an additional field indicating the Expected Bidi-Read
Data Transfer Length is following the (possibly extended) CDB as
shown below:
+---------------+---------------+---------------+---------------+
48/ Additional CDB (if any) /
+/ /
+---------------+---------------+---------------+---------------+
+n| Expected Bidi-Read Data Transfer Length |
+---------------------------------------------------------------+
+4/ Immediate data (optional) /
/ /
+---------------------------------------------------------------+
Satran, J. Standards-Track, June 2001 26
iSCSI December 30, 2000
If no data will be transferred in SCSI Data packets for this SCSI
operation, this field should be set to zero.
Upon completion of a data transfer, the target will inform the
initiator of how many bytes were actually processed (sent or
received) by the target. This will be done through residual counts.
2.3.6 CDB - SCSI Command Descriptor Block
There are 16 bytes in the CDB field to accommodate the commonly used
CDB. Whenever larger CDBs are used, the CDB spillover MAY extend
beyond the 48-byte header.
2.3.7 Command-Data
Some SCSI commands require additional parameter data to accompany the
SCSI command. This data may be placed beyond the 48-byte boundary of
the iSCSI header. Alternatively, user data (as from a WRITE
operation) can be placed in the same PDU (both cases referred to as
immediate data).
Satran, J. Standards-Track, June 2001 27
iSCSI December 30, 2000
2.4 SCSI Response
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x81 |Rsvd |o|u|O|U| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| Reserved (0) |
+ +
12| |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Basic Residual Count |
+---------------+---------------+---------------+---------------+
24| StatRN |
+---------------+---------------+---------------+---------------+
28| ExpCmdRN |
+---------------+---------------+---------------+---------------+
32| MaxCmdRN |
+---------------+---------------+---------------+---------------+
36| Command Status| Reserved (0) |
+---------------+---------------+---------------+---------------+
40| Resp_length | Sense_length |
+---------------+---------------+---------------+---------------+
44| Bidi-Read Residual Count |
+---------------+---------------+---------------+---------------+
48/ Response and/or sense Data (optional) /
+/ /
+---------------+---------------+---------------+---------------+
2.4.1 Byte 1 - Flags
b0 (U) set for Residual Underflow. In this case, the Basic
Residual Count indicates how many bytes were not transferred
out of those expected to be transferred.
b1 (O) set for Residual Overflow. In this case, the Basic
Residual Count indicates how many bytes could not be
transferred because the initiator's Expected Data Transfer
Length was too small.
b2 (u) same as b0 but for the read-part of a bi-directional
operation
Satran, J. Standards-Track, June 2001 28
iSCSI December 30, 2000
b3 (o) same as b1 but for the read-part of a bi-directional
operation
b4-7 not used (SHOULD be set to 0)
Bits O and U are mutually exclusive and so are bits o and u.
2.4.2 Basic Residual Count
The Basic Residual Count field is valid only in case either the U bit
or the O bit is set. If neither bit is set, the Basic Residual Count
field SHOULD be zero. If the U bit is set, the Basic Residual Count
indicates how many bytes were not transferred out of those expected
to be transferred. If the O bit is set, the Basic Residual Count
indicates how many bytes could not be transferred because the
initiator's Expected Data Transfer Length was too small.
2.4.3 Bidi-Read Residual Count
The Bidi-Read Residual Count field is valid only in case either the u
bit or the o bit is set. If neither bit is set, the Bidi-Read
Residual Count field SHOULD be zero. If the u bit is set, the Bidi-
Read Residual Count indicates how many bytes were not transferred in
out of those expected to be transferred. If the o bit is set, the
Bidi-Read Residual Count indicates how many bytes could not be
transferred in because the initiator's Expected Bidi-Read Transfer
Length was too small.
2.4.4 Command Status
The Command Status field is used to report the SCSI status of the
command (as specified in [SAM2]).
2.4.5 Resp_length - Response length
2.4.6 Sense_length - Length of sense data
2.4.7 Response and/or Sense Data
iSCSI targets MUST support and enable autosense. If the Command
Status was CHECK CONDITION (0x02), then the Response and/or Sense
Data field will contain sense data for the failed command after the
response data. Some sense codes will relate to iSCSI check
conditions (e.g. excessive number of outstanding commands, immediate
data blocks too large etc.). The Length parameters specify the
number of bytes in each section of this field. If no error occurred,
and no data is needed for the response to the SCSI Command the length
Satran, J. Standards-Track, June 2001 29
iSCSI December 30, 2000
field is zero. If both Response Data and Sense Data are present, the
Response Data precedes the Sense Data.
2.4.8 StatRN - Status Reference Number
StatRN is a reference number that the target iSCSI layer generates
per connection and that in turn enables the initiator to acknowledge
status reception. StatRN is incremented by 1 for every
response/status sent on a connection.
2.4.9 ExpCmdRN - next expected CmdRN from this initiator
ExpCmdRN is a reference number that the target iSCSI returns to the
initiator to acknowledge command reception. It is used to update a
local counter with the same name.
2.4.10 MaxCmdRN - maximum CmdRN acceptable from this initiator
MaxCmdRN is a reference number that the target iSCSI returns to the
initiator to indicate the maximum CmdRN the initiator can send. It is
used to update a local counter with the same name.
MaxCmdRN and ExpCmdRN are processed as follows:
-if the PDU MaxCmdRN is less than the PDU ExpCmdRN (in Serial
Arithmetic Sense and with a difference bounded by 2**31-1) they
are both ignored
-if the PDU MaxCmdRN is less than the current MaxCmdRN (in
Serial Arithmetic Sense and with a difference bounded by 2**31-
1) it is ignored else it updates MaxCmdRN
-if the PDU ExpCmdRN is less than the current ExpCmdRN (in
Serial Arithmetic Sense and with a difference bounded by 2**31-
1) it is ignored else it updates ExpCmdRN
This sequence is required as updates may arrive out of order (they
travel on different TCP connections).
Satran, J. Standards-Track, June 2001 30
iSCSI December 30, 2000
2.5 SCSI Task Management Command
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x02 |0| Function | Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| Logical Unit Number (LUN) |
+ +
12| |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Referenced Task Tag or Reserved (0) |
+---------------+---------------+---------------+---------------+
24| CmdRN |
+---------------+---------------+---------------+---------------+
28| ExpStatRN |
+---------------+---------------+---------------+---------------+
32/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48
2.5.1 Function
The Task Management functions provide an initiator with a way to
explicitly control the execution of one or more Tasks. The Task
Management functions are summarized as follows (for a more detailed
description see the [SAM2] document):
1 Abort Task---aborts the task identified by the Referenced
Task Tag field.
2 Abort Task Set---aborts all Tasks issued by this initiator
on the Logical Unit.
3 Clear ACA---clears the Auto Contingent Allegiance
condition.
4 Clear Task Set---Aborts all Tasks (from all initiators)
for the Logical Unit.
5 Logical Unit Reset
6 Target Warm Reset
7 Target Cold Reset
Satran, J. Standards-Track, June 2001 31
iSCSI December 30, 2000
For the functions above a SCSI Task Management Response MUST be
returned, using the Initiator Task Tag to identify the operation for
which it is responding.
For the <Clear Task Set>, if SCSI control mode enables AE reporting,
the target MUST send an Asynchronous Event to all other attached
initiators to inform them that all pending tasks are cancelled and
then enter the ACA state for any initiator for which it had pending
tasks.
For the <Target Warm Reset> and <Target Cold Reset> functions, the
target cancels all pending operations and are both equivalent to the
Target Reset as specified by SAM-2. Provided that SCSI control mode
enables AE reporting, the target MUST send an Asynchronous Event to
all attached initiators notifying them that the target is being
reset.
In addition, for the <Target Warm Reset> the target will enter the
ACA state on all sessions and all LUs on which an AE was sent.
In addition, for the <Target Cold Reset> the target then MUST
terminate all of its TCP connections to all initiators (all sessions
are terminated). However, if the target finds that it cannot send the
required response or AEN it MUST continue the reset operation and it
SHOULD log the condition for later retrieval. The logging operation
MUST be reported through the target MIB.
Further actions on reset functions are specified in the relevant SCSI
documents for the specific class of devices.
2.5.2 Referenced Task Tag
Initiator Task Tag of the task to be aborted - for abort task
Satran, J. Standards-Track, June 2001 32
iSCSI December 30, 2000
2.6 SCSI Task Management Response
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x82 |0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| Logical Unit Number (LUN) |
+ +
12| |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Referenced Task Tag or Reserved (0) |
+---------------+---------------+---------------+---------------+
24| StatRN |
+---------------+---------------+---------------+---------------+
28| ExpCmdRN |
+---------------+---------------+---------------+---------------+
32| MaxCmdRN |
+---------------+---------------+---------------+---------------+
36| Response | Reserved (0) |
+---------------+---------------+---------------+---------------+
40/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48
For the functions <Abort Task, Abort Task Set, Clear ACA, Clear Task
Set, Logical Unit reset, Target Warm Reset>, the target performs the
requested Task Management function and sends a SCSI Task Management
Response back to the initiator. The target provides a Response, which
may take on the following values:
0 Function Complete
1 No Task Found
255 Function Rejected
For the <Target Cold Reset> and <Target Warm Reset> functions, the
target cancels all pending operations. If SCSI control mode enables
AE reporting, the target MUST send an Asynchronous Event to all
Satran, J. Standards-Track, June 2001 33
iSCSI December 30, 2000
attached initiators notifying them that the target has been reset.
For the <Target Cold Reset> the target MUST then close all of its TCP
connections to all initiators (terminates all sessions).
2.6.1 Referenced Task Tag
Initiator Task Tag of the task not found
Satran, J. Standards-Track, June 2001 34
iSCSI December 30, 2000
2.7 SCSI Data
The typical data transfer specifies the length of the data payload,
the Transfer Tag provided by the receiver for this data transfer, and
a buffer offset. The typical SCSI Data packet for WRITE (from
initiator to target) has the following format:
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x05 |F| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| LUN or Reserved (0) |
12| |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Target Task Tag (solicited) or Reserved (0) (unsolicited) |
+---------------+---------------+---------------+---------------+
24| Reserved (0) |
+---------------+---------------+---------------+---------------+
28| ExpStatRN |
+---------------+---------------+---------------+---------------+
32/ Reserved (0) /
/ /
+---------------+---------------+---------------+---------------+
40| Buffer Offset |
+---------------+---------------+---------------+---------------+
44| Reserved (0) |
+---------------+---------------+---------------+---------------+
48/ Payload /
+/ /
+---------------+---------------+---------------+---------------+
Satran, J. Standards-Track, June 2001 35
iSCSI December 30, 2000
The typical SCSI Data packet for READ (from target to initiator) has
the following format:
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x85 |P| (0) |S|O|U| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| Reserved (0) |
+---------------+---------------+---------------+---------------+
12| Reserved (0) |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Residual Count |
+---------------+---------------+---------------+---------------+
24| DataRN /StatRN |
+---------------+---------------+---------------+---------------+
28| ExpCmdRN |
+---------------+---------------+---------------+---------------+
32| MaxCmdRN |
+---------------+---------------+---------------+---------------+
36| Command Status| Reserved (0) |
+---------------+---------------+---------------+---------------+
40| Buffer Offset |
+---------------+---------------+---------------+---------------+
44| Reserved (0) |
+---------------+---------------+---------------+---------------+
48/ Payload /
+/ /
+---------------+---------------+---------------+---------------+
2.7.1 F (Final) bit
This bit is 1 for the last PDU of immediate data or the last PDU of a
sequence answering a R2T.
2.7.2 Length
The length field specifies the total number of bytes in the following
payload.
Satran, J. Standards-Track, June 2001 36
iSCSI December 30, 2000
2.7.3 Target Task Tag
The Target Task Tag is provided to the target if the transfer is
honoring a R2T. In this case, the Target Task Tag field is a replica
of the Target Task Tag provided with the R2T.
The Target Task Tag values are not specified by this protocol except
that the all-bits-one value (0x'ffffffff') is reserved and means that
the Target Task Tag is not supplied. If the Target Task Tag is
provided then the LUN field MUST hold a valid value and consistent
with whatever was specified with the command, else the LUN field is
reserved.
2.7.4 Buffer Offset
The Buffer Offset field contains the offset of the following data
against the complete data transfer. The sum of the buffer offset and
length should not exceed the expected transfer length for the
command.
2.7.5 Flags
The last SCSI Data packet sent from a target to an initiator for a
particular SCSI command that completed successfully may optionally
also contain the Command Status for the data transfer. In this case
Sense Data cannot be sent together with the Command Status. If the
command completed with an error, then the response and sense data
must be sent in a SCSI Response packet and must not be sent in a SCSI
Data packet.
b0-1 as in an ordinary SCSI Response
b2 S (status)- set to indicate that the Command Status field
contains status
b3-6 not used (should be set to 0)
b7 P (poll) - set to indicate data acknowledgement is
requested; b7 and b2 are mutually exclusive - if S bit is set P
bit MUST be ignored
If the S bit is set, then there is meaning to the extra fields in the
SCSI Data packet (StatRN, Command Status, Residual Count).
2.7.6 Data numbering (DataRN)
On inbound data, the target MAY number (sequence) the data packets to
enable shorter recovery on connection failure. In case the target
numbers data packets, the initiator MUST acknowledge them by
specifying the next expected packet in a NOP-Out with the same
Satran, J. Standards-Track, June 2001 37
iSCSI December 30, 2000
Initiator Tag. Acknowledging NOP PDUs MAY be postponed for up to the
number of incoming data PDUs negotiated at login. An explicit
request for acknowledgement made by setting the P bit MUST be
honored.
Satran, J. Standards-Track, June 2001 38
iSCSI December 30, 2000
2.8 Text Command
The Text Command is provided to allow the exchange of information and
for future extensions. It permits the initiator to inform a target of
its capabilities or to request some special operations.
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x04 |0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Reserved (0) |
+---------------+---------------+---------------+---------------+
24| CmdRN |
+---------------+---------------+---------------+---------------+
28| ExpStatRN |
+---------------+---------------+---------------+---------------+
32/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48/ Text /
+/ /
+---------------+---------------+---------------+---------------+
2.8.1 Length
This is the length, in bytes, of the Text field.
2.8.2 Initiator Task Tag
The initiator assigned identifier for this Text Command.
If the command is sent as part of the Login Phase the Initiator Task
Tag MUST be the same as the one sent with the Login Command.
2.8.3 Text
Satran, J. Standards-Track, June 2001 39
iSCSI December 30, 2000
The initiator sends the target a set of key:value or key:(list) pairs
encoded in UTF-8 Unicode. The key and value are separated by a ':'
(0x3A) delimiter. Many key:value pairs can be included in the Text
block by separating them with null ' ' (0x00) delimiters.
Character strings are represented following the C-language syntax.
Numeric and binary values are represented using either using decimal
numbers or the hexadecimal 0x'ffff' notation. The result is adjusted
to the specific key.
Some basic key:value pairs are described in Appendix A & C. The
target responds by sending its response back to the initiator. The
target and initiator can then perform some advanced operations based
on their common capabilities.
Manufacturers may introduce new keys by prefixing them with their
(reversed) domain name, for example the company owning the domain
acme.com can issue:
com.acme.bar.foo.do_something:0000000000000003
Any key that the target does not understand may be ignored without
affecting basic function. Once the target has processed all the
key:value or key:(list) pairs, it responds with the Text Response
command, listing the parameters that it supports. It is recommended
that Text operations that will take a long time should be placed in
their own Text command. If the Text Response does not contain a key
that was requested, the initiator must assume that the key was not
understood by the target.
Targets and initiators may limit the size of the text accepted in a
text command and text response as well as the size of key:value
pairs. Such limits should be indicated at login.
The default limit is 16384 UTF8 characters.
Satran, J. Standards-Track, June 2001 40
iSCSI December 30, 2000
2.9 Text Response
The Text Response message contains the responses of the target to the
initiator's Text Command. The format of the Text field matches that
of the Text Command.
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x84 |0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Reserved (0) |
+---------------+---------------+---------------+---------------+
24| StatRN |
+---------------+---------------+---------------+---------------+
28| ExpCmdRN |
+---------------+---------------+---------------+---------------+
32| MaxCmdRN |
+---------------+---------------+---------------+---------------+
36/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48/ Text Response /
+/ /
+---------------+---------------+---------------+---------------+
2.9.1 Length
This is the length, in bytes, of the Text Response field.
2.9.2 Initiator Task Tag
The Initiator Task Tag matches the tag used in the initial Text
Command or the Login Initiator Task Tag.
2.9.3 Text Response
Satran, J. Standards-Track, June 2001 41
iSCSI December 30, 2000
The Text Response field contains responses in the same key:value
format as the Text Command. Appendix C lists some basic Text Commands
and their Responses. If the Text Response does not contain a key
that was requested, the initiator must assume that the key was not
understood by the target or that the answer is <key>:none and the two
MUST be equivalent where applicable.
Satran, J. Standards-Track, June 2001 42
iSCSI December 30, 2000
2.10 Login Command
After establishing a TCP connection between an initiator and a
target, the initiator MUST issue a Login Command to gain further
access to the target's resources.
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x03 |0| Reserved (0)| Version-major | Version-minor |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| CID | Reserved (0) |
+---------------+---------------+---------------+---------------+
12| ISID |TSID |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Reserved (0) |
+---------------+---------------+---------------+---------------+
24| InitCmdRN or 0 |
+---------------+---------------+---------------+---------------+
28/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48/ Login Parameters in Text Command Format /
+/ /
+---------------+---------------+---------------+---------------+
2.10.1 Version-major and Version-minor
Currently 0.3
2.10.2 CID
A unique id for this connection within the session
2.10.3 InitCmdRN
Is significant only if TSID is zero and indicates the starting
Command reference number for this session; it SHOULD be zero for all
other instances. If it is significant (TSID is 0) and the value is
Satran, J. Standards-Track, June 2001 43
iSCSI December 30, 2000
zero then this is a single connection session with no support for
command numbering.
2.10.4 Login Parameters
The initiator MAY provide some basic parameters in order to enable
the target to determine if the initiator may in fact use the target's
resources and the initial text parameters for the security exchange.
The format of the parameters is as specified for the Text Command.
Keys and their explanations are listed in Appendixes.
Satran, J. Standards-Track, June 2001 44
iSCSI December 30, 2000
2.11 Login Response
The Login Response indicates the end of the login phase. Note, if
security is established, the login response is authenticated.
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x83 |F| Reserved (0)| Version-major | Version-minor |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
12| ISID |TSID |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Reserved (0) |
+---------------+---------------+---------------+---------------+
24| InitStatRN |
+---------------+---------------+---------------+---------------+
28| ExpCmdRN |
+---------------+---------------+---------------+---------------+
32| MaxCmdRN |
+---------------+---------------+---------------+---------------+
36| Status | Reserved (0) |
+---------------+---------------+---------------+---------------+
40/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48/ Login Parameters in Text Command Format /
+/ /
+---------------+---------------+---------------+---------------+
2.11.1 Version-major minor
Indicates the version supported. Assuming versions are backward
compatible, it indicates the highest (compatible) version supported
by the target.
2.11.2 InitStatRN
Satran, J. Standards-Track, June 2001 45
iSCSI December 30, 2000
This is the starting status reference number for this connection.
2.11.3 Status
The Status returned in a Login Response is one of the following:
0 accept login (will now accept SCSI commands)
1 reject login
In the case that the Status is "accept login" the initiator may
proceed to issue SCSI commands. In the case that the Status is
"reject login" the initiator should immediately close down its end of
the TCP connection, thus freeing up the target's port for some other
connection. The target also has the option of immediately closing
down its end of the TCP connection.
2.11.4 TSID
The TSID is an initiator identifying tag set by the target. A 0 in
the returned TSID indicates that either the target supports only a
single connection or that the ISID has already been used as a leading
ISID. In both cases, the target is rejecting the login.
2.11.5 Final bit
Final bit is set to one in the Final Login Response. A Final bit of 0
indicates a "partial" response - more negotiation needed.
TSID must be returned in the partial response and the same value must
be presented with the final response.
Satran, J. Standards-Track, June 2001 46
iSCSI December 30, 2000
2.12 NOP-Out
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x00 |P| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag or Reserved (0) |
+---------------+---------------+---------------+---------------+
20| Target Tag or Reserved (0x'ffffffff') |
+---------------+---------------+---------------+---------------+
24| CmdRN or (0) |
+---------------+---------------+---------------+---------------+
28| ExpStatRN or (0) |
+---------------+---------------+---------------+---------------+
32| ExpDataRN or (0) |
+---------------+---------------+---------------+---------------+
36/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48/ Ping Data (optional) /
+/ /
+---------------+---------------+---------------+---------------+
The NOP-Out with the P bit set acts as a "ping command".
This form of the NOP-Out can be used to verify that a connection is
still active and all it's components are operational using in-order
delivery or out-of-order delivery. It may be useful in the case where
an initiator has been waiting a long time for the response to some
command, and the initiator suspects that there is some problem with
the connection. When a target receives the NOP-Out with the Ping bit
set, it should respond with a Ping Response, duplicating as much as
possible of the data that was provided in the NOP-Out. If the
initiator does not receive the NOP-In within some time (determined by
the initiator), or if the data returned by the NOP-In is different
from the data that was in the NOP-Out, the initiator may conclude
that there is a problem with the connection. The initiator will then
close the connection and may try to establish a new connection.
Satran, J. Standards-Track, June 2001 47
iSCSI December 30, 2000
The NOP-Out with the P bit not set MUST be used to acknowledge data
received from a target (data-ack) whenever data numbering is used. In
this case, the command caries the same Initiator Task Tag as the data
it acknowledges and the CmdRN field MUST be zero. Duplicate or
obsolete data acknowledgements MUST be silently discarded by the
target.
The NOP-Out can be sent by an initiator because of a NOP-In with the
poll bit set, in which case the Target Tag will copy the NOP-In
value.
2.12.1 P - Ping bit
Request a NOP-In
2.12.2 Length
This is the length of the optional Ping Data.
2.12.3 Initiator Task Tag
An initiator assigned identifier for the operation.
The NOP-Out MUST have the Initiator Task Tag set only if the P bit is
one or the DataRN field is set.
2.12.4 Target Task Tag
A target assigned identifier for the operation.
The NOP-Out MUST have the Target Tag set only if it issued in
response to a NOP-In with the P bit one, in which case it copies the
Target Tag from the NOP-In PDU.
2.12.5 Ping Data
Binary data that will be reflected in the Ping Response.
Satran, J. Standards-Track, June 2001 48
iSCSI December 30, 2000
2.13 NOP-In
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x80 |0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20 Target Tag or Reserved (0x'ffffffff') |
+---------------+---------------+---------------+---------------+
24| StatRN |
+---------------+---------------+---------------+---------------+
28| ExpCmdRN |
+---------------+---------------+---------------+---------------+
32| MaxCmdRN |
+---------------+---------------+---------------+---------------+
36/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48/ Return Ping Data /
+/ /
+---------------+---------------+---------------+---------------+
When a target receives the NOP-Out with the P bit set, it MUST
respond with a NOP-In, with the same Initiator Task Tag that was
provided in the Ping Command. It SHOULD also duplicate as much of the
initiator provided Ping Data as allowed by a configurable target
parameter.
A target may issue a NOP-In by its own to test connection and the
state of the initiator. In this case the Initiator Task Tag MUST be 0
and the Target Tag MUST be set (not x'ffffffff') only if the P bit is
1.
2.13.1 Target Task Tag
A target assigned identifier for the operation.
Satran, J. Standards-Track, June 2001 49
iSCSI December 30, 2000
2.14 Logout Command
The Logout command is used to perform a controlled closing of a
connection.
An initiator MAY use a logout command to remove a connection from a
session.
If an initiator intends to start recovery for a failing connection it
MUST use the Logout command to "clean-up" the target end of a failing
connection and enable recovery to start. On sessions with a single
connection, this might imply opening a second connection with the
sole purpose of cleaning-up the first.
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x06 |0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| CID | Reserved (0) |Reason Code |
+---------------+---------------+---------------+---------------+
12| Reserved (0) |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48
2.14.1 CID
The connection ID of the connection to be closed (including closing
the TCP stream)
2.14.2 Reason Code
Indicate the reason for Logout:
0 - Remove the connection session is closing
1 - Remove the connection for recovery
2 - Remove the connection at targets requests (requested
through an AEN)
Satran, J. Standards-Track, June 2001 50
iSCSI December 30, 2000
2.15 Logout Response
The logout is used by the target to indicate that the cleanup
operation for the failed connection has completed.
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x86 |0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8/ Reserved (0) /
/ /
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
28| ExpCmdRN |
+---------------+---------------+---------------+---------------+
32| MaxCmdRN |
+---------------+---------------+---------------+---------------+
36| Status | Reserved (0) |
+---------------------------------------------------------------+
48
2.15.1 Status
Logout ending status:
0 - connection closed successfully
1 - cleanup failed
Satran, J. Standards-Track, June 2001 51
iSCSI December 30, 2000
2.16 Ready To Transfer (R2T)
When an initiator has submitted a SCSI Command with data passing from
the initiator to the target (WRITE), the target may specify which
blocks of data it is ready to receive. In general, the target may
request that the data blocks be delivered in whatever order is
convenient for the target at that particular instant. This
information is passed from the target to the initiator in the Ready
To Transfer (R2T) message.
In order to allow write operations without R2T, the initiator and
target must have agreed to do so by both sending the UseR2T:no key-
pair attribute to each other (either during Login or through the Text
Command/Response mechanism).
An R2T MAY be answered with one or more iSCSI Data-out PDU with a
matching Target Task Tag. If an R2T is answered with a single Data
PDU the Buffer Offset in the Data PDU MUST be the same as the one
specified by the R2T and the data length of the Data PDU must not
exceed the Desired Data Length specified in R2T. If the R2T is
answered with a sequence of Data PDUs the Buffer Offset and Length
must be within the range of those specified by R2T, the last PDU
should have the F bit set to 1, the Buffer Offsets and Lengths for
consecutive PDUs SHOULD form a continuous non-overlapping range and
the PDUs should be sent in increasing offset order.
The target may send several R2T PDUs and thus have a number or data
transfers pending. The present document does not limit the number of
outstanding data transfers. However, the target SHOULD NOT issue
overlapping R2T request (i.e. referring to the same data area). All
outstanding R2T should have different Target Transfer Tags.
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x90 |0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| Reserved (0) |
+ +
12| |
+---------------+---------------+---------------+---------------+
Satran, J. Standards-Track, June 2001 52
iSCSI December 30, 2000
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Target Task Tag |
+---------------+---------------+---------------+---------------+
24| Reserved (0) |
+---------------+---------------+---------------+---------------+
28| ExpCmdRN |
+---------------+---------------+---------------+---------------+
32| MaxCmdRN |
+---------------+---------------+---------------+---------------+
36| Desired Data Length |
+---------------+---------------+---------------+---------------+
40| Buffer Offset |
+---------------+---------------+---------------+---------------+
44| Reserved (0) |
| |
+---------------+---------------+---------------+---------------+
48
2.16.1 Desired Data Transfer Length and Buffer Offset
The target specifies how many bytes it wants the initiator to send
because of this R2T message. The target may request the data from
the initiator in several chunks, not necessarily in the original
order of the data. The target, therefore, also specifies a Buffer
Offset indicating the point at which the data transfer should begin,
relative to the beginning of the total data transfer.
2.16.2 Target Transfer Tag
The target assigns its own tag to each R2T request that it sends to
the initiator. This can be used by the target to easily identify data
it receives. The Target Transfer Tag is copied in the outgoing data
PDUs and is provided by the target and used by the target only. There
is no protocol rule about Target Transfer Tag but it is assumed that
it will be used to tag the response data to the target (alone or
combination with the LUN).
Satran, J. Standards-Track, June 2001 53
iSCSI December 30, 2000
2.17 Asynchronous Event
An Asynchronous Event may be sent from the target to the initiator
without corresponding to a particular command. The target specifies
the status for the event and sense data.
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0| 0x91 |0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| Logical Unit Number (LUN) |
+ +
12| |
+---------------+---------------+---------------+---------------+
16/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
24| StatRN |
+---------------+---------------+---------------+---------------+
28| ExpCmdRN |
+---------------+---------------+---------------+---------------+
32| MaxCmdRN |
+---------------+---------------+---------------+---------------+
36|SCSI Event Ind |iSCSI Event Ind| Reserved (0) |
+---------------+---------------+---------------+---------------+
40/ Reserved (0) /
/ /
+---------------+---------------+---------------+---------------+
48/ Sense Data /
+/ /
+---------------+---------------+---------------+---------------+
2.17.1 iSCSI Event
Some Asynchronous Events are strictly related to iSCSI while others
are related to SAM-2. The codes returned for iSCSI Asynchronous
Events are:
1 Target is being reset.
2 Target requests Logout on this connection
Satran, J. Standards-Track, June 2001 54
iSCSI December 30, 2000
2.17.2 SCSI Event Indicator
The following values are defined. (See [SAM2] for details):
1 An error condition was encountered after command
completion.
2 A newly initialized device is available to this initiator.
3 All Task Sets are being Reset by another Initiator
5 Some other type of unit attention condition has occurred.
6 An asynchronous event has occurred.
Sense Data accompanying the report identifies the condition. The
Length parameter is set to the length of the Sense Data.
For new device identification an iSCSI target MUST support the Device
Identification page.
Please note that StatRN counts this PDU as a acknowledgeable event
allowing the initiator and target state synchronization.
Satran, J. Standards-Track, June 2001 55
iSCSI December 30, 2000
2.18 Third Party Commands
SCSI allows every addressable entity to be ether initiator or target.
In host-to-host communication, each one of them can take on the
initiator role. In typical I/O operations between a host and a
peripheral subsystem, the host plays the initiator role and the
peripheral subsystem plays the target role.
For EXTENDED COPY and other third party commands SCSI commands, that
involve device-to-device communication, such as (EXTENDED) COPY and
COMPARE, SCSI defines a copy-manager. The copy-manager takes on the
role of initiator in the device-to-device communication. The copy-
manager is the "original-target" of the command and acts as initiator
for a (variable) number of the devices, called sources and
destinations. Sources and destinations act as targets. The whole
operation is described by one "master CDB" delivered to the copy-
manager and a series of descriptor blocks; each descriptor block
addresses a source and destination target and LU and a description of
the work to be done in terms of blocks or bytes as required by the
device types. The relevant SCSI standards do not require full support
of the (EXTENDED) COPY or COMPARE nor do they provide a detailed
execution model.
To address them an iSCSI copy-manager will use information provided
to it through map commands and the SRAs and flags provided in the
descriptors - allowing for iSCSI and FC sources and destinations.
Enabling a FC copy-manager to support iSCSI sources and destinations
is subject to coordination with T10.
Satran, J. Standards-Track, June 2001 56
iSCSI December 30, 2000
2.19 Reject
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0|0| 0xef |0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
36| Reason | Reserved (0) |
+---------------+---------------+---------------+---------------+
40| Reserved (0) |
+/ /
+---------------+---------------+---------------+---------------+
48/ Header of Bad Message /
+/ /
+---------------+---------------+---------------+---------------+
96
It may happen that a target receives a message with a format error
(inconsistent fields, reserved fields not 0, inexistent LUN etc.) or
a digest error (invalid payload or header). The target returns the
header of the message in error as the data of the response.
2.20 Reason
The reject Reason is coded as follows:
1 - Format Error
2 - Header Digest Error
3 - Payload Digest Error
Satran, J. Standards-Track, June 2001 57
iSCSI December 30, 2000
3. Login phase
The login phase establishes an iSCSI session between initiator and
target. It sets the iSCSI protocol parameters, security parameters,
and authenticates initiator and target to each other.
The login phase is implemented via login and text commands and
responses only. The login command is sent from the initiator to
target in order to start the login phase and the login response is
sent from the target to the initiator to conclude the login phase.
Text messages are used to implement negotiation, establish security
and set operational parameters.
The whole login phase is considered as a single task and has a single
Initiator Task Tag (very much like the linked SCSI commands).
The login phase sequence of commands and responses proceeds as
follows:
- Login command (mandatory)
- Login Partial-Response (optional)
- Text Command(s) and Response(s) (optional)
- Login Final-Response (mandatory)
3.1 Login phase start
The login phase starts with a login request via a login command from
the initiator to the target. The login request includes:
-Protocol version supported by the initiator (currently 0.3)
-Session and connection Ids
-Security Parameters (if security is requested) and
-Protocol parameters
The target can answer in the following ways:
-Login Response with Login Reject (and Final bit 1). This is
an immediate rejection from the target causing the session to
terminate. Causes for rejection are address rejection, local
protection etc.. Login reject with Final bit 0 is a format
error.
-Login Response with Login Accept with session ID and iSCSI
parameters and Final bit 1. In this case, the target does not
support any security or authentication mechanism and starts
with the session immediately (enters full feature phase)
Satran, J. Standards-Track, June 2001 58
iSCSI December 30, 2000
-Login Response with Final bit 0 indicating the start of a
authentication/negotiation sequence. The response includes the
protocol version supported by the target and the security
parameters (not iSCSI parameters, those will be returned only
after security is established to protect them) supported by the
target.
3.2 Security negotiation
The negotiation proceeds as follows:
-The initiator sends a text command with an ordered list of the
options it supports for each subject (encryption algorithm,
authentication algorithm, iSCSI parameters and so on). The
options are listed from the most preferable (to the initiator)
to the least.
-The target MUST reply with the first option in the list it
supports. The parameters are encoded in Unicode - UTF8 as
key:value (e.g., the encryption option of triple-DES will
appear as encryption:3des-cbc). The initiator MAY send
proprietary options as well. The "none" option MUST be included
in the list, indicating no algorithm supported by the target.
If security is to be established, the initiator MUST NOT send
parameters other than security parameters in the login command.
The general parameters should be negotiated only after security
is established at the desired level. Any operational
parameters sent before establishing a secure context MUST be
reset by both the target and the initiator when establishing
the security context. For a list of security parameters see
Appendix A.
3.3 iSCSI Security
The security exchange sets the security mechanism and authenticates
the user and the target to each other. The exchange proceeds
according to the algorithms that were chosen in the negotiation phase
and is conducted by the text commands key:value parameters.
The security mechanism includes the following elements:
-Initial authentication - the host and the target authenticate
themselves to each other. A negotiable algorithm, e.g.,
user/password or public key, provides this feature.
-Message integrity - an integrity and authentication digest is
attached to each packet and authenticates it. The algorithm is
negotiable.
Satran, J. Standards-Track, June 2001 59
iSCSI December 30, 2000
-Encryption - data from host to target and from target to host
is encrypted. The user MAY choose to encrypt only part of the
data, e.g., headers only (for complexity reasons). Encryption
MAY use IPsec. The algorithm and its parameters are negotiable.
Using IPsec for encryption or authentication may eliminate the need
for parameter negotiation at the iSCSI level (for example, ISAKMP for
IPsec). However, there is still a need to negotiate for the algorithm
itself.
If security is established in the login phase note that:
-After setting message integrity, each iSCSI message MUST
include the appropriate digest field (i.e., each message after
the one through which the target choose the algorithm.
-If encryption is to be set (e.g., IPsec), it should be set
prior to the login phase.
-The iSCSI parameter negotiation (non-security parameters)
SHOULD start only after security is established. This should be
carried on text commands.
Satran, J. Standards-Track, June 2001 60
iSCSI December 30, 2000
4. iSCSI Error Handling and Recovery
4.1 Connection failure
For any outstanding SCSI command, it is assumed that iSCSI in
conjunction with SCSI at the initiator is able to keep enough
information to be able to rebuild the command PDU, that outgoing data
is available (in host memory) for retransmission while the command is
outstanding. It is also assumed that, at a target, iSCSI and
specialized TCP implementations are able to recover unacknowledged
data packets from a closing connection or, alternatively, the target
has means to re-read data from a device server. It is further
assumed that a target will keep the "status & sense" for a command it
has executed while the total number of outstanding commands and
executed commands does not exceed its limit. A target will
sequentially number the delivered responses and thus enable
initiators to tell when a response is missing and which response is
missing.
Under those conditions, iSCSI will be able to keep a session in
operation if it is able to keep/establish at least one TCP connection
between the initiator and target in a timely fashion. Unfortunately,
the maximum admissible recovery time is a function of the target and
for some devices and communications networks recovery may be complex
and may percolate to upper software layers. It is assumed that
targets and/or initiators will recognize a failing connection by
either transport level means (TCP) or by a gap in the command or
response stream that is not filled for a long time, or by a failing
iSCSI NOP-ping (the later MAY be used periodically by highly reliable
implementations). Initiators and targets MAY also use the keep-alive
option on the TCP connection to enable early link failure detection
on idle links.
The iSCSI recovery involves the following steps:
-abort offending TCP connection(s) (target & initiator) and
recover at target all unacknowledged read-data
-issue a Logout command on a remaining connection or create a
new connection and issue the Logout command
-wait for the Logout response
-if needed, create one or more new TCP connections (within the
same session) and associate all outstanding commands from the
failed connection to the new connection at both initiator and
target.
Satran, J. Standards-Track, June 2001 61
iSCSI December 30, 2000
-the initiator will reissue all outstanding commands with their
original Initiator Task Tag and their original CmdRN if they
are not acknowledged yet or a CmdRN of 0 (not-numbered) if they
were acknowledged; the retry (X) flag in the command PDU will
be set
-upon receiving the new/retry commands the target will resume
command execution; for write commands it means requesting data
retransmission through R2T, for reads retransmitting recovered
data and for "terminated" commands retransmitting the Status &
Sense while retaining the original StatRN. If data recovery is
not possible, the target will either provide data from the
media or redo the operation (if the operation is not idempotent
the device server may fail the operation).
4.2 Protocol Errors
The authors recognize that mapping framed messages over a "stream"
connection (like TCP) makes the proposed mechanisms vulnerable to
simple software framing errors and introducing framing mechanisms may
be onerous for performance and bandwidth. Command reference numbers
and the above mechanisms for connection drop and reestablishment will
help handle this type of mapping errors.
4.3 Session Errors
If all the connections of a session fail and can't be reestablished
in a short time or if initiators detect protocol errors repeatedly,
an initiator may choose to terminate a session and establish a new
session. It will terminate all outstanding requests with an iSCSI
error indication before initiating a new session. A target that
detects one of the above errors will take the following actions:
- Reset the TCP connections (close the session).
- Abort all Tasks in the task set for the corresponding
initiator.
4.4 Format errors
Explicit violations of the rules stated in this document are
considered as format errors.
While a session is active whenever a target receives an iSCSI PDU
with a format error is MUST answer with a Reject iSCSI PDU with
a Reason-code of Format-error.
Satran, J. Standards-Track, June 2001 62
iSCSI December 30, 2000
When a session is active whenever an initiator receives an iSCSI PDU
with a format error, for which it has an outstanding task, it MUST
abort the target task and report the error as a SCSI check condition
status with a sense key of 4h (hardware error).
4.5 Digest errors
When a target receives an iSCSI data PDU with a data payload digest
error, it MUST discard it and request retransmission with a R2T.
When a target receives an iSCSI PDU with a header digest error or a
payload digest error in anything but a data iSCSI PDU it MUST answer
with a Reject iSCSI PDU with a Reject iSCSI PDU with a Reason-code of
Digest-error.
When an initiator receives an iSCSI data PDU with a data payload
digest error or any other iSCSI PDU with a header or payload digest
error it MUST discard it, and restart the task - the later provided
it could recognize the Initiator Task Tag. If the initiator can't
recognize the Initiator Task Tag, (e.g., a header digest error) the
initiators MUST logout the connection and restart it (including
restarting all outstanding tasks).
Satran, J. Standards-Track, June 2001 63
iSCSI December 30, 2000
5. Notes to Implementers
This section notes some of the performance and reliability
considerations of the iSCSI protocol. This protocol was designed to
allow efficient silicon and software implementations. The iSCSI tag
mechanism was designed to enable RDMA at the iSCSI level or lower.
5.1 Multiple Network Adapters
The iSCSI protocol allows multiple connections, not all of which need
go over the same network adapter. If multiple network connections are
to be utilized with hardware support, the iSCSI protocol command-
data-status allegiance to one TCP connection insure that there is no
need to replicate information across network adapters or otherwise
require them to cooperate.
5.2 Autosense
Autosense refers to the automatic return of sense data to the
initiator in case a command did not complete successfully. iSCSI
mandates support for autosense.
Satran, J. Standards-Track, June 2001 64
iSCSI December 30, 2000
6. Security Considerations
6.1 Data Integrity
We assume that basic level end-to-end data integrity can be
reasonably handled by TCP, by using the standard checksum. For those
applications for which data integrity is of utmost importance iSCSI
will provide an integrity option.
6.2 Network operations and the Threat Model
Historically, native storage systems have not had to consider
security because their environments offered minimal security risks.
That is, these environments consisted of storage devices either
directly attached to hosts or connected via a subnet distinctly
separate from the communications network. The use of storage
protocols, such as SCSI, over IP networks requires that security
concerns be addressed.
6.2.1 Threat Model
Attacks fall into three main areas; passive, active, and denial of
service.
6.2.1.1 Passive Attacks
Often, data transfers will be made through a switched fabric, making
sniffing difficult. In addition, the nature of the data (block
transfers), even if sniffed, would not necessarily be readily
understandable to the attacker. That being said, a determined
attacker, by capturing of content and analyzing traffic over time,
could replicate enough of a storage device to make the captured data
meaningful. Certain storage operations which are mostly
unidirectional, such as writing to a tape or reading from a CD-ROM,
are more susceptible to passive attacks since the listener will be
able to replicate most if not all of the operation.
Passive attacks by traffic analysis alone is deemed out of scope
since it is unlikely that the listener will be able to guess any
pertinent information without knowing the content of the messages.
It is also out of scope to detect passive attacks. The protocol must
be able to prevent passive attacks by masking the contents of
messages through some form of encryption.
Finally, it is assumed that a strong authentication mechanism will be
necessary. Therefore, any long-lived passwords or private keys SHOULD
never be sent in the clear.
Satran, J. Standards-Track, June 2001 65
iSCSI December 30, 2000
6.2.1.2 Active Attacks
Whereas passive attacks involve SNIFFING, active attacks will
generally involve SPOOFING. If an attacker can successfully
masquerade as a client, he will have total read/write access to those
storage resources assigned to that client. Spoofing as a server is
sometimes more difficult, since many operations involve client reads
of some expected or otherwise understandable data.
Most likely, many of the sessions will be long-lived. This feature
has a dual effect of making these sessions more vulnerable to attack
(hijacking TCP connections, cryptographic attacks), while at the same
time providing mechanisms to detect attacks. An attempt to open a
session while one is already active can be treated as a possible
attack. Both the transport and session layer protocols will have
sequencing that would need to be adhered to by the attacker to avoid
generating errors that could also be treated as a possible attack.
Message modification can be a significant threat to an environment
reliant on the integrity of the data. Message replay, insertion, or
deletion will generally produce errors (such as data
overruns/underruns) that can be recovered successfully, they can have
the effect of reducing performance, and as such can act as a denial
of service. It is possible that an attacker can modify a message in
such a way the session becomes uncoordinated, resulting in a tear
down of the session.
6.2.2 Security Model
6.2.2.1 No Security
This mode does not authenticate nor does it encrypt data. This mode
should only be used in environments where there is minimal security
risk and little chance for configuration errors.
6.2.2.2 End-to-End Authentication
This mode protects against an unauthorized access to storage
resources either through an active attack (SPOOFING) or configuration
errors. Once the client is authenticated, all messages are sent and
received in the clear. This mode should only be used when there is
minimal risk to man-in-the-middle attacks, eavesdropping, message
insertion, deletion, and modification. For example, this mode can be
used when IPsec is used in security gateways.
6.2.2.3 iSCSI integrity and authentication
Satran, J. Standards-Track, June 2001 66
iSCSI December 30, 2000
The iSCSI protocol provides an authentication mechanism for initiator
and target. This includes login authentication and authentication
trailers for headers and data. No encryption is provided at the iSCSI
protocol level. The implementers may use other protocols (e.g.,
IPsec) for this purpose.
6.2.2.4 Encryption
This mode provides for the end-to-end encryption (e.g. IPsec). In
addition to authenticating the client, it provides end-to-end data
integrity and protects against man-in-the-middle attacks,
eavesdropping, message insertion, deletion, and modification.
A connection or multiple connections can be protected end-to-end by
using IPSec. In this case, the initiator must use the "Implicit
Authentication" parameter to indicate that IPSec should be used to
specify the Access ID and perform authentication.
6.2.3 Other Considerations
Due to long-lived sessions, is there a need for periodic
authentication after the session is established? For example, should
the client be challenged during key-alive exchanges in addition to
login?
Due to long-lived sessions with encryption, is there a higher level
of vulnerability to cryptographic attacks?
6.3 Login Process
In some environments, a target will not be interested in
authenticating the initiator. In this case, the target can simply
ignore some or all of the parameters sent in a Login Command, and the
target can simply reply with a basic Login Response indicating a
successful login. Some targets MAY want to perform some kind of
authentication. Various authentication schemes can be used, including
encrypted passwords and trusted certificate authorities. Once the
initiator and target are confident of the identity of the attached
party, the established channel is considered secure.
6.4 Feasibility
The encryption algorithms are computationally complex. Therefore, the
real time constraints on the transmission and reception may render
Satran, J. Standards-Track, June 2001 67
iSCSI December 30, 2000
difficult the implementation of completely encrypted streams. Working
with fast networks will force the implementers to use one of the
following alternatives:
-Hardware implementation
-Partial encryption
The first alternative enables the use of completely encrypted
streams. Although robust, this may be (at least at top speeds)
expensive.
The second alternative does not require specialized hardware, but
will reduce the safety of the system. In most cases, however, the
safety tradeoff is acceptable (e.g., encryption of headers only by
defining an IPsec policy).
Data integrity/authentication through data and header digests can
easily be performed.
Satran, J. Standards-Track, June 2001 68
iSCSI December 30, 2000
7. IANA Considerations
There will be a well-known port for iSCSI connections. This well
known port will be registered with IANA.
Satran, J. Standards-Track, June 2001 69
iSCSI December 30, 2000
8. References and Bibliography
[AC] A detailed proposal for Access Control, Jim Hafner,
T10/99-245
[ALTC] Internet Draft: Alternative checksums (work in
progress)
[CAM] ANSI X3.232-199X, Common Access Method-3 (Cam-3)
[CRC] ISO 3309, High-Level Data Link Control (CRC 32)
[FIPS-180-1] FIPS-Secure Hash Standard
[FIPS-186-2] FIPS-Digital Signature Standard
[Orm96] Orman, H., "The Oakley Key Determination Protocol",
version 1, TR97-92, Department of Computer Science Technical
Report, University of Arizona.
[PKIX-Part1] Housley, R., et al, "Internet X.509 Public Key
Infrastructure, Certificate and CRL Profile", Internet Draft,
draft-ietf-pkix-ipki-part1-11.txt
[RFC793] Transmission Control Protocol, RFC 793
[RFC1122] Requirements for Internet Hosts-Communication Layer,
RFC1122, R. Braden (editor)
[RFC-1766] Alvestrand, H., "Tags for the Identification of
Languages", March 1995.
[RFC1982] Elz, R., Bush, R., "Serial Number Arithmetic", RFC
1982, August 1996.
[RFC2026] Bradner, S., "The Internet Standards Process --
Revision 3", RFC 2026, October 1996.
[RFC-2044] Yergeau, F., "UTF-8, a Transformation Format of
Unicode and ISO 10646", October 1996.
[RFC-2104] Krawczyk, H., Bellare, M., and Canetti, R., "HMAC:
Keyed-Hashing for Message Authentication", February 1997
[RFC-2119] Bradner, S. "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC-2144] Adams, C., "The CAST-128 Encryption Algorithm", May
1997.
[RFC-2234] D. Crocker, P. Overell Augmented BNF for Syntax
Specifications: ABNF
[RFC-2313] B. Kaliski, PKCS #1: RSA Encryption, Version 1.5
[RFC-2434] T. Narten, and H. Avestrand, "Guidelines for Writing
an IANA Considerations Section in RFCs.", RFC2434, October
1998.
[RFC-2440] Callas, J., et al, "OpenPGP Message Format",
November 1998.
[SAM2] ANSI X3.270-1998, SCSI-3 Architecture Model (SAM-2)
[SBC] ANSI X3.306-199X, SCSI-3 Block Commands (SBC)
[SCSI2] ANSI X3.131-1994, SCSI-2
[Schneier] Schneier, B., "Applied Cryptography Second Edition:
protocols, algorithms, and source code in C", 2nd edition, John
Wiley & Sons, New York, NY, 1996.
Satran, J. Standards-Track, June 2001 70
iSCSI December 30, 2000
[SPC] ANSI X3.301-199X, SCSI-3 Primary Commands (SPC)
[TLS] The TLS Protocol, RFC 2246, T. Dierks et al.
Satran, J. Standards-Track, June 2001 71
iSCSI December 30, 2000
9. Author's Addresses
Julian Satran
Kalman Meth
IBM, Haifa Research Lab
MATAM - Advanced Technology Center
Haifa 31905, Israel
Phone +972 4 829 6211
Email: Julian_Satran@vnet.ibm.com meth@il.ibm.com
Daniel F. Smith
IBM Almaden Research Center
650 Harry Road
San Jose, CA 95120-6099, USA
Phone: +1 408 927 2072
Email: dfsmith@almaden.ibm.com
Costa Sapuntzakis
Cisco Systems, Inc.
170 W. Tasman Drive
San Jose, CA 95134, USA
Phone: +1 408 525 5497
Email: csapuntz@cisco.com
Randy Haagens
Hewlett-Packard Company
8000 Foothills Blvd.
Roseville, CA 95747-5668, USA
Phone: +1 (916) 785-4578
E-mail: Randy_Haagens@hp.com
Matt Wakeley
Agilent Technologies
1101 Creekside Ridge Drive
Suite 100, M/S RH21
Roseville, CA 95661
Phone: +1 (916) 788-5670
E-Mail: matt_wakeley@agilent.com
Efri Zeidner
SANGate
Satran, J. Standards-Track, June 2001 72
iSCSI December 30, 2000
Israel
efri@sangate.com
Satran, J. Standards-Track, June 2001 73
iSCSI December 30, 2000
Paul von Stamwitz
Adaptec, Inc.
691 South Milpitas Boulevard
Milpitas, CA 95035
Phone: +1(408) 957-5660
E-mail: paulv@corp.adaptec.com
Luciano Dalle Ore
Quantum Corp.
Phone: +1(408) 232 6524
E-mail: lldalleore@snapserver.com
Yaron Klein
SANRAD
24 Raul Valenberg St.
Tel-Aviv, 69719 Israel
Phone: +972-3-7659998
E-mail: klein@sanrad.com
Comments may be sent to Julian Satran
Satran, J. Standards-Track, June 2001 74
iSCSI December 30, 2000
Apendix A. iSCSI Security
01 Security keys and values
The parameters (keys) negotiated for security are:
- digests (header_digest:, data_digest:)
- authentication methods (init_auth:, target_auth:)
- public key algorithm (public_key)
The following table lists cyclic integrity checksums that can be
negotiated for the digests.
+---------------------------------------------+
| Name | Description |
+---------------------------------------------+
| crc-16 | 16 bit CRC |
+---------------------------------------------+
| crc-CCITT | 16 bit CRC |
+---------------------------------------------+
| crc-32 | 32 bit CRC |
+---------------------------------------------+
| crc-64 | 64 bit CRC |
+---------------------------------------------+
| none | no digest |
+---------------------------------------------+
The generator polynomials for those digests are:
crc-16 - x**16+x**15+x**2+1
crc-CCITT - x**16+x**12+x**5+1
crc-32 - x**32+x**26+x**x23+x**22+x**16+x**12+x**11+x**10+
x**8+x**7+x**5+x**4+x**2+x+1
crc-64 -
Digests enable checking end-to-end data integrity (beyond the
integrity checks provided by the link layers and covering the whole
communication path including all elements that may change the network
level PDUs - like routers, switches, proxies etc.).
crc-16 and crc-CCITT are considered adequate for very short blocks
(like PDU headers or very short payloads).
crc-32 and crc-64 are considered adequate for longer blocks.
Satran, J. Standards-Track, June 2001 75
iSCSI December 30, 2000
Cyclic codes are particularly well suited for hardware
implementations.
Implementations MAY also negotiate some hash functions that may
provide data authentication in addition to integrity as detailed in
the following table:
+-----------------------------------------------------------+
| Name | Description | Definition |
+-----------------------------------------------------------+
| hmac-sha1 | HMAC-SHA1 length=20 | RFC-2104 |
+-----------------------------------------------------------+
| hmac-sha-96 | first 96 bits of HMAC-SHA 1 | RFC-2104 |
+-----------------------------------------------------------+
| hmac-md5 | HMAC-MD5 length 16 | RFC-2104 |
+-----------------------------------------------------------+
| hmac-md5-96 | first 96 bits of HMAC-MD5 | RFC-2104 |
+-----------------------------------------------------------+
Other and proprietary algorithms MAY also be negotiated.
The none value is the only one that MUST be supported.
The following table details authentication methods:
+-----------------------------------------------------------+
| Name | Description |
+-----------------------------------------------------------+
| publickey | Public key authentication |
+-----------------------------------------------------------+
| password | Plain text user-password |
+-----------------------------------------------------------+
| challenge | Challenge and response |
+-----------------------------------------------------------+
| none | No authentication |
+-----------------------------------------------------------+
The following table details public key algorithms for authentication:
Satran, J. Standards-Track, June 2001 76
iSCSI December 30, 2000
+-----------------------------------------------------------+
| Name | Description | Definition |
+-----------------------------------------------------------+
| ssh-dss | Simple DSS | [FIPS-186] |
+-----------------------------------------------------------+
| rsa | RSA public key | [RFC2313] |
+-----------------------------------------------------------+
| none | No Public Key | - |
+-----------------------------------------------------------+
Where the public key information is encoded as:
public_key:<name>,<parameters>
For example, if ssh-dss is selected:
public_key:ssh-dss,p,q,g,y
Here the "p", "q", "g", and "y" parameters (encoded as numbers in
Unicode UTF8) form the signature key blob.
Signing and verifying using this key format are done according to the
Digital Signature Standard [FIPS-186] using the SHA-1 hash. A
description can also be found in [Schneier].
The dss signature blob is encoded as a string containing "r" followed
by "s" (which are 160 bits long integers, without lengths or padding,
unsigned and in network byte order).
02 Authentication
The authentication exchange SHOULD authenticate the initiator and
target to each other. Authentication is not mandatory and is
distinct from the data integrity exchange.
Different levels of authentication can be applied such as initiator
authentication, target authentication or both.
The authentication methods to be used are public key, user/password
or challenge/response.
If public key is selected then each party MUST use:
authenticate:<user-id>,<blob>
Satran, J. Standards-Track, June 2001 77
iSCSI December 30, 2000
where user-id is an assigned id of the host-OS for the initiator or
the World-Wide-Name for the target and blob is the public-key blob.
For user/password each party must use:
authenticate:<user-id>,<password>
where user-id is as above and password is a plain-text password.
03 Salt
salt:<number> can be used by different authentication schemes to
prevent replay attacks (a random number - cookie - or a time stamp or
both)
04 Challenge
challenge:<string> and authenticate:<string> MUST be used for
challenge answer schemes
05 Login Phase examples:
The first example is a "user-password" authentication:
In this example, the result of the negotiation is to use md5 for
header digest, crc32-2k for data digest and user/password for
initiator authentication. No target authentication required.
I-> Login header_digest:(hmac-md5,hmac-md5-96,crc32,none)
data_digest:(crc32-2k) init_auth:(public-key,password,none)
target_auth:(none) public_key:((ssh-dss,parameters),none)
T-> Text header_digest:hmac-md5 data_digest:crc32-2k
init_auth:password
I-> Text authenticate:alef,sesam
If the authentication is successful:
T->StartSecure:HERE
...
T-> Login "login accept"
If the authentication was not successful:
T-> Login "login reject"
Note - the Text command including SecureStart:HERE and each PDU after
it will have the trailer consisting in a hmac-md5 digest for the
header and a crc32 for each 2k of data (or fraction thereof).
Satran, J. Standards-Track, June 2001 78
iSCSI December 30, 2000
The next example is a "public-key" authentication. The initiator
authenticates itself to the target; no keys are exchanged:
I-> Login header_digest:(hmac-md5,hmac-md5-
96,crc32,none)data_digest:(crc32-2k,none)
init_auth:(publickey,password,none) target_auth:(none)
public_key:((rsa,parameters),(ssh-dss,parameters),none)
T-> Text header_digest:hmac-md5 data_digest:crc32-2k
init_auth:publickey public_key:(ssh-dss,parameters)
I-> Text authenticate:user,blob salt:578913456
NB - where the parameters stands for the hash of header
and the salt, i.e., hash(heder || salt). The initiator
SHOULD add "salt" to the packet, e.g. add the pair
salt:<random-number> (or timestamp or a mixture) to its
packet to prevent record and replay.
The key distribution may be done by a certificate authority
or other server and is beyond the scope of this document
If the user was not confirmed, the target sends a login
response message with "login reject" to the initiator. Else,
it can send a login response with "login accept" and MAY
attach a secret:
T->Text StartSecure:HERE secret:
I->Text ... parameters ...EndLogin:HERE
T->Login (accept) ... parameters ...
The next example is another "public-key" authentication. The
initiator authenticates itself to the target. The target
authenticates itself to the initiator and key are exchanged:
I-> Login header_digest:(hmac-md5,hmac-md5-
96,crc32,none)data_digest:(crc32-2k,none)
init_auth:(publickey,password,none) target_auth:
(none) public_key:((ssh-dss,parameters),(rsa, parameters),none)
T-> Text header_digest:hmac-md5 data_digest:crc32-2k
init_auth:publickey public_key:(ssh-dss,parameters)
target_auth:(publickey,password,none) public_key:(ssh-
dss,parameters),none
I-> Text authenticate:user,blob target_auth:publickey
public_key:(ssh_dss,parameters) salt:20001103172433
where blob stands for hash(header || salt).
Note: the last packet should have the appropriate trailers.
Satran, J. Standards-Track, June 2001 79
iSCSI December 30, 2000
If the initiator was not confirmed, the target sends a login response
message with "login reject" to the initiator. Else, it can continue
with the login process:
T-> Text authenticate:user,blob salt:532678925
where blob stands for hash(header || salt).
In here, the target authenticates itself to the initiator. If the
authentication was successful, the initiator responses with an empty
text command, continuing the login phase. Else, it stops the login
phase.
I->Text
T->Text secret:blob
Where blob is a key encrypted with the initiator’s public key.
I->Text StartSecure:HERE... parameters ...
...
T->Login "login accept" ... parameters ...
In the next example the target authenticates the initiator via
challenge and response.
I-> Login header_digest:(hmac-md5,hmac-md5-96,crc32,none)
data_digest:(crc32-2k) init_auth:(public-
key,password,challenge,none) target_auth:(none)
public_key:(ssh-dss,parameters)
T-> Text header_digest:hmac-md5 data_digest:crc32-2k
init_auth:challenge challenge:question
I-> Text authenticate:answer
If authentication is successful, i.e., the answer to the question is
correct, the target may proceeds:
T->... parameter negotiation
Or give another challenge:
T-> Text challenge:question2
I-> Text authenticate:answer2
And at the end:
Satran, J. Standards-Track, June 2001 80
iSCSI December 30, 2000
T-> Login "login accept"
If the authentication was not successful:
T-> Login "login reject"
Note - the Text command after authentication and each PDU thereafter
will have in the trailer an hmac-md5 digest for the header and a
crc32 for each 2k of data (or fraction of it).
Satran, J. Standards-Track, June 2001 81
iSCSI December 30, 2000
Apendix B. Examples
06 Read operation example
|Initiator Function| Message Type | Target Function |
+------------------+-----------------------+----------------------+
| Command request |SCSI Command (READ)>>> | |
| (read) | | |
+------------------+-----------------------+----------------------+
| | | Prepare Data Transfer|
+------------------+-----------------------+----------------------+
| Receive Data | <<< SCSI Data | Send Data |
+------------------+-----------------------+----------------------+
| Receive Data | <<< SCSI Data | Send Data |
+------------------+-----------------------+----------------------+
| Receive Data | <<< SCSI Data | Send Data |
+------------------+-----------------------+----------------------+
| | <<< SCSI Response |Send Status and Sense |
+------------------+-----------------------+----------------------+
| Command Complete | | |
+------------------+-----------------------+----------------------+
Satran, J. Standards-Track, June 2001 82
iSCSI December 30, 2000
07 Write operation example
+------------------+-----------------------+---------------------+
|Initiator Function| Message Type | Target Function |
+------------------+-----------------------+---------------------+
| Command request |SCSI Command (WRITE)>>>| Receive command |
| (write) | | and queue it |
+------------------+-----------------------+---------------------+
| | | Process old commands|
+------------------+-----------------------+---------------------+
| | | Ready to process |
| | <<< R2T | WRITE command |
+------------------+-----------------------+---------------------+
| Send Data | SCSI Data >>> | Receive Data |
+------------------+-----------------------+---------------------+
| | <<< R2T | |
+------------------+-----------------------+---------------------+
| | <<< R2T | |
+------------------+-----------------------+---------------------+
| Send Data | SCSI Data >>> | Receive Data |
+------------------+-----------------------+---------------------+
| Send Data | SCSI Data >>> | Receive Data |
+------------------+-----------------------+---------------------+
| | <<< SCSI Response |Send Status and Sense|
+------------------+-----------------------+---------------------+
| Command Complete | | |
+------------------+-----------------------+---------------------+
Satran, J. Standards-Track, June 2001 83
iSCSI December 30, 2000
Apendix C. Login/Text keys (not security related)
ISID and TSID form collectively the SSID (session id). A TSID of zero
indicates a leading connection. Only a leading connection login can
carry session specific parameters, e.g. MaxConnections, the maximum
immediate data length requested, etc..
08 MaxConnections
MaxConnections:<number-from-1-to-65535>
Initiator and target negotiate the maximum number of connections
requested/acceptable.
09 Target
Target:<domainname>[/modifier]
Examples:
Target:disk-array.sj-bldg-h.cisco.com
Target:disk-array.sj-bldg-h.cisco.com/control7
This key is provided by the initiator of the TCP connection to the
remote endpoint. The Target key specifies the domain name of the
target, since that information is not available from the TCP layer.
The target is not required to support this key. The initiator should
send this key in the first login message. The Target key might be
used by the target to select a unit within a multi-unit target.
10 Initiator
Initiator:[domainname[/modifier]] Examples:
Initiator:sample.foobar.org
Initiator:cluster.foobar.org/machine1
Initiator:
The Initiator key enables the initiator to identify itself to the
remote endpoint. The domain name should be that of the initiator. A
zero-length domain name is interpreted as "other side of TCP
connection". The target may silently ignore this key if it does not
support it.
11 AccessID
Satran, J. Standards-Track, June 2001 84
iSCSI December 30, 2000
AccessID:<SCSI-AccessID-value>
Deliver a SCSI AccessID to the target
12 FMarker
FMarker:<send|receive|send-receive|no>
Examples:
I->FMarker:send-receive
T->FMarker:send-receive
results in Marker being used in both directions while
I->FMarker:send-receive
T->FMarker:receive
results in Marker being used from the initiator to the target but not
from the target to initiator.
13 RFMarkInt
RFMarkInt:<number-from-1-to-65535>
Indicates at what interval (in 4 byte words) the receiver wants the
markers. The larger of the numbers (wanted by receiver and offered by
sender) is selected.
14 SFMarkInt
SFMarkInt:<number-from-1-to-65535>
Indicates at what interval (in 4 byte words) the sender offers to
send the markers. The larger of the numbers (wanted by receiver and
offered by sender) is selected.
15 IFMarkInt
IFMarkInt:<number-from-1-to-65535>
Indicates that the initial marker-less interval required by the
initiator in both directions.
Satran, J. Standards-Track, June 2001 85
iSCSI December 30, 2000
16 UseR2T
UseR2T:<yes|no>
Examples:
I->UseR2T:no
T->UseR2T:no
The UseR2T key is used to turn off the default use of R2T, thus
allowing an initiator to send data to a target without the target
having sent an R2T to the initiator. The default action is that R2T
is required, unless both the initiator and the target send this key-
pair attribute specifying UseR2T:no. Once UseR2T has been set to
'no', it cannot be set back to 'yes'. Note than only the first
outgoing data item (either immediate data or a separate PDU) can be
sent unsolicited by a R2T.
17 BidiUseR2T
BidiUseR2T:<yes|no>
Examples:
I->BidiUseR2T:no
T->BidiUseR2T:no
The BidiUseR2T key is used to turn off the default use of BiDiR2T,
thus allowing an initiator to send data to a target without the
target having sent an R2T to the initiator for the output data (write
part) of a Bi-directional command (having both the R and the W bits
set). The default action is that R2T is required, unless both the
initiator and the target send this key-pair attribute specifying
BidiUseR2T:no. Once BidiUseR2T has been set to 'no', it cannot be
set back to 'yes'. Note than only the first outgoing data item
(either immediate data or a separate PDU) can be sent unsolicited by
a R2T.
18 DataNumber
DataNumber:<number-from-0-to-65535>
Example:
The DataNumber key is used by targets to turn on the use of input
data packet numbering, thus allowing a target to discard input data
Satran, J. Standards-Track, June 2001 86
iSCSI December 30, 2000
as soon as acknowledged without loosing recovery capabilities. By
default data numbering is off. A nonzero value for DataNumber
indicates both that data numbering is requested and the maximum
number of unacknowledged packets. An initiator MUST support data
numbering if requested.
19 ImmediateDataLength
ImmediateDataLength:<number>
Initiator and target negotiate the maximum length supported for
immediate data. Default is 2**32-1 bytes.
20 ITagLength
ITagLength:<number-from16-to-32>
Initiator and target negotiate the significant length of the
initiator tag to be used. Default is 32.
21 PingMaxReplyLength
PingMaxReplyLength:<number>
Initiator and target negotiate the maximum length of data contained
in a ping reply. Default is 4096.
22 StartSecure
StartSecure:HERE
Initiator and target indicate the end-of-authentication/integrity
exchange (start of parameter negotiation if any).
23 TotalText
TotalText:<number-from-512-to-65535>
Initiator and target indicate the total text limit for any Text or
Login command.
24 KeyValueText
KeyValueText:<number-from-256-to-8192>
Satran, J. Standards-Track, June 2001 87
iSCSI December 30, 2000
Initiator and target indicate the total text limit for any key:value
pair.
25 MaxOutstandingR2T
MaxOutstandingR2T:<number-from-1-to-65535>
Initiator and target negotiate the maximum number of outstanding R2Ts
per task. The default is 256.
Satran, J. Standards-Track, June 2001 88
iSCSI December 30, 2000
Full Copyright Statement
"Copyright (C) The Internet Society (date). All Rights Reserved. This
document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."
Satran, J. Standards-Track, June 2001 89
| PAFTECH AB 2003-2026 | 2026-04-19 19:05:58 |