rfc2616-sec3.html
34.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns='http://www.w3.org/1999/xhtml'>
<head><title>HTTP/1.1: Protocol Parameters</title></head>
<body><address>part of <a rev='Section' href='rfc2616.html'>Hypertext Transfer Protocol -- HTTP/1.1</a><br />
RFC 2616 Fielding, et al.</address>
<h2><a id='sec3'>3</a> Protocol Parameters</h2>
<h3><a id='sec3.1'>3.1</a> HTTP Version</h3>
<p>
HTTP uses a "<major>.<minor>" numbering scheme to indicate versions
of the protocol. The protocol versioning policy is intended to allow
the sender to indicate the format of a message and its capacity for
understanding further HTTP communication, rather than the features
obtained via that communication. No change is made to the version
number for the addition of message components which do not affect
communication behavior or which only add to extensible field values.
The <minor> number is incremented when the changes made to the
protocol add features which do not change the general message parsing
algorithm, but which may add to the message semantics and imply
additional capabilities of the sender. The <major> number is
incremented when the format of a message within the protocol is
changed. See RFC 2145 <a rel='bibref' href='rfc2616-sec17.html#bib36'>[36]</a> for a fuller explanation.
</p>
<p>
The version of an HTTP message is indicated by an HTTP-Version field
in the first line of the message.
</p>
<pre> HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT
</pre>
<p>
Note that the major and minor numbers MUST be treated as separate
integers and that each MAY be incremented higher than a single digit.
Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in turn is
lower than HTTP/12.3. Leading zeros MUST be ignored by recipients and
MUST NOT be sent.
</p>
<p>
An application that sends a request or response message that includes
HTTP-Version of "HTTP/1.1" MUST be at least conditionally compliant
with this specification. Applications that are at least conditionally
compliant with this specification SHOULD use an HTTP-Version of
"HTTP/1.1" in their messages, and MUST do so for any message that is
not compatible with HTTP/1.0. For more details on when to send
specific HTTP-Version values, see RFC 2145 <a rel='bibref' href='rfc2616-sec17.html#bib36'>[36]</a>.
</p>
<p>
The HTTP version of an application is the highest HTTP version for
which the application is at least conditionally compliant.
</p>
<p>
Proxy and gateway applications need to be careful when forwarding
messages in protocol versions different from that of the application.
Since the protocol version indicates the protocol capability of the
sender, a proxy/gateway MUST NOT send a message with a version
indicator which is greater than its actual version. If a higher
version request is received, the proxy/gateway MUST either downgrade
the request version, or respond with an error, or switch to tunnel
behavior.
</p>
<p>
Due to interoperability problems with HTTP/1.0 proxies discovered
since the publication of RFC 2068[33], caching proxies MUST, gateways
MAY, and tunnels MUST NOT upgrade the request to the highest version
they support. The proxy/gateway's response to that request MUST be in
the same major version as the request.
</p>
<pre> Note: Converting between versions of HTTP may involve modification
of header fields required or forbidden by the versions involved.
</pre>
<h3><a id='sec3.2'>3.2</a> Uniform Resource Identifiers</h3>
<p>
URIs have been known by many names: WWW addresses, Universal Document
Identifiers, Universal Resource Identifiers [3], and finally the
combination of Uniform Resource Locators (URL) <a rel='bibref' href='rfc2616-sec17.html#bib4'>[4]</a> and Names (URN)
<a rel='bibref' href='rfc2616-sec17.html#bib20'>[20]</a>. As far as HTTP is concerned, Uniform Resource Identifiers are
simply formatted strings which identify--via name, location, or any
other characteristic--a resource.
</p>
<h3><a id='sec3.2.1'>3.2.1</a> General Syntax</h3>
<p>
URIs in HTTP can be represented in absolute form or relative to some
known base URI [11], depending upon the context of their use. The two
forms are differentiated by the fact that absolute URIs always begin
with a scheme name followed by a colon. For definitive information on
URL syntax and semantics, see "Uniform Resource Identifiers (URI):
Generic Syntax and Semantics," RFC 2396 <a rel='bibref' href='rfc2616-sec17.html#bib42'>[42]</a> (which replaces RFCs
1738 <a rel='bibref' href='rfc2616-sec17.html#bib4'>[4]</a> and RFC 1808 <a rel='bibref' href='rfc2616-sec17.html#bib11'>[11]</a>). This specification adopts the
definitions of "URI-reference", "absoluteURI", "relativeURI", "port",
"host","abs_path", "rel_path", and "authority" from that
specification.
</p>
<p>
The HTTP protocol does not place any a priori limit on the length of
a URI. Servers MUST be able to handle the URI of any resource they
serve, and SHOULD be able to handle URIs of unbounded length if they
provide GET-based forms that could generate such URIs. A server
SHOULD return 414 (Request-URI Too Long) status if a URI is longer
than the server can handle (see section <a rel='xref' href='rfc2616-sec10.html#sec10.4.15'>10.4.15</a>).
</p>
<pre> Note: Servers ought to be cautious about depending on URI lengths
above 255 bytes, because some older client or proxy
implementations might not properly support these lengths.
</pre>
<h3><a id='sec3.2.2'>3.2.2</a> http URL</h3>
<p>
The "http" scheme is used to locate network resources via the HTTP
protocol. This section defines the scheme-specific syntax and
semantics for http URLs.
</p>
<p>
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
</p>
<p>
If the port is empty or not given, port 80 is assumed. The semantics
are that the identified resource is located at the server listening
for TCP connections on that port of that host, and the Request-URI
for the resource is abs_path (section <a rel='xref' href='rfc2616-sec5.html#sec5.1.2'>5.1.2</a>). The use of IP addresses
in URLs SHOULD be avoided whenever possible (see RFC 1900 <a rel='bibref' href='rfc2616-sec17.html#bib24'>[24]</a>). If
the abs_path is not present in the URL, it MUST be given as "/" when
used as a Request-URI for a resource (section <a rel='xref' href='rfc2616-sec5.html#sec5.1.2'>5.1.2</a>). If a proxy
receives a host name which is not a fully qualified domain name, it
MAY add its domain to the host name it received. If a proxy receives
a fully qualified domain name, the proxy MUST NOT change the host
name.
</p>
<h3><a id='sec3.2.3'>3.2.3</a> URI Comparison</h3>
<p>
When comparing two URIs to decide if they match or not, a client
SHOULD use a case-sensitive octet-by-octet comparison of the entire
URIs, with these exceptions:
</p>
<pre> - A port that is empty or not given is equivalent to the default
port for that URI-reference;
</pre>
<pre> - Comparisons of host names MUST be case-insensitive;
</pre>
<pre> - Comparisons of scheme names MUST be case-insensitive;
</pre>
<pre> - An empty abs_path is equivalent to an abs_path of "/".
</pre>
<p>
Characters other than those in the "reserved" and "unsafe" sets (see
RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.
</p>
<p>
For example, the following three URIs are equivalent:
</p>
<pre> http://abc.com:80/~smith/home.html
http://ABC.com/%7Esmith/home.html
http://ABC.com:/%7esmith/home.html
</pre>
<h3><a id='sec3.3'>3.3</a> Date/Time Formats</h3>
<h3><a id='sec3.3.1'>3.3.1</a> Full Date</h3>
<p>
HTTP applications have historically allowed three different formats
for the representation of date/time stamps:
</p>
<pre> Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123
Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036
Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
</pre>
<p>
The first format is preferred as an Internet standard and represents
a fixed-length subset of that defined by RFC 1123 [8] (an update to
RFC 822 <a rel='bibref' href='rfc2616-sec17.html#bib9'>[9]</a>). The second format is in common use, but is based on the
obsolete RFC 850 <a rel='bibref' href='rfc2616-sec17.html#bib12'>[12]</a> date format and lacks a four-digit year.
HTTP/1.1 clients and servers that parse the date value MUST accept
all three formats (for compatibility with HTTP/1.0), though they MUST
only generate the RFC 1123 format for representing HTTP-date values
in header fields. See section <a rel='xref' href='rfc2616-sec19.html#sec19.3'>19.3</a> for further information.
</p>
<pre> Note: Recipients of date values are encouraged to be robust in
accepting date values that may have been sent by non-HTTP
applications, as is sometimes the case when retrieving or posting
messages via proxies/gateways to SMTP or NNTP.
</pre>
<p>
All HTTP date/time stamps MUST be represented in Greenwich Mean Time
(GMT), without exception. For the purposes of HTTP, GMT is exactly
equal to UTC (Coordinated Universal Time). This is indicated in the
first two formats by the inclusion of "GMT" as the three-letter
abbreviation for time zone, and MUST be assumed when reading the
asctime format. HTTP-date is case sensitive and MUST NOT include
additional LWS beyond that specifically included as SP in the
grammar.
</p>
<pre> HTTP-date = rfc1123-date | rfc850-date | asctime-date
rfc1123-date = wkday "," SP date1 SP time SP "GMT"
rfc850-date = weekday "," SP date2 SP time SP "GMT"
asctime-date = wkday SP date3 SP time SP 4DIGIT
date1 = 2DIGIT SP month SP 4DIGIT
; day month year (e.g., 02 Jun 1982)
date2 = 2DIGIT "-" month "-" 2DIGIT
; day-month-year (e.g., 02-Jun-82)
date3 = month SP ( 2DIGIT | ( SP 1DIGIT ))
; month day (e.g., Jun 2)
time = 2DIGIT ":" 2DIGIT ":" 2DIGIT
; 00:00:00 - 23:59:59
wkday = "Mon" | "Tue" | "Wed"
| "Thu" | "Fri" | "Sat" | "Sun"
weekday = "Monday" | "Tuesday" | "Wednesday"
| "Thursday" | "Friday" | "Saturday" | "Sunday"
month = "Jan" | "Feb" | "Mar" | "Apr"
| "May" | "Jun" | "Jul" | "Aug"
| "Sep" | "Oct" | "Nov" | "Dec"
</pre>
<pre> Note: HTTP requirements for the date/time stamp format apply only
to their usage within the protocol stream. Clients and servers are
not required to use these formats for user presentation, request
logging, etc.
</pre>
<h3><a id='sec3.3.2'>3.3.2</a> Delta Seconds</h3>
<p>
Some HTTP header fields allow a time value to be specified as an
integer number of seconds, represented in decimal, after the time
that the message was received.
</p>
<pre> delta-seconds = 1*DIGIT
</pre>
<h3><a id='sec3.4'>3.4</a> Character Sets</h3>
<p>
HTTP uses the same definition of the term "character set" as that
described for MIME:
</p>
<p>
The term "character set" is used in this document to refer to a
method used with one or more tables to convert a sequence of octets
into a sequence of characters. Note that unconditional conversion in
the other direction is not required, in that not all characters may
be available in a given character set and a character set may provide
more than one sequence of octets to represent a particular character.
This definition is intended to allow various kinds of character
encoding, from simple single-table mappings such as US-ASCII to
complex table switching methods such as those that use ISO-2022's
techniques. However, the definition associated with a MIME character
set name MUST fully specify the mapping to be performed from octets
to characters. In particular, use of external profiling information
to determine the exact mapping is not permitted.
</p>
<pre> Note: This use of the term "character set" is more commonly
referred to as a "character encoding." However, since HTTP and
MIME share the same registry, it is important that the terminology
also be shared.
</pre>
<p>
HTTP character sets are identified by case-insensitive tokens. The
complete set of tokens is defined by the IANA Character Set registry
<a rel='bibref' href='rfc2616-sec17.html#bib19'>[19]</a>.
</p>
<pre> charset = token
</pre>
<p>
Although HTTP allows an arbitrary token to be used as a charset
value, any token that has a predefined value within the IANA
Character Set registry <a rel='bibref' href='rfc2616-sec17.html#bib19'>[19]</a> MUST represent the character set defined
by that registry. Applications SHOULD limit their use of character
sets to those defined by the IANA registry.
</p>
<p>
Implementors should be aware of IETF character set requirements <a rel='bibref' href='rfc2616-sec17.html#bib38'>[38]</a>
[41].
</p>
<h3><a id='sec3.4.1'>3.4.1</a> Missing Charset</h3>
<p>
Some HTTP/1.0 software has interpreted a Content-Type header without
charset parameter incorrectly to mean "recipient should guess."
Senders wishing to defeat this behavior MAY include a charset
parameter even when the charset is ISO-8859-1 and SHOULD do so when
it is known that it will not confuse the recipient.
</p>
<p>
Unfortunately, some older HTTP/1.0 clients did not deal properly with
an explicit charset parameter. HTTP/1.1 recipients MUST respect the
charset label provided by the sender; and those user agents that have
a provision to "guess" a charset MUST use the charset from the
</p>
<p>
content-type field if they support that charset, rather than the
recipient's preference, when initially displaying a document. See
section <a rel='xref' href='rfc2616-sec3.html#sec3.7.1'>3.7.1</a>.
</p>
<h3><a id='sec3.5'>3.5</a> Content Codings</h3>
<p>
Content coding values indicate an encoding transformation that has
been or can be applied to an entity. Content codings are primarily
used to allow a document to be compressed or otherwise usefully
transformed without losing the identity of its underlying media type
and without loss of information. Frequently, the entity is stored in
coded form, transmitted directly, and only decoded by the recipient.
</p>
<pre> content-coding = token
</pre>
<p>
All content-coding values are case-insensitive. HTTP/1.1 uses
content-coding values in the Accept-Encoding (section 14.3) and
Content-Encoding (section <a rel='xref' href='rfc2616-sec14.html#sec14.11'>14.11</a>) header fields. Although the value
describes the content-coding, what is more important is that it
indicates what decoding mechanism will be required to remove the
encoding.
</p>
<p>
The Internet Assigned Numbers Authority (IANA) acts as a registry for
content-coding value tokens. Initially, the registry contains the
following tokens:
</p>
<p>
gzip An encoding format produced by the file compression program
"gzip" (GNU zip) as described in RFC 1952 [25]. This format is a
Lempel-Ziv coding (LZ77) with a 32 bit CRC.
</p>
<p>
compress
The encoding format produced by the common UNIX file compression
program "compress". This format is an adaptive Lempel-Ziv-Welch
coding (LZW).
</p>
<pre> Use of program names for the identification of encoding formats
is not desirable and is discouraged for future encodings. Their
use here is representative of historical practice, not good
design. For compatibility with previous implementations of HTTP,
applications SHOULD consider "x-gzip" and "x-compress" to be
equivalent to "gzip" and "compress" respectively.
</pre>
<p>
deflate
The "zlib" format defined in RFC 1950 [31] in combination with
the "deflate" compression mechanism described in RFC 1951 <a rel='bibref' href='rfc2616-sec17.html#bib29'>[29]</a>.
</p>
<p>
identity
The default (identity) encoding; the use of no transformation
whatsoever. This content-coding is used only in the Accept-
Encoding header, and SHOULD NOT be used in the Content-Encoding
header.
</p>
<p>
New content-coding value tokens SHOULD be registered; to allow
interoperability between clients and servers, specifications of the
content coding algorithms needed to implement a new value SHOULD be
publicly available and adequate for independent implementation, and
conform to the purpose of content coding defined in this section.
</p>
<h3><a id='sec3.6'>3.6</a> Transfer Codings</h3>
<p>
Transfer-coding values are used to indicate an encoding
transformation that has been, can be, or may need to be applied to an
entity-body in order to ensure "safe transport" through the network.
This differs from a content coding in that the transfer-coding is a
property of the message, not of the original entity.
</p>
<pre> transfer-coding = "chunked" | transfer-extension
transfer-extension = token *( ";" parameter )
</pre>
<p>
Parameters are in the form of attribute/value pairs.
</p>
<pre> parameter = attribute "=" value
attribute = token
value = token | quoted-string
</pre>
<p>
All transfer-coding values are case-insensitive. HTTP/1.1 uses
transfer-coding values in the TE header field (section 14.39) and in
the Transfer-Encoding header field (section <a rel='xref' href='rfc2616-sec14.html#sec14.41'>14.41</a>).
</p>
<p>
Whenever a transfer-coding is applied to a message-body, the set of
transfer-codings MUST include "chunked", unless the message is
terminated by closing the connection. When the "chunked" transfer-
coding is used, it MUST be the last transfer-coding applied to the
message-body. The "chunked" transfer-coding MUST NOT be applied more
than once to a message-body. These rules allow the recipient to
determine the transfer-length of the message (section <a rel='xref' href='rfc2616-sec4.html#sec4.4'>4.4</a>).
</p>
<p>
Transfer-codings are analogous to the Content-Transfer-Encoding
values of MIME [7], which were designed to enable safe transport of
binary data over a 7-bit transport service. However, safe transport
has a different focus for an 8bit-clean transfer protocol. In HTTP,
the only unsafe characteristic of message-bodies is the difficulty in
determining the exact body length (section <a rel='xref' href='rfc2616-sec7.html#sec7.2.2'>7.2.2</a>), or the desire to
encrypt data over a shared transport.
</p>
<p>
The Internet Assigned Numbers Authority (IANA) acts as a registry for
transfer-coding value tokens. Initially, the registry contains the
following tokens: "chunked" (section <a rel='xref' href='rfc2616-sec3.html#sec3.6.1'>3.6.1</a>), "identity" (section
<a rel='xref' href='rfc2616-sec3.html#sec3.6.2'>3.6.2</a>), "gzip" (section <a rel='xref' href='rfc2616-sec3.html#sec3.5'>3.5</a>), "compress" (section <a rel='xref' href='rfc2616-sec3.html#sec3.5'>3.5</a>), and "deflate"
(section <a rel='xref' href='rfc2616-sec3.html#sec3.5'>3.5</a>).
</p>
<p>
New transfer-coding value tokens SHOULD be registered in the same way
as new content-coding value tokens (section 3.5).
</p>
<p>
A server which receives an entity-body with a transfer-coding it does
not understand SHOULD return 501 (Unimplemented), and close the
connection. A server MUST NOT send transfer-codings to an HTTP/1.0
client.
</p>
<h3><a id='sec3.6.1'>3.6.1</a> Chunked Transfer Coding</h3>
<p>
The chunked encoding modifies the body of a message in order to
transfer it as a series of chunks, each with its own size indicator,
followed by an OPTIONAL trailer containing entity-header fields. This
allows dynamically produced content to be transferred along with the
information necessary for the recipient to verify that it has
received the full message.
</p>
<pre> Chunked-Body = *chunk
last-chunk
trailer
CRLF
</pre>
<pre> chunk = chunk-size [ chunk-extension ] CRLF
chunk-data CRLF
chunk-size = 1*HEX
last-chunk = 1*("0") [ chunk-extension ] CRLF
</pre>
<pre> chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
chunk-ext-name = token
chunk-ext-val = token | quoted-string
chunk-data = chunk-size(OCTET)
trailer = *(entity-header CRLF)
</pre>
<p>
The chunk-size field is a string of hex digits indicating the size of
the chunk. The chunked encoding is ended by any chunk whose size is
zero, followed by the trailer, which is terminated by an empty line.
</p>
<p>
The trailer allows the sender to include additional HTTP header
fields at the end of the message. The Trailer header field can be
used to indicate which header fields are included in a trailer (see
section <a rel='xref' href='rfc2616-sec14.html#sec14.40'>14.40</a>).
</p>
<p>
A server using chunked transfer-coding in a response MUST NOT use the
trailer for any header fields unless at least one of the following is
true:
</p>
<p>
a)the request included a TE header field that indicates "trailers" is
acceptable in the transfer-coding of the response, as described in
section <a rel='xref' href='rfc2616-sec14.html#sec14.39'>14.39</a>; or,
</p>
<p>
b)the server is the origin server for the response, the trailer
fields consist entirely of optional metadata, and the recipient
could use the message (in a manner acceptable to the origin server)
without receiving this metadata. In other words, the origin server
is willing to accept the possibility that the trailer fields might
be silently discarded along the path to the client.
</p>
<p>
This requirement prevents an interoperability failure when the
message is being received by an HTTP/1.1 (or later) proxy and
forwarded to an HTTP/1.0 recipient. It avoids a situation where
compliance with the protocol would have necessitated a possibly
infinite buffer on the proxy.
</p>
<p>
An example process for decoding a Chunked-Body is presented in
appendix 19.4.6.
</p>
<p>
All HTTP/1.1 applications MUST be able to receive and decode the
"chunked" transfer-coding, and MUST ignore chunk-extension extensions
they do not understand.
</p>
<h3><a id='sec3.7'>3.7</a> Media Types</h3>
<p>
HTTP uses Internet Media Types <a rel='bibref' href='rfc2616-sec17.html#bib17'>[17]</a> in the Content-Type (section
14.17) and Accept (section 14.1) header fields in order to provide
open and extensible data typing and type negotiation.
</p>
<pre> media-type = type "/" subtype *( ";" parameter )
type = token
subtype = token
</pre>
<p>
Parameters MAY follow the type/subtype in the form of attribute/value
pairs (as defined in section 3.6).
</p>
<p>
The type, subtype, and parameter attribute names are case-
insensitive. Parameter values might or might not be case-sensitive,
depending on the semantics of the parameter name. Linear white space
(LWS) MUST NOT be used between the type and subtype, nor between an
attribute and its value. The presence or absence of a parameter might
be significant to the processing of a media-type, depending on its
definition within the media type registry.
</p>
<p>
Note that some older HTTP applications do not recognize media type
parameters. When sending data to older HTTP applications,
implementations SHOULD only use media type parameters when they are
required by that type/subtype definition.
</p>
<p>
Media-type values are registered with the Internet Assigned Number
Authority (IANA [19]). The media type registration process is
outlined in RFC 1590 <a rel='bibref' href='rfc2616-sec17.html#bib17'>[17]</a>. Use of non-registered media types is
discouraged.
</p>
<h3><a id='sec3.7.1'>3.7.1</a> Canonicalization and Text Defaults</h3>
<p>
Internet media types are registered with a canonical form. An
entity-body transferred via HTTP messages MUST be represented in the
appropriate canonical form prior to its transmission except for
"text" types, as defined in the next paragraph.
</p>
<p>
When in canonical form, media subtypes of the "text" type use CRLF as
the text line break. HTTP relaxes this requirement and allows the
transport of text media with plain CR or LF alone representing a line
break when it is done consistently for an entire entity-body. HTTP
applications MUST accept CRLF, bare CR, and bare LF as being
representative of a line break in text media received via HTTP. In
addition, if the text is represented in a character set that does not
use octets 13 and 10 for CR and LF respectively, as is the case for
some multi-byte character sets, HTTP allows the use of whatever octet
sequences are defined by that character set to represent the
equivalent of CR and LF for line breaks. This flexibility regarding
line breaks applies only to text media in the entity-body; a bare CR
or LF MUST NOT be substituted for CRLF within any of the HTTP control
structures (such as header fields and multipart boundaries).
</p>
<p>
If an entity-body is encoded with a content-coding, the underlying
data MUST be in a form defined above prior to being encoded.
</p>
<p>
The "charset" parameter is used with some media types to define the
character set (section 3.4) of the data. When no explicit charset
parameter is provided by the sender, media subtypes of the "text"
type are defined to have a default charset value of "ISO-8859-1" when
received via HTTP. Data in character sets other than "ISO-8859-1" or
its subsets MUST be labeled with an appropriate charset value. See
section <a rel='xref' href='rfc2616-sec3.html#sec3.4.1'>3.4.1</a> for compatibility problems.
</p>
<h3><a id='sec3.7.2'>3.7.2</a> Multipart Types</h3>
<p>
MIME provides for a number of "multipart" types -- encapsulations of
one or more entities within a single message-body. All multipart
types share a common syntax, as defined in section <a rel='xref' href='rfc2616-sec5.html#sec5.1.1'>5.1.1</a> of RFC 2046
</p>
<p>
<a rel='bibref' href='rfc2616-sec17.html#bib40'>[40]</a>, and MUST include a boundary parameter as part of the media type
value. The message body is itself a protocol element and MUST
therefore use only CRLF to represent line breaks between body-parts.
Unlike in RFC 2046, the epilogue of any multipart message MUST be
empty; HTTP applications MUST NOT transmit the epilogue (even if the
original multipart contains an epilogue). These restrictions exist in
order to preserve the self-delimiting nature of a multipart message-
body, wherein the "end" of the message-body is indicated by the
ending multipart boundary.
</p>
<p>
In general, HTTP treats a multipart message-body no differently than
any other media type: strictly as payload. The one exception is the
"multipart/byteranges" type (appendix <a rel='xref' href='rfc2616-sec19.html#sec19.2'>19.2</a>) when it appears in a 206
(Partial Content) response, which will be interpreted by some HTTP
caching mechanisms as described in sections <a rel='xref' href='rfc2616-sec13.html#sec13.5.4'>13.5.4</a> and <a rel='xref' href='rfc2616-sec14.html#sec14.16'>14.16</a>. In all
other cases, an HTTP user agent SHOULD follow the same or similar
behavior as a MIME user agent would upon receipt of a multipart type.
The MIME header fields within each body-part of a multipart message-
body do not have any significance to HTTP beyond that defined by
their MIME semantics.
</p>
<p>
In general, an HTTP user agent SHOULD follow the same or similar
behavior as a MIME user agent would upon receipt of a multipart type.
If an application receives an unrecognized multipart subtype, the
application MUST treat it as being equivalent to "multipart/mixed".
</p>
<pre> Note: The "multipart/form-data" type has been specifically defined
for carrying form data suitable for processing via the POST
request method, as described in RFC 1867 <a rel='bibref' href='rfc2616-sec17.html#bib15'>[15]</a>.
</pre>
<h3><a id='sec3.8'>3.8</a> Product Tokens</h3>
<p>
Product tokens are used to allow communicating applications to
identify themselves by software name and version. Most fields using
product tokens also allow sub-products which form a significant part
of the application to be listed, separated by white space. By
convention, the products are listed in order of their significance
for identifying the application.
</p>
<pre> product = token ["/" product-version]
product-version = token
</pre>
<p>
Examples:
</p>
<pre> User-Agent: CERN-LineMode/2.15 libwww/2.17b3
Server: Apache/0.8.4
</pre>
<p>
Product tokens SHOULD be short and to the point. They MUST NOT be
used for advertising or other non-essential information. Although any
token character MAY appear in a product-version, this token SHOULD
only be used for a version identifier (i.e., successive versions of
the same product SHOULD only differ in the product-version portion of
the product value).
</p>
<h3><a id='sec3.9'>3.9</a> Quality Values</h3>
<p>
HTTP content negotiation (section 12) uses short "floating point"
numbers to indicate the relative importance ("weight") of various
negotiable parameters. A weight is normalized to a real number in
the range 0 through 1, where 0 is the minimum and 1 the maximum
value. If a parameter has a quality value of 0, then content with
this parameter is `not acceptable' for the client. HTTP/1.1
applications MUST NOT generate more than three digits after the
decimal point. User configuration of these values SHOULD also be
limited in this fashion.
</p>
<pre> qvalue = ( "0" [ "." 0*3DIGIT ] )
| ( "1" [ "." 0*3("0") ] )
</pre>
<p>
"Quality values" is a misnomer, since these values merely represent
relative degradation in desired quality.
</p>
<h3><a id='sec3.10'>3.10</a> Language Tags</h3>
<p>
A language tag identifies a natural language spoken, written, or
otherwise conveyed by human beings for communication of information
to other human beings. Computer languages are explicitly excluded.
HTTP uses language tags within the Accept-Language and Content-
Language fields.
</p>
<p>
The syntax and registry of HTTP language tags is the same as that
defined by RFC 1766 [1]. In summary, a language tag is composed of 1
or more parts: A primary language tag and a possibly empty series of
subtags:
</p>
<pre> language-tag = primary-tag *( "-" subtag )
primary-tag = 1*8ALPHA
subtag = 1*8ALPHA
</pre>
<p>
White space is not allowed within the tag and all tags are case-
insensitive. The name space of language tags is administered by the
IANA. Example tags include:
</p>
<pre> en, en-US, en-cockney, i-cherokee, x-pig-latin
</pre>
<p>
where any two-letter primary-tag is an ISO-639 language abbreviation
and any two-letter initial subtag is an ISO-3166 country code. (The
last three tags above are not registered tags; all but the last are
examples of tags which could be registered in future.)
</p>
<h3><a id='sec3.11'>3.11</a> Entity Tags</h3>
<p>
Entity tags are used for comparing two or more entities from the same
requested resource. HTTP/1.1 uses entity tags in the ETag (section
<a rel='xref' href='rfc2616-sec14.html#sec14.19'>14.19</a>), If-Match (section <a rel='xref' href='rfc2616-sec14.html#sec14.24'>14.24</a>), If-None-Match (section <a rel='xref' href='rfc2616-sec14.html#sec14.26'>14.26</a>), and
If-Range (section <a rel='xref' href='rfc2616-sec14.html#sec14.27'>14.27</a>) header fields. The definition of how they
are used and compared as cache validators is in section <a rel='xref' href='rfc2616-sec13.html#sec13.3.3'>13.3.3</a>. An
entity tag consists of an opaque quoted string, possibly prefixed by
a weakness indicator.
</p>
<pre> entity-tag = [ weak ] opaque-tag
weak = "W/"
opaque-tag = quoted-string
</pre>
<p>
A "strong entity tag" MAY be shared by two entities of a resource
only if they are equivalent by octet equality.
</p>
<p>
A "weak entity tag," indicated by the "W/" prefix, MAY be shared by
two entities of a resource only if the entities are equivalent and
could be substituted for each other with no significant change in
semantics. A weak entity tag can only be used for weak comparison.
</p>
<p>
An entity tag MUST be unique across all versions of all entities
associated with a particular resource. A given entity tag value MAY
be used for entities obtained by requests on different URIs. The use
of the same entity tag value in conjunction with entities obtained by
requests on different URIs does not imply the equivalence of those
entities.
</p>
<h3><a id='sec3.12'>3.12</a> Range Units</h3>
<p>
HTTP/1.1 allows a client to request that only part (a range of) the
response entity be included within the response. HTTP/1.1 uses range
units in the Range (section <a rel='xref' href='rfc2616-sec14.html#sec14.35'>14.35</a>) and Content-Range (section <a rel='xref' href='rfc2616-sec14.html#sec14.16'>14.16</a>)
header fields. An entity can be broken down into subranges according
to various structural units.
</p>
<pre> range-unit = bytes-unit | other-range-unit
bytes-unit = "bytes"
other-range-unit = token
</pre>
<p>
The only range unit defined by HTTP/1.1 is "bytes". HTTP/1.1
implementations MAY ignore ranges specified using other units.
</p>
<p>
HTTP/1.1 has been designed to allow implementations of applications
that do not depend on knowledge of ranges.
</p>
</body></html>