best makefile
[webserver.git] / rfc1945.txt
1
2
3
4
5
6
7 Network Working Group T. Berners-Lee
8 Request for Comments: 1945 MIT/LCS
9 Category: Informational R. Fielding
10 UC Irvine
11 H. Frystyk
12 MIT/LCS
13 May 1996
14
15
16 Hypertext Transfer Protocol -- HTTP/1.0
17
18 Status of This Memo
19
20 This memo provides information for the Internet community. This memo
21 does not specify an Internet standard of any kind. Distribution of
22 this memo is unlimited.
23
24 IESG Note:
25
26 The IESG has concerns about this protocol, and expects this document
27 to be replaced relatively soon by a standards track document.
28
29 Abstract
30
31 The Hypertext Transfer Protocol (HTTP) is an application-level
32 protocol with the lightness and speed necessary for distributed,
33 collaborative, hypermedia information systems. It is a generic,
34 stateless, object-oriented protocol which can be used for many tasks,
35 such as name servers and distributed object management systems,
36 through extension of its request methods (commands). A feature of
37 HTTP is the typing of data representation, allowing systems to be
38 built independently of the data being transferred.
39
40 HTTP has been in use by the World-Wide Web global information
41 initiative since 1990. This specification reflects common usage of
42 the protocol referred to as "HTTP/1.0".
43
44 Table of Contents
45
46 1. Introduction .............................................. 4
47 1.1 Purpose .............................................. 4
48 1.2 Terminology .......................................... 4
49 1.3 Overall Operation .................................... 6
50 1.4 HTTP and MIME ........................................ 8
51 2. Notational Conventions and Generic Grammar ................ 8
52 2.1 Augmented BNF ........................................ 8
53 2.2 Basic Rules .......................................... 10
54 3. Protocol Parameters ....................................... 12
55
56
57
58 Berners-Lee, et al Informational [Page 1]
59 \f
60 RFC 1945 HTTP/1.0 May 1996
61
62
63 3.1 HTTP Version ......................................... 12
64 3.2 Uniform Resource Identifiers ......................... 14
65 3.2.1 General Syntax ................................ 14
66 3.2.2 http URL ...................................... 15
67 3.3 Date/Time Formats .................................... 15
68 3.4 Character Sets ....................................... 17
69 3.5 Content Codings ...................................... 18
70 3.6 Media Types .......................................... 19
71 3.6.1 Canonicalization and Text Defaults ............ 19
72 3.6.2 Multipart Types ............................... 20
73 3.7 Product Tokens ....................................... 20
74 4. HTTP Message .............................................. 21
75 4.1 Message Types ........................................ 21
76 4.2 Message Headers ...................................... 22
77 4.3 General Header Fields ................................ 23
78 5. Request ................................................... 23
79 5.1 Request-Line ......................................... 23
80 5.1.1 Method ........................................ 24
81 5.1.2 Request-URI ................................... 24
82 5.2 Request Header Fields ................................ 25
83 6. Response .................................................. 25
84 6.1 Status-Line .......................................... 26
85 6.1.1 Status Code and Reason Phrase ................. 26
86 6.2 Response Header Fields ............................... 28
87 7. Entity .................................................... 28
88 7.1 Entity Header Fields ................................. 29
89 7.2 Entity Body .......................................... 29
90 7.2.1 Type .......................................... 29
91 7.2.2 Length ........................................ 30
92 8. Method Definitions ........................................ 30
93 8.1 GET .................................................. 31
94 8.2 HEAD ................................................. 31
95 8.3 POST ................................................. 31
96 9. Status Code Definitions ................................... 32
97 9.1 Informational 1xx .................................... 32
98 9.2 Successful 2xx ....................................... 32
99 9.3 Redirection 3xx ...................................... 34
100 9.4 Client Error 4xx ..................................... 35
101 9.5 Server Error 5xx ..................................... 37
102 10. Header Field Definitions .................................. 37
103 10.1 Allow ............................................... 38
104 10.2 Authorization ....................................... 38
105 10.3 Content-Encoding .................................... 39
106 10.4 Content-Length ...................................... 39
107 10.5 Content-Type ........................................ 40
108 10.6 Date ................................................ 40
109 10.7 Expires ............................................. 41
110 10.8 From ................................................ 42
111
112
113
114 Berners-Lee, et al Informational [Page 2]
115 \f
116 RFC 1945 HTTP/1.0 May 1996
117
118
119 10.9 If-Modified-Since ................................... 42
120 10.10 Last-Modified ....................................... 43
121 10.11 Location ............................................ 44
122 10.12 Pragma .............................................. 44
123 10.13 Referer ............................................. 44
124 10.14 Server .............................................. 45
125 10.15 User-Agent .......................................... 46
126 10.16 WWW-Authenticate .................................... 46
127 11. Access Authentication ..................................... 47
128 11.1 Basic Authentication Scheme ......................... 48
129 12. Security Considerations ................................... 49
130 12.1 Authentication of Clients ........................... 49
131 12.2 Safe Methods ........................................ 49
132 12.3 Abuse of Server Log Information ..................... 50
133 12.4 Transfer of Sensitive Information ................... 50
134 12.5 Attacks Based On File and Path Names ................ 51
135 13. Acknowledgments ........................................... 51
136 14. References ................................................ 52
137 15. Authors' Addresses ........................................ 54
138 Appendix A. Internet Media Type message/http ................ 55
139 Appendix B. Tolerant Applications ........................... 55
140 Appendix C. Relationship to MIME ............................ 56
141 C.1 Conversion to Canonical Form ......................... 56
142 C.2 Conversion of Date Formats ........................... 57
143 C.3 Introduction of Content-Encoding ..................... 57
144 C.4 No Content-Transfer-Encoding ......................... 57
145 C.5 HTTP Header Fields in Multipart Body-Parts ........... 57
146 Appendix D. Additional Features ............................. 57
147 D.1 Additional Request Methods ........................... 58
148 D.1.1 PUT ........................................... 58
149 D.1.2 DELETE ........................................ 58
150 D.1.3 LINK .......................................... 58
151 D.1.4 UNLINK ........................................ 58
152 D.2 Additional Header Field Definitions .................. 58
153 D.2.1 Accept ........................................ 58
154 D.2.2 Accept-Charset ................................ 59
155 D.2.3 Accept-Encoding ............................... 59
156 D.2.4 Accept-Language ............................... 59
157 D.2.5 Content-Language .............................. 59
158 D.2.6 Link .......................................... 59
159 D.2.7 MIME-Version .................................. 59
160 D.2.8 Retry-After ................................... 60
161 D.2.9 Title ......................................... 60
162 D.2.10 URI ........................................... 60
163
164
165
166
167
168
169
170 Berners-Lee, et al Informational [Page 3]
171 \f
172 RFC 1945 HTTP/1.0 May 1996
173
174
175 1. Introduction
176
177 1.1 Purpose
178
179 The Hypertext Transfer Protocol (HTTP) is an application-level
180 protocol with the lightness and speed necessary for distributed,
181 collaborative, hypermedia information systems. HTTP has been in use
182 by the World-Wide Web global information initiative since 1990. This
183 specification reflects common usage of the protocol referred too as
184 "HTTP/1.0". This specification describes the features that seem to be
185 consistently implemented in most HTTP/1.0 clients and servers. The
186 specification is split into two sections. Those features of HTTP for
187 which implementations are usually consistent are described in the
188 main body of this document. Those features which have few or
189 inconsistent implementations are listed in Appendix D.
190
191 Practical information systems require more functionality than simple
192 retrieval, including search, front-end update, and annotation. HTTP
193 allows an open-ended set of methods to be used to indicate the
194 purpose of a request. It builds on the discipline of reference
195 provided by the Uniform Resource Identifier (URI) [2], as a location
196 (URL) [4] or name (URN) [16], for indicating the resource on which a
197 method is to be applied. Messages are passed in a format similar to
198 that used by Internet Mail [7] and the Multipurpose Internet Mail
199 Extensions (MIME) [5].
200
201 HTTP is also used as a generic protocol for communication between
202 user agents and proxies/gateways to other Internet protocols, such as
203 SMTP [12], NNTP [11], FTP [14], Gopher [1], and WAIS [8], allowing
204 basic hypermedia access to resources available from diverse
205 applications and simplifying the implementation of user agents.
206
207 1.2 Terminology
208
209 This specification uses a number of terms to refer to the roles
210 played by participants in, and objects of, the HTTP communication.
211
212 connection
213
214 A transport layer virtual circuit established between two
215 application programs for the purpose of communication.
216
217 message
218
219 The basic unit of HTTP communication, consisting of a structured
220 sequence of octets matching the syntax defined in Section 4 and
221 transmitted via the connection.
222
223
224
225
226 Berners-Lee, et al Informational [Page 4]
227 \f
228 RFC 1945 HTTP/1.0 May 1996
229
230
231 request
232
233 An HTTP request message (as defined in Section 5).
234
235 response
236
237 An HTTP response message (as defined in Section 6).
238
239 resource
240
241 A network data object or service which can be identified by a
242 URI (Section 3.2).
243
244 entity
245
246 A particular representation or rendition of a data resource, or
247 reply from a service resource, that may be enclosed within a
248 request or response message. An entity consists of
249 metainformation in the form of entity headers and content in the
250 form of an entity body.
251
252 client
253
254 An application program that establishes connections for the
255 purpose of sending requests.
256
257 user agent
258
259 The client which initiates a request. These are often browsers,
260 editors, spiders (web-traversing robots), or other end user
261 tools.
262
263 server
264
265 An application program that accepts connections in order to
266 service requests by sending back responses.
267
268 origin server
269
270 The server on which a given resource resides or is to be created.
271
272 proxy
273
274 An intermediary program which acts as both a server and a client
275 for the purpose of making requests on behalf of other clients.
276 Requests are serviced internally or by passing them, with
277 possible translation, on to other servers. A proxy must
278 interpret and, if necessary, rewrite a request message before
279
280
281
282 Berners-Lee, et al Informational [Page 5]
283 \f
284 RFC 1945 HTTP/1.0 May 1996
285
286
287 forwarding it. Proxies are often used as client-side portals
288 through network firewalls and as helper applications for
289 handling requests via protocols not implemented by the user
290 agent.
291
292 gateway
293
294 A server which acts as an intermediary for some other server.
295 Unlike a proxy, a gateway receives requests as if it were the
296 origin server for the requested resource; the requesting client
297 may not be aware that it is communicating with a gateway.
298 Gateways are often used as server-side portals through network
299 firewalls and as protocol translators for access to resources
300 stored on non-HTTP systems.
301
302 tunnel
303
304 A tunnel is an intermediary program which is acting as a blind
305 relay between two connections. Once active, a tunnel is not
306 considered a party to the HTTP communication, though the tunnel
307 may have been initiated by an HTTP request. The tunnel ceases to
308 exist when both ends of the relayed connections are closed.
309 Tunnels are used when a portal is necessary and the intermediary
310 cannot, or should not, interpret the relayed communication.
311
312 cache
313
314 A program's local store of response messages and the subsystem
315 that controls its message storage, retrieval, and deletion. A
316 cache stores cachable responses in order to reduce the response
317 time and network bandwidth consumption on future, equivalent
318 requests. Any client or server may include a cache, though a
319 cache cannot be used by a server while it is acting as a tunnel.
320
321 Any given program may be capable of being both a client and a server;
322 our use of these terms refers only to the role being performed by the
323 program for a particular connection, rather than to the program's
324 capabilities in general. Likewise, any server may act as an origin
325 server, proxy, gateway, or tunnel, switching behavior based on the
326 nature of each request.
327
328 1.3 Overall Operation
329
330 The HTTP protocol is based on a request/response paradigm. A client
331 establishes a connection with a server and sends a request to the
332 server in the form of a request method, URI, and protocol version,
333 followed by a MIME-like message containing request modifiers, client
334 information, and possible body content. The server responds with a
335
336
337
338 Berners-Lee, et al Informational [Page 6]
339 \f
340 RFC 1945 HTTP/1.0 May 1996
341
342
343 status line, including the message's protocol version and a success
344 or error code, followed by a MIME-like message containing server
345 information, entity metainformation, and possible body content.
346
347 Most HTTP communication is initiated by a user agent and consists of
348 a request to be applied to a resource on some origin server. In the
349 simplest case, this may be accomplished via a single connection (v)
350 between the user agent (UA) and the origin server (O).
351
352 request chain ------------------------>
353 UA -------------------v------------------- O
354 <----------------------- response chain
355
356 A more complicated situation occurs when one or more intermediaries
357 are present in the request/response chain. There are three common
358 forms of intermediary: proxy, gateway, and tunnel. A proxy is a
359 forwarding agent, receiving requests for a URI in its absolute form,
360 rewriting all or parts of the message, and forwarding the reformatted
361 request toward the server identified by the URI. A gateway is a
362 receiving agent, acting as a layer above some other server(s) and, if
363 necessary, translating the requests to the underlying server's
364 protocol. A tunnel acts as a relay point between two connections
365 without changing the messages; tunnels are used when the
366 communication needs to pass through an intermediary (such as a
367 firewall) even when the intermediary cannot understand the contents
368 of the messages.
369
370 request chain -------------------------------------->
371 UA -----v----- A -----v----- B -----v----- C -----v----- O
372 <------------------------------------- response chain
373
374 The figure above shows three intermediaries (A, B, and C) between the
375 user agent and origin server. A request or response message that
376 travels the whole chain must pass through four separate connections.
377 This distinction is important because some HTTP communication options
378 may apply only to the connection with the nearest, non-tunnel
379 neighbor, only to the end-points of the chain, or to all connections
380 along the chain. Although the diagram is linear, each participant may
381 be engaged in multiple, simultaneous communications. For example, B
382 may be receiving requests from many clients other than A, and/or
383 forwarding requests to servers other than C, at the same time that it
384 is handling A's request.
385
386 Any party to the communication which is not acting as a tunnel may
387 employ an internal cache for handling requests. The effect of a cache
388 is that the request/response chain is shortened if one of the
389 participants along the chain has a cached response applicable to that
390 request. The following illustrates the resulting chain if B has a
391
392
393
394 Berners-Lee, et al Informational [Page 7]
395 \f
396 RFC 1945 HTTP/1.0 May 1996
397
398
399 cached copy of an earlier response from O (via C) for a request which
400 has not been cached by UA or A.
401
402 request chain ---------->
403 UA -----v----- A -----v----- B - - - - - - C - - - - - - O
404 <--------- response chain
405
406 Not all responses are cachable, and some requests may contain
407 modifiers which place special requirements on cache behavior. Some
408 HTTP/1.0 applications use heuristics to describe what is or is not a
409 "cachable" response, but these rules are not standardized.
410
411 On the Internet, HTTP communication generally takes place over TCP/IP
412 connections. The default port is TCP 80 [15], but other ports can be
413 used. This does not preclude HTTP from being implemented on top of
414 any other protocol on the Internet, or on other networks. HTTP only
415 presumes a reliable transport; any protocol that provides such
416 guarantees can be used, and the mapping of the HTTP/1.0 request and
417 response structures onto the transport data units of the protocol in
418 question is outside the scope of this specification.
419
420 Except for experimental applications, current practice requires that
421 the connection be established by the client prior to each request and
422 closed by the server after sending the response. Both clients and
423 servers should be aware that either party may close the connection
424 prematurely, due to user action, automated time-out, or program
425 failure, and should handle such closing in a predictable fashion. In
426 any case, the closing of the connection by either or both parties
427 always terminates the current request, regardless of its status.
428
429 1.4 HTTP and MIME
430
431 HTTP/1.0 uses many of the constructs defined for MIME, as defined in
432 RFC 1521 [5]. Appendix C describes the ways in which the context of
433 HTTP allows for different use of Internet Media Types than is
434 typically found in Internet mail, and gives the rationale for those
435 differences.
436
437 2. Notational Conventions and Generic Grammar
438
439 2.1 Augmented BNF
440
441 All of the mechanisms specified in this document are described in
442 both prose and an augmented Backus-Naur Form (BNF) similar to that
443 used by RFC 822 [7]. Implementors will need to be familiar with the
444 notation in order to understand this specification. The augmented BNF
445 includes the following constructs:
446
447
448
449
450 Berners-Lee, et al Informational [Page 8]
451 \f
452 RFC 1945 HTTP/1.0 May 1996
453
454
455 name = definition
456
457 The name of a rule is simply the name itself (without any
458 enclosing "<" and ">") and is separated from its definition by
459 the equal character "=". Whitespace is only significant in that
460 indentation of continuation lines is used to indicate a rule
461 definition that spans more than one line. Certain basic rules
462 are in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc.
463 Angle brackets are used within definitions whenever their
464 presence will facilitate discerning the use of rule names.
465
466 "literal"
467
468 Quotation marks surround literal text. Unless stated otherwise,
469 the text is case-insensitive.
470
471 rule1 | rule2
472
473 Elements separated by a bar ("I") are alternatives,
474 e.g., "yes | no" will accept yes or no.
475
476 (rule1 rule2)
477
478 Elements enclosed in parentheses are treated as a single
479 element. Thus, "(elem (foo | bar) elem)" allows the token
480 sequences "elem foo elem" and "elem bar elem".
481
482 *rule
483
484 The character "*" preceding an element indicates repetition. The
485 full form is "<n>*<m>element" indicating at least <n> and at
486 most <m> occurrences of element. Default values are 0 and
487 infinity so that "*(element)" allows any number, including zero;
488 "1*element" requires at least one; and "1*2element" allows one
489 or two.
490
491 [rule]
492
493 Square brackets enclose optional elements; "[foo bar]" is
494 equivalent to "*1(foo bar)".
495
496 N rule
497
498 Specific repetition: "<n>(element)" is equivalent to
499 "<n>*<n>(element)"; that is, exactly <n> occurrences of
500 (element). Thus 2DIGIT is a 2-digit number, and 3ALPHA is a
501 string of three alphabetic characters.
502
503
504
505
506 Berners-Lee, et al Informational [Page 9]
507 \f
508 RFC 1945 HTTP/1.0 May 1996
509
510
511 #rule
512
513 A construct "#" is defined, similar to "*", for defining lists
514 of elements. The full form is "<n>#<m>element" indicating at
515 least <n> and at most <m> elements, each separated by one or
516 more commas (",") and optional linear whitespace (LWS). This
517 makes the usual form of lists very easy; a rule such as
518 "( *LWS element *( *LWS "," *LWS element ))" can be shown as
519 "1#element". Wherever this construct is used, null elements are
520 allowed, but do not contribute to the count of elements present.
521 That is, "(element), , (element)" is permitted, but counts as
522 only two elements. Therefore, where at least one element is
523 required, at least one non-null element must be present. Default
524 values are 0 and infinity so that "#(element)" allows any
525 number, including zero; "1#element" requires at least one; and
526 "1#2element" allows one or two.
527
528 ; comment
529
530 A semi-colon, set off some distance to the right of rule text,
531 starts a comment that continues to the end of line. This is a
532 simple way of including useful notes in parallel with the
533 specifications.
534
535 implied *LWS
536
537 The grammar described by this specification is word-based.
538 Except where noted otherwise, linear whitespace (LWS) can be
539 included between any two adjacent words (token or
540 quoted-string), and between adjacent tokens and delimiters
541 (tspecials), without changing the interpretation of a field. At
542 least one delimiter (tspecials) must exist between any two
543 tokens, since they would otherwise be interpreted as a single
544 token. However, applications should attempt to follow "common
545 form" when generating HTTP constructs, since there exist some
546 implementations that fail to accept anything beyond the common
547 forms.
548
549 2.2 Basic Rules
550
551 The following rules are used throughout this specification to
552 describe basic parsing constructs. The US-ASCII coded character set
553 is defined by [17].
554
555 OCTET = <any 8-bit sequence of data>
556 CHAR = <any US-ASCII character (octets 0 - 127)>
557 UPALPHA = <any US-ASCII uppercase letter "A".."Z">
558 LOALPHA = <any US-ASCII lowercase letter "a".."z">
559
560
561
562 Berners-Lee, et al Informational [Page 10]
563 \f
564 RFC 1945 HTTP/1.0 May 1996
565
566
567 ALPHA = UPALPHA | LOALPHA
568 DIGIT = <any US-ASCII digit "0".."9">
569 CTL = <any US-ASCII control character
570 (octets 0 - 31) and DEL (127)>
571 CR = <US-ASCII CR, carriage return (13)>
572 LF = <US-ASCII LF, linefeed (10)>
573 SP = <US-ASCII SP, space (32)>
574 HT = <US-ASCII HT, horizontal-tab (9)>
575 <"> = <US-ASCII double-quote mark (34)>
576
577 HTTP/1.0 defines the octet sequence CR LF as the end-of-line marker
578 for all protocol elements except the Entity-Body (see Appendix B for
579 tolerant applications). The end-of-line marker within an Entity-Body
580 is defined by its associated media type, as described in Section 3.6.
581
582 CRLF = CR LF
583
584 HTTP/1.0 headers may be folded onto multiple lines if each
585 continuation line begins with a space or horizontal tab. All linear
586 whitespace, including folding, has the same semantics as SP.
587
588 LWS = [CRLF] 1*( SP | HT )
589
590 However, folding of header lines is not expected by some
591 applications, and should not be generated by HTTP/1.0 applications.
592
593 The TEXT rule is only used for descriptive field contents and values
594 that are not intended to be interpreted by the message parser. Words
595 of *TEXT may contain octets from character sets other than US-ASCII.
596
597 TEXT = <any OCTET except CTLs,
598 but including LWS>
599
600 Recipients of header field TEXT containing octets outside the US-
601 ASCII character set may assume that they represent ISO-8859-1
602 characters.
603
604 Hexadecimal numeric characters are used in several protocol elements.
605
606 HEX = "A" | "B" | "C" | "D" | "E" | "F"
607 | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT
608
609 Many HTTP/1.0 header field values consist of words separated by LWS
610 or special characters. These special characters must be in a quoted
611 string to be used within a parameter value.
612
613 word = token | quoted-string
614
615
616
617
618 Berners-Lee, et al Informational [Page 11]
619 \f
620 RFC 1945 HTTP/1.0 May 1996
621
622
623 token = 1*<any CHAR except CTLs or tspecials>
624
625 tspecials = "(" | ")" | "<" | ">" | "@"
626 | "," | ";" | ":" | "\" | <">
627 | "/" | "[" | "]" | "?" | "="
628 | "{" | "}" | SP | HT
629
630 Comments may be included in some HTTP header fields by surrounding
631 the comment text with parentheses. Comments are only allowed in
632 fields containing "comment" as part of their field value definition.
633 In all other fields, parentheses are considered part of the field
634 value.
635
636 comment = "(" *( ctext | comment ) ")"
637 ctext = <any TEXT excluding "(" and ")">
638
639 A string of text is parsed as a single word if it is quoted using
640 double-quote marks.
641
642 quoted-string = ( <"> *(qdtext) <"> )
643
644 qdtext = <any CHAR except <"> and CTLs,
645 but including LWS>
646
647 Single-character quoting using the backslash ("\") character is not
648 permitted in HTTP/1.0.
649
650 3. Protocol Parameters
651
652 3.1 HTTP Version
653
654 HTTP uses a "<major>.<minor>" numbering scheme to indicate versions
655 of the protocol. The protocol versioning policy is intended to allow
656 the sender to indicate the format of a message and its capacity for
657 understanding further HTTP communication, rather than the features
658 obtained via that communication. No change is made to the version
659 number for the addition of message components which do not affect
660 communication behavior or which only add to extensible field values.
661 The <minor> number is incremented when the changes made to the
662 protocol add features which do not change the general message parsing
663 algorithm, but which may add to the message semantics and imply
664 additional capabilities of the sender. The <major> number is
665 incremented when the format of a message within the protocol is
666 changed.
667
668 The version of an HTTP message is indicated by an HTTP-Version field
669 in the first line of the message. If the protocol version is not
670 specified, the recipient must assume that the message is in the
671
672
673
674 Berners-Lee, et al Informational [Page 12]
675 \f
676 RFC 1945 HTTP/1.0 May 1996
677
678
679 simple HTTP/0.9 format.
680
681 HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT
682
683 Note that the major and minor numbers should be treated as separate
684 integers and that each may be incremented higher than a single digit.
685 Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in turn is
686 lower than HTTP/12.3. Leading zeros should be ignored by recipients
687 and never generated by senders.
688
689 This document defines both the 0.9 and 1.0 versions of the HTTP
690 protocol. Applications sending Full-Request or Full-Response
691 messages, as defined by this specification, must include an HTTP-
692 Version of "HTTP/1.0".
693
694 HTTP/1.0 servers must:
695
696 o recognize the format of the Request-Line for HTTP/0.9 and
697 HTTP/1.0 requests;
698
699 o understand any valid request in the format of HTTP/0.9 or
700 HTTP/1.0;
701
702 o respond appropriately with a message in the same protocol
703 version used by the client.
704
705 HTTP/1.0 clients must:
706
707 o recognize the format of the Status-Line for HTTP/1.0 responses;
708
709 o understand any valid response in the format of HTTP/0.9 or
710 HTTP/1.0.
711
712 Proxy and gateway applications must be careful in forwarding requests
713 that are received in a format different than that of the
714 application's native HTTP version. Since the protocol version
715 indicates the protocol capability of the sender, a proxy/gateway must
716 never send a message with a version indicator which is greater than
717 its native version; if a higher version request is received, the
718 proxy/gateway must either downgrade the request version or respond
719 with an error. Requests with a version lower than that of the
720 application's native format may be upgraded before being forwarded;
721 the proxy/gateway's response to that request must follow the server
722 requirements listed above.
723
724
725
726
727
728
729
730 Berners-Lee, et al Informational [Page 13]
731 \f
732 RFC 1945 HTTP/1.0 May 1996
733
734
735 3.2 Uniform Resource Identifiers
736
737 URIs have been known by many names: WWW addresses, Universal Document
738 Identifiers, Universal Resource Identifiers [2], and finally the
739 combination of Uniform Resource Locators (URL) [4] and Names (URN)
740 [16]. As far as HTTP is concerned, Uniform Resource Identifiers are
741 simply formatted strings which identify--via name, location, or any
742 other characteristic--a network resource.
743
744 3.2.1 General Syntax
745
746 URIs in HTTP can be represented in absolute form or relative to some
747 known base URI [9], depending upon the context of their use. The two
748 forms are differentiated by the fact that absolute URIs always begin
749 with a scheme name followed by a colon.
750
751 URI = ( absoluteURI | relativeURI ) [ "#" fragment ]
752
753 absoluteURI = scheme ":" *( uchar | reserved )
754
755 relativeURI = net_path | abs_path | rel_path
756
757 net_path = "//" net_loc [ abs_path ]
758 abs_path = "/" rel_path
759 rel_path = [ path ] [ ";" params ] [ "?" query ]
760
761 path = fsegment *( "/" segment )
762 fsegment = 1*pchar
763 segment = *pchar
764
765 params = param *( ";" param )
766 param = *( pchar | "/" )
767
768 scheme = 1*( ALPHA | DIGIT | "+" | "-" | "." )
769 net_loc = *( pchar | ";" | "?" )
770 query = *( uchar | reserved )
771 fragment = *( uchar | reserved )
772
773 pchar = uchar | ":" | "@" | "&" | "=" | "+"
774 uchar = unreserved | escape
775 unreserved = ALPHA | DIGIT | safe | extra | national
776
777 escape = "%" HEX HEX
778 reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+"
779 extra = "!" | "*" | "'" | "(" | ")" | ","
780 safe = "$" | "-" | "_" | "."
781 unsafe = CTL | SP | <"> | "#" | "%" | "<" | ">"
782 national = <any OCTET excluding ALPHA, DIGIT,
783
784
785
786 Berners-Lee, et al Informational [Page 14]
787 \f
788 RFC 1945 HTTP/1.0 May 1996
789
790
791 reserved, extra, safe, and unsafe>
792
793 For definitive information on URL syntax and semantics, see RFC 1738
794 [4] and RFC 1808 [9]. The BNF above includes national characters not
795 allowed in valid URLs as specified by RFC 1738, since HTTP servers
796 are not restricted in the set of unreserved characters allowed to
797 represent the rel_path part of addresses, and HTTP proxies may
798 receive requests for URIs not defined by RFC 1738.
799
800 3.2.2 http URL
801
802 The "http" scheme is used to locate network resources via the HTTP
803 protocol. This section defines the scheme-specific syntax and
804 semantics for http URLs.
805
806 http_URL = "http:" "//" host [ ":" port ] [ abs_path ]
807
808 host = <A legal Internet host domain name
809 or IP address (in dotted-decimal form),
810 as defined by Section 2.1 of RFC 1123>
811
812 port = *DIGIT
813
814 If the port is empty or not given, port 80 is assumed. The semantics
815 are that the identified resource is located at the server listening
816 for TCP connections on that port of that host, and the Request-URI
817 for the resource is abs_path. If the abs_path is not present in the
818 URL, it must be given as "/" when used as a Request-URI (Section
819 5.1.2).
820
821 Note: Although the HTTP protocol is independent of the transport
822 layer protocol, the http URL only identifies resources by their
823 TCP location, and thus non-TCP resources must be identified by
824 some other URI scheme.
825
826 The canonical form for "http" URLs is obtained by converting any
827 UPALPHA characters in host to their LOALPHA equivalent (hostnames are
828 case-insensitive), eliding the [ ":" port ] if the port is 80, and
829 replacing an empty abs_path with "/".
830
831 3.3 Date/Time Formats
832
833 HTTP/1.0 applications have historically allowed three different
834 formats for the representation of date/time stamps:
835
836 Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123
837 Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036
838 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
839
840
841
842 Berners-Lee, et al Informational [Page 15]
843 \f
844 RFC 1945 HTTP/1.0 May 1996
845
846
847 The first format is preferred as an Internet standard and represents
848 a fixed-length subset of that defined by RFC 1123 [6] (an update to
849 RFC 822 [7]). The second format is in common use, but is based on the
850 obsolete RFC 850 [10] date format and lacks a four-digit year.
851 HTTP/1.0 clients and servers that parse the date value should accept
852 all three formats, though they must never generate the third
853 (asctime) format.
854
855 Note: Recipients of date values are encouraged to be robust in
856 accepting date values that may have been generated by non-HTTP
857 applications, as is sometimes the case when retrieving or posting
858 messages via proxies/gateways to SMTP or NNTP.
859
860 All HTTP/1.0 date/time stamps must be represented in Universal Time
861 (UT), also known as Greenwich Mean Time (GMT), without exception.
862 This is indicated in the first two formats by the inclusion of "GMT"
863 as the three-letter abbreviation for time zone, and should be assumed
864 when reading the asctime format.
865
866 HTTP-date = rfc1123-date | rfc850-date | asctime-date
867
868 rfc1123-date = wkday "," SP date1 SP time SP "GMT"
869 rfc850-date = weekday "," SP date2 SP time SP "GMT"
870 asctime-date = wkday SP date3 SP time SP 4DIGIT
871
872 date1 = 2DIGIT SP month SP 4DIGIT
873 ; day month year (e.g., 02 Jun 1982)
874 date2 = 2DIGIT "-" month "-" 2DIGIT
875 ; day-month-year (e.g., 02-Jun-82)
876 date3 = month SP ( 2DIGIT | ( SP 1DIGIT ))
877 ; month day (e.g., Jun 2)
878
879 time = 2DIGIT ":" 2DIGIT ":" 2DIGIT
880 ; 00:00:00 - 23:59:59
881
882 wkday = "Mon" | "Tue" | "Wed"
883 | "Thu" | "Fri" | "Sat" | "Sun"
884
885 weekday = "Monday" | "Tuesday" | "Wednesday"
886 | "Thursday" | "Friday" | "Saturday" | "Sunday"
887
888 month = "Jan" | "Feb" | "Mar" | "Apr"
889 | "May" | "Jun" | "Jul" | "Aug"
890 | "Sep" | "Oct" | "Nov" | "Dec"
891
892 Note: HTTP requirements for the date/time stamp format apply
893 only to their usage within the protocol stream. Clients and
894 servers are not required to use these formats for user
895
896
897
898 Berners-Lee, et al Informational [Page 16]
899 \f
900 RFC 1945 HTTP/1.0 May 1996
901
902
903 presentation, request logging, etc.
904
905 3.4 Character Sets
906
907 HTTP uses the same definition of the term "character set" as that
908 described for MIME:
909
910 The term "character set" is used in this document to refer to a
911 method used with one or more tables to convert a sequence of
912 octets into a sequence of characters. Note that unconditional
913 conversion in the other direction is not required, in that not all
914 characters may be available in a given character set and a
915 character set may provide more than one sequence of octets to
916 represent a particular character. This definition is intended to
917 allow various kinds of character encodings, from simple single-
918 table mappings such as US-ASCII to complex table switching methods
919 such as those that use ISO 2022's techniques. However, the
920 definition associated with a MIME character set name must fully
921 specify the mapping to be performed from octets to characters. In
922 particular, use of external profiling information to determine the
923 exact mapping is not permitted.
924
925 Note: This use of the term "character set" is more commonly
926 referred to as a "character encoding." However, since HTTP and
927 MIME share the same registry, it is important that the terminology
928 also be shared.
929
930 HTTP character sets are identified by case-insensitive tokens. The
931 complete set of tokens are defined by the IANA Character Set registry
932 [15]. However, because that registry does not define a single,
933 consistent token for each character set, we define here the preferred
934 names for those character sets most likely to be used with HTTP
935 entities. These character sets include those registered by RFC 1521
936 [5] -- the US-ASCII [17] and ISO-8859 [18] character sets -- and
937 other names specifically recommended for use within MIME charset
938 parameters.
939
940 charset = "US-ASCII"
941 | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3"
942 | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6"
943 | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9"
944 | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR"
945 | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8"
946 | token
947
948 Although HTTP allows an arbitrary token to be used as a charset
949 value, any token that has a predefined value within the IANA
950 Character Set registry [15] must represent the character set defined
951
952
953
954 Berners-Lee, et al Informational [Page 17]
955 \f
956 RFC 1945 HTTP/1.0 May 1996
957
958
959 by that registry. Applications should limit their use of character
960 sets to those defined by the IANA registry.
961
962 The character set of an entity body should be labelled as the lowest
963 common denominator of the character codes used within that body, with
964 the exception that no label is preferred over the labels US-ASCII or
965 ISO-8859-1.
966
967 3.5 Content Codings
968
969 Content coding values are used to indicate an encoding transformation
970 that has been applied to a resource. Content codings are primarily
971 used to allow a document to be compressed or encrypted without losing
972 the identity of its underlying media type. Typically, the resource is
973 stored in this encoding and only decoded before rendering or
974 analogous usage.
975
976 content-coding = "x-gzip" | "x-compress" | token
977
978 Note: For future compatibility, HTTP/1.0 applications should
979 consider "gzip" and "compress" to be equivalent to "x-gzip"
980 and "x-compress", respectively.
981
982 All content-coding values are case-insensitive. HTTP/1.0 uses
983 content-coding values in the Content-Encoding (Section 10.3) header
984 field. Although the value describes the content-coding, what is more
985 important is that it indicates what decoding mechanism will be
986 required to remove the encoding. Note that a single program may be
987 capable of decoding multiple content-coding formats. Two values are
988 defined by this specification:
989
990 x-gzip
991 An encoding format produced by the file compression program
992 "gzip" (GNU zip) developed by Jean-loup Gailly. This format is
993 typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC.
994
995 x-compress
996 The encoding format produced by the file compression program
997 "compress". This format is an adaptive Lempel-Ziv-Welch coding
998 (LZW).
999
1000 Note: Use of program names for the identification of
1001 encoding formats is not desirable and should be discouraged
1002 for future encodings. Their use here is representative of
1003 historical practice, not good design.
1004
1005
1006
1007
1008
1009
1010 Berners-Lee, et al Informational [Page 18]
1011 \f
1012 RFC 1945 HTTP/1.0 May 1996
1013
1014
1015 3.6 Media Types
1016
1017 HTTP uses Internet Media Types [13] in the Content-Type header field
1018 (Section 10.5) in order to provide open and extensible data typing.
1019
1020 media-type = type "/" subtype *( ";" parameter )
1021 type = token
1022 subtype = token
1023
1024 Parameters may follow the type/subtype in the form of attribute/value
1025 pairs.
1026
1027 parameter = attribute "=" value
1028 attribute = token
1029 value = token | quoted-string
1030
1031 The type, subtype, and parameter attribute names are case-
1032 insensitive. Parameter values may or may not be case-sensitive,
1033 depending on the semantics of the parameter name. LWS must not be
1034 generated between the type and subtype, nor between an attribute and
1035 its value. Upon receipt of a media type with an unrecognized
1036 parameter, a user agent should treat the media type as if the
1037 unrecognized parameter and its value were not present.
1038
1039 Some older HTTP applications do not recognize media type parameters.
1040 HTTP/1.0 applications should only use media type parameters when they
1041 are necessary to define the content of a message.
1042
1043 Media-type values are registered with the Internet Assigned Number
1044 Authority (IANA [15]). The media type registration process is
1045 outlined in RFC 1590 [13]. Use of non-registered media types is
1046 discouraged.
1047
1048 3.6.1 Canonicalization and Text Defaults
1049
1050 Internet media types are registered with a canonical form. In
1051 general, an Entity-Body transferred via HTTP must be represented in
1052 the appropriate canonical form prior to its transmission. If the body
1053 has been encoded with a Content-Encoding, the underlying data should
1054 be in canonical form prior to being encoded.
1055
1056 Media subtypes of the "text" type use CRLF as the text line break
1057 when in canonical form. However, HTTP allows the transport of text
1058 media with plain CR or LF alone representing a line break when used
1059 consistently within the Entity-Body. HTTP applications must accept
1060 CRLF, bare CR, and bare LF as being representative of a line break in
1061 text media received via HTTP.
1062
1063
1064
1065
1066 Berners-Lee, et al Informational [Page 19]
1067 \f
1068 RFC 1945 HTTP/1.0 May 1996
1069
1070
1071 In addition, if the text media is represented in a character set that
1072 does not use octets 13 and 10 for CR and LF respectively, as is the
1073 case for some multi-byte character sets, HTTP allows the use of
1074 whatever octet sequences are defined by that character set to
1075 represent the equivalent of CR and LF for line breaks. This
1076 flexibility regarding line breaks applies only to text media in the
1077 Entity-Body; a bare CR or LF should not be substituted for CRLF
1078 within any of the HTTP control structures (such as header fields and
1079 multipart boundaries).
1080
1081 The "charset" parameter is used with some media types to define the
1082 character set (Section 3.4) of the data. When no explicit charset
1083 parameter is provided by the sender, media subtypes of the "text"
1084 type are defined to have a default charset value of "ISO-8859-1" when
1085 received via HTTP. Data in character sets other than "ISO-8859-1" or
1086 its subsets must be labelled with an appropriate charset value in
1087 order to be consistently interpreted by the recipient.
1088
1089 Note: Many current HTTP servers provide data using charsets other
1090 than "ISO-8859-1" without proper labelling. This situation reduces
1091 interoperability and is not recommended. To compensate for this,
1092 some HTTP user agents provide a configuration option to allow the
1093 user to change the default interpretation of the media type
1094 character set when no charset parameter is given.
1095
1096 3.6.2 Multipart Types
1097
1098 MIME provides for a number of "multipart" types -- encapsulations of
1099 several entities within a single message's Entity-Body. The multipart
1100 types registered by IANA [15] do not have any special meaning for
1101 HTTP/1.0, though user agents may need to understand each type in
1102 order to correctly interpret the purpose of each body-part. An HTTP
1103 user agent should follow the same or similar behavior as a MIME user
1104 agent does upon receipt of a multipart type. HTTP servers should not
1105 assume that all HTTP clients are prepared to handle multipart types.
1106
1107 All multipart types share a common syntax and must include a boundary
1108 parameter as part of the media type value. The message body is itself
1109 a protocol element and must therefore use only CRLF to represent line
1110 breaks between body-parts. Multipart body-parts may contain HTTP
1111 header fields which are significant to the meaning of that part.
1112
1113 3.7 Product Tokens
1114
1115 Product tokens are used to allow communicating applications to
1116 identify themselves via a simple product token, with an optional
1117 slash and version designator. Most fields using product tokens also
1118 allow subproducts which form a significant part of the application to
1119
1120
1121
1122 Berners-Lee, et al Informational [Page 20]
1123 \f
1124 RFC 1945 HTTP/1.0 May 1996
1125
1126
1127 be listed, separated by whitespace. By convention, the products are
1128 listed in order of their significance for identifying the
1129 application.
1130
1131 product = token ["/" product-version]
1132 product-version = token
1133
1134 Examples:
1135
1136 User-Agent: CERN-LineMode/2.15 libwww/2.17b3
1137
1138 Server: Apache/0.8.4
1139
1140 Product tokens should be short and to the point -- use of them for
1141 advertizing or other non-essential information is explicitly
1142 forbidden. Although any token character may appear in a product-
1143 version, this token should only be used for a version identifier
1144 (i.e., successive versions of the same product should only differ in
1145 the product-version portion of the product value).
1146
1147 4. HTTP Message
1148
1149 4.1 Message Types
1150
1151 HTTP messages consist of requests from client to server and responses
1152 from server to client.
1153
1154 HTTP-message = Simple-Request ; HTTP/0.9 messages
1155 | Simple-Response
1156 | Full-Request ; HTTP/1.0 messages
1157 | Full-Response
1158
1159 Full-Request and Full-Response use the generic message format of RFC
1160 822 [7] for transferring entities. Both messages may include optional
1161 header fields (also known as "headers") and an entity body. The
1162 entity body is separated from the headers by a null line (i.e., a
1163 line with nothing preceding the CRLF).
1164
1165 Full-Request = Request-Line ; Section 5.1
1166 *( General-Header ; Section 4.3
1167 | Request-Header ; Section 5.2
1168 | Entity-Header ) ; Section 7.1
1169 CRLF
1170 [ Entity-Body ] ; Section 7.2
1171
1172 Full-Response = Status-Line ; Section 6.1
1173 *( General-Header ; Section 4.3
1174 | Response-Header ; Section 6.2
1175
1176
1177
1178 Berners-Lee, et al Informational [Page 21]
1179 \f
1180 RFC 1945 HTTP/1.0 May 1996
1181
1182
1183 | Entity-Header ) ; Section 7.1
1184 CRLF
1185 [ Entity-Body ] ; Section 7.2
1186
1187 Simple-Request and Simple-Response do not allow the use of any header
1188 information and are limited to a single request method (GET).
1189
1190 Simple-Request = "GET" SP Request-URI CRLF
1191
1192 Simple-Response = [ Entity-Body ]
1193
1194 Use of the Simple-Request format is discouraged because it prevents
1195 the server from identifying the media type of the returned entity.
1196
1197 4.2 Message Headers
1198
1199 HTTP header fields, which include General-Header (Section 4.3),
1200 Request-Header (Section 5.2), Response-Header (Section 6.2), and
1201 Entity-Header (Section 7.1) fields, follow the same generic format as
1202 that given in Section 3.1 of RFC 822 [7]. Each header field consists
1203 of a name followed immediately by a colon (":"), a single space (SP)
1204 character, and the field value. Field names are case-insensitive.
1205 Header fields can be extended over multiple lines by preceding each
1206 extra line with at least one SP or HT, though this is not
1207 recommended.
1208
1209 HTTP-header = field-name ":" [ field-value ] CRLF
1210
1211 field-name = token
1212 field-value = *( field-content | LWS )
1213
1214 field-content = <the OCTETs making up the field-value
1215 and consisting of either *TEXT or combinations
1216 of token, tspecials, and quoted-string>
1217
1218 The order in which header fields are received is not significant.
1219 However, it is "good practice" to send General-Header fields first,
1220 followed by Request-Header or Response-Header fields prior to the
1221 Entity-Header fields.
1222
1223 Multiple HTTP-header fields with the same field-name may be present
1224 in a message if and only if the entire field-value for that header
1225 field is defined as a comma-separated list [i.e., #(values)]. It must
1226 be possible to combine the multiple header fields into one "field-
1227 name: field-value" pair, without changing the semantics of the
1228 message, by appending each subsequent field-value to the first, each
1229 separated by a comma.
1230
1231
1232
1233
1234 Berners-Lee, et al Informational [Page 22]
1235 \f
1236 RFC 1945 HTTP/1.0 May 1996
1237
1238
1239 4.3 General Header Fields
1240
1241 There are a few header fields which have general applicability for
1242 both request and response messages, but which do not apply to the
1243 entity being transferred. These headers apply only to the message
1244 being transmitted.
1245
1246 General-Header = Date ; Section 10.6
1247 | Pragma ; Section 10.12
1248
1249 General header field names can be extended reliably only in
1250 combination with a change in the protocol version. However, new or
1251 experimental header fields may be given the semantics of general
1252 header fields if all parties in the communication recognize them to
1253 be general header fields. Unrecognized header fields are treated as
1254 Entity-Header fields.
1255
1256 5. Request
1257
1258 A request message from a client to a server includes, within the
1259 first line of that message, the method to be applied to the resource,
1260 the identifier of the resource, and the protocol version in use. For
1261 backwards compatibility with the more limited HTTP/0.9 protocol,
1262 there are two valid formats for an HTTP request:
1263
1264 Request = Simple-Request | Full-Request
1265
1266 Simple-Request = "GET" SP Request-URI CRLF
1267
1268 Full-Request = Request-Line ; Section 5.1
1269 *( General-Header ; Section 4.3
1270 | Request-Header ; Section 5.2
1271 | Entity-Header ) ; Section 7.1
1272 CRLF
1273 [ Entity-Body ] ; Section 7.2
1274
1275 If an HTTP/1.0 server receives a Simple-Request, it must respond with
1276 an HTTP/0.9 Simple-Response. An HTTP/1.0 client capable of receiving
1277 a Full-Response should never generate a Simple-Request.
1278
1279 5.1 Request-Line
1280
1281 The Request-Line begins with a method token, followed by the
1282 Request-URI and the protocol version, and ending with CRLF. The
1283 elements are separated by SP characters. No CR or LF are allowed
1284 except in the final CRLF sequence.
1285
1286 Request-Line = Method SP Request-URI SP HTTP-Version CRLF
1287
1288
1289
1290 Berners-Lee, et al Informational [Page 23]
1291 \f
1292 RFC 1945 HTTP/1.0 May 1996
1293
1294
1295 Note that the difference between a Simple-Request and the Request-
1296 Line of a Full-Request is the presence of the HTTP-Version field and
1297 the availability of methods other than GET.
1298
1299 5.1.1 Method
1300
1301 The Method token indicates the method to be performed on the resource
1302 identified by the Request-URI. The method is case-sensitive.
1303
1304 Method = "GET" ; Section 8.1
1305 | "HEAD" ; Section 8.2
1306 | "POST" ; Section 8.3
1307 | extension-method
1308
1309 extension-method = token
1310
1311 The list of methods acceptable by a specific resource can change
1312 dynamically; the client is notified through the return code of the
1313 response if a method is not allowed on a resource. Servers should
1314 return the status code 501 (not implemented) if the method is
1315 unrecognized or not implemented.
1316
1317 The methods commonly used by HTTP/1.0 applications are fully defined
1318 in Section 8.
1319
1320 5.1.2 Request-URI
1321
1322 The Request-URI is a Uniform Resource Identifier (Section 3.2) and
1323 identifies the resource upon which to apply the request.
1324
1325 Request-URI = absoluteURI | abs_path
1326
1327 The two options for Request-URI are dependent on the nature of the
1328 request.
1329
1330 The absoluteURI form is only allowed when the request is being made
1331 to a proxy. The proxy is requested to forward the request and return
1332 the response. If the request is GET or HEAD and a prior response is
1333 cached, the proxy may use the cached message if it passes any
1334 restrictions in the Expires header field. Note that the proxy may
1335 forward the request on to another proxy or directly to the server
1336 specified by the absoluteURI. In order to avoid request loops, a
1337 proxy must be able to recognize all of its server names, including
1338 any aliases, local variations, and the numeric IP address. An example
1339 Request-Line would be:
1340
1341 GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.0
1342
1343
1344
1345
1346 Berners-Lee, et al Informational [Page 24]
1347 \f
1348 RFC 1945 HTTP/1.0 May 1996
1349
1350
1351 The most common form of Request-URI is that used to identify a
1352 resource on an origin server or gateway. In this case, only the
1353 absolute path of the URI is transmitted (see Section 3.2.1,
1354 abs_path). For example, a client wishing to retrieve the resource
1355 above directly from the origin server would create a TCP connection
1356 to port 80 of the host "www.w3.org" and send the line:
1357
1358 GET /pub/WWW/TheProject.html HTTP/1.0
1359
1360 followed by the remainder of the Full-Request. Note that the absolute
1361 path cannot be empty; if none is present in the original URI, it must
1362 be given as "/" (the server root).
1363
1364 The Request-URI is transmitted as an encoded string, where some
1365 characters may be escaped using the "% HEX HEX" encoding defined by
1366 RFC 1738 [4]. The origin server must decode the Request-URI in order
1367 to properly interpret the request.
1368
1369 5.2 Request Header Fields
1370
1371 The request header fields allow the client to pass additional
1372 information about the request, and about the client itself, to the
1373 server. These fields act as request modifiers, with semantics
1374 equivalent to the parameters on a programming language method
1375 (procedure) invocation.
1376
1377 Request-Header = Authorization ; Section 10.2
1378 | From ; Section 10.8
1379 | If-Modified-Since ; Section 10.9
1380 | Referer ; Section 10.13
1381 | User-Agent ; Section 10.15
1382
1383 Request-Header field names can be extended reliably only in
1384 combination with a change in the protocol version. However, new or
1385 experimental header fields may be given the semantics of request
1386 header fields if all parties in the communication recognize them to
1387 be request header fields. Unrecognized header fields are treated as
1388 Entity-Header fields.
1389
1390 6. Response
1391
1392 After receiving and interpreting a request message, a server responds
1393 in the form of an HTTP response message.
1394
1395 Response = Simple-Response | Full-Response
1396
1397 Simple-Response = [ Entity-Body ]
1398
1399
1400
1401
1402 Berners-Lee, et al Informational [Page 25]
1403 \f
1404 RFC 1945 HTTP/1.0 May 1996
1405
1406
1407 Full-Response = Status-Line ; Section 6.1
1408 *( General-Header ; Section 4.3
1409 | Response-Header ; Section 6.2
1410 | Entity-Header ) ; Section 7.1
1411 CRLF
1412 [ Entity-Body ] ; Section 7.2
1413
1414 A Simple-Response should only be sent in response to an HTTP/0.9
1415 Simple-Request or if the server only supports the more limited
1416 HTTP/0.9 protocol. If a client sends an HTTP/1.0 Full-Request and
1417 receives a response that does not begin with a Status-Line, it should
1418 assume that the response is a Simple-Response and parse it
1419 accordingly. Note that the Simple-Response consists only of the
1420 entity body and is terminated by the server closing the connection.
1421
1422 6.1 Status-Line
1423
1424 The first line of a Full-Response message is the Status-Line,
1425 consisting of the protocol version followed by a numeric status code
1426 and its associated textual phrase, with each element separated by SP
1427 characters. No CR or LF is allowed except in the final CRLF sequence.
1428
1429 Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
1430
1431 Since a status line always begins with the protocol version and
1432 status code
1433
1434 "HTTP/" 1*DIGIT "." 1*DIGIT SP 3DIGIT SP
1435
1436 (e.g., "HTTP/1.0 200 "), the presence of that expression is
1437 sufficient to differentiate a Full-Response from a Simple-Response.
1438 Although the Simple-Response format may allow such an expression to
1439 occur at the beginning of an entity body, and thus cause a
1440 misinterpretation of the message if it was given in response to a
1441 Full-Request, most HTTP/0.9 servers are limited to responses of type
1442 "text/html" and therefore would never generate such a response.
1443
1444 6.1.1 Status Code and Reason Phrase
1445
1446 The Status-Code element is a 3-digit integer result code of the
1447 attempt to understand and satisfy the request. The Reason-Phrase is
1448 intended to give a short textual description of the Status-Code. The
1449 Status-Code is intended for use by automata and the Reason-Phrase is
1450 intended for the human user. The client is not required to examine or
1451 display the Reason-Phrase.
1452
1453
1454
1455
1456
1457
1458 Berners-Lee, et al Informational [Page 26]
1459 \f
1460 RFC 1945 HTTP/1.0 May 1996
1461
1462
1463 The first digit of the Status-Code defines the class of response. The
1464 last two digits do not have any categorization role. There are 5
1465 values for the first digit:
1466
1467 o 1xx: Informational - Not used, but reserved for future use
1468
1469 o 2xx: Success - The action was successfully received,
1470 understood, and accepted.
1471
1472 o 3xx: Redirection - Further action must be taken in order to
1473 complete the request
1474
1475 o 4xx: Client Error - The request contains bad syntax or cannot
1476 be fulfilled
1477
1478 o 5xx: Server Error - The server failed to fulfill an apparently
1479 valid request
1480
1481 The individual values of the numeric status codes defined for
1482 HTTP/1.0, and an example set of corresponding Reason-Phrase's, are
1483 presented below. The reason phrases listed here are only recommended
1484 -- they may be replaced by local equivalents without affecting the
1485 protocol. These codes are fully defined in Section 9.
1486
1487 Status-Code = "200" ; OK
1488 | "201" ; Created
1489 | "202" ; Accepted
1490 | "204" ; No Content
1491 | "301" ; Moved Permanently
1492 | "302" ; Moved Temporarily
1493 | "304" ; Not Modified
1494 | "400" ; Bad Request
1495 | "401" ; Unauthorized
1496 | "403" ; Forbidden
1497 | "404" ; Not Found
1498 | "500" ; Internal Server Error
1499 | "501" ; Not Implemented
1500 | "502" ; Bad Gateway
1501 | "503" ; Service Unavailable
1502 | extension-code
1503
1504 extension-code = 3DIGIT
1505
1506 Reason-Phrase = *<TEXT, excluding CR, LF>
1507
1508 HTTP status codes are extensible, but the above codes are the only
1509 ones generally recognized in current practice. HTTP applications are
1510 not required to understand the meaning of all registered status
1511
1512
1513
1514 Berners-Lee, et al Informational [Page 27]
1515 \f
1516 RFC 1945 HTTP/1.0 May 1996
1517
1518
1519 codes, though such understanding is obviously desirable. However,
1520 applications must understand the class of any status code, as
1521 indicated by the first digit, and treat any unrecognized response as
1522 being equivalent to the x00 status code of that class, with the
1523 exception that an unrecognized response must not be cached. For
1524 example, if an unrecognized status code of 431 is received by the
1525 client, it can safely assume that there was something wrong with its
1526 request and treat the response as if it had received a 400 status
1527 code. In such cases, user agents should present to the user the
1528 entity returned with the response, since that entity is likely to
1529 include human-readable information which will explain the unusual
1530 status.
1531
1532 6.2 Response Header Fields
1533
1534 The response header fields allow the server to pass additional
1535 information about the response which cannot be placed in the Status-
1536 Line. These header fields give information about the server and about
1537 further access to the resource identified by the Request-URI.
1538
1539 Response-Header = Location ; Section 10.11
1540 | Server ; Section 10.14
1541 | WWW-Authenticate ; Section 10.16
1542
1543 Response-Header field names can be extended reliably only in
1544 combination with a change in the protocol version. However, new or
1545 experimental header fields may be given the semantics of response
1546 header fields if all parties in the communication recognize them to
1547 be response header fields. Unrecognized header fields are treated as
1548 Entity-Header fields.
1549
1550 7. Entity
1551
1552 Full-Request and Full-Response messages may transfer an entity within
1553 some requests and responses. An entity consists of Entity-Header
1554 fields and (usually) an Entity-Body. In this section, both sender and
1555 recipient refer to either the client or the server, depending on who
1556 sends and who receives the entity.
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570 Berners-Lee, et al Informational [Page 28]
1571 \f
1572 RFC 1945 HTTP/1.0 May 1996
1573
1574
1575 7.1 Entity Header Fields
1576
1577 Entity-Header fields define optional metainformation about the
1578 Entity-Body or, if no body is present, about the resource identified
1579 by the request.
1580
1581 Entity-Header = Allow ; Section 10.1
1582 | Content-Encoding ; Section 10.3
1583 | Content-Length ; Section 10.4
1584 | Content-Type ; Section 10.5
1585 | Expires ; Section 10.7
1586 | Last-Modified ; Section 10.10
1587 | extension-header
1588
1589 extension-header = HTTP-header
1590
1591 The extension-header mechanism allows additional Entity-Header fields
1592 to be defined without changing the protocol, but these fields cannot
1593 be assumed to be recognizable by the recipient. Unrecognized header
1594 fields should be ignored by the recipient and forwarded by proxies.
1595
1596 7.2 Entity Body
1597
1598 The entity body (if any) sent with an HTTP request or response is in
1599 a format and encoding defined by the Entity-Header fields.
1600
1601 Entity-Body = *OCTET
1602
1603 An entity body is included with a request message only when the
1604 request method calls for one. The presence of an entity body in a
1605 request is signaled by the inclusion of a Content-Length header field
1606 in the request message headers. HTTP/1.0 requests containing an
1607 entity body must include a valid Content-Length header field.
1608
1609 For response messages, whether or not an entity body is included with
1610 a message is dependent on both the request method and the response
1611 code. All responses to the HEAD request method must not include a
1612 body, even though the presence of entity header fields may lead one
1613 to believe they do. All 1xx (informational), 204 (no content), and
1614 304 (not modified) responses must not include a body. All other
1615 responses must include an entity body or a Content-Length header
1616 field defined with a value of zero (0).
1617
1618 7.2.1 Type
1619
1620 When an Entity-Body is included with a message, the data type of that
1621 body is determined via the header fields Content-Type and Content-
1622 Encoding. These define a two-layer, ordered encoding model:
1623
1624
1625
1626 Berners-Lee, et al Informational [Page 29]
1627 \f
1628 RFC 1945 HTTP/1.0 May 1996
1629
1630
1631 entity-body := Content-Encoding( Content-Type( data ) )
1632
1633 A Content-Type specifies the media type of the underlying data. A
1634 Content-Encoding may be used to indicate any additional content
1635 coding applied to the type, usually for the purpose of data
1636 compression, that is a property of the resource requested. The
1637 default for the content encoding is none (i.e., the identity
1638 function).
1639
1640 Any HTTP/1.0 message containing an entity body should include a
1641 Content-Type header field defining the media type of that body. If
1642 and only if the media type is not given by a Content-Type header, as
1643 is the case for Simple-Response messages, the recipient may attempt
1644 to guess the media type via inspection of its content and/or the name
1645 extension(s) of the URL used to identify the resource. If the media
1646 type remains unknown, the recipient should treat it as type
1647 "application/octet-stream".
1648
1649 7.2.2 Length
1650
1651 When an Entity-Body is included with a message, the length of that
1652 body may be determined in one of two ways. If a Content-Length header
1653 field is present, its value in bytes represents the length of the
1654 Entity-Body. Otherwise, the body length is determined by the closing
1655 of the connection by the server.
1656
1657 Closing the connection cannot be used to indicate the end of a
1658 request body, since it leaves no possibility for the server to send
1659 back a response. Therefore, HTTP/1.0 requests containing an entity
1660 body must include a valid Content-Length header field. If a request
1661 contains an entity body and Content-Length is not specified, and the
1662 server does not recognize or cannot calculate the length from other
1663 fields, then the server should send a 400 (bad request) response.
1664
1665 Note: Some older servers supply an invalid Content-Length when
1666 sending a document that contains server-side includes dynamically
1667 inserted into the data stream. It must be emphasized that this
1668 will not be tolerated by future versions of HTTP. Unless the
1669 client knows that it is receiving a response from a compliant
1670 server, it should not depend on the Content-Length value being
1671 correct.
1672
1673 8. Method Definitions
1674
1675 The set of common methods for HTTP/1.0 is defined below. Although
1676 this set can be expanded, additional methods cannot be assumed to
1677 share the same semantics for separately extended clients and servers.
1678
1679
1680
1681
1682 Berners-Lee, et al Informational [Page 30]
1683 \f
1684 RFC 1945 HTTP/1.0 May 1996
1685
1686
1687 8.1 GET
1688
1689 The GET method means retrieve whatever information (in the form of an
1690 entity) is identified by the Request-URI. If the Request-URI refers
1691 to a data-producing process, it is the produced data which shall be
1692 returned as the entity in the response and not the source text of the
1693 process, unless that text happens to be the output of the process.
1694
1695 The semantics of the GET method changes to a "conditional GET" if the
1696 request message includes an If-Modified-Since header field. A
1697 conditional GET method requests that the identified resource be
1698 transferred only if it has been modified since the date given by the
1699 If-Modified-Since header, as described in Section 10.9. The
1700 conditional GET method is intended to reduce network usage by
1701 allowing cached entities to be refreshed without requiring multiple
1702 requests or transferring unnecessary data.
1703
1704 8.2 HEAD
1705
1706 The HEAD method is identical to GET except that the server must not
1707 return any Entity-Body in the response. The metainformation contained
1708 in the HTTP headers in response to a HEAD request should be identical
1709 to the information sent in response to a GET request. This method can
1710 be used for obtaining metainformation about the resource identified
1711 by the Request-URI without transferring the Entity-Body itself. This
1712 method is often used for testing hypertext links for validity,
1713 accessibility, and recent modification.
1714
1715 There is no "conditional HEAD" request analogous to the conditional
1716 GET. If an If-Modified-Since header field is included with a HEAD
1717 request, it should be ignored.
1718
1719 8.3 POST
1720
1721 The POST method is used to request that the destination server accept
1722 the entity enclosed in the request as a new subordinate of the
1723 resource identified by the Request-URI in the Request-Line. POST is
1724 designed to allow a uniform method to cover the following functions:
1725
1726 o Annotation of existing resources;
1727
1728 o Posting a message to a bulletin board, newsgroup, mailing list,
1729 or similar group of articles;
1730
1731 o Providing a block of data, such as the result of submitting a
1732 form [3], to a data-handling process;
1733
1734 o Extending a database through an append operation.
1735
1736
1737
1738 Berners-Lee, et al Informational [Page 31]
1739 \f
1740 RFC 1945 HTTP/1.0 May 1996
1741
1742
1743 The actual function performed by the POST method is determined by the
1744 server and is usually dependent on the Request-URI. The posted entity
1745 is subordinate to that URI in the same way that a file is subordinate
1746 to a directory containing it, a news article is subordinate to a
1747 newsgroup to which it is posted, or a record is subordinate to a
1748 database.
1749
1750 A successful POST does not require that the entity be created as a
1751 resource on the origin server or made accessible for future
1752 reference. That is, the action performed by the POST method might not
1753 result in a resource that can be identified by a URI. In this case,
1754 either 200 (ok) or 204 (no content) is the appropriate response
1755 status, depending on whether or not the response includes an entity
1756 that describes the result.
1757
1758 If a resource has been created on the origin server, the response
1759 should be 201 (created) and contain an entity (preferably of type
1760 "text/html") which describes the status of the request and refers to
1761 the new resource.
1762
1763 A valid Content-Length is required on all HTTP/1.0 POST requests. An
1764 HTTP/1.0 server should respond with a 400 (bad request) message if it
1765 cannot determine the length of the request message's content.
1766
1767 Applications must not cache responses to a POST request because the
1768 application has no way of knowing that the server would return an
1769 equivalent response on some future request.
1770
1771 9. Status Code Definitions
1772
1773 Each Status-Code is described below, including a description of which
1774 method(s) it can follow and any metainformation required in the
1775 response.
1776
1777 9.1 Informational 1xx
1778
1779 This class of status code indicates a provisional response,
1780 consisting only of the Status-Line and optional headers, and is
1781 terminated by an empty line. HTTP/1.0 does not define any 1xx status
1782 codes and they are not a valid response to a HTTP/1.0 request.
1783 However, they may be useful for experimental applications which are
1784 outside the scope of this specification.
1785
1786 9.2 Successful 2xx
1787
1788 This class of status code indicates that the client's request was
1789 successfully received, understood, and accepted.
1790
1791
1792
1793
1794 Berners-Lee, et al Informational [Page 32]
1795 \f
1796 RFC 1945 HTTP/1.0 May 1996
1797
1798
1799 200 OK
1800
1801 The request has succeeded. The information returned with the
1802 response is dependent on the method used in the request, as follows:
1803
1804 GET an entity corresponding to the requested resource is sent
1805 in the response;
1806
1807 HEAD the response must only contain the header information and
1808 no Entity-Body;
1809
1810 POST an entity describing or containing the result of the action.
1811
1812 201 Created
1813
1814 The request has been fulfilled and resulted in a new resource being
1815 created. The newly created resource can be referenced by the URI(s)
1816 returned in the entity of the response. The origin server should
1817 create the resource before using this Status-Code. If the action
1818 cannot be carried out immediately, the server must include in the
1819 response body a description of when the resource will be available;
1820 otherwise, the server should respond with 202 (accepted).
1821
1822 Of the methods defined by this specification, only POST can create a
1823 resource.
1824
1825 202 Accepted
1826
1827 The request has been accepted for processing, but the processing
1828 has not been completed. The request may or may not eventually be
1829 acted upon, as it may be disallowed when processing actually takes
1830 place. There is no facility for re-sending a status code from an
1831 asynchronous operation such as this.
1832
1833 The 202 response is intentionally non-committal. Its purpose is to
1834 allow a server to accept a request for some other process (perhaps
1835 a batch-oriented process that is only run once per day) without
1836 requiring that the user agent's connection to the server persist
1837 until the process is completed. The entity returned with this
1838 response should include an indication of the request's current
1839 status and either a pointer to a status monitor or some estimate of
1840 when the user can expect the request to be fulfilled.
1841
1842 204 No Content
1843
1844 The server has fulfilled the request but there is no new
1845 information to send back. If the client is a user agent, it should
1846 not change its document view from that which caused the request to
1847
1848
1849
1850 Berners-Lee, et al Informational [Page 33]
1851 \f
1852 RFC 1945 HTTP/1.0 May 1996
1853
1854
1855 be generated. This response is primarily intended to allow input
1856 for scripts or other actions to take place without causing a change
1857 to the user agent's active document view. The response may include
1858 new metainformation in the form of entity headers, which should
1859 apply to the document currently in the user agent's active view.
1860
1861 9.3 Redirection 3xx
1862
1863 This class of status code indicates that further action needs to be
1864 taken by the user agent in order to fulfill the request. The action
1865 required may be carried out by the user agent without interaction
1866 with the user if and only if the method used in the subsequent
1867 request is GET or HEAD. A user agent should never automatically
1868 redirect a request more than 5 times, since such redirections usually
1869 indicate an infinite loop.
1870
1871 300 Multiple Choices
1872
1873 This response code is not directly used by HTTP/1.0 applications,
1874 but serves as the default for interpreting the 3xx class of
1875 responses.
1876
1877 The requested resource is available at one or more locations.
1878 Unless it was a HEAD request, the response should include an entity
1879 containing a list of resource characteristics and locations from
1880 which the user or user agent can choose the one most appropriate.
1881 If the server has a preferred choice, it should include the URL in
1882 a Location field; user agents may use this field value for
1883 automatic redirection.
1884
1885 301 Moved Permanently
1886
1887 The requested resource has been assigned a new permanent URL and
1888 any future references to this resource should be done using that
1889 URL. Clients with link editing capabilities should automatically
1890 relink references to the Request-URI to the new reference returned
1891 by the server, where possible.
1892
1893 The new URL must be given by the Location field in the response.
1894 Unless it was a HEAD request, the Entity-Body of the response
1895 should contain a short note with a hyperlink to the new URL.
1896
1897 If the 301 status code is received in response to a request using
1898 the POST method, the user agent must not automatically redirect the
1899 request unless it can be confirmed by the user, since this might
1900 change the conditions under which the request was issued.
1901
1902
1903
1904
1905
1906 Berners-Lee, et al Informational [Page 34]
1907 \f
1908 RFC 1945 HTTP/1.0 May 1996
1909
1910
1911 Note: When automatically redirecting a POST request after
1912 receiving a 301 status code, some existing user agents will
1913 erroneously change it into a GET request.
1914
1915 302 Moved Temporarily
1916
1917 The requested resource resides temporarily under a different URL.
1918 Since the redirection may be altered on occasion, the client should
1919 continue to use the Request-URI for future requests.
1920
1921 The URL must be given by the Location field in the response. Unless
1922 it was a HEAD request, the Entity-Body of the response should
1923 contain a short note with a hyperlink to the new URI(s).
1924
1925 If the 302 status code is received in response to a request using
1926 the POST method, the user agent must not automatically redirect the
1927 request unless it can be confirmed by the user, since this might
1928 change the conditions under which the request was issued.
1929
1930 Note: When automatically redirecting a POST request after
1931 receiving a 302 status code, some existing user agents will
1932 erroneously change it into a GET request.
1933
1934 304 Not Modified
1935
1936 If the client has performed a conditional GET request and access is
1937 allowed, but the document has not been modified since the date and
1938 time specified in the If-Modified-Since field, the server must
1939 respond with this status code and not send an Entity-Body to the
1940 client. Header fields contained in the response should only include
1941 information which is relevant to cache managers or which may have
1942 changed independently of the entity's Last-Modified date. Examples
1943 of relevant header fields include: Date, Server, and Expires. A
1944 cache should update its cached entity to reflect any new field
1945 values given in the 304 response.
1946
1947 9.4 Client Error 4xx
1948
1949 The 4xx class of status code is intended for cases in which the
1950 client seems to have erred. If the client has not completed the
1951 request when a 4xx code is received, it should immediately cease
1952 sending data to the server. Except when responding to a HEAD request,
1953 the server should include an entity containing an explanation of the
1954 error situation, and whether it is a temporary or permanent
1955 condition. These status codes are applicable to any request method.
1956
1957
1958
1959
1960
1961
1962 Berners-Lee, et al Informational [Page 35]
1963 \f
1964 RFC 1945 HTTP/1.0 May 1996
1965
1966
1967 Note: If the client is sending data, server implementations on TCP
1968 should be careful to ensure that the client acknowledges receipt
1969 of the packet(s) containing the response prior to closing the
1970 input connection. If the client continues sending data to the
1971 server after the close, the server's controller will send a reset
1972 packet to the client, which may erase the client's unacknowledged
1973 input buffers before they can be read and interpreted by the HTTP
1974 application.
1975
1976 400 Bad Request
1977
1978 The request could not be understood by the server due to malformed
1979 syntax. The client should not repeat the request without
1980 modifications.
1981
1982 401 Unauthorized
1983
1984 The request requires user authentication. The response must include
1985 a WWW-Authenticate header field (Section 10.16) containing a
1986 challenge applicable to the requested resource. The client may
1987 repeat the request with a suitable Authorization header field
1988 (Section 10.2). If the request already included Authorization
1989 credentials, then the 401 response indicates that authorization has
1990 been refused for those credentials. If the 401 response contains
1991 the same challenge as the prior response, and the user agent has
1992 already attempted authentication at least once, then the user
1993 should be presented the entity that was given in the response,
1994 since that entity may include relevant diagnostic information. HTTP
1995 access authentication is explained in Section 11.
1996
1997 403 Forbidden
1998
1999 The server understood the request, but is refusing to fulfill it.
2000 Authorization will not help and the request should not be repeated.
2001 If the request method was not HEAD and the server wishes to make
2002 public why the request has not been fulfilled, it should describe
2003 the reason for the refusal in the entity body. This status code is
2004 commonly used when the server does not wish to reveal exactly why
2005 the request has been refused, or when no other response is
2006 applicable.
2007
2008 404 Not Found
2009
2010 The server has not found anything matching the Request-URI. No
2011 indication is given of whether the condition is temporary or
2012 permanent. If the server does not wish to make this information
2013 available to the client, the status code 403 (forbidden) can be
2014 used instead.
2015
2016
2017
2018 Berners-Lee, et al Informational [Page 36]
2019 \f
2020 RFC 1945 HTTP/1.0 May 1996
2021
2022
2023 9.5 Server Error 5xx
2024
2025 Response status codes beginning with the digit "5" indicate cases in
2026 which the server is aware that it has erred or is incapable of
2027 performing the request. If the client has not completed the request
2028 when a 5xx code is received, it should immediately cease sending data
2029 to the server. Except when responding to a HEAD request, the server
2030 should include an entity containing an explanation of the error
2031 situation, and whether it is a temporary or permanent condition.
2032 These response codes are applicable to any request method and there
2033 are no required header fields.
2034
2035 500 Internal Server Error
2036
2037 The server encountered an unexpected condition which prevented it
2038 from fulfilling the request.
2039
2040 501 Not Implemented
2041
2042 The server does not support the functionality required to fulfill
2043 the request. This is the appropriate response when the server does
2044 not recognize the request method and is not capable of supporting
2045 it for any resource.
2046
2047 502 Bad Gateway
2048
2049 The server, while acting as a gateway or proxy, received an invalid
2050 response from the upstream server it accessed in attempting to
2051 fulfill the request.
2052
2053 503 Service Unavailable
2054
2055 The server is currently unable to handle the request due to a
2056 temporary overloading or maintenance of the server. The implication
2057 is that this is a temporary condition which will be alleviated
2058 after some delay.
2059
2060 Note: The existence of the 503 status code does not imply
2061 that a server must use it when becoming overloaded. Some
2062 servers may wish to simply refuse the connection.
2063
2064 10. Header Field Definitions
2065
2066 This section defines the syntax and semantics of all commonly used
2067 HTTP/1.0 header fields. For general and entity header fields, both
2068 sender and recipient refer to either the client or the server,
2069 depending on who sends and who receives the message.
2070
2071
2072
2073
2074 Berners-Lee, et al Informational [Page 37]
2075 \f
2076 RFC 1945 HTTP/1.0 May 1996
2077
2078
2079 10.1 Allow
2080
2081 The Allow entity-header field lists the set of methods supported by
2082 the resource identified by the Request-URI. The purpose of this field
2083 is strictly to inform the recipient of valid methods associated with
2084 the resource. The Allow header field is not permitted in a request
2085 using the POST method, and thus should be ignored if it is received
2086 as part of a POST entity.
2087
2088 Allow = "Allow" ":" 1#method
2089
2090 Example of use:
2091
2092 Allow: GET, HEAD
2093
2094 This field cannot prevent a client from trying other methods.
2095 However, the indications given by the Allow header field value should
2096 be followed. The actual set of allowed methods is defined by the
2097 origin server at the time of each request.
2098
2099 A proxy must not modify the Allow header field even if it does not
2100 understand all the methods specified, since the user agent may have
2101 other means of communicating with the origin server.
2102
2103 The Allow header field does not indicate what methods are implemented
2104 by the server.
2105
2106 10.2 Authorization
2107
2108 A user agent that wishes to authenticate itself with a server--
2109 usually, but not necessarily, after receiving a 401 response--may do
2110 so by including an Authorization request-header field with the
2111 request. The Authorization field value consists of credentials
2112 containing the authentication information of the user agent for the
2113 realm of the resource being requested.
2114
2115 Authorization = "Authorization" ":" credentials
2116
2117 HTTP access authentication is described in Section 11. If a request
2118 is authenticated and a realm specified, the same credentials should
2119 be valid for all other requests within this realm.
2120
2121 Responses to requests containing an Authorization field are not
2122 cachable.
2123
2124
2125
2126
2127
2128
2129
2130 Berners-Lee, et al Informational [Page 38]
2131 \f
2132 RFC 1945 HTTP/1.0 May 1996
2133
2134
2135 10.3 Content-Encoding
2136
2137 The Content-Encoding entity-header field is used as a modifier to the
2138 media-type. When present, its value indicates what additional content
2139 coding has been applied to the resource, and thus what decoding
2140 mechanism must be applied in order to obtain the media-type
2141 referenced by the Content-Type header field. The Content-Encoding is
2142 primarily used to allow a document to be compressed without losing
2143 the identity of its underlying media type.
2144
2145 Content-Encoding = "Content-Encoding" ":" content-coding
2146
2147 Content codings are defined in Section 3.5. An example of its use is
2148
2149 Content-Encoding: x-gzip
2150
2151 The Content-Encoding is a characteristic of the resource identified
2152 by the Request-URI. Typically, the resource is stored with this
2153 encoding and is only decoded before rendering or analogous usage.
2154
2155 10.4 Content-Length
2156
2157 The Content-Length entity-header field indicates the size of the
2158 Entity-Body, in decimal number of octets, sent to the recipient or,
2159 in the case of the HEAD method, the size of the Entity-Body that
2160 would have been sent had the request been a GET.
2161
2162 Content-Length = "Content-Length" ":" 1*DIGIT
2163
2164 An example is
2165
2166 Content-Length: 3495
2167
2168 Applications should use this field to indicate the size of the
2169 Entity-Body to be transferred, regardless of the media type of the
2170 entity. A valid Content-Length field value is required on all
2171 HTTP/1.0 request messages containing an entity body.
2172
2173 Any Content-Length greater than or equal to zero is a valid value.
2174 Section 7.2.2 describes how to determine the length of a response
2175 entity body if a Content-Length is not given.
2176
2177 Note: The meaning of this field is significantly different from
2178 the corresponding definition in MIME, where it is an optional
2179 field used within the "message/external-body" content-type. In
2180 HTTP, it should be used whenever the entity's length can be
2181 determined prior to being transferred.
2182
2183
2184
2185
2186 Berners-Lee, et al Informational [Page 39]
2187 \f
2188 RFC 1945 HTTP/1.0 May 1996
2189
2190
2191 10.5 Content-Type
2192
2193 The Content-Type entity-header field indicates the media type of the
2194 Entity-Body sent to the recipient or, in the case of the HEAD method,
2195 the media type that would have been sent had the request been a GET.
2196
2197 Content-Type = "Content-Type" ":" media-type
2198
2199 Media types are defined in Section 3.6. An example of the field is
2200
2201 Content-Type: text/html
2202
2203 Further discussion of methods for identifying the media type of an
2204 entity is provided in Section 7.2.1.
2205
2206 10.6 Date
2207
2208 The Date general-header field represents the date and time at which
2209 the message was originated, having the same semantics as orig-date in
2210 RFC 822. The field value is an HTTP-date, as described in Section
2211 3.3.
2212
2213 Date = "Date" ":" HTTP-date
2214
2215 An example is
2216
2217 Date: Tue, 15 Nov 1994 08:12:31 GMT
2218
2219 If a message is received via direct connection with the user agent
2220 (in the case of requests) or the origin server (in the case of
2221 responses), then the date can be assumed to be the current date at
2222 the receiving end. However, since the date--as it is believed by the
2223 origin--is important for evaluating cached responses, origin servers
2224 should always include a Date header. Clients should only send a Date
2225 header field in messages that include an entity body, as in the case
2226 of the POST request, and even then it is optional. A received message
2227 which does not have a Date header field should be assigned one by the
2228 recipient if the message will be cached by that recipient or
2229 gatewayed via a protocol which requires a Date.
2230
2231 In theory, the date should represent the moment just before the
2232 entity is generated. In practice, the date can be generated at any
2233 time during the message origination without affecting its semantic
2234 value.
2235
2236 Note: An earlier version of this document incorrectly specified
2237 that this field should contain the creation date of the enclosed
2238 Entity-Body. This has been changed to reflect actual (and proper)
2239
2240
2241
2242 Berners-Lee, et al Informational [Page 40]
2243 \f
2244 RFC 1945 HTTP/1.0 May 1996
2245
2246
2247 usage.
2248
2249 10.7 Expires
2250
2251 The Expires entity-header field gives the date/time after which the
2252 entity should be considered stale. This allows information providers
2253 to suggest the volatility of the resource, or a date after which the
2254 information may no longer be valid. Applications must not cache this
2255 entity beyond the date given. The presence of an Expires field does
2256 not imply that the original resource will change or cease to exist
2257 at, before, or after that time. However, information providers that
2258 know or even suspect that a resource will change by a certain date
2259 should include an Expires header with that date. The format is an
2260 absolute date and time as defined by HTTP-date in Section 3.3.
2261
2262 Expires = "Expires" ":" HTTP-date
2263
2264 An example of its use is
2265
2266 Expires: Thu, 01 Dec 1994 16:00:00 GMT
2267
2268 If the date given is equal to or earlier than the value of the Date
2269 header, the recipient must not cache the enclosed entity. If a
2270 resource is dynamic by nature, as is the case with many data-
2271 producing processes, entities from that resource should be given an
2272 appropriate Expires value which reflects that dynamism.
2273
2274 The Expires field cannot be used to force a user agent to refresh its
2275 display or reload a resource; its semantics apply only to caching
2276 mechanisms, and such mechanisms need only check a resource's
2277 expiration status when a new request for that resource is initiated.
2278
2279 User agents often have history mechanisms, such as "Back" buttons and
2280 history lists, which can be used to redisplay an entity retrieved
2281 earlier in a session. By default, the Expires field does not apply to
2282 history mechanisms. If the entity is still in storage, a history
2283 mechanism should display it even if the entity has expired, unless
2284 the user has specifically configured the agent to refresh expired
2285 history documents.
2286
2287 Note: Applications are encouraged to be tolerant of bad or
2288 misinformed implementations of the Expires header. A value of zero
2289 (0) or an invalid date format should be considered equivalent to
2290 an "expires immediately." Although these values are not legitimate
2291 for HTTP/1.0, a robust implementation is always desirable.
2292
2293
2294
2295
2296
2297
2298 Berners-Lee, et al Informational [Page 41]
2299 \f
2300 RFC 1945 HTTP/1.0 May 1996
2301
2302
2303 10.8 From
2304
2305 The From request-header field, if given, should contain an Internet
2306 e-mail address for the human user who controls the requesting user
2307 agent. The address should be machine-usable, as defined by mailbox in
2308 RFC 822 [7] (as updated by RFC 1123 [6]):
2309
2310 From = "From" ":" mailbox
2311
2312 An example is:
2313
2314 From: webmaster@w3.org
2315
2316 This header field may be used for logging purposes and as a means for
2317 identifying the source of invalid or unwanted requests. It should not
2318 be used as an insecure form of access protection. The interpretation
2319 of this field is that the request is being performed on behalf of the
2320 person given, who accepts responsibility for the method performed. In
2321 particular, robot agents should include this header so that the
2322 person responsible for running the robot can be contacted if problems
2323 occur on the receiving end.
2324
2325 The Internet e-mail address in this field may be separate from the
2326 Internet host which issued the request. For example, when a request
2327 is passed through a proxy, the original issuer's address should be
2328 used.
2329
2330 Note: The client should not send the From header field without the
2331 user's approval, as it may conflict with the user's privacy
2332 interests or their site's security policy. It is strongly
2333 recommended that the user be able to disable, enable, and modify
2334 the value of this field at any time prior to a request.
2335
2336 10.9 If-Modified-Since
2337
2338 The If-Modified-Since request-header field is used with the GET
2339 method to make it conditional: if the requested resource has not been
2340 modified since the time specified in this field, a copy of the
2341 resource will not be returned from the server; instead, a 304 (not
2342 modified) response will be returned without any Entity-Body.
2343
2344 If-Modified-Since = "If-Modified-Since" ":" HTTP-date
2345
2346 An example of the field is:
2347
2348 If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT
2349
2350
2351
2352
2353
2354 Berners-Lee, et al Informational [Page 42]
2355 \f
2356 RFC 1945 HTTP/1.0 May 1996
2357
2358
2359 A conditional GET method requests that the identified resource be
2360 transferred only if it has been modified since the date given by the
2361 If-Modified-Since header. The algorithm for determining this includes
2362 the following cases:
2363
2364 a) If the request would normally result in anything other than
2365 a 200 (ok) status, or if the passed If-Modified-Since date
2366 is invalid, the response is exactly the same as for a
2367 normal GET. A date which is later than the server's current
2368 time is invalid.
2369
2370 b) If the resource has been modified since the
2371 If-Modified-Since date, the response is exactly the same as
2372 for a normal GET.
2373
2374 c) If the resource has not been modified since a valid
2375 If-Modified-Since date, the server shall return a 304 (not
2376 modified) response.
2377
2378 The purpose of this feature is to allow efficient updates of cached
2379 information with a minimum amount of transaction overhead.
2380
2381 10.10 Last-Modified
2382
2383 The Last-Modified entity-header field indicates the date and time at
2384 which the sender believes the resource was last modified. The exact
2385 semantics of this field are defined in terms of how the recipient
2386 should interpret it: if the recipient has a copy of this resource
2387 which is older than the date given by the Last-Modified field, that
2388 copy should be considered stale.
2389
2390 Last-Modified = "Last-Modified" ":" HTTP-date
2391
2392 An example of its use is
2393
2394 Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT
2395
2396 The exact meaning of this header field depends on the implementation
2397 of the sender and the nature of the original resource. For files, it
2398 may be just the file system last-modified time. For entities with
2399 dynamically included parts, it may be the most recent of the set of
2400 last-modify times for its component parts. For database gateways, it
2401 may be the last-update timestamp of the record. For virtual objects,
2402 it may be the last time the internal state changed.
2403
2404 An origin server must not send a Last-Modified date which is later
2405 than the server's time of message origination. In such cases, where
2406 the resource's last modification would indicate some time in the
2407
2408
2409
2410 Berners-Lee, et al Informational [Page 43]
2411 \f
2412 RFC 1945 HTTP/1.0 May 1996
2413
2414
2415 future, the server must replace that date with the message
2416 origination date.
2417
2418 10.11 Location
2419
2420 The Location response-header field defines the exact location of the
2421 resource that was identified by the Request-URI. For 3xx responses,
2422 the location must indicate the server's preferred URL for automatic
2423 redirection to the resource. Only one absolute URL is allowed.
2424
2425 Location = "Location" ":" absoluteURI
2426
2427 An example is
2428
2429 Location: http://www.w3.org/hypertext/WWW/NewLocation.html
2430
2431 10.12 Pragma
2432
2433 The Pragma general-header field is used to include implementation-
2434 specific directives that may apply to any recipient along the
2435 request/response chain. All pragma directives specify optional
2436 behavior from the viewpoint of the protocol; however, some systems
2437 may require that behavior be consistent with the directives.
2438
2439 Pragma = "Pragma" ":" 1#pragma-directive
2440
2441 pragma-directive = "no-cache" | extension-pragma
2442 extension-pragma = token [ "=" word ]
2443
2444 When the "no-cache" directive is present in a request message, an
2445 application should forward the request toward the origin server even
2446 if it has a cached copy of what is being requested. This allows a
2447 client to insist upon receiving an authoritative response to its
2448 request. It also allows a client to refresh a cached copy which is
2449 known to be corrupted or stale.
2450
2451 Pragma directives must be passed through by a proxy or gateway
2452 application, regardless of their significance to that application,
2453 since the directives may be applicable to all recipients along the
2454 request/response chain. It is not possible to specify a pragma for a
2455 specific recipient; however, any pragma directive not relevant to a
2456 recipient should be ignored by that recipient.
2457
2458 10.13 Referer
2459
2460 The Referer request-header field allows the client to specify, for
2461 the server's benefit, the address (URI) of the resource from which
2462 the Request-URI was obtained. This allows a server to generate lists
2463
2464
2465
2466 Berners-Lee, et al Informational [Page 44]
2467 \f
2468 RFC 1945 HTTP/1.0 May 1996
2469
2470
2471 of back-links to resources for interest, logging, optimized caching,
2472 etc. It also allows obsolete or mistyped links to be traced for
2473 maintenance. The Referer field must not be sent if the Request-URI
2474 was obtained from a source that does not have its own URI, such as
2475 input from the user keyboard.
2476
2477 Referer = "Referer" ":" ( absoluteURI | relativeURI )
2478
2479 Example:
2480
2481 Referer: http://www.w3.org/hypertext/DataSources/Overview.html
2482
2483 If a partial URI is given, it should be interpreted relative to the
2484 Request-URI. The URI must not include a fragment.
2485
2486 Note: Because the source of a link may be private information or
2487 may reveal an otherwise private information source, it is strongly
2488 recommended that the user be able to select whether or not the
2489 Referer field is sent. For example, a browser client could have a
2490 toggle switch for browsing openly/anonymously, which would
2491 respectively enable/disable the sending of Referer and From
2492 information.
2493
2494 10.14 Server
2495
2496 The Server response-header field contains information about the
2497 software used by the origin server to handle the request. The field
2498 can contain multiple product tokens (Section 3.7) and comments
2499 identifying the server and any significant subproducts. By
2500 convention, the product tokens are listed in order of their
2501 significance for identifying the application.
2502
2503 Server = "Server" ":" 1*( product | comment )
2504
2505 Example:
2506
2507 Server: CERN/3.0 libwww/2.17
2508
2509 If the response is being forwarded through a proxy, the proxy
2510 application must not add its data to the product list.
2511
2512 Note: Revealing the specific software version of the server may
2513 allow the server machine to become more vulnerable to attacks
2514 against software that is known to contain security holes. Server
2515 implementors are encouraged to make this field a configurable
2516 option.
2517
2518
2519
2520
2521
2522 Berners-Lee, et al Informational [Page 45]
2523 \f
2524 RFC 1945 HTTP/1.0 May 1996
2525
2526
2527 Note: Some existing servers fail to restrict themselves to the
2528 product token syntax within the Server field.
2529
2530 10.15 User-Agent
2531
2532 The User-Agent request-header field contains information about the
2533 user agent originating the request. This is for statistical purposes,
2534 the tracing of protocol violations, and automated recognition of user
2535 agents for the sake of tailoring responses to avoid particular user
2536 agent limitations. Although it is not required, user agents should
2537 include this field with requests. The field can contain multiple
2538 product tokens (Section 3.7) and comments identifying the agent and
2539 any subproducts which form a significant part of the user agent. By
2540 convention, the product tokens are listed in order of their
2541 significance for identifying the application.
2542
2543 User-Agent = "User-Agent" ":" 1*( product | comment )
2544
2545 Example:
2546
2547 User-Agent: CERN-LineMode/2.15 libwww/2.17b3
2548
2549 Note: Some current proxy applications append their product
2550 information to the list in the User-Agent field. This is not
2551 recommended, since it makes machine interpretation of these
2552 fields ambiguous.
2553
2554 Note: Some existing clients fail to restrict themselves to
2555 the product token syntax within the User-Agent field.
2556
2557 10.16 WWW-Authenticate
2558
2559 The WWW-Authenticate response-header field must be included in 401
2560 (unauthorized) response messages. The field value consists of at
2561 least one challenge that indicates the authentication scheme(s) and
2562 parameters applicable to the Request-URI.
2563
2564 WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge
2565
2566 The HTTP access authentication process is described in Section 11.
2567 User agents must take special care in parsing the WWW-Authenticate
2568 field value if it contains more than one challenge, or if more than
2569 one WWW-Authenticate header field is provided, since the contents of
2570 a challenge may itself contain a comma-separated list of
2571 authentication parameters.
2572
2573
2574
2575
2576
2577
2578 Berners-Lee, et al Informational [Page 46]
2579 \f
2580 RFC 1945 HTTP/1.0 May 1996
2581
2582
2583 11. Access Authentication
2584
2585 HTTP provides a simple challenge-response authentication mechanism
2586 which may be used by a server to challenge a client request and by a
2587 client to provide authentication information. It uses an extensible,
2588 case-insensitive token to identify the authentication scheme,
2589 followed by a comma-separated list of attribute-value pairs which
2590 carry the parameters necessary for achieving authentication via that
2591 scheme.
2592
2593 auth-scheme = token
2594
2595 auth-param = token "=" quoted-string
2596
2597 The 401 (unauthorized) response message is used by an origin server
2598 to challenge the authorization of a user agent. This response must
2599 include a WWW-Authenticate header field containing at least one
2600 challenge applicable to the requested resource.
2601
2602 challenge = auth-scheme 1*SP realm *( "," auth-param )
2603
2604 realm = "realm" "=" realm-value
2605 realm-value = quoted-string
2606
2607 The realm attribute (case-insensitive) is required for all
2608 authentication schemes which issue a challenge. The realm value
2609 (case-sensitive), in combination with the canonical root URL of the
2610 server being accessed, defines the protection space. These realms
2611 allow the protected resources on a server to be partitioned into a
2612 set of protection spaces, each with its own authentication scheme
2613 and/or authorization database. The realm value is a string, generally
2614 assigned by the origin server, which may have additional semantics
2615 specific to the authentication scheme.
2616
2617 A user agent that wishes to authenticate itself with a server--
2618 usually, but not necessarily, after receiving a 401 response--may do
2619 so by including an Authorization header field with the request. The
2620 Authorization field value consists of credentials containing the
2621 authentication information of the user agent for the realm of the
2622 resource being requested.
2623
2624 credentials = basic-credentials
2625 | ( auth-scheme #auth-param )
2626
2627 The domain over which credentials can be automatically applied by a
2628 user agent is determined by the protection space. If a prior request
2629 has been authorized, the same credentials may be reused for all other
2630 requests within that protection space for a period of time determined
2631
2632
2633
2634 Berners-Lee, et al Informational [Page 47]
2635 \f
2636 RFC 1945 HTTP/1.0 May 1996
2637
2638
2639 by the authentication scheme, parameters, and/or user preference.
2640 Unless otherwise defined by the authentication scheme, a single
2641 protection space cannot extend outside the scope of its server.
2642
2643 If the server does not wish to accept the credentials sent with a
2644 request, it should return a 403 (forbidden) response.
2645
2646 The HTTP protocol does not restrict applications to this simple
2647 challenge-response mechanism for access authentication. Additional
2648 mechanisms may be used, such as encryption at the transport level or
2649 via message encapsulation, and with additional header fields
2650 specifying authentication information. However, these additional
2651 mechanisms are not defined by this specification.
2652
2653 Proxies must be completely transparent regarding user agent
2654 authentication. That is, they must forward the WWW-Authenticate and
2655 Authorization headers untouched, and must not cache the response to a
2656 request containing Authorization. HTTP/1.0 does not provide a means
2657 for a client to be authenticated with a proxy.
2658
2659 11.1 Basic Authentication Scheme
2660
2661 The "basic" authentication scheme is based on the model that the user
2662 agent must authenticate itself with a user-ID and a password for each
2663 realm. The realm value should be considered an opaque string which
2664 can only be compared for equality with other realms on that server.
2665 The server will authorize the request only if it can validate the
2666 user-ID and password for the protection space of the Request-URI.
2667 There are no optional authentication parameters.
2668
2669 Upon receipt of an unauthorized request for a URI within the
2670 protection space, the server should respond with a challenge like the
2671 following:
2672
2673 WWW-Authenticate: Basic realm="WallyWorld"
2674
2675 where "WallyWorld" is the string assigned by the server to identify
2676 the protection space of the Request-URI.
2677
2678 To receive authorization, the client sends the user-ID and password,
2679 separated by a single colon (":") character, within a base64 [5]
2680 encoded string in the credentials.
2681
2682 basic-credentials = "Basic" SP basic-cookie
2683
2684 basic-cookie = <base64 [5] encoding of userid-password,
2685 except not limited to 76 char/line>
2686
2687
2688
2689
2690 Berners-Lee, et al Informational [Page 48]
2691 \f
2692 RFC 1945 HTTP/1.0 May 1996
2693
2694
2695 userid-password = [ token ] ":" *TEXT
2696
2697 If the user agent wishes to send the user-ID "Aladdin" and password
2698 "open sesame", it would use the following header field:
2699
2700 Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==
2701
2702 The basic authentication scheme is a non-secure method of filtering
2703 unauthorized access to resources on an HTTP server. It is based on
2704 the assumption that the connection between the client and the server
2705 can be regarded as a trusted carrier. As this is not generally true
2706 on an open network, the basic authentication scheme should be used
2707 accordingly. In spite of this, clients should implement the scheme in
2708 order to communicate with servers that use it.
2709
2710 12. Security Considerations
2711
2712 This section is meant to inform application developers, information
2713 providers, and users of the security limitations in HTTP/1.0 as
2714 described by this document. The discussion does not include
2715 definitive solutions to the problems revealed, though it does make
2716 some suggestions for reducing security risks.
2717
2718 12.1 Authentication of Clients
2719
2720 As mentioned in Section 11.1, the Basic authentication scheme is not
2721 a secure method of user authentication, nor does it prevent the
2722 Entity-Body from being transmitted in clear text across the physical
2723 network used as the carrier. HTTP/1.0 does not prevent additional
2724 authentication schemes and encryption mechanisms from being employed
2725 to increase security.
2726
2727 12.2 Safe Methods
2728
2729 The writers of client software should be aware that the software
2730 represents the user in their interactions over the Internet, and
2731 should be careful to allow the user to be aware of any actions they
2732 may take which may have an unexpected significance to themselves or
2733 others.
2734
2735 In particular, the convention has been established that the GET and
2736 HEAD methods should never have the significance of taking an action
2737 other than retrieval. These methods should be considered "safe." This
2738 allows user agents to represent other methods, such as POST, in a
2739 special way, so that the user is made aware of the fact that a
2740 possibly unsafe action is being requested.
2741
2742
2743
2744
2745
2746 Berners-Lee, et al Informational [Page 49]
2747 \f
2748 RFC 1945 HTTP/1.0 May 1996
2749
2750
2751 Naturally, it is not possible to ensure that the server does not
2752 generate side-effects as a result of performing a GET request; in
2753 fact, some dynamic resources consider that a feature. The important
2754 distinction here is that the user did not request the side-effects,
2755 so therefore cannot be held accountable for them.
2756
2757 12.3 Abuse of Server Log Information
2758
2759 A server is in the position to save personal data about a user's
2760 requests which may identify their reading patterns or subjects of
2761 interest. This information is clearly confidential in nature and its
2762 handling may be constrained by law in certain countries. People using
2763 the HTTP protocol to provide data are responsible for ensuring that
2764 such material is not distributed without the permission of any
2765 individuals that are identifiable by the published results.
2766
2767 12.4 Transfer of Sensitive Information
2768
2769 Like any generic data transfer protocol, HTTP cannot regulate the
2770 content of the data that is transferred, nor is there any a priori
2771 method of determining the sensitivity of any particular piece of
2772 information within the context of any given request. Therefore,
2773 applications should supply as much control over this information as
2774 possible to the provider of that information. Three header fields are
2775 worth special mention in this context: Server, Referer and From.
2776
2777 Revealing the specific software version of the server may allow the
2778 server machine to become more vulnerable to attacks against software
2779 that is known to contain security holes. Implementors should make the
2780 Server header field a configurable option.
2781
2782 The Referer field allows reading patterns to be studied and reverse
2783 links drawn. Although it can be very useful, its power can be abused
2784 if user details are not separated from the information contained in
2785 the Referer. Even when the personal information has been removed, the
2786 Referer field may indicate a private document's URI whose publication
2787 would be inappropriate.
2788
2789 The information sent in the From field might conflict with the user's
2790 privacy interests or their site's security policy, and hence it
2791 should not be transmitted without the user being able to disable,
2792 enable, and modify the contents of the field. The user must be able
2793 to set the contents of this field within a user preference or
2794 application defaults configuration.
2795
2796 We suggest, though do not require, that a convenient toggle interface
2797 be provided for the user to enable or disable the sending of From and
2798 Referer information.
2799
2800
2801
2802 Berners-Lee, et al Informational [Page 50]
2803 \f
2804 RFC 1945 HTTP/1.0 May 1996
2805
2806
2807 12.5 Attacks Based On File and Path Names
2808
2809 Implementations of HTTP origin servers should be careful to restrict
2810 the documents returned by HTTP requests to be only those that were
2811 intended by the server administrators. If an HTTP server translates
2812 HTTP URIs directly into file system calls, the server must take
2813 special care not to serve files that were not intended to be
2814 delivered to HTTP clients. For example, Unix, Microsoft Windows, and
2815 other operating systems use ".." as a path component to indicate a
2816 directory level above the current one. On such a system, an HTTP
2817 server must disallow any such construct in the Request-URI if it
2818 would otherwise allow access to a resource outside those intended to
2819 be accessible via the HTTP server. Similarly, files intended for
2820 reference only internally to the server (such as access control
2821 files, configuration files, and script code) must be protected from
2822 inappropriate retrieval, since they might contain sensitive
2823 information. Experience has shown that minor bugs in such HTTP server
2824 implementations have turned into security risks.
2825
2826 13. Acknowledgments
2827
2828 This specification makes heavy use of the augmented BNF and generic
2829 constructs defined by David H. Crocker for RFC 822 [7]. Similarly, it
2830 reuses many of the definitions provided by Nathaniel Borenstein and
2831 Ned Freed for MIME [5]. We hope that their inclusion in this
2832 specification will help reduce past confusion over the relationship
2833 between HTTP/1.0 and Internet mail message formats.
2834
2835 The HTTP protocol has evolved considerably over the past four years.
2836 It has benefited from a large and active developer community--the
2837 many people who have participated on the www-talk mailing list--and
2838 it is that community which has been most responsible for the success
2839 of HTTP and of the World-Wide Web in general. Marc Andreessen, Robert
2840 Cailliau, Daniel W. Connolly, Bob Denny, Jean-Francois Groff, Phillip
2841 M. Hallam-Baker, Hakon W. Lie, Ari Luotonen, Rob McCool, Lou
2842 Montulli, Dave Raggett, Tony Sanders, and Marc VanHeyningen deserve
2843 special recognition for their efforts in defining aspects of the
2844 protocol for early versions of this specification.
2845
2846 Paul Hoffman contributed sections regarding the informational status
2847 of this document and Appendices C and D.
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858 Berners-Lee, et al Informational [Page 51]
2859 \f
2860 RFC 1945 HTTP/1.0 May 1996
2861
2862
2863 This document has benefited greatly from the comments of all those
2864 participating in the HTTP-WG. In addition to those already mentioned,
2865 the following individuals have contributed to this specification:
2866
2867 Gary Adams Harald Tveit Alvestrand
2868 Keith Ball Brian Behlendorf
2869 Paul Burchard Maurizio Codogno
2870 Mike Cowlishaw Roman Czyborra
2871 Michael A. Dolan John Franks
2872 Jim Gettys Marc Hedlund
2873 Koen Holtman Alex Hopmann
2874 Bob Jernigan Shel Kaphan
2875 Martijn Koster Dave Kristol
2876 Daniel LaLiberte Paul Leach
2877 Albert Lunde John C. Mallery
2878 Larry Masinter Mitra
2879 Jeffrey Mogul Gavin Nicol
2880 Bill Perry Jeffrey Perry
2881 Owen Rees Luigi Rizzo
2882 David Robinson Marc Salomon
2883 Rich Salz Jim Seidman
2884 Chuck Shotton Eric W. Sink
2885 Simon E. Spero Robert S. Thau
2886 Francois Yergeau Mary Ellen Zurko
2887 Jean-Philippe Martin-Flatin
2888
2889 14. References
2890
2891 [1] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D.,
2892 Torrey, D., and B. Alberti, "The Internet Gopher Protocol: A
2893 Distributed Document Search and Retrieval Protocol", RFC 1436,
2894 University of Minnesota, March 1993.
2895
2896 [2] Berners-Lee, T., "Universal Resource Identifiers in WWW: A
2897 Unifying Syntax for the Expression of Names and Addresses of
2898 Objects on the Network as used in the World-Wide Web",
2899 RFC 1630, CERN, June 1994.
2900
2901 [3] Berners-Lee, T., and D. Connolly, "Hypertext Markup Language -
2902 2.0", RFC 1866, MIT/W3C, November 1995.
2903
2904 [4] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform
2905 Resource Locators (URL)", RFC 1738, CERN, Xerox PARC,
2906 University of Minnesota, December 1994.
2907
2908
2909
2910
2911
2912
2913
2914 Berners-Lee, et al Informational [Page 52]
2915 \f
2916 RFC 1945 HTTP/1.0 May 1996
2917
2918
2919 [5] Borenstein, N., and N. Freed, "MIME (Multipurpose Internet Mail
2920 Extensions) Part One: Mechanisms for Specifying and Describing
2921 the Format of Internet Message Bodies", RFC 1521, Bellcore,
2922 Innosoft, September 1993.
2923
2924 [6] Braden, R., "Requirements for Internet hosts - Application and
2925 Support", STD 3, RFC 1123, IETF, October 1989.
2926
2927 [7] Crocker, D., "Standard for the Format of ARPA Internet Text
2928 Messages", STD 11, RFC 822, UDEL, August 1982.
2929
2930 [8] F. Davis, B. Kahle, H. Morris, J. Salem, T. Shen, R. Wang,
2931 J. Sui, and M. Grinbaum. "WAIS Interface Protocol Prototype
2932 Functional Specification." (v1.5), Thinking Machines
2933 Corporation, April 1990.
2934
2935 [9] Fielding, R., "Relative Uniform Resource Locators", RFC 1808,
2936 UC Irvine, June 1995.
2937
2938 [10] Horton, M., and R. Adams, "Standard for interchange of USENET
2939 Messages", RFC 1036 (Obsoletes RFC 850), AT&T Bell
2940 Laboratories, Center for Seismic Studies, December 1987.
2941
2942 [11] Kantor, B., and P. Lapsley, "Network News Transfer Protocol:
2943 A Proposed Standard for the Stream-Based Transmission of News",
2944 RFC 977, UC San Diego, UC Berkeley, February 1986.
2945
2946 [12] Postel, J., "Simple Mail Transfer Protocol." STD 10, RFC 821,
2947 USC/ISI, August 1982.
2948
2949 [13] Postel, J., "Media Type Registration Procedure." RFC 1590,
2950 USC/ISI, March 1994.
2951
2952 [14] Postel, J., and J. Reynolds, "File Transfer Protocol (FTP)",
2953 STD 9, RFC 959, USC/ISI, October 1985.
2954
2955 [15] Reynolds, J., and J. Postel, "Assigned Numbers", STD 2, RFC
2956 1700, USC/ISI, October 1994.
2957
2958 [16] Sollins, K., and L. Masinter, "Functional Requirements for
2959 Uniform Resource Names", RFC 1737, MIT/LCS, Xerox Corporation,
2960 December 1994.
2961
2962 [17] US-ASCII. Coded Character Set - 7-Bit American Standard Code
2963 for Information Interchange. Standard ANSI X3.4-1986, ANSI,
2964 1986.
2965
2966
2967
2968
2969
2970 Berners-Lee, et al Informational [Page 53]
2971 \f
2972 RFC 1945 HTTP/1.0 May 1996
2973
2974
2975 [18] ISO-8859. International Standard -- Information Processing --
2976 8-bit Single-Byte Coded Graphic Character Sets --
2977 Part 1: Latin alphabet No. 1, ISO 8859-1:1987.
2978 Part 2: Latin alphabet No. 2, ISO 8859-2, 1987.
2979 Part 3: Latin alphabet No. 3, ISO 8859-3, 1988.
2980 Part 4: Latin alphabet No. 4, ISO 8859-4, 1988.
2981 Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988.
2982 Part 6: Latin/Arabic alphabet, ISO 8859-6, 1987.
2983 Part 7: Latin/Greek alphabet, ISO 8859-7, 1987.
2984 Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988.
2985 Part 9: Latin alphabet No. 5, ISO 8859-9, 1990.
2986
2987 15. Authors' Addresses
2988
2989 Tim Berners-Lee
2990 Director, W3 Consortium
2991 MIT Laboratory for Computer Science
2992 545 Technology Square
2993 Cambridge, MA 02139, U.S.A.
2994
2995 Fax: +1 (617) 258 8682
2996 EMail: timbl@w3.org
2997
2998
2999 Roy T. Fielding
3000 Department of Information and Computer Science
3001 University of California
3002 Irvine, CA 92717-3425, U.S.A.
3003
3004 Fax: +1 (714) 824-4056
3005 EMail: fielding@ics.uci.edu
3006
3007
3008 Henrik Frystyk Nielsen
3009 W3 Consortium
3010 MIT Laboratory for Computer Science
3011 545 Technology Square
3012 Cambridge, MA 02139, U.S.A.
3013
3014 Fax: +1 (617) 258 8682
3015 EMail: frystyk@w3.org
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026 Berners-Lee, et al Informational [Page 54]
3027 \f
3028 RFC 1945 HTTP/1.0 May 1996
3029
3030
3031 Appendices
3032
3033 These appendices are provided for informational reasons only -- they
3034 do not form a part of the HTTP/1.0 specification.
3035
3036 A. Internet Media Type message/http
3037
3038 In addition to defining the HTTP/1.0 protocol, this document serves
3039 as the specification for the Internet media type "message/http". The
3040 following is to be registered with IANA [13].
3041
3042 Media Type name: message
3043
3044 Media subtype name: http
3045
3046 Required parameters: none
3047
3048 Optional parameters: version, msgtype
3049
3050 version: The HTTP-Version number of the enclosed message
3051 (e.g., "1.0"). If not present, the version can be
3052 determined from the first line of the body.
3053
3054 msgtype: The message type -- "request" or "response". If
3055 not present, the type can be determined from the
3056 first line of the body.
3057
3058 Encoding considerations: only "7bit", "8bit", or "binary" are
3059 permitted
3060
3061 Security considerations: none
3062
3063 B. Tolerant Applications
3064
3065 Although this document specifies the requirements for the generation
3066 of HTTP/1.0 messages, not all applications will be correct in their
3067 implementation. We therefore recommend that operational applications
3068 be tolerant of deviations whenever those deviations can be
3069 interpreted unambiguously.
3070
3071 Clients should be tolerant in parsing the Status-Line and servers
3072 tolerant when parsing the Request-Line. In particular, they should
3073 accept any amount of SP or HT characters between fields, even though
3074 only a single SP is required.
3075
3076 The line terminator for HTTP-header fields is the sequence CRLF.
3077 However, we recommend that applications, when parsing such headers,
3078 recognize a single LF as a line terminator and ignore the leading CR.
3079
3080
3081
3082 Berners-Lee, et al Informational [Page 55]
3083 \f
3084 RFC 1945 HTTP/1.0 May 1996
3085
3086
3087 C. Relationship to MIME
3088
3089 HTTP/1.0 uses many of the constructs defined for Internet Mail (RFC
3090 822 [7]) and the Multipurpose Internet Mail Extensions (MIME [5]) to
3091 allow entities to be transmitted in an open variety of
3092 representations and with extensible mechanisms. However, RFC 1521
3093 discusses mail, and HTTP has a few features that are different than
3094 those described in RFC 1521. These differences were carefully chosen
3095 to optimize performance over binary connections, to allow greater
3096 freedom in the use of new media types, to make date comparisons
3097 easier, and to acknowledge the practice of some early HTTP servers
3098 and clients.
3099
3100 At the time of this writing, it is expected that RFC 1521 will be
3101 revised. The revisions may include some of the practices found in
3102 HTTP/1.0 but not in RFC 1521.
3103
3104 This appendix describes specific areas where HTTP differs from RFC
3105 1521. Proxies and gateways to strict MIME environments should be
3106 aware of these differences and provide the appropriate conversions
3107 where necessary. Proxies and gateways from MIME environments to HTTP
3108 also need to be aware of the differences because some conversions may
3109 be required.
3110
3111 C.1 Conversion to Canonical Form
3112
3113 RFC 1521 requires that an Internet mail entity be converted to
3114 canonical form prior to being transferred, as described in Appendix G
3115 of RFC 1521 [5]. Section 3.6.1 of this document describes the forms
3116 allowed for subtypes of the "text" media type when transmitted over
3117 HTTP.
3118
3119 RFC 1521 requires that content with a Content-Type of "text"
3120 represent line breaks as CRLF and forbids the use of CR or LF outside
3121 of line break sequences. HTTP allows CRLF, bare CR, and bare LF to
3122 indicate a line break within text content when a message is
3123 transmitted over HTTP.
3124
3125 Where it is possible, a proxy or gateway from HTTP to a strict RFC
3126 1521 environment should translate all line breaks within the text
3127 media types described in Section 3.6.1 of this document to the RFC
3128 1521 canonical form of CRLF. Note, however, that this may be
3129 complicated by the presence of a Content-Encoding and by the fact
3130 that HTTP allows the use of some character sets which do not use
3131 octets 13 and 10 to represent CR and LF, as is the case for some
3132 multi-byte character sets.
3133
3134
3135
3136
3137
3138 Berners-Lee, et al Informational [Page 56]
3139 \f
3140 RFC 1945 HTTP/1.0 May 1996
3141
3142
3143 C.2 Conversion of Date Formats
3144
3145 HTTP/1.0 uses a restricted set of date formats (Section 3.3) to
3146 simplify the process of date comparison. Proxies and gateways from
3147 other protocols should ensure that any Date header field present in a
3148 message conforms to one of the HTTP/1.0 formats and rewrite the date
3149 if necessary.
3150
3151 C.3 Introduction of Content-Encoding
3152
3153 RFC 1521 does not include any concept equivalent to HTTP/1.0's
3154 Content-Encoding header field. Since this acts as a modifier on the
3155 media type, proxies and gateways from HTTP to MIME-compliant
3156 protocols must either change the value of the Content-Type header
3157 field or decode the Entity-Body before forwarding the message. (Some
3158 experimental applications of Content-Type for Internet mail have used
3159 a media-type parameter of ";conversions=<content-coding>" to perform
3160 an equivalent function as Content-Encoding. However, this parameter
3161 is not part of RFC 1521.)
3162
3163 C.4 No Content-Transfer-Encoding
3164
3165 HTTP does not use the Content-Transfer-Encoding (CTE) field of RFC
3166 1521. Proxies and gateways from MIME-compliant protocols to HTTP must
3167 remove any non-identity CTE ("quoted-printable" or "base64") encoding
3168 prior to delivering the response message to an HTTP client.
3169
3170 Proxies and gateways from HTTP to MIME-compliant protocols are
3171 responsible for ensuring that the message is in the correct format
3172 and encoding for safe transport on that protocol, where "safe
3173 transport" is defined by the limitations of the protocol being used.
3174 Such a proxy or gateway should label the data with an appropriate
3175 Content-Transfer-Encoding if doing so will improve the likelihood of
3176 safe transport over the destination protocol.
3177
3178 C.5 HTTP Header Fields in Multipart Body-Parts
3179
3180 In RFC 1521, most header fields in multipart body-parts are generally
3181 ignored unless the field name begins with "Content-". In HTTP/1.0,
3182 multipart body-parts may contain any HTTP header fields which are
3183 significant to the meaning of that part.
3184
3185 D. Additional Features
3186
3187 This appendix documents protocol elements used by some existing HTTP
3188 implementations, but not consistently and correctly across most
3189 HTTP/1.0 applications. Implementors should be aware of these
3190 features, but cannot rely upon their presence in, or interoperability
3191
3192
3193
3194 Berners-Lee, et al Informational [Page 57]
3195 \f
3196 RFC 1945 HTTP/1.0 May 1996
3197
3198
3199 with, other HTTP/1.0 applications.
3200
3201 D.1 Additional Request Methods
3202
3203 D.1.1 PUT
3204
3205 The PUT method requests that the enclosed entity be stored under the
3206 supplied Request-URI. If the Request-URI refers to an already
3207 existing resource, the enclosed entity should be considered as a
3208 modified version of the one residing on the origin server. If the
3209 Request-URI does not point to an existing resource, and that URI is
3210 capable of being defined as a new resource by the requesting user
3211 agent, the origin server can create the resource with that URI.
3212
3213 The fundamental difference between the POST and PUT requests is
3214 reflected in the different meaning of the Request-URI. The URI in a
3215 POST request identifies the resource that will handle the enclosed
3216 entity as data to be processed. That resource may be a data-accepting
3217 process, a gateway to some other protocol, or a separate entity that
3218 accepts annotations. In contrast, the URI in a PUT request identifies
3219 the entity enclosed with the request -- the user agent knows what URI
3220 is intended and the server should not apply the request to some other
3221 resource.
3222
3223 D.1.2 DELETE
3224
3225 The DELETE method requests that the origin server delete the resource
3226 identified by the Request-URI.
3227
3228 D.1.3 LINK
3229
3230 The LINK method establishes one or more Link relationships between
3231 the existing resource identified by the Request-URI and other
3232 existing resources.
3233
3234 D.1.4 UNLINK
3235
3236 The UNLINK method removes one or more Link relationships from the
3237 existing resource identified by the Request-URI.
3238
3239 D.2 Additional Header Field Definitions
3240
3241 D.2.1 Accept
3242
3243 The Accept request-header field can be used to indicate a list of
3244 media ranges which are acceptable as a response to the request. The
3245 asterisk "*" character is used to group media types into ranges, with
3246 "*/*" indicating all media types and "type/*" indicating all subtypes
3247
3248
3249
3250 Berners-Lee, et al Informational [Page 58]
3251 \f
3252 RFC 1945 HTTP/1.0 May 1996
3253
3254
3255 of that type. The set of ranges given by the client should represent
3256 what types are acceptable given the context of the request.
3257
3258 D.2.2 Accept-Charset
3259
3260 The Accept-Charset request-header field can be used to indicate a
3261 list of preferred character sets other than the default US-ASCII and
3262 ISO-8859-1. This field allows clients capable of understanding more
3263 comprehensive or special-purpose character sets to signal that
3264 capability to a server which is capable of representing documents in
3265 those character sets.
3266
3267 D.2.3 Accept-Encoding
3268
3269 The Accept-Encoding request-header field is similar to Accept, but
3270 restricts the content-coding values which are acceptable in the
3271 response.
3272
3273 D.2.4 Accept-Language
3274
3275 The Accept-Language request-header field is similar to Accept, but
3276 restricts the set of natural languages that are preferred as a
3277 response to the request.
3278
3279 D.2.5 Content-Language
3280
3281 The Content-Language entity-header field describes the natural
3282 language(s) of the intended audience for the enclosed entity. Note
3283 that this may not be equivalent to all the languages used within the
3284 entity.
3285
3286 D.2.6 Link
3287
3288 The Link entity-header field provides a means for describing a
3289 relationship between the entity and some other resource. An entity
3290 may include multiple Link values. Links at the metainformation level
3291 typically indicate relationships like hierarchical structure and
3292 navigation paths.
3293
3294 D.2.7 MIME-Version
3295
3296 HTTP messages may include a single MIME-Version general-header field
3297 to indicate what version of the MIME protocol was used to construct
3298 the message. Use of the MIME-Version header field, as defined by RFC
3299 1521 [5], should indicate that the message is MIME-conformant.
3300 Unfortunately, some older HTTP/1.0 servers send it indiscriminately,
3301 and thus this field should be ignored.
3302
3303
3304
3305
3306 Berners-Lee, et al Informational [Page 59]
3307 \f
3308 RFC 1945 HTTP/1.0 May 1996
3309
3310
3311 D.2.8 Retry-After
3312
3313 The Retry-After response-header field can be used with a 503 (service
3314 unavailable) response to indicate how long the service is expected to
3315 be unavailable to the requesting client. The value of this field can
3316 be either an HTTP-date or an integer number of seconds (in decimal)
3317 after the time of the response.
3318
3319 D.2.9 Title
3320
3321 The Title entity-header field indicates the title of the entity.
3322
3323 D.2.10 URI
3324
3325 The URI entity-header field may contain some or all of the Uniform
3326 Resource Identifiers (Section 3.2) by which the Request-URI resource
3327 can be identified. There is no guarantee that the resource can be
3328 accessed using the URI(s) specified.
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362 Berners-Lee, et al Informational [Page 60]
3363 \f