REC-xml-infoset-20040204
51 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8"/>
<title>XML Information Set (Second Edition)</title>
<style type="text/css">
.xml-def {padding-left: 24pt}
.xml-syntax {padding-left: 24pt}
.deleted {background-color: #FF9999; text-decoration: line-through}
</style>
<link href="http://www.w3.org/StyleSheets/TR/W3C-REC" type="text/css" rel="stylesheet"/>
<meta name="RCSId" content="$Id: Overview.html,v 1.2 2007/10/11 20:43:40 jules Exp $"/>
</head>
<body>
<div class="head">
<a href="http://www.w3.org/">
<img height="48" width="72" alt="W3C" src="http://www.w3.org/Icons/w3c_home" />
</a>
<div align="center">
<h1>XML Information Set<span class="added"> (Second Edition)</span></h1>
<h2 class="nonum">W3C Recommendation 4 February 2004</h2>
</div>
<dl>
<dt>This version:</dt>
<dd>
<a href="http://www.w3.org/TR/2004/REC-xml-infoset-20040204">
http://www.w3.org/TR/2004/REC-xml-infoset-20040204</a>
</dd>
<dt>Latest version:</dt>
<dd>
<a href="http://www.w3.org/TR/xml-infoset">
http://www.w3.org/TR/xml-infoset</a>
</dd>
<dt>Previous version:</dt>
<dd>
<a href="http://www.w3.org/TR/2003/PER-xml-infoset-20031210">
http://www.w3.org/TR/2003/PER-xml-infoset-20031210</a>
</dd>
<dt>Editors:</dt>
<dd>
John Cowan,
<a href="mailto:jcowan@reutershealth.com">jcowan@reutershealth.com</a>
</dd>
<dd>
Richard Tobin,
<a href="mailto:richard@cogsci.ed.ac.uk">richard@cogsci.ed.ac.uk</a>
</dd>
</dl>
<p>
Please refer to the
<a href="http://www.w3.org/2001/10/02/xml-infoset-errata.html">
<strong>errata</strong></a>
for this document, which may include some normative corrections.
</p>
<p>
See also
<a href="http://www.w3.org/2003/03/Translations/byTechnology?technology=xml-infoset">
<strong>translations</strong></a>.
</p>
<p class="copyright">
<a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">
Copyright</a>
©1999-2004
<a href="http://www.w3.org/">
<acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup>
(<a href="http://www.csail.mit.edu/">
<acronym title="Massachusetts Institute of Technology">MIT</acronym></a>,
<a href="http://www.ercim.org/">
<acronym title="European Research Consortium for Informatics and Mathematics">
ERCIM</acronym></a>,
<a href="http://www.keio.ac.jp/">Keio</a>),
All Rights Reserved.
W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">
liability</a>,
<a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">
trademark</a>,
<a href="http://www.w3.org/Consortium/Legal/copyright-documents">
document use</a> and
<a href="http://www.w3.org/Consortium/Legal/copyright-software">
software licensing</a> rules apply.
</p>
</div>
<hr />
<div>
<h2 class="nonum"><a name="abstract">Abstract</a></h2>
<p>This specification provides a set of definitions for use in other
specifications that need to refer to the information in an XML document.
</p>
</div>
<div>
<h2 class="nonum"><a name="status" id="status"/>Status of this Document</h2>
<p><em>This section describes the status of this document at the time of its
publication. Other documents may supersede this document. A list of current
W3C publications and the latest revision of this technical report can be
found in the <a href="http://www.w3.org/TR/">W3C
technical reports index</a> at http://www.w3.org/TR/.</em>
</p>
<p>This document is
a <a href="http://www.w3.org/2003/06/Process-20030618/tr.html#RecsW3C">Recommendation</a> of
the W3C. It has been reviewed by W3C Members and other interested parties,
and has been endorsed by the Director as a W3C Recommendation. It is a stable
document and may be used as reference material or cited as a normative
reference from another document. W3C's role in making the Recommendation
is to draw attention to the specification and to promote its widespread
deployment. This enhances the functionality and interoperability of the Web.
</p>
<p>This document updates the Infoset to cover
<a href="#XML11">XML 1.1</a> and <a href="#Namespaces11">Namespaces 1.1</a>,
clarifies the consequences of certain kinds of invalidity,
and corrects some typographical errors. It is a
product of the <a href="http://www.w3.org/XML/Activity.html">W3C XML Activity</a>.
The English version of this specification is the only normative version. However,
for translations of this document, see <a
href="http://www.w3.org/2003/03/Translations/byTechnology?technology=xml-infoset"
>http://www.w3.org/2003/03/Translations/byTechnology?technology=xml-infoset</a>.
</p>
<p>Documentation of intellectual property possibly relevant to this recommendation
may be found at the Working Group's public <a
href="http://www.w3.org/2002/08/xmlcore-IPR-statements">IPR disclosure page</a>.
</p>
<p>Please report errors in this document to <a
href="mailto:www-xml-infoset-comments@w3.org">www-xml-infoset-comments@w3.org</a>
(public <a href="http://lists.w3.org/Archives/Public/www-xml-infoset-comments/">
archives</a>
are available). The errata list for this Recommendation is available at <a
href="http://www.w3.org/2001/10/02/xml-infoset-errata.html"
>http://www.w3.org/2001/10/02/xml-infoset-errata.html</a>.
</p>
</div>
<div>
<h2 class="nonum"><a name="contents">Contents</a></h2>
<ul style="list-style-type: none;">
<li><a href="#intro">1. Introduction</a></li>
<li>
<a href="#infoitem">2. Information Items</a>
<ul style="list-style-type: none;">
<li><a href="#infoitem.document">2.1 The Document Information Item</a></li>
<li><a href="#infoitem.element">2.2 Element Information Items</a></li>
<li><a href="#infoitem.attribute">2.3 Attribute Information Items</a></li>
<li><a href="#infoitem.pi">2.4 Processing Instruction Information Items</a></li>
<li><a href="#infoitem.rse">2.5 Unexpanded Entity Reference Information Items</a></li>
<li><a href="#infoitem.character">2.6 Character Information Items</a></li>
<li><a href="#infoitem.comment">2.7 Comment Information Items</a></li>
<li><a href="#infoitem.doctype">2.8 The Document Type Declaration Information Item</a></li>
<li><a href="#infoitem.entity.unparsed">2.9 Unparsed Entity Information Items</a></li>
<li><a href="#infoitem.notation">2.10 Notation Information Items</a></li>
<li><a href="#infoitem.namespace">2.11 Namespace Information Items</a></li>
</ul>
</li>
<li><a href="#conformance">3. Conformance</a></li>
<li><a href="#references">Appendix A: References</a></li>
<li><a href="#reporting">Appendix B: XML <!-- <span class="deleted">1.0</span> --> Reporting Requirements (informative)</a></li>
<li><a href="#example">Appendix C: Example (informative)</a></li>
<li><a href="#omitted">Appendix D: What is not in the Information Set</a></li>
<li><a href="#rdfschema">Appendix E: RDF Schema (informative)</a></li>
</ul>
</div>
<hr />
<div>
<h2><a name="intro">1. Introduction </a></h2>
<p>This specification defines an abstract data set called
the <dfn><strong>XML Information Set</strong></dfn>
(<dfn><strong>Infoset</strong></dfn>).
Its purpose is to provide a consistent set of definitions for use
in other specifications that need to refer to the information in a well-formed
XML document <a href="#XML">[XML]</a>.
</p>
<p>
It does not attempt to be exhaustive; the primary criterion for inclusion
of an information item or property has been that of expected usefulness
in future specifications. Nor does it constitute a minimum set of
information that must be returned by an XML processor.
</p>
<p>
An XML document has an information set if it is well-formed and
satisfies the namespace constraints described
<a href="#intro.namespaces">below</a>.
There is no requirement
for an XML document to be valid in order to have an information set.
</p>
<p>
Information sets may be created by methods (not described in this
specification) other than parsing an XML document.
See <a href="#intro.synthetic">Synthetic Infosets</a> below.
</p>
<p>
An XML document's information set consists of a number of
<dfn><strong>information items</strong></dfn>;
the information set for any well-formed XML document
will contain at least a
<a href="#infoitem.document">document</a> information item
and several others.
An information item is an abstract description of some part of an XML
document: each information item has a set of associated named
<dfn><strong>properties</strong></dfn>. In this specification, the
property names are shown in square brackets, <strong>[thus]</strong>.
The types of information item are listed in
<a href="#infoitem">section 2</a>.
</p>
<p>
The XML
Information Set does not require or favor a specific interface or class of
interfaces. This specification presents the information set as a modified
tree for the sake of clarity and simplicity, but there is no requirement that
the XML Information Set be made available through a tree structure; other
types of interfaces, including (but not limited to) event-based and query-based
interfaces, are also capable of providing information conforming to the XML
Information Set.
</p>
<p>
The terms "information set" and "information
item" are similar in meaning to the generic terms "tree" and "node", as they
are used in computing. However, the former terms are used in this specification
to reduce possible confusion with other specific data models. Information
items do <em>not</em> map one-to-one with the nodes of the DOM or the "tree"
and "nodes" of the XPath data model.
</p>
<p>
In this specification, the words "must",
"should", and "may" assume the meanings specified in
<a href="#RFC2119">[RFC2119]</a>, except that the words do not appear in
uppercase.
</p>
<h3 class="added"><a name="intro.versions">XML Versions</a></h3>
<p class="added">
Different versions of the XML specification may specify different
parsing rules.
<span class="added">The information set of an XML document is defined to
be the one obtained by parsing it according to the rules of the
specification whose version corresponds
to that of the document.</span>
A document which does not specify a
version number is considered to have version 1.0. If an XML
processor accepts a document with a version number that it does not
understand, it will not necessarily be able to produce the correct
information set.
</p>
<h3><a name="intro.namespaces">Namespaces</a></h3>
<p>
XML <!-- <span class="deleted">1.0</span> --> documents that do not conform to
<a href="#Namespaces">[Namespaces]</a>,
though technically well-formed,
are not considered to have meaningful information sets.
That is, this specification does not define an information
set for documents that have element or attribute names containing colons that
are used in other ways than as prescribed by
<a href="#Namespaces">[Namespaces]</a>.
</p>
<p>
Furthermore, this specification does not define an information set for
documents which use relative URI references in namespace declarations.
This is in accordance with the decision of the W3C XML Plenary Interest
Group described in <a href="#RelNS">[Relative Namespace URI References]</a>.
</p>
<p>
The value of a [namespace name] property is the normalized value of
the corresponding namespace attribute; no additional URI escaping is
applied to it by the processor.
</p>
<h3><a name="intro.entities">Entities</a></h3>
<p>
An information set describes its XML document with entity
references already expanded, that is, represented by the information
items corresponding to their replacement text. However, there are
various circumstances in which a processor may not perform this
expansion. An entity may not be declared, or may not be retrievable.
A non-validating processor may choose not to read all declarations,
and even if it does, may not expand all external entities. In these
cases an
<a href="#infoitem.rse">unexpanded entity reference</a>
information item is used to represent the entity reference.
</p>
<h3><a name="intro.eol">End-of-Line Handling</a></h3>
<p>
The values of all properties in the Infoset
take account of the end-of-line normalization described in
<a href="#XML">[XML]</a>, 2.11 "End-of-Line Handling".
</p>
<h3><a name="intro.baseURIs">Base URIs</a></h3>
<p>
Several information items have a [base URI] or [declaration base URI] property.
These are computed according to
<a href="#XMLBase">[XML Base]</a>.
Note that retrieval of a resource may involve redirection
at the parser level (for example, in an entity resolver) or below;
in this case the base URI is the final URI used to retrieve the resource
after all redirection.
</p>
<p>
The value of these properties does not reflect any URI escaping that
may be required for retrieval of the resource, but it may include
escaped characters if these were specified in the document, or returned
by a server in the case of redirection.
</p>
<p>
In some cases (such as a document read from a string or a pipe) the
rules in
<a href="#XMLBase">[XML Base]</a>
may result in a base URI being application
dependent. In these cases this specification does not define
the value of the [base URI] or [declaration base URI] property.
</p>
<p>
When resolving relative URIs the [base URI] property should be used in
preference to the values of xml:base attributes; they may be inconsistent
in the case of <a href="#intro.synthetic">Synthetic Infosets</a>.
</p>
<h3><a name="intro.null">``Unknown'' and ``No Value''</a></h3>
<p>
Some properties may sometimes have the value
<dfn><strong>unknown</strong></dfn> or
<dfn><strong>no value</strong></dfn>,
and it is said that a property value is unknown or that a property
has no value respectively.
These values are distinct from each other and from all other values.
In particular they are distinct from the empty string, the empty set,
and the empty list, each of which simply has no members.
This specification does not use the term <strong>null</strong> since in some
communities it has particular connotations which may not match those
intended here.
</p>
<h3 class="added"><a name="intro.invalidity">Inconsistencies Resulting from Invalidity</a></h3>
<p class="added">
As noted above, an XML document need not be valid to have an
information set. However, certain kinds of invalidity affect the
values assigned to some properties.
Entities, notations, elements and attributes may be undeclared.
Notations and elements may be multiply declared (multiple declarations
are valid for entities and attributes).
An ID may be undefined or multiply defined.
Such cases are noted where relevant in the Information Item definitions below.
</p>
<h3><a name="intro.synthetic">Synthetic Infosets</a></h3>
<p>
This specification describes the information set resulting from parsing
an XML document. Information sets may be constructed by other means,
for example by use of an API such as the DOM or by transforming an
existing information set.
</p>
<p>
An information set corresponding to a real document will necessarily
be consistent in various ways; for example the [in-scope namespaces]
property of an element will be consistent with the [namespace
attributes] properties of the element and its ancestors. This may not
be true of an information set constructed by other means; in such a case
there will be no XML document corresponding to the information set,
and to serialize it will require resolution of the inconsistencies
(for example, by outputting namespace declarations that correspond to
the namespaces in scope).
</p>
</div>
<div>
<h2><a name="infoitem">2. Information Items</a></h2>
<p>An
information set can contain up to eleven different types of information item,
as explained in the following sections. Every information item has properties.
For ease of reference, each property is given a name, indicated
<strong>[thus]</strong>.
Links to a definition and/or syntax in the XML 1.0
Recommendation <a href="#XML">[XML]</a> are given for each information item.
</p>
<div>
<h3><a name="infoitem.document">2.1. The Document Information Item</a></h3>
<p class="xml-def"><em><strong>XML Definition:
</strong> <a href="http://www.w3.org/TR/REC-xml#dt-xml-doc">document</a> (Section
2, <cite>Documents</cite>)</em></p> <p class="xml-syntax"><em><strong>
XML Syntax:</strong> [1] <a href="http://www.w3.org/TR/REC-xml#NT-document">
Document</a> (Section 2.1, <cite>Well-Formed XML Documents</cite>)</em></p>
<p>There is exactly one <dfn><strong>document information item</strong></dfn>
in the information set, and all other information items are accessible from
the properties of the document information item, either directly or indirectly
through the properties of other information items.</p> <p>The document information
item has the following properties:</p> <ol>
<li><strong>[children]</strong> An ordered list of child information items,
in document order. The list contains exactly one <a href="#infoitem.element">
element</a> information item. The list also contains one <a href="#infoitem.pi">
processing instruction</a> information item for each processing instruction
outside the document element, and one <a href="#infoitem.comment">comment</a> information item for each comment outside
the document element. Processing instructions and comments within the DTD
are excluded. If there is a document type declaration, the list also
contains a <a href="#infoitem.doctype">document type declaration</a>
information item.</li>
<li><strong>[document element]</strong>
The <a href="#infoitem.element">element</a> information item corresponding to the document element.
</li>
<li><strong>[notations]</strong> An unordered set of <a href="#infoitem.notation">
notation</a> information items, one for each notation declared in the DTD.
<span class="added">If any notation is multiply declared, this property
has no value.</span>
</li>
<li><strong>[unparsed entities]</strong> An unordered set of
<a href="#infoitem.entity.unparsed">unparsed entity</a>
information items, one for each unparsed entity declared
in the DTD.
</li>
<li><strong>[base URI]</strong> The base URI of the document entity.
</li>
<li><strong>[character encoding scheme]</strong>
The name of the character encoding scheme in which the document entity
is expressed.
</li>
<li><strong>[standalone]</strong> An indication of the standalone status of
the document, either yes or no. This property is derived
from the optional standalone document declaration in
the XML declaration at the beginning of the document
entity, and has no value if there is no standalone document declaration.</li>
<li><strong>[version]</strong> A string representing the XML version of the
document. This property is derived from the XML declaration optionally present
at the beginning of the document entity, and has no value if there is no
XML declaration.</li>
<li>
<strong>[all declarations processed]</strong> This property is not
strictly speaking part of the infoset of the document. Rather it is
an indication of whether the processor has read the complete DTD.
Its value is a boolean. If it is false, then certain
properties (indicated in their descriptions below) may be unknown.
If it is true, those properties are never unknown.
</li>
</ol></div>
<div>
<h3><a name="infoitem.element">2.2. Element Information Items</a></h3>
<p class="xml-def"><em><strong>XML Definition:</strong> <a href="http://www.w3.org/TR/REC-xml#dt-element">element</a> (Section 3, <cite>
Logical Structures</cite>)</em></p> <p class="xml-syntax"><em><strong>
XML Syntax:</strong> [39] <a href="http://www.w3.org/TR/REC-xml#NT-element">
Element</a> (Section 3, <cite>Logical Structures</cite>)</em></p>
<p>There is an <dfn><strong>element information item</strong></dfn> for each
element appearing in the XML document. One of the element information items
is the value of the [document element] property of the document information
item, corresponding to the root of the element tree, and all
other element information items are accessible by recursively following
its [children] property.</p>
<p>An element information item has the following
properties:</p> <ol>
<li><strong>[namespace name]</strong> The namespace name, if any, of the element
type. If the element does not belong to a namespace, this property
has no value.
</li>
<li><strong>[local name]</strong> The local part of the element-type name.
This does not include any namespace prefix or following colon.</li>
<li><strong>[prefix]</strong> The namespace prefix part of the element-type
name. If the name is unprefixed, this property
has no value. Note that namespace-aware applications should use
the namespace name rather than the prefix to identify elements.
</li>
<li><strong>[children]</strong> An ordered list of child information items,
in document order. This list contains <a href="#infoitem.element">element</a>,
<a href="#infoitem.pi">processing instruction</a>, <a href="#infoitem.rse">
unexpanded entity reference</a>, <a href="#infoitem.character">character</a>,
and <a href="#infoitem.comment">comment</a> information items, one for each
element, processing instruction, reference to an unprocessed external entity,
data character, and comment appearing immediately within the current element.
If the element is empty, this list has no members.</li>
<li><strong>[attributes]</strong> An unordered set of <a href="#infoitem.attribute">
attribute</a> information items, one for each of the attributes (specified
or defaulted from the DTD) of this element. Namespace declarations
do not appear in this set.
If the element has no attributes, this
set has no members.</li>
<li><strong>[namespace attributes]</strong> An unordered set of <a href="#infoitem.attribute">
attribute</a> information items, one for each of the namespace
declarations (specified or defaulted from the DTD) of this element.
<span class="changed">
Declarations of the form xmlns="" and xmlns:name="", which undeclare
the default namespace and prefixes respectively, count as namespace
declarations. Prefix undeclaration was added in
<a href="#Namespaces11">Namespaces in XML 1.1</a>.
</span>
By definition, all namespace attributes (including
those named <code>xmlns</code>, whose [prefix] property
has no value) have a namespace
URI of <code>http://www.w3.org/2000/xmlns/</code>.
If the element has no namespace declarations, this set
has no members.
</li>
<li><strong>[in-scope namespaces]</strong> An unordered set
of <a href="#infoitem.namespace">
namespace</a> information items, one for each of the namespaces
in effect for this element. This set always contains an item with
the prefix <code>xml</code> which is implicitly bound to the
namespace name <code>http://www.w3.org/XML/1998/namespace</code>.
It does not contain an item with the prefix <code>xmlns</code> (used
for declaring namespaces), since
an application can never encounter an element or attribute with that
prefix.
The set will include namespace items corresponding to all of the
members of [namespace attributes],
<span class="changed">
except for any representing declarations of the form xmlns="" or
xmlns:name="", which do not declare a namespace but rather undeclare
the default namespace and prefixes.
</span>
When resolving the prefixes of qualified names this property should be
used in preference to the [namespace attributes] property; they may be
inconsistent in the case of <a href="#intro.synthetic">Synthetic
Infosets</a>.
</li>
<li><strong>[base URI]</strong> The base URI of the element.
</li>
<li><strong>[parent]</strong> The document or element information item which
contains this information item in its [children] property.</li>
</ol></div>
<div>
<h3><a name="infoitem.attribute">2.3. Attribute Information Items</a></h3>
<p class="xml-def"><em><strong>XML Definition:</strong> <a href="http://www.w3.org/TR/REC-xml#dt-attr">attribute</a> (Section 3.1, <cite>
Start-Tags, End-Tags, and Empty-Element Tags</cite>)</em></p>
<p class="xml-syntax"><em><strong>XML Syntax:</strong> [41] <a href="http://www.w3.org/TR/REC-xml#NT-Attribute">Attribute</a> (Section 3.1, <cite>
Start-Tags, End-Tags, and Empty-Element Tags</cite>)</em></p>
<p>There is an <dfn><strong>attribute information item</strong></dfn> for
each attribute (specified or defaulted) of each element in the document,
including those which are namespace declarations. The latter however
appear as members of an element's [namespace attributes] property rather
than its [attributes] property.
</p> <p>Attributes declared in the DTD with no default value
and not specified in the element's start tag are not represented by
attribute information items.</p>
<p>An attribute information item has the
following properties:</p> <ol>
<li><strong>[namespace name]</strong> The namespace name, if any, of the attribute.
Otherwise, this property has no value.
</li>
<li><strong>[local name]</strong> The local part of the attribute name.
This does not include any namespace prefix or following colon.</li>
<li><strong>[prefix]</strong> The namespace prefix part of the attribute
name. If the name is unprefixed, this property
has no value.
Note that namespace-aware applications should use
the namespace name rather than the prefix to identify attributes.
</li>
<li><strong>[normalized value]</strong> The normalized attribute value (see <a href="http://www.w3.org/TR/REC-xml#AVNormalize">3.3.3 Attribute-Value Normalization
</a> <a href="#XML">[XML]</a>).</li>
<li><strong>[specified]</strong> A flag indicating whether this attribute
was actually specified in the start-tag of its element, or was defaulted from
the DTD.</li>
<li><strong>[attribute type]</strong> An indication of the type declared for
this attribute in the DTD. Legitimate values are ID, IDREF, IDREFS, ENTITY,
ENTITIES, NMTOKEN, NMTOKENS, NOTATION, CDATA, and ENUMERATION.
If there is no declaration for the attribute, this property has no value.
If no declaration has been read, but the [all declarations processed]
property of the document information item is false (so there may be an
unread declaration), then the value of this property is unknown.
Applications should treat no value and unknown as equivalent to
a value of CDATA.
<span class="added">The value of this property is not affected by the
validity of the attribute value.</span>
</li>
<li><strong>[references]</strong>
If the attribute type is ID, NMTOKEN, NMTOKENS, CDATA, or ENUMERATION,
this property has no value. If the attribute type is unknown,
the value of this property is unknown. Otherwise (that is,
if the attribute type is IDREF, IDREFS, ENTITY, ENTITIES, or NOTATION),
the value of this property is an ordered list of the
<a href="#infoitem.element">element</a>,
<a href="#infoitem.entity.unparsed">unparsed entity</a>, or
<a href="#infoitem.notation">notation</a>
information items
referred to in the attribute value, in the order that they appear there.
In this case, if the attribute value is syntactically
invalid, this property has no value.
If the type is IDREF or IDREFS and any of the IDs does not appear as
the value of an ID attribute in the document, or if the type is
ENTITY, ENTITIES or NOTATION and no declaration has been read for any
of the entities or the notation, then this property has no value
or is unknown, depending on whether the [all declarations processed]
property of the document information item is true or false.
If the type is IDREF or IDREFS and any of the IDs appears as the
value of more than one ID attribute in the document,
<span class="added">or if the type is NOTATION and there are multiple
declarations for the notation,</span>
then this property
has no value.
</li>
<li><strong>[owner element]</strong> The element information item which contains
this information item in its [attributes] property.</li>
</ol> </div>
<div>
<h3><a name="infoitem.pi">2.4. Processing Instruction Information Items</a></h3>
<p class="xml-def"><em><strong>XML Definition:
</strong> <a href="http://www.w3.org/TR/REC-xml#dt-pi">processing instruction
</a> (Section 2.6, <cite>Processing Instructions</cite>)</em></p>
<p class="xml-syntax"><em><strong>XML Syntax:</strong> [16] <a href="http://www.w3.org/TR/REC-xml#NT-PI">PI</a> (Section 2.6, <cite>Processing
Instructions</cite>)</em></p> <p>There is a <dfn><strong>
processing instruction information item</strong></dfn> for each processing
instruction in the document. The XML declaration and text declarations for
external parsed entities are not considered processing instructions. </p>
<p>A processing instruction information item has the following properties:
</p> <ol>
<li><strong>[target]</strong> A string representing the target part of the
processing instruction (an XML name).</li>
<li><strong>[content]</strong> A string representing the content of the processing
instruction, excluding the target and any white space immediately following
it. If there is no such content, the value of this property will be an empty
string.</li>
<li><strong>[base URI]</strong> The base URI of the PI.
Note that if an infoset is serialized as an XML document, it will not be
possible to preserve the base URI of any PI that originally appeared at
the top level of an external entity, since there is no syntax for PIs
corresponding to the <code>xml:base</code> attribute on elements.
</li>
<li><strong>[notation]</strong>
The <a href="#infoitem.notation">notation</a>
information item named by the target.
If there is no declaration for a notation with that name,
<span class="added">or there are multiple declarations,</span>
this
property has no value. If no declaration has been read, but the [all
declarations processed] property of the document information item is
false (so there may be an unread declaration), then the value of this
property is unknown.
</li>
<li><strong>[parent]</strong> The document, element, or document type
<span class="changed">declaration</span>
information item which contains this information item in its [children] property.
</li>
</ol> </div>
<div>
<h3><a name="infoitem.rse">2.5. Unexpanded Entity Reference Information Items</a></h3>
<p class="xml-def"><em><strong>
XML Definition:</strong> Section 4.4.3, <cite><a href="http://www.w3.org/TR/REC-xml#include-if-valid">
Included If Validating</a></cite></em></p>
<p>A <dfn><strong>unexpanded entity reference information item</strong></dfn>
serves as a placeholder by which an XML processor
can indicate that it has not expanded an external parsed entity.
There is such an information item for each unexpanded
reference to an external general entity within the content of an
element. A validating XML processor, or a non-validating processor that reads
all external general entities, will never generate unexpanded entity reference
information items for a valid document.</p>
<p>An unexpanded entity reference
information item has the following properties:</p> <ol>
<li><strong>[name]</strong> The name of the entity referenced.</li>
<li><strong>[system identifier]</strong>
The system identifier of the entity, as it appears in the declaration
of the entity, without any additional URI escaping applied by the processor.
If there is no declaration for the entity, this property has no
value. If no declaration has been read, but the [all declarations
processed] property of the document information item is false (so
there may be an unread declaration), then the value of this property
is unknown.
</li>
<li>
<strong>[public identifier]</strong>
The public identifier of the entity, normalized as described in
<a href="http://www.w3.org/TR/REC-xml#dt-pubid">4.2.2 External Entities</a>
<a href="#XML">[XML]</a>.
If there is no declaration for the entity, or the declaration does not
include a public identifier, this property has no value. If no
declaration has been read, but the [all declarations processed]
property of the document information item is false (so there may be an
unread declaration), then the value of this property is unknown.
</li>
<li>
<strong>[declaration base URI]</strong>
The base URI relative to which the system identifier should be resolved
(i.e. the base URI of the resource within which the entity declaration occurs).
This is unknown or has no value in the same circumstances as the
[system identifier] property.
</li>
<li><strong>[parent]</strong> The element information item which contains
this information item in its [children] property.</li>
</ol> </div>
<div>
<h3><a name="infoitem.character">2.6. Character Information Items</a></h3>
<p class="xml-syntax"><em><strong>XML Syntax:</strong>
[2] <a href="http://www.w3.org/TR/REC-xml#NT-Char">Char</a> (Section 2.2, <cite>
Characters</cite>)</em></p> <p>There is a <dfn><strong>character
information item</strong></dfn> for each data character that appears in the
document, whether literally, as a character reference, or within a
CDATA section.
</p>
<p>Each character
is a logically separate information item, but XML applications are free to
chunk characters into larger groups as necessary or desirable.</p> <p>A character
information item has the following properties:</p> <ol>
<li><strong>[character code]</strong> The ISO 10646 character code (in the
range 0 to #x10FFFF, though not every value in this range is a legal XML character
code) of the character.</li>
<li><strong>[element content whitespace]</strong> A boolean indicating whether
the character is white space appearing within element content (see <a href="#XML">
[XML]</a>, 2.10 "White Space Handling"). Note that validating XML processors
are <em>required</em>
<!-- <span class="deleted">by XML 1.0</span> -->
to provide this information.
If there is no declaration for the containing element,
<span class="added">or there are multiple declarations,</span>
this property has
no value for white space characters.
If no declaration has been read, but the [all declarations processed]
property of the document information item is false (so there may be an
unread declaration), then the value of this property is unknown for
white space characters.
It is always false for characters that are not white space.
</li>
<li><strong>[parent]</strong> The element information
item which contains this information item in its [children] property.</li>
</ol> </div>
<div>
<h3><a name="infoitem.comment">2.7. Comment Information Items</a></h3>
<p class="xml-def"><em><strong>XML Definition:</strong> <a href="http://www.w3.org/TR/REC-xml#dt-comment">comment</a> (Section 2.5, <cite>
Comments</cite>)</em></p> <p class="xml-syntax"><em><strong>
XML Syntax:</strong> [15] <a href="http://www.w3.org/TR/REC-xml#NT-Comment">
Comment</a> (Section 2.5, <cite>Comments</cite>)</em></p> <p>
There is a <dfn><strong>comment information item</strong></dfn>
for each XML comment in the original document, except for those appearing
in the DTD (which are not represented).</p>
<p>A comment information item has
the following properties:</p> <ol>
<li><strong>[content]</strong> A string representing the content of the comment.
</li>
<li><strong>[parent]</strong> The document or element
information item which contains this information item in its [children] property.
</li>
</ol> </div>
<div>
<h3><a name="infoitem.doctype">2.8. The Document Type Declaration Information Item</a></h3>
<p class="xml-def"><em><strong>
XML Definition:</strong> <a href="http://www.w3.org/TR/REC-xml#dt-doctype">
document type declaration</a> (section 2.8, <cite>Prolog and Document Type
Declaration</cite>)</em></p> <p class="xml-syntax"><em><strong>
XML Syntax:</strong> [28] <a href="http://www.w3.org/TR/REC-xml#NT-doctypedecl">
doctypedecl</a> (section 2.8, <cite>Prolog and Document Type Declaration</cite>)
</em></p> <p>If the XML document has a document type declaration,
then the information set contains a single <dfn><strong>document type declaration
information item</strong></dfn>. Note that entities and notations
are provided as
properties of the document information item, not the document type declaration
information item.</p> <p>A document type declaration information item has
the following properties:</p> <ol>
<li>
<strong>[system identifier]</strong>
The system identifier of the external subset, as it appears in the DOCTYPE
declaration, without any additional URI escaping applied by the processor.
If there is no external subset this property has no value.
</li>
<li>
<strong>[public identifier]</strong>
The public identifier of the external subset, normalized as described in
<a href="http://www.w3.org/TR/REC-xml#dt-pubid">4.2.2 External Entities</a>
<a href="#XML">[XML]</a>.
If there is no external subset or if it has no public identifier,
this property has no value.
</li>
<li><strong>[children]</strong> An ordered list of
<a href="#infoitem.pi">processing instruction</a> information items
representing processing instructions appearing
in the DTD, in the original document order. Items from the internal DTD subset
appear before those in the external subset.</li>
<li><strong>[parent]</strong> The document information item.</li>
</ol> </div>
<div>
<h3><a name="infoitem.entity.unparsed">2.9. Unparsed Entity Information Items</a></h3>
<p class="xml-def"><em><strong>XML Definition:
</strong> <a href="http://www.w3.org/TR/REC-xml#dt-entity">entity</a> (section
4, <cite>Physical Structures</cite>)</em></p> <p
class="xml-syntax"><em><strong>XML Syntax:</strong> [71] <a href="http://www.w3.org/TR/REC-xml#NT-GEDecl">
GEDecl</a> (section 4.2, <cite>Entities</cite>)</em></p>
<p>
There is an <dfn><strong>unparsed entity information item</strong></dfn>
for each unparsed general entity declared in the DTD.
</p>
<p>
An unparsed entity information item has the following properties:
</p>
<ol>
<li>
<strong>[name]</strong>
The name of the entity.
</li>
<li>
<strong>[system identifier]</strong>
The system identifier of the entity, as it appears in the declaration
of the entity, without any additional URI escaping applied by the processor.
</li>
<li>
<strong>[public identifier]</strong>
The public identifier of the entity, normalized as described in
<a href="http://www.w3.org/TR/REC-xml#dt-pubid">4.2.2 External Entities</a>
<a href="#XML">[XML]</a>.
If the entity has no public identifier, this property has no value.
</li>
<li>
<strong>[declaration base URI]</strong>
The base URI relative to which the system identifier should be resolved
(i.e. the base URI of the resource within which the entity declaration occurs).
</li>
<li>
<strong>[notation name]</strong>
The notation name associated with the entity.
</li>
<li>
<strong>[notation]</strong>
The <a href="#infoitem.notation">notation</a>
information item named by the notation name.
If there is no declaration for a notation with that name,
<span class="added">or there are multiple declarations,</span>
this
property has no value. If no declaration has been read, but the [all
declarations processed] property of the document information item is
false (so there may be an unread declaration), then the value of this
property is unknown.
</li>
</ol>
</div>
<div>
<h3><a name="infoitem.notation">2.10. Notation Information Items</a></h3>
<p class="xml-def"><em><strong>XML Definition:</strong> <a href="http://www.w3.org/TR/REC-xml#dt-notation">notation</a> (section 4.7, <cite>
Notations</cite>)</em></p> <p class="xml-syntax"><em><strong>
XML Syntax:</strong> [82] <a href="http://www.w3.org/TR/REC-xml#NT-NotationDecl">
NotationDecl</a> (section 4.7, <cite>Notations</cite>)</em></p>
<p>There is a <dfn><strong>notation information item</strong></dfn> for
each notation declared in the DTD.</p> <p>A notation information item has
the following properties:</p> <ol>
<li><strong>[name]</strong> The name of the notation.</li>
<li><strong>[system identifier]</strong> The system identifier of the notation,
as it appears in the declaration of the notation,
without any additional URI escaping applied by the processor.
If no system identifier was specified, this property has no value.</li>
<li><strong>[public identifier]</strong>
The public identifier of the notation, normalized as described in
<a href="http://www.w3.org/TR/REC-xml#dt-pubid">4.2.2 External Entities</a>
<a href="#XML">[XML]</a>.
If the notation has no public identifier,
this property has no value.</li>
<li>
<strong>[declaration base URI]</strong>
The base URI relative to which the system identifier should be resolved
(i.e. the base URI of the resource within which the notation declaration
occurs).
</li>
</ol>
</div>
<div>
<h3><a name="infoitem.namespace">2.11. Namespace Information Items</a></h3>
<p>
Each element in the document has a <dfn><strong>namespace
information item</strong></dfn> for each namespace that is in scope
for that element.
</p> <p>A namespace information item has the following properties:
</p> <ol>
<li><strong>[prefix]</strong> The prefix whose binding this item describes.
Syntactically, this
is the part of the attribute name following the <code>xmlns:</code> prefix.
If the attribute name is simply <code>xmlns</code>, so that the
declaration is of the default namespace, this property
has no value.
</li>
<li><strong>[namespace name]</strong> The namespace name to which the
prefix is bound.</li>
</ol> </div>
</div>
<div>
<h2><a name="conformance">3. Conformance</a></h2>
<p>
Since the purpose of the Information Set is to provide a set of definitions,
conformance is a property of specifications that use those
definitions, rather than of implementations.
</p>
<p>
Specifications referring to the Infoset must:
</p>
<ul>
<li>
Indicate the information items and properties that are needed to implement
the specification. (This indirectly imposes conformance requirements
on processors used to implement the specification.)
</li>
<li>
Specify how other information items and properties are treated (for
example, they might be passed through unchanged).
</li>
<li>
Note any information required from an XML document that is not defined
by the Infoset.
</li>
<li>
Note any difference in the use of terms defined by the Infoset (this
should be avoided).
</li>
</ul>
<p>
If a specification allows the construction of an infoset that has
inconsistencies as described above under
<a href="#intro.synthetic">Synthetic Infosets</a>
it may describe how
those inconsistencies are to be resolved, and should do so if it
provides for serialization of the infoset.
</p>
</div>
<div>
<h2><a name="references">Appendix A. References</a></h2>
<div>
<h3><a name="references.normative">Normative References</a></h3>
<dl>
<dt><strong><a name="ISO10646" id="ISO10646">ISO/IEC 10646</a></strong></dt>
<dd>ISO (International Organization for Standardization).
<cite>ISO/IEC 10646-1:2000. Information technology —
Universal Multiple-Octet Coded Character Set (UCS) —
Part 1: Architecture and Basic Multilingual Plane</cite> and
<cite>ISO/IEC 10646-2:2001.Information technology —
Universal Multiple-Octet Coded Character Set (UCS) —
Part 2: Supplementary Planes</cite>,
as, from time to time, amended, replaced by a new edition or
expanded by the addition of new parts.
[Geneva]: International Organization for Standardization.
(See <a href="http://www.iso.ch">http://www.iso.ch</a> for the latest version.)
</dd>
<dt><strong><a name="Namespaces">Namespaces</a></strong></dt>
<dd><cite>Namespaces in XML,</cite> W3C, eds. Tim Bray, Dave Hollander, Andrew
Layman. 14 January 1999. Available at <code><a href="http://www.w3.org/TR/REC-xml-names">
http://www.w3.org/TR/REC-xml-names</a></code>.</dd>
<dt class="added"><strong><a name="Namespaces11">Namespaces 1.1</a></strong></dt>
<dd class="added"><cite>Namespaces in XML 1.1,</cite>
W3C, eds. Tim Bray, Dave Hollander, Andrew Layman, Richard Tobin.
4 February 2004.
Available at
<code><a href="http://www.w3.org/TR/xml-names11">
http://www.w3.org/TR/xml-names11</a></code>.</dd>
<dt><strong><a name="RFC2119">RFC2119</a></strong></dt>
<dd><cite>Key words for use in RFCs to Indicate Requirement Levels,</cite>
ed. S. Bradner. March 1997. Available at <code><a href="http://www.ietf.org/rfc/rfc2119.txt">
http://www.ietf.org/rfc/rfc2119.txt</a></code>.</dd>
<dt><strong><a name="XML">XML</a></strong></dt>
<dd><cite>Extensible Markup Language (XML) 1.0 (Third Edition),</cite>
W3C, eds. Tim Bray, Jean Paoli, C.M. Sperberg-McQueen, Eve Maler, François Yergeau. 4 February 2004.
Available at <code><a href="http://www.w3.org/TR/REC-xml">http://www.w3.org/TR/REC-xml</a></code>.
</dd>
<dt class="added"><strong><a name="XML11">XML 1.1</a></strong></dt>
<dd class="added"><cite>Extensible Markup Language (XML) 1.1,</cite>
W3C, eds. Tim Bray, Jean Paoli, C.M. Sperberg-McQueen, Eve Maler, John Cowan, François Yergeau.
4 February 2004.
Available at
<code><a href="http://www.w3.org/TR/xml11">
http://www.w3.org/TR/xml11</a></code>.
</dd>
<dt><strong><a name="XMLBase">XML Base</a></strong></dt>
<dd><cite>XML Base,</cite> W3C, ed. Jonathan Marsh. February 2000. Available at <code><a href="http://www.w3.org/TR/xmlbase">http://www.w3.org/TR/xmlbase</a></code>.
</dd>
</dl>
</div>
<div>
<h3><a name="references.informative">Informative References</a></h3>
<dl>
<dt><strong><a name="DOM">DOM</a></strong></dt>
<dd><cite>Document Object Model (DOM) Level 1 Specification,</cite> W3C, eds. Vidur
Apparao, Steve Byrne, Mike Champion, et al. 1 October 1998. Available
at <code><a href="http://www.w3.org/TR/REC-DOM-Level-1">http://www.w3.org/TR/REC-DOM-Level-1</a></code>.</dd>
<dt><strong><a name="XPointer-Liaison">XPointer-Liaison</a></strong></dt>
<dd><cite>XPointer-Information Set Liaison Statement,</cite> W3C, ed. Steven J.
DeRose. 24 February 1999. Available at <code><a href="http://www.w3.org/TR/NOTE-xptr-infoset-liaison">
http://www.w3.org/TR/NOTE-xptr-infoset-liaison</a></code>.</dd>
<dt><strong><a name="RelNS">Relative Namespace URI References</a></strong></dt>
<dd>
<cite>Results of W3C XML Plenary Ballot on relative URI References
in namespace declarations, 3-17 July 2000,</cite> W3C, eds. Dave Hollander,
C. M. Sperberg-McQueen. 6 September 2000. Available at
<code><a href="http://www.w3.org/2000/09/xppa">http://www.w3.org/2000/09/xppa</a></code>.
</dd>
<dt><strong><a name="RDFNote">RDF Schema for the XML Information Set</a></strong></dt>
<dd>
<cite>RDF Schema for the XML Information Set,</cite> W3C, ed. Richard Tobin. 6 April 2001. Available at
<code><a href="http://www.w3.org/TR/xml-infoset-rdfs">http://www.w3.org/TR/xml-infoset-rdfs</a></code>.
</dd>
</dl></div></div>
<div>
<h2><a name="reporting">Appendix B: XML <!-- <span class="deleted">1.0</span> --> Reporting Requirements (informative)</a></h2>
<p>Although the XML <!-- <span class="deleted">1.0</span> --> Recommendation <a href="#XML">[XML]</a> is primarily concerned with XML syntax, it also includes
some specific reporting requirements for XML processors.</p> <p>The reporting
requirements include errors, which are outside the scope of this specification,
and document information. All of the XML <!-- <span class="deleted">1.0</span> --> requirements for document information
reporting have been integrated into the XML Information Set; numbers in parentheses
refer to sections of the XML Recommendation:</p> <ol>
<li>An XML processor must always provide all characters in a document that
are not part of markup to the application (2.10).</li>
<li>A validating XML processor must inform the application which of the character
data in a document is white space appearing within element content (2.10).
</li>
<li>An XML processor must normalize line-ends to LF before passing
them to the application (2.11).</li>
<li>An XML processor must normalize the value of attributes according to the
rules in clause 3.3.3 before passing them to the application.
</li>
<li>An XML processor must pass the names and external identifiers (system
identifiers, public identifiers or both) of declared notations to the application
(4.7).</li>
<li>When the name of an unparsed entity appears as the explicit or default
value of an ENTITY or ENTITIES attribute, an XML processor must provide the
names, system identifiers, and (if present) public identifiers of both the
entity and its notation to the application (4.6, 4.7).</li>
<li>An XML processor must pass processing instructions to the application
(2.6).</li>
<li>An XML processor (necessarily a non-validating one) that does not include
the replacement text of an external parsed entity in place of an entity reference
must notify the application that it recognized but did not read the entity
(4.4.3).</li>
<li>A validating XML processor must include the replacement text of an entity
in place of an entity reference (5.2).</li>
<li>An XML processor must supply the default value of attributes
declared in the DTD for a given element type but not appearing in the element's
start tag (3.3.2).</li>
</ol>
<div>
<h2><a name="example">Appendix C: Example (informative)</a></h2>
<p>
Consider the following example XML document:
</p>
<pre><?xml version="1.0"?>
<msg:message doc:date="19990421"
xmlns:doc="http://doc.example.org/namespaces/doc"
xmlns:msg="http://message.example.org/"
>Phone home!</msg:message></pre>
<p>
The information set for this XML document
contains the following information items:
</p>
<ul>
<li>A <a href="#infoitem.document">document</a> information item.</li>
<li>
An <a href="#infoitem.element">element</a> information item
with namespace name "<code>http://message.example.org/</code>",
local part "<code>message</code>",
and prefix "<code>msg</code>".
</li>
<li>
An <a href="#infoitem.attribute">attribute</a> information item with the
namespace name "<code>http://doc.example.org/namespaces/doc</code>",
local part "<code>date</code>",
prefix "<code>doc</code>",
and normalized value "<code>19990421</code>".
</li>
<li>
Three <a href="#infoitem.namespace">namespace</a> information items
for the
<code>http://www.w3.org/XML/1998/namespace</code>,
<code>http://doc.example.org/namespaces/doc</code>, and
<code>http://message.example.org/</code> namespaces.
</li>
<li>
Two <a href="#infoitem.attribute">attribute</a> information items
for the namespace attributes.
</li>
<li>
Eleven <a href="#infoitem.character">character</a> information items
for the character data.
</li>
</ul>
</div>
<div>
<h2><a name="omitted">Appendix D: What is not in the Information Set</a></h2>
<p>The following information is not represented in the
current version of the XML Information Set (this list is not intended to
be exhaustive):</p> <ol>
<li>The content models of elements, from ELEMENT declarations in the DTD.
</li>
<li>The grouping and ordering of attribute declarations in ATTLIST declarations.
</li>
<li>The document type name.</li>
<li>White space outside the document element.</li>
<li>White space immediately following the target name of a PI.</li>
<li>Whether characters are represented by character references.</li>
<li>The difference between the two forms of an empty element: <code><foo/>
</code> and <code><foo></foo></code>.</li>
<li>White space within start-tags (other than significant white space in attribute
values) and end-tags.</li>
<li>The difference between CR, CR-LF, and LF line termination.</li>
<li>The order of attributes within a start-tag.</li>
<li>The order of declarations within the DTD.</li>
<li>The boundaries of conditional sections in the DTD.</li>
<li>The boundaries of parameter entities in the DTD.</li>
<li>Comments in the DTD.</li>
<li>The location of declarations (whether in internal or external subset or
parameter entities).</li>
<li>Any ignored declarations, including those within an IGNORE conditional
section, as well as entity and attribute declarations ignored because previous
declarations override them. </li>
<li>The kind of quotation marks (single or double) used to quote attribute
values.</li>
<li>The boundaries of general parsed entities.</li>
<li>The boundaries of CDATA marked sections.</li>
<li>The default value of attributes declared in the DTD.</li>
</ol>
<div>
<h2><a name="rdfschema">Appendix E: RDF Schema (informative)</a></h2>
<p>
See <a href="#RDFNote">RDF Schema for the XML Information Set</a> for a formal
characterization of the Infoset.
</p>
</div> </div> </div></body>
</html>