

The .txt file contains a checklist of the patent numbers included in the archive, with one patent number per line. For week 4 of 2012 the checklist includes design patent numbers D0652606 through D0653015, plant patent numbers PP022464 through PP022468, reissue patent numbers RE043120 through RE043146 and utility patent numbers 08099794 through 08104093. The .html file includes a header (shown here—click to enlarge) summarizing the contents of the archive.
Notice that there is only one rather large .xml file. This is a concatenated XML file. As explained in the USPTO’s Bulk Data Product FAQs:
It is important to understand that the concatenated XML documents in the ZIP files, which have file extension "XML," are not the same as standard XML files and therefore will not be immediately readable by an ordinary XML parser. Instead, the files must be broken into individual XML documents, by splitting them apart at the XML declarations and/or DOCTYPE declarations.
Thus, unlike the CIPO’s archive from which one may directly extract separate XML files corresponding to individual Canadian bibliographic patent documents, some further processing of the USPTO’s concatenated XML files is required. Since XML files consist only of text and since each separate XML document within the USPTO’s concatenated XML file is prefaced by a unique XML declaration header (e.g. <?xml version="1.0" encoding="UTF-8"?>) it is relatively straightforward to split the concatenated XML file into separate XML files. For week 4 of 2012 this should yield 4,725 separate XML files as shown in above header.