Google Patent Data Analytics: 2013

Monday 23 December 2013

Attorney, Agent or Firm: the <agent rep-type="attorney"> attribute

One may be registered to practice before the USPTO as a patent attorney or as a patent agent (see generally 37 CFR §11.6).

In a previous post I considered this small extract from the <us-bibliographic-data-grant> element of the USPTO’s XML publication for United States Patent No. 8332851 and observed that the well known IP firm Fish & Richardson P.C. handled the prosecution of the application from which the ‘851 patent issued. Notice the rep-type="attorney" attribute in the <agent> element depicted here.

Does this mean that the rep-type attribute is populated to specify the attorney vs. agent registration status of the practitioner(s) who prosecuted the application from which the patent in question issued? If the answer is "yes" then we would expect to find some rep-type="agent" attributes within a reasonably large dataset.  Let's explore.

I ran this simple SQL query against a database I constructed from the USPTO’s XML bibliographic data for US patents which issued in 2012.  In plain English, the query says "show me the rep-type attribute for every practitioner record in the database, but ignore records with rep-type='attorney' ".  The query returned zero rows.  Therefore, every practitioner record in the database has rep-type="attorney". In other words, there are no occurrences of rep-type="agent", as one would expect if any of the XML documents used to construct the database had a rep-type="agent" attribute.

Does this mean that none of the US patents which issued in 2012 were prosecuted by a registered patent agent as opposed to a registered patent attorney? That seems unlikely, but we need a counter-example in order to reach a definitive conclusion.

Inventek is the Oakland, CA intellectual property service firm of Dr. Dov Rosenfeld, who is a registered US patent agent. This image (click the image to enlarge it) is a screen capture of a visualization I created via this blog's "Assignees & Attorneys" tab by typing "Inventek" in the "Search for Attorney or Agent:" box. The visualization shows that Inventek (Dr. Rosenfeld) prosecuted US patent applications which resulted in the grant of 32 US patents in 2012.

By way of example, one of those patents is US 8305996 which issued on 6 November 2012. The red-underlined portion of this partial image of the ‘996 patent’s cover sheet shows that the corresponding US patent application was prosecuted by Dr. Rosenfeld/Inventek.







This query result set (again, click the image to enlarge it) is based on the database mentioned above and shows some basic details of the 32 US patents which issued in 2012 from applications prosecuted by Dr. Rosenfeld. The "Representative Type" column again corresponds to the rep-type attribute and reveals that every document in the result set has rep-type="attorney". There are no occurrences of rep-type="agent".










It is thus apparent that the rep-type attribute in the USPTO’s bibliographic data is populated as rep-type="attorney" without regard to the practitioner’s registration classification (i.e. attorney vs. agent).  That's not surprising, since a practitioner's registration classification is not required in documents submitted to the USPTO in support of a US patent application.  For example, the Representative Information section of the USPTO's Application Data Sheet (shown here) can be configured to identify a specific practitioner by name and USPTO registration number, but the practitioner's registration classification is not required.  The same is true of the USPTO's Power of Attorney and Customer Number forms.

Monday 16 December 2013

Attorney, Agent or Firm: the <agent> element

Some IP firms have multiple offices. Those firms’ internal docketing systems undoubtedly include detailed particulars of each patent case prosecuted by the firm, with an indication of which one of the firm’s offices is responsible for each case. Multi-office firms can accordingly use data mining techniques to develop metrics representative of various aspects of the operations of their different offices. However, that is feasible only for those with access to the data, i.e. the specific firms which have accumulated the data internally.

Can data mining techniques be applied to bibliographic patent data to develop metrics representative of the operations of different offices of a multi-office IP firm? Not directly.

Consider United States Patent No. 8332851 which issued on 11 December 2012 to SAP AG for an invention of Ostermeier et al. entitled Configuration and Execution of Mass Data Run Objects. The red-underlined portion of this partial image of the ‘851 patent’s cover sheet tells us that the corresponding US patent application was prosecuted by the well known IP firm Fish & Richardson P.C. As of this writing, Fish & Richardson P.C. has offices in Atlanta, Austin, Boston, Dallas, Delaware, Houston, Munich, New York, Silicon Valley, Southern California, the Twin Cities and Washington, DC. Which one of Fish & Richardson’s 11 US offices handled the prosecution of the application from which the ‘851 patent issued?  The cover sheet does not tell us.

Now consider this small extract from the <us-bibliographic-data-grant> element of the USPTO’s XML publication for the ‘851 patent. The <us-bibliographic-data-grant> element encapsulates the <parties> element, which in turn encapsulates the <agents> element (as well as the <applicants> element). The <agent>, <addressbook>, <orgname>, <address> and <country> elements encapsulated and sub-encapsulated by the <agents> are sparsely populated. We see Fish & Richardson P.C. encapsulated by the <orgname></orgname> tag pair, but none of the other elements encapsulate data apart from the "attorney attribute in the <agent> element and the unknown text encapsulated by the <address><country></country><address> hierarchical tags. Is that unknown" an aberration or a data glitch? No it is not.

This query result set is based on a database I constructed from the USPTO’s bibliographic data for US patents which issued in 2012. As can be seen, the rightmost country" column (which corresponds to the aforementioned <address><country></country><address> hierarchical tags) contains the text unknown for every patent listed here. The same is true for the entire dataset. It is thus apparent that the bibliographic data does not tell us which one of the 11 US offices of Fish & Richardson P.C. handled the prosecution of the application from which the ‘851 patent issued. So which office was it?


The answer is the Dallas, Texas office. This is revealed by looking the case up via the USPTO’s public Patent Application Information Retrieval (i.e. public PAIR) system. Specifically, this portion of the filing transmittal for the application from which the ‘851 patent issued appears on the letterhead of the Fish & Richardson P.C. Dallas, Texas office (and also cites the firm’s Minneapolis office address; presumably to facilitate centralized docketing for all of the firm’s offices).

In summary, the USPTO’s bibliographic data identifies the attorney, agent or firm by name only. No address information (not even a county identifier) is provided, so it is not possible to discriminate between different offices of the same firm solely by reference to the bibliographic data.

Monday 9 December 2013

Correlation of USPTO Art Units with IPC subclasses

What correlation, if any, is there between the art units to which the USPTO allocates patent applications for examination and the International Patent Classification subclasses allocated to the claimed inventions encompassed by the resultant patents? One would expect a fairly strong correlation, given the USPTO’s allocation of US patent classifications to art units and further in view of similarities between the IPC and US patent classification schemes.  For background, see the USPTO’s tabulation of US Patent Classes Arranged by Art Unit and its US patent classification to IPC concordance.

This scatter plot shows, along the horizontal axis, numbers of US patents which issued in 2012 for different USPTO art units; and along the horizontal axis, numbers of those patents in terms of their primary IPC subclasses; both of those measures being referenced to the assignee dimension (i.e. the parties to whom the patents issued).

The positive slope trend line reveals a strong positive correlation, as expected. The underlying trend model has an r2 value of 0.843717, so r (the correlation value, in accordance with the Pearson correlation) is 0.92. Correlation values close to either +1 or -1 represent strong positive or negative correlation respectively between the measures being compared. Correlation values closer to zero represent weaker—or an absence of—correlation.

Monday 2 December 2013

Office specific data elements

WIPO’s ST.36 Standard recommendation for the processing of patent information using XML includes a provision for so-called office-specific-data elements. These elements can be used to encapsulate details unique to a particular country in patent bibliographic data published for that country. ST.36 also provides alternative mechanisms for accomplishing the same thing, e.g. mixing office-specific elements with international common elements.

Let’s consider an example. United States patent no. 8260535 issued on 4 September 2012 to Bombardier Recreational Products Inc. of Valcourt, Québec, Canada for an invention of Mario Dagenais entitled "Load sensor for a vehicle electronic stability system". The ‘535 patent issued from United States patent application no. 11/864,265 which was filed on 28 September 2007.

As of this writing, an apparently corresponding Canadian patent application is pending, namely CA 2699332 which was published by the Canadian Intellectual Property Office as of 2 April 2009. The ‘332 application is a Canadian national counterpart of international application PCT/US2008/070129 which was filed on 16 July 2008 and claimed priority based upon the US ‘265 application. PCT/US2008/070129 was published on 2 April 2009 as WO2009/042276.

Consider this small extract from the <ca-bibliographic-data> element of the Canadian Intellectual Property Office’s XML publication for the ‘332 application. The <ca-bibliographic-data> element encapsulates the <publication-reference> element, which in turn encapsulates the <document-id> element. The <country>, <doc-number>, <kind> and <date> elements encapsulated by the <document-id> element identify this as Canadian patent application no. 2699332 published 2 April 2009, as aforesaid.

Now consider this corresponding small extract from the <us-bibliographic-data-grant> element of the USPTO’s XML publication for the ‘535 patent. Some structural similarities are evident: the <us-bibliographic-data-grant> element encapsulates the <publication-reference> element, which in turn encapsulates the <document-id> element. The <country>, <doc-number>, <kind> and <date> elements encapsulated by the <document-id> element identify this as United States patent no. 8260535 issued 4 September 2012, as aforesaid.

The foregoing are examples of international common elements.

Now consider this small extract from the <ca-office-specific-bib-data> element of the CIPO’s XML publication for the ‘332 application. Among other things, this encapsulates the <ca-license-for-sale> element.  This is a uniquely Canadian feature whereby a patentee may request that a patent be flagged as available for license or sale when details of that patent are published in the CIPO’s weekly Canadian Patent Office Record. In this case the <ca-license-for-sale> element encapsulates the word "false", indicating that no such flag has been raised in respect of the ‘332 application—not surprising since the ‘332 application has not issued as a Canadian patent as of this writing.

Finally, consider this small extract from the concluding portion of the <us-bibliographic-data-grant> element of the USPTO’s XML publication for the ‘535 patent. Here we have some uniquely American features identifying the primary examiner (Khoi Tran), the USPTO art unit (3664: robotics and vehicle controls) and the assistant examiner (Bhavesh V. Amin) for the ‘535 patent.

We can thus see that the CIPO has opted to implement ST.36 by assigning office-specific-data elements to encapsulate details unique to Canada, whereas the USPTO implements ST.36 by mixing office-specific elements with international common elements.

Monday 25 November 2013

Plant patents

The US is one of the few countries that grants plant patents. Canada grants "plant breeders’ rights" which are administered not by the Canadian Intellectual Property Office but by the Plant Breeders’ Rights Office, which is part of the Canadian Food Inspection Agency.

The USPTO’s group art unit 1661 handles plant patent applications. US plant patents are allocated one of two kind codes: P2 or P3; depending on whether the application from which the patent issued underwent pre-grant publication (P3) or not (P2).

The USPTO’s bibliographic patent data contains a wealth of information for plant patents, including the botanical denomination (i.e. Latin name) of the genus and species of the patented plant; and the plant variety designation (i.e. cultivar name) of the patented plant. For example, US plant patent PP23241 issued on 4 December 2012 for a plant having the botanical denomination Echinacea purpurea and variety designation "Quills and Thrills". According to the patent’s abstract, the plant is "characterized by large inflorescences with quilled ray florets of purple pink, a compact, multicrown habit, a long bloom time, and excellent vigor".


It’s relatively straightforward to feed bibliographic data of this sort directly from a visualization tool into a search engine, to obtain further information. For example, the botanical denomination or the variety designation can be fed into a search engine directly off the visualization to get an image of the plant—as seen here for "Quills and Thrills".


As further examples, either the botanical denomination or the variety designation can be fed into a search engine directly off the visualization to lookup the corresponding plant patent(s) in the Google patents database, or to see if the botantical denomination appears in The Plant List—as seen here for Echinacea purpurea—etc.



Click the "Plant Patents" tab above to explore these and other aspects of the USPTO’s 2012 plant patents.

Monday 18 November 2013

The USPTO’s patent bibliographic data—concatenated XML

In a previous post I mentioned that the Canadian Intellectual Property Office’s 2012 XML format patent bibliographic data is provided in a single 188 MB archive from which 58,572 separate XML files can be extracted. 21,592 of those files correspond to Canadian utility patents which issued in 2012. The other files correspond to Canadian laid-open applications (kind code A1), reissue patents (kind code E), re-examined patents (kind code F) and republished versions of previously published files (e.g. to correct errors). The files range in size from about 2-39 KB, with an average size of about 3.3 KB.

In contrast, as shown in this portion of Google’s "USPTO Bulk Downloads: Patent Grant Bibliographic Data" web page, the USPTO’s 2012 patent bibliographic data is provided in 52 separate .zip archives—one per week (recall that US patents are issued in batches, on Tuesday of each week throughout the year). Unless you relish the prospect of manually downloading 52 separate archives one at a time, you’ll want to consider using a bulk file download utility.

After downloading one or more of the USPTO’s weekly bibliographic data archives you can extract the contents of each archive. Unlike the CIPO’s archive, which extracts into a multiplicity of XML files—one per Canadian patent bibliographic document—the USPTO’s archives extract into just three files. This example shows details of the three files extracted from the USPTO’s bibliographic data archive for week 4 of 2012, i.e. bibliographic data for US patents which issued on Tuesday, 24 January 2012.

The .txt file contains a checklist of the patent numbers included in the archive, with one patent number per line. For week 4 of 2012 the checklist includes design patent numbers D0652606 through D0653015, plant patent numbers PP022464 through PP022468, reissue patent numbers RE043120 through RE043146 and utility patent numbers 08099794 through 08104093. The .html file includes a header (shown here—click to enlarge) summarizing the contents of the archive.

Notice that there is only one rather large .xml file. This is a concatenated XML file. As explained in the USPTO’s Bulk Data Product FAQs:
It is important to understand that the concatenated XML documents in the ZIP files, which have file extension "XML," are not the same as standard XML files and therefore will not be immediately readable by an ordinary XML parser. Instead, the files must be broken into individual XML documents, by splitting them apart at the XML declarations and/or DOCTYPE declarations.

Thus, unlike the CIPO’s archive from which one may directly extract separate XML files corresponding to individual Canadian bibliographic patent documents, some further processing of the USPTO’s concatenated XML files is required. Since XML files consist only of text and since each separate XML document within the USPTO’s concatenated XML file is prefaced by a unique XML declaration header (e.g. <?xml version="1.0" encoding="UTF-8"?>) it is relatively straightforward to split the concatenated XML file into separate XML files. For week 4 of 2012 this should yield 4,725 separate XML files as shown in above header.

Monday 11 November 2013

Working with XML format patent bibliographic data

The USPTO issues United States patents in batches, on Tuesday of each week throughout the year. The Canadian Intellectual Property Office does the same: Canadian patents are issued in batches, on Tuesday of each week throughout the year.

According to the USPTO’s statistics 253,155 US utility patents were granted in 2012. I’m ignoring reissue, design and plant patents for comparison purposes. Canada does not grant design or plant patents. Instead of design patents, Canada grants industrial design registrations. Instead of plant patents, Canada grants plant breeders’ rights (these are administered not by the CIPO but by the Plant Breeders’ Rights Office, which is part of the Canadian Food Inspection Agency).

A search of the CIPO’s online patent database reveals that 21,592 Canadian utility patents issued in 2012. So, in 2012, the volume of Canadian utility patent grants was about 8.5% of the volume of US utility patent grants. An even greater disparity appears in relation to reissue patents: the USPTO granted 822 reissue patents in 2012, but only 20 Canadian reissue patents were granted in the decade spanning 2001-2011.

The USPTO and the CIPO publish bibliographic data for their respective granted patents in XML format, in accordance with WIPO’s ST.36 standard. The CIPO’s Canadian patent bibliographic data XML files are typically provided in .zip type archive files. For example, the CIPO’s 2012 XML format patent bibliographic data is provided in a 188 MB archive from which 58,572 separate XML files can be extracted. However, those XML files pertain not only to granted utility patents (kind code C) but also to laid-open applications (kind code A1), reissue patents (kind code E) and re-examined patents (kind code F).

Moreover, the CIPO may republish a patent bibliographic data XML file—if an error is detected in a previously published version thereof.  For example, the CIPO’s 2012 patent bibliographic data archive includes an XML file for Canadian patent no. 2121906 which issued on 29 April 1993. As shown here, that XML file contains a pair of <ca-date-updated></ca-date-updated> XML tags encapsulating the 31 December 2012 date on which the CIPO republished its XML bibliographic data file for the ‘906 patent (New Years Eve 2012 fell on a Tuesday).  In processing the CIPO’s patent bibliographic data, one must take any such republication into account and perform appropriate update operations on existing data.

This brief discussion touches on only some issues that one must be cognizant of in processing XML format patent bibliographic data. Next week I’ll discuss another issue specific to the USPTO’s XML format patent bibliographic data.

Monday 4 November 2013

viewing XML format patent bibliographic data

OK—you have obtained some bibliographic patent data in XML format from one of the available sources (see last week’s post). Now what?

You can inspect individual XML files with an XML file viewer. XML files consist of text only, so a text editor such as the Microsoft Windows Notepad utility will do. Here is a small portion of the USPTO’s XML file for US patent no. 8329177 as viewed in Notepad. You can see the descriptive tag pairs (e.g. <document-id></document-id>) encapsulating the information content, but the tags’ hierarchical structure isn’t readily apparent via Notepad.

The tags’ hierarchical structure is more apparent if we inspect the XML file with a spreadsheet program such as Microsoft Excel. Here is a small portion of the same XML file (i.e. US 8329177) as viewed in Excel.  The file has been converted into the familiar spreadsheet row/column format, with the tags appearing as column headings and the encapsulated information shown in rows beneath the respective headings. The conversion (or "flattening") process repeats information in some cells, as seen here.  The column headers have been narrowed to show more columns.   When viewed in Excel, the XML file for US 8329177 has 53 rows and 131 columns, so the worksheet has a total of 6,943 cells. However, the worksheet is rather sparse: only 2,408 of those cells contain information.

A web browser can also be used to inspect an XML file. Here is the same small portion of the XML file for US 8329177 shown above in Notepad, as viewed in Microsoft Internet Explorer. In this case, the tags’ hierarchical structure is made apparent by color highlighting and by hierarchical indentation levels. The encapsulated information is bolded.  This makes it somewhat easier to browse through the contents of a single XML file—if that is what you want to do.

None of this is very helpful, unless you are only interested in a particular XML file’s bibliographic patent data content and don’t mind manually inspecting and deciphering the file’s tagged information as outlined above. More generally, what we want to do is examine the information content of a number (preferably a very large number) of bibliographic patent data XML files in parallel. Future posts will delve into that topic.

Monday 28 October 2013

Bibliographic patent data sources

As previously mentioned, many patent offices publish their bibliographic patent data in XML format in accordance with WIPO standard ST.36.  Where can you find the data?  As of the date of this post, the following web pages provide download links or purchase order information for such data.
Note that the data may not be free.  The USPTO makes its data available free of charge, with no usage restrictions.  The CIPO also makes its data available free of charge, but only for non-commercial use.  Each patent office may impose different pricing and/or usage constraints on its data.

Monday 21 October 2013

XML format patent bibliographic data

Many patent offices (e.g. USPTO, EPO, WIPO, SIPO, CIPO) publish their bibliographic patent data in XML format in accordance with WIPO standard ST.36.   Some patent offices (e.g. CIPO) make the data available free of charge, for non-commercial use.   Others (e.g. USPTO) make the data available free of charge with no usage restrictions.

Like HTML, XML (extensible markup language) employs tags to encapsulate information.  Unlike HTML tags, XML tags impart no display characteristics (e.g. fonts) to the tagged information.  Also unlike HTML tags, XML tags are user-definable.  This means that they can be—and usually are—self-describing.  XML tags can also be arranged, e.g. nested to present information hierarchically.  Patent bibliographic data stored in the XML format defined by WIPO's ST.36 standard utilizes self-describing tags which are defined and hierarchically arranged in accordance with the standard.


Consider this extract from the USPTOs XML document for US patent no. 8309744.  Notice the field tags.  For example, the <country></country> tag pair encapsulates the “US” country code, telling us that this document pertains to a US patent.

The <doc-number></doc-number> tag pair encapsulates “08309744”, telling us the document's number.

The <kind></kind> tag pair encapsulates “B2”, telling us that the document is a granted utility patent.

The <date></date> tag pair encapsulates “20121113”, telling us that the patent issued on November 13, 2012.

Those four tag pairs are nested within the <document-id></document-id> tag pair which is in turn nested within the <publication-reference></publication-reference> tag pair.  The information encapsulated by those tag pairs identifies the published document.


The <document-id></document-id>, <country></country>, <doc-number></doc-number>, <date></date> tag pairs are also hierarchically nested within a pair of <application-reference></application-reference> tags.  Since the tags are self-describing, you can easily understand that the encapsulated information tells us that the '744 patent issued from US application serial no. 13/081,794 which was filed on April 7, 2011.

The depicted extract is just a small part of the USPTO's XML document publication for US patent no. 8309744.  Anyone familiar with patent information could read the
raw XML document and discern its meaning fairly readily.  However, XML documents are not normally intended for human reading.  Their primary purpose is to preserve a document's organization and structure in computer-readable form.  The visualizations presented via this blog were developed by computer processing of XML documents corresponding to the visualized patent publications.

Monday 14 October 2013

Patent bibliographic data basics

Bibliography is the description of books using details such as author, publication date, edition, etc. which collectively constitute bibliographic data. In relation to patents, bibliographic data encompasses details such as country, patent number & issue date; application number & filing date; priority number(s), country(ies) & date(s); invention title; inventor name(s), citizenship & address; assignee name(s), nationality & residence; and much more.

Have a look at the cover sheet of this United States patent. Everything that you see here—plus more information that you do not see here—constitutes this patent’s bibliographic data.

The visualizations presented via this blog make only limited use of the full range of available patent bibliographic data. In general, text and image information (e.g. abstract, description, claims, drawings) is not used. For the most part, information that can be counted is used.

For example, the question “how many patents did firm X prosecute on behalf of assignee Y for inventions handled by USPTO art unit Z ?” is answered by counting the number of patents which satisfy all three of those criteria. Accordingly, patent bibliographic details such as firm names, assignee names and art unit numbers are utilized. But, apart from counting the total number of claims in a patent, neither the text comprising a patent’s abstract, description and claims nor the drawing images are useful for the purposes of the visualizations presented via this blog.

Some dates can be useful, especially if they facilitate calculation of meaningful statistics for a large group of documents. For example, the time span between an application’s filing date and the corresponding patent’s issue date provides a useful measure that can be used to address questions such as “What is the average filing-to-issue time in years for US patents which issued in 2012 to assignee X for inventions in IPC subclass G06Q ?”

In future posts I’ll delve more deeply into other aspects of patent bibliographic data.

Monday 7 October 2013

Bubble charts

Bubble charts are sometimes useful for visualizing data. This example uses color to encode country (mauve = Finland, peach = Israel, green = Italy) and size to encode number of patent documents. The labels identify USPTO art units. Overall, the visualization compares Finland, Israel and Italy in terms of the number of US patents which issued in 2012 to assignees located in those countries and which were allocated by the USPTO to one of five different art units. The five art units are:

  • 2617 (cellular telephony)
  • 2618 (radio/satellite communications)
  • 2624 (image analysis)
  • 2916 (a design patent art unit)
  • 2913 (another design patent art unit)
You can easily see that, for Finland, art unit 2617 is the most significant one of the five. For Italy it’s art unit 2913 and for Israel it’s art unit 2624. In the underlying dataset, the Finland/2617 bubble corresponds to 127 patents, the Italy/2913 bubble corresponds to 54 patents and the Israel/2624 bubble corresponds to 48 patents.

For Finland, the next two most significant art units are 2916 and 2618 in that order, but you need to look closely to determine each bubble's size to get them in the right sequence. The Finland/2916 bubble corresponds to 51 patents and the Finland/2918 bubble corresponds to 46 patents. Difficulty in distinguishing bubble sizes is a downside of bubble charts.

For Israel, the next two most significant art units are 2617 and 2618 in that order, as is reasonably apparent from the bubbles’ respective sizes.

For Italy, the next two most significant art units are 2617 and 2624 in that order, but again you need to look closely to get them in the right order.  The Italy/2617 bubble corresponds to 23 patents and the Italy/2624 bubble corresponds to 18 patents.

The bubble size discrimination problem can be addressed by adding ranking values (e.g. 1, 2, 3...) to the bubbles within each color group, by applying different patterns corresponding to the number of patents represented by each bubble, etc. However, such techniques can distract the viewer without adequately addressing the problem.

Bubble charts are useful if you only want to see an approximation. But, if precision matters, bubble charts may not be the best choice. If you look back at my "Top technology sectors by country" post, you’ll see that I used data bars to compare Finland, Israel and Italy in a different context. Consider whether it’s easier to understand the data bar visualization or the bubble chart visualization.

Monday 30 September 2013

Top technology sectors by country

Which technology sectors are of primary significance to Finland? What about Israel or Italy? This visualization compares those countries, showing the top-5 most significant USPTO art units per country, in terms of numbers of US patents which issued in 2012 to assignees in those countries.

You can immediately see that art unit 2617 (cellular telephony) is the most significant art unit for Finland. For Israel it’s art unit 2624 (image analysis) and for Italy it’s art unit 2913 (one of the USPTO’s several design patent art units). No surprises there. Let’s look further.

For Finland, the next most significant art unit is 2916—another design patent art unit. Is that a surprise? No—the underlying patents pertain to handset design, a vital ingredient in the fiercely competitive cellular telephony sector.  [Click the FAQ tab and read item 6 to see how to check the underlying patents.]


Continuing with Finland, art unit 2618 (radio and satellite communications) is next—another close fit with cellular telephony. But then we have art unit 1741 (tires, adhesive bonding, glass/paper making, plastics shaping & molding). Does that make sense? Indeed it does—consider Finland’s strong pulp & paper technology sector and note that paper making comes within art unit 1741.

For Israel, the next most significant art unit is 2617 (cellular telephony), but note what comes next: art unit 1661 (plant patents). Is that a surprise? No—Israel has a significant flower export market.

No surprises when we look more closely at Italy either—the top-5 art units all pertain to design patents, reflecting Italy’s prominence in industrial design.

You can experiment with top-5 art unit (and top-5 IPC subclass) breakdowns for any countries of interest by clicking the Top Technologies tab above. Notice that, for Italy, the top IPC subclass is "not indicated". Why is that? It’s because there is no IPC classification for design patents.  As noted above, Italy’s top-5 technologies (in terms of USPTO art units and patents issued in 2012) pertain to design patents.

Monday 23 September 2013

PCT utilization

Do most US patent applications originate as US national phase entries of Patent Cooperation Treaty applications, or are they directly filed in the USPTO without regard to the PCT?   The answer depends on whose applications we’re talking about.   Let’s explore.

Click the Assignees & Attorneys tab to open a visualization of the USPTO’s 2012 patent grant data.   Note the order of the countries listed in the Assignee Country section at the bottom.   You should see US, JP, Null (unknown), KR, DE, TW, FR, CN in that order (i.e. ranked according to the number of patent documents per assignee country).

Note the Filing Type section of the visualization.   Hover your mouse over it for an explanation of the PCT and non-PCT filing types.   Click the PCT row to reconfigure the visualization with data pertaining only to US patents which issued in 2012 from applications that entered the US national phase via the PCT.   The plant and design patent kind codes no longer appear—only the utility patent (and corresponding reissue) kind codes remain.   That makes sense, since the PCT pertains only to utility patent applications.

Note the new ranking order in the Assignee Country section.   You should see JP, US, DE, FR, Null (unknown), KR, GB, NL in that order.   This tells us that, in the context of US patents which issued in 2012, Japanese assignees were the heaviest users of the PCT national entry mechanism, followed by Americans, Germans, Koreans, etc.   "Null (unknown)" corresponds to patents for which no Attorney or Agent is indicated in the USPTO’s bibliographic data (this does not necessarily mean that the patents were not prosecuted by an IP firm).

In the Assignee section at the top of the visualization you’ll see some well known Japanese, Korean and German assignees.

Click a data bar in the Assignee Country section to reconfigure the visualization with data pertaining only to that country.   Notice the relative percentages of PCT vs. non-PCT filings for that country, as indicated in the Filing Type section of the visualization.   For example, over 60% of the US patents that issued to Norwegian assignees in 2012 were based on PCT national phase entries.

You can further reconfigure the visualization by clicking a data bar in the Assignee section, or in the Attorney or Agent section, or both, to restrict the visualization to a selected assignee, IP firm, or both.

Click the "Revert All" button (the circular back arrow) at the bottom of the visualization to go back to the initial visualization.   Try any different reconfiguration you like in order to see the relative importance of the PCT (in the context of US patents issued in 2012) from the perspective of any selected country, assignee or IP firm, or combinations thereof.

Tuesday 17 September 2013

Design Patents

Unlike the art units which handle utility patent applications, the USPTO’s design patent art units are not partitioned according to subject matter. However, the USPTO does use the Locarno classification for industrial designs and Locarno classification information is included in the USPTO's bibliographic design patent data. Click the “Design Patents” tab above to explore the USPTO’s 2012 design patents from a Locarno subject matter classification, and other, perspectives.

Sunday 15 September 2013

Introduction

Many patent offices (e.g. USPTO, EPO, WIPO, SIPO, CIPO) publish their bibliographic patent data in XML format in accordance with WIPO standard ST.36.   Some patent offices (e.g. CIPO) make the data available free of charge, for non-commercial use.   Others (e.g. USPTO) make the data available free of charge with no usage restrictions.

The United States Patent & Trademark Office issued well over 250,000 patents in 2012. 
This blog will attempt to illustrate, via interactive working examples based on the USPTO’s 2012 bibliographic patent data, some ways in which multi-dimensional data sources and data analytic/visualization tools can derive powerful “business intelligence” information from the data.  This is not the sort of information typically derived via text/keyword searches in conducting novelty, patentability, infringement or validity type searches.


To get started, click one of the tabs above.