Google Patent Data Analytics

Thursday, 13 March 2014

Visualization example: Chattanooga patentees

Assignees & Attorneys Visualization
Suppose you’re located in Chattanooga, Tennessee and want a quick snapshot of local interest in US patent protection. Begin by clicking the "Assignees & Attorneys" tab along the top of this page to load the visualization shown here.  Click the image to enlarge it, if desired (use the escape key to return).







Reconfigured Visualization restricted to TN assignees' patents
Note the US Assignees table in the lower right corner of the visualization. Drag that table’s vertical scroll bar and click on TN (i.e. Tennessee) as shown here. The number 633 beside the TN state abbreviation tells us that the underlying database (which I constructed from the USPTO’s bibliographic data for US patents which issued in 2012) contains details of 633 US patents which issued in 2012 to assignees located in Tennessee.

Notice that all of the visualization’s tables were quickly reconfigured when you clicked on TN. The original "Assignees & Attorneys Visualization" provides details of 266,864 US patents which issued in 2012. The reconfigured visualization is restricted to the 633 US patents which issued in 2012 to assignees located in Tennessee.


Vanderbilt University popup
The upper table in the reconfigured visualization shows the names of the Tennessee-based assignees that own those 633 patents. Drag that table’s vertical scroll bar to see all the assignees’ names. The blue data bars adjacent each assignee’s name correspond to the number of patents owned by each assignee. Hover your mouse over any of the data bars to obtain a popup with further details, as shown here for the data bar corresponding to Vanderbilt University.  Again, click the image to enlarge it, if desired.




Lookup Vanderbilt's web site
Click the "Lookup Assignee Web Site" link in the popup to open a web browser pre-configured to run a web search using the assignee’s name, as shown here for Vanderbilt University. If desired, you can click the search result link to visit the official web site of Vanderbilt University and learn more about that institution—which is located in Nashville, Tennessee.




Pitts & Lake, PC popup
The next table in the reconfigured visualization shows the names of the IP firms which prosecuted the US patent applications from which the aforementioned 633 Tennessee-assigned patents issued. Drag that table’s vertical scroll bar to see all of those firms’ names. The green data bars adjacent each firm name correspond to the number of patents prosecuted by each firm. Hover your mouse over any of the data bars to obtain a popup with further details, as shown here for the data bar corresponding to Pitts & Lake, PC.

Lookup Pitts & Lake, PC web site

Click the "Lookup Attorney/Agent web site" link in the popup, to open a web browser pre-configured to run a web search using the IP firm’s name, as shown here for Pitts & Lake, PC. If desired, you can click the appropriate search result link to visit the web site of Pitts & Lake, PC and learn more about that firm—which is based in Knoxville, TN.


kind code explanation popup
Returning to the reconfigured visualization, the lower left Document Kind table shows the kind code breakdown of the 633 Tennessee-assigned patents. Hover your mouse over any number in the table’s right column to obtain a kind code explanation popup, as shown here.


633 Tennessee-assigned patents popup
The reconfigured visualization’s lower central Assignee Country table shows the countries in which the assignees of the 633 Tennessee-assigned patents are located. Of course, only one country appears—the US. Hover your mouse over the US data bar to obtain a popup with further details, as shown here.


expand US assignees (add cities)
Returning to the US Assignees table in the lower right corner of the visualization, hover your mouse over the State column header then click the + symbol which appears adjacent the column header, as seen here. This expands the table by adding a City column.











explore Tennessee
Drag the expanded table’s vertical scroll bar to reveal the city breakdown for Tennessee, as shown here. The numbers adjacent each city correspond to the number of US patents which issued in 2012 to assignees located in each of the indicated Tennessee cities. Notice the entries for "Chatanooga" and "Chattanooga". The first of those reflects a typographical error in the USPTO’s underlying bibliographic data—I’ve made no effort to cleanse the data.



Reconfigured Visualization restrictedto Chattanooga assignees' patents


Further drill down reconfiguration of the visualization is possible via any of the tables. For example, click on the "Chattanooga" entry in the US Assignees table. This further reconfigures the visualization, as shown here, by restricting it to the 26 US patents which issued in 2012 to assignees located in Chattanooga, Tennessee. The upper table in the reconfigured visualization shows the names of the Chattanooga-based assignees that own those 26 patents; the next table shows the names of the IP firms which prosecuted the US patent applications from which those 26 patents issued; etc.



At any time, you can click the revert all button (the circular back arrow symbol) at the bottom of the visualization to return to the initial visualization and start over.

As can be seen, a good deal of data mining can be accomplished with a few mouse clicks. The foregoing example is restricted to US patents which issued in 2012 but it will be appreciated that the visualizations can be enhanced by adding more bibliographic data to the underlying database.  Visualizations based on other countries' patent bibliographic data can also be created.

Monday, 13 January 2014

The <agent> sequence attribute

US 8332851
The previous post compared the <agents> element portion of the USPTO’s XML publication for United States Patent No. 8332851 with the <agents> element portion of the Canadian Intellectual Property Office’s XML publication for pending Canadian patent application serial no. 2699332. 
CA 2699332


As seen here, the US element has a sequence="01" attribute, whereas the CA element has a sequence="0" attribute. What does this mean?



I ran this simple SQL query against a database I constructed from the CIPO’s XML bibliographic data for Canadian patent documents published during the decade spanning 2001-2011. The query says "show me the sequence attribute for every practitioner record in the database, but ignore records with sequence='0' ". The query returned zero rows. Therefore, every practitioner record in the database has sequence='0'. This suggests that the CIPO does not utilize the sequence attribute.


US 8102435
I ran another query against a database I constructed from the USPTO’s XML bibliographic data for US patents which issued in 2012, to locate <agent> elements with sequence attributes other than sequence="01". This revealed some sequence="02" and sequence="03" attributes, but no others. For example, the <agents> element in the USPTO’s XML publication for United States Patent No. 8102435 has three <agent> elements with sequence attributes of "01","02" and "03" respectively, as shown here. Further queries against the same database of US patents issued in 2012 revealed 243,545 patents with only a sequence="01" attribute; 45,461 patents with both sequence="01" and sequence="02" attributes; and 10,940 patents with sequence="01", sequence="02" and sequence="03" attributes. (The database contains records of 266,864 US patents. 23,318 of those patents do not identify an "attorney, agent or firm".)

Is it surprising that the <agent> elements in the USPTO’s XML bibliographic data have sequence attributes of "01","02" or "03" , but no others? No it is not. As shown here, the USPTO’s PTOL-85B issue fee transmittal form provides for the printing on the front page of a US patent, the names of up to 3 registered patent attorneys or agents; or the name of a single firm and the names of up to 2 registered patent attorneys or agents.

Is it surprising that the CIPO does not utilize the sequence attribute in its XML bibliographic data for Canadian patent publications? Not really. Section 6 of the Canadian Patent Rules requires the CIPO to communicate only with the "authorized correspondent" in relation to a Canadian patent application. Rule 2 defines "authorized correspondent" in terms such that only one person (or firm) may be the authorized correspondent at any particular time. There is accordingly no need for the CIPO to keep track of more than one patent agent per application and thus no need for utilization of the <agent> element’s sequence attribute.


CA 2741562
What about cases that are filed and prosecuted pro se by one or more inventors without the assistance of a patent attorney or agent? No <agent> element will be found in the USPTO’s XML bibliographic data for a US patent which does not identify an "attorney, agent or firm", which makes sense. The situation in Canada is different. Since April 2008, the CIPO has  used ‘NA’ (presumably an acronym for "no agent") in the <agent> element as the "name" of the agent in a pro se situation, as seen in this example from CA 2741562.

CA 2602045
The CIPO’s XML bibliographic data for documents published before April 2008 contains <agent> elements with self-closing or empty element <name/> tags in pro se situations, as seen in this example from CA 2602045.

Monday, 6 January 2014

Canada’s <agent> element

In the two previous posts we saw that the <agent> element in the USPTO’s bibliographic data identifies the “attorney, agent or firm” for a US patent by name only.  No address information is provided—not even a country identifier—so it’s impossible to discriminate between different offices of the same firm solely by reference to the bibliographic data.

We also saw that the rep-type attribute of the <agent> element in the USPTO’s bibliographic data is populated as rep-type="attorney", without regard to the practitioner’s registration classification (i.e. attorney vs. agent).

How does Canada’s patent bibliographic data compare with the USPTO’s data in relation to the <agent> element and its rep-type attribute?

US 8332851
The previous posts considered this <agents> element extract from the USPTO’s XML publication for United States Patent No. 8332851.






CA 2699332
Another earlier post considered extracts from the Canadian Intellectual Property Office’s XML publication for pending Canadian patent application serial no. 2699332.  Here is the <agents> element portion of the CIPO’s XML publication for the ‘332 application.

Comparing the <agent> elements of US 8332851 and CA 2699332 reveals:
  • the US element has a sequence="01" attribute, whereas the CA element has a sequence="0" attribute;
  • the US element has a rep-type="attorney" attribute, whereas the CA element has a rep-type="agent" attribute;
  • the US element has an <orgname></orgname> tag pair encapsulating the firm name Fish & Richardson P.C.”, whereas the CA element has a <name></name> tag pair encapsulating the firm name “Osler, Hoskin & Harcourt LLP”;
  • the US element has an <address><country></country></address> tag pair encapsulating the word unknown”, whereas in the CA element that tag pair encapsulates the country code “CA.
I will leave the sequence attribute to a future post.

Recall that the rep-type attribute in the USPTO’s bibliographic data is populated as rep-type="attorney", without regard to the practitioner’s registration classification (i.e. attorney vs. agent).  Does the rep-type="agent" attribute in the CA element shown here mean that the CIPO’s bibliographic data reflects the practitioner’s registration classification (i.e. lawyer vs. agent) for a particular Canadian patent?  Let’s explore.

I ran this simple SQL query against a database I constructed from the CIPO’s XML bibliographic data for Canadian patent documents published during the decade spanning 2001-2011.  The query says “show me the rep-type attribute for every practitioner record in the database, but ignore records with rep-type='agent' ”.  The query returned zero rows.  Therefore, every practitioner record in the database has rep-type="agent". That is not surprising—all patent practitioners who become qualified to practice before the CIPO are registered as agents, irrespective of whether they also happen to be lawyers admitted to practice in one or more Canadian provinces.  Indeed, many—but not all—registered Canadian patent agents are also duly admitted to practice law in one or more Canadian provinces.  There is no such thing as a registered Canadian patent attorney and therefore there are no occurrences of rep-type='attorney' (or anything besides rep-type='agent') in any of the CIPO’s XML documents.

Now consider the “CAcountry code encapsulated by the <address><country></country></address> tag pair.  I ran another simple SQL query—shown here—against the database mentioned in the previous paragraph.  The query says “show me the country code for every practitioner address record in the database, but ignore records for which the country code is 'CA' ”.  The query returned zero rows.  Therefore, every practitioner address record in the database has the 'CA' country code.

It can thus be seen that the <agent> element in the CIPO’s bibliographic data has the same limitations as the <agent> element in the USPTO’s bibliographic data.  The IP firm responsible for a particular Canadian patent or application is identified by name only—no address information for the firm is provided—so it’s impossible to discriminate between different offices of the same firm, solely by reference to the bibliographic data.
 



 

Monday, 23 December 2013

Attorney, Agent or Firm: the <agent rep-type="attorney"> attribute

One may be registered to practice before the USPTO as a patent attorney or as a patent agent (see generally 37 CFR §11.6).

In a previous post I considered this small extract from the <us-bibliographic-data-grant> element of the USPTO’s XML publication for United States Patent No. 8332851 and observed that the well known IP firm Fish & Richardson P.C. handled the prosecution of the application from which the ‘851 patent issued. Notice the rep-type="attorney" attribute in the <agent> element depicted here.

Does this mean that the rep-type attribute is populated to specify the attorney vs. agent registration status of the practitioner(s) who prosecuted the application from which the patent in question issued? If the answer is "yes" then we would expect to find some rep-type="agent" attributes within a reasonably large dataset.  Let's explore.

I ran this simple SQL query against a database I constructed from the USPTO’s XML bibliographic data for US patents which issued in 2012.  In plain English, the query says "show me the rep-type attribute for every practitioner record in the database, but ignore records with rep-type='attorney' ".  The query returned zero rows.  Therefore, every practitioner record in the database has rep-type="attorney". In other words, there are no occurrences of rep-type="agent", as one would expect if any of the XML documents used to construct the database had a rep-type="agent" attribute.

Does this mean that none of the US patents which issued in 2012 were prosecuted by a registered patent agent as opposed to a registered patent attorney? That seems unlikely, but we need a counter-example in order to reach a definitive conclusion.

Inventek is the Oakland, CA intellectual property service firm of Dr. Dov Rosenfeld, who is a registered US patent agent. This image (click the image to enlarge it) is a screen capture of a visualization I created via this blog's "Assignees & Attorneys" tab by typing "Inventek" in the "Search for Attorney or Agent:" box. The visualization shows that Inventek (Dr. Rosenfeld) prosecuted US patent applications which resulted in the grant of 32 US patents in 2012.

By way of example, one of those patents is US 8305996 which issued on 6 November 2012. The red-underlined portion of this partial image of the ‘996 patent’s cover sheet shows that the corresponding US patent application was prosecuted by Dr. Rosenfeld/Inventek.







This query result set (again, click the image to enlarge it) is based on the database mentioned above and shows some basic details of the 32 US patents which issued in 2012 from applications prosecuted by Dr. Rosenfeld. The "Representative Type" column again corresponds to the rep-type attribute and reveals that every document in the result set has rep-type="attorney". There are no occurrences of rep-type="agent".










It is thus apparent that the rep-type attribute in the USPTO’s bibliographic data is populated as rep-type="attorney" without regard to the practitioner’s registration classification (i.e. attorney vs. agent).  That's not surprising, since a practitioner's registration classification is not required in documents submitted to the USPTO in support of a US patent application.  For example, the Representative Information section of the USPTO's Application Data Sheet (shown here) can be configured to identify a specific practitioner by name and USPTO registration number, but the practitioner's registration classification is not required.  The same is true of the USPTO's Power of Attorney and Customer Number forms.

Monday, 16 December 2013

Attorney, Agent or Firm: the <agent> element

Some IP firms have multiple offices. Those firms’ internal docketing systems undoubtedly include detailed particulars of each patent case prosecuted by the firm, with an indication of which one of the firm’s offices is responsible for each case. Multi-office firms can accordingly use data mining techniques to develop metrics representative of various aspects of the operations of their different offices. However, that is feasible only for those with access to the data, i.e. the specific firms which have accumulated the data internally.

Can data mining techniques be applied to bibliographic patent data to develop metrics representative of the operations of different offices of a multi-office IP firm? Not directly.

Consider United States Patent No. 8332851 which issued on 11 December 2012 to SAP AG for an invention of Ostermeier et al. entitled Configuration and Execution of Mass Data Run Objects. The red-underlined portion of this partial image of the ‘851 patent’s cover sheet tells us that the corresponding US patent application was prosecuted by the well known IP firm Fish & Richardson P.C. As of this writing, Fish & Richardson P.C. has offices in Atlanta, Austin, Boston, Dallas, Delaware, Houston, Munich, New York, Silicon Valley, Southern California, the Twin Cities and Washington, DC. Which one of Fish & Richardson’s 11 US offices handled the prosecution of the application from which the ‘851 patent issued?  The cover sheet does not tell us.

Now consider this small extract from the <us-bibliographic-data-grant> element of the USPTO’s XML publication for the ‘851 patent. The <us-bibliographic-data-grant> element encapsulates the <parties> element, which in turn encapsulates the <agents> element (as well as the <applicants> element). The <agent>, <addressbook>, <orgname>, <address> and <country> elements encapsulated and sub-encapsulated by the <agents> are sparsely populated. We see Fish & Richardson P.C. encapsulated by the <orgname></orgname> tag pair, but none of the other elements encapsulate data apart from the "attorney attribute in the <agent> element and the unknown text encapsulated by the <address><country></country><address> hierarchical tags. Is that unknown" an aberration or a data glitch? No it is not.

This query result set is based on a database I constructed from the USPTO’s bibliographic data for US patents which issued in 2012. As can be seen, the rightmost country" column (which corresponds to the aforementioned <address><country></country><address> hierarchical tags) contains the text unknown for every patent listed here. The same is true for the entire dataset. It is thus apparent that the bibliographic data does not tell us which one of the 11 US offices of Fish & Richardson P.C. handled the prosecution of the application from which the ‘851 patent issued. So which office was it?


The answer is the Dallas, Texas office. This is revealed by looking the case up via the USPTO’s public Patent Application Information Retrieval (i.e. public PAIR) system. Specifically, this portion of the filing transmittal for the application from which the ‘851 patent issued appears on the letterhead of the Fish & Richardson P.C. Dallas, Texas office (and also cites the firm’s Minneapolis office address; presumably to facilitate centralized docketing for all of the firm’s offices).

In summary, the USPTO’s bibliographic data identifies the attorney, agent or firm by name only. No address information (not even a county identifier) is provided, so it is not possible to discriminate between different offices of the same firm solely by reference to the bibliographic data.

Monday, 9 December 2013

Correlation of USPTO Art Units with IPC subclasses

What correlation, if any, is there between the art units to which the USPTO allocates patent applications for examination and the International Patent Classification subclasses allocated to the claimed inventions encompassed by the resultant patents? One would expect a fairly strong correlation, given the USPTO’s allocation of US patent classifications to art units and further in view of similarities between the IPC and US patent classification schemes.  For background, see the USPTO’s tabulation of US Patent Classes Arranged by Art Unit and its US patent classification to IPC concordance.

This scatter plot shows, along the horizontal axis, numbers of US patents which issued in 2012 for different USPTO art units; and along the horizontal axis, numbers of those patents in terms of their primary IPC subclasses; both of those measures being referenced to the assignee dimension (i.e. the parties to whom the patents issued).

The positive slope trend line reveals a strong positive correlation, as expected. The underlying trend model has an r2 value of 0.843717, so r (the correlation value, in accordance with the Pearson correlation) is 0.92. Correlation values close to either +1 or -1 represent strong positive or negative correlation respectively between the measures being compared. Correlation values closer to zero represent weaker—or an absence of—correlation.

Monday, 2 December 2013

Office specific data elements

WIPO’s ST.36 Standard recommendation for the processing of patent information using XML includes a provision for so-called office-specific-data elements. These elements can be used to encapsulate details unique to a particular country in patent bibliographic data published for that country. ST.36 also provides alternative mechanisms for accomplishing the same thing, e.g. mixing office-specific elements with international common elements.

Let’s consider an example. United States patent no. 8260535 issued on 4 September 2012 to Bombardier Recreational Products Inc. of Valcourt, Québec, Canada for an invention of Mario Dagenais entitled "Load sensor for a vehicle electronic stability system". The ‘535 patent issued from United States patent application no. 11/864,265 which was filed on 28 September 2007.

As of this writing, an apparently corresponding Canadian patent application is pending, namely CA 2699332 which was published by the Canadian Intellectual Property Office as of 2 April 2009. The ‘332 application is a Canadian national counterpart of international application PCT/US2008/070129 which was filed on 16 July 2008 and claimed priority based upon the US ‘265 application. PCT/US2008/070129 was published on 2 April 2009 as WO2009/042276.

Consider this small extract from the <ca-bibliographic-data> element of the Canadian Intellectual Property Office’s XML publication for the ‘332 application. The <ca-bibliographic-data> element encapsulates the <publication-reference> element, which in turn encapsulates the <document-id> element. The <country>, <doc-number>, <kind> and <date> elements encapsulated by the <document-id> element identify this as Canadian patent application no. 2699332 published 2 April 2009, as aforesaid.

Now consider this corresponding small extract from the <us-bibliographic-data-grant> element of the USPTO’s XML publication for the ‘535 patent. Some structural similarities are evident: the <us-bibliographic-data-grant> element encapsulates the <publication-reference> element, which in turn encapsulates the <document-id> element. The <country>, <doc-number>, <kind> and <date> elements encapsulated by the <document-id> element identify this as United States patent no. 8260535 issued 4 September 2012, as aforesaid.

The foregoing are examples of international common elements.

Now consider this small extract from the <ca-office-specific-bib-data> element of the CIPO’s XML publication for the ‘332 application. Among other things, this encapsulates the <ca-license-for-sale> element.  This is a uniquely Canadian feature whereby a patentee may request that a patent be flagged as available for license or sale when details of that patent are published in the CIPO’s weekly Canadian Patent Office Record. In this case the <ca-license-for-sale> element encapsulates the word "false", indicating that no such flag has been raised in respect of the ‘332 application—not surprising since the ‘332 application has not issued as a Canadian patent as of this writing.

Finally, consider this small extract from the concluding portion of the <us-bibliographic-data-grant> element of the USPTO’s XML publication for the ‘535 patent. Here we have some uniquely American features identifying the primary examiner (Khoi Tran), the USPTO art unit (3664: robotics and vehicle controls) and the assistant examiner (Bhavesh V. Amin) for the ‘535 patent.

We can thus see that the CIPO has opted to implement ST.36 by assigning office-specific-data elements to encapsulate details unique to Canada, whereas the USPTO implements ST.36 by mixing office-specific elements with international common elements.