With the recent API changes, the patent endpoint is the main way to retrieve data. The other endpoints supply additional information. Also note that an API key is required.
Patent endpoint
Which patents have been cited by more than 500 US patents?
library(patentsview)
fields = c("patent_id", "patent_title", "patent_date")
search_pv(query = qry_funs$gt(patent_num_times_cited_by_us_patents = 500),
fields = fields)
#> $data
#> #### A list with a single data frame on patents level:
#>
#> List of 1
#> $ patents:'data.frame': 1000 obs. of 3 variables:
#> ..$ patent_id : chr [1:1000] "10045769" ...
#> ..$ patent_title: chr [1:1000] "Circular surgical staplers with foldable an"..
#> ..$ patent_date : chr [1:1000] "2018-08-14" ...
#>
#> $query_results
#> #### Distinct entity counts across all downloadable pages of output:
#>
#> total_hits = 13,995
How many distinct inventors are represented by these highly-cited patents?
search_pv(
query = qry_funs$gt(patent_num_times_cited_by_us_patents = 500),
fields = c("patent_id", "inventors")
)
#> $data
#> #### A list with a single data frame (with list column(s) inside) on patents level:
#>
#> List of 1
#> $ patents:'data.frame': 1000 obs. of 2 variables:
#> ..$ patent_id: chr [1:1000] "10045769" ...
#> ..$ inventors:List of 1000
#>
#> $query_results
#> #### Distinct entity counts across all downloadable pages of output:
#>
#> total_hits = 13,995
Where geographically have Microsoft inventors been coming from over the past few years?
# Write the query
query <- with_qfuns(
and(
gte(patent_date = "2022-07-26"), # Dates are in yyyy-mm-dd format
begins(assignees.assignee_organization = "Microsoft")
)
)
# Create a field list by getting the inventors fields- the primary key is needed
# for unnest_pv_data()
inv_fields <- get_fields(endpoint = "patent", groups="inventors")
inv_fields <- c("patent_id", inv_fields)
inv_fields
#> [1] "patent_id" "inventors.inventor_id"
#> [3] "inventors.inventor_city" "inventors.inventor_country"
#> [5] "inventors.inventor_name_first" "inventors.inventor_name_last"
#> [7] "inventors.inventor_sequence" "inventors.inventor_state"
# Pull the data
pv_out <- search_pv(query, fields = inv_fields, all_pages = TRUE, size = 1000)
# Unnest the inventor list column
unnest_pv_data(pv_out$data, "patent_id")
#> List of 2
#> $ inventors:'data.frame': 17937 obs. of 11 variables:
#> ..$ patent_id : chr [1:17937] "11397055" ...
#> ..$ inventor : chr [1:17937] "https://search.patentsview.org/api"..
#> ..$ inventor_id : chr [1:17937] "fl:tz_ln:lin-53" ...
#> ..$ inventor_name_first : chr [1:17937] "Tzu-Yuan" ...
#> ..$ inventor_name_last : chr [1:17937] "Lin" ...
#> ..$ inventor_gender_code: chr [1:17937] "M" ...
#> ..$ inventor_location_id: chr [1:17937] "13f05eea-16c8-11ed-9b5f-1234bde3cd"..
#> ..$ inventor_city : chr [1:17937] "San Jose" ...
#> ..$ inventor_state : chr [1:17937] "CA" ...
#> ..$ inventor_country : chr [1:17937] "US" ...
#> ..$ inventor_sequence : int [1:17937] 1 2 ...
#> $ patents :'data.frame': 4442 obs. of 1 variable:
#> ..$ patent_id: chr [1:4442] "11397055" ...
Which assignees have an interest in beer?
query <- with_qfuns(
and(
contains(patent_title = "beer"),
eq(assignees.assignee_sequence = 0)
)
)
fields <- c("patent_id", "patent_title", "assignees.assignee_organization")
res <- search_pv(query = query, fields = fields, endpoint = "patent", size = 1)
str(res$data)
#> List of 1
#> $ patents:'data.frame': 1 obs. of 3 variables:
#> ..$ patent_id : chr "10117451"
#> ..$ patent_title: chr "Beer-flavored beverage"
#> ..$ assignees :List of 1
#> .. ..$ :'data.frame': 1 obs. of 1 variable:
#> .. .. ..$ assignee_organization: chr "KAO CORPORATION"
#> - attr(*, "class")= chr [1:2] "list" "pv_data_result"
Inventor Endpoint
Which inventor’s most recent patent has Chicago, IL listed as their location.
pv_out <- search_pv(
query = '{"_and":[{"_text_phrase": {"inventor_lastknown_city":"Chicago"}},
{"_text_phrase": {"inventor_lastknown_state":"IL"}}]}',
endpoint = "inventor",
fields = c("inventor_id", "inventor_name_first", "inventor_name_last")
)
pv_out
#> $data
#> #### A list with a single data frame on inventors level:
#>
#> List of 1
#> $ inventors:'data.frame': 1000 obs. of 3 variables:
#> ..$ inventor_id : chr [1:1000] "59f9k818v0o6g4eoi880q418l" ...
#> ..$ inventor_name_first: chr [1:1000] "Jeanne A." ...
#> ..$ inventor_name_last : chr [1:1000] "Mervine" ...
#>
#> $query_results
#> #### Distinct entity counts across all downloadable pages of output:
#>
#> total_hits = 13,653
In the new version of the API, the behavior of this endpoint has changed. See the similar example on the legacy inventors endpoint page for its original behavior.
We could also call the new version of the patent endpoint to find inventors who listed Chicago, IL as their location when applying for a patent.
fields <- get_fields('patent', groups="inventors")
fields <- c("patent_id", fields)
fields
#> [1] "patent_id" "inventors.inventor_id"
#> [3] "inventors.inventor_city" "inventors.inventor_country"
#> [5] "inventors.inventor_name_first" "inventors.inventor_name_last"
#> [7] "inventors.inventor_sequence" "inventors.inventor_state"
query <- '{"_and":[{"_text_phrase": {"inventors.inventor_city":"Chicago"}},
{"_text_phrase": {"inventors.inventor_state":"IL"}}]}'
search_pv(query, fields=fields, endpoint="patent")
#> $data
#> #### A list with a single data frame (with list column(s) inside) on patents level:
#>
#> List of 1
#> $ patents:'data.frame': 1000 obs. of 2 variables:
#> ..$ patent_id: chr [1:1000] "10045379" ...
#> ..$ inventors:List of 1000
#>
#> $query_results
#> #### Distinct entity counts across all downloadable pages of output:
#>
#> total_hits = 47,499
Note that here all the inventors on a particular patent will be returned, not just the ones whose location was Chicago, IL. Also see the Writing Queries Vignette for more readable ways to write queries.