Introducing trendyy

trendyy is a package for querying Google Trends. It is build around Philippe Massicotte’s package gtrendsR which accesses this data wonderfully.

The inspiration for this package was to provide a tidy interface to the trends data.

Getting Started

Installation

You can install trendyy from CRAN using install.packages("trendyy").

Usage

Use trendy() to search Google Trends. The only mandatory argument is search_terms. This is a character vector with the terms of interest. It is important to note that Google Trends is only capable of comparing up to five terms. Thus, if your search_terms vector is longer than 5, it will search each term individually. This will remove the direct comparative advantage that Google Trends gives you.

Additional arguments

  • from: The beginning date of the query in "YYYY-MM-DD" format.
  • to: The end date of the query in "YYYY-MM-DD" format.
  • ... : any additional arguments that would be passed to gtrendsR::gtrends(). Note that it might be useful to indicate the geography of interest. See gtrendsR::countries for list of possible geographies.

Accessor Functions

  • get_interest(): Retrieve interest over time
  • get_interest_city(): Retrieve interest by city
  • get_interest_country(): Retrieve interest by country
  • get_interest_dma(): Retrieve interest by DMA
  • get_interest_region(): Retrieve interest by region
  • get_related_queries(): Retrieve related queries
  • get_related_topics(): Retrieve related topics

Example

Seeing as I found an interest in this due to the relatively pervasive use of Google Trends in political analysis, I will compare the top five polling candidates in the 2020 Democratic Primary. As of May 22nd, they were Joe Biden, Kamala Harris, Beto O’Rourke, Bernie Sanders, and Elizabeth Warren.

First, I will create a vector of my desired search terms. Second, I will pass that vector to trendy() specifying my query date range from the first of 2019 until today (May 25th, 2019).

candidates <- c("Joe Biden", "Kamala Harris", "Beto O'Rourke", "Bernie Sanders", "Elizabeth Warren")

candidate_trends <- trendy(candidates, from = "2019-01-01", to = Sys.Date())

Now that we have a trendy object, we can print it out to get a summary of the trends.

candidate_trends
## ~Trendy results~
## 
## Search Terms: Joe Biden, Kamala Harris, Beto O'Rourke, Bernie Sanders, Elizabeth Warren
## 
## (>^.^)> ~~~~~~~~~~~~~~~~~~~~ summary ~~~~~~~~~~~~~~~~~~~~ <(^.^<)
## # A tibble: 5 x 5
##   keyword          max_hits min_hits from       to        
##   <chr>               <int>    <int> <date>     <date>    
## 1 Bernie Sanders        100        3 2019-01-06 2020-02-23
## 2 Beto O'Rourke           4        1 2019-01-06 2020-02-23
## 3 Elizabeth Warren       18        2 2019-01-06 2020-02-23
## 4 Joe Biden              32        2 2019-01-06 2020-02-23
## 5 Kamala Harris          43        1 2019-01-06 2020-02-23

In order to retrieve the trend data, use get_interest(). Note, that this is dplyr friendly.

get_interest(candidate_trends)
## # A tibble: 300 x 7
##    date                 hits geo   time          keyword  gprop category   
##    <dttm>              <int> <chr> <chr>         <chr>    <chr> <chr>      
##  1 2019-01-06 00:00:00     3 world 2019-01-01 2… Joe Bid… web   All catego…
##  2 2019-01-13 00:00:00     2 world 2019-01-01 2… Joe Bid… web   All catego…
##  3 2019-01-20 00:00:00     3 world 2019-01-01 2… Joe Bid… web   All catego…
##  4 2019-01-27 00:00:00     3 world 2019-01-01 2… Joe Bid… web   All catego…
##  5 2019-02-03 00:00:00     3 world 2019-01-01 2… Joe Bid… web   All catego…
##  6 2019-02-10 00:00:00     4 world 2019-01-01 2… Joe Bid… web   All catego…
##  7 2019-02-17 00:00:00     7 world 2019-01-01 2… Joe Bid… web   All catego…
##  8 2019-02-24 00:00:00     4 world 2019-01-01 2… Joe Bid… web   All catego…
##  9 2019-03-03 00:00:00     5 world 2019-01-01 2… Joe Bid… web   All catego…
## 10 2019-03-10 00:00:00     8 world 2019-01-01 2… Joe Bid… web   All catego…
## # … with 290 more rows

Plotting Interest

candidate_trends %>% 
  get_interest() %>% 
  ggplot(aes(date, hits, color = keyword)) +
  geom_line() +
  geom_point(alpha = .2) +
  theme_minimal() +
  theme(legend.position = "bottom") +
  labs(x = "", 
       y = "Relative Search Popularity",
       title = "Google Search Popularity")

It is also possible to view the related search queries for a given set of keywords using get_related_queries().

candidate_trends %>% 
  get_related_queries() %>% 
  group_by(keyword) %>% 
  sample_n(2)
## # A tibble: 10 x 5
## # Groups:   keyword [5]
##    subject  related_queries value                   keyword     category   
##    <chr>    <chr>           <chr>                   <chr>       <chr>      
##  1 Breakout rising          nevada caucus results   Bernie San… All catego…
##  2 28       top             buttigieg               Bernie San… All catego…
##  3 Breakout rising          jay inslee              Beto ORour… All catego…
##  4 46       top             bernie sanders          Beto ORour… All catego…
##  5 Breakout rising          bernie sanders elizabe… Elizabeth … All catego…
##  6 24       top             klobuchar               Elizabeth … All catego…
##  7 Breakout rising          biden ukraine           Joe Biden   All catego…
##  8 20       top             joe biden ukraine       Joe Biden   All catego…
##  9 Breakout rising          kamala harris logo      Kamala Har… All catego…
## 10 Breakout rising          kamala harris drops out Kamala Har… All catego…