The {op3r}
package was created to bring the novel Open Podcast Prefix Project (OP3) API to R
users through an streamlined and straight-forward interface. A growing
number of podcasts adopting the Podcastion 2.0 namespace are opting in
to the OP3 service as a robust and transparent source of statistics
associated with their podcast.
The API endpoints supported by {op3r}
require an API
token. To obtain your own set, create a free developer account at https://op3.dev/api/docs
and save the token as an environment variables called
OP3_API_TOKEN
within a project-level or default
user-directory .Renviron
file. Below is an example (change
to match your token value):
OP3_API_TOKEN="abcd123"
Once the package is installed, you can verify if authentication is working as expected:
The API endpoints offered by the OP3 service offer the following general capabilities:
The examples below demonstrate each of these perspectives. The podcast used for the single-podcast examples is the R Weekly Highlights Podcast hosted by Eric Nantz and Michael Thomas. First we load additional R packages used for analysis and visualization:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
Each podcast using the OP3 project is assigned a universally unique identifier (UUID). Many of the functions tailored to specific podcasts rely on this identifier as a parameter. To find this identifier for a given podcast, you can use either of the following methods:
op3_show()
function and supply either the
podcast’s GUID or RSS feed URL for the required show_id
parameter:rweekly_show_rss <- "https://serve.podhome.fm/rss/bb28afcc-137e-5c66-b231-4ffad7979b44"
op3_show(show_id = rweekly_show_rss)
op3_get_show_uuid()
function with either the
podcast GUID or RSS feed URL for show_id
:Within the podcast industry, many hosts and hosting companies
gravitate towards download metrics. The
op3_downloads_show()
function obtains download statistics
for the previous 30 days of the time the function is executed (for
additional details on how OP3 calculates downloads refer to the official
documentation):
# using the OP3 UUID for R Weekly Highlights
show_id <- "c008c9c7cfe847dda55cfdde54a22154"
op3_downloads_show(show_id = show_id)
If you are interested in the downloads per week captured in the
metrics, you can run the same function with
nest_downloads = FALSE
:
op3_downloads_show(show_id = show_id, nest_downloads = FALSE) |>
select(weekNumber, weeklyDownloads)
Additional download metrics are available based on applications used
to play the podcast. The op3_top_show_apps()
function
obtains metrics over the last three calendar months from the time of
execution:
Each of the functions demonstrated above are convenient ways to obtain derived metrics already pre-computed behind the scenes in the corresponding OP3 API endpoints. The common source of information used in each of them are redirect logs (hits), a collection of metadata dynamically generated each time a client such as a podcast application or web browser renders the URL for a podcast episode. To illustrate, here is the most recent log metadata at the time this vignette was compiled:
On top of these logs being available across all shows, we can filter
the logs returned for a particular podcast by setting the
url
parameter to be either a direct link to a podcast
episode, or a version of the URL with a wildcard placeholder
*
to signal multiple episodes.
How can we obtain these episode download URLs? You certainly could
obtain the podcast’s RSS feed and manually inspect each episode entry to
find the direct link to an episode. However, another R package called {podindexr}
contains many functions to obtain metadata for podcasts registered on
the Podcast Index. For example,
the episodes_byurl()
function takes a podcast RSS feed as
input to obtain a collection of metadata associated with up to 1,000
episodes of a podcast. Here is the metadata associated with the last
three episodes of the R Weekly Highlights podcast:
# remotes::install_github("rpodcast/podindexr")
library(podindexr)
episode_df <- episodes_byurl(rweekly_show_rss, max = 3)
glimpse(episode_df)
The enclosureUrl
field gives us the direct link to each
podcast episode:
As seen in the output above, we can create a wildcard version of the
URL by replacing the MP3 file name at the end with a *
character. Now we can run a revised call to op3_hits()
using this URL as a parameter: