The Open Movie Database has an API that let’s you query lot’s of movie metadata (title, imdb rating, summary of the plot, etc). The webpage has instructions and example queries. This lab will give you practice querying an API.

library(jsonlite)
library(stringr)
library(tidyverse)

Example query

The code below shows you how to query the OMDb API based on a movie title.

# URL for API request based on a movie title
url <- "http://www.omdbapi.com/?t=Logan&y=&plot=full&r=json"

# query the API to get some json
raw_json <- readLines(url)

# turn the json into a nice list
movie <- fromJSON(raw_json)
movie
## $Title
## [1] "Logan"
##
## $Year
## [1] "2017"
##
## $Rated
## [1] "R"
##
## $Released
## [1] "03 Mar 2017"
##
## $Runtime
## [1] "137 min"
##
## $Genre
## [1] "Action, Drama, Sci-Fi"
##
## $Director
## [1] "James Mangold"
##
## $Writer
## [1] "James Mangold (story by), Scott Frank (screenplay), James Mangold (screenplay), Michael Green (screenplay)"
##
## $Actors
## [1] "Hugh Jackman, Patrick Stewart, Dafne Keen, Boyd Holbrook"
##
## $Plot
## [1] "In 2029 the mutant population has shrunk significantly and the X-Men have disbanded. Logan, whose power to self-heal is dwindling, has surrendered himself to alcohol and now earns a living as a chauffeur. He takes care of the ailing old Professor X whom he keeps hidden away. One day, a female stranger asks Logan to drive a girl named Laura to the Canadian border. At first he refuses, but the Professor has been waiting for a long time for her to appear. Laura possesses an extraordinary fighting prowess and is in many ways like Wolverine. She is pursued by sinister figures working for a powerful corporation; this is because her DNA contains the secret that connects her to Logan. A relentless pursuit begins - In this third cinematic outing featuring the Marvel comic book character Wolverine we see the superheroes beset by everyday problems. They are ageing, ailing and struggling to survive financially. A decrepit Logan is forced to ask himself if he can or even wants to put his remaining powers to good use. It would appear that in the near-future, the times in which they were able put the world to rights with razor sharp claws and telepathic powers are now over."
##
## $Language
## [1] "English, Spanish"
##
## $Country
## [1] "USA"
##
## $Awards
## [1] "N/A"
##
## $Poster
## [1] "https://images-na.ssl-images-amazon.com/images/M/MV5BMjI1MjkzMjczMV5BMl5BanBnXkFtZTgwNDk4NjYyMTI@._V1_SX300.jpg"
##
## $Ratings
##                    Source  Value
## 1 Internet Movie Database 8.5/10
## 2         Rotten Tomatoes    93%
## 3              Metacritic 77/100
##
## $Metascore
## [1] "77"
##
## $imdbRating
## [1] "8.5"
##
## $imdbVotes
## [1] "181,361"
##
## $imdbID
## [1] "tt3315342"
##
## $Type
## [1] "movie"
##
## $DVD
## [1] "N/A"
##
## $BoxOffice
## [1] "N/A"
##
## $Production
## [1] "20th Century Fox"
##
## $Website
## [1] "http://www.foxmovies.com/movies/logan"
##
## $Response
## [1] "True"

Question 1

1a. Write a function that takes a list of movie titles, queries OMDB for each movie then returns a data frame with the following columns indicated in the function body below.

# given some movie titles returns a data frame with metadata about each movie
get_movie_data_from_title <- function(titles){

    # start and end of query url
    url_start <-
    url_end <-

    # movie data frame to return
    # get the data for each colum
    movie_df <- tibble(title = titles,
                       imdbRating = NA,
                       imdbID = NA,
                       Year = NA,
                       Rated = NA,
                       Runtime = NA,
                       Genre = NA)


    # for each title in the list query the data base and extract the metadata
    for(i in 1:length(titles)){
        title <- titles[i]

        # replace all spaces in title with +
        modified_title <-

        # create url for movie (hint: paste())
        url <-

        # query the API to get some json
        raw_json <-

        # turn the json into a nice list
        movie <-

        # the response will be false if you misspelled a movie title
        if(movie$Response){

            # add each information for each column
            movie_df[i, 'imdbRating'] <-
        }

    }

    return(movie_df)
}

1b. Test your function by querying five movies.

titles <- c("Harry Potter and the Sorcerer's Stone", "Logan")
get_movie_data_from_title(titles)

Question 2

2a. Modify the above function to instead take a list of imdb ids. Hint: This should be a matter of modifying the url – you can get an example from the OMDb website under the By ID section.

get_movie_data_from_imdb_id <- function(imdb_ids){

}

2b. Test your function by querying five movies. Hint: You can find a movie’s imdb id on it’s imdb page.

imdb_ids <- c('tt0241527')
get_movie_data_from_imdb_id(imdb_ids)