CBC case 2015 solutions PDF

Title CBC case 2015 solutions
Author Stefano Pozzi
Course Statistica per l'economia
Institution Università degli Studi dell'Insubria
Pages 5
File Size 362.9 KB
File Type PDF
Total Downloads 22
Total Views 152

Summary

Appunti ...


Description

Colonial Broadcasting Company CASE

Along with ABN and BBS, Colonial Broadcasting Company (CBC) is one of the three major American television network. The dataset Colonial_Broadcasting_case.syz contains a sample of 88 TV movies broadcasted by the three networks. These are the variables included in the dataset: - Ratings: Nielsen rating for movie - Competition: average of Nielsen ratings received by the two competing networks during the movie’s broad cast - Top_movie: dummy variable, 1=successful movie, 0=unsuccessful movie - Previous Ratings: Nielsen rating for program immediately preceding movie on same network - Budget: amount spent to produce a film (in millions of dollars) - Fact: dummy variable, 1=based on true events (fact-based), 0=fictional - Stars: dummy variable, 1=the movie has one or more “star” actor/actress, 0=no stars in the movie

1. CBC is interested in analyzing the aspects that explain the success of a film with respect to its competitors. A movie can be defined as successful when its rating is higher than the average competitor’s program ratings. Use a logistic regression model to study the relationship between the probability that the film is successful and the following variables: Fact , Previous rating and Budget. Comment on the meaning of the estimated coefficients and their statistical significance.

The regression coefficients can be interpreted as follows:  Constant: for a fictional movie (Fact dummy = 0), with both Previous ratings and Budget equal to zero, the estimated log-odds is -15.216, so the probability of success is very low (close to zero).  Fact: keeping other variables constant, the estimated log-odds for the success of a movie increases by 1.841 for a fact-based movie compared to a fictional movie.  Previous ratings: keeping other variables constant, the estimated log-odds for the success of a movie increases by 0.429 as the previous rating increases by one point.  Budget: keeping other variables constant, the estimated log-odds for the success of a movie increases by 0.244 as the budget increases by one million. All the estimated coefficients are statistically different from zero (as the p-value is low), so there is a significant (and positive) relationship between each independent variable and the probability that the film is successful. Alternatively, it’s possible to interpret the relationship between the probability of success and the independent variables using the estimated odds-ratios:  Fact: keeping other variables constant, the odds for a fact-based movie is estimated to be 6.305 times the odds for a fictional movie.  Previous ratings: keeping other variables constant, the odds for a movie that has a previous rating of (x*+1) is estimated to be 1.536 times the odds for a movie that has a previous rating of x* points.



Budget: keeping other variables constant, the odds for a movie that has a budget of $(x*+1) million is estimated to be 1.277 times the odds for a movie that has a budget of $(x*) million.

2. Generate the confusion matrix using the default cut-off value. What is the true positive rate? Is the default cut-off (0.5) suitable in this specific case?

The true positive rate is equal to 35/42=0.833 ( sensitivity!). Considering that, in the sample, we observe:

So that the a-priori probability (=relative frequency of the response category in the observed sample) is equal to 42/88=0.4773. Using a cut-off value of 0.5, even it’s not exactly equal to the a-priori probability, appear to be a rather suitable solution. As the matter of fact, changing the cut-off value to 0.4773 won’t change the confusion matrix obtained with 0.5.

3. With a confidence of 95%, can we assume that a movie based on a true story, that costs 35 million dollars and the previous program with a 20 in ratings, will have a probability higher than 90% of being a successful film?

Converting into probability the point estimate and the 95% confidence interval for the Odds, we obtain: 42.464

Point estimate: 1+42.464 = 0.9770 LCL:

4.684

= 0.8241

1+4.684 385.004

UCL: 1+385.004 = 0.9974

Since the probability of 90% is included in this interval, we can say that we are not confident that this type of film will have a probability of success over 90%.

4. If the previous rating remain constant (we assume equal to 15), on average, is it more likely that the success of a film is based on a true story that costs $25 million or on a fiction that costs $45 million?

Comparing the estimated probability of success in the two different situation, it’s possible to observe how the fictional movie with a budget of $45 million has an higher probability (about 90,1%) to become a successful movie compared to a fact-based based paid $25 million (keeping constant Previous Ratings), that has a probability of 30,2%.

5. How do you evaluate the inclusion of the variable Stars in the logistical regression model estimated in point 1? Is the model more effective with or without the Stars variable? After the estimation of the new model including the variable “Stars”, it’s possible to notice: - the first model (no Stars), exhibits a lower AIC index compared to the second one (with Stars) - Pseudo R-squares and ROC assume almost the same values in the two models. - The variable Stars is not statistically significant in the second model. Therefore, the model without Stars variable can be considered better than the model with Stars variable, because it’s simpler while keeping the same performance....


Similar Free PDFs