DAP Instructions and Questions PDF

Title DAP Instructions and Questions
Course Small Group Comm
Institution Cuesta College
Pages 4
File Size 127.6 KB
File Type PDF
Total Downloads 83
Total Views 155

Summary

DAP Instructions and Questions for class...


Description

COMM 215 Data Analysis Project Fall 2021 *** Please follow the instructions in this outline *** Key Points  Worth 8%  To be completed (preferably) in groups of 5 people (all must be from the same section)  Due on Tuesday, Dec. 07, 2021, before 24:00:00.  Must be typed and saved as a Word document or PDF and submitted via upload on Moodle in the DAP folder.  Use font Size 10pt, Times New Roman, Single Spaced  Cover Page must include the following for each group member: o

Last Name

o

First Name

o

Student ID

Make sure to include 1 page that has: - an Introduction: The introduction explains what you are doing and why within the context of your analysis. Give a description of the nature and the scope of the case, the importance of the problem and the need for its resolution, and - a Conclusion – draw an inference about your results. Tell me what you think you found! The conclusion is where I am going to know if you have learned your material. Show me what you know by giving me insightful and relevant explanations of your results as well as any recommendations you may have. Be sure to include the reasons / logic behind your statements (why did you say what you said) Provide a story here!

Discussion of your results are to be included with your answers for each question: The discussion presents the results to the questions asked below (in the order in which they are asked).

NOTE: Make sure your DAP submission is in line with these 3 simple guidelines: NEAT, CLEAR, and WELL-ORGANIZED work

1/4

COMM 215 Data Analysis Project Fall 2021 Relevant information on some properties sold in the United States between 2004 and 2010 was collected. Information on the customers who purchased these properties was also collected. Your role as a team of data scientists is to analyze the collected data. Here is a description of the 14 variables providing information on 160 observations in the data set: Column

Description

A

Categorical Data: ID number of the property sold

B

Categorical Data: Year that property was sold

C

Categorical Data: Type of property that was sold

D

Ratio Data: Area in square feet of the property sold

E

Ratio Data: Price of the property

F

BLANK

G

Categorial Data: Customer ID of customer who purchased the property

H

Ratio Data: Age of customer at the time of purchase

I

Categorical Data: Age Interval in which customer belonged at the time of purchase

J

Categorical Data: Gender of customer

K

Categorical Data: State is US state where customer lives

L

Categorical Data: Purpose of property purchase

M

Categorical/ Ordinal Data:: Deal Satisfaction is customer’s level of satisfaction on the purchase deal ( “1” = Very dissatisfied; “2”= Dissatisfied; “3”= Neutral; “4”= Satisfied; “5”= Very satisfied )

N

Categorical Data: Mortgage is identifying if customer a purchased with mortgage or not

O

Categorical Data: Source is the source of information from where customer found out about property for sale

2/4

COMM 215 Data Analysis Project Fall 2021 PART I DESCRIPTIVE STATISTICS 1.

Create a frequency distribution table on each of the following variables (5): Area of property (column D) Price of property (column E) Age interval of customer (column I) State where customer lives (column K) Deal Satisfaction (column M) For all ratio level variables included in this list, calculate the Mean, Median, Mode, Standard deviation, and Range for each of them.

2.

For each variable in Question 1, build a bar chart or histogram. Summarize the overall shape of the distribution.

3.

Inspect the frequency distributions and graphs of each variable. Choose appropriate measures of central tendency, and dispersion measures for ordinal level and ratio-level variables. Additionally, for ratio-level variables, check for skewness. Write a sentence or two for description of each variable.

PART II – ESTIMATION 1.

On the variables Area (column D), Price (column E), and Age of customer (column H), determine the 95% confidence interval for their mean values respectively. For each variable, write a statement reporting the variable that you are estimating, the interval that you obtain, and the confidence level and sample size used.

2.

On the variable Purpose (column L), compute the proportion of the sample found in the category “Home” and in the category “Investment”. Construct a 90% confidence interval for the proportion of “Investment” purchases. Repeat this exercise for the following variables: Deal Satisfaction (column M), and a 90% confidence interval for the proportion of customers being very satisfied (“5”) Mortgage (column N), and a 90% confidence interval for the proportion of customers who purchased with a mortgage (“yes”) Source (column O), and a 90% confidence interval for the proportion of customers who used a website to find their property For each variable, write a statement reporting the confidence interval for the selected category of this variable that you are estimating, the confidence level, and the sample size used.

PART III SIGNIFICANCE TESTING 1.

Using the chi-square test for independence, determine if the variables Deal Satisfaction and Source are independent or not. Write out your hypothesis statements and conduct a test on your null hypothesis at a level of significance of 0.05. What is your conclusion?

3/4

COMM 215 Data Analysis Project Fall 2021 PART IV – ANALYZING STRENGTH AND SIGNIFICANCE OF RELATIONSHIPS 1.

Create a scatter plot displaying the relationship between variables Area in square feet (column D) and Price of the property (column E). Compute the correlation between these 2 variables. Repeat this exercise for the following pairs of variables: - Age Interval (column I) and Price (column E) - Age (column H) and Area (column D) Write a statement on the strength and direction of the relationship displayed in each plot. Does it agree with the correlation coefficient found for each pair?

2.

a. Construct a simple regression model that describes the relationship between the Price of a property (dependent variable) and the Area of a property (independent variable). What is the relationship? (ie: provide an equation). Also, - determine the 95% confidence interval of the model’s slope - calculate the expected price for a property that has an area of 1200 sq. feet and calculate a 95% confidence interval for the mean value of the dependent variable b. Now construct another simple regression model that describes the relationship between Area of a property (dependent variable) and the Age of the customer at the time purchase. What is this relationship? (ie: provide an equation) - determine the 95% confidence interval of the model’s slope Write a report that presents and analyzes the relationships in parts (a) and (b) of this section. Discuss the significance of each relationship and discuss the 95% confidence interval of each model’s slope. Which of these 2 models is a better predictive model? Support your answer with results obtained, the scatterplots produced in Question 1 of this section, and any other information needed to justify your choice.

4/4...


Similar Free PDFs