Recall that tab-delimited data are easily imported into R with the following command
Recall that tab-delimited data are easily imported into R with the following command:
DF <- read.table('YOUR_FILE_PATH/alternative currency data.txt',header=TRUE,sep='\t')
If you don’t use this exact command, be sure to at least call your data table, DF. You can use the attach command, attach(DF), to assign the variable names described earlier to the data table you’ve imported. Lastly, run the command, Date <- as.Date(Date), to convert the date variable into something that R can understand.
When submitting your answers, Blackboard will require you to enter the exact characters specified as the solutions. So, if your answer is a whole number, simply write the number with no decimal places. If your answer contains decimals, then round to the nearest second decimal place. Beware of rounding prior to getting your final solution as this will likely cause you to get the wrong result. If your answer is nonnumeric, write your answer in all lowercase letters. If you make a simple mistake with these instructions, please contact your TA to receive partial credit.
Suppose we only want to consider using the page views of one cryptocurrency (CC) to predict the page views of the phrase, Alternative currency (alt_cur). Which CC would be the worst choice to make predictions?
Does this CC (the answer to question 1) significantly predict alt_cur?
Now, consider the variable which best predictsalt_cur. Call it CCbest. What is the residual standard error associated with the regression of alt_cur and CCbest?
Plot CCbest against the residuals from the regression of alt_cur on CCbest. Do the residuals appear to be approximately normally distributed?
Run a regression predicting alt_cur from litecoin. What percent of the total variation in alt_cur can be explained by litecoin? Don’t include the percent sign but remember to round to two decimals.
Plot Date versus the residuals of the regression from question 5. Do the residuals follow the standard linear regression assumptions?
In what month do you find the most extreme residual? Don’t report the associated year.Answer in lowercase.
Which (if any) CC should not be used to predict alt_cur in a linear model due to statistical insignificance? Choose one of bitcoin, dogecoin, litecoin or none.
Which CC is least correlated with the other two CCs? If all are equally correlated, write none.
Calculate a new variable with the command, bit_rate<- 24*60^2/bitcoin. For a given date, the associate bit_rate observation represents the average number seconds between searches for bitcoin on Wikipedia. Without doing any formal tests, would you say Date is linearly related to bitcoin, bit_rate, both or neither?
Purchase the answer to view it