Prepare a question about this code and post it on the RStudio Community thread l

Prepare a question about this code and post it on the RStudio Community thread listed in the assignment.
1. Ask a clear question about the code in English. Tell us what you expect and what is happening.
2. Use the reprex package to create an example in R.
Your reprex should be something that I can copy and paste on my machine and run right away.
Do NOT just share a screenshot of your RStudio session.
Consider these questions while preparing your reprex:
What do we expect this code to do, and what is happening?
Are the data accesible? Do we need to use the diabetes dataset, or will a built-in dataset do?
Is every part of this code necessary to show the problem?
Is there anything else missing?
diabetes <- read.csv("https://raw.githubusercontent.com/malcolmbarrett/au-stats412-612-01-reading_data/master/diabetes.csv") just_height <- diabetes[, "height"] ggplot(just_height, aes(x = height)) + geom_histogram()

Create a report that summarizes and visualizes data that evaluate a topic of you

Create a report that summarizes and visualizes data that evaluate a topic of your own choice from the National Health and Nutrition Examination Survey (NHANES).
Section 1: Determine one category in “Laboratory Data” that you
want to analyze
◦ Read the description of variables and search for information about how those
measurements are used to assess individuals’ health (and maybe how the measure is
calculated, if of interest and if not too complex)
◦ For example, you may pick Metals – Urine
Section 1: Search for references
◦ How the categories/variables are used in
practice
◦ For example, a certain measurement is
used to diagnosis certain disease or
screen for organ failure
◦ Write an introduction describing the
variable and your topic you want to
examine
◦ Cite the reference accordingly
◦ Do not copy-and-paste
Section 2: Data preprocessing
◦ Need to combine data using “joins” syntax
◦ For example
◦ Demographics + Laboratory
◦ Demographics + Examination
◦ Examination + Laboratory
◦ Laboratory + Questionnaire
◦ Determine the types of your variables of interest
◦ Check how many missing values are in your variables of interest

Week 5 Lab Assignment¶ Tree-based Models Tree-based model is useful for classifi

Week 5 Lab Assignment¶
Tree-based Models
Tree-based model is useful for classification and regression problems where one has a set of predictor variables X and a single reponse Y.
Why we use tree-based models?
They are easier to interpret and discuss than linear models. Also, we do not have to worry about missing values or variable transformations. In tree-based models, we do not have to worry about recoding categorical variables.
In this RLab assignment, we will use rpart package in R. You can go over chapter 09 of HOML book to go over the details.

# Exercise #1: List the Nine_neighbors # your code here Nine_balance<-arrange(na

# Exercise #1: List the Nine_neighbors
# your code here
Nine_balance<-arrange(names, income)[ Exercise 2 Use knn function in class package and predict labels in the test data with knn when k=5. Use set.seed(4230) and name the p # Exercise #3: knn results when k=10 # your code here # Exercise #4: Performance measure # your code here 0 Exercise 5: Write a function to find the optimal k (the k value which minimizes the classification error) and call it optimal_k. In other words, at which value of k does the k_class_error take the minimum value?

The problem set has 26 multiple choice questions, the file is attached down. Ple

The problem set has 26 multiple choice questions, the file is attached down. Please put the answers all in one MS word file. Some of the questions will require to use R. Please submit a text file with all the R commands that you will use to anwer some of the questions. After you take this post, I will send you the data that you will need to use through email because I am not able to upload it here! It does not work.
This is what the questions about:
1. Measurement error and missing values.
2. Proxy Variables.
3. Computer Problem on Heteroskedasticity.
And other stuff.
Please check the file for all other stuff and details.
Please take this question only and ONLY if you are good at R and Econometrics.
Two files will be submitted : 1. MS word file with all the answers (26). 2. a text file for all R commands used.
let me know if you have any questions. Thanks!