We are going to explore some more features of datsets, namely the factor
data type. Factors can be very problematic, so we need to understand them well. Many functions may or may want you to enter variables as factors. Let's use this homework as an opportunity to explore this in more detail. You can also look at section 6.1 of Matloff. The goal here is to help you better deal with problems when you encounter them.
R comes with a bunch of datasets pre-loaded. Also many packages also come with pre-loaded datasets.
Type data()
to see what is loaded. Spend a minute here thinking about how you could use these packages to help yourself learn R. Then load the car
package and load the Greene
dataset.
continent
and continentF
, one which is a factor
and one which is of class character
. Come up with a strategy to recode nation
into the two new continent variables. Be explicit about the levels of the factor and why you chose them. How would these levels make a difference if you added further data? decision
is a factor with two levels. Convert this into a dummy variable that is numeric
with 0s and 1s. Why might you want to do this?with
function and table
functions explore several contingency tables. Write up your results. If you want to explore a fancier version of crosstables, explore the CrossTable
function in the Descr
package. character
data in the dataset. That is one line of code should yield you six tables of all the non-numeric variables. Hint: think about apply
family of functions.