Gathering Data

Need to add worksheet in somewhere... --MF, 8/12/21
In this activity, you'll define a research question, collect data from your classmates, and clean the data so it can be entered into Snap!.

In a later activity, you'll input the data in Snap! and create your own visualization of the results, a pictograph like this one:
pictograph with ice cream cones; the vertical axis is labeled 'Number of students' and the categories along on the horizontal axis are 'MintChip' showing three ice cream cones, 'Vanilla' showing four ice cream cones,  and 'Orange' showing one ice cream cone

Every day, thousands of professional data analysts go through the process of collecting and understanding data.
Define the question → Collect the data → Clean the data → Analyze the data → Visualize and share findings

Defining Your Research Question

First, come up with a question that has a relatively small number of possible answers (2-12) and for which each person will only have one response. Here are some ideas:

  1. Decide on a research question. (Make sure it's one for which each person will only have one response.)
  2. List some possible responses. There should be a relatively small number (2-12). (Note that questions "What is your favorite book?" won't make a not great choice for this project since there are a large number of possible responses.)

Collecting Your Data

  1. Ask each classmate your question, and paste a link to the results.

Cleaning Your Data

Reference: Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says
I still don't love having this link in the curriculum. It basically says, "this stuff isn't fun." --MF, 8/12/21
Data cleaning is a crucial part of the data analysis process. Data scientists report spending 60% of their time cleaning data!

So, what makes data messy? That depends on the kind of data collected, but in this case, there might be variation in the ways people respond to the question. You'll need to find and remove those differences so the data can be analyzed by a computer. Here are some examples:

  1. Go through your data, and see if any of your data needs to be cleaned up. You'll have to make some decisions about how you want to organize the data.
  2. Save your cleaned up data so you can find it again later.
In this activity, you defined a research question, collected responses to your question, and cleaned the data for entry into Snap!.