STAT 412/612 Week 12: Homework

forcats and lubridate

YOUR NAME

2020-04-10

 

STAT代写 STAT 412/612 Week 12: Homework Submit your R Markdown file and your PDF, knitted directly from R Markdown, on Blackboard.

STAT代写
STAT代写

 

Instructions

  • Submit your R Markdown file and your PDF, knitted directly from R Markdown, on Blackboard. Only include the necessary code, not any extraneous code, to answer the questions.
  • Learning outcomes:

               – Manipulate factors with forcats.

               – Manipulate dates with lubridate.

Question 1: Capital Bikeshare Data

  1. Load in the data containing trip information from the Capital Bikeshare program. Also load in the station information. Rename variables that have spaces in the names          STAT代写

          trip data 

          station data

          Note: These data were originally from http://data.codefordc.org/group/transportation.

  1. Parse the date-time information from the trip data. Recall the times are recorded in the America/New_York time zone, not the UTC time zone. Specify that in your parser.       STAT代写

## # A tibble: 6 x 9




##
duration start_time

end_time

start_station_n~
##
<dbl>
<dttm>

<dttm>

<dbl>
## 1
301295
2016-03-31 23:59:00
2016-04-01 00:04:00
31280
## 2
557887
2016-03-31 23:59:00
2016-04-01 00:08:00
31275
## 3
555944
2016-03-31 23:59:00
2016-04-01 00:08:00
31101
## 4
766916
2016-03-31 23:57:00
2016-04-01 00:09:00
31226
## 5
139656
2016-03-31 23:57:00
2016-03-31
23:59:00
31011
## 6
967713
2016-03-31
23:57:00
2016-04-01
00:13:00
31266
























##  # ... with 5 more variables: start_station <chr>, end_station_number <dbl>,

##  #   end_station <chr>, bike_number <chr>, member_type <chr>

3. Calculate the average number of trips for each weekday (Sunday, Monday, Tuesday . . . ) given the day has trips. There are several days with no trips.      STAT代写

  • Save the resulting days of week and corresponding average number of trips as a data frame called sumdf and print it out.
  • It should look like this:
```

##  # A tibble: 7 x 2

##     wday  mean_num_trips


##
<ord>
<dbl>
## 1 Sun
5111.
## 2 Mon
6057.
## 3 Tue
6617.
## 4 Wed
6846.
## 5 Thu
7309.
## 6
Fri
6358.
## 7
Sat
6027
```

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4.Reproduce this plot in R:

 

STAT代写
STAT代写

 

  1. In a stunning show of contempt, the IEEE Computer Societydecided to add a new weekday called “Fooday” with abbreviation “Foo”. Fooday was decided to be the first day of the week (ahead of Sunday).      STAT代写

On the first Fooday ever, people used Capital Bikeshare in record numbers, yielding 15567 trips. Add Fooday as the first level to the wday variable in sumdf and add its average number of trips (now 15567 since there

has only been one Fooday so far).

Hint: Create a new data frame that contains the Fooday trips and use bind_rows().

Your final data frame should look like this:

“`

##     # A tibble: 8 x 2

##     wday  mean_num_trips


##
<fct>
<dbl>
## 1 Foo
15567
## 2 Sun
5111.
## 3 Mon
6057.
## 4 Tue
6617.
## 5 Wed
6846.
## 6 Thu
7309.
## 7 Fri
6358.
## 8 Sat
6027
```


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In another stunning show of contempt, the IEEE Computer Societydecided to change the abbreviations from three letters to two letters. STAT代写 Change the levels of wday so that each day uses only two-letter abbreviations. Your final data frame should look like this:

##     # A tibble: 8 x 2

##    wday  mean_num_trips


##
<fct>
<dbl>
## 1 Fo
15567
## 2 Su
5111.
## 3 Mo
6057.
## 4 Tu
6617.
## 5 We
6846.
## 6
Th
7309.
## 7
Fr
6358.
## 8
Sa
6027

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

7.In the stations data frame, it seems that installDate is populated by the number of milliseconds since January 1, 1970, 00:00:00 (in the America/New_York time zone). Parse this into a date-time and make a histogram of the install dates. It should look something like this:

STAT代写
STAT代写

 

更多其他:Creative代写  代写下单   商科论文代写   Report代写      代写案例  数据分析代写   论文代写

合作平台:天才代写 幽灵代写  写手招聘