An Easy Way to Report your Results using R Markdown

by Vivian Lew

The headers above were created by using

===================================

and

———————————–

with text above each one.

We use three accents followed by curly brackets {} to create an R “chunk” or we can use the dropdown menu (Code - insert chunk to do the same). I'm going to read your data that I just downloaded:

CENSUS <- read.csv("~/Downloads/SVHCENSUS.csv")
event <- read.csv("~/Downloads/2013 event report.csv", sep = ";", dec = ",", 
    stringsAsFactors = FALSE)

You can hide code by using echo=FALSE (see the original file for an example)

##                 DATETIME        COUNT             DOW      
##  2013-11-03 01:00:00:   2   Min.   : 0.0   Friday   :1248  
##  2013-01-01 00:00:00:   1   1st Qu.: 4.0   Monday   :1248  
##  2013-01-01 01:00:00:   1   Median : 7.0   Saturday :1248  
##  2013-01-01 02:00:00:   1   Mean   : 7.2   Sunday   :1248  
##  2013-01-01 03:00:00:   1   3rd Qu.:10.0   Thursday :1248  
##  2013-01-01 04:00:00:   1   Max.   :27.0   Tuesday  :1272  
##  (Other)            :8753                  Wednesday:1248  
##       HOUR            DAY      
##  Min.   : 0.00   Min.   : 1.0  
##  1st Qu.: 5.75   1st Qu.: 8.0  
##  Median :11.50   Median :16.0  
##  Mean   :11.50   Mean   :15.7  
##  3rd Qu.:17.25   3rd Qu.:23.0  
##  Max.   :23.00   Max.   :31.0  
## 

If you are wondering why there are two November 3rd 01:00 and there is no March 3rd 02:00, it's due to the change from Daylight savings to standard time. So for example

CENSUS[as.POSIXlt(as.character(CENSUS$DATETIME)) %in% c(as.POSIXlt("2013-03-10 01:00:00"), 
    as.POSIXlt("2013-03-10 02:00:00"), as.POSIXlt("2013-03-10 03:00:00")), ]
##                 DATETIME COUNT    DOW HOUR DAY
## 1634 2013-03-10 01:00:00     3 Sunday    1  10
## 1635 2013-03-10 03:00:00     2 Sunday    3  10
CENSUS[as.POSIXlt(as.character(CENSUS$DATETIME)) == as.POSIXlt("2013-11-03 01:00:00"), 
    ]
##                 DATETIME COUNT    DOW HOUR DAY
## 7345 2013-11-03 01:00:00     0 Sunday    1   3
## 7346 2013-11-03 01:00:00     2 Sunday    1   3

I tend to not load any packages when I program in R primarily because my memory is beginning to fail me and I can't remember which package contains which function… so I use basic functions. as.POSIXlt will force the conversion of a character (NOTE: CHARACTER, NOT FACTOR) object into a date/time representation in R. A useful package for you may be lubridate but everything you need can be accomplished in base R.

If you need to compute time differences, you can just use math or the difftime function is handy. I use the head function to test my code on the first 6 observations and then save the whole thing to a new object, then run some stats:

req_DATETIME <- as.POSIXlt(paste(event$REQ_DATE, event$REQ_TIME, sep = " "), 
    format = "%m/%d/%y %H:%M")
start_DATETIME <- as.POSIXlt(paste(event$START_DATE, event$START_TIME, sep = " "), 
    format = "%m/%d/%y %H:%M")
difftime(head(start_DATETIME[event$EVENT_NAME == "Urine Collect"]), head(req_DATETIME[event$EVENT_NAME == 
    "Urine Collect"]), units = "mins")
## Time differences in mins
## [1]  NA 145   4   6  NA  NA
## attr(,"tzone")
## [1] ""
UC <- difftime(start_DATETIME[event$EVENT_NAME == "Urine Collect"], req_DATETIME[event$EVENT_NAME == 
    "Urine Collect"], units = "mins")
mean(UC, na.rm = TRUE)
## Time difference of 62.2 mins
quantile(UC, c(0.01, 0.05, 0.25, 0.5, 0.75, 0.9, 0.95), na.rm = TRUE)
## Time differences in mins
##    1%    5%   25%   50%   75%   90%   95% 
##   0.0   2.0  16.0  38.0  88.0 146.0 188.3

You can also embed plots, for example:

UC_DOW <- weekdays(req_DATETIME[event$EVENT_NAME == "Urine Collect"])
boxplot(as.numeric(UC) ~ factor(UC_DOW, c("Monday", "Tuesday", "Wednesday", 
    "Thursday", "Friday", "Saturday", "Sunday"), label = c("Monday", "Tuesday", 
    "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")), ylim = c(-60, 
    360))

plot of chunk unnamed-chunk-5

This part was added 3/10/2014.

If you use the difftime function, the result is a difftime object, not a numeric variable. Some functions do not recognize these types of objects, so you might need to convert it to get a valid result. Example:

hist(log(as.numeric(UC)), xlab = "log of time difference", main = "Histogram of Logged Time Difference converted to numeric")
## Warning: NaNs produced

plot of chunk unnamed-chunk-6

So you know what time someone arrived, how do you match it back to the correct census count in the emergency room? One quick and dirty way (there are otheres) is to trick R by using formats. First I'll add a few variables to the event dataframe:

event$req_DATETIME <- req_DATETIME
event$start_DATETIME <- start_DATETIME
event$difftime <- difftime(event$start_DATETIME, event$req_DATETIME, units = "mins")

Then I'll keep only the variables I'd like for this analysis. I'll also prepare it for merging with the CENSUS dataframe by creating an Arrival variable which is consistent in format with the DATETIMESTAMP variable in the CENSUS dataframe.

a <- event[, c(2, 4, 10, 11, 16, 20, 22, 23, 25, 26, 28, 29, 34, 35, 36)]
a$Arrival <- as.POSIXlt(paste(a$ARR_DATE, a$ARR_TIME, sep = " "), format = "%m/%d/%y %H:%M")

The trick I use is to truncate the time variable via a format function to “round” the arrive time in the event dataframe (which is captured in minutes) to the hour it fell into. This would be consistent with the accounting in the CENSUS dataframe. We can examine the results (I'll just use urine collection as an example):

a$Arr <- format(a$Arrival, format = "%m/%d/%y %H")

junk <- as.POSIXlt(as.character(CENSUS$DATETIME), format = "%Y-%m-%d %H:%M")

CENSUS$Arr <- format(junk, format = "%m/%d/%y %H")

b <- merge(a, CENSUS, by = "Arr")

head(na.omit(b[b$EVENT_NAME == "Urine Collect", ]))
##             Arr    FIN_NBR      DOB ARR_DATE ARR_TIME   ACUITY
## 13  01/01/13 00 1005912520  12/1/68   1/1/13     0:16 3-Urgent
## 34  01/01/13 00 1005917461  11/6/92   1/1/13     0:55 3-Urgent
## 37  01/01/13 00 1005917461  11/6/92   1/1/13     0:55 3-Urgent
## 82  01/01/13 02 1005917503   9/7/93   1/1/13     2:33 3-Urgent
## 114 01/01/13 04 1005917529  4/11/37   1/1/13     4:11 3-Urgent
## 131 01/01/13 04 1005917545 12/11/58   1/1/13     4:16 3-Urgent
##        EVENT_NAME REQ_DATE REQ_TIME START_DATE START_TIME COMPLETE_DATE
## 13  Urine Collect   1/1/13     1:57     1/1/13       4:22        1/1/13
## 34  Urine Collect   1/1/13     1:00     1/1/13       1:04        1/1/13
## 37  Urine Collect   1/1/13     1:00     1/1/13       1:06        1/1/13
## 82  Urine Collect   1/1/13     2:36     1/1/13       3:45        1/1/13
## 114 Urine Collect   1/1/13     4:24     1/1/13       4:56        1/1/13
## 131 Urine Collect   1/1/13     4:41     1/1/13       6:08        1/1/13
##     COMPLETE_TIME        req_DATETIME      start_DATETIME difftime
## 13           4:22 2013-01-01 01:57:00 2013-01-01 04:22:00 145 mins
## 34           1:04 2013-01-01 01:00:00 2013-01-01 01:04:00   4 mins
## 37           1:06 2013-01-01 01:00:00 2013-01-01 01:06:00   6 mins
## 82           3:45 2013-01-01 02:36:00 2013-01-01 03:45:00  69 mins
## 114          4:56 2013-01-01 04:24:00 2013-01-01 04:56:00  32 mins
## 131          6:08 2013-01-01 04:41:00 2013-01-01 06:08:00  87 mins
##                 Arrival            DATETIME COUNT     DOW HOUR DAY
## 13  2013-01-01 00:16:00 2013-01-01 00:00:00     3 Tuesday    0   1
## 34  2013-01-01 00:55:00 2013-01-01 00:00:00     3 Tuesday    0   1
## 37  2013-01-01 00:55:00 2013-01-01 00:00:00     3 Tuesday    0   1
## 82  2013-01-01 02:33:00 2013-01-01 02:00:00     5 Tuesday    2   1
## 114 2013-01-01 04:11:00 2013-01-01 04:00:00     5 Tuesday    4   1
## 131 2013-01-01 04:16:00 2013-01-01 04:00:00     5 Tuesday    4   1

OK, so now we know what time interval any patient fell into and we can now compute the correlation of the count of patients in the ER at time of arrival and the time it took to complete this test. Your questions will have variants of this code example.

Test Your Knowledge

How would you compute a patient's age in years using their DOB (Date of Birth) and the date when they arrived at the hospital (ARR_DATE)?