Getting the date of each Sunday of a year based on input (year)

I am looking forward to make a function which takes the input of the year and gives all date of Sunday.

library(lubridate)
yearSunday <- function(year) {
      ymd() + weeks(0:51)

}

I need help in doing it keeping the option of leap year in mind.

Answers


A slightly faster approach would be to

  • generate a sequence of the first seven days of the year
  • determine which of those days is a Sunday
  • Generate a sequence from that first Sunday to the end of the year, incrementing by 7.

Below, I've compared Ronak's yearSunday to an alternate yearSunday2. yearSunday2 runs about 4 times as fast.

yearSunday <- function(year) {
 dates <- seq(as.Date(paste(year, "-01-01", sep = "")), as.Date(paste(year, "-12-31", sep = "")), by="+1 day")
 days <- weekdays(dates) == "Sunday"
 dates[days]
}

yearSunday2 <- function(year){
  start <- seq(as.Date(paste0(year, "-01-01")), 
               as.Date(paste0(year, "-01-07")),
               by = 1)
  seq(start[which(weekdays(start) == "Sunday")],
      as.Date(paste0(year, "-12-31")), 
      by = 7)
}


library(microbenchmark)
microbenchmark(
  yearSunday(2015),
  yearSunday2(2015)
)

Unit: microseconds
              expr      min       lq      mean    median       uq      max neval cld
  yearSunday(2015) 1499.074 1513.149 1543.4064 1527.8120 1543.207 2616.634   100   b
 yearSunday2(2015)  357.173  372.569  386.5511  383.7125  398.375  444.854   100  a 

Try this,

yearSunday <- function(year) {
 dates <- seq(as.Date(paste(year, "-01-01", sep = "")), as.Date(paste(year, "-12-31", sep = "")), by="+1 day")
 days <- weekdays(dates) == "Sunday"
 dates[days]
}

yearSunday(2015)
#[1] "2015-01-04" "2015-01-11" "2015-01-18" "2015-01-25" "2015-02-01" "2015-02-08" "2015-02-15"
#[8] "2015-02-22" "2015-03-01" "2015-03-08" "2015-03-15" "2015-03-22" "2015-03-29" "2015-04-05"
#[15] "2015-04-12" "2015-04-19" "2015-04-26" "2015-05-03" "2015-05-10" "2015-05-17" "2015-05-24"
#[22] "2015-05-31" "2015-06-07" "2015-06-14" "2015-06-21" "2015-06-28" "2015-07-05" "2015-07-12"
#[29] "2015-07-19" "2015-07-26" "2015-08-02" "2015-08-09" "2015-08-16" "2015-08-23" "2015-08-30" 
#[36] "2015-09-06" "2015-09-13" "2015-09-20" "2015-09-27" "2015-10-04" "2015-10-11" "2015-10-18"
#[43] "2015-10-25" "2015-11-01" "2015-11-08" "2015-11-15" "2015-11-22" "2015-11-29" "2015-12-06"
#[50] "2015-12-13" "2015-12-20" "2015-12-27"

This would automatically take care of leap years.

yearSunday(2016)
#[1] "2016-01-03" "2016-01-10" "2016-01-17" "2016-01-24" "2016-01-31" "2016-02-07" "2016-02-14"
#[8] "2016-02-21" "2016-02-28" "2016-03-06" "2016-03-13" "2016-03-20" "2016-03-27" "2016-04-03"
#[15] "2016-04-10" "2016-04-17" "2016-04-24" "2016-05-01" "2016-05-08" "2016-05-15" "2016-05-22"
#[22] "2016-05-29" "2016-06-05" "2016-06-12" "2016-06-19" "2016-06-26" "2016-07-03" "2016-07-10"
#[29] "2016-07-17" "2016-07-24" "2016-07-31" "2016-08-07" "2016-08-14" "2016-08-21" "2016-08-28"
#[36] "2016-09-04" "2016-09-11" "2016-09-18" "2016-09-25" "2016-10-02" "2016-10-09" "2016-10-16"
#[43] "2016-10-23" "2016-10-30" "2016-11-06" "2016-11-13" "2016-11-20" "2016-11-27" "2016-12-04"
#[50] "2016-12-11" "2016-12-18" "2016-12-25"

EDIT

So if we need a faster approach then we can get the first Sunday of that year and then increment it with 7 days,

yearSunday <- function(year) {
dates <- as.Date(paste(year, "-01-01", sep = "")) + 0:6
days <- weekdays(dates) == "Sunday"
seq(dates[days], as.Date(paste(year, "-12-31", sep = "")),  by = "+7 day")
}

It is slightly faster than Benjamin's answer

library(microbenchmark)
> microbenchmark(
 +   yearSunday(2015),
 +   yearSunday2(2015)
 + )
 # Unit: microseconds
 #         expr      min      lq     mean   median      uq      max neval
 # yearSunday(2015) 270.277 279.343  308.9434 294.054 312.529  623.005   100
 # yearSunday2(2015) 357.176 379.927 416.4793 397.717 411.402 1295.960   100

Need Your Help

Elapsed Time for a Hadoop Task

java hadoop task yarn elapsed

I have a cluster running YARN on it. It has 3 datanodes and 1 client node. I submit all my jobs on the client node. How can I get the elapsed time for all the tasks in a particular job.