Daily Average to monthly total in R

Date        DE      VE
12/1/2016   93.387  0.095
11/1/2016   77.968  0.095
10/1/2016   65.184  0.095
9/1/2016    63.984  0.095
8/1/2016    67.657  0.095
%m/%d/%Y

DE and VE are daily averages. How to convert from daily average to monthly total in R based on the actual days in that month? Total for 12/2016 =93.387*31. Need to calculate the monthly total for all 10*12 months from 2006-01 to 2016-12.

Answers


To find the number of days in a month you can use the days_in_month function in the lubridate package.

The argument takes a datetime object so you have to convert your Date column to a known date/datetime-based class (i.e. "POSIXct, POSIXlt, Date, chron, yearmon, yearqtr, zoo, zooreg, timeDate, xts, its, ti, jul, timeSeries, and fts objects").

Then you can just mutate your df with the multiplicated daily averages.

library(lubridate)
library(dplyr)

myDf <- read.table(text = "Date        DE      VE
12/1/2016   93.387  0.095
11/1/2016   77.968  0.095
10/1/2016   65.184  0.095
9/1/2016    63.984  0.095
8/1/2016    67.657  0.095", header = TRUE)

mutate(myDf, Date = as.Date(Date, format = "%m/%d/%Y"), 
       monthlyTotalDE = DE * days_in_month(Date),
       monthlyTotalVE = VE * days_in_month(Date))

#             Date     DE    VE monthlyTotalDE monthlyTotalVE
#     1 2016-12-01 93.387 0.095       2894.997          2.945
#     2 2016-11-01 77.968 0.095       2339.040          2.850
#     3 2016-10-01 65.184 0.095       2020.704          2.945
#     4 2016-09-01 63.984 0.095       1919.520          2.850
#     5 2016-08-01 67.657 0.095       2097.367          2.945

EDIT

In mutate if you use a new column name, it will append this column to the data frame. If you want to avoid to add columns, you have to keep the columns names that already exist, it will overwrite these columns e.g.

mutate(myDf, Date = as.Date(Date, format = "%m/%d/%Y"), 
       DE = DE * days_in_month(Date), 
       VE = VE * days_in_month(Date))

#         Date       DE    VE
# 1 2016-12-01 2894.997 2.945
# 2 2016-11-01 2339.040 2.850
# 3 2016-10-01 2020.704 2.945
# 4 2016-09-01 1919.520 2.850
# 5 2016-08-01 2097.367 2.945

If you have a lot of columns to compute, I suggest you to use mutate_each, it's very powerfull and will save you the pain to do it manualy with mutate or the loss of performance by doing a traditional loop.

Use vars to include/exclude variables in mutate.

You can exclude variables manualy using the variable name prececed by a minus : vars = -Date or use a vector to exclude several variables vars = c(Date, DE).

Or you can also use special specification functions as in dplyr::select, see ?dplyr::select for more informations.

Warning : If you use vars to include variables, don't explicit the named argument vars = in your function if you want to keep the column names.

one_of(c("DE", "VE")), DE:VE... To drop variables, use - before the function : -contains("Date")

myDf %>%
mutate(Date = as.Date(Date, format = "%m/%d/%Y")) %>%
mutate_each(funs(. * days_in_month(Date)),
            vars = -Date)

#         Date       DE    VE
# 1 2016-12-01 2894.997 2.945
# 2 2016-11-01 2339.040 2.850
# 3 2016-10-01 2020.704 2.945
# 4 2016-09-01 1919.520 2.850
# 5 2016-08-01 2097.367 2.945

Need Your Help

Place labels on Pie Chart

r ggplot2

I would like to place my labels on the piechart, not in the legend. In other words, I would like the name of the party to appear centered on the graph rather than to the right. How would I do this?

PHP with lengthy loop creating erratic server error

php loops sleep

I have a PHP script that pulls down a bunch of RSS feeds. To prevent overloading the publishers' servers, I use the PHP Sleep function to slow things down.