Examples and Recipes

library(clock)
library(magrittr)

This vignette shows common examples and recipes that might be useful when learning about clock. Where possible, both the high and low level API are shown.

Many of these examples are adapted from the date C++ library’s Examples and Recipes page.

The current local time

zoned_time_now() returns the current time in a particular time zone. It will display up to nanosecond precision, but the exact amount is OS dependent (on a Mac this displays microsecond level information at nanosecond resolution).

Using "" as the time zone string will try and use whatever R thinks your local time zone is (i.e. from Sys.timezone()).

zoned_time_now("")
#> <zoned_time<nanosecond><America/New_York (current)>[1]>
#> [1] "2021-02-10T15:54:29.875011000-05:00"

The current time somewhere else

Pass a time zone name to zoned_time_now() to get the current time somewhere else.

zoned_time_now("Asia/Shanghai")
#> <zoned_time<nanosecond><Asia/Shanghai>[1]>
#> [1] "2021-02-11T04:54:29.875011000+08:00"

Set a meeting across time zones

Say you need to set a meeting with someone in Shanghai, but you live in New York. If you set a meeting for 9am, what time is that for them?

my_time <- year_month_day(2019, 1, 30, 9) %>%
  as_naive_time() %>%
  as_zoned_time("America/New_York")

my_time
#> <zoned_time<second><America/New_York>[1]>
#> [1] "2019-01-30T09:00:00-05:00"

their_time <- zoned_time_set_zone(my_time, "Asia/Shanghai")

their_time
#> <zoned_time<second><Asia/Shanghai>[1]>
#> [1] "2019-01-30T22:00:00+08:00"

High level API

my_time <- as.POSIXct("2019-01-30 09:00:00", "America/New_York")

date_time_set_zone(my_time, "Asia/Shanghai")
#> [1] "2019-01-30 22:00:00 CST"

Force a specific time zone

Say your co-worker in Shanghai (from the last example) accidentally logged on at 9am their time. What time would this be for you?

The first step to solve this is to force my_time to have the same printed time, but use the Asia/Shanghai time zone. You can do this by going through naive-time:

my_time <- year_month_day(2019, 1, 30, 9) %>%
  as_naive_time() %>%
  as_zoned_time("America/New_York")

my_time
#> <zoned_time<second><America/New_York>[1]>
#> [1] "2019-01-30T09:00:00-05:00"

# Drop the time zone information, retaining the printed time
my_time %>%
  as_naive_time()
#> <naive_time<second>[1]>
#> [1] "2019-01-30T09:00:00"

# Add the correct time zone name back on,
# again retaining the printed time
their_9am <- my_time %>%
  as_naive_time() %>%
  as_zoned_time("Asia/Shanghai")

their_9am
#> <zoned_time<second><Asia/Shanghai>[1]>
#> [1] "2019-01-30T09:00:00+08:00"

Note that a conversion like this isn’t always possible due to daylight saving time issues, in which case you might need to set the nonexistent and ambiguous arguments of as_zoned_time().

What time would this have been for you in New York?

zoned_time_set_zone(their_9am, "America/New_York")
#> <zoned_time<second><America/New_York>[1]>
#> [1] "2019-01-29T20:00:00-05:00"

High level API

my_time <- as.POSIXct("2019-01-30 09:00:00", "America/New_York")

my_time %>%
  as_naive_time() %>%
  as.POSIXct("Asia/Shanghai") %>%
  date_time_set_zone("America/New_York")
#> [1] "2019-01-29 20:00:00 EST"

Finding the next Monday (or Thursday)

Given a particular day precision naive-time, how can you compute the next Monday? This is very easily accomplished with time_point_shift(). It takes a time point vector and a “target” weekday, and shifts the time points to that target weekday.

days <- as_naive_time(year_month_day(2019, c(1, 2), 1))

# A Tuesday and a Friday
as_weekday(days)
#> <weekday[2]>
#> [1] Tue Fri

monday <- weekday(clock_weekdays$monday)

time_point_shift(days, monday)
#> <naive_time<day>[2]>
#> [1] "2019-01-07" "2019-02-04"

as_weekday(time_point_shift(days, monday))
#> <weekday[2]>
#> [1] Mon Mon

You can also shift to the previous instance of the target weekday:

time_point_shift(days, monday, which = "previous")
#> <naive_time<day>[2]>
#> [1] "2018-12-31" "2019-01-28"

If you happen to already be on the target weekday, the default behavior returns the input unchanged. However, you can also chose to advance to the next instance of the target.

tuesday <- weekday(clock_weekdays$tuesday)

time_point_shift(days, tuesday)
#> <naive_time<day>[2]>
#> [1] "2019-01-01" "2019-02-05"
time_point_shift(days, tuesday, boundary = "advance")
#> <naive_time<day>[2]>
#> [1] "2019-01-08" "2019-02-05"

While time_point_shift() is built in to clock, it can be useful to discuss the arithmetic going on in the underlying weekday type which powers this function. To do so, we will build some parts of time_point_shift() from scratch.

The weekday type represents a single day of the week and implements circular arithmetic. Let’s see the code for a simple version of time_point_shift() that just shifts to the next target weekday:

next_weekday <- function(x, target) {
  x + (target - as_weekday(x))
}

next_weekday(days, monday)
#> <naive_time<day>[2]>
#> [1] "2019-01-07" "2019-02-04"

as_weekday(next_weekday(days, monday))
#> <weekday[2]>
#> [1] Mon Mon

Let’s break down how next_weekday() works. The first step takes the difference between two weekday vectors. It does this using circular arithmetic. Once we get passed the 7th day of the week (whatever that may be), it wraps back around to the 1st day of the week. Implementing weekday arithmetic in this way means that the following nicely returns the number of days until the next Monday as a day based duration:

monday - as_weekday(days)
#> <duration<day>[2]>
#> [1] 6 3

Which can be added to our day precision days vector to get the date of the next Monday:

days + (monday - as_weekday(days))
#> <naive_time<day>[2]>
#> [1] "2019-01-07" "2019-02-04"

The current implementation will return the input if it is already on the target weekday. To use the boundary = "advance" behavior, you could implement next_weekday() as:

next_weekday2 <- function(x, target) {
  x <- x + duration_days(1L)
  x + (target - as_weekday(x))
}

a_monday <- as_naive_time(year_month_day(2018, 12, 31))
as_weekday(a_monday)
#> <weekday[1]>
#> [1] Mon

next_weekday2(a_monday, monday)
#> <naive_time<day>[1]>
#> [1] "2019-01-07"

High level API

In the high level API, you can use date_shift():

monday <- weekday(clock_weekdays$monday)

x <- as.Date(c("2019-01-01", "2019-02-01"))

date_shift(x, monday)
#> [1] "2019-01-07" "2019-02-04"

# With a date-time
y <- as.POSIXct(
  c("2019-01-01 02:30:30", "2019-02-01 05:20:22"), 
  "America/New_York"
)

date_shift(y, monday)
#> [1] "2019-01-07 02:30:30 EST" "2019-02-04 05:20:22 EST"

Note that adding weekdays to a POSIXct could generate nonexistent or ambiguous times due to daylight saving time, which would have to be handled by supplying nonexistent and ambiguous arguments to date_shift().

Generate sequences of dates and date-times

clock implements S3 methods for the seq() generic function for the calendar and time point types it provides. The precision that you can generate sequences for depends on the type.

  • year-month-day: Yearly or monthly sequences
  • year-quarter-day: Yearly or quarterly sequences
  • sys-time / naive-time: Weekly, Daily, Hourly, …, Subsecond sequences

When generating sequences, the type and precision of from determine the result. For example:

ym <- seq(year_month_day(2019, 1), by = 2, length.out = 10)
ym
#> <year_month_day<month>[10]>
#>  [1] "2019-01" "2019-03" "2019-05" "2019-07" "2019-09" "2019-11" "2020-01"
#>  [8] "2020-03" "2020-05" "2020-07"
yq <- seq(year_quarter_day(2019, 1), by = 2, length.out = 10)

This allows you to generate sequences of year-months or year-quarters without having to worry about the day of the month/quarter becoming invalid. You can set the day of the results to get to a day precision calendar. For example, to get the last days of the month/quarter for this sequence:

set_day(ym, "last")
#> <year_month_day<day>[10]>
#>  [1] "2019-01-31" "2019-03-31" "2019-05-31" "2019-07-31" "2019-09-30"
#>  [6] "2019-11-30" "2020-01-31" "2020-03-31" "2020-05-31" "2020-07-31"

set_day(yq, "last")
#> <year_quarter_day<January><day>[10]>
#>  [1] "2019-Q1-90" "2019-Q3-92" "2020-Q1-91" "2020-Q3-92" "2021-Q1-90"
#>  [6] "2021-Q3-92" "2022-Q1-90" "2022-Q3-92" "2023-Q1-90" "2023-Q3-92"

You won’t be able to generate day precision sequences with calendars. Instead, you should use a time point.

from <- as_naive_time(year_month_day(2019, 1, 1))
to <- as_naive_time(year_month_day(2019, 5, 15))

seq(from, to, by = 20)
#> <naive_time<day>[7]>
#> [1] "2019-01-01" "2019-01-21" "2019-02-10" "2019-03-02" "2019-03-22"
#> [6] "2019-04-11" "2019-05-01"

If you use an integer by value, it is interpreted as a duration at the same precision as from. You can also use a duration object that can be cast to the same precision as from. For example, to generate a sequence spaced out by 90 minutes for these second precision end points:

from <- as_naive_time(year_month_day(2019, 1, 1, 2, 30, 00))
to <- as_naive_time(year_month_day(2019, 1, 1, 12, 30, 00))

seq(from, to, by = duration_minutes(90))
#> <naive_time<second>[7]>
#> [1] "2019-01-01T02:30:00" "2019-01-01T04:00:00" "2019-01-01T05:30:00"
#> [4] "2019-01-01T07:00:00" "2019-01-01T08:30:00" "2019-01-01T10:00:00"
#> [7] "2019-01-01T11:30:00"

High level API

In the high level API, you can use date_seq() to generate sequences. This doesn’t have all of the flexibility of the seq() methods above, but is still extremely useful and has the added benefit of switching between calendars, sys-times, and naive-times automatically for you.

If an integer by is supplied with a date from, it defaults to a daily sequence:

date_seq(date_build(2019, 1), by = 2, total_size = 10)
#>  [1] "2019-01-01" "2019-01-03" "2019-01-05" "2019-01-07" "2019-01-09"
#>  [6] "2019-01-11" "2019-01-13" "2019-01-15" "2019-01-17" "2019-01-19"

You can generate a monthly sequence by supplying a month precision duration for by.

date_seq(date_build(2019, 1), by = duration_months(2), total_size = 10)
#>  [1] "2019-01-01" "2019-03-01" "2019-05-01" "2019-07-01" "2019-09-01"
#>  [6] "2019-11-01" "2020-01-01" "2020-03-01" "2020-05-01" "2020-07-01"

If you supply to, be aware that all components of to that are more precise than the precision of by must match from exactly. For example, the day component of from and to doesn’t match here, so the sequence isn’t defined.

date_seq(
  date_build(2019, 1, 1),
  to = date_build(2019, 10, 2),
  by = duration_months(2)
)
#> Error in `date_seq()`:
#> ! All components of `from` and `to` more precise than "month" must
#>   match.
#> ℹ `from` is "2019-01-01".
#> ℹ `to` is "2019-10-02".

date_seq() also catches invalid dates for you, forcing you to specify the invalid argument to specify how to handle them.

jan31 <- date_build(2019, 1, 31)
dec31 <- date_build(2019, 12, 31)

date_seq(jan31, to = dec31, by = duration_months(1))
#> Error in `invalid_resolve()`:
#> ! Invalid date found at location 2.
#> ℹ Resolve invalid date issues by specifying the `invalid` argument.

By specifying invalid = "previous" here, we can generate month end values.

date_seq(jan31, to = dec31, by = duration_months(1), invalid = "previous")
#>  [1] "2019-01-31" "2019-02-28" "2019-03-31" "2019-04-30" "2019-05-31"
#>  [6] "2019-06-30" "2019-07-31" "2019-08-31" "2019-09-30" "2019-10-31"
#> [11] "2019-11-30" "2019-12-31"

Compare this with the automatic “overflow” behavior of seq(), which is often a source of confusion.

seq(jan31, to = dec31, by = "1 month")
#>  [1] "2019-01-31" "2019-03-03" "2019-03-31" "2019-05-01" "2019-05-31"
#>  [6] "2019-07-01" "2019-07-31" "2019-08-31" "2019-10-01" "2019-10-31"
#> [11] "2019-12-01" "2019-12-31"

Grouping by months or quarters

When working on a data analysis, you might be required to summarize certain metrics at a monthly or quarterly level. With calendar_group(), you can easily summarize at the granular precision that you care about. Take this vector of day precision naive-times in 2019:

from <- as_naive_time(year_month_day(2019, 1, 1))
to <- as_naive_time(year_month_day(2019, 12, 31))

x <- seq(from, to, by = duration_days(20))

x
#> <naive_time<day>[19]>
#>  [1] "2019-01-01" "2019-01-21" "2019-02-10" "2019-03-02" "2019-03-22"
#>  [6] "2019-04-11" "2019-05-01" "2019-05-21" "2019-06-10" "2019-06-30"
#> [11] "2019-07-20" "2019-08-09" "2019-08-29" "2019-09-18" "2019-10-08"
#> [16] "2019-10-28" "2019-11-17" "2019-12-07" "2019-12-27"

To group by month, first convert to a year-month-day:

ymd <- as_year_month_day(x)

head(ymd)
#> <year_month_day<day>[6]>
#> [1] "2019-01-01" "2019-01-21" "2019-02-10" "2019-03-02" "2019-03-22"
#> [6] "2019-04-11"

calendar_group(ymd, "month")
#> <year_month_day<month>[19]>
#>  [1] "2019-01" "2019-01" "2019-02" "2019-03" "2019-03" "2019-04" "2019-05"
#>  [8] "2019-05" "2019-06" "2019-06" "2019-07" "2019-08" "2019-08" "2019-09"
#> [15] "2019-10" "2019-10" "2019-11" "2019-12" "2019-12"

To group by quarter, convert to a year-quarter-day:

yqd <- as_year_quarter_day(x)

head(yqd)
#> <year_quarter_day<January><day>[6]>
#> [1] "2019-Q1-01" "2019-Q1-21" "2019-Q1-41" "2019-Q1-61" "2019-Q1-81"
#> [6] "2019-Q2-11"

calendar_group(yqd, "quarter")
#> <year_quarter_day<January><quarter>[19]>
#>  [1] "2019-Q1" "2019-Q1" "2019-Q1" "2019-Q1" "2019-Q1" "2019-Q2" "2019-Q2"
#>  [8] "2019-Q2" "2019-Q2" "2019-Q2" "2019-Q3" "2019-Q3" "2019-Q3" "2019-Q3"
#> [15] "2019-Q4" "2019-Q4" "2019-Q4" "2019-Q4" "2019-Q4"

If you need to group by a multiple of months / quarters, you can do that too:

calendar_group(ymd, "month", n = 2)
#> <year_month_day<month>[19]>
#>  [1] "2019-01" "2019-01" "2019-01" "2019-03" "2019-03" "2019-03" "2019-05"
#>  [8] "2019-05" "2019-05" "2019-05" "2019-07" "2019-07" "2019-07" "2019-09"
#> [15] "2019-09" "2019-09" "2019-11" "2019-11" "2019-11"

calendar_group(yqd, "quarter", n = 2)
#> <year_quarter_day<January><quarter>[19]>
#>  [1] "2019-Q1" "2019-Q1" "2019-Q1" "2019-Q1" "2019-Q1" "2019-Q1" "2019-Q1"
#>  [8] "2019-Q1" "2019-Q1" "2019-Q1" "2019-Q3" "2019-Q3" "2019-Q3" "2019-Q3"
#> [15] "2019-Q3" "2019-Q3" "2019-Q3" "2019-Q3" "2019-Q3"

Note that the returned calendar vector is at the precision we grouped by, not at the original precision with, say, the day of the month / quarter set to 1.

Additionally, be aware that calendar_group() groups “within” the component that is one unit of precision larger than the precision you specify. So, when grouping by "day", this groups by “day of the month”, which can’t cross the month or year boundary. If you need to bundle dates together by something like 60 days (i.e. crossing the month boundary), then you should use time_point_floor().

High level API

In the high level API, you can use date_group() to group Date vectors by one of their 3 components: year, month, or day. Since month precision dates can’t be represented with Date vectors, date_group() sets the day of the month to 1.

x <- seq(as.Date("2019-01-01"), as.Date("2019-12-31"), by = 20)

date_group(x, "month")
#>  [1] "2019-01-01" "2019-01-01" "2019-02-01" "2019-03-01" "2019-03-01"
#>  [6] "2019-04-01" "2019-05-01" "2019-05-01" "2019-06-01" "2019-06-01"
#> [11] "2019-07-01" "2019-08-01" "2019-08-01" "2019-09-01" "2019-10-01"
#> [16] "2019-10-01" "2019-11-01" "2019-12-01" "2019-12-01"

You won’t be able to group by "quarter", since this isn’t one of the 3 components that the high level API lets you work with. Instead, this is a case where you should convert to a year-quarter-day, group on that type, then convert back to Date.

x %>%
  as_year_quarter_day() %>%
  calendar_group("quarter") %>%
  set_day(1) %>%
  as.Date()
#>  [1] "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01"
#>  [6] "2019-04-01" "2019-04-01" "2019-04-01" "2019-04-01" "2019-04-01"
#> [11] "2019-07-01" "2019-07-01" "2019-07-01" "2019-07-01" "2019-10-01"
#> [16] "2019-10-01" "2019-10-01" "2019-10-01" "2019-10-01"

This is actually equivalent to date_group(x, "month", n = 3). If your fiscal year starts in January, you can use that instead. However, if your fiscal year starts in a different month, say, June, you’ll need to use the approach from above like so:

x %>%
  as_year_quarter_day(start = clock_months$june) %>%
  calendar_group("quarter") %>%
  set_day(1) %>%
  as.Date()
#>  [1] "2018-12-01" "2018-12-01" "2018-12-01" "2019-03-01" "2019-03-01"
#>  [6] "2019-03-01" "2019-03-01" "2019-03-01" "2019-06-01" "2019-06-01"
#> [11] "2019-06-01" "2019-06-01" "2019-06-01" "2019-09-01" "2019-09-01"
#> [16] "2019-09-01" "2019-09-01" "2019-12-01" "2019-12-01"

Flooring by days

While calendar_group() can group by “component”, it isn’t useful for bundling together sets of time points that can cross month/year boundaries, like “60 days” of data. For that, you are better off flooring by rolling sets of 60 days.

from <- as_naive_time(year_month_day(2019, 1, 1))
to <- as_naive_time(year_month_day(2019, 12, 31))

x <- seq(from, to, by = duration_days(20))
time_point_floor(x, "day", n = 60)
#> <naive_time<day>[19]>
#>  [1] "2018-12-15" "2018-12-15" "2018-12-15" "2019-02-13" "2019-02-13"
#>  [6] "2019-02-13" "2019-04-14" "2019-04-14" "2019-04-14" "2019-06-13"
#> [11] "2019-06-13" "2019-06-13" "2019-08-12" "2019-08-12" "2019-08-12"
#> [16] "2019-10-11" "2019-10-11" "2019-10-11" "2019-12-10"

Flooring operates on the underlying duration, which for day precision time points is a count of days since the origin, 1970-01-01.

unclass(x[1])
#> $lower
#> [1] 2147483648
#> 
#> $upper
#> [1] 17897
#> 
#> attr(,"clock")
#> [1] 1
#> attr(,"precision")
#> [1] 4

The 60 day counter starts here, which means that any times between [1970-01-01, 1970-03-02) are all floored to 1970-01-01. At 1970-03-02, the counter starts again.

If you would like to change this origin, you can provide a time point to start counting from with the origin argument. This is mostly useful if you are flooring by weeks and you want to change the day of the week that the count starts on. Since 1970-01-01 is a Thursday, flooring by 14 days defaults to returning all Thursdays.

x <- seq(as_naive_time(year_month_day(2019, 1, 1)), by = 3, length.out = 10)
x
#> <naive_time<day>[10]>
#>  [1] "2019-01-01" "2019-01-04" "2019-01-07" "2019-01-10" "2019-01-13"
#>  [6] "2019-01-16" "2019-01-19" "2019-01-22" "2019-01-25" "2019-01-28"

thursdays <- time_point_floor(x, "day", n = 14)
thursdays
#> <naive_time<day>[10]>
#>  [1] "2018-12-27" "2018-12-27" "2018-12-27" "2019-01-10" "2019-01-10"
#>  [6] "2019-01-10" "2019-01-10" "2019-01-10" "2019-01-24" "2019-01-24"

as_weekday(thursdays)
#> <weekday[10]>
#>  [1] Thu Thu Thu Thu Thu Thu Thu Thu Thu Thu

You can use origin to change this to floor to Mondays.

origin <- as_naive_time(year_month_day(2018, 12, 31))
as_weekday(origin)
#> <weekday[1]>
#> [1] Mon

mondays <- time_point_floor(x, "day", n = 14, origin = origin)
mondays
#> <naive_time<day>[10]>
#>  [1] "2018-12-31" "2018-12-31" "2018-12-31" "2018-12-31" "2018-12-31"
#>  [6] "2019-01-14" "2019-01-14" "2019-01-14" "2019-01-14" "2019-01-28"

as_weekday(mondays)
#> <weekday[10]>
#>  [1] Mon Mon Mon Mon Mon Mon Mon Mon Mon Mon

High level API

You can use date_floor() with Date and POSIXct types.

x <- seq(as.Date("2019-01-01"), as.Date("2019-12-31"), by = 20)

date_floor(x, "day", n = 60)
#>  [1] "2018-12-15" "2018-12-15" "2018-12-15" "2019-02-13" "2019-02-13"
#>  [6] "2019-02-13" "2019-04-14" "2019-04-14" "2019-04-14" "2019-06-13"
#> [11] "2019-06-13" "2019-06-13" "2019-08-12" "2019-08-12" "2019-08-12"
#> [16] "2019-10-11" "2019-10-11" "2019-10-11" "2019-12-10"

The origin you provide should be another Date. For week precision flooring with Dates, you can specify "week" as the precision.

x <- seq(as.Date("2019-01-01"), by = 3, length.out = 10)

origin <- as.Date("2018-12-31")

date_floor(x, "week", n = 2, origin = origin)
#>  [1] "2018-12-31" "2018-12-31" "2018-12-31" "2018-12-31" "2018-12-31"
#>  [6] "2019-01-14" "2019-01-14" "2019-01-14" "2019-01-14" "2019-01-28"

Day of the year

To get the day of the year, convert to the year-day calendar type and extract the day with get_day().

x <- year_month_day(2019, clock_months$july, 4)

yd <- as_year_day(x)
yd
#> <year_day<day>[1]>
#> [1] "2019-185"

get_day(yd)
#> [1] 185

High level API

x <- as.Date("2019-07-04")

x %>%
  as_year_day() %>%
  get_day()
#> [1] 185

Computing an age in years

To get the age of an individual in years, use calendar_count_between().

x <- year_month_day(1980, 12, 14:16)
today <- year_month_day(2005, 12, 15)

# Note that the month and day of the month are taken into account!
# (Time of day would also be taken into account if there was any.)
calendar_count_between(x, today, "year")
#> [1] 25 25 24

High level API

You can use date_count_between() with Date and POSIXct types.

x <- date_build(1980, 12, 14:16)
today <- date_build(2005, 12, 15)

date_count_between(x, today, "year")
#> [1] 25 25 24

Computing number of weeks since the start of the year

lubridate::week() is a useful function that returns “the number of complete seven day periods that have occurred between the date and January 1st, plus one.”

There is no direct equivalent to this, but it is possible to replicate with calendar_start() and time_point_count_between().

x <- year_month_day(2019, 11, 28)

# lubridate::week(as.Date(x))
# [1] 48

x_start <- calendar_start(x, "year")
x_start
#> <year_month_day<day>[1]>
#> [1] "2019-01-01"

time_point_count_between(
  as_naive_time(x_start),
  as_naive_time(x),
  "week"
) + 1L
#> [1] 48

You could also peek at the lubridate::week() implementation to see that this is just:

doy <- get_day(as_year_day(x))
doy
#> [1] 332

(doy - 1L) %/% 7L + 1L
#> [1] 48

High level API

This is actually a little easier in the high level API because you don’t have to think about switching between types.

x <- date_build(2019, 11, 28)

date_count_between(date_start(x, "year"), x, "week") + 1L
#> [1] 48

Compute the number of months between two dates

How can we compute the number of months between these two dates?

x <- year_month_day(2013, 10, 15)
y <- year_month_day(2016, 10, 13)

This is a bit of an ambiguous question because “month” isn’t very well-defined, and there are various different interpretations we could take.

We might want to ignore the day component entirely, and just compute the number of months between 2013-10 and 2016-10.

calendar_narrow(y, "month") - calendar_narrow(x, "month")
#> <duration<month>[1]>
#> [1] 36

Or we could include the day of the month, and say that 2013-10-15 to 2014-10-15 defines 1 month (i.e. you have to hit the same day of the month in the next month).

calendar_count_between(x, y, "month")
#> [1] 35

With this you could also compute the number of days remaining between these two dates.

x_close <- add_months(x, calendar_count_between(x, y, "month"))
x_close
#> <year_month_day<day>[1]>
#> [1] "2016-09-15"

x_close_st <- as_sys_time(x_close)
y_st <- as_sys_time(y)

time_point_count_between(x_close_st, y_st, "day")
#> [1] 28

Or we could compute the number of days between these two dates in units of seconds, and divide that by the average number of seconds in 1 proleptic Gregorian month.

# Days between x and y
days <- as_sys_time(y) - as_sys_time(x)
days
#> <duration<day>[1]>
#> [1] 1094

# In units of seconds
days <- duration_cast(days, "second")
days <- as.numeric(days)
days
#> [1] 94521600

# Average number of seconds in 1 proleptic Gregorian month
avg_sec_in_month <- duration_cast(duration_months(1), "second")
avg_sec_in_month <- as.numeric(avg_sec_in_month)

days / avg_sec_in_month
#> [1] 35.94324

High level API

x <- date_build(2013, 10, 15)
y <- date_build(2016, 10, 13)

To ignore the day of the month, first shift to the start of the month, then you can use date_count_between().

date_count_between(date_start(x, "month"), date_start(y, "month"), "month")
#> [1] 36

To utilize the day field, do the same as above but without calling date_start().

date_count_between(x, y, "month")
#> [1] 35

There is no high level equivalent to the average length of one proleptic Gregorian month example.

Computing the ISO year or week

The ISO 8601 standard outlines an alternative calendar that is specified by the year, the week of the year, and the day of the week. It also specifies that the start of the week is considered to be a Monday. This ends up meaning that the actual ISO year may be different from the Gregorian year, and is somewhat difficult to compute “by hand”. Instead, you can use the year_week_day() calendar if you need to work with ISO week dates.

x <- date_build(2019:2026)
y <- as_year_week_day(x, start = clock_weekdays$monday)

data.frame(x = x, y = y)
#>            x          y
#> 1 2019-01-01 2019-W01-2
#> 2 2020-01-01 2020-W01-3
#> 3 2021-01-01 2020-W53-5
#> 4 2022-01-01 2021-W52-6
#> 5 2023-01-01 2022-W52-7
#> 6 2024-01-01 2024-W01-1
#> 7 2025-01-01 2025-W01-3
#> 8 2026-01-01 2026-W01-4
get_year(y)
#> [1] 2019 2020 2020 2021 2022 2024 2025 2026
get_week(y)
#> [1]  1  1 53 52 52  1  1  1

# Last week in the ISO year
set_week(y, "last")
#> <year_week_day<Monday><day>[8]>
#> [1] "2019-W52-2" "2020-W53-3" "2020-W53-5" "2021-W52-6" "2022-W52-7"
#> [6] "2024-W52-1" "2025-W52-3" "2026-W53-4"

The year-week-day calendar is a fully supported calendar, meaning that all of the calendar_*() functions work on it:

calendar_narrow(y, "week")
#> <year_week_day<Monday><week>[8]>
#> [1] "2019-W01" "2020-W01" "2020-W53" "2021-W52" "2022-W52" "2024-W01" "2025-W01"
#> [8] "2026-W01"

There is also an iso_year_week_day() calendar available, which is identical to year_week_day(start = clock_weekdays$monday). That ISO calendar actually existed first, before we generalized it to any start weekday.

Computing the Epidemiological year or week

Epidemiologists following the US CDC guidelines use a calendar that is similar to the ISO calendar, but defines the start of the week to be Sunday instead of Monday. year_week_day() supports this as well:

x <- date_build(2019:2026)
iso <- as_year_week_day(x, start = clock_weekdays$monday)
epi <- as_year_week_day(x, start = clock_weekdays$sunday)

data.frame(x = x, iso = iso, epi = epi)
#>            x        iso        epi
#> 1 2019-01-01 2019-W01-2 2019-W01-3
#> 2 2020-01-01 2020-W01-3 2020-W01-4
#> 3 2021-01-01 2020-W53-5 2020-W53-6
#> 4 2022-01-01 2021-W52-6 2021-W52-7
#> 5 2023-01-01 2022-W52-7 2023-W01-1
#> 6 2024-01-01 2024-W01-1 2024-W01-2
#> 7 2025-01-01 2025-W01-3 2025-W01-4
#> 8 2026-01-01 2026-W01-4 2025-W53-5
get_year(epi)
#> [1] 2019 2020 2020 2021 2023 2024 2025 2025
get_week(epi)
#> [1]  1  1 53 52  1  1  1 53

Converting a time zone abbreviation into a time zone name

It is possible that you might run into date-time strings of the form "2020-10-25 01:30:00 IST", which contain a time zone abbreviation rather than a full time zone name. Because time zone maintainers change the abbreviation they use throughout time, and because multiple time zones sometimes use the same abbreviation, it is generally impossible to parse strings of this form without more information. That said, if you know what time zone this abbreviation goes with, you can parse this time with zoned_time_parse_abbrev(), supplying the zone.

x <- "2020-10-25 01:30:00 IST"

zoned_time_parse_abbrev(x, "Asia/Kolkata")
#> <zoned_time<second><Asia/Kolkata>[1]>
#> [1] "2020-10-25T01:30:00+05:30"
zoned_time_parse_abbrev(x, "Asia/Jerusalem")
#> <zoned_time<second><Asia/Jerusalem>[1]>
#> [1] "2020-10-25T01:30:00+02:00"

If you don’t know what time zone this abbreviation goes with, then generally you are out of luck. However, there are low-level tools in this library that can help you generate a list of possible zoned-times this could map to.

Assuming that x is a naive-time with its corresponding time zone abbreviation attached, the first thing to do is to parse this string as a naive-time.

x <- naive_time_parse(x, format = "%Y-%m-%d %H:%M:%S IST")
x
#> <naive_time<second>[1]>
#> [1] "2020-10-25T01:30:00"

Next, we’ll develop a function that attempts to turn this naive-time into a zoned-time, iterating through all of the time zone names available in the time zone database. These time zone names are accessible through tzdb_names(). By using the low-level naive_time_info(), rather than as_zoned_time(), to lookup zone specific information, we’ll also get back information about the UTC offset and time zone abbreviation that is currently in use. By matching this abbreviation against our input abbreviation, we can generate a list of zoned-times that use the abbreviation we care about at that particular instance in time.

naive_find_by_abbrev <- function(x, abbrev) {
  if (!is_naive_time(x)) {
    abort("`x` must be a naive-time.")
  }
  if (length(x) != 1L) {
    abort("`x` must be length 1.")
  }
  if (!rlang::is_string(abbrev)) {
    abort("`abbrev` must be a single string.")
  }
  
  zones <- tzdb_names()
  info <- naive_time_info(x, zones)
  info$zones <- zones
  
  c(
    compute_uniques(x, info, abbrev),
    compute_ambiguous(x, info, abbrev)
  )
}

compute_uniques <- function(x, info, abbrev) {
  info <- info[info$type == "unique",]
  
  # If the abbreviation of the unique time matches the input `abbrev`,
  # then that candidate zone should be in the output
  matches <- info$first$abbreviation == abbrev
  zones <- info$zones[matches]
  
  lapply(zones, as_zoned_time, x = x)
}

compute_ambiguous <- function(x, info, abbrev) {
  info <- info[info$type == "ambiguous",]

  # Of the two possible times,
  # does the abbreviation of the earliest match the input `abbrev`?
  matches <- info$first$abbreviation == abbrev
  zones <- info$zones[matches]
  
  earliest <- lapply(zones, as_zoned_time, x = x, ambiguous = "earliest")
  
  # Of the two possible times,
  # does the abbreviation of the latest match the input `abbrev`?
  matches <- info$second$abbreviation == abbrev
  zones <- info$zones[matches]
  
  latest <- lapply(zones, as_zoned_time, x = x, ambiguous = "latest")
  
  c(earliest, latest)
}
candidates <- naive_find_by_abbrev(x, "IST")
candidates
#> [[1]]
#> <zoned_time<second><Asia/Calcutta>[1]>
#> [1] "2020-10-25T01:30:00+05:30"
#> 
#> [[2]]
#> <zoned_time<second><Asia/Kolkata>[1]>
#> [1] "2020-10-25T01:30:00+05:30"
#> 
#> [[3]]
#> <zoned_time<second><Eire>[1]>
#> [1] "2020-10-25T01:30:00+01:00"
#> 
#> [[4]]
#> <zoned_time<second><Europe/Dublin>[1]>
#> [1] "2020-10-25T01:30:00+01:00"
#> 
#> [[5]]
#> <zoned_time<second><Asia/Jerusalem>[1]>
#> [1] "2020-10-25T01:30:00+02:00"
#> 
#> [[6]]
#> <zoned_time<second><Asia/Tel_Aviv>[1]>
#> [1] "2020-10-25T01:30:00+02:00"
#> 
#> [[7]]
#> <zoned_time<second><Israel>[1]>
#> [1] "2020-10-25T01:30:00+02:00"

While it looks like we got 7 candidates, in reality we only have 3. Asia/Kolkata, Europe/Dublin, and Asia/Jerusalem are our 3 candidates. The others are aliases of those 3 that have been retired but are kept for backwards compatibility.

Looking at the code, there are two ways to add a candidate time zone name to the list.

If there is a unique mapping from {naive-time, zone} to sys-time, then we check if the abbreviation that goes with that unique mapping matches our input abbreviation. If so, then we convert x to a zoned-time with that time zone.

If there is an ambiguous mapping from {naive-time, zone} to sys-time, which is due to a daylight saving fallback, then we check the abbreviation of both the earliest and latest possible times. If either matches, then we convert x to a zoned-time using that time zone and the information about which of the two ambiguous times were used.

This example is particularly interesting, since each of the 3 candidates came from a different path. The Asia/Kolkata one is unique, the Europe/Dublin one is ambiguous but the earliest was chosen, and the Asia/Jerusalem one is ambiguous but the latest was chosen:

as_zoned_time(x, "Asia/Kolkata")
#> <zoned_time<second><Asia/Kolkata>[1]>
#> [1] "2020-10-25T01:30:00+05:30"
as_zoned_time(x, "Europe/Dublin", ambiguous = "earliest")
#> <zoned_time<second><Europe/Dublin>[1]>
#> [1] "2020-10-25T01:30:00+01:00"
as_zoned_time(x, "Asia/Jerusalem", ambiguous = "latest")
#> <zoned_time<second><Asia/Jerusalem>[1]>
#> [1] "2020-10-25T01:30:00+02:00"

When is the next daylight saving time event?

Given a particular zoned-time, when will it next be affected by daylight saving time? For this, we can use a relatively low level helper, zoned_time_info(). It returns a data frame of information about the current daylight saving time transition points, along with information about the offset, the current time zone abbreviation, and whether or not daylight saving time is currently active or not.

x <- zoned_time_parse_complete("2019-01-01T00:00:00-05:00[America/New_York]")

info <- zoned_time_info(x)

# Beginning of the current DST range
info$begin
#> <zoned_time<second><America/New_York>[1]>
#> [1] "2018-11-04T01:00:00-05:00"

# Beginning of the next DST range
info$end
#> <zoned_time<second><America/New_York>[1]>
#> [1] "2019-03-10T03:00:00-04:00"

So on 2018-11-04 at (the second) 1 o’clock hour, daylight saving time was turned off. On 2019-03-10 at 3 o’clock, daylight saving time will be considered on again. This is the next moment in time right after a daylight saving time gap of 1 hour, which you can see by subtracting 1 second (in sys-time):

# Last moment in time in the current DST range
info$end %>%
  as_sys_time() %>%
  add_seconds(-1) %>%
  as_zoned_time(zoned_time_zone(x))
#> <zoned_time<second><America/New_York>[1]>
#> [1] "2019-03-10T01:59:59-05:00"

High level API

date_time_info() exists in the high level API to do a similar thing. It is basically the same as zoned_time_info(), except the begin and end columns are returned as R POSIXct date-times rather than zoned-times, and the offset column is returned as an integer rather than as a clock duration (since we try not to expose high level API users to low level types).

x <- date_time_parse("2019-01-01 00:00:00", zone = "America/New_York")

date_time_info(x)
#>                 begin                 end offset   dst abbreviation
#> 1 2018-11-04 01:00:00 2019-03-10 03:00:00 -18000 FALSE          EST