How can I find the first and last occurrences of an element in a data.frame?

I have searched exhaustively for a direct R translation for the FIRST. and LAST. pointers in SAS DATA steps but can't seem to find one. For those not familiar with SAS, FIRST. is a boolean that identifies the first appearance of a given element in a table and LAST. is a boolean that identifies the last appearance. For instance, consider the following sorted table:

V1    V2    V3
1     1     1
1     1     2
1     2     3
1     2     4
2     3     5
2     3     6
2     4     7
2     4     8
3     5     9
3     5     10
3     6     11
3     6     12

Because SAS DATA steps read tables line by line, I can use a statement like:

IF FIRST.V1 THEN DO ...

FIRST.V1 will return TRUE if and only if this is the first time the observation has been encountered in V1. In other words, it will return true for V1[1] (the first appearance of '1'), V1[5] (the first appearance of '2'), and V1[9] (the first appearance of '3'). The LAST. pointer functions in analogous fashion, but with the final appearance of that element.

Is there anything in R that emulates this?

Answers


You can do this with duplicated and rev (for LAST):

> v1=c(1,1,1,2,2,3,3,3,3,4,4,5)

> data.frame(v1,FIRST=!duplicated(v1),LAST=rev(!duplicated(rev(v1))))
   v1 FIRST  LAST
1   1  TRUE FALSE
2   1 FALSE FALSE
3   1 FALSE  TRUE
4   2  TRUE FALSE
5   2 FALSE  TRUE
6   3  TRUE FALSE
7   3 FALSE FALSE
8   3 FALSE FALSE
9   3 FALSE  TRUE
10  4  TRUE FALSE
11  4 FALSE  TRUE
12  5  TRUE  TRUE

Need Your Help

How to properly set up a PDO connection

php mysql sql class pdo

From time to time I see questions regarding connecting to database.

When is it appropriate to use C# partial classes?

c# class architecture

I was wondering if someone could give me an overview of why I would use them and what advantage I would gain in the process.