How to filter the dataframe based on a particular column or group?

<>

This question already has an answer here:

Answers


Not solution for dplyr package, but it is easier by shell command awk

I set two IDs for demo.

cat file
         V1        V2   V3      V4    V5   V6   V7    V8
   m.Bra004793 Bra004793  887  887.00 21.74 0.45 0.29 16.40
 m.Bra004793.1 Bra004794  907  907.00 20.52 0.42 0.27 15.11
 m.Bra004793.2 Bra004793 1006 1006.00 16.39 0.30 0.19 10.81
 m.Bra004793.3 Bra004794  988  988.00 56.56 1.05 0.67 38.02
 m.Bra004793.4 Bra004793 1097 1097.00 32.69 0.54 0.35 19.67

Here is awk command:

awk '{if (max[$2]<$8){max[$2]=$8;l[$2]=$0}}END{for (i in max) print l[i]}' file

         V1        V2   V3      V4    V5   V6   V7    V8
 m.Bra004793.4 Bra004793 1097 1097.00 32.69 0.54 0.35 19.67
 m.Bra004793.3 Bra004794  988  988.00 56.56 1.05 0.67 38.02

I guess V2 is your unique ID, since otherwise there would be no maximum to choose from [every row in V1 is unique]. In that case, a data.table solution is:

library(data.table)
df = data.table(read.table(header = T, text = "
V1          V2   V3      V4    V5   V6   V7    V8
m.Bra004793   Bra004793  887  887.00 21.74 0.45 0.29 16.40
m.Bra004793.1 Bra004793  907  907.00 20.52 0.42 0.27 15.11
m.Bra004793.2 Bra004793 1006 1006.00 16.39 0.30 0.19 10.81
m.Bra004793.3 Bra004793  988  988.00 56.56 1.05 0.67 38.02
m.Bra004793.4 Bra004793 1097 1097.00 32.69 0.54 0.35 19.67
"))

df[,best := max(V8), by = V2]
df[V8 == best,]

Maybe you could use something like below:

test[test$V8==max(test$V8),]

Need Your Help

python recursively merge 2 list in sorted order

python list sorting recursion

I want to sort a single list as I'm creating it from 2 sorted list recursively. This is what I have written so far: