Mean of the n last or first
A small post today that could be very useful. A colleague ask me how to compute the mean of the 5 oldest trees in a populations. The following code that I have found in stackoverflow.
library(data.table)
treeage <- data.table(age = rnorm(50,200,50), group = rep(LETTERS[1:5],each = 10)) #dataset with trees age in 5 different group
subset_mean_three <- treeage[order(rank(age), group)][,tail(.SD,3),by=group] #subset the 3 oldest trees for each unique group
group_age <- subset_mean_three [, .(age_mean= mean(age )), by = group] #Compute the mean of the tree age from the 3 oldest trees per group
merge(treeage,group_age)[1:15] #adding the mean age of the 3 oldest trees in the treeage data table
As well you can ask for the n quantile age distribution or the youngest trees ages with the following code:
#For the quantile :
subset_quantile <- treeage[, .(quantile_age = quantile(age,0.90)),by=group] #subset the 90 qauntile age for each unique group
#For the 3 youngest trees:
subset_mean_young <- treeage[order(rank(age), group)][,head(.SD,3),by=group] #subset the 3 youngest trees for each unique group
group_age_young <- subset_mean_young [, .(age_mean_young = mean(age )), by = group] #compute the mean of the 3 youngest trees for each groups
#merge results
Reduce(merge,list(treeage,group_age,subset_quantile,group_age_young))[1:15] #Merging all the tables together
group age age_mean quantile_age age_mean_young 1: A 173.3567 303.9826 291.0337 170.4187 2: A 262.4292 303.9826 291.0337 170.4187 3: A 233.8751 303.9826 291.0337 170.4187 4: A 211.6698 303.9826 291.0337 170.4187 5: A 193.1739 303.9826 291.0337 170.4187 6: A 282.6023 303.9826 291.0337 170.4187 7: A 144.7256 303.9826 291.0337 170.4187 8: A 366.9163 303.9826 291.0337 170.4187 9: A 201.2530 303.9826 291.0337 170.4187 10: A 206.6334 303.9826 291.0337 170.4187 11: B 265.5331 292.0307 279.6926 185.2296 12: B 228.0602 292.0307 279.6926 185.2296 13: B 200.2322 292.0307 279.6926 185.2296 14: B 219.2227 292.0307 279.6926 185.2296 15: B 203.7675 292.0307 279.6926 185.2296
You can also see how to merge multiple data table together using the
function Reduce(merge, list(DT1, DT2, DT3, ...))
finding also in stackoverflow.