r - count adjacent NAs in data.frame column -
i add column "na_count" counts adjacent nas in column value, like
value na_count 8 0 2 0 na 4 na 4 na 4 na 4 5 0 9 0 1 0 na 2 na 2 5 0 na 3 na 3 na 3 8 0 5 0 na 1
is there perhaps way dplyr
window functions?
here option using dplyr
(as author asked for). create grouping column taking difference of logical vector (!is.na(value)
), compare 1 , cumsum
, create 'na_count' multiplying logical vector number of elements in group (n()
).
library(dplyr) df1 %>% select(-na_count) %>% #removing column not needed group_by(grp=cumsum(c(true,abs(diff(!is.na(value)))==1))) %>% mutate(na_count = is.na(value)*n()) %>% ungroup() %>% select(-grp)
or can convert 'data.frame' 'data.table' (setdt(df1)
), grouped rleid
of logical vector (is.na(value)
), nrow (.n
), multiply logical vector , extract 'v1' column.
library(data.table)#v1.9.6+ setdt(df1)[, .n*is.na(value) ,rleid(is.na(value))]$v1 #[1] 0 0 4 4 4 4 0 0 0 2 2 0 3 3 3 0 0 1
if create new column,
setdt(df1)[, na_count:= .n*is.na(value) ,rleid(is.na(value))]
or can use rle
(run length encoding) base r
. rle
of 'value' na (is.na(df1$value)
) in list
, use within.list
change 'values' i.e. true
elements using index corresponding 'lengths' , return atomic vector inverse.rle
.
inverse.rle(within.list(rle(is.na(df1$value)), {values[values] <- lengths[values] })) #[1] 0 0 4 4 4 4 0 0 0 2 2 0 3 3 3 0 0 1
or more compact version is
inverse.rle(within.list(rle(is.na(df1$value)), values <-lengths*values)) #[1] 0 0 4 4 4 4 0 0 0 2 2 0 3 3 3 0 0 1
Comments
Post a Comment