r - count adjacent NAs in data.frame column -


i add column "na_count" counts adjacent nas in column value, like

value na_count     8        0     2        0     na        4    na        4    na        4    na        4     5        0     9        0     1        0    na        2    na        2      5        0    na        3    na        3    na        3     8        0     5        0    na        1 

is there perhaps way dplyr window functions?

here option using dplyr (as author asked for). create grouping column taking difference of logical vector (!is.na(value)), compare 1 , cumsum, create 'na_count' multiplying logical vector number of elements in group (n()).

library(dplyr) df1 %>%    select(-na_count) %>% #removing column not needed   group_by(grp=cumsum(c(true,abs(diff(!is.na(value)))==1))) %>%    mutate(na_count = is.na(value)*n()) %>%   ungroup() %>%   select(-grp) 

or can convert 'data.frame' 'data.table' (setdt(df1)), grouped rleid of logical vector (is.na(value)), nrow (.n), multiply logical vector , extract 'v1' column.

library(data.table)#v1.9.6+ setdt(df1)[, .n*is.na(value) ,rleid(is.na(value))]$v1 #[1] 0 0 4 4 4 4 0 0 0 2 2 0 3 3 3 0 0 1 

if create new column,

setdt(df1)[, na_count:= .n*is.na(value) ,rleid(is.na(value))] 

or can use rle (run length encoding) base r. rle of 'value' na (is.na(df1$value)) in list, use within.list change 'values' i.e. true elements using index corresponding 'lengths' , return atomic vector inverse.rle.

inverse.rle(within.list(rle(is.na(df1$value)),                 {values[values] <- lengths[values] })) #[1] 0 0 4 4 4 4 0 0 0 2 2 0 3 3 3 0 0 1 

or more compact version is

inverse.rle(within.list(rle(is.na(df1$value)), values <-lengths*values)) #[1] 0 0 4 4 4 4 0 0 0 2 2 0 3 3 3 0 0 1 

Comments

Popular posts from this blog

How to show in django cms breadcrumbs full path? -

php - Invalid Cofiguration - yii\base\InvalidConfigException - Yii2 -

ruby on rails - npm error: tunneling socket could not be established, cause=connect ETIMEDOUT -