Frequency of Characters in Strings as columns in data frame using R -

- September 15, 2010

i have data frame initial of following format

> head(initial)       strings 1     a,a,b,c 2       a,b,c 3 a,a,a,a,a,b 4     a,a,b,c 5       a,b,c 6 a,a,a,a,a,b

and data frame want final

    > head(final)       strings b c 1     a,a,b,c 2 1 1 2       a,b,c 1 1 1 3 a,a,a,a,a,b 5 1 0 4     a,a,b,c 2 1 1 5       a,b,c 1 1 1 6 a,a,a,a,a,b 5 1 0

to generate data frames following codes can used keep number of rows high

initial<-data.frame(strings=rep(c("a,a,b,c","a,b,c","a,a,a,a,a,b"),100)) final<-data.frame(strings=rep(c("a,a,b,c","a,b,c","a,a,a,a,a,b"),100),a=rep(c(2,1,5),100),b=rep(c(1,1,1),100),c=rep(c(1,1,0),100))

what fastest way can achieve this? appreciated

we can use base r methods task. split 'strings' column (strsplit(...)), set names of output list sequence of rows, stack convert data.frame key/value columns, frequency table, convert 'data.frame' , cbind original dataset.

 cbind(df1, as.data.frame.matrix(                   table(                     stack(                      setnames(                        strsplit(as.character(df1$strings),','), 1:nrow(df1))                            )[2:1])))  #          strings b c d  #1         a,b,c,d 1 1 1 1  #2     a,b,b,d,d,d 1 2 0 3  #3 a,a,a,a,b,c,d,d 4 1 1 2

or can use mtabulate after splitting column.

library(qdaptools) cbind(df1, mtabulate(strsplit(as.character(df1$strings), ','))) #          strings b c d #1         a,b,c,d 1 1 1 1 #2     a,b,b,d,d,d 1 2 0 3 #3 a,a,a,a,b,c,d,d 4 1 1 2

update

for new dataset 'initial', second method works. if need use first method correct order, convert factor class levels specified unique elements of 'ind'.

df1 <- stack(setnames(strsplit(as.character(initial$strings), ','),           seq_len(nrow(initial)))) df1$ind <- factor(df1$ind, levels=unique(df1$ind)) cbind(initial, as.data.frame.matrix(table(df1[2:1])))

Search This Blog

Maxid

Frequency of Characters in Strings as columns in data frame using R -

update

Comments

Post a Comment

Popular posts from this blog

How to show in django cms breadcrumbs full path? -

php - Invalid Cofiguration - yii\base\InvalidConfigException - Yii2 -

ruby on rails - npm error: tunneling socket could not be established, cause=connect ETIMEDOUT -