Summarize observations in r -
this question has answer here:
i have been working on following csv file http://www3.amherst.edu/~nhorton/r2/datasets/batting.csv own practice.
i not sure of how following:
- summarize observations same team (teamid) in same year adding component values. is, should end 1 record per team per year, , record should have
year, team name, total runs, total hits, total x2b ,…. total hbp
.
here code have far giving me 1 team per year yet need teams each year totals (e.g, 1980, need teams totalruns,totalhits,.....,for 1981, teams totalruns,totalhits,.... , on)
newdat1 <- read.csv("http://www3.amherst.edu/~nhorton/r2/datasets/batting.csv") id <- split(1:nrow(newdata1), newdata1$yearid) a2 <- data.frame(yearid=sapply(id, function(i) newdata1$yearid[i[1]]), teamid=sapply(id,function(i) newdata$teamid[i[1]]), totalruns=sapply(id, function(i) sum(newdata1$r[i],na.rm=true)), totalhits=sapply(id, function(i) sum(newdata1$h[i],na.rm=true)), totalx2b=sapply(id, function(i) sum(newdata1$x2b[i],na.rm=true)), totalx3b=sapply(id, function(i) sum(newdata1$x3b[i],na.rm=true)), totalhr=sapply(id, function(i) sum(newdata1$hr[i],na.rm=true)), totalbb=sapply(id, function(i) sum(newdata1$bb[i],na.rm=true)), totalsb=sapply(id, function(i) sum(newdata1$sb[i],na.rm=true)), totalgidp=sapply(id, function(i) sum(newdata1$gidp[i],na.rm=true)), totalibb=sapply(id, function(i) sum(newdata1$ibb[i],na.rm=true)), totalhbp=sapply(id, function(i) sum(newdata1$hbp[i],na.rm=true))) a2
perhaps try like:
library("dplyr") newdata1 %>% group_by(yearid, teamid) %>% summarize_each(funs(sum(., na.rm = t)), r, h, x2b, x3b, hr, bb, sb, gidp, ibb, hbp)
naturally useful if you're comfortable dplyr
library. guess without looking @ data closely.
also, instead of listing each column wish sum over, can use alternatively do
summarize_each(funs(sum(., na.rm = t)), -column_to_exclude1, -column_to_exlude2)
and forth.
Comments
Post a Comment