r - How do I compare two columns and delete the not overlapping elements? -
i have 2 columns in 2 data frames, longer 1 includes elements of other column. want delete elements in longer column not overlap other, corresponding row. identified "difference" using:
diff <- setdiff(gdp$country, tfpg$country)
and tried use 2 loops done:
for (i in 1:28) { for(j in 1:123) {if(diff[i] == gdp$country[j]) {gdp <- gdp[-c(j),]}}}
where 28 number of rows want delete (length of diff) , 123 length of longer column. not work, error message:
error in if (diff[i] == gdp$country[j]) { : missing value true/false needed
so how fix this? or there better way this?
thank much.
i have data frame called "gdp" here:
country wto y1990 y1991 y1992 austria 1995 251540 260197 265644 belgium 1995 322113 328017 333038 cyprus 1995 14436 14537 15898 denmark 1995 177089 179392 182936 finland 1995 149584 140737 136058 france 1995 1804032 1822778 1851937
there 123 rows. delete rows country names specified in vector:
diff ["austria","china",...,"yemen"]
there better way! you're describing equivalent of left join, or inner join. in r way achieve using merge command:
## s3 method class 'data.frame' merge(x, y, = intersect(names(x), names(y)), by.x = by, by.y = by, = false, all.x = all, all.y = all, sort = true, suffixes = c(".x",".y"), incomparables = null, ...)
in case:
merge(gdp, tfpg, = intersect('country', 'country'))
e.g.
x = data.frame(foo = c(1,2,3,4,5), bar=c("a","b","c","d","e")) y = data.frame(baz = c(6,7,8,9), bar=c("a","c","e","f")) z = merge(x,y,by=intersect('bar','bar'))
gives
bar foo baz 1 1 6 2 c 3 7 3 e 5 8
Comments
Post a Comment