r - Build data frame from multiple rvest elements -
i trying web scraping of journal article metadata (title, authors, abstract, etc.). have list of pages need navigate , each page has information need (except table of contents pages in list). built function piece each part of page list , i'm trying go through each page , end data frame of results.
here have:
article.links <- c("http://onlinelibrary.wiley.com/doi/10.1002/jee.20116/abstract", "http://onlinelibrary.wiley.com/doi/10.1002/jee.20120/abstract", "http://onlinelibrary.wiley.com/doi/10.1002/jee.20117/abstract" ) pager <- function(page) { new.row = vector("list", 4) page <- read_html(page) #doi new.row[1] <- page %>% html_node("#doi") %>% html_text() #title new.row[2] <- page %>% html_node(".maintitle") %>% html_text() #authors new.row[3] <- page %>% html_node("#authors") %>% html_text() #abstract new.row[4] <- page %>% html_node("#abstract") %>% html_text() return(unlist(new.row)) }
when run pager.test(article.links.test[1])
results expect 1 entry. i'm not quite sure build data frame series of results though. tried loop rbind put rows when try of rows throws errors entries being generated:
#this doesn't seem work abstracts <- data.frame() for(key in 1:length(article.links.test)) { abstracts <- rbind(abstracts2, pager.test(article.links.test[key])) }
how can scrape elements each of pages in list , combine results data frame?
you can use lapply
, rbind
rows
options(stringsasfactors=f) library(rvest) article.links <- c("http://onlinelibrary.wiley.com/doi/10.1002/jee.20116/abstract", "http://onlinelibrary.wiley.com/doi/10.1002/jee.20120/abstract", "http://onlinelibrary.wiley.com/doi/10.1002/jee.20117/abstract" ) pager <- function(page) { doc <- read_html(url(page)) data.frame(doi=doc %>% html_node("#doi") %>% html_text(), title=doc %>% html_node(".maintitle") %>% html_text(), authors=doc %>% html_node("#authors") %>% html_text(), abstract=doc %>% html_node("#abstract") %>% html_text()) } ans <- do.call(rbind, lapply(article.links, pager)) str(ans)
Comments
Post a Comment