r - rvest scraping data from a button style tag? -
following more delving scraping data off sites. trying pull data off of site seatgeeks few columns. i'm having trouble accessing pricing , link data specifically. following code runs can't accurate data pricing , links. 65$ keeps repeating though numbers different per button. ideas? appreciate help!
#ticket scruber library(rvest) tix_link = paste("https://seatgeek.com/new-york-knicks-tickets#events") tix_info = tix_link %>% read_html() %>% html_nodes(".event-listing-title span") link_date = read_html(tix_link) link_date = html_nodes(link_date, ".event-listing-date") link_time = read_html(tix_link) link_time = html_nodes(link_time, ".event-listing-time") link_price = read_html(tix_link) link_price = html_node(link_price, ".event-listing-button") link_info = read_html(tix_link) link_info = html_node(link_info, "span") #convert data frame ticket_deals = data.frame(deals = html_text(tix_info), date = html_text(link_date), time = html_text(link_time), price = html_text(link_price), correpsonding_link = html_attr(link_info,"href")) head(ticket_deals) deals date 1 dallas mavericks @ new york knicks \n nov 14 2 detroit pistons @ new york knicks \n nov 16 3 atlanta hawks @ new york knicks \n nov 20 4 portland trail blazers @ new york knicks \n nov 22 5 charlotte hornets @ new york knicks \n nov 25 6 oklahoma city thunder @ new york knicks \n nov 28 time price 1 \n mon 7:30 pm \n $65 2 \n wed 7:30 pm \n $65 3 \n sun 12:00 pm \n $65 4 \n tue 7:30 pm \n $65 5 \n fri 7:30 pm \n $65 6 \n mon 7:30 pm \n $65 correpsonding_link 1 <na> 2 <na> 3 <na> 4 <na> 5 <na>
Comments
Post a Comment