java - How to access the subclass using jsoup -
i want access webpage: https://www.google.com/trends/explore#q=ice%20cream , extract data within in center line graph. html file is(here, paste part use.):
<div class="center-col"> <div class="comparison-summary-title-line">...</div> ... <div id="reportcontent" class="report-content"> <!-- tag handles report titles component --> ... <div id="report"> <div id="reportmain"> <div class="timesection"> <div class = "primaryband timeband">...</div> ... <div aria-lable = "one-chart" style = "position: absolute; ..."> <svg ....> ... <script type="text/javascript"> var chartdata = {...}
and data used stored in script part(last line). idea class "report-content" first, , select script. , code follows as:
string html = "https://www.google.com/trends/explore#q=ice%20cream"; document doc = jsoup.connect(html).get(); elements center = doc.getelementsbyclass("center-col"); element report = doc.getelementsbyclass("report-content"); system.out.println(center); system.out.println(report);
when print "center" class, can subclasses content except "report-content", , when print "report-content", result like:
<div id="reportcontent" class="report-content"></div>
and try this:
element report = doc.select(div.report-content).first();
but still not work @ all. how data in script here? appreciate help!!!
try url instead:
https://www.google.com/trends/trendsreport?hl=en&q=${keywords}&tz=${timezone}&content=1
where
${keywords}
encoded space separated keywords list${timezone}
encoded timezone in etc/gmt* form
sample code
string mykeywords = "ice cream"; string mytimezone = "etc/gmt+2"; string url = "https://www.google.com/trends/trendsreport?hl=en&q=" + urlencoder.encode(keywords, "utf-8") +"&tz="+urlencoder.encode(mytimezone, "utf-8")+"&content=1"; document doc = jsoup.connect(url).timeout(10000).get(); element scriptelement = doc.select("div#timeseries_graph_0-time-chart + script").first(); if (scriptelement==null) { throw new runtimeexception("unable locate trends data."); } string jscode = scriptelement.html(); // parse jscode extract chardata...
references:
Comments
Post a Comment