java - How to extract a "registration" URL from a mail content -
i successful in reading content of gmail-email using "javamail" , able store in string. want specific registration url content (string). how can this, string contains plenty of tags , href want extract url provided in hyper link on word " click here" exist in below mentioned statement
"please <a class="h5" href="https://newstaging.mobilous.com/en/user-register/******" target="_blank">click here</a> complete registration".
on hyper link "click here" url
href="https://newstaging.mobilous.com/en/user-register/******" target="_blank"
i have tried using following code
package email; import java.util.arraylist; import java.util.properties; import java.util.regex.matcher; import java.util.regex.pattern; import javax.mail.folder; import javax.mail.message; import javax.mail.messagingexception; import javax.mail.nosuchproviderexception; import javax.mail.session; import javax.mail.store; public class emailaccess { public static void check(string host, string storetype, string user, string password) { try { //create properties field properties properties = new properties(); properties.put("mail.imap.host",host); properties.put("mail.imap.port", "993"); properties.put("mail.imap.starttls.enable", "true"); properties.setproperty("mail.imap.socketfactory.class","javax.net.ssl.sslsocketfactory"); properties.setproperty("mail.imap.socketfactory.fallback", "false"); properties.setproperty("mail.imap.socketfactory.port",string.valueof(993)); session emailsession = session.getdefaultinstance(properties); //create pop3 store object , connect pop server store store = emailsession.getstore("imap"); store.connect(host, user, password); //create folder object , open folder emailfolder = store.getfolder("inbox"); emailfolder.open(folder.read_only); // retrieve messages folder in array , print message[] messages = emailfolder.getmessages(); system.out.println("messages.length---" + messages.length); int n=messages.length; (int = 0; i<n; i++) { message message = messages[i]; arraylist<string> links = new arraylist<string>(); if(message.getsubject().contains("thank signing appexe")){ string desc=message.getcontent().tostring(); // system.out.println(desc); pattern linkpattern = pattern.compile(" <a\\b[^>]*href=\"[^>]*>(.*?)</a>", pattern.case_insensitive|pattern.dotall); matcher pagematcher = linkpattern.matcher(desc); while(pagematcher.find()){ links.add(pagematcher.group()); } }else{ system.out.println("email:"+ + " not wanted email"); } for(string temp:links){ if(temp.contains("user-register")){ system.out.println(temp); } } /*system.out.println("---------------------------------"); system.out.println("email number " + (i + 1)); system.out.println("subject: " + message.getsubject()); system.out.println("from: " + message.getfrom()[0]); system.out.println("text: " + message.getcontent().tostring());*/ } //close store , folder objects emailfolder.close(false); store.close(); } catch (nosuchproviderexception e) { e.printstacktrace(); } catch (messagingexception e) { e.printstacktrace(); } catch (exception e) { e.printstacktrace(); } } public static void main(string[] args) { // todo auto-generated method stub string host = "imap.gmail.com"; string mailstoretype = "imap"; string username = "rameshakur@gmail.com"; string password = "*****"; check(host, mailstoretype, username, password); } }
on executing got out put
< class="h5" href="https://newstaging.mobilous.com/en/user-register/******" target="_blank">
how can extract href value i.e. https://newstaging.mobilous.com/en/user-register/******
please suggest, thanks.
you're close. you're using group(), you've got couple issues. here's code should work, replacing bit of you've got:
pattern linkpattern = pattern.compile(" <a\\b[^>]*href=\"([^\"]*)[^>]*>(.*?)</a>", pattern.case_insensitive|pattern.dotall); matcher pagematcher = linkpattern.matcher(desc); while(pagematcher.find()){ links.add(pagematcher.group(1)); }
all did change pattern explicitly looks end-quote of href attribute, wrapped portion of pattern string you're looking in parentheses.
i added argument pagemather.group()
method, needs one.
tell truth, use pattern instead (along .group(1)
change):
pattern linkpattern = pattern.compile("href=\"([^\"]*)", pattern.case_insensitive|pattern.dotall);
Comments
Post a Comment