html - Python scrapy, how to only get immediate children -
so have html this
<div class="content"> <div class="infobox"> <p> text </p> <p> more text </p> </div> <p> text again </p> <p> more text </p> </div>
and using selector '.content p::text'
thought me immediate children, wanted extract "text again" , "even more text" it's getting text paragraphs inside other div, how can prevent happening, want text paragraphs immediate children of div class .content
scrapy uses extended set of css selectors , xpath selectors. in case, you're using css selectors. css relationship selector want >
denoting parent/child relationship, in: .content > p::text
. scrapy's selectors described in section titled "selectors" in documentation.
Comments
Post a Comment