css - How to select only table rows with specific content inside -
i'm scraping email has many table rows, of want exclude. table rows need exactly like:
<tr> <td class="quantity"> empty </td> <td class="description"> empty </td> <td class="price"> empty </td> </tr>
none of table rows have class or id. moreover, there unwanted <table>
rows contain cells these classes no values, need table rows have these 3 classes of cells, , 3 cells non-empty values. i'm not sure of syntax this:
body = nokogiri::html(email) wanted_rows = body.css('tr').select{ not sure how encapsulate logic here }
this straightforward xpath:
wanted_rows = body.xpath('//tr[td[(@class = "quantity") , normalize-space()] , td[(@class = "description") , normalize-space()] , td[(@class = "price") , normalize-space()]]')
the normalize-space()
calls same normalize-space(.) != ""
, i.e. check current node (the td
) contains other whitespace.
Comments
Post a Comment