xpath , XPath - How to get image source from xml

XPath - How to get image source from xml


Tag: xpath

Hello i have this xml

        <title> Something for title»</title>
        <link>some url</link>
        <description><![CDATA[<div class="feed-description"><div class="feed-image"><img src="pictureUrl.jpg" /></div>text for desc</div>]]></description>
        <pubDate>Thu, 11 Jun 2015 16:50:16 +0300</pubDate>

I try to get the img src with path: //description//div[@class='feed-description']//div[@class='feed-image']//img/@src but it doesn't work

is there any solution?


A CDATA section escapes its contents. In other words, CDATA prevents its contents from being parsed as markup when the rest of the document is parsed. So the <div>s in there are not seen as XML elements, only as flat text. The <description> element has no element children ... only a single text child. As such, XPath can't select any <div> descendant of <description> because none exists in the parsed XML tree.

What to do?

If your XPath environment supports XPath 3.0, you could use parse-xml() to turn the flat text into a tree, then use XPath to select //div[@class='feed-description']//div[@class='feed-image']//img/@src from the resulting tree.

Otherwise, your best workaround may be to use primitive string-processing functions like substring-before(), substring-after(), or match(). (The latter uses regular expressions and requires XPath 2.0.) Of course, many people will tell you not to use regular expressions to analyze markup like XML and HTML. For good reason: in the general case, it's very difficult to do it right (with regexes or with plain string searches). But for very restricted cases where the input is highly predictable, and in absence of better tools, it can be the best tool for a less-than-ideal job.

For example, for the data shown in your question, you could use

substring-before(substring-after(//description, 'img src="'), '"')

In this case, the inner call substring-after(//description, 'img src="') returns pictureUrl.jpg" /></div>text for desc</div>, of which the substring before " is pictureUrl.jpg.

This isn't really robust, for example it'll fail if there's a space between src and =. But if the exact formatting is predictable, you'll be OK.


Error when building an XDocument

Using the following example xml containing one duplicate: <Persons> <Person> <PersonID>7506</PersonID> <Forename>K</Forename> <Surname>Seddon</Surname> <ChosenName /> <MiddleName /> <LegalSurname /> <Gender>Male</Gender> </Person> <Person> <PersonID>6914</PersonID> <Forename>Clark</Forename> <Surname>Kent</Surname> <ChosenName>Clark</ChosenName> <MiddleName />...

Adding a child attribute to the parent element in xslt 1.0

I have multiple elements that contain an uniqueId(generated and stored in a variable). I have recursively added the element(object class=Bundle with unique id ) with the xsl as follows <xsl:template match="visualChildren"> <object class="Set" > <installChildren> <xsl:call-template name="Bundle"> <xsl:with-param name="i" select="1"/> <xsl:with-param name="limit" select="4" /> </xsl:call-template> </installChildren> </object> </xsl:template> <xsl:template name="Bundle">...

Remove Duplicate XML Records

I wanted to remove duplicate records in my xml but so far unable to I am unsure of how I can go about doing this, here is the xml and you can see there are 4 duplicate records. I want to remove itemGrp node due to having same rateClass element...

Load just XPath search to XMLReader memory?

Can i somehow do this? XMLReader is pull parser, so i expect from him to give me just data i search, but it loads whole document into memory and then gives me search from his memory. This code: $url = $this->buildUrl($name,$params); $xml = ''; $reader = new XMLReader(); $reader->open($url); $pathXML...

Java Selenium - Can't seem to select an element

I'm trying to select an e-mail href to grab the text, but no matter what selections I use to try to select the e-mail, my selection doesn't seem to work and I am curious what other ways might exist to fix the problem. I have used absolute and relative xpath,...

Java XPath returns single result instead of NodeSet

I am trying to create an XPath expression in Java (8, default XPath implementation). I am doing the following: Object res = xpath.evaluate("(//*[local-name()='PartyId'])", requestDom, XPathConstants.NODESET); I have multiple PartyId nodes in the document at the same level, because it's parent is repeating. I got my result, but only a single...

Transform XML structure using XSLT

I want to transform an XML structure with XSLT. <detaileddescription> <para>Some text</para> <para> <bold>Title</bold> </para> <para>Intro text: <itemizedlist> <listitem> <para>Text</para> </listitem> <listitem> <para>Text</para> </listitem> </itemizedlist> </para> </detaileddescription> This is what I want: <detaileddescription> <para>Some text</para> <List>...

Convert contents of an XmlNodeList to a new XmlDocument without looping

I have Xml that I filter using XPath (a query similar to this): XmlNodeList allItems = xDoc.SelectNodes("//Person[not(PersonID = following::Person/PersonID)]"); This filters all duplicates from my original Persons Xml. I want to create a new XmlDocument instance from the XmlNodeList generated above. At the minute, the only way I can see...

Can't get value from xpath python

I want to get values from page: http://www.tabele-kalorii.pl/kalorie,Actimel-cytryna-miod-Danone.html I can get all values from first section, but I can't get values from table "Wartości odżywcze" I use this xpath: ''.join(tree2.xpath("//html/body/div[1]/div[3]/article/div[2]/div/div[4]/div[3]/div/div[1]/div[3]/table[1]/tr[3]/td[2]/span/text()")) But I'm not getting anything. With xpath like this: ''.join(tree2.xpath("//html/body/div[1]/div[3]/article/div[2]/div/div[4]/div[3]/div/div[1]/div[3]/table[1]/tr[3]/td[2]//text()")) I'm...

Python XPath include missing elements

<tree> <item> <element1>somedata</element1> <element2>moredata</element2> <element3>data?</element3> <optional_element>data!</optional_element> </item> <item> <element1>somedata</element1> <element2>moredata</element2> <element3>data?</element3> </item> <item> <element1>somedata</element1> <element2>moredata</element2> <element3>data?</element3>...

XML-XSLT-XPATH : How to convert multiple XML elements to a string, separated by semicolon

I have just demonstrated my question as an input and output format as below. I have an input as xml document which consist of following data <Users> <user> <name>Mark Curtain</name> <email>[email protected]</email> <username>mark</username> </user> <user> <name>Zuke Gossip</name> <email>[email protected]</email> <username>zuke</username> </user> <user> <name>Villan Kiosk</name> <email>[email protected]</email>...

Count unique values in comma separated value in xslt 1.0

I have an node in an XML file: <TEST_STRING>12,13,12,14</TEST_STRING> I need to count how many unique numbers/values this string has. For example, in this case there are 2 unique values i.e. 13 and 14. Honestly speaking i could not build anything yet. It seems it is difficult in XSLT 1.0...

Can I create a macro or shortuct for a step of XPath in XQuery?

Do we have Macros in XQuery? If yes, could you please give an example of their usage. I have the following code let $x := //price/ancestor::* Can I someway, using macros or other things write it as follows: let $x := //price/outward So, the outward should mean ancestor::*...

xpath query seem to be failing

Can someone tell me the reason for the failure of this xpath query: //Pages/*[(name() = 'Home') or following-sibling::Home] on this xml structure: <Pages> <copyright>me inc. 2015,</copyright> <author>Me</author> <lastUpdate>2/1/1999</lastUpdate> <Home>--------------------</Home> <About>--------------------</About> <Contact>------------------</Contact> </Pages> only returned copyright element, which is against my targets (To grab all element between Pages and...

Unable to select column with its header through XPath

My HTML <table id="flex1" cellspacing="0" cellpadding="0" border="0"> <thead> <tr class="hDiv"> <th width="6%"> <div class="text-left field-sorting asc" rel="IFSC_CODE"> IFSC CODE </div> </th> <th width="6%"> <div class="text-left field-sorting " rel="BRANCH_NAME"> BRANCH NAME </div> </th> </tr> </thead> <tbody> <tr> <td class="sorted" width="6%"> <div class="text-left">SACS011151</div> </td> <td width="6%"> <div...

How to reject specify HTML tags by using css or xpath selector

I want to remove style and script tags and the contents of them by using css or xpath selector. This is a example HTML: <html> <head> <title>test</title> <style> // style </style> <script> /* some script */ </script> </head> <body> <p>text</p> <script> /* some script */ </script> <div>foo</div> </body> </html> I...

Unable to select text boxes in selenium webdriver through XPath?

My HTML <table border="2" style="background:gray"> <tbody> <td class="std"> <input id="ActQ1Revenue" type="text" name="amount" disabled="disabled" maxlength="20" style="background:wheat"> </td> </tr> <tr> <td class="atd"> <input id="ActAprRevenue" type="text" name="amount" maxlength="20"> </td> </tr> <tr> <td class="atd"> <input id="ActMayRevenue" type="text" name="amount" maxlength="20"> </td> </tbody> </table> My XPath...

XSLT insert sibling if it doesn't exist - Not being rerunnable for some reason

I'm trying to add a sibling to an element only if it doesn't exist. Here's my XML: <?xml version='1.0' encoding='UTF-8'?> <domain xmlns="http://xmlns.oracle.com/weblogic/domain" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.oracle.com/weblogic/security/xacml http://xmlns.oracle.com/weblogic/security/xacml/1.0/xacml.xsd http://xmlns.oracle.com/weblogic/security/providers/passwordvalidator...

Unable to construct Document object from xml string

I have xml string coming to my appliaction like follows <?xml version="1.0" encoding="UTF-8"?><loc:getLocation xmlns:loc="http://www.csapi.org/schema/parlayx/terminal_location/v2_3/local"> <loc:address>tel:+919420161525</loc:address> <loc:requestedAccuracy>500</loc:requestedAccuracy> <loc:acceptableAccuracy>500</loc:acceptableAccuracy> </loc:getLocation> I want to construct Document object from this so that using XPath, I can retrieve required data. I tried following code...

Combine multiple tags with lxml

I have an html file which looks like: ... <p> <strong>This is </strong> <strong>a lin</strong> <strong>e which I want to </strong> <strong>join.</strong> </p> <p> 2. <strong>But do not </strong> <strong>touch this</strong> <em>Maybe some other tags as well.</em> bla bla blah... </p> ... What I need is, if all the tags...

how to make xml values comma seperated using XPath, XQuery in Sql Server

I have an xml column with sample values as <error> <errorno>BL04002055</errorno> <description>Smart Rule PROJECT_COUNTRYCODE_VAL Violated</description> <description2>Country Code is required</description2> <correction /> </error> <error> <errorno>BL01001973</errorno> <description /> <description2>Error While Saving the Project info</description2> <correction /> </error> <error> <errorno>Unable to Create Custom...

XSL getting out of context using dynamic XPATH

I'm trying to reformat an XML I get from an appliance into an HTML table, and it's format is not usual. It use unique references in node name's, like this: /network/content/host/content/REF_1/content /network/content/network/content/REF_2/content and then, it use the same references to another part of the file, as a value of a...

How to remove the line breaks in front of a tag in Xpath

So here is my HTML code: </div><div id="structureDescs" class="buttonWrap"><h2>Structure Descriptors</h2> <div><h3>InChI</h3> 1S/C2H4O/c1-2-3/h2H,1H3<br> <button type="button" id="downloadInchi">Download</button> </div> <div><h3>InChIKey</h3> IKHGUXGNUITLKF-UHFFFAOYSA-N<br> <button type="button" id="googleInchi">Search the web for this InChIKey</button> </div> <div...

group siblings by identifying the first node of a certain type in sequence

Not sure if that description is the best...but given this xml: <?xml version="1.0"?> <root> <type1 num="1" first="1"/> <type1 num="2" /> <type2 num="3" /> <type2 num="4" /> <type1 num="5" first="2"/> <type1 num="6" /> <type2 num="7" /> <type2 num="8" /> <type1 num="9" first="3"/> <type1 num="10" /> <type2 num="11" /> <type2 num="12" />...

Update Text Field While Automating iPhone App

I am automating an iPhone App. The scenario is Login Logout and re-Login. But while re-login the username and password field is displaying the details. I am trying to list that xpath as a WebElement and clear() that field if(Webelement.gettext() !="") But it is not happening as in the existing...

Xpath text() wrong output

This is my first scrapy program! I'm writing a program using python/scrapy and I've tested my Xpath in FirePath and it works perfectly, but it is not displaying properly in the console (still in the early testing phase) What I'm doing is attempting to scrape a page of amazon reviews....

I want to find the xpath for day separated by

I have an HTML like this <td class="FormLabel" valign="top"> <span class="DataLabel">Consists of:</span> </td> <td class="FormData"> <span class="BodyText"> Sunday<br> Monday<br> </span> </td> Here I want Xpath to just check for Sunday. I had written Xpath like this //span[contains(.,'Consis')]/parent::td/following-sibling::td/span[contains(.,'Sun')] But this is not working as it is showing both days sunday and...

How to retrieve ATTRIBUTE VALUE from specific ELEMENT VALUE in C#

I have an XML file structured like this: <?xml version="1.0" encoding="utf-8"?> <PresCrt> <Times> <Time hours="0">2015-06-20T00:00:00Z</Time> <Time hours="12">2015-06-20T12:00:00Z</Time> <Time hours="24">2015-06-21T00:00:00Z</Time> <Time hours="36">2015-06-21T12:00:00Z</Time> <Time hours="48">2015-06-22T00:00:00Z</Time> <Time hours="60">2015-06-22T12:00:00Z</Time> <Time hours="72">2015-06-23T00:00:00Z</Time> <Time...

Adding a public class to click using xpath with Java Selenium

I am trying to create a public class to click an item on a webpage with selenium by just passing it the xpath and driver I'm using. I want to be able to just do: ClickByXpath(driver, "/html/body/div/div[3]/form/div[2]/div[2]/div[1]/div[1]/div[3]/div/div[3]/div/input[1]"); Here's the code I'm using, but it's complaining that the method xpath string...

xslt condition output one by one

hope that someone has a suggestion about this: I need to have in each 'a', all the 'b' that have @n equal or bigger than the @n of the 'a' in which they are. I am using xslt 2.0 and Saxon-HE XML source: <blabla> <a n="2"></a> <a n="6"></a> <b...

What is the xpath for retrieving all attribute nodes (whatever be the attributes name) with a specified value?

Following is a sample xml. <library> <book id="aaa"> <title id="bbb">Harry Potter and the Half Blood Prince</title> <author id="ccc" ref="zzz"/> </book> <book id="ddd"> <title id="eee">Harry Potter and the Philosophers Stone</title> <author id="fff" ref="zzz"/> </book> <author-details id="zzz"> <firstName id="ggg">Joanne</firstName> <lastName id="hhh"/>Rowling</lastName> </author-details> </library> There are many more books in this...

Difference between ancestor and ancestor-or-self

I know about ancestor in xpath but what is this ancestor-or-self. when we have to use ancestor-or-self.Please give me any examples.

Linq in C++ CLI

I need to get details from a xml file. I wrote code in C# but not able to rewrite in C++/CLI. Code in C# class cROI { public Int16 iX { get; set; } public Int16 iY { get; set; } public Int16 iWidth { get; set; } public Int16...

XPath query to select ONLY unique nodes

I have what is hopefully a very simple XPath query to build that I'm stuck on (very new to XPath). I have the following xml: <?xml version="1.0" encoding="utf-8"?> <Persons> <Person> <PersonID>6352</PersonID> <Forename>Tristan</Forename> </Person> <Person> <PersonID>6353</PersonID> <Forename>Ruth</Forename> </Person> <Person> <PersonID>6913</PersonID> <Forename>Mina</Forename> <Surname>Asif</Surname>...

XPath query to select nodes

I have the following XML: <?xml version="1.0" encoding="utf-8"?> <Persons> <Person> <PersonID>6352</PersonID> <Forename>Tristan</Forename> </Person> <Person> <PersonID>6353</PersonID> <Forename>Ruth</Forename> </Person> <Person> <PersonID>6913</PersonID> <Forename>Mina</Forename> <Surname>Asif</Surname> </Person> <Person> <PersonID>6914</PersonID>...

How to set up XPath query for HTML parsing?

Here is some HTML code from http://chem.sis.nlm.nih.gov/chemidplus/rn/75-07-0 in Google Chrome that I want to parse the website for some project. <div id="names"> <h2>Names and Synonyms</h2> <div class="ds"><button class="toggle1Col"title="Toggle display between 1 column of wider results and multiple columns.">&#8596;</button> <h3 id="yui_3_18_1_3_1434394159641_407">Name of Substance</h3> <ul> <li id="ds2"> `` <div>Acetaldehyde</div> </li> </ul>...

XPath does not work with XMLReader and SimpleXML? [duplicate]

This question already has an answer here: Load just XPath search to XMLReader memory? 1 answer I get feeds from an xml feeder and his XML structure is like this: <XMLSOCCER.COM> <OddsList> <Odds> <FixtureMatch_Id>346076</FixtureMatch_Id> <Bookmaker>Bet-At-Home</Bookmaker> <UpdatedDate>2015-06-20T19:42:32.943</UpdatedDate> <Type>Over/Under 2.5</Type> <HomeOdds>2.22</HomeOdds> <AwayOdds>1.58</AwayOdds> </Odds> <Odds>...

XPath query: parsing multiple paths using the same query (Cross Apply / .nodes() )

I have a rather big and structured XML receipt which one I want to parse into a relational database. There are some equal structures on different levels, so it'd be very good to parse them using the same SQL statement. Like: DECLARE @XMLPath varchar(127) SET @XMLPath = 'atag/btag/item' INSERT INTO...

XSL transformation outputting multiple times and other confusion

I'm attempting to transform a section of an XML document (which is mostly HTML) with a templated piece of markup should a particular pattern be matched. I'm inexperienced with XSLT (I've only used xpath, really) and online documentation is sparse so I'm struggling with it... To the following XML document:...

Is there a bug in xpathSApply if you produce your own XML object?

I tried the following: library(XML) top = newXMLNode("A") newXMLNode("b", attrs=c(x=1,y='abc'),parent=top) newXMLNode("c", "With some text", parent=top) top xpathSApply(top,'//A/b/@x') and the R stops working....

Selection skipping node on missing childnode

Here is my XML file: <?xml version="1.0" encoding="iso-8859-1" ?> <data> <metadata> <sector>weather</sector> <title>sourceTitle</title> </metadata> <weather> <countries> <country code="AU" name="Australia" region="Oceania"> <location type="APLOC" code="6700" name="Addington" state="VIC" postcode="3352"> <point_forecasts type="TWC"> <related_location type="TWCID" code="9508" name="Ballarat" state="VIC"> </related_location> <point_forecast...

Test XML content without using mock mvc

I am using spring test with mockmvc and it works like a charm to test xml output! Example: ResultActions actions = this.mockMvc.perform(get("/entry/NX_P38398/overview.xml")); actions.andExpect(xpath("entry/overview/gene-name-list/gene-name[@type='recommended']").exists()); actions.andExpect(xpath("entry/overview/gene-name-list/gene-name[@type='recommended']").string("BRCA1")); I would like to take advantage of the same features to test an OutputStream without using mockmvc and controllers. Is it possible to use the same...

selenium webdriver - xpath locator not working if element's text contains Unicode Characters

I'm trying to select an option contained inside a menu. It's not a select menu, but it's styled to appear as such. Anyway, if the text contained inside the menu is in English, I can select it ok. Trouble is, the text I need to select is french so it...

How can I find a node containing a sub-string?

I am able to find nodes in an XML file using the following statement: Set user = objXMLDoc.selectSingleNode("//user[@id = '" & id & "']") But the XML files I'm reading are being generated automatically and sometimes contain spaces after the ID. The id attribute of the user node might look...

Seach by class in Nokogiri nodeset

I got the name of a CSS class from a Nokogiri node. Now I want to find all the nodes that also have the same class attached. I don't know which HTML tag the element that I'm looking for has, and how deep it is. All i know is what...

What version of XPath does XmlDocument.SelectSingleNode use?

I'm using the XmlDocument class like this: divisionsDoc.SelectSingleNode( string.Format(@"Root/PoliticalDivisions/PoliticalDivision[upper-case(@Code)='{0}']", withCode.ToUpper())); And this is resulting in the error: Namespace Manager or XsltContext needed. This query has a prefix, variable, or user-defined function. I gather this is due to the upper-case XPath function, which I understand exists in XPath 2.0 and not...

How to remove nodes above and below somewhere in the document

Assuming I have an instance of HtmlNode pointing to table, how can I remove all nodes above and below it? we can assume table is in the same level of html and body tag <html> <body> <p>please remove me</p> <table> .... </table> <p>please remove me</p> <a> ... </a> . <img>...</img>...

How to get xml Nodes which are in lower case using XSLT 1.0

I need to get XML nodes which are in lower case and values of it using XSLT 1.0 and display the output as XML <main> <ACAT>Cat Name A </ACAT> <bcat>Cat Name b </bcat> <ccat>Cat Name c </ccat> <dcat>Cat Name d </dcat> <ECAT>Cat Name E </ECAT> <fcat>Cat Name f </fcat> </main>...

webdriver C# - click this element with XPath position?

I have an element whit this XPath Position: //td[4]/a I try this but does not work: driver.FindElement(By.XPath(".//*[@position='//td[4]/a']")).Click(); Using C# Webdriver and not java, please....

XPath for child element

What is the Xpath to select the first element of this jquery selector: $('.A .B:eq(1)') ? HTML Sample <div class="x A z"> <div class="y"> <div class="r B z"></div> <---- that is that I need to select <div class="r B z"></div > ... So far I've tried this: (//div[(@class='A') and div[contains(concat('...