xml,parsing,emacs,lisp,elisp , Parsing XML with emacs elisp and finding a nested attribute

Parsing XML with emacs elisp and finding a nested attribute


Tag: xml,parsing,emacs,lisp,elisp

Since two days I am working on the following problem: I have some xml wich looks like this:

	<f form="paradāra"><s stem=""/><m meaning="anothers wife; adultery"/></f>
	<f form="abhimarśeṣu"><s stem="" meaning=""/><m meaning=""/></f>
	<f form="pravṛttān"><s stem="" meaning=""/><m meaning=""/></f>
	<f form="mahipatis"><s stem="" meaning=""/><m meaning=""/></f>
	<f form="udvejana"><s stem="udvejana" meaning="agitation, fear"/><m meaning=""/></f>
	<f form="karais"><na><ins/><pl/><mas/></na><s stem="kara#1" meaning="action"/><m meaning="by action"/></f>
	<f form="daṇḍais"><na><ins/><pl/><mas/></na><na><ins/><pl/><neu/></na><s stem="daṇḍa" meaning="punishment"/><m meaning="by punishment"/></f>
	<f form="cihnayitvā"><s stem="" meaning="having marked"/><m meaning=""/></f>
	<f form="pravāsayet"><v><cj><ca/></cj><sys><prs><md><op/></md><para/></prs></sys><np><sg/><trd/></np></v><s stem="pravas"/><m meaning="to put on, dress"/></f>

Now I convert this into S-expressions by running (xml-parse-region). It returns something like this:

((grammar nil "
" (l nil "
" (f ((form . "paradāra")) (s ((stem . ""))) (m ((meaning . "anothers wife; adultery")))) "

" (f ((form . "abhimarśeṣu")) (s ((stem . "") (meaning . ""))) (m ((meaning . "")))) "
" (f ((form . "pravṛttān")) (s ((stem . "") (meaning . ""))) (m ((meaning . "")))) "
" (f ((form . "mahipatis")) (s ((stem . "") (meaning . ""))) (m ((meaning . "")))) "
") "
" (l nil "
" (f ((form . "udvejana")) (s ((stem . "udvejana") (meaning . "agitation, fear"))) (m ((meaning . "")))) "

" (f ((form . "karais")) (na nil (ins nil) (pl nil) (mas nil)) (s ((stem . "kara#1") (meaning . "action"))) (m ((meaning . "by action")))) "

" (f ((form . "daṇḍais")) (na nil (ins nil) (pl nil) (mas nil)) (na nil (ins nil) (pl nil) (neu nil)) (s ((stem . "daṇḍa") (meaning . "punishment"))) (m ((meaning . "by punishment")))) "

" (f ((form . "cihnayitvā")) (s ((stem . "") (meaning . "having marked"))) (m ((meaning . "")))) "

" (f ((form . "pravāsayet")) (v nil (cj nil (ca nil)) (sys nil (prs nil (md nil (op nil)) (para nil))) (np nil (sg nil) (trd nil))) (s ((stem . "pravas"))) (m ((meaning . "to put on, dress")))) "

") "

What I want to do now is extract all the subnodes wich start with (s ... ) and collect them in a seperate buffer. like: (s ((stem . "udvejana") (meaning . "agitation, fear"))) How would the code look like? recursive walk the tree? Yesterday I got as far es being able to walk the first (l ... ) node, but due to a blackout I lost the code. Hope somebody of you has some suggestions! Greetings,



You just need basic recursion:

(defun rec-filter (predicate seq &optional acc)
  (cond ((null seq)
        ((consp seq)
         (append (rec-filter predicate (car seq) nil)
                 (rec-filter predicate (cdr seq) nil)
                 (if (funcall predicate seq)
                     (cons seq acc)

 (lambda (x) (eq (car x) 's))
;; =>
;; ((s ((stem . "")))
;;  (s ((stem . "")
;;      (meaning . "")))
;;  (s ((stem . "")
;;      (meaning . "")))
;;  (s ((stem . "")
;;      (meaning . "")))
;;  (s ((stem . "udvejana")
;;      (meaning . "agitation, fear")))
;;  (s ((stem . "kara#1")
;;      (meaning . "action")))
;;  (s ((stem . "daṇḍa")
;;      (meaning . "punishment")))
;;  (s ((stem . "")
;;      (meaning . "having marked")))
;;  (s ((stem . "pravas"))))


How to instantiate lexical.Scanner in a JavaTokenParsers class?

I am writing a parser which inherits from JavaTokenParsers in that I have a function as follow: import scala.util.parsing.combinator.lexical._ import scala.util.parsing._ import scala.util.parsing.combinator.RegexParsers; import scala.util.parsing.combinator.syntactical.StdTokenParsers import scala.util.parsing.combinator.token.StdTokens import scala.util.parsing.combinator.lexical.StdLexical import scala.util.parsing.combinator.lexical.Scanners import scala.util.parsing.combinator.lexical.Lexical import...

group siblings by identifying the first node of a certain type in sequence

Not sure if that description is the best...but given this xml: <?xml version="1.0"?> <root> <type1 num="1" first="1"/> <type1 num="2" /> <type2 num="3" /> <type2 num="4" /> <type1 num="5" first="2"/> <type1 num="6" /> <type2 num="7" /> <type2 num="8" /> <type1 num="9" first="3"/> <type1 num="10" /> <type2 num="11" /> <type2 num="12" />...

Get XML node value when previous node value conditions are true (without looping)

Sample XML - <?xml version="1.0"?> <Root> <PhoneType dataType="string"> <Value>CELL</Value> </PhoneType> <PhonePrimaryYn dataType="string"> <Value>Y</Value> </PhonePrimaryYn> <PhoneNumber dataType="string"> <Value>555-555-5554</Value> </PhoneNumber> <PhonePrimaryYn dataType="string"> <Value>Y</Value> </PhonePrimaryYn> <PhoneType dataType="string"> <Value>HOME</Value> </PhoneType>...

Multiply arrays by arrays in JAVA

I have for example three arrays (but I can have more) with some values like this: table_1 = [a,b,c]; //three elements table_2 = [d]; //one elements table_3 = [e,f]; //two elements and I want to get that output <test> <test_1>a</test_1> <test_2>d</test_2> <test_3>e</test_3> </test> <test> <test_1>a</test_1> <test_2>d</test_2> <test_3>f</test_3> </test> <test> <test_1>b</test_1>...

Clean and convert HTML to XML for BaseX

I would like to run some XQuery commands using BaseX over an HTML source that may be full of <script>, <style> nodes that must be removed and also unclosed tags (<br>, <img>) that must have a pair. (for example the dirty source of this page ) "Converting HTML to XML"...

XElement.Value is stripping XML tags from content

I have the following XML: <Message> <Identification>c387e36a-0d79-405a-745c-7fc3e1aa8160</Identification> <SerializedContent> {"Identification":"81d090ca-b913-4f15-854d-059055cc49ff","LogType":0,"LogContent":"{\"EntitiesChanges\":\" <audit> <username>acfc</username> <date>2015-06-04T15:15:34.7979485-03:00</date> <entities> <entity> <properties> <property> <name>DepId</name> <current>2</current> </property>...

XSLT How to remove style from div and td tags

I am new to XSLT. I got stuck while removing style attributes from div, td or li tags. Input XML: <?xml version="1.0" encoding="UTF-8"?> <div xmlns="http://www.w3.org/1999/xhtml"> <table style="BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; WIDTH: 606px; BORDER-COLLAPSE: collapse; WORD-WRAP: break-word; TABLE-LAYOUT: fixed; BORDER-TOP: medium none; BORDER-RIGHT: medium none" class="MsoNormalTable msoUcTable" tabIndex="-1" border="1"...

Parsing XML array using Jquery

I have stuck up with an issue of passing XML using Jquery. I am getting empty array while traversing to jquery.Please help me how to get datas from XML array. I have mentioned my code below. XML <?xml version="1.0" encoding="UTF-8"?> <json> <json> <CustomerName>999GIZA MID INSURANCEAND SERVICES PVT LTD</CustomerName> <mobiLastReceiptDate>null</mobiLastReceiptDate> </json>...

XSL transformation outputting multiple times and other confusion

I'm attempting to transform a section of an XML document (which is mostly HTML) with a templated piece of markup should a particular pattern be matched. I'm inexperienced with XSLT (I've only used xpath, really) and online documentation is sparse so I'm struggling with it... To the following XML document:...

XMLPullParser black diamond question marks with certain characters

I'm making an android app, that needs to fetch and parse XML. The class for that was made following the instructions from here http://www.tutorialspoint.com/android/android_rss_reader.htm and the fetcher method looks like this: public void fetchXML() { Thread thread = new Thread(new Runnable() { @Override public void run() { try { URL...

How get value from property file to input in springConfig.xml

I want to get the property value in email.properties file to input in the springConfig.xml. but there is an error occur. here is my code below springConfig.xml <bean class="org.springframework.mail.javamail.JavaMailSenderImpl" id="mailSender"> <property name="host" value="${email.host}" /> <property name="protocol" value="${email.protocol}" /> <property name="port" value="${email.port}" /> <property name="username" value="${email.username}"/> <property name="password" value="${email.password}" />...

Tagging values in HTML document for automated extraction

We have a series of documents that are being converted to HTML for web access. The documents are operating instructions that list actions people have to do as well as distinct requirements. We wanted to put a tag around each requirement so it can be automatically extracted using some code....

Ruby- get a xml node value

can someone help me in extracting the node value for the element "Name". Type 1: I am able to extract the "name" value for the below xml by using the below code <Element> <Details> <ID>20367</ID> <Name>Ram</Name> <Name>Sam</Name> </Details> </Element> doc = Nokogiri::XML(response.body) values = doc.xpath('//Name').map{ |node| node.text}.join ',' puts values...

XSLT for-each statement not iterating proper amount of times

I am having trouble with my XSLT for-each statements. When I run the XML through the XSLT, it only comes up with the first iteration of the list, and then stops. It doesn't post the values either. Here is the XML code. <?xml version="1.0" encoding="UTF-8"?> <template> <L> <Q>Hey</Q> <Q>There</Q> <Q>Thank...

Load XML to list using LINQ [duplicate]

This question already has an answer here: XDocument to List of object 1 answer I have following XML: <?xml version="1.0" encoding="utf-8"?> <start> <Current CurrentID="5"> <GeoLocations> <GeoLocation id="1" x="78492.61" y="-80973.03" z="-4403.297"/> <GeoLocation id="2" x="78323.57" y="-81994.98" z="-4385.707"/> <GeoLocation id="3" x="78250.57" y="-81994.98" z="-4385.707"/> </GeoLocations> <Vendors> <Vendor id = "1" x="123456" y="456789" z="0234324"/>...

XML-XSLT-XPATH : How to convert multiple XML elements to a string, separated by semicolon

I have just demonstrated my question as an input and output format as below. I have an input as xml document which consist of following data <Users> <user> <name>Mark Curtain</name> <email>[email protected]</email> <username>mark</username> </user> <user> <name>Zuke Gossip</name> <email>[email protected]</email> <username>zuke</username> </user> <user> <name>Villan Kiosk</name> <email>[email protected]</email>...

xpath query seem to be failing

Can someone tell me the reason for the failure of this xpath query: //Pages/*[(name() = 'Home') or following-sibling::Home] on this xml structure: <Pages> <copyright>me inc. 2015,</copyright> <author>Me</author> <lastUpdate>2/1/1999</lastUpdate> <Home>--------------------</Home> <About>--------------------</About> <Contact>------------------</Contact> </Pages> only returned copyright element, which is against my targets (To grab all element between Pages and...

List view not returning to original state after clearing search

I'm trying to get my list to show all my items again whenever I cancel a search from my search view but for some strange reason, the list gets stuck with the results only from the previous search. Does anyone know what is wrong with my code and how to...

Sequence number for static and dynamic rows in XSLT 2.0

I'm trying to generate sequence number for my input xml with some static and dynamic rows combination. input xml: (Edited) <data> <oldLine>dat1</oldLine> <modLine>dat2</modLine> <line>para1</line> <line>para2</line> <line>para3</line> </data> <data> <oldLine>dat3</oldLine> <modLine>dat4</modLine> <line>para4</line> <line>para5</line> </data> I need to add three fixed records after every "data" tag in the loop...

Remove all nodes in a specified namespace from XML

I have an XML document that contains some content in a namespace. Here is an example: <?xml version="1.0" encoding="UTF-8"?> <root xmlns:test="urn:my-test-urn"> <Item name="Item one"> <test:AlternativeName>Another name</test:AlternativeName> <Price test:Currency="GBP">124.00</Price> </Item> </root> I want to remove all of the content that is within the test namespace - not just remove the namespace...

Collect strings after a foreach loop

Is it possible to collect the strings after a foreach loop? For example: StringCollection col = new StringCollection(); XmlNodeList skillNameNodeList=SkillXML.GetElementsByTagName("name"); foreach (XmlNode skillNameNode in skillNameNodeList) { skillsName=skillNameNode.Attributes["value"].Value; } col.Add(skillsName); //Return System.Collections.Specialized.StringCollection I want to collect each skillsName and put them in a collection or a list so that I can...

finding file in root of wpf application

I'm trying to load a file with pack://application: The file is situated in the root of my project but I keep getting a null reference error. However When I do an absolute reference it finds the file and loads just fine. What am I missing here? This doesn't work var...

String parsing with batch scripting

I have a file called pictures.xml and it contains some pictures information like: <ResourcePicture Name="a.jpg"> <GeneratedPicture Name="b.jpg"/> <GeneratedPicture Name="c.jpg"/> </ResourcePicture> <ResourcePicture Name="z1.jpg"> <GeneratedPicture Name="z2.jpg"/> <GeneratedPicture Name="z3.jpg"/> <GeneratedPicture Name="z4.jpg"/> </ResourcePicture> What I want do do is to get each line in for loop and print the names of the pictures. Sample...

Fixed element in android?

I am using a FAB(Floating action button) and a ViewPager that has a list inside a fragment. The ViewPager stops due to the FAB block and each are blocks the ViewPager being on top of the FAB activity_main.xml <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" xmlns:fab="http://schemas.android.com/apk/res-auto" android:layout_width="match_parent" android:layout_height="match_parent" android:orientation="vertical" android:fitsSystemWindows="true">...

Unable to construct Document object from xml string

I have xml string coming to my appliaction like follows <?xml version="1.0" encoding="UTF-8"?><loc:getLocation xmlns:loc="http://www.csapi.org/schema/parlayx/terminal_location/v2_3/local"> <loc:address>tel:+919420161525</loc:address> <loc:requestedAccuracy>500</loc:requestedAccuracy> <loc:acceptableAccuracy>500</loc:acceptableAccuracy> </loc:getLocation> I want to construct Document object from this so that using XPath, I can retrieve required data. I tried following code...

How to calculate max string-length of a node-set?

I am trying to use XSLT to turn an XML document into plain text tables for human consumption. I am using xsltproc, which only implements XSLT 1.0 (so max is from EXSLT actually). I tried the below, but the commented-out definition fails because string-length returns only a single value (the...

C# XML: System.InvalidOperationException

I have been learning C#'s XML with a project however I keep getting the InvalidOperationException. I have put the code below XmlTextWriter writer = new XmlTextWriter(path, System.Text.Encoding.UTF8); writer.WriteStartDocument(true); writer.Formatting = Formatting.Indented; writer.Indentation = 4; writer.WriteStartElement("User Info"); writer.WriteStartElement("Name"); writer.WriteString(userName); writer.WriteEndElement(); writer.WriteStartElement("Tutor Name"); writer.WriteString(tutorName); writer.WriteEndElement();...

XSL - iterate through elements and update based on the node index from another xml file

I have an XML file with multiple Shape elements each with a child Material element that contains a Code attribute. I want to update the Code attribute for each Material element based on a value that is obtained from a separate XML file. The problem I have is that the...

Deserializing or parse XML response in Symfony2

I am calling a API method through cURL and I got this response: <?xml version="1.0" encoding="UTF-8"?> <jobInfo xmlns="http://www.force.com/2009/06/asyncapi/dataload"> <id>75080000002s5siAAA</id> <operation>query</operation> <object>User</object> <createdById>00580000008ReolAAC</createdById> <createdDate>2015-06-23T13:03:01.000Z</createdDate> <systemModstamp>2015-06-23T13:03:01.000Z</systemModstamp> <state>Open</state>...

jquery get elements by class name

I'm using Jquery to get a list of elements having a class "x". html: <p class="x">Some content</p> <p class="x">Some content#2</p> If we use Jquery to get both these html elements and do something with it- we use something like: $(".x").text("changed text"); This will change the text of both the paragraphs....

Converting XSD 1.1 to 1.0 - Validation Error

When I try to validate this XSD: <xs:group name="ValidityDateGroup"> <xs:annotation> <xs:documentation>Reusable element group to be used where Valid From/Until needs to be captured in xs:date format</xs:documentation> </xs:annotation> <xs:all> <xs:element minOccurs="0" name="ValidFrom" type="xs:date"/> <xs:element minOccurs="0" name="ValidUntil" type="xs:date"/> </xs:all> </xs:group> <xs:complexType name="NameType"> <xs:choice maxOccurs="unbounded" minOccurs="0">...

type conversion performance optimizable?

The following snippet converts xml data to csv data in a data processing application. element is a XElement. I'm currently trying to optimize the performance of the application and was wondering if I could somehow combine the two operations going on below: Ultimately I still want access to the string...

Change attribute value of an XML tag in Qt

I'm trying to change the language attribute of a .ts file in Qt using Qt itself. Here is the sample XML format. <?xml version='1.0' encoding='utf-8'?> <!DOCTYPE TS> <TS language="es_ES" version="2.1"> ... </TS> I have tried different ways, but no luck. Here are the methods I used. FileIOError FileIO::changeLanguageOfTsFile( QString tsFileName,...

Parse text from a .txt file using csv module

I have an email that comes in everyday and the format of the email is always the same except some of the data is different. I wrote a VBA Macro that exports the email to a text file. Now that it is a text file I want to parse the...

HTMLPurifier without XML declaration

I am using HTMLPurifier on PHP to clean some dirty HTML, as follows: $H=new HTMLPurifier() $content_text_fixHTML = $H->purify($content_text); Note: Omited encoding set up, because it is UTF-8 But, it will output the XML encoding declaration at the top. <?xml encoding="utf-8" ?> I do not want it. How do I prevent...

Extracting strings from HTML with Python wont work with regex or BeautifulSoup

Im using Python 2.7, BeautifulSoup4, regex, and requests on windows 7. I've scraped some code from a website and I am having problems parsing and extracting the bits I want and storing them in a dictionary. What I'm after is text that is presented as follows in the code: @CAD_DTA\">I...

XML, XSL namespaces

I'm new to XML especially namespaces. I made all the documents and everything seems to work fine, but I don't know whether I'm really using namespaces (which is requirement). Except that my html file are not valid because off this: "Attribute xmlns:xsi not allowed here." and "Attribute xmlns:xslformatting not allowed...

removing a parent node dependig upon child node using xslt

i am looking forwar for an template that removes anode from xml depending upon the value of a chid node,basically i am having an xml like: <EventInfo> <AssignmentEvent> <CreateDateTime>2015-06-02T00:00:00+02:00</CreateDateTime> </AssignmentEvent> <EstimateEvent> <CreateDateTime>2015-06-02T07:38:28.0000000Z</CreateDateTime> <CommitDateTime>2015-06-04T14:29:38.0000000Z</CommitDateTime> <UploadDateTime>2015-06-04T14:29:39.7651796Z</UploadDateTime>...

Convert contents of an XmlNodeList to a new XmlDocument without looping

I have Xml that I filter using XPath (a query similar to this): XmlNodeList allItems = xDoc.SelectNodes("//Person[not(PersonID = following::Person/PersonID)]"); This filters all duplicates from my original Persons Xml. I want to create a new XmlDocument instance from the XmlNodeList generated above. At the minute, the only way I can see...

How to define a Regex in StandardTokenParsers to identify path?

I am writing a parser in which I want to parse arithmetic expressions like: /hdfs://xxx.xx.xx.x:xxxx/path1/file1.jpg+1 I want to parse it change the infix to postfix and do the calculation. I used helps from a part of code in another discussion as well. class InfixToPostfix extends StandardTokenParsers { import lexical._ def...

odoo v8 - Field(s) `arch` failed against a constraint: Invalid view definition

I want to create a new view with a DB-view. When I try to install my app, DB-view was created then I get error: 2015-06-22 12:59:10,574 11988 ERROR odoo openerp.addons.base.ir.ir_ui_view: Das Feld `datum` existiert nicht Fehler Kontext: Ansicht `overview.tree.view` [view_id: 1532, xml_id: k. A., model: net.time.overview, parent_id: k. A.] 2015-06-22...

About sorting based on the counting of subelements

i have an xml document with properties that belong to agencies: <agency name="Century 42" num="Century42" mail="[email protected]"/> <property agency="Century42" ....> ... I would like to print the info of all agencies. The agencies should be sorted by the number of properties that they own. I tried this but it does not...

How to extract efficientely content from an xml with python?

I have the following xml: <?xml version="1.0" encoding="UTF-8" standalone="no"?><author id="user23"> <document><![CDATA["@username: That boner came at the wrong time ???? http://t.co/5X34233gDyCaCjR" HELP I'M DYING ]]></document> <document><![CDATA[Ugh ]]></document> <document><![CDATA[YES !!!! WE GO FOR IT. http://t.co/fiI23324E83b0Rt ]]></document> <document><![CDATA[@username Shout out to me???? ]]></document> </author> What is the most efficient...

XML Schema 1.0 “All” with multiple same elements?

What I want to validate is the XML looks like below: <A></A> <B></B> <C></C> <D></D> <E></E> <E></E> A,B,C,D just have zero or one. And they don't have sequence. It could be D,C,B,A. And in the very end, there are one or more E element(s). I have tried multiple ways, but...

Find element by class name

I'm trying to find one tag using we.find_element_by_css_selector('p.p1.transfer strong.ng-binding').text The problem is that there is sometimes a 'strong' tag before the tag I'm searching for which is very similar but it's class is: class="ng-binding ng-hide" instead of class="ng-binding". But when I try to find it it finds the first tag....

Error when building an XDocument

Using the following example xml containing one duplicate: <Persons> <Person> <PersonID>7506</PersonID> <Forename>K</Forename> <Surname>Seddon</Surname> <ChosenName /> <MiddleName /> <LegalSurname /> <Gender>Male</Gender> </Person> <Person> <PersonID>6914</PersonID> <Forename>Clark</Forename> <Surname>Kent</Surname> <ChosenName>Clark</ChosenName> <MiddleName />...

Java XPath returns single result instead of NodeSet

I am trying to create an XPath expression in Java (8, default XPath implementation). I am doing the following: Object res = xpath.evaluate("(//*[local-name()='PartyId'])", requestDom, XPathConstants.NODESET); I have multiple PartyId nodes in the document at the same level, because it's parent is repeating. I got my result, but only a single...

R readHTMLTable failed to load external entity [duplicate]

This question already has an answer here: R Error using readHTMLTable 2 answers When I run the line on my laptop, table500 <- readHTMLTable('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')[[1]] it gets the data. When I run it on my desktop, I receive the error Error: failed to load external entity "http://en.wikipedia.org/wiki/List_of_S%26P_500_companies". So I'm guessing...

Extracting XML data from CLOB

How can I extract Food ItemID and Food Item Name and Quantity from the data as mentioned below. This is in clob column in plsql. <ServiceDetails> <FoodItemDetails> <FoodItem FoodItemID="6486" FoodItemName="CARROT" Quantity="2" Comments="" ServingQuantityID="142" ServingQuantityName="SMALL GLASS" FoodItemPrice="50" ItemDishPriceID="5336" CurrencyName="INR" Currency Id="43"/> </FoodItemDetails> <BillOption> <Bill Details Total Price="22222" BillOption="cash"/> </BillOption> <Authoritativeness/>...