xml,perl,xml-twig , Is there a way to get XML::Twig to understand a UTF-16-encoded XML file?


Is there a way to get XML::Twig to understand a UTF-16-encoded XML file?

Question:

Tag: xml,perl,xml-twig

Is there a way to get XML::Twig to understand a UTF-16-encoded XML file?

The code to read the file is what was stated in the tutorials:

use warnings;
use strict;

use XML::Twig;

# ...

my $twig=XML::Twig->new(
  twig_handlers => { ... },
  prety_print => 'indented',
  keep_encoding => 1,
};

# ...

$twig->parsefile('myXmlFile.xml');  # <= line 71

Error is:

error parsing tag '<RIBBON>' at /usr/lib/perl5/vendor_perl/5.14/x86_64-cygwin-threads/XML/Parser/Expat.pm line 470
 at ../../cv32/res/convert-xml-string2.pl line 71
 at ../../cv32/res/convert-xml-string2.pl line 71

The XML starts off like so:

<?xml version="1.0" encoding="utf-16"?>

Changing my opening code as Borodin suggests, it still doesn't work:

# parse the XML file
open(my $xmlIn, '<:encoding(UTF-16)', $xmlFile) or die "Couldn't open xml file '$xmlFile'. $!";
$twig->parse($xmlIn); # <= line 72

The error becomes:

encoding specified in XML declaration is incorrect at line 1, column 30, byte 30 at /usr/lib/perl5/vendor_perl/5.14/x86_64-cygwin-threads/XML/Parser.pm line 187
 at ../../cv32/res/convert-xml-string2.pl line 72

Answer:

Apparently, the XML parser used by XML::Twig (XML::Parser) doesn't support UTF-16. You need to convert the XML document to a supported encoding (e.g. UTF-8) first.

For example,

use XML::LibXML qw( );

my $xml;
{
   open(my $fh, '<:raw', $qfn)
      or die $!;
   local $/;
   $xml = <$fh>;
}

{
   my $doc = XML::LibXML->new()->parse_string($xml);
   $doc->setEncoding('UTF-8');
   $xml = $doc->toString();
}

$twig->parse($xml);

A lighter solution would be to detect/expect UTF-16, decode the document (using Encode's decode), use a regex to adjust the encoding declaration, then encoding the document (using Encodes encode).


Related:


xpath query seem to be failing


xml,xpath
Can someone tell me the reason for the failure of this xpath query: //Pages/*[(name() = 'Home') or following-sibling::Home] on this xml structure: <Pages> <copyright>me inc. 2015,</copyright> <author>Me</author> <lastUpdate>2/1/1999</lastUpdate> <Home>--------------------</Home> <About>--------------------</About> <Contact>------------------</Contact> </Pages> only returned copyright element, which is against my targets (To grab all element between Pages and...

Unable to construct Document object from xml string


java,xml,xpath,xml-parsing
I have xml string coming to my appliaction like follows <?xml version="1.0" encoding="UTF-8"?><loc:getLocation xmlns:loc="http://www.csapi.org/schema/parlayx/terminal_location/v2_3/local"> <loc:address>tel:+919420161525</loc:address> <loc:requestedAccuracy>500</loc:requestedAccuracy> <loc:acceptableAccuracy>500</loc:acceptableAccuracy> </loc:getLocation> I want to construct Document object from this so that using XPath, I can retrieve required data. I tried following code...

XMLPullParser black diamond question marks with certain characters


android,xml,character-encoding,xmlpullparser,questionmark
I'm making an android app, that needs to fetch and parse XML. The class for that was made following the instructions from here http://www.tutorialspoint.com/android/android_rss_reader.htm and the fetcher method looks like this: public void fetchXML() { Thread thread = new Thread(new Runnable() { @Override public void run() { try { URL...

XML-XSLT-XPATH : How to convert multiple XML elements to a string, separated by semicolon


xml,xslt,xpath,xslt-2.0
I have just demonstrated my question as an input and output format as below. I have an input as xml document which consist of following data <Users> <user> <name>Mark Curtain</name> <email>[email protected]</email> <username>mark</username> </user> <user> <name>Zuke Gossip</name> <email>[email protected]</email> <username>zuke</username> </user> <user> <name>Villan Kiosk</name> <email>[email protected]</email>...

Java XPath returns single result instead of NodeSet


java,xml,dom,xpath
I am trying to create an XPath expression in Java (8, default XPath implementation). I am doing the following: Object res = xpath.evaluate("(//*[local-name()='PartyId'])", requestDom, XPathConstants.NODESET); I have multiple PartyId nodes in the document at the same level, because it's parent is repeating. I got my result, but only a single...

R readHTMLTable failed to load external entity [duplicate]


xml,r,connection
This question already has an answer here: R Error using readHTMLTable 2 answers When I run the line on my laptop, table500 <- readHTMLTable('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')[[1]] it gets the data. When I run it on my desktop, I receive the error Error: failed to load external entity "http://en.wikipedia.org/wiki/List_of_S%26P_500_companies". So I'm guessing...

Deserializing or parse XML response in Symfony2


php,xml,symfony2,deserialization,jmsserializerbundle
I am calling a API method through cURL and I got this response: <?xml version="1.0" encoding="UTF-8"?> <jobInfo xmlns="http://www.force.com/2009/06/asyncapi/dataload"> <id>75080000002s5siAAA</id> <operation>query</operation> <object>User</object> <createdById>00580000008ReolAAC</createdById> <createdDate>2015-06-23T13:03:01.000Z</createdDate> <systemModstamp>2015-06-23T13:03:01.000Z</systemModstamp> <state>Open</state>...

Converting XSD 1.1 to 1.0 - Validation Error


xml,xsd
When I try to validate this XSD: <xs:group name="ValidityDateGroup"> <xs:annotation> <xs:documentation>Reusable element group to be used where Valid From/Until needs to be captured in xs:date format</xs:documentation> </xs:annotation> <xs:all> <xs:element minOccurs="0" name="ValidFrom" type="xs:date"/> <xs:element minOccurs="0" name="ValidUntil" type="xs:date"/> </xs:all> </xs:group> <xs:complexType name="NameType"> <xs:choice maxOccurs="unbounded" minOccurs="0">...

Get XML node value when previous node value conditions are true (without looping)


xml,vb.net,linq-to-xml
Sample XML - <?xml version="1.0"?> <Root> <PhoneType dataType="string"> <Value>CELL</Value> </PhoneType> <PhonePrimaryYn dataType="string"> <Value>Y</Value> </PhonePrimaryYn> <PhoneNumber dataType="string"> <Value>555-555-5554</Value> </PhoneNumber> <PhonePrimaryYn dataType="string"> <Value>Y</Value> </PhonePrimaryYn> <PhoneType dataType="string"> <Value>HOME</Value> </PhoneType>...

type conversion performance optimizable?


c#,xml,csv,optimization,type-conversion
The following snippet converts xml data to csv data in a data processing application. element is a XElement. I'm currently trying to optimize the performance of the application and was wondering if I could somehow combine the two operations going on below: Ultimately I still want access to the string...

Fixed element in android?


android,xml,android-fragments
I am using a FAB(Floating action button) and a ViewPager that has a list inside a fragment. The ViewPager stops due to the FAB block and each are blocks the ViewPager being on top of the FAB activity_main.xml <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" xmlns:fab="http://schemas.android.com/apk/res-auto" android:layout_width="match_parent" android:layout_height="match_parent" android:orientation="vertical" android:fitsSystemWindows="true">...

Parsing XML array using Jquery


javascript,jquery,xml,jquery-mobile
I have stuck up with an issue of passing XML using Jquery. I am getting empty array while traversing to jquery.Please help me how to get datas from XML array. I have mentioned my code below. XML <?xml version="1.0" encoding="UTF-8"?> <json> <json> <CustomerName>999GIZA MID INSURANCEAND SERVICES PVT LTD</CustomerName> <mobiLastReceiptDate>null</mobiLastReceiptDate> </json>...

Why Filter::Indent::HereDoc complain when blank line in middle of HereDoc


perl,heredoc
I am trying Filter::Indent::HereDoc which allows one to indent the HereDocument. This is very useful, to be able to have HereDoc that flows with the code logic. From the above link When a 'here document' is used, the document text and the termination string must be flush with the left...

Error when building an XDocument


c#,xml,linq,xpath,linq-to-xml
Using the following example xml containing one duplicate: <Persons> <Person> <PersonID>7506</PersonID> <Forename>K</Forename> <Surname>Seddon</Surname> <ChosenName /> <MiddleName /> <LegalSurname /> <Gender>Male</Gender> </Person> <Person> <PersonID>6914</PersonID> <Forename>Clark</Forename> <Surname>Kent</Surname> <ChosenName>Clark</ChosenName> <MiddleName />...

odoo v8 - Field(s) `arch` failed against a constraint: Invalid view definition


python,xml,view,odoo,add-on
I want to create a new view with a DB-view. When I try to install my app, DB-view was created then I get error: 2015-06-22 12:59:10,574 11988 ERROR odoo openerp.addons.base.ir.ir_ui_view: Das Feld `datum` existiert nicht Fehler Kontext: Ansicht `overview.tree.view` [view_id: 1532, xml_id: k. A., model: net.time.overview, parent_id: k. A.] 2015-06-22...

XSLT How to remove style from div and td tags


xml,xslt
I am new to XSLT. I got stuck while removing style attributes from div, td or li tags. Input XML: <?xml version="1.0" encoding="UTF-8"?> <div xmlns="http://www.w3.org/1999/xhtml"> <table style="BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; WIDTH: 606px; BORDER-COLLAPSE: collapse; WORD-WRAP: break-word; TABLE-LAYOUT: fixed; BORDER-TOP: medium none; BORDER-RIGHT: medium none" class="MsoNormalTable msoUcTable" tabIndex="-1" border="1"...

Collect strings after a foreach loop


c#,xml,foreach
Is it possible to collect the strings after a foreach loop? For example: StringCollection col = new StringCollection(); XmlNodeList skillNameNodeList=SkillXML.GetElementsByTagName("name"); foreach (XmlNode skillNameNode in skillNameNodeList) { skillsName=skillNameNode.Attributes["value"].Value; } col.Add(skillsName); //Return System.Collections.Specialized.StringCollection I want to collect each skillsName and put them in a collection or a list so that I can...

Capture tee's argument inside piped Perl execution


perl,unix
How to capture piped command's argument ? I use : perl my_script.pl -some_args | tee arg_filename How to get arg_filename 's value inside my_script.pl ? CONTEXT I need to send this filename in a mail which my_script.pl sends at the end. I need to use tee because we dump huge...

Perl : Display perl variable awk sed echo


perl
When I am using below command directly its working fine but when I am trying to put this in perl script its giving lots of error. my $calculate = `echo "$value" | awk -F "SP=" '{print $2}' | awk -F ";" '{print $1}' | awk -F ":" '{print $2}' |...

Extracting XML data from CLOB


sql,xml,oracle
How can I extract Food ItemID and Food Item Name and Quantity from the data as mentioned below. This is in clob column in plsql. <ServiceDetails> <FoodItemDetails> <FoodItem FoodItemID="6486" FoodItemName="CARROT" Quantity="2" Comments="" ServingQuantityID="142" ServingQuantityName="SMALL GLASS" FoodItemPrice="50" ItemDishPriceID="5336" CurrencyName="INR" Currency Id="43"/> </FoodItemDetails> <BillOption> <Bill Details Total Price="22222" BillOption="cash"/> </BillOption> <Authoritativeness/>...

XML Schema 1.0 “All” with multiple same elements?


xml,schema
What I want to validate is the XML looks like below: <A></A> <B></B> <C></C> <D></D> <E></E> <E></E> A,B,C,D just have zero or one. And they don't have sequence. It could be D,C,B,A. And in the very end, there are one or more E element(s). I have tried multiple ways, but...

Multiply arrays by arrays in JAVA


java,arrays,xml,permutation
I have for example three arrays (but I can have more) with some values like this: table_1 = [a,b,c]; //three elements table_2 = [d]; //one elements table_3 = [e,f]; //two elements and I want to get that output <test> <test_1>a</test_1> <test_2>d</test_2> <test_3>e</test_3> </test> <test> <test_1>a</test_1> <test_2>d</test_2> <test_3>f</test_3> </test> <test> <test_1>b</test_1>...

Command line arguments in Perl


perl
I am working on an open source project for GSoC and I have this piece of Perl code with me. I need to create another Perl file for a similar task. However, I am having trouble understanding 3 lines of this file. More specifically, I am not able to understand...

Clean and convert HTML to XML for BaseX


html,xml,converter,xquery,basex
I would like to run some XQuery commands using BaseX over an HTML source that may be full of <script>, <style> nodes that must be removed and also unclosed tags (<br>, <img>) that must have a pair. (for example the dirty source of this page ) "Converting HTML to XML"...

How to extract efficientely content from an xml with python?


python,xml,python-2.7,pandas,lxml
I have the following xml: <?xml version="1.0" encoding="UTF-8" standalone="no"?><author id="user23"> <document><![CDATA["@username: That boner came at the wrong time ???? http://t.co/5X34233gDyCaCjR" HELP I'M DYING ]]></document> <document><![CDATA[Ugh ]]></document> <document><![CDATA[YES !!!! WE GO FOR IT. http://t.co/fiI23324E83b0Rt ]]></document> <document><![CDATA[@username Shout out to me???? ]]></document> </author> What is the most efficient...

XML, XSL namespaces


xml,xslt,namespaces
I'm new to XML especially namespaces. I made all the documents and everything seems to work fine, but I don't know whether I'm really using namespaces (which is requirement). Except that my html file are not valid because off this: "Attribute xmlns:xsi not allowed here." and "Attribute xmlns:xslformatting not allowed...

Tagging values in HTML document for automated extraction


html,xml,html5
We have a series of documents that are being converted to HTML for web access. The documents are operating instructions that list actions people have to do as well as distinct requirements. We wanted to put a tag around each requirement so it can be automatically extracted using some code....

Regex in Perl Uninitialized $1


regex,perl
My string looks like this: <File `../Path/To/My_File.gif'> I want to extract just "Path/To/My_File.gif". Here is the check I have: if ($row =~ /(?<=File `..\/).*(?=')/) { print "Found it!\n"; print "$1\n"; } I see "Found it!" printed to the console but also get an error saying that $1 is uninitialized. What...

Sequence number for static and dynamic rows in XSLT 2.0


xml,xslt-2.0
I'm trying to generate sequence number for my input xml with some static and dynamic rows combination. input xml: (Edited) <data> <oldLine>dat1</oldLine> <modLine>dat2</modLine> <line>para1</line> <line>para2</line> <line>para3</line> </data> <data> <oldLine>dat3</oldLine> <modLine>dat4</modLine> <line>para4</line> <line>para5</line> </data> I need to add three fixed records after every "data" tag in the loop...

XSLT for-each statement not iterating proper amount of times


xml,xslt
I am having trouble with my XSLT for-each statements. When I run the XML through the XSLT, it only comes up with the first iteration of the list, and then stops. It doesn't post the values either. Here is the XML code. <?xml version="1.0" encoding="UTF-8"?> <template> <L> <Q>Hey</Q> <Q>There</Q> <Q>Thank...

finding file in root of wpf application


c#,xml,wpf,visual-studio,relative-path
I'm trying to load a file with pack://application: The file is situated in the root of my project but I keep getting a null reference error. However When I do an absolute reference it finds the file and loads just fine. What am I missing here? This doesn't work var...

Why this exclusion not working for long sentences?


text-processing,perl
Command perl -ne 'print unless /.[240,]/' input.txt > output.txt which includes some sentences which are longer than 240 letters. Why? Example data Development of World Funny Society program on young people who are working hard for the sport and social life such that they have time to go pizzeria every...

group siblings by identifying the first node of a certain type in sequence


xml,xslt,xpath
Not sure if that description is the best...but given this xml: <?xml version="1.0"?> <root> <type1 num="1" first="1"/> <type1 num="2" /> <type2 num="3" /> <type2 num="4" /> <type1 num="5" first="2"/> <type1 num="6" /> <type2 num="7" /> <type2 num="8" /> <type1 num="9" first="3"/> <type1 num="10" /> <type2 num="11" /> <type2 num="12" />...

C# XML: System.InvalidOperationException


c#,xml
I have been learning C#'s XML with a project however I keep getting the InvalidOperationException. I have put the code below XmlTextWriter writer = new XmlTextWriter(path, System.Text.Encoding.UTF8); writer.WriteStartDocument(true); writer.Formatting = Formatting.Indented; writer.Indentation = 4; writer.WriteStartElement("User Info"); writer.WriteStartElement("Name"); writer.WriteString(userName); writer.WriteEndElement(); writer.WriteStartElement("Tutor Name"); writer.WriteString(tutorName); writer.WriteEndElement();...

Ruby- get a xml node value


ruby,xml
can someone help me in extracting the node value for the element "Name". Type 1: I am able to extract the "name" value for the below xml by using the below code <Element> <Details> <ID>20367</ID> <Name>Ram</Name> <Name>Sam</Name> </Details> </Element> doc = Nokogiri::XML(response.body) values = doc.xpath('//Name').map{ |node| node.text}.join ',' puts values...

Perl: Using Text::CSV to print AoH


arrays,perl,csv
I have an array of hashes (AoH) which looks like this: $VAR1 = [ { 'Unit' => 'M', 'Size' => '321', 'User' => 'test' } { 'Unit' => 'M' 'Size' => '0.24' 'User' => 'test1' } ... ]; How do I write my AoH to a CSV file with separators,...

How to calculate max string-length of a node-set?


xml,xslt,xslt-1.0,libxslt
I am trying to use XSLT to turn an XML document into plain text tables for human consumption. I am using xsltproc, which only implements XSLT 1.0 (so max is from EXSLT actually). I tried the below, but the commented-out definition fails because string-length returns only a single value (the...

Convert contents of an XmlNodeList to a new XmlDocument without looping


c#,xml,xpath,xmldocument,xmlnodelist
I have Xml that I filter using XPath (a query similar to this): XmlNodeList allItems = xDoc.SelectNodes("//Person[not(PersonID = following::Person/PersonID)]"); This filters all duplicates from my original Persons Xml. I want to create a new XmlDocument instance from the XmlNodeList generated above. At the minute, the only way I can see...

List view not returning to original state after clearing search


java,android,xml,android-activity,android-listfragment
I'm trying to get my list to show all my items again whenever I cancel a search from my search view but for some strange reason, the list gets stuck with the results only from the previous search. Does anyone know what is wrong with my code and how to...

About sorting based on the counting of subelements


xml,xslt
i have an xml document with properties that belong to agencies: <agency name="Century 42" num="Century42" mail="[email protected]"/> <property agency="Century42" ....> ... I would like to print the info of all agencies. The agencies should be sorted by the number of properties that they own. I tried this but it does not...

Perl Debugging Using Flags


perl,debugging,script-debugging
So my goal is to find an easy way to turn on print statements in Perl a flag. In C/C++ you can use a #define to choose if certain code is run and it is a way to turn on and off debug print statements. Where if a #define DEBUG...

XElement.Value is stripping XML tags from content


c#,.net,xml,xml-parsing,xelement
I have the following XML: <Message> <Identification>c387e36a-0d79-405a-745c-7fc3e1aa8160</Identification> <SerializedContent> {"Identification":"81d090ca-b913-4f15-854d-059055cc49ff","LogType":0,"LogContent":"{\"EntitiesChanges\":\" <audit> <username>acfc</username> <date>2015-06-04T15:15:34.7979485-03:00</date> <entities> <entity> <properties> <property> <name>DepId</name> <current>2</current> </property>...

Load XML to list using LINQ [duplicate]


c#,xml,linq
This question already has an answer here: XDocument to List of object 1 answer I have following XML: <?xml version="1.0" encoding="utf-8"?> <start> <Current CurrentID="5"> <GeoLocations> <GeoLocation id="1" x="78492.61" y="-80973.03" z="-4403.297"/> <GeoLocation id="2" x="78323.57" y="-81994.98" z="-4385.707"/> <GeoLocation id="3" x="78250.57" y="-81994.98" z="-4385.707"/> </GeoLocations> <Vendors> <Vendor id = "1" x="123456" y="456789" z="0234324"/>...

HTMLPurifier without XML declaration


php,xml,htmlpurifier
I am using HTMLPurifier on PHP to clean some dirty HTML, as follows: $H=new HTMLPurifier() $content_text_fixHTML = $H->purify($content_text); Note: Omited encoding set up, because it is UTF-8 But, it will output the XML encoding declaration at the top. <?xml encoding="utf-8" ?> I do not want it. How do I prevent...

XSL transformation outputting multiple times and other confusion


xml,xslt,xpath
I'm attempting to transform a section of an XML document (which is mostly HTML) with a templated piece of markup should a particular pattern be matched. I'm inexperienced with XSLT (I've only used xpath, really) and online documentation is sparse so I'm struggling with it... To the following XML document:...

How get value from property file to input in springConfig.xml


java,xml,spring-mvc
I want to get the property value in email.properties file to input in the springConfig.xml. but there is an error occur. here is my code below springConfig.xml <bean class="org.springframework.mail.javamail.JavaMailSenderImpl" id="mailSender"> <property name="host" value="${email.host}" /> <property name="protocol" value="${email.protocol}" /> <property name="port" value="${email.port}" /> <property name="username" value="${email.username}"/> <property name="password" value="${email.password}" />...

Looping variables


perl,scripting
I'm working with perl to make a script that will work with Dot products/assorted vector math. I've got a working script ( Still very much in progress/needs refinement ) that will do what I ask. #!/usr/bin/perl use strict; use warnings; use diagnostics; use Math::Vector::Real; use 5.010; use Math::Trig; my $source...

Remove all nodes in a specified namespace from XML


c#,xml,linq-to-xml
I have an XML document that contains some content in a namespace. Here is an example: <?xml version="1.0" encoding="UTF-8"?> <root xmlns:test="urn:my-test-urn"> <Item name="Item one"> <test:AlternativeName>Another name</test:AlternativeName> <Price test:Currency="GBP">124.00</Price> </Item> </root> I want to remove all of the content that is within the test namespace - not just remove the namespace...

Opening multiple files in perl array


arrays,perl
I have a perl script where by I assigned all the files with a .log extension to an array called @allfiles. How do I run my script for the files stored in each array? My idea is something like open(my $fn, '<', @allfiles) or die "Could not open file '@files':...