json,xml,r,web-scraping , Getting table in R from HTMLInternalDocument object


Getting table in R from HTMLInternalDocument object

Question:

Tag: json,xml,r,web-scraping

I have to download several tables from website, table id is "tabela", I tried various functions XML::readHTMLTable, XML::xmlTreeParse, but only rvest package loads it :

 require(rvest)
 url="http://www.pse.pl/index.php?modul=21&id_rap=2&data=2013-01-01"
 wpkd <- html(url)
 class(wpkd)
[1] "HTMLInternalDocument" "HTMLInternalDocument" "XMLInternalDocument"  "XMLAbstractDocument" 
 str(wpkd)
Classes 'HTMLInternalDocument', 'HTMLInternalDocument', 'XMLInternalDocument', 'XMLAbstractDocument' <externalptr>

now I would like to extract table with "tabela" id or save wpkd as plain text and try low level extraction. Structure of wpkd isn't recognised properly :

> wpkd %>% xml_structure()
{DTD}
<html>
  <head>
    <title> {text}
    <meta [http-equiv, content]>
    <meta [name, content]>
    <meta [name, content]>
    <link [rel, type, title, href]>
    <meta [name, content]>
    <meta [name, content]>
    <meta [http-equiv, content]>
    <meta [http-equiv, content]>
    <link [rel, type, href]>
    <link [rel, href, type]>
    <link [rel, type, href]>
    <link [rel, href, type, media]>
    <link [rel, href, type, media]>
    <link [rel, href, type, media]>
    <script [src]>
    <script [src]>
Error: Unknown input XMLInternalCommentNode/XMLInternalNode/XMLAbstractNode

Answer:

Given that the "header" is not uniform (spanned TRs) here is one way to do it (it's not the only way):

library(rvest)
library(magrittr)
library(dplyr)

pg <- html("http://www.pse.pl/index.php?modul=21&id_rap=2&data=2013-01-01")

# small function to extract by column

get_col <- function(doc, i) {
  skip <- ifelse(i==8, -1, -2) # last column is "wonky"
  doc %>% 
    html_nodes(xpath=sprintf("//table[@id='tabela']/tr/td[%d]", i)) %>% 
    extract(-1:skip) %>% # skip the useless "TR"s
    html_text()
}

# manually build data frame, which actually gives you better column names

data.frame(time=pg %>% get_col(1),
           demand=pg %>% get_col(2),
           capacity_jwcd=pg %>% get_col(3),
           capacity_njwcd=pg %>% get_col(4),
           generation_jwcd=pg %>% get_col(5),
           generation_njwcd=pg %>% get_col(6),
           reserve_over=pg %>% get_col(7),
           reserve_below=pg %>% get_col(8),
           stringsAsFactors=FALSE) -> energy

glimpse(energy)

## Observations: 24
## Variables:
## $ time             (chr) "1", "2", "3", "4", "5", "6", "7", "8", "9", "10...
## $ demand           (chr) "14 650", "14 000", "13 325", "12 850", "12 575"...
## $ capacity_jwcd    (chr) "21 032", "21 032", "21 032", "21 032", "21 032"...
## $ capacity_njwcd   (chr) "8 918", "8 918", "8 918", "8 918", "8 918", "8 ...
## $ generation_jwcd  (chr) "7 085", "6 446", "5 777", "5 307", "5 031", "4 ...
## $ generation_njwcd (chr) "7 565", "7 554", "7 548", "7 543", "7 544", "7 ...
## $ reserve_over     (chr) "1 328", "1 269", "1 209", "1 166", "1 141", "1 ...
## $ reserve_below    (chr) "-1 328", "-1 269", "-1 209", "-1 166", "-1 141"...               

You'll need to do type conversions on your own, though (and you would even if you used one of the auto-table functions provided they worked).


Related:


XElement.Value is stripping XML tags from content


c#,.net,xml,xml-parsing,xelement
I have the following XML: <Message> <Identification>c387e36a-0d79-405a-745c-7fc3e1aa8160</Identification> <SerializedContent> {"Identification":"81d090ca-b913-4f15-854d-059055cc49ff","LogType":0,"LogContent":"{\"EntitiesChanges\":\" <audit> <username>acfc</username> <date>2015-06-04T15:15:34.7979485-03:00</date> <entities> <entity> <properties> <property> <name>DepId</name> <current>2</current> </property>...

How to extract efficientely content from an xml with python?


python,xml,python-2.7,pandas,lxml
I have the following xml: <?xml version="1.0" encoding="UTF-8" standalone="no"?><author id="user23"> <document><![CDATA["@username: That boner came at the wrong time ???? http://t.co/5X34233gDyCaCjR" HELP I'M DYING ]]></document> <document><![CDATA[Ugh ]]></document> <document><![CDATA[YES !!!! WE GO FOR IT. http://t.co/fiI23324E83b0Rt ]]></document> <document><![CDATA[@username Shout out to me???? ]]></document> </author> What is the most efficient...

List view not returning to original state after clearing search


java,android,xml,android-activity,android-listfragment
I'm trying to get my list to show all my items again whenever I cancel a search from my search view but for some strange reason, the list gets stuck with the results only from the previous search. Does anyone know what is wrong with my code and how to...

Response 200 OK but Jquery shows error?


javascript,jquery,json
I am stuck in kind of a weird problem where I am trying to make a JSONP request to the Open Weather history API to get the weather information for the previous years on the a particular date. fetchHistoryUrl = 'http://78.46.48.103/data/3.0/days?day=0622&mode=list&lat=39.77476949&lon=-100.1953125'; $.ajax({ method: 'get', url: fetchHistoryUrl, success: function(response){ console.log(response); },...

Serializing a java bean into a cookie: Is it bad?


java,json,cookies
In the organization that i work for, there was a serious debate about the following. Scenario: There is a POJO with 6 different properties all are of type Strings. These values need to be persisted as cookies so that it can be picked back when someone does a booking on...

odoo v8 - Field(s) `arch` failed against a constraint: Invalid view definition


python,xml,view,odoo,add-on
I want to create a new view with a DB-view. When I try to install my app, DB-view was created then I get error: 2015-06-22 12:59:10,574 11988 ERROR odoo openerp.addons.base.ir.ir_ui_view: Das Feld `datum` existiert nicht Fehler Kontext: Ansicht `overview.tree.view` [view_id: 1532, xml_id: k. A., model: net.time.overview, parent_id: k. A.] 2015-06-22...

Codeigniter Select JSON, Insert JSON


json,codeigniter,select,insert,routing
I have very simple users database: user_id, user_name, user_email My model this: class Users extends CI_Model { private $table; private $table_fields; private $table_fields_join; function __construct() { parent::__construct(); $this->table = 'users'; $this->table_fields = array( $this->table.'.user_id', $this->table.'.user_name', $this->table.'.user_email' ); $this->table_fields_join = array(); } function select(){ $this->db->select(implode(', ', array_merge($this->table_fields, $this->table_fields_join)));...

Uncaught error: Invalid type for google table column


javascript,json,google-maps,google-visualization
I recently started using google visualization due to its simplicity. But while implementing in my recent project i seem to have run into a bit of trouble. The program shown is used to pull data from a JSON object and utilize it to display data in the google tables. Now...

convert from json to array


javascript,php,json
I am trying to get data to display in a table. I don't know what I am doing wrong, but when I get the data from my page it is an array of single characters. I could parse this myself but would prefer to know what I am doing wrong....

Fatal error catched by register_shutdown_function and update json_encode


php,json,fatal-error
I can catch a fatal error with register_shutdown_function('shutdown') function shutdown(){ $error = error_get_last(); $result['message'] = $error['message']; } and I have an echo json_encode($result); Unfortunately this echo json_encode shows nothing because json is not updated with the message of the fatal error. What can I do to fix this?...

Tagging values in HTML document for automated extraction


html,xml,html5
We have a series of documents that are being converted to HTML for web access. The documents are operating instructions that list actions people have to do as well as distinct requirements. We wanted to put a tag around each requirement so it can be automatically extracted using some code....

group siblings by identifying the first node of a certain type in sequence


xml,xslt,xpath
Not sure if that description is the best...but given this xml: <?xml version="1.0"?> <root> <type1 num="1" first="1"/> <type1 num="2" /> <type2 num="3" /> <type2 num="4" /> <type1 num="5" first="2"/> <type1 num="6" /> <type2 num="7" /> <type2 num="8" /> <type1 num="9" first="3"/> <type1 num="10" /> <type2 num="11" /> <type2 num="12" />...

XML Schema 1.0 “All” with multiple same elements?


xml,schema
What I want to validate is the XML looks like below: <A></A> <B></B> <C></C> <D></D> <E></E> <E></E> A,B,C,D just have zero or one. And they don't have sequence. It could be D,C,B,A. And in the very end, there are one or more E element(s). I have tried multiple ways, but...

XSLT How to remove style from div and td tags


xml,xslt
I am new to XSLT. I got stuck while removing style attributes from div, td or li tags. Input XML: <?xml version="1.0" encoding="UTF-8"?> <div xmlns="http://www.w3.org/1999/xhtml"> <table style="BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; WIDTH: 606px; BORDER-COLLAPSE: collapse; WORD-WRAP: break-word; TABLE-LAYOUT: fixed; BORDER-TOP: medium none; BORDER-RIGHT: medium none" class="MsoNormalTable msoUcTable" tabIndex="-1" border="1"...

why i don't get return value javascript


javascript,jquery,html,json,html5
When i debug my code i can see i have value but i don't get value to createCheckBoxPlatform FN function createCheckBoxPlatform(data) { var platform = ""; $.each(data, function (i, item) { platform += '<label><input type="checkbox" name="' + item.PlatformName + ' value="' + item.PlatformSK + '">' + item.PlatformName + '</label>' +...

Link to another resource in a REST API: by its ID, or by its URL?


json,api,rest,api-design,hateoas
I am creating some APIs using apiary, so the language used is JSON. Let's assume I need to represent this resource: { "id" : 9, "name" : "test", "customer_id" : 12, "user_id" : 1, "store_id" : 3, "notes" : "Lorem ipsum example long text" } Is it correct to refer...

Nested JSON structure to build side-by-side d3.js charts


javascript,json,d3.js
I'm currently stuck on how to traverse a JSON structure (I created) in order to create side-by-side donut charts. I think I've created a bad structure, but would appreciate any advice. I'm working from Mike Bostock's example here: http://bl.ocks.org/mbostock/1305337 which uses a .csv file for source data. In that example...

XSLT for-each statement not iterating proper amount of times


xml,xslt
I am having trouble with my XSLT for-each statements. When I run the XML through the XSLT, it only comes up with the first iteration of the list, and then stops. It doesn't post the values either. Here is the XML code. <?xml version="1.0" encoding="UTF-8"?> <template> <L> <Q>Hey</Q> <Q>There</Q> <Q>Thank...

How to calculate max string-length of a node-set?


xml,xslt,xslt-1.0,libxslt
I am trying to use XSLT to turn an XML document into plain text tables for human consumption. I am using xsltproc, which only implements XSLT 1.0 (so max is from EXSLT actually). I tried the below, but the commented-out definition fails because string-length returns only a single value (the...

Parse JSON output from AlamoFire


json,swift,nsurlrequest
I am using the latest Alamofire to manage a GET http request to my server. I am using the following to GET and parse the JSON: Alamofire.request(.GET, "*******") .responseJSON {(request, response, JSON, error) in if let statusesArray = JSON as? NSArray{ if let aStatus = statusesArray[0] as? NSDictionary{ //OUTPUT SHOWN...

HTMLPurifier without XML declaration


php,xml,htmlpurifier
I am using HTMLPurifier on PHP to clean some dirty HTML, as follows: $H=new HTMLPurifier() $content_text_fixHTML = $H->purify($content_text); Note: Omited encoding set up, because it is UTF-8 But, it will output the XML encoding declaration at the top. <?xml encoding="utf-8" ?> I do not want it. How do I prevent...

access the json encoded object returned by php in jquery


php,jquery,ajax,json
I want to post some data to php function by ajax, then get the encoded json object that the php function will return, then I want to get the information (keys and values) from this object, but I don't know how, here is my code: $.ajax({ url: "functions.php", dataType: "JSON",...

Why i get can not resolve method error in class android?


android,json
I use this tutorial: JSON and part of code to this: protected String doInBackground(String... urls) { TextView lbl=(TextView) findViewById(R.id.provCODE); person = new Person(); person.setName(lbl.getText().toString()); //person.setCountry("1853"); //person.setTwitter("1892"); return POST(urls[0],person); } but in the this line: person.setName(lbl.getText().toString()); i get this error: Can not resolve method. How can i solve that?what happen? my...

XMLPullParser black diamond question marks with certain characters


android,xml,character-encoding,xmlpullparser,questionmark
I'm making an android app, that needs to fetch and parse XML. The class for that was made following the instructions from here http://www.tutorialspoint.com/android/android_rss_reader.htm and the fetcher method looks like this: public void fetchXML() { Thread thread = new Thread(new Runnable() { @Override public void run() { try { URL...

Load XML to list using LINQ [duplicate]


c#,xml,linq
This question already has an answer here: XDocument to List of object 1 answer I have following XML: <?xml version="1.0" encoding="utf-8"?> <start> <Current CurrentID="5"> <GeoLocations> <GeoLocation id="1" x="78492.61" y="-80973.03" z="-4403.297"/> <GeoLocation id="2" x="78323.57" y="-81994.98" z="-4385.707"/> <GeoLocation id="3" x="78250.57" y="-81994.98" z="-4385.707"/> </GeoLocations> <Vendors> <Vendor id = "1" x="123456" y="456789" z="0234324"/>...

XSL transformation outputting multiple times and other confusion


xml,xslt,xpath
I'm attempting to transform a section of an XML document (which is mostly HTML) with a templated piece of markup should a particular pattern be matched. I'm inexperienced with XSLT (I've only used xpath, really) and online documentation is sparse so I'm struggling with it... To the following XML document:...

Create n:m objects using json and sequelize?


javascript,json,node.js,sequelize.js
I am trying to learn sequelize, but am having trouble getting a n:m object created. So far I have my 2 models that create 3 tables (Store, Product, StoreProducts) and the models below: models/Store.js module.exports = (sequelize, DataTypes) => { return Store = sequelize.define('Store', { name: { type: DataTypes.STRING, },...

Replacing elements in an HTML file with JSON objects


javascript,json,replace
I am writing some Javascript to: Read in and Parse a JSON file Read in an HTML file (created as a template file) Replace elements in the HTML file with elements from the JSON file. It is only replacing the first element obj.verb. Can someone please help me figure out...

Error when building an XDocument


c#,xml,linq,xpath,linq-to-xml
Using the following example xml containing one duplicate: <Persons> <Person> <PersonID>7506</PersonID> <Forename>K</Forename> <Surname>Seddon</Surname> <ChosenName /> <MiddleName /> <LegalSurname /> <Gender>Male</Gender> </Person> <Person> <PersonID>6914</PersonID> <Forename>Clark</Forename> <Surname>Kent</Surname> <ChosenName>Clark</ChosenName> <MiddleName />...

do calculation inside JSONArray in Java


java,arrays,json
I have a simple issue but cannot solve it as I am not very good at algorithm! I have an JSONArray in this form: [{"values":[{"time":1434976493,"value":"50"}, {"time":1434976494,"value":"100"}],"counter":"counter1"}, {"values":[{"time":1434976493,"value":"200"}, {"time":1434976494,"value":"300"}],"counter":"counter2"}, {"values":[{"time":1434976493,"value":"400"}, {"time":1434976494,"value":"600"}],"counter":"total"}] What I want to do is to get the integer value for counter 1 and counter 2 and then divide...

Deserializing JSON into c# class


c#,json,couchdb
I have a JSON in the following format (from couchDB view) {"rows":[ {"key":["2015-04-01","524","http://www.sampleurl.com/"],"value":1}, {"key":["2015-04-01","524","http://www.sampleurl2.com/"],"value":2}, {"key":["2015-04-01","524","http://www.sampleurl3.com"],"value":1} ]} I need to create a "service" to get this data from couchDB and insert it on SQL Server (for generating reports..) in a efficient way. My first bet was to bulk insert this json...

JQuery mutiple post in the same time


javascript,php,jquery,json
I have three functions ,and every function post to special php page to get data.. and every function need some time because every php script need some time .. function nb1() { $.post("p1.php", { action: 1 }, function(data) { console.log(data); }, "json") .fail(function(data) { console.log("error"); }); } function nb2() {...

Insert data in collection at Meteor's startup


javascript,json,meteor,data,startup
I would like to insert data at Meteor's startup. (And after from a JSON file) At startup, I create a new account and I would like to insert data and link it to this account once this one created. This is the code that creates the new account at startup:...

About sorting based on the counting of subelements


xml,xslt
i have an xml document with properties that belong to agencies: <agency name="Century 42" num="Century42" mail="[email protected]"/> <property agency="Century42" ....> ... I would like to print the info of all agencies. The agencies should be sorted by the number of properties that they own. I tried this but it does not...

Use JSON file to insert data in database


javascript,json,mongodb,meteor,data
I'm using my JSON file like this to insert data in my collection : var content = JSON.parse(Assets.getText('test.json')); console.log('inserting...'); Profiles.insert({ user: id, data:content }; But I would like to have a "data's tree" like that : [ user: "rtegert23423131", firstname:"test", surname:"test2", // ... ] Not like that : [ user:...

KendoUI Grid - Complex JSON with inconsistent keys


javascript,json,kendo-ui,kendo-grid
I know this is probably been asked before, but having tried to find an answer - I am guessing either I am not comprehending some of the answers right, or I am looking at the problem all wrong. I am using a complex SLC loopback query - and the api...

Parsing Google Custom Search API for Elasticsearch Documents


json,python-2.7,elasticsearch,google-search-api
After retrieving results from the Google Custom Search API and writing it to JSON, I want to parse that JSON to make valid Elasticsearch documents. You can configure a parent - child relationship for nested results. However, this relationship seems to not be inferred by the data structure itself. I've...

Parsing XML array using Jquery


javascript,jquery,xml,jquery-mobile
I have stuck up with an issue of passing XML using Jquery. I am getting empty array while traversing to jquery.Please help me how to get datas from XML array. I have mentioned my code below. XML <?xml version="1.0" encoding="UTF-8"?> <json> <json> <CustomerName>999GIZA MID INSURANCEAND SERVICES PVT LTD</CustomerName> <mobiLastReceiptDate>null</mobiLastReceiptDate> </json>...

type conversion performance optimizable?


c#,xml,csv,optimization,type-conversion
The following snippet converts xml data to csv data in a data processing application. element is a XElement. I'm currently trying to optimize the performance of the application and was wondering if I could somehow combine the two operations going on below: Ultimately I still want access to the string...

Convert contents of an XmlNodeList to a new XmlDocument without looping


c#,xml,xpath,xmldocument,xmlnodelist
I have Xml that I filter using XPath (a query similar to this): XmlNodeList allItems = xDoc.SelectNodes("//Person[not(PersonID = following::Person/PersonID)]"); This filters all duplicates from my original Persons Xml. I want to create a new XmlDocument instance from the XmlNodeList generated above. At the minute, the only way I can see...

Can't save json data to variable (or cache) with angularjs $http.get


json,angularjs,web-services,rest
I have weird angularjs problem. I'm trying to fetch data from Rest Webservice. It works fine, but I can't save json data to object. My code looks like: services.service('customerService', [ '$http', '$cacheFactory', function($http, $cacheFactory) { var cache = $cacheFactory('dataCache'); var result = cache.get('user'); this.getById = function(id){ $http.get(urlList.getCustomer + id).success(function(data, status,...

xpath query seem to be failing


xml,xpath
Can someone tell me the reason for the failure of this xpath query: //Pages/*[(name() = 'Home') or following-sibling::Home] on this xml structure: <Pages> <copyright>me inc. 2015,</copyright> <author>Me</author> <lastUpdate>2/1/1999</lastUpdate> <Home>--------------------</Home> <About>--------------------</About> <Contact>------------------</Contact> </Pages> only returned copyright element, which is against my targets (To grab all element between Pages and...

Check for duplicates in JSON


javascript,jquery,json,duplicates
I have a json object: var object1 = [{ "value1": "1", "value2": "2", "value3": "3", }, { "value1": "1", "value2": "5", "value3": "7", }, { "value1": "6", "value2": "9", "value3": "5", }, { "value1": "6", "value2": "9", "value3": "5", }] Now I want to take each record out of that...

C# XML: System.InvalidOperationException


c#,xml
I have been learning C#'s XML with a project however I keep getting the InvalidOperationException. I have put the code below XmlTextWriter writer = new XmlTextWriter(path, System.Text.Encoding.UTF8); writer.WriteStartDocument(true); writer.Formatting = Formatting.Indented; writer.Indentation = 4; writer.WriteStartElement("User Info"); writer.WriteStartElement("Name"); writer.WriteString(userName); writer.WriteEndElement(); writer.WriteStartElement("Tutor Name"); writer.WriteString(tutorName); writer.WriteEndElement();...

Ruby- get a xml node value


ruby,xml
can someone help me in extracting the node value for the element "Name". Type 1: I am able to extract the "name" value for the below xml by using the below code <Element> <Details> <ID>20367</ID> <Name>Ram</Name> <Name>Sam</Name> </Details> </Element> doc = Nokogiri::XML(response.body) values = doc.xpath('//Name').map{ |node| node.text}.join ',' puts values...

Get XML node value when previous node value conditions are true (without looping)


xml,vb.net,linq-to-xml
Sample XML - <?xml version="1.0"?> <Root> <PhoneType dataType="string"> <Value>CELL</Value> </PhoneType> <PhonePrimaryYn dataType="string"> <Value>Y</Value> </PhonePrimaryYn> <PhoneNumber dataType="string"> <Value>555-555-5554</Value> </PhoneNumber> <PhonePrimaryYn dataType="string"> <Value>Y</Value> </PhonePrimaryYn> <PhoneType dataType="string"> <Value>HOME</Value> </PhoneType>...

Fixed element in android?


android,xml,android-fragments
I am using a FAB(Floating action button) and a ViewPager that has a list inside a fragment. The ViewPager stops due to the FAB block and each are blocks the ViewPager being on top of the FAB activity_main.xml <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" xmlns:fab="http://schemas.android.com/apk/res-auto" android:layout_width="match_parent" android:layout_height="match_parent" android:orientation="vertical" android:fitsSystemWindows="true">...

Collect strings after a foreach loop


c#,xml,foreach
Is it possible to collect the strings after a foreach loop? For example: StringCollection col = new StringCollection(); XmlNodeList skillNameNodeList=SkillXML.GetElementsByTagName("name"); foreach (XmlNode skillNameNode in skillNameNodeList) { skillsName=skillNameNode.Attributes["value"].Value; } col.Add(skillsName); //Return System.Collections.Specialized.StringCollection I want to collect each skillsName and put them in a collection or a list so that I can...

Deserializing Json data to c# for use in GridView - Error Data source is an invalid type


c#,json,gridview,serialization,.net-4.5
I am relatively new to working with C# (v5.0) and Json data (Newtonsoft.Json v6.0) and am seeking assistance in resolving an error when attempting to populate a .Net 4.5 GridView control. My sample Json data as returned by a web service is: { PersonDetails: [ { Book: "10 ", FirstName:...