c#,xml,parsing,csv , How to make xml to csv parsing/conversion faster?


How to make xml to csv parsing/conversion faster?

Question:

Tag: c#,xml,parsing,csv

I'm currently using the snippet below to convert xml data(not well formed) to .CSV format after doing some processing in between. It only converts those elements in the xml data that contain a integer from the list testList (List<int> testList). It only converts and writes to the file once that match has been made. I need to use this algorithm for files that are several GB's in size. Currently it processes a 1 Gb file in ~7.5 minutes. Can someone suggest any changes that I could make to improve performance? I've fixed everything I could but it won't get any faster. Any help will be appreciated!

Note: Message.TryParse is an external parsing method that I have to use and can't exclude or change. Note: StreamElements is just a customized Xmlreader that improves performance.

foreach (var element in StreamElements(p, "XML"))
                {
                    string joined = string.Concat(element.ToString().Split().Take(3)) + string.Join(" ", element.
                        ToString().Split().Skip(3));
                    List<string> listX = new List<string>();
                    listX.Add(joined.ToString());
                    Message msg = null;
                    if (Message.TryParse(joined.ToString(), out msg))
                    {
                        var values = element.DescendantNodes().OfType<XText>()
                        .Select(v => Regex.Replace(v.Value, "\\s+", " "));

                        foreach (var val in values)
                        {
                            for (int i = 0; i < testList.Count; i++)
                            {
                                if (val.ToString().Contains("," + testList[i].ToString() + ","))
                                {
                                    var line = string.Join(",", values);
                                    sss.WriteLine(line);
                                }
                            }
                        }
                    }
    }

Answer:

I'm seeing some things you could probably improve:

But before focusing on stuff like this, you really need to identify what's taking the time in your code. My guess is that it's almost all spent in these two places:

  1. Reading from the XML stream
  2. Writing to sss

If I'm right, then anything else you focus on is going to be premature optimization. Spend some time testing what happens if you comment out various parts of your for loop, to see where all the time is being spent.


Related:


Export data from table in Pervasive


c#,pervasive
I want to export data from table programatically. And i wonder if it's even possible? The picture is from Pervasive, that the db-server I'm using. Please assist! :) ...

Catch concurrency exception in EF6 to change message to be more user friendly


c#,asp.net,.net,entity-framework,entity-framework-6
I am using EF6.1 and i would like to change the message to a more system specific message when the below exception is thrown. Store update, insert, or delete statement affected an unexpected number of rows (0) Now, my problem is i cannot seem to catch the exception? I have...

Is it possible to concactenate a DataBound value with a constant string in XAML DataBinding?


c#,xaml,windows-phone
To bind a value to a TextBlock we use the following syntax to display an <ItemName> property of a bounded object. <TextBlock Text="{Binding Path=ItemName}" /> But is there a syntax to use the above tag to concatenate the constant string 'Item' with the databounded property, in order display something like:...

Difference between application and module pipelines in Nancy?


c#,asp.net,nancy
I have seen in the documentation of Nancy, sometimes these two are referred distinctively. And also is there a difference in the Before/After hooks of these two pipelines?...

C# XML: System.InvalidOperationException


c#,xml
I have been learning C#'s XML with a project however I keep getting the InvalidOperationException. I have put the code below XmlTextWriter writer = new XmlTextWriter(path, System.Text.Encoding.UTF8); writer.WriteStartDocument(true); writer.Formatting = Formatting.Indented; writer.Indentation = 4; writer.WriteStartElement("User Info"); writer.WriteStartElement("Name"); writer.WriteString(userName); writer.WriteEndElement(); writer.WriteStartElement("Tutor Name"); writer.WriteString(tutorName); writer.WriteEndElement();...

Why is the task is not cancelled when I call CancellationTokenSource's Cancel method in async method?


c#,asynchronous,task,cancellationtokensource,cancellation-token
I created a small wrapper around CancellationToken and CancellationTokenSource. The problem I have is that the CancelAsync method of CancellationHelper doesn't work as expected. I'm experiencing the problem with the ItShouldThrowAExceptionButStallsInstead method. To cancel the running task, it calls await coordinator.CancelAsync();, but the task is not cancelled actually and doesn't...

How can I determine if an object of anonymous type is empty?


c#,.net
I am sure the answer to this is quite simple but I am trying to write an if statement (C# 5.0) to determine whether or not an anonymous type is empty or not. Here is a simplified version of my code: public void DoSomething(object attributes) { // This is the...

Regex to remove `.` from a sub-string enclosed in square brackets


c#,.net,regex,string,replace
I have this regex in C#: \[.+?\] This regex extracts the sub-strings enclosed between square brackets. But before doing that I want to remove . inside these sub-strings. For example, the string hello,[how are yo.u?]There are [300.2] billion stars in [Milkyw.?ay]. should become hello,[how are you?]There are [3002] billion stars...

Parsing XML array using Jquery


javascript,jquery,xml,jquery-mobile
I have stuck up with an issue of passing XML using Jquery. I am getting empty array while traversing to jquery.Please help me how to get datas from XML array. I have mentioned my code below. XML <?xml version="1.0" encoding="UTF-8"?> <json> <json> <CustomerName>999GIZA MID INSURANCEAND SERVICES PVT LTD</CustomerName> <mobiLastReceiptDate>null</mobiLastReceiptDate> </json>...

Update list of items in c#


c#,linq,list,updates
I would like to know if you can suggest me an efficient way to update a list of items in c#. Here is a generic example: If CurrentList is [ {Id: 154, Name: "George", Salary: 10 000} {Id: 233, Name: "Alice", Salary: 10 000}] And NewList is [ {Id: 154,...

Convert Date Time to IST


c#
I want to convert the date time to "Indian Standard Time", so i used the following code :- public static TimeZoneInfo INDIAN_ZONE = TimeZoneInfo.FindSystemTimeZoneById("Indian Standard Time"); writer.WriteLine("{0} {1}", indianTime.ToLongTimeString(), indianTime.ToLongDateString()); The above code gives me error :- System.TimeZoneNotFoundException: The time zone ID 'Indian Standard Time' was not found on the...

Error when building an XDocument


c#,xml,linq,xpath,linq-to-xml
Using the following example xml containing one duplicate: <Persons> <Person> <PersonID>7506</PersonID> <Forename>K</Forename> <Surname>Seddon</Surname> <ChosenName /> <MiddleName /> <LegalSurname /> <Gender>Male</Gender> </Person> <Person> <PersonID>6914</PersonID> <Forename>Clark</Forename> <Surname>Kent</Surname> <ChosenName>Clark</ChosenName> <MiddleName />...

Index was out of range. Must be non-negative or less than size of collection [duplicate]


c#
This question already has an answer here: What is an “index out of range” exception, and how do I fix it? [duplicate] 1 answer Trying to run a delete application in C#. If there is more than 10 files in a directory, delete the oldest file, and iterate again....

while Inherit style in WPF it affect parent style


c#,xaml,styles,wpf-controls
In WPF i have a style for the control like below, <Style TargetType="local:CustomControl"> <Setter Property="Background" Value="Transparent" /> <Setter Property="BorderBrush" Value="Gray" /> <Setter Property="BorderThickness" Value="0,0,0,1" /> <Setter Property="Padding" Value="3,0,3,0" /> <Setter Property="IsTabStop" Value="False" /> <Setter Property="VerticalContentAlignment" Value="Center" /> </Style> Now i need to override customcontrol border for some other place like...

Get object by attribute value [duplicate]


c#,reflection,custom-attributes,spring.net
This question already has an answer here: How enumerate all classes with custom class attribute? 4 answers I have a set of classes which implement a common interface and are annotated with a business domain attribute. By design, each class is annotated with different parametrization [Foo(Bar=1)] public class EntityA...

C# PCL HMACSHAX with BouncyCastle-PCL


c#,bouncycastle,portable-class-library
I want to implement this logic in portable C# class: static JsonWebToken() { HashAlgorithms = new Dictionary<JwtHashAlgorithm, Func<byte[], byte[], byte[]>> { { JwtHashAlgorithm.HS256, (key, value) => { using (var sha = new HMACSHA256(key)) { return sha.ComputeHash(value); } } }, { JwtHashAlgorithm.HS384, (key, value) => { using (var sha = new...

How to return result while applying Command query separation (CQS)


c#,design-patterns,cqrs,command-query-separation
I am separating my query and command on service side like this: public class ProductCommandService{ void AddProduct(Product product); } public interface ProductQueryService{ Product GetProduct(Guid id); Product[] GetAllProducts(); } Command Query Separation accepts that a method should change state or return a result. There is no problem. public class ProductController: ApiController{...

C# Code design / Seperate classes for each TabControl


c#,oop,architecture,software-design,code-design
My main problem is that my tool grows and grows and I start loosing the focus on the different parts of my code. The main-Form got a docked tabControl at fullsize. I got 5 different tabs with for really different functions. So I can say my tool is splitted into...

XSLT How to remove style from div and td tags


xml,xslt
I am new to XSLT. I got stuck while removing style attributes from div, td or li tags. Input XML: <?xml version="1.0" encoding="UTF-8"?> <div xmlns="http://www.w3.org/1999/xhtml"> <table style="BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; WIDTH: 606px; BORDER-COLLAPSE: collapse; WORD-WRAP: break-word; TABLE-LAYOUT: fixed; BORDER-TOP: medium none; BORDER-RIGHT: medium none" class="MsoNormalTable msoUcTable" tabIndex="-1" border="1"...

deployment of a site asp.net and iis


c#,asp.net,iis
I know this is for some of you a stupid question but for me is a real problem. I have never deployed a site before What i have done so far: 1) publish the site from visual studio to a folder. 2) added to iis for testing everything works great...

Marshal struct in struct from c# to c++


c#,c++,marshalling
I have the following structures in C# and C++. C++: struct TestA { char* iu; }; struct TestB { int cycle1; int cycle2; }; struct MainStruct { TestA test; TestB test2; }; C#: [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi, Pack = 1)] internal struct TestA { [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 36)] private string iu; public...

Aligning StackPanel to top-center in Canvas


c#,wpf,xaml,canvas
I'm trying to create an application which is supposed to measure quick reaction performance of it's user. The application starts up in full-screen mode and resizes it's elements accordingly to the screen resolution. The project was strongly inspired by training_aim_csgo2 map. It's mostly done, but here is the problem: I...

check if file is image


c#,asp.net,asp.net-mvc
I want to check if file is image. and then you will see a link where you can see the image. But the link only has to appear if file is link. I try it like this: if (!String.IsNullOrEmpty(item.FileName)) { var file = item.FileName; string[] formats = new string[] {...

SQL Server / C# : Filter for System.Date - results only entries at 00:00:00


c#,asp.net,sql-server,date,gridview-sorting
I have a connected SQL Server database in Visual Studio and am displaying its content in a grid. I created a dropdown menu with the column names as selectable options and a text field to filter for specific content, e.g., DropDown = "Start" - Textfield = 14.03.2015 = Filter Column...

Visual Studio Assembly force-installs Target Framework


c#,.net,visual-studio-2013,.net-framework-version
I have this Assembly targeted at .NET 3.5. The code will work on later versions as well, but I like this to work on Windows XP. I mean, .NET is backwards compatible, right? I can run apps for .NET 3.5 on Windows 8.1. However, when I run my own assembly,...

System.net.http.formatting causing issues with Newtonsoft.json


c#,asp.net,asp.net-mvc,json.net
My Windows service is in the same solution as a MVC project. The MVC project uses a reference to SignalR Client which requires Newtonsoft.Json v6 + the Windows service uses System.Net.Http.Formatting, which requires Newtonsoft.Json version 4.5.0.0. I assumed this would not be a problem, as I could just use a...

Fixed element in android?


android,xml,android-fragments
I am using a FAB(Floating action button) and a ViewPager that has a list inside a fragment. The ViewPager stops due to the FAB block and each are blocks the ViewPager being on top of the FAB activity_main.xml <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" xmlns:fab="http://schemas.android.com/apk/res-auto" android:layout_width="match_parent" android:layout_height="match_parent" android:orientation="vertical" android:fitsSystemWindows="true">...

Access manager information from Active Directory


c#,asp.net,active-directory
Attach is the picture of active directory, which i got from my IT department. Now i want to get the manager information in C#. NOTE: I am able to get all information of user but there isn't any key of manager, but IT department just gave me above attached...

Convert contents of an XmlNodeList to a new XmlDocument without looping


c#,xml,xpath,xmldocument,xmlnodelist
I have Xml that I filter using XPath (a query similar to this): XmlNodeList allItems = xDoc.SelectNodes("//Person[not(PersonID = following::Person/PersonID)]"); This filters all duplicates from my original Persons Xml. I want to create a new XmlDocument instance from the XmlNodeList generated above. At the minute, the only way I can see...

Collect strings after a foreach loop


c#,xml,foreach
Is it possible to collect the strings after a foreach loop? For example: StringCollection col = new StringCollection(); XmlNodeList skillNameNodeList=SkillXML.GetElementsByTagName("name"); foreach (XmlNode skillNameNode in skillNameNodeList) { skillsName=skillNameNode.Attributes["value"].Value; } col.Add(skillsName); //Return System.Collections.Specialized.StringCollection I want to collect each skillsName and put them in a collection or a list so that I can...

How do I provide a collection of elements to a custom attached property?


c#,wpf,binding
I found a few examples online, and a few questions and answers here, but I just can't get it to work. I need a custom attached property that can take one or more target elements. For example... <ListView> <dd:MyDragDrop.DropBorders> <Binding ElementName="brdOne"/> <Binding ElementName="brdTwo"/> <Binding ElementName="brdThree"/> </dd:MyDragDrop.DropBorders> </ListView> I've also had...

Validate a field only if it is populated


c#,wpf,idataerrorinfo
I am having a problem with validating phone numbers. In our system we have two phone numbers which you can store. The problem I am having is that these are optional fields. So I want it to validate the phone number IF and only IF the user has tried to...

C# - Can't connect to remote MySQL server


c#,mysql
My problem is that I can't connect to my website remote MySQL server. I have read all answers in stackoverflow.com, but I can't find right answer. Here's my C# code: using System; using System.Collections.Generic; using System.Data; using System.Data.SqlClient; namespace ConsoleApplication3 { class Program { static void Main(string[] args) { SqlConnection...

Load XML to list using LINQ [duplicate]


c#,xml,linq
This question already has an answer here: XDocument to List of object 1 answer I have following XML: <?xml version="1.0" encoding="utf-8"?> <start> <Current CurrentID="5"> <GeoLocations> <GeoLocation id="1" x="78492.61" y="-80973.03" z="-4403.297"/> <GeoLocation id="2" x="78323.57" y="-81994.98" z="-4385.707"/> <GeoLocation id="3" x="78250.57" y="-81994.98" z="-4385.707"/> </GeoLocations> <Vendors> <Vendor id = "1" x="123456" y="456789" z="0234324"/>...

Memory consumption when chaining string methods


c#,string,immutability,method-chaining
I know that string in C# is an immutable type. Is it true that when you chain string functions, every function instantiates a new string? If it is true, what is the best practice to do too many manipulations on a string using chaining methods?...

How to declare var datatype in public scope in c#?


c#,linq
I write simple query with linq to sql : var query = (from p in behzad.GAPERTitles select new { p.id, p.gaptitle }).ToArray(); up code into the c# windows application ,windows form load event,and i want use up result into the button click event in this scope: private void button1_Click(object sender,...

Foreign key in C#


c#,sql,sql-server,database
I have two tables, A and B, in a dataset in SQL Server; I have created a connection to the dataset in a c# project in visual studio. How can I create a foreign key ( A is the parent) between my two tables ? I want to create the...

Regex that allow void fractional part of number


c#,regex
@"[+-]?\d+(\.\d+)?" -this is a regex I have wrote for numbers it allows [+-] minus before the number digits before and digits after the point the question is how to change this to allow "not finished" values so that input of "5." - is fine too ?...

How to calculate max string-length of a node-set?


xml,xslt,xslt-1.0,libxslt
I am trying to use XSLT to turn an XML document into plain text tables for human consumption. I am using xsltproc, which only implements XSLT 1.0 (so max is from EXSLT actually). I tried the below, but the commented-out definition fails because string-length returns only a single value (the...

How to Customize Visual Studio Setup


c#,visual-studio,setup-project
I have created a video chat application in c#. Now I wan to make a setup of it. I have created a setup using Visual studio's setup project but my client told me to customize the setup progress bar styles and other properties. i dont know how to do it....

How to send Ctrl+S through SendKeys.Send() method to save a file(save as dialog)


c#,.net,windows,sendkeys
I need to save a file which is in an External application using SendKeys.Send() method. The keys needed to be sent are Ctrl+S. I wrote the below code, but its not working: SendKeys.SendWait("^%s?"); // to get the Save As dialog Thread.Sleep(5000); SetForegroundWindow(FindWindow(null, "Save As")); Thread.Sleep(5000); SendKeys.SendWait("xyz"); // Sending FileName ...

Multiple Threads searching on same folder at same time


c#,multithreading,file-search
Currently I have a .txt file of about 170,000 jpg file names and I read them all into a List (fileNames). I want to search ONE folder (this folder has sub-folders) to check if each file in fileNames exists in this folder and if it does, copy it to a...

how can I add a column to IQueryable object and modify its values


c#,.net,linq,grid,devexpress
var packs = from r in new XPQuery<Roll>(session) select new { Number = r.number Selection = new bool() }; gcPack.DataSource = packs; I want to add another column to my grid control with: Selection = new bool(). It will be added to the grid but I can't change its...