FAQ Database Discussion Community


How to parse a very large file in F# using FParsec

parsing,f#,bigdata,large-files,fparsec
I'm trying to parse a very large file using FParsec. The file's size is 61GB, which is too big to hold in RAM, so I'd like to generate a sequence of results (i.e. seq<'Result>), rather than a list, if possible. Can this be done with FParsec? (I've come up with...

Fastest way to read very large text file in C#

c#,wpf,large-files
I have a very basic question. I have several text files with data which are several GB's in size each. I have a C# WPF application which I'm using to process similar data files but nowhere close to that size (probably around 200-300mb right now). How can I efficiently read...

How to extract specific lines from a huge data file?

text-files,extract,large-files,data-files
I have a very large data file, about 32GB. The file is made up of about 130k lines, each of which mainly contains numbers, but also has few characters. The task I need to perform is very clear: I have to extract 20 lines and write them to a new...

long text file to SAS Dataset

import,sas,large-files
I am trying to load a large text file(report) as a single cell in SAS dataset, but because of multiple spaces and formatting the data is getting split into multiple cells. Data l1.MD; infile 'E:\Sasfile\f1.txt' truncover; input text $char50. @; run; I have 50 such files to upload so keeping...

Java CRC (Adler) with large files

java,checksum,large-files
I have the following situation: a directory tree with big files (about 5000 files with ~4Gb size). I need to find duplicates in this tree. I tried to use the CRC32 and Adler32 classes built-in to Java, but it VERY slow (about 3-4 minutes per file). The code was like...

Retrieving File Data Stored in Buffer

c++,performance,buffer,ifstream,large-files
I'm new to the forum, but not to this website. I've been searching for weeks on how to process a large data file quickly using C++ 11. I'm trying to have a function with a member that will capture the trace file name, open and process the data. The trace...

Converting very large files from xml to csv

c#,xml,csv,converter,large-files
Currently I'm using the following code snippet to convert a .txt file with XML data to .CSV format. My question is this, currently this works perfectly with files that are around 100-200 mbs and the conversion time is very low (1-2 minutes max), However I now need this to work...

Powershell random shuffle/split large text file

powershell,large-files,large-data
Is there a fast implementation in Powershell to randomly shuffle and split a text file with 15 million rows using a 15%-85% split? Many sources mention how to do it using Get-Content, but Get-Content and Get-Random is slow for large files: Get-Content "largeFile.txt" | Sort-Object{Get-Random}| Out-file "shuffled.txt" I was looking...

Problems rendering large html files (links are wrong, images not displaying)

jquery,html,css,image,large-files
I created a script which takes an xml input and creates html based reports for ease of viewing. All of the reports are based off of the same html template. Some reports are small (50KB), some are larger (7MB). I have a problem with the larger page not rendering images...

Python - Opening and changing large text files

python,replace,out-of-memory,large-files
I have a ~600MB Roblox type .mesh file, which reads like a text file in any text editor. I have the following code below: mesh = open("file.mesh", "r").read() mesh = mesh.replace("[", "{").replace("]", "}").replace("}{", "},{") mesh = "{"+mesh+"}" f = open("p2t.txt", "w") f.write(mesh) It returns: Traceback (most recent call last): File...

Python: histogram/ binning data from 2 arrays.

python,histogram,large-files
I have two arrays of data: one is a radius values and the other is a corresponding intensity reading at that intensity: e.g. a small section of the data. First column is radius and the second is the intensities. 29.77036614 0.04464427 29.70281027 0.07771409 29.63523525 0.09424901 29.3639355 1.322793 29.29596385 2.321502 29.22783249...

Transferring large application to Android Wear through Android Studio

android,debugging,testing,android-wear,large-files
I am developing a large application for Android Wear through Android Studio (~200 MB). Trying to test the application on my LG G Watch R through "Debugging over Bluetooth" is taking a lot of time to send the large app to the Watch. Are there any alternatives / faster methods...

PHP read a large file line by line and string replace

php,string,replace,large-files
I'd like to read a large file line by line, perform string replacement and save changes into the file, IOW rewriting 1 line at a time. Is there any simple solution in PHP/ Unix? The easiest way that came on my mind would be to write the lines into a...

re-number observations in a large data frame

r,performance,naming,large-files
I'm dealing with a data frame with over 500,000 observations, and I'm dealing with code optimization for the first time. I have a very simple problem that is just killing me on time, and was looking for a faster solution. My data frame "d" has a column for the observation...

Is there a mercurial command which can generate a clone without largefiles?

python,mercurial,large-files
Since I believe there is no way to strip largefiles out of a repository, I'm looking for a way to either: clone to (create) a new repo that contains at least all the same files, even without history (export tip revision only) if necessary, deleting all largefiles. achieve similar results...