regex,r,csv,format,read.csv , How to remove the [1]s, [[1]]s and double quotes from a csv data in R?


How to remove the [1]s, [[1]]s and double quotes from a csv data in R?

Question:

Tag: regex,r,csv,format,read.csv

I've a CSV file. It contains the output of some previous R operations, so it is filled with the index numbers (such as [1], [[1]]). When it is read into R, it looks like this, for example:

        V1
1                                                                                                           [1] 789
2                                                                                                             [[1]]
3                                                           [1] "PNG"        "D115"    "DX06"    "Slz"
4                                                                                                           [1] 787
5                                                                                                             [[1]]
6                                                                       [1] "D010"           "HC"
7                                                                                                           [1] 949
8                                                                                                             [[1]]
9                                                                       [1] "HC" "DX06"          

(I don't know why all that wasted space between line number and the output data)

I need the above data to appear as follows (without [1] or [[1]] or " " and with the data placed beside its corresponding number, like):

789 PNG,D115,DX06,Slz
787 D010,HC
949 HC,DX06

(possibly the 789 and its corresponding data PNG,D115,DX06,Slz should be separated by a tab.. and like that for each row)

How to achieve this in R?


Answer:

We could create a grouping variable ('indx'), split the 'V1' column using the grouping index after removing the parentheses part in the beginning as well as the quotes within the string ". Assuming that we need the first column as the numeric element, and the second column as the non-numeric part, we can use regex to replace the space with , (as showed in the expected result, and then rbind the list elements.

indx <- cumsum(c(grepl('\\[\\[', df1$V1)[-1], FALSE))
 do.call(rbind,lapply(split(gsub('"|^.*\\]', '', df1$V1), indx),
         function(x) data.frame(ind=x[1],
    val=gsub('\\s+', ',', gsub('^\\s+|\\s+$', '',x[-1][x[-1]!=''])))))

 #   ind               val
 #1  789 PNG,D115,DX06,Slz
 #2  787           D010,HC
 #3  949           HC,DX06

data

 df1 <- structure(list(V1 = c("[1] 789", "[[1]]", 
 "[1] \"PNG\"        \"D115\"    \"DX06\"    \"Slz\"", 
 "[1] 787", "[[1]]", "[1] \"D010\"           \"HC\"", "[1] 949", 
 "[[1]]", "[1] \"HC\" \"DX06\"")), .Names = "V1", 
 class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", 
 "7", "8", "9"))

Related:


how to call Java method which returns any List from R Language? [on hold]


java,r,rjava
How to call java method which returns list from R Language.

Identify that a string could be a datetime object


python,regex,algorithm,python-2.7,datetime
If I knew the format in which a string represents date-time information, then I can easily use datetime.datetime.strptime(s, fmt). However, without knowing the format of the string beforehand, would it be possible to determine whether a given string contains something that could be parsed as a datetime object with the...

regex - Match filename with or without extension


regex,logstash-grok
Need a regex pattern to match all of the following: hello hello. hello.cc I tried \b\w+\.?\w+?\b, but this doesn't match "hello." (the second string mentioned above)....

How to create the javascript regular expression for number with some special symbols


javascript,regex
what can be the java-script regular expression which gives the numbers with some symbols For example following condition must be pass. Number can start with $ Can have the . or , : symbols between and % sign at the send. Passing valus: $233 48.23% 278 22.33 45:23 10,000 Number...

MySQL substring match using regular expression; substring contain 'man' not 'woman'


mysql,regex
I have an issue while I fetch data from database using regular expression. While I search for 'man' in tags it returns tags contains 'woman' too; because its substring. SELECT '#hellowomanclothing' REGEXP '^(.)*[^wo]man(.)*$'; # returns 0 correct, it contains 'woman' SELECT '#helloowmanclothing' REGEXP '^(.)*[^wo]man(.)*$'; # returns 0 incorrect, it can...

How many characters are visible like a space, but are not space characters?


php,regex
If I want to discover the hexadecimal equivalent of a space in PHP I can play with bin2hex: php > echo var_dump(bin2hex(" ")); string(2) "20" I can also obtain space character from "20" php > echo var_dump(hex2bin("20")); string(1) " " But there exist Unicode versions of a "visible" space: php...

How (in a vectorized manner) to retrieve single value quantities from dataframe cells containing numeric arrays?


r,dataframes,vectorization
I've got a dataframe that includes columns like the one on the right here: lengthArray speed_max 1 4 24, 18, 24, 18 2 10 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 3 4 -999, -999, -999, -999 4 2 -999, -999 5 2 18, 18 6 1...

Highlighting specific ranges on a Graph in R


r,graph,highlight
library(season) plot(CVD$yrmon, CVD$cvd, type = 'o',pch = 19,ylab = 'Number of CVD deaths per month',xlab = 'Time') if i wanted to highlight a region of the graph based on x values say from 1994-1998 how do i do this? Any thought would be appreciated Thanks....

Regex pass dynamic values with boundry


c#,regex,string,boundary
I'm trying to pass a dynamic value at runtime with a boundary \b to a Regex function. My code is: static void Main(string[] args) { string sent = "Accelerometer, gyro, proximity, compass, barometer, gesture, heart rate"; string match = "gyro"; string bound = @"\b"; if (Regex.IsMatch(sent, @"\bgyro", RegexOptions.IgnoreCase)) { Console.WriteLine("match...

Regular Expression for whole world


regex,c#-4.0,vb6
First of all, I use C# 4.0 to parse the code of a VB6 application. I have some old VB6 code and about 500+ copies of it. And I use a regular expression to grab all kinds of global variables from the code. The code is described as "Yuck" and...

Reg ex matching a word


regex
I need to match only first two files, out of four files listed below: ABD_DEF_GHIJ_20150611 ABD_DEF_GHIJ ABD_DEF_GHIJ_FX_20150611 ABD_DEF_GHIJ_FX I am using reg ex - ABD_DEF_GHIJ(_\d{8}|\b) and it's working fine. I would like to know if my solution is ok or there is any better alternate solution....

How to plot data points at particular location in a map in R


r,google-maps,ggmap
I have a dataset that looks like this: LOCALITY numbers 1 Airoli 72 2 Andheri East 286 3 Andheri west 208 4 Arya Nagar 5 5 Asalfa 7 6 Bandra East 36 7 Bandra West 72 I want to plot bubbles (bigger the number bigger would be the bubble) inside...

Skip some lines with fread


r,fread
I am interested to skip some lines of my data frame before the header names . How can i do it by skiping all the lines before ID_REF or if ID_REF is not present, check for the pattern ILMN_ and deleting all the lines keeping immediate first if not containing...

How to quickly read a large txt data file (5GB) into R(RStudio) (Centrino 2 P8600, 4Gb RAM)


r,large-data
I have a large data set, one of the files is 5GB. Can someone suggest me how to quickly read it into R (RStudio)? Thanks

R — frequencies within a variable for repeating values


r,count,duplicates
I've got a column A, which has several values, some of them repeating. So, example: A = c(5, 9, 6, 5, 5). I need to go through A and count the frequencies of each of the values in A. So, for this example, for the set of 5s in A,...

REGEX python find previous string


python,regex,string
I'm trying to find if the last word of the string is followed by a space or a special char, and if yes return the string without this space/special char For example : "do you love dogs ?" ==> return "do you love dogs" "i love my dog " (space...

optimization algorithm for circular data


r,optimization,circular,maximization
Background: I am interested in localizing a sound source from a suite of audio recorders. Each audio array consists of 6 directional microphones spaced evenly every 60 degrees (0, 60, 120, 180, 240, 300 degrees). I am interested in finding the neighboring pair of microphones with the maximum set of...

Remove quotes to use result as dataset name


r,string
I've got a vector with a long list of dataset names. E.g myvector<-c('ds1','ds2,'ds3') I'd like to use the names ds1..ds3 to write a file, taking the file name from the vector. Like this: write.csv(dataset[i],file=paste(myvector[i],'.csv',sep='') with dataset being d1...ds3, but without quotes. How can I remove the quotes and refer to...

Regex to remove `.` from a sub-string enclosed in square brackets


c#,.net,regex,string,replace
I have this regex in C#: \[.+?\] This regex extracts the sub-strings enclosed between square brackets. But before doing that I want to remove . inside these sub-strings. For example, the string hello,[how are yo.u?]There are [300.2] billion stars in [Milkyw.?ay]. should become hello,[how are you?]There are [3002] billion stars...

How to write RegEx for inserting line break for line length more than 30 characters?


regex
I am using a text editor which lets use regular expression to find / replace text. I have a large text file. I want to insert new line in each lines which are more than 30 characters. I want the line to break after 30th character (doesnt matter if a...

copy a list of data.tables


r,data.table
I have the following situation: 1) a list of data tables 2) For testing purposes I deliberately want to (deeply) copy the whole list including the data tables 3) I want to take some element from the copied list and add a new column. Here is the code: library(data.table) x...

ggplot2 & facet_wrap - eliminate vertical distance between facets


r,ggplot2
I'm working with some data that I want to display as a nxn grid of plots. Edit: To be more clear, there's 21 categories in my data. I want to facet by category, and have those 21 plots in a 5 x 5 square grid (where the orphan is by...

Appending a data frame with for if and else statements or how do put print in dataframe


r,loops,data.frame,append
How do I put what I printed in a dataframe with a for loop and if else statements? Basically, this code: list<-c("10","20","5") for (j in 1:3){ if (list[j] < 8) print("Greater") else print("Less") }) #[1] "Less" #[1] "Less" #[1] "Greater" Or should it be something more like this? f3 <-...

Histogram-like summary for interval data


r,statistics,histogram
How do I get a histogram-like summary of interval data in R? My MWE data has four intervals. interval range Int1 2-7 Int2 10-14 Int3 12-18 Int4 25-28 I want a histogram-like function which counts how the intervals Int1-Int4 span a range split across fixed-size bins. The function output should...

match line break except line begin with spcific word or blank line


regex,notepad++
If I have text that the line breaks is broken: Chapter 1 Lorem ipsum dolor sit amet, consectetur adipisci ng elit, sed do eiusmod tempor incididunt ut la bore et dolore magna aliqua. Ut enim ad minim ve niam, quis nostrud exercitation ullamco labo ris nisi ut aliquip ex ea...

how to get values from selectInput with shiny


r,shiny
I am playing around with the shiny packages for some hours now, and wanted to make a select input widget that enables me to download a certain data set from the server. So i figured out a way to get me this data frame containing all my IDs for downloading:...

PHP Regular Expressions Counting starting consonants in a string


php,regex
I need to find out how many starting consonants a word has. The number is used later in the program. The code below does work, I am wondering if it is possible to do this with a regular expression. $mystring ="SomeStringExample"; $mystring2 =("bcdfghjklmnpqrstvwxyzABCDFGHJKLMNPQRSTWVXYZ"); $var = strspn($mystring, $mystring2); Using a regular...

Regex that allow void fractional part of number


c#,regex
@"[+-]?\d+(\.\d+)?" -this is a regex I have wrote for numbers it allows [+-] minus before the number digits before and digits after the point the question is how to change this to allow "not finished" values so that input of "5." - is fine too ?...

Find multiple consecutive empty lines


r
I'm trying to chop up a text file into the articles it contains. Usually this is done by identifying a pattern each article begins with. Unfortunately the database I downloaded the articles from doesn't have that. The only pattern I can find is that after each article there are 3...

Match a pattern preceded by a specific pattern without using a lookbehind


regex,eclipse,lookahead
Is there a way to match a B only if preceded by an A? The A can be at any position behind the B, with any amount of characters between. Examples: A_B (Matches `B`) C_B (No match) I've tried: (?=A)[^B]*B But it matches all the characters preceeding B as well....

Get all prices with $ from string into an array in Javascript


javascript,regex,currency
var string = 'Our Prices are $355.00 and $550, down form $999.00'; How can I get those 3 prices into an array?...

Select / subset spatial data in R


r,dictionary,spatial
I am working on a large data set with spatial data (lat/long). My data set contains some positions that I don´t want in my analysis (it makes the files to heavy to process in ArcMap- many Go of data). This is why I want to subset the relevant data for...

Finding embeded xpaths in a String


java,regex
I have a string where I have the user should be able to specify xpaths that will be evaluated at runtime. I was thinking about having a the following way to specify it. String = "Hi my name is (/message/user) how can i help you with (/message/message) "; How can...

Linear multivariate regression in R


r
I want to model that a factory takes an input of, say, x tonnes of raw material, which is then processed. In the first step waste materials are removed, and a product P1 is created. For the "rest" of the material, it is processed once again and another product P2...

Subsetting rows by passing an argument to a function


r,subset
I have the following data frame which I imported into R using read.table() (I incorporated read.table() within read_data() which is a function I created that also throw messages in case the file name is not written appropriately): > raw_data <- read_data("n44.txt") [1] #### Reading txt file #### > head(raw_data) subject...

Sleep Shiny WebApp to let it refresh… Any alternative?


r,shiny,sleep
I have a WebApp that have some renderUI({})... and some of them depend on the input of another. This makes that, briefly, a red error in the webpage appear when I select some options. Because the if() clause of some renderUI({}) depend on the input of a selectizer. The error...

Subtract time in r, forcing unit of results to minutes [duplicate]


r,posix,posixct
This question already has an answer here: Getting consist units from diff command in R 4 answers I successfully subtracted two POSIXct cols of df1 (below). However, since the time differences are >= 1 hour in all rows, R gives the results in hours. I know that this make...

Regex with whitespaces and preceding zeros


regex,sas
I want to match the string 11 with a regular Expression in SAS. The 11 can be preceded by zero or more 0 and/or by white spaces. Any other character is not allowed. Likewise, if anything there should only be white spaces following the 11. Examples: Match: 0000011 11 11<space><space>...

Please can someone help me understand the exec method for regular expressions?


javascript,regex
The best place I have found for the exec method is Eloquent Javascript Chapter 9: "Regular expressions also have an exec (execute) method that will return null if no match was found and return an object with information about the match otherwise. An object returned from exec has an index...

Serial modification of objects in R


r,oop
I have a number of matrices of the same size: m1.m <- matrix(c(1,2,3,4), nrow=2, ncol=2) m2.m <- matrix(c(5,6,7,8), nrow=2, ncol=2) ... I want to set uniform column and row names to all of them. Currently I am doing it like this: new_col_names <- c("Col1","Col2") new_row_names <- c("Row1","Row2") change_names <- function(m,...

Return Column Names when True in R


r
I am using R for a project and I have a data frame in in the following format: A B C 1 1 0 0 2 0 1 1 I want to return a data frame that gives the Column Name when the value is 1. i.e. Impair1 Impair2 1...

How to split a text into two meaningful words in R


r,string-split,stemming,text-analysis
I had a text data frame having sentences, and as I wanted the list of separate words in another dataframe I used the "qdap package" function "all_words" Words = all_words(df$problem_note_text, begins.with=NULL , alphabetical = FALSE, apostrophe.remove = TRUE, char.keep = char2space, char2space = "~~") Now have a dataframe which has...

Get number from string


regex
I am trying to get the enclosed number between two slashes in a URL using regex. The code regex I have is not working, I am fairly new to regex and don't really understand it. The regex: http:\/\/?www\.?example\.com\/g\/(^\d$)\/\w The URL: http://www.example.com/g/1337/Game-Title Trying to get the "1337", which is the PlaceId....

ggplot equivalent for matplot


r,ggplot2
Is there an equivalent in ggplot2 to plot this dataset? I use matplot, and read that qplot could be used, but it really does not work. ggplot/matplot data<-rbind(c(6,16,25), c(1,4,7), c(NA, 1,2), c(NA, NA, 1)) as.data.frame(data) matplot(data, log="y",type='b', pch=1) ...

Count number of rows meeting criteria in another table - R PRogramming


r
I have two tables, one with property listings and another one with contacts made for a property (i.e. is someone is interested in the property they will "contact" the owner). Sample "listings" table below: listings <- data.frame(id = c("6174", "2175", "9176", "4176", "9177"), city = c("A", "B", "B", "B" ,"A"),...

How to build a 'for' loop with input$i in R Shiny


r,loops,for-loop,shiny
In my shiny app, I build a a number of checkboxes using a for loop, like this: landelist <- c("Danmark", "Tjekkiet", "Østrig", "Belgien", "Tyskland", "Sverige", "USA", "Norge", "Island") landecheckbox <- c() for (land in landelist){ landechek <- paste0("<label class=\"checkbox inline\"><input id=\"", land, "\" type=\"checkbox\" checked><span>", land, "</span></label>") landecheckbox <- c(landechek,...

How to Match a string with the format: “20959WC-01” in php?


php,regex
i want to restrict a user to enter a value which is similar to the value "20959WC-01", means it must contains 5 integers followed by two character, a '-' and two integers, can anyone please give me a solution to sort out this problem. Thanks in advance :) ...

Swing regular expression for phone number validation


java,regex
I want to validate phone number field in swing, so I am writing code to allow user to enter only digits, comma, spaces. For this I am using regular expression, when user enter characters or other than the pattern text field will consume. My code is not working. Can anyone...

Keep the second occurrence in a column in R


r,conditional,subset,find-occurrences
I have quite a simple dataset: ID Value Time 1 censored 1 1 censored 2 1 uncensored 3 1 uncensored 4 1 censored 5 1 censored 6 2 censored 1 2 uncensored 2 2 uncensored 3 2 uncensored 4 2 censored 5 I want to keep the first uncensored occurrence,...

Aggregating data in R


r
user_id date datetime page 217568 6/12/2015 49:23.9 Vodafone | How to get in touch with Vodafone 135437 6/10/2015 43:35.7 My Vodafone – Manage your Vodafone Pay Monthly Account Online – Vodafone 196094 6/13/2015 33:39.4 Check the status of Vodafone’s mobile network in real-time 74197 6/6/2015 52:46.1 undefined 153501 6/5/2015 02:55.5...