python,regex,string,split , Split by a word (case insensitive)


Split by a word (case insensitive)

Question:

Tag: python,regex,string,split

If I want to take

"hi, my name is foo bar"

and split it on "foo", and have that split be case insensitive (split on any of "foO", "FOO", "Foo", etc), what should I do? Keep in mind that although I would like to have the split be case insensitive, I also DO want to maintain the case sensitivity of the rest of the string.

So if I have:

test = "hi, my name is foo bar"

print test.split('foo')

print test.upper().split("FOO")

I would get

['hi, my name is ', ' bar']
['HI, MY NAME IS ', ' BAR']

respectively.

But what I want is:

['hi, my name is ', ' bar']

every time. The goal is to maintain the case sensitivity of the original string, except for what I am splitting on.

So if my test string was:

"hI MY NAME iS FoO bar"

my desired result would be:

['hI MY NAME iS ', ' bar']

Answer:

You can use the re.split function with the re.IGNORECASE flag (or re.I for short):

>>> import re
>>> test = "hI MY NAME iS FoO bar"
>>> re.split("foo", test, flags=re.IGNORECASE)
['hI MY NAME iS ', ' bar']
>>>

Related:


Get all prices with $ from string into an array in Javascript


javascript,regex,currency
var string = 'Our Prices are $355.00 and $550, down form $999.00'; How can I get those 3 prices into an array?...

Python - Opening and changing large text files


python,replace,out-of-memory,large-files
I have a ~600MB Roblox type .mesh file, which reads like a text file in any text editor. I have the following code below: mesh = open("file.mesh", "r").read() mesh = mesh.replace("[", "{").replace("]", "}").replace("}{", "},{") mesh = "{"+mesh+"}" f = open("p2t.txt", "w") f.write(mesh) It returns: Traceback (most recent call last): File...

The event loop is already running


python,python-3.x,pyqt,pyqt4
I have the following 5 files: gui.py # -*- coding: utf-8 -*- from PyQt4 import QtCore, QtGui try: _fromUtf8 = QtCore.QString.fromUtf8 except AttributeError: def _fromUtf8(s): return s try: _encoding = QtGui.QApplication.UnicodeUTF8 def _translate(context, text, disambig): return QtGui.QApplication.translate(context, text, disambig, _encoding) except AttributeError: def _translate(context, text, disambig): return QtGui.QApplication.translate(context, text, disambig)...

Pandas Dataframe Complex Calculation


python,python-2.7,pandas,dataframes
I have the following dataframe,df: Year totalPubs ActualCitations 0 1994 71 191.002034 1 1995 77 2763.911781 2 1996 69 2022.374474 3 1997 78 3393.094951 I want to write code that would do the following: Citations of currentyear / Sum of totalPubs of the two previous years I want something to...

represent an index inside a list as x,y in python


python,list,numpy,multidimensional-array
I have a list which contains 1000 integers. The 1000 integers represent 20X50 elements of dimensional array which I read from a file into the list. I need to walk through the list with an indicator in order to find close elements to each other. I want that my indicator...

How to change the IP address of Amazon EC2 instance using boto library


python,amazon-web-services,boto
How can I assign a new IP address (or Elastic IP) to an already existing AWS EC2 instance using boto library.

Identify that a string could be a datetime object


python,regex,algorithm,python-2.7,datetime
If I knew the format in which a string represents date-time information, then I can easily use datetime.datetime.strptime(s, fmt). However, without knowing the format of the string beforehand, would it be possible to determine whether a given string contains something that could be parsed as a datetime object with the...

Matplotlib: Plot the result of an SQL query


python,sql,matplotlib,plot
from sqlalchemy import create_engine import _mssql from matplotlib import pyplot as plt engine = create_engine('mssql+pymssql://**:****@127.0.0.1:1433/AffectV_Test') connection = engine.connect() result = connection.execute('SELECT Campaign_id, SUM(Count) AS Total_Count FROM Impressions GROUP BY Campaign_id') for row in result: print row connection.close() The above code generates an array: (54ca686d0189607081dbda85', 4174469) (551c21150189601fb08b6b64', 182) (552391ee0189601fb08b6b73', 237304) (5469f3ec0189606b1b25bcc0',...

Reg ex matching a word


regex
I need to match only first two files, out of four files listed below: ABD_DEF_GHIJ_20150611 ABD_DEF_GHIJ ABD_DEF_GHIJ_FX_20150611 ABD_DEF_GHIJ_FX I am using reg ex - ABD_DEF_GHIJ(_\d{8}|\b) and it's working fine. I would like to know if my solution is ok or there is any better alternate solution....

Pandas - Dropping multiple empty columns


python,pandas
I have some tables where the first 11 columns are populated with data, but all columns after this are blank. I tried: df=df.dropna(axis=1,how='all') which didn't work. I then used: df = df.drop(df.columns[range(11,36)], axis=1) Which worked on the first few tables, but then some of the tables were longer or shorter...

How do I read this list and parse it?


python,list
I'm using requests and the output I get from the sites API is a list, I've been stuck trying to parse it to get the data from it. I use r = requests.get(urlas, params=params) r.json() to get the data I want. Here is a snippet of the list [{'relation_type': None,...

Replace nodejs for python?


python,node.js,webserver
i'm working in a HTML5 multiplayer game, and i need a server to sync player's movement, chat, battles, etc. So I'm looking for ways to use python instead nodejs, because i have I have more familiarity with python. The server is simple: var express = require('express'); var app = express();...

How to write RegEx for inserting line break for line length more than 30 characters?


regex
I am using a text editor which lets use regular expression to find / replace text. I have a large text file. I want to insert new line in each lines which are more than 30 characters. I want the line to break after 30th character (doesnt matter if a...

How does the class_weight parameter in scikit-learn work?


python,scikit-learn
I am having a lot of trouble understanding how the class_weight parameter in scikit-learn's Logistic Regression operates. The Situation I want to use logistic regression to do binary classification on a very unbalanced data set. The classes are labelled 0 (negative) and 1 (positive) and the observed data is in...

group indices of list in list of lists


python,list
I am looking for an elegant solution for the following problem. I have a list of ints and I want to create a list of lists where the indices with the same value are grouped together in the order of the occurrences of said list. [2, 0, 1, 1, 3,...

Twilio Client Python not Working in IOS Browser


javascript,python,ios,flask,twilio
I have created a simple twilio client application to make phone calls from Web Browser to phones. I used a sample Flask app to generate a secure Capability Token and used twilio.min.js library to handle calls from my HTML. The functionality works fine in Computer Browsers ans Android Phone Browsers,...

Calling function and passing arguments multiple times


python,function,loops
I want to call the function multiple time and use it's returned argument everytime when it's called. For example: def myfunction(first, second, third): return (first+1,second+1,third+1) 1st call: myfunction(1,2,3) 2nd call is going to be pass returned variables: myfunction(2,3,4) and loop it until defined times. How can I do such loop?...

How to remove structure with python from this case?
python,python-2.7
How to remove "table" from HTML using python? I had case like this: paragraph = ''' <p>Lorem ipsum dolor sit amet, consectetur adipisicing elit. Quidem molestiae consequuntur officiis corporis sint.<br /><br /> <table> <tr> <td> text title </td> <td> text title 2 </td> </tr> </table> <p> lorem ipsum</p> ''' how...

How to Match a string with the format: “20959WC-01” in php?


php,regex
i want to restrict a user to enter a value which is similar to the value "20959WC-01", means it must contains 5 integers followed by two character, a '-' and two integers, can anyone please give me a solution to sort out this problem. Thanks in advance :) ...

In sklearn, does a fitted pipeline reapply every transform?


python,scikit-learn,pipeline,feature-selection
Apologies if this is obvious but I couldn't find a clear answer to this: Say I've used a pretty typical pipeline: feat_sel = RandomizedLogisticRegression() clf = RandomForestClassifier() pl = Pipeline([ ('preprocessing', preprocessing.StandardScaler()), ('feature_selection', feat_sel), ('classification', clf)]) pl.fit(X,y) Now when I apply pl on a new set, pl.predict(X_classify); is RandomizedLogisticRegression going...

Using counter on array for one value while keeping index of other values


python,collections
After reading the answers on this question How to count the frequency of the elements in a list? I was wondering how to count the frequency of something, and at the same time retreive some extra information, through something like an index. For example a = ['fruit','Item#001'] b = ['fruit','Item#002']...

Regex to remove `.` from a sub-string enclosed in square brackets


c#,.net,regex,string,replace
I have this regex in C#: \[.+?\] This regex extracts the sub-strings enclosed between square brackets. But before doing that I want to remove . inside these sub-strings. For example, the string hello,[how are yo.u?]There are [300.2] billion stars in [Milkyw.?ay]. should become hello,[how are you?]There are [3002] billion stars...

Finding embeded xpaths in a String


java,regex
I have a string where I have the user should be able to specify xpaths that will be evaluated at runtime. I was thinking about having a the following way to specify it. String = "Hi my name is (/message/user) how can i help you with (/message/message) "; How can...

Regular Expression for whole world


regex,c#-4.0,vb6
First of all, I use C# 4.0 to parse the code of a VB6 application. I have some old VB6 code and about 500+ copies of it. And I use a regular expression to grab all kinds of global variables from the code. The code is described as "Yuck" and...

Sum of two variables in RobotFramework


python,automated-tests,robotframework
I have two variables: ${calculatedTotalPrice} = 42,42 ${productPrice1} = 43,15 I executed ${calculatedTotalPrice} Evaluate ${calculatedTotalPrice}+${productPrice1} I got 42,85,15 How can I resolve it?...

Count function counting only last line of my list


python,python-2.7
Count function counting only last line of my list N = int(raw_input()) cnt = [] for i in range(N): string = raw_input() for j in range(1,len(string)): if string[j] =='K': cnt.append('R') elif string[j] =='R': cnt.append('R') if string[0] == 'k': cnt.append('k') elif string[0] == 'R': cnt.append('R') print cnt.count('R') if I am giving...

match line break except line begin with spcific word or blank line


regex,notepad++
If I have text that the line breaks is broken: Chapter 1 Lorem ipsum dolor sit amet, consectetur adipisci ng elit, sed do eiusmod tempor incididunt ut la bore et dolore magna aliqua. Ut enim ad minim ve niam, quis nostrud exercitation ullamco labo ris nisi ut aliquip ex ea...

MySQL substring match using regular expression; substring contain 'man' not 'woman'


mysql,regex
I have an issue while I fetch data from database using regular expression. While I search for 'man' in tags it returns tags contains 'woman' too; because its substring. SELECT '#hellowomanclothing' REGEXP '^(.)*[^wo]man(.)*$'; # returns 0 correct, it contains 'woman' SELECT '#helloowmanclothing' REGEXP '^(.)*[^wo]man(.)*$'; # returns 0 incorrect, it can...

Get number from string


regex
I am trying to get the enclosed number between two slashes in a URL using regex. The code regex I have is not working, I am fairly new to regex and don't really understand it. The regex: http:\/\/?www\.?example\.com\/g\/(^\d$)\/\w The URL: http://www.example.com/g/1337/Game-Title Trying to get the "1337", which is the PlaceId....

Swing regular expression for phone number validation


java,regex
I want to validate phone number field in swing, so I am writing code to allow user to enter only digits, comma, spaces. For this I am using regular expression, when user enter characters or other than the pattern text field will consume. My code is not working. Can anyone...

How to check for multiple attributes in a list


python,python-2.7
I am making a TBRPG game using Python 2.7, and i'm currently making a quest system. I wanted to make a function that checks all of the quests in a list, in this case (quests), and tells you if any of of the quests in the list have the same...

trying to understand LSH through the sample python code


python,similarity,locality-sensitive-hash
the concise python code i study for is here Question A @ line 8 i do not really understand the syntax meaning for "res = res << 1" for the purpose of "get_signature" Question B @ line 49 (SOLVED BY myself through another Q&A) "xor = r1^r2" does not really...

Python recursive function not recursing


python,recursion
I'm trying to solve a puzzle, which is to reverse engineer this code, to get a list of possible passwords, and from those there should be one that 'stands out', and should work function checkPass(password) { var total = 0; var charlist = "abcdefghijklmnopqrstuvwxyz"; for (var i = 0; i...

SQLAlchemy. 2 different relationships for 1 column


python,sqlalchemy
I have a simple many-to-many relationship with associated table: with following data: matches: users: users_mathces: ONE user can play MANY matches and ONE match can involve up to TWO users I want to realize proper relationships in both "Match" and "User" classes users_matches_table = Table('users_matches', Base.metadata, Column('match_id', Integer, ForeignKey('matches.id', onupdate="CASCADE",...

Regex that allow void fractional part of number


c#,regex
@"[+-]?\d+(\.\d+)?" -this is a regex I have wrote for numbers it allows [+-] minus before the number digits before and digits after the point the question is how to change this to allow "not finished" values so that input of "5." - is fine too ?...

Find the tf-idf score of specific words in documents using sklearn


python,scikit-learn,tf-idf
I have code that runs basic TF-IDF vectorizer on a collection of documents, returning a sparse matrix of D X F where D is the number of documents and F is the number of terms. No problem. But how do I find the TF-IDF score of a specific term in...

Displaying a 32-bit image with NaN values (ImageJ)


python,image-processing,imagej
I wrote a multilanguage 3-D image denoising ImageJ plugin that does some operations on an image and returns the denoised image as a 1-D array. The 1-D array contains NaN values (around the edges). The 1-D array is converted back into an image stack and displayed. It is simply black....

Python: histogram/ binning data from 2 arrays.


python,histogram,large-files
I have two arrays of data: one is a radius values and the other is a corresponding intensity reading at that intensity: e.g. a small section of the data. First column is radius and the second is the intensities. 29.77036614 0.04464427 29.70281027 0.07771409 29.63523525 0.09424901 29.3639355 1.322793 29.29596385 2.321502 29.22783249...

How many characters are visible like a space, but are not space characters?


php,regex
If I want to discover the hexadecimal equivalent of a space in PHP I can play with bin2hex: php > echo var_dump(bin2hex(" ")); string(2) "20" I can also obtain space character from "20" php > echo var_dump(hex2bin("20")); string(1) " " But there exist Unicode versions of a "visible" space: php...

How to use template within Django template?


python,html,django,templates,django-1.4
I have the django template like below: <a href="https://example.com/url{{ mylist.0.id }}" target="_blank"><h1 class="title">{{ mylist.0.title }}</h1></a> <p> {{ mylist.0.text|truncatewords:50 }}<br> ... (the actual template is quite big) It should be used 10 times on the same page, but 'external' html elements are different: <div class="row"> <div class="col-md-12 col-lg-12 block block-color-1"> *django...

Strange Behavior: Floating Point Error after Appending to List


python,python-2.7,behavior
I am writing a simple function to step through a range with floating step size. To keep the output neat, I wrote a function, correct, that corrects the floating point error that is common after an arithmetic operation. That is to say: correct(0.3999999999) outputs 0.4, correct(0.1000000001) outputs 0.1, etc. Here's...

Please can someone help me understand the exec method for regular expressions?


javascript,regex
The best place I have found for the exec method is Eloquent Javascript Chapter 9: "Regular expressions also have an exec (execute) method that will return null if no match was found and return an object with information about the match otherwise. An object returned from exec has an index...

Inserting a variable in MongoDB specifying _id field


python,mongodb,pymongo
I want to insert a variable, say, a = {1:2,3:4} into my database with a particular id "56". It is very clear from the docs that I can do the following: db.testcol.insert({"_id": "56", 1:2, 3:4}) However, I cannot figure out any way to insert "a" itself, specifying an id. In...

How to create the javascript regular expression for number with some special symbols


javascript,regex
what can be the java-script regular expression which gives the numbers with some symbols For example following condition must be pass. Number can start with $ Can have the . or , : symbols between and % sign at the send. Passing valus: $233 48.23% 278 22.33 45:23 10,000 Number...

PHP Regular Expressions Counting starting consonants in a string


php,regex
I need to find out how many starting consonants a word has. The number is used later in the program. The code below does work, I am wondering if it is possible to do this with a regular expression. $mystring ="SomeStringExample"; $mystring2 =("bcdfghjklmnpqrstvwxyzABCDFGHJKLMNPQRSTWVXYZ"); $var = strspn($mystring, $mystring2); Using a regular...

Regex with whitespaces and preceding zeros


regex,sas
I want to match the string 11 with a regular Expression in SAS. The 11 can be preceded by zero or more 0 and/or by white spaces. Any other character is not allowed. Likewise, if anything there should only be white spaces following the 11. Examples: Match: 0000011 11 11<space><space>...

Sort when values are None or empty strings python


python,list,sorting,null
I have a list with dictionaries in which I sort them on different values. I'm doing it with these lines of code: def orderBy(self, col, dir, objlist): if dir == 'asc': sorted_objects = sorted(objlist, key=lambda k: k[col]) else: sorted_objects = sorted(objlist, key=lambda k: k[col], reverse=True) return sorted_objects Now the problem...

How do variables inside python modules work?


python,module,python-module
I am coming from a Java background with Static variables, and I am trying to create a list of commonly used strings in my python application. I understand there are no static variables in python so I have written a module as follows: import os APP_NAME = 'Window Logger' APP_DATA_FOLDER_PATH...

regex - Match filename with or without extension


regex,logstash-grok
Need a regex pattern to match all of the following: hello hello. hello.cc I tried \b\w+\.?\w+?\b, but this doesn't match "hello." (the second string mentioned above)....