FAQ Database Discussion Community


groupby on sparse matrix with scipy

python,numpy,matrix,scipy,sparse
I build a scipy sparse matrix S with sklearn.preprocessing.OneHotEncoder(). The matrix S has 10^6 rows for 500 columns. I also have a numpy array A with 10^6 values as follows: A = [1,1,2,2,2,3,4,5,6,6,7,8,8,8,...] I want to do a group by on the sparse matrix S following the groups written in...

Problems interpolating and evaluating numpy array at arbitrary points with Scipy

python,scipy,interpolation
I am trying to replicate some of the functionality of Matlab's interp2. I know somewhat similar questions have been asked before, but none apply to my specific case. I have a distance map (available at this Google drive location): https://drive.google.com/open?id=0B6acq_amk5e3X0Q5UG1ya1VhSlE&authuser=0 Values are normalized in the range 0-1. Size is 200...

Python Pandas: rolling_kurt vs. scipy.stats.kurtosis

python,pandas,scipy,kurtosis
I am trying to figure out why the following code returns different values for the sample's kurtosis: import pandas import scipy e = pandas.DataFrame([1, 2, 3, 4, 5, 4, 3, 2, 1]) print "pandas.rolling_kurt:\n", pandas.rolling_kurt(e, window=9) print "\nscipy.stats.kurtosis:", scipy.stats.kurtosis(e) The output I am getting: pandas.rolling_kurt: 0 0 NaN 1 NaN...

Sparse random matrix in Python with different range than [0,1]

python,random,scipy,sparse-matrix
I need to generate a sparse random matrix in Python with all values in the range [-1,1] with uniform distribution. What is the most efficient way to do this? I have a basic sparse random matrix: from scipy import sparse from numpy.random import RandomState p = sparse.rand(10, 10, 0.1, random_state=RandomState(1))...

Finding roots with scipy.optimize.root

python,numpy,scipy
I am trying to find the root y of a function called f using Python. Here is my code: def f(y): w,p1,p2,p3,p4,p5,p6,p7 = y[:8] t1 = w - 0.500371726*(p1**0.92894164) - (-0.998515304)*((1-p1)**1.1376649) t2 = w - 8.095873128*(p2**0.92894164) - (-0.998515304)*((1-p2)**1.1376649) t3 = w - 220.2054377*(p3**0.92894164) - (-0.998515304)*((1-p3)**1.1376649) t4 = w - 12.52760758*(p4**0.92894164)...

Calculating distances between unique Python array regions?

python,arrays,numpy,scipy,distance
I have a raster with a set of unique ID patches/regions which I've converted into a two-dimensional Python numpy array. I would like to calculate pairwise Euclidean distances between all regions to obtain the minimum distance separating the nearest edges of each raster patch. As the array was originally a...

Scipy.optimize.root does not converge in Python while Matlab fsolve works, why?

python,numpy,scipy
I am trying to find the root y of a function called f using Python. Here is my code: def f(y): w,p1,p2,p3,p4,p5,p6 = y[:7] t1 = w - 0.99006633*(p1**0.5) - (-1.010067)*((1-p1)) t2 = w - 22.7235687*(p2**0.5) - (-1.010067)*((1-p2)) t3 = w - 9.71323491*(p3**0.5) - (-1.010067)*((1-p3)) t4 = w - 2.43852877*(p4**0.5)...

matplotlib argrelmax doesn't find all maxes

python,scipy,sampling
I have a project where I'm sampling analog data and attempting to analyze with matplotlib. Currently, my analog data source is a potentiometer hooked up to a microcontroller, but that's not really relevant to the issue. Here's my code arrayFront = RunningMean(array(dataFront), 15) arrayRear = RunningMean(array(dataRear), 15) x = linspace(0,...

Undefined symbols in Scipy and Scikit-learn on RedHat

python,scipy,scikit-learn,atlas
I'm trying to install Scikit-Learn on a 64-bit Red Hat Enterprise 6.6 server on which I don't have root privileges. I've done a fresh installation of Python 2.7.9, Numpy 1.9.2, Scipy 0.15.1, and Scikit-Learn 0.16.1. The Atlas BLAS installation on the server is 3.8.4. I can install scikit-learn, but when...

Python complex coupled ODEs error

python,scipy,ode
At the moment, I am trying to solve a system of coupled ODEs with complex terms. I am using scipy.integrate.ODE, I have successfully solved a previous problem involving a coupled ODE system with only real terms. In that case I used odeint, which is not suitable for the problem I...

SciPy Conjugate Gradient Optimisation not invoking callback method after each iteration

python,optimization,machine-learning,scipy,theano
I followed the tutorial here in order to implement Logistic Regression using theano. The aforementioned tutorial uses SciPy's fmin_cg optimisation procedure. Among the important argument to the aforementioned function are: f the object/cost function to be minimised, x0 a user supplied initial guess of the parameters, fprime a function which...

What's the most pythonic way to load a matrix in ijv/coo/triplet format?

python,pandas,scipy,scikit-learn
My input file is in ijv/coo/triplet format with string column names, eg: Apple,Google,1 Apple,Banana,5 Microsoft,Orange,2 Should result in this 2x3 matrix: [[1,5,0], [0,0,2]] I can read it manually by putting the column names to dictionaries and create a scipy sparse coo_matrix with that dict mapping to IDs. I would like...

Can't use scipy stats function on nested list

python,numpy,statistics,scipy,nested-lists
I've been trying to scipy.mstats.zscore a dataset that is intentionally organized into a nested list, and it gives: TypeError: unsupported operand type(s) for /: 'list' and 'long' which probably suggests that scipy.stats doesn't work for nested lists. What can I do about it? Does a for loop affect the nature...

a simple command — “np.save”, maybe i misunderstood

python,scipy
just working through the example for numpy.save -- http://docs.scipy.org/doc/numpy/reference/generated/numpy.save.html Examples from tempfile import TemporaryFile outfile = TemporaryFile() x = np.arange(10) np.save(outfile, x) AFTER this command (highlighted), why i could not find the output file called "outfile" in the current directory? sorry this may sound stupid outfile.seek(0) # Only needed here...

python / numpy - group matrix elements and build dictionary

python,numpy,matrix,scipy
I have two numpy square matrices called M1 and M2 as: M1 = np.matrix('0 1 2 3; 4 5 6 7; 8 9 10 11; 12 13 14 15') M2 = np.matrix('100 200; 300 400') I would like to group 2x2 elements of M1 assigning those elements to the corresponding...

Distances between coordinate pairs in pandas

python,pandas,scipy
What is the best way to find the number of points (rows) that are within a distance of a given point in this pandas dataframe: x y 0 2 9 1 8 7 2 1 10 3 9 2 4 8 4 5 1 1 6 2 3 7 10...

scipy: interpolation, cubic & linear

python,scipy,interpolation
I'm trying to interpolate my set of data (first columnt is the time, third columnt is the actual data): import numpy as np import matplotlib.pyplot as plt from scipy.interpolate import interp1d data = np.genfromtxt("data.csv", delimiter=" ") x = data[:, 0] y = data[:, 2] xx = np.linspace(x.min(), x.max(), 1000) y_smooth...

Smoothing of graph gives a huge difference in the range

python,python-2.7,matplotlib,scipy
I am trying to plot a smooth curve using the x,y cordinates above. Howsoever the graph which i get is out of the range of my data. The snippet of my code is here. import numpy as np import matplotlib.pyplot as plt from scipy.interpolate import spline ylist = [0.36758563074352546, 0.27634194831013914,...

Fitting a sum to data in Python

python,scipy,curve-fitting,data-fitting
Given that the fitting function is of type: I intend to fit such function to the experimental data (x,y=f(x)) that I have. But then I have some doubts: How do I define my fitting function when there's a summation involved? Once the function defined, i.e. def func(..) return ... is...

Difference in x,y parameters for scipy interpolate RectBivariateSpline and interp2d

python,scipy,interpolation
If I want to interpolate the data below: from scipy.interpolate import RectBivariateSpline, interp2d import numpy as np x1 = np.linspace(0,5,10) y1 = np.linspace(0,20,20) xx, yy = np.meshgrid(x1, y1) z = np.sin(xx**2+yy**2) with interp2d this works: f = interp2d(x1, y1, z, kind='cubic') however if I use RectBivariateSpline with the same x1,...

Fast way to find the mean of a function?

python,scipy
I'm writing code to evaluate the mean of functions it is passed, but where the functional form is not known beforehand. I have code below that does work, using scipy.integrate.quad, but it is rather slow. I was wondering does anybody know of a faster way? import numpy as np from...

Python: Reading Fortran Binary file using numpy or scipy

python,numpy,scipy
I am trying to read a fortran file with headers as integers and then the actual data as 32 bit floats. Using numpy's fromfile('mydatafile', dtype=np.float32) it reads in the whole file as float32 but I need the headers to be in int32 for my output file. Using scipy's FortranFile it...

Normalizing matrix row scipy matrix

python-2.7,scipy
I wish to normalize each row of a sparse scipy matrix, obtained from a networkx directed graph. import networkx as nx import numpy as np G=nx.random_geometric_graph(10,0.3) M=nx.to_scipy_sparse_matrix(G, nodelist=G.nodes()) from __future__ import division print(M[3]) (0, 1) 1 (0, 5) 1 print(M[3].multiply(1/M[3].sum())) (0, 1) 0.5 (0, 5) 0.5 this is ok, I...

What are the loc and scale parameters in scipy.stats.maxwell?

python,numpy,scipy,distribution
The maxwell-boltzmann distribution is given by . The scipy.stats.maxwell distribution uses loc and scale parameters to define this distribution. How are the parameters in the two definitions connected? I also would appreciate if someone could tell in general how to determine the relation between parameters in scipy.stats and their usual...

Accessing elements in coo_matrix

python,scipy
This is a very simple question. For SciPy sparse matrices like coo_matrix, how does one access individual elements? To give an analogy to Eigen linear algebra library. One can access element (i,j) using coeffRef as follows: myMatrix.coeffRef(i,j) ...

Logarithm matrix python

python,numpy,scipy,bioinformatics
I search to compute logarithm of a matrix which is given by logm (scipy.linalg) I wrote this code in Python : from scipy.linalg import logm, expm from Bio import SeqIO import numpy as np from numpy.linalg import svd from numpy import eye np.set_printoptions(linewidth=10000) my_file = open("matrice/mammifere_muscle.list.imv") #read two lines of...

how to calculate the norm of a vector in a large mxnx3 array?

python,arrays,scipy,vectorization,linear-algebra
Suppose I have an array of the shape (m,n,3), where m and n refers to the y and x coordinates of a point, and the 3 numbers in each point refer to a three-dimensional vector. (A similar situation is an image with height m and width n, and 3 refers...

Is it possible to use an Anaconda Python 3 environment together with Pycharm?

python-3.x,scipy,pycharm,anaconda
My basic problem is that I want to install scipy on a Window's machine for Python 3 and use Pycharm as my development environment. The suggestion from the Scipy Documentation as well as several StackOverflow posts (Installing NumPy and SciPy on 64-bit Windows (with Pip), Trouble installing SciPy on windows,...

python: finding the value of a random variable for a cdf

python,statistics,scipy,normal-distribution,cdf
I apologize in advance if this is poorly worded. If I have a stdDev = 1, mean = 0, scipy.stats.cdf(-1, loc = 0, scale = 1) will give me the probability that a normally distributed random variable will be <= -1, and that is 0.15865525393145707. Given 0.15865..., how do I...

what is the best method to extract highly correlated vaiables within the given threshold

python,numpy,pandas,scipy
I have one data frame and pairwise correlation were calculated >>> df1 = pd.read_csv("/home/zebrafish/Desktop/stack.csv") >>> df1.corr() GA PN PC MBP GR AP GA 1.000000 0.070541 0.259937 -0.452661 0.115722 0.268014 PN 0.070541 1.000000 0.512536 0.447831 -0.042238 0.263601 PC 0.259937 0.512536 1.000000 0.331354 -0.254312 0.958877 MBP -0.452661 0.447831 0.331354 1.000000 -0.467683 0.229870...

doing algebra with an MxNx3 array using vectorization in python?

arrays,vector,scipy,vectorization,linear-algebra
Suppose I have an MxNx3 array A, where the first two indexes refer to the coordinates a point, and the last index (the number '3') refers to the three components of a vector. e.g. A[4,7,:] = [1,2,3] means that the vector at point (7,4) is (1,2,3). Now I need to...

applying sobel filter on image

python,scipy,edge-detection,sobel
I am trying to use sobel filter on an image of a wall but it doesn't work. My code is : im=scipy.misc.imread('IMG_1479bis.JPG') im = im.astype('int32') dx=ndimage.sobel(im,1) dy=ndimage.sobel(im,0) mag=np.hypot(dx,dy) mag*=255.0/np.max(mag) cv2.imshow('sobel.jpg', mag) I really don't understand where is my mistake. Any help would be appreciated ! Thanks in advance !...

Python p value from t-statistic giving nan

python,statistics,scipy,p-value
I have some t-values and degrees of freedom and want to find the p-values from them (it's two-tailed). In the real world I would use a t-test table in the back of a Statistics textbook; however, I am using stdtr or stats.t.sf function in python. Both of them work fine...

Using a subset of Pandas dataframe with Scipy Kmeans?

python,pandas,scipy
I have a data frame that I import using df = pd.read_csv('my.csv',sep=','). In that CSV file, the first row is the column name, and the first column is the observation name. I know how to select a subset of the Panda dataframe, using: df.iloc[:,1::] which gives me only the numeric...

Python 2D Gaussian Fit with NaN Values in Data

python,numpy,scipy,gaussian
I'm very new to Python but I'm trying to produce a 2D Gaussian fit for some data. Specifically, stellar fluxes linked to certain positions in a coordinate system/grid. However not all of the positions in my grid have corresponding flux values. I don't really want to set these values to...

Interpolation with Delaunay Triangulation (n-dim)

python,scipy,interpolation,delaunay
I would like to use Delaunay Triangulation in Python to interpolate the points in 3D. What I have is # my array of points points = [[1,2,3], [2,3,4], ...] # my array of values values = [7, 8, ...] # an object with triangulation tri = Delaunay(points) # a set...

How to use meshgrid with large arrays in Matplotlib?

python,arrays,matplotlib,scipy,scikit-learn
I have trained a machine learning binary classifier on a 100x85 array in sklearn. I would like to be able to vary 2 of the features in the array, say column 0 and column 1, and generate contour or surface graph, showing how the predicted probability of falling in one...

How can I make my plot smoother in Python?

python,numpy,plot,scipy,smooth
I have a function called calculate_cost which calculates the performance of supplier for different S_range (stocking level). The function works but the plots are not smooth, is there a way to smooth it in Python? import numpy import scipy.stats import scipy.integrate import scipy.misc import matplotlib import math import pylab from...

Scipy ValueError: Total size of new array must be unchanged

python,numpy,scipy
I am currently using Scipy 0.7.2 with Numpy 1.4.1. My Python version is 2.6.6. I have written a simple code to read a coo sparse matrix from a .mtx file as follows: data = scipy.io.mmread('matrix.mtx') On running the code, I got the following error: Traceback (most recent call last): File...

“failed with error code 1” while installing scipy

python,scipy
I have Python 2.7.9 on windows 7 64-bits. I'm trying to install scipy using pip. I used pip install scipy but I get the following error : Command "C:\Python27\python.exe -c "import setuptools, tokenize;__file__='c:\\us ers\\admin\\appdata\\local\\temp\\pip-build-xpl5cw\\scipy\\setup.py';exec(compil e(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file __, 'exec'))" install --record c:\users\admin\appdata\local\temp\pip-b68pfc-reco rd\install-record.txt --single-version-externally-managed --compile" failed with error...

Outlier detection using recursive curve fitting and error elimination

python-2.7,scipy,curve-fitting,outliers,best-fit-curve
Is there any way to do anomaly detection in dataset using recursive curve fitting and removing points having the most mean square error with respect to the curve, upto an acceptable threshold? I am using the scipy.optimize.curve_fit function for python 2.7, and I need to work with python preferably. ...

Solving nonlinear differential first order equations using Python

python,math,numpy,matplotlib,scipy
I would like to solve a nonlinear first order differential equation using Python. For instance, df/dt = f**4 I wrote the following program, but I have an issue with matplotlib, so I don't know if the method I used with scipy is correct. from scipy.integrate import odeint import numpy as...

Finding 2 largest eigenvalues of large-sparse matrix in Python [closed]

python,numpy,scipy,sparse-matrix,eigenvalue
I want to find the 1st and 2nd largest eigenvalues of a big, sparse and symmetric matrix (in python). scipy.sparse.linalg.eigsh with k=2 gives the second largest eigenvalue with respect to the absolute value - so it's not a good solution. In addition, I can't use numpy methods because my...

Multiply high order matrices with numpy

python,numpy,matrix,scipy
I created this toy problem that reflects my much bigger problem: import numpy as np ind = np.ones((3,2,4)) # shape=(3L, 2L, 4L) dist = np.array([[0.1,0.3],[1,2],[0,1]]) # shape=(3L, 2L) ans = np.array([np.dot(dist[i],ind[i]) for i in xrange(dist.shape[0])]) # shape=(3L, 4L) print ans """ prints: [[ 0.4 0.4 0.4 0.4] [ 3. 3....

Will numpy.roots() ever return n different floats when a polynomial only has

numpy,scipy
I think the title says it all, but just to be specific, say I have some list of numbers named "coeffs". Assuming the polynomial with said coefficients has exactly k unique roots, will the following code ever set number_of_unique_roots to be a number greater than k? import numpy as np...

Check if two scipy.sparse.csr_matrix are equal

python,numpy,scipy
I want to check if two csr_matrix are equal. If I do: x.__eq__(y) I get: raise ValueError("The truth value of an array with more than one " ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). This, However, works well: assert...

scipy fftconvolve claims input parameters don't have same dimensionality. What am I parsing?

python,scipy
I'm trying to create a class which uses fftconvolve from scipy.signal to convolve some data with a gaussian inside a method of the class instance. However every time create an instance and call the method enlarge_smooth (which happens upon right arrow key press), I get an error from fftconvolve stating:...

How to fit datasets so that they share some (but not all) parameter values

python,numpy,scipy,curve-fitting,data-fitting
Say I want to fit two arrays x_data_one and y_data_one with an exponential function. In order to do that I might use the following code (in which x_data_one and y_data_one are given dummy definitions): import numpy as np from scipy.optimize import curve_fit def power_law(x, a, b, c): return a *...

List of Elements to Boolean Array

python,arrays,numpy,data-structures,scipy
Say my list is the following: ['cat','elephant'] How can I efficiently convert my list into an array of boolean elements, where each index represents whether a given animal (of 10^n animals) is present in my list? That is, if cat is present index x is true and if elephant is...

scipy linalg deterministic/non-deterministic code

python,random,scipy
I'm running this SVD solver from scipy with the below code: import numpy as np from scipy.sparse.linalg import svds features = np.arange(9,dtype=np.float64).reshape((3,3)) for i in range(10): _,_,V = svds(features,2) print i,np.mean(V) I expected the printed mean value to be the same each time, however it changes and seems to cycle...

Python and conflicting module names

python,scipy,ubuntu-14.04
It seems that if a file is called io.py and it imports scipy.ndimage, the latter somehow ends up failing to find its own submodule, also called io: $ echo "import scipy.ndimage" > io.py $ python io.py Traceback (most recent call last): File "io.py", line 1, in <module> import scipy.ndimage File...

How do I plug distance data into scipy's agglomerative clustering methods?

numpy,machine-learning,scipy,hierarchical-clustering
So, I have a set of texts I'd like to do some clustering analysis on. I've taken a Normalized Compression Distance between every text, and now I have basically built a complete graph with weighted edges that looks something like this: text1, text2, 0.539 text2, text3, 0.675 I'm having tremendous...

Python: using X and Y values to draw a picture

python,numpy,module,scipy
I have a series of methods that take an image 89x22 pixels (although the size, theoretically, is irrelevant) and fits a curve to each row of pixels to find the location of the most significant signal. At the end, I have a list of Y-values, one for each row of...

Integration not successful in Python QuTiP

python,numpy,scipy,integrate
I have been trying to use QuTiP to solve a quantum mechanics matrix differential equation (a Lindblad equation). Here is the code: from qutip import * from matplotlib import * import numpy as np hamiltonian = np.array([[215, -104.1, 5.1, -4.3 ,4.7,-15.1 ,-7.8 ], [-104.1, 220.0, 32.6 ,7.1, 5.4, 8.3, 0.8],...

Redraw plot in same window with scipy / voronoi_plot_2d

python,numpy,matplotlib,scipy,voronoi
I'm trying to make a Voronoi plot update in real time as the generating points change position. My problem is how to reuse the same figure, since currently I get a new window each time I call voronoi_plot_2d. See code: #!/usr/bin/env python import numpy as np import time from scipy.spatial...

How to include all points into error-less triangulation mesh with scipy.spatial.Delaunay?

python,scipy,delaunay
I am testing scipy.spatial.Delaunay and not able to solve two issues: the mesh has errors the mesh doesn't include all points Code and image of plot: import numpy as np from scipy.spatial import Delaunay,delaunay_plot_2d import matplotlib.pyplot as plt #input_xyz.txt contains 1000 pts in "X Y Z" (float numbers) format points...

Is it possible to enforce edges (constrained delaunay triangulation) in scipy.spatial's Delaunay?

python,scipy,triangulation,delaunay
I am experimenting with scipy.spatial's implementation of Qhull's Delaunay triangulation. Is it possible to generate the triangulation in a manner that preserves the edges defined by the input vertices? (EDIT: i.e. a constrained Delaunay triangulation.) As can be done with the triangle package for Python. For example, in the picture...

How to solve nonlinear equation with Python with three unknowns and hundreds of solutions?

python,numpy,scipy,nonlinear-optimization
I am trying to use python to find the values of three unknowns (x,y,z) in a nonlinear equation of the type: g(x) * h(y) * k(z) = F where F is a vector with hundreds of values. I successfully used scipy.optimize.minimize where F only had 3 values, but that failed...

Double integral with variable boundaries in python Scipy + sympy (?)

python,scipy,sympy,integral
The full mathematical problem is here. Briefly I want to integrate a function with a double integral. The inner integral has boundaries 20 and x-2, while the outer has boundaries 22 and 30. I know that with Scipy I can compute the double integral with scipy.integrate.nquad. I would like to...

scipy.integrate.quad gives wrong result on large ranges

python,python-3.x,scipy,integration,quad
I am trying to integrate over the sum of two 'half' normal distributions. scipy.integrate.quad works fine when I try to integrate over a small range but returns 0 when I do it for large ranges. Here's the code: mu1 = 0 mu2 = 0 std1 = 1 std2 = 1...

Numpy and dot products of multiple vector pairs: how can it be done?

python,numpy,matrix,scipy
I want to get dot product of N vector pairs (a_vec[i, :], b_vec[i, :]). a_vec has shape [N, 3], bvec has the same shape (N 3D vectors). I know that it can be easily done in cycle via numpy.dot function. But cannot it be done somehow simpler and faster?...

How configure Stanford QNMinimizer to get similar results as scipy.optimize.minimize L-BFGS-B

java,optimization,machine-learning,scipy,stanford-nlp
I want to configurate the QN-Minimizer from Stanford Core NLP Lib to get nearly similar optimization results as scipy optimize L-BFGS-B implementation or get a standard L-BFSG configuration that is suitable for the most things. I set the standard paramters as follow: The python example I want to copy: scipy.optimize.minimize(neuralNetworkCost,...

Effectively change dimension of scipy.spare.csr_matrix [duplicate]

python,python-2.7,numpy,scipy,sparse
This question already has an answer here: Adding a column of zeroes to a csr_matrix 2 answers I have a function that takes a csr_matrix and does some calculations on it. The behavior of these calculation requires the shape of this matrix to be specific (say NxM). The input...

Linear programming with scipy.optimize.linprog

python,numpy,scipy
I've just check the simple linear programming problem with scipy.optimize.linprog: 1*x[1] + 2x[2] -> max 1*x[1] + 0*x[2] <= 5 0*x[1] + 1*x[2] <= 5 1*x[1] + 0*x[2] >= 1 0*x[1] + 1*x[2] >= 1 1*x[1] + 1*x[2] <= 6 And got the very strange result, I expected that x[1]...

Solve broadcasting error without for loop, speed up code

python,for-loop,numpy,matrix,scipy
I may be misunderstanding how broadcasting works in Python, but I am still running into errors. scipy offers a number of "special functions" which take in two arguments, in particular the eval_XX(n, x[,out]) functions. See http://docs.scipy.org/doc/scipy/reference/special.html My program uses many orthogonal polynomials, so I must evaluate these polynomials at distinct...

Scipy's correlate function is slow

python,numpy,scipy
I have compared the different methods for convolving/correlating two signals using numpy/scipy. It turns out that there are huge differences in speed. I compared the follwing methods: correlate from the numpy package (np.correlate in plot) correlate from the scipy.signal package (sps.correlate in plot) fftconvolve from scipy.signal (sps.fftconvolve in plot) Now...

How to solve this convex optimization?

scipy,convex-optimization,convex,quadratic-programming
It is simple, I know but I have little understanding of convex optimization yet Problem definition: Objective function is II b - Aw II norm 2 a vector of unknown [w1, w2, ..., wn] a data matrix A (m x n), each row has n components([ai1, ai2, ..., ain]), m...

Python scipy fsolve “mismatch between the input and output shape of the 'func' argument”

python,scipy
This is my first time posting on stackoverflow, so if I don't use the correct stackoverflow etiquette, I'm sorry. Before I go into my problem, I've searched the relevant threads on stackoverflow with the same problem: input/output error in scipy.optimize.fsolve Python fsolve() complains about shape. Why? fsolve - mismatch between...

how to group by mode in python?

python,pandas,scipy
i am trying to find the item belongs to which category based on mode by using below pandas data frame data ITEM CATEGORY 1 red saree actual 2 red saree actual 3 glass lbh 4 glass lbh 5 red saree actual 6 red saree lbh 7 glass actual 8 bottle...

Can someone explain mdict in python (scipy.io), for example in scipy.io.savemat()?

python,matlab,io,scipy
I have been working on loading some files in python, and then once the files are loaded I want to export them into a .mat file and do the rest of the processing in MATLAB. I understand that I can do this with: import scipy.io as sio # load some...

efficiently caclulating double integral

python,scipy,montecarlo
Good day. I calculating following integral using scipy: from scipy.stats import norm def integrand(y, x): # print "y: %s x: %s" % (y,x) return (du(y)*measurment_outcome_belief(x, 3)(y))*fv_belief(item.mean, item.var)(x) return dblquad( integrand, norm.ppf(0.001, item.mean, item.var), norm.ppf(0.999, item.mean, item.var), lambda x: norm.ppf(0.001, x, 3), lambda x: norm.ppf(0.999, x, 3))[0] I have belief state...

how to get derivatives from 1D interpolation

python,scipy
Is there a way to get scipy's interp1d (in linear mode) to return the derivative at each interpolated point? I could certainly write my own 1D interpolation routine that does, but presumably scipy's is internally in C and therefore faster, and speed is already a major issue. I am ultimately...

print surface fit equation in python

python,numpy,matplotlib,scipy,least-squares
I'm trying to fit a surface model to a 3D data-set (x,y,z) using matplotlib. Where z = f(x,y). So, I'm going for the quadratic fitting with equation: f(x,y) = ax^2+by^2+cxy+dx+ey+f So far, I have successfully plotted the 3d-fitted-surface using least-square method using: # best-fit quadratic curve A = np.c_[np.ones(data.shape[0]), data[:,:2],...

numpy how find local minimum in neighborhood on 1darray

python,numpy,scipy
I've got a list of sorted samples. They're sorted by their sample time, where each sample is taken one second after the previous one. I'd like to find the minimum value in a neighborhood of a specified size. For example, given a neighborhood size of 2 and the following sample...

Constrained Optimization with Scipy for a nonlinear fucntion

python,numpy,scipy
I am trying to maximize x^(0.5)y^(0.5) st. x+y=10 using scipy. I can't figure out which method to use. I would really appreciate it if someone could guide me on this. ...

How do I find and remove white specks from an image using SciPy/NumPy?

python,image-processing,numpy,scipy
I have a series of images which serve as my raw data which I am trying to prepare for publication. These images have a series of white specks randomly throughout which I would like to replace with the average of some surrounding pixels. I cannot post images, but the following...

How scipy.stats handles nans?

python,numpy,statistics,scipy,missing-data
I am trying to do some statistics in Python. I have data with several missing values, filled with np.nan, and I am not sure should I remove it manually, or scipy can handle it. So I tried both: import scipy.stats, numpy as np a = [0.75, np.nan, 0.58337, 0.75, 0.75,...

stdtr in python giving nan for p-value while doing t-test

python,statistics,scipy,p-value
I am using the following code to perform t-test: def t_stat(na,abar,avar,nb,bbar,bvar): logger.info("T-test to be performed") logger.info("Set A count = %f mean = %f variance = %f" % (na,abar,avar)) logger.info("Set B count = %f mean = %f variance = %f" % (nb,bbar,bvar)) adof = na - 1 bdof = nb -...

Comparing datasets to nonstandard probability distributions in Python

python,statistics,scipy,probability
I have a few large sets of data which I have used to create non-standard probability distributions (using numpy.histogram to bin the data, and scipy.interpolate's interp1d function to interpolate the resulting curves). I have also created a function which can sample from these custom PDFs using the scipy.stats package. My...

Reshaping after Interpolation

python,numpy,grid,scipy,reshape
After interpolating data to a target grid i am not able to reshape my data to to match the original shape. The original shape of my data is 900x900 being rows x columns. After the interpolation i have an 1-D array of interpolated values in the new size of the...

Specifying greater than inequality in scipy

python,scipy,linear-programming
I've solved a simple LP problem where all constraints are "less than or equal to". I used scipy.optimize.linprog for those. The problem is when one or more of the constraints equation is "greater than or equal to". How do I specify that? I need to use the two-phase approach provided...

Fisher's Exact in scipy as new column using pandas

pandas,scipy,ipython-notebook
Using ipython notebook, a pandas dataframe has 4 columns: numerator1, numerator2, denominator1 and denominator2. Without iterating through each record, I am trying to create a fifth column titled FishersExact. I would like the value of the column to store the tuple returned by scipy.stats.fisher_exact using values (or some derivation of...

Race condition with scipy.weave.inline

python,parallel-processing,scipy,race-condition
Recently I've begun to receive SyntaxErrors when running parallel neural-network simulations using brian2. These are being raised by calls to scipy.weave.inline when it tries to evaluate lines of code in a cache file. The full description of the problem and my guess at its cause is here. And here is...

Scipy sparse matrix from list of list with integers

python,scipy,sparse-matrix
How to make a scipy sparse matrix from a list of lists with integers (or strings)? [[1,2,3], [1], [1,4,5]] Should become: [[1, 1, 1, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 1, 1]] But then in scipy's compressed sparse format?...

Why is there a difference in magnitude response between scipy.filtfilt and scipy.lfilter?

python,scipy,signal-processing
I was trying to filter a signal using the scipy module of python and I wanted to see which of lfilter or filtfilt is better. I tried to compare them and I got the following plot from my mwe import numpy as np import scipy.signal as sp import matplotlib.pyplot as...

array of minimum euclidian distances between all points in array

python,numpy,scipy,distance
I have this numpy array with points, something like [(x1,y1), (x2,y2), (x3,y3), (x4,y4), (x5,y5)] What I would like to do, is to get an array of all minimum distances. So for point 1 (x1, y1), I want the distance of the point closest to it, same for point 2 (x2,y2),...

scipy.mstats.theilslopes error in confidence limit if data have missing values

python,statistics,scipy
If one uses the scipy.mstats.theilslopes routine on a data set with missing values, the results of the lower and upper bounds for the slope estimate are incorrect. The upper bound is often/always(?) NaN, while the lower bound is simply wrong. This happens, because the theilslopes routine computes an index into...

Can one train estimators in a scikit-learn pipeline simultaneously?

python,machine-learning,scipy,scikit-learn,pipeline
Is it possible to do the following in scikit-learn? We train an estimator A using the given mapping from features to targets, then we use the same data (or mapping) to train another estimator B, then we use outputs of the two trained estimators (A and B) as inputs for...

numpy.repeat() to create block-diagonal indices?

python,numpy,scipy
I am trying to figure out how to speed up the following Python code. Basically, the code builds the matrix of outter products of a matrix C and stores it as block diagonal sparse matrix. I use numpy.repeat() to build indices into the block diagonal. Profiling the code revealed that...

Efficient way to fill 2d array in Python

python,performance,optimization,scipy,sparse-matrix
I have 3 arrays: array "words" of pairs ["id": "word"] by the length 5000000, array "ids" of unique ids by the length 13000 and array "dict" of unique words (dictionary) by the length 500000. This is my code: matrix = sparse.lil_matrix((len(ids), len(dict))) for i in words: matrix[id.index(i['id']), dict.index(i['word'])] += 1.0...

Apparent creation of array from another array?

python,arrays,scipy
I have the following code snippet from SciPy: resDat = data[scipy.random.randint(0,N,(N,))] What I try to understand is how and why this line works. the randint function seems to return a list of N integer values in the range of the data indizes, so what I interpret this line of code...

How to smoothen data in Python?

python,numpy,scipy,smooth,smoothing
I am trying to smoothen a scatter plot shown below using SciPy's B-spline representation of 1-D curve. The data is available here. The code I used is: import matplotlib.pyplot as plt import numpy as np from scipy import interpolate data = np.genfromtxt("spline_data.dat", delimiter = '\t') x = 1000 / data[:,...

UnboundLocalError using Kmeans in scipy

python,scipy
I'm trying to learn more about image processing in Python and, as part of the process, am doing some of the exercises in a book that I am reading. In one exercise I'm trying to do kmeans clustering of average pixel color in an image. The code below is pretty...

NumPy and SciPy - Difference between .todense() and .toarray()

python,numpy,scipy
I am wondering if there is any difference (advantage/disadvantage) of using .toarray() vs. .todense() on sparse NumPy arrays. E.g., import scipy as sp import numpy as np sparse_m = sp.sparse.bsr_matrix(np.array([[1,0,0,0,1], [1,0,0,0,1]])) %timeit sparse_m.toarray() 1000 loops, best of 3: 299 µs per loop %timeit sparse_m.todense() 1000 loops, best of 3: 305...

Computing the correlation coefficient between two multi-dimensional arrays

python,arrays,numpy,scipy,correlation
I have two arrays that have the shapes N X T and M X T. I'd like to compute the correlation coefficient across T between every possible pair of rows n and m (from N and M, respectively). What's the fastest, most pythonic way to do this? (Looping over N...

More efficient solution? Dictionary as sparse vector

python,performance,numpy,scipy,sparse
I have two dictionaries that I use as sparse vectors: dict1 = {'a': 1, 'b': 4} dict2 = {'a': 2, 'c': 2} I wrote my own __add__ function to get this desired result: dict1 = {'a': 3, 'b': 4, 'c': 2} It is important that I know the strings 'a',...

Efficiently select random non-zero column from each row of sparse matrix in scipy

python,numpy,scipy
I'm trying to efficiently select a random non-zero column index for each row of a large sparse SciPy matrix. I can't seem to figure out a vectorized way of doing it, so I'm resorting to a very slow Python loop: random_columns = np.zeros((sparse_matrix.shape[0])) for i,row in enumerate(sparse_matrix): random_columns[i] = (np.random.choice(row.nonzero()[1]))...

Wrong Exponential Power Plot - How to improve curve fit

python,scipy,curve-fitting
Unfortunately the power fit with scipy does not return a good fit. I tried to use p0 as an input argument with close values which did not help. I would be very glad if someone could point out to me my problem? # Imports from scipy.optimize import curve_fit import numpy...

Getting a pdf from scipy.stats in a generic way

python,scipy,distribution
I am running some goodness of fit tests using scipy.stats in Python 2.7.10. for distrName in distrNameList: distr = getattr(distributions, distrName) param = distr.fit(sample) pdf = distr.pdf(???) What do I pass into distr.pdf() to get the values of the best-fit pdf on the list of sample points of interest, called...

how to solve many overdetermined systems of linear equations using vectorized codes?

multidimensional-array,scipy,vectorization,linear-algebra,least-squares
I need to solve a system of linear equations Lx=b, where x is always a vector (3x1 array), L is an Nx3 array, and b is an Nx1 vector. N usually ranges from 4 to something like 10. I have no problems solving this using scipy.linalg.lstsq(L,b) However, I need to...