I have something like

```
import matplotlib.pyplot as plt
import numpy as np
a=[0.05, 0.1, 0.2, 1, 2, 3]
plt.hist((a*2, a*3), bins=[0, 0.1, 1, 10])
plt.gca().set_xscale("symlog", linthreshx=0.1)
plt.show()
```

which gives me the following plot:

As one can see, the bar width is not equal. In the linear part (from 0 to 0.1), everything is find, but after this, the bar width is still in linear scale, while the axis is in logarithmic scale, giving me uneven widths for bars and spaces in between (the tick is not in the middle of the bars).

Is there any way to correct this?

Answer:

Inspired by http://stackoverflow.com/a/30555229/635387 I came up with the following solution:

```
import matplotlib.pyplot as plt
import numpy as np
d=[0.05, 0.1, 0.2, 1, 2, 3]
def LogHistPlot(data, bins):
totalWidth=0.8
colors=("b", "r", "g")
for i, d in enumerate(data):
heights = np.histogram(d, bins)[0]
width=1/len(data)*totalWidth
left=np.array(range(len(heights))) + i*width
plt.bar(left, heights, width, color=colors[i], label=i)
plt.xticks(range(len(bins)), bins)
plt.legend(loc='best')
LogHistPlot((d*2, d*3, d*4), [0, 0.1, 1, 10])
plt.show()
```

Which produces this plot:

The basic idea is to drop the plt.hist function, compute the histogram by numpy and plot it with plt.bar. Than, you can easily use a linear x-axis, which makes the bar width calculation trivial. Lastly, the ticks are replaced by the bin edges, resulting in the logarithmic scale. And you don't even have to deal with the symlog linear/logarithmic botchery anymore.

python,numpy,matplotlib,draw,imshow

I have a large data set I want to be able to "zoom" in on. What I really want is for the data to be rebinned based on the selection and then update the data in the graph. So the graph will show different limits but maintain the same resolution....

python,numpy,correlation

I have a large number (M) of time series, each with N time points, stored in an MxN matrix. Then I also have a separate time series with N time points that I would like to correlate with all the time series in the matrix. An easy solution is to...

python-3.x,matplotlib,histogram

I want to plot a histrogram showing the frequencies of various actions at different intervals. I want to bin the occurence of actions into 10 minute intervals. binwidth = 10*60 #10 minutes times = array([ 1.43431325e+09, 1.43431325e+09, 1.43431329e+09, 1.43431330e+09, 1.43431333e+09, 1.43431334e+09, 1.43431345e+09, 1.43431346e+09, 1.43431346e+09, 1.43431346e+09, 1.43431349e+09, 1.43431350e+09, 1.43431350e+09, 1.43431351e+09, 1.43431354e+09,...

python,matplotlib

Does anyone know how to show the labels of the minor ticks on a logarithmic scale with Python/Matplotlib? Thanks!...

python,numpy

A = np.array([[1,2,3],[3,4,5],[5,6,7]]) X = np.array([[0, 1, 0]]) for i in xrange(np.shape(X)[0]): for j in xrange(np.shape(X)[1]): if X[i,j] == 0.0: A = np.delete(A, (j), axis=0) I am trying to delete j from A if in X there is 0 at index j. I get IndexError: index 2 is out of...

python,image,numpy

I already achieved the goal described in the title but I was wondering if there was a more efficient (or generally better) way to do it. First of all let me introduce the problem. I have a set of images of different sizes but with a width/height ratio less than...

python,numpy,matrix,scipy

I want to get dot product of N vector pairs (a_vec[i, :], b_vec[i, :]). a_vec has shape [N, 3], bvec has the same shape (N 3D vectors). I know that it can be easily done in cycle via numpy.dot function. But cannot it be done somehow simpler and faster?...

python,list,numpy,multidimensional-array

I have a list which contains 1000 integers. The 1000 integers represent 20X50 elements of dimensional array which I read from a file into the list. I need to walk through the list with an indicator in order to find close elements to each other. I want that my indicator...

python,numpy,scipy

I've just check the simple linear programming problem with scipy.optimize.linprog: 1*x[1] + 2x[2] -> max 1*x[1] + 0*x[2] <= 5 0*x[1] + 1*x[2] <= 5 1*x[1] + 0*x[2] >= 1 0*x[1] + 1*x[2] >= 1 1*x[1] + 1*x[2] <= 6 And got the very strange result, I expected that x[1]...

python,matplotlib,pyqt4

I'm implementing an image viewer using matplotlib. The idea is that changes being made to the image (such as filter application) will update automatically. I create a Figure to show the inital image and have added a button using pyQt to update the data. The data does change, I have...

python,numpy

I have a 3D color image im (shape 512 512 3), and a 2D array mask(512 512). I want to annotate this color image by the mask: im = im[mask>threshold] + im[mask<threshold] * 0.2 + (255,0,0) * [mask<threshold]. How do I write this in Python efficiently?...

python,regex,numpy,cheminformatics

I'm trying to extract data from the text output of a cheminformatics program called NWChem, I've already extraced the part of the output that I'm interested in(the vibrational modes), here is the string that I have extracted: s = ''' 1 2 3 4 5 6 P.Frequency -0.00 0.00 0.00...

numpy,scipy

I think the title says it all, but just to be specific, say I have some list of numbers named "coeffs". Assuming the polynomial with said coefficients has exactly k unique roots, will the following code ever set number_of_unique_roots to be a number greater than k? import numpy as np...

python,numpy,matrix,factorial

I'd like to know how to calculate the factorial of a matrix elementwise. For example, import numpy as np mat = np.array([[1,2,3],[2,3,4]]) np.the_function_i_want(mat) would give a matrix mat2 such that mat2[i,j] = mat[i,j]!. I've tried something like np.fromfunction(lambda i,j: np.math.factorial(mat[i,j])) but it passes the entire matrix as argument for np.math.factorial....

python,matplotlib,margins

I'm trying to plot a horizontal stacked bar chart but get annoyingly big margins on top and bottom. I would like to get rid of that or control the size. Here is an example code and fig: from random import random Y = ['A', 'B', 'C', 'D', 'E','F','G','H','I','J', 'K'] y_pos...

python,sql,matplotlib,plot

from sqlalchemy import create_engine import _mssql from matplotlib import pyplot as plt engine = create_engine('mssql+pymssql://**:****@127.0.0.1:1433/AffectV_Test') connection = engine.connect() result = connection.execute('SELECT Campaign_id, SUM(Count) AS Total_Count FROM Impressions GROUP BY Campaign_id') for row in result: print row connection.close() The above code generates an array: (54ca686d0189607081dbda85', 4174469) (551c21150189601fb08b6b64', 182) (552391ee0189601fb08b6b73', 237304) (5469f3ec0189606b1b25bcc0',...

python,numpy

When doing: import numpy A = numpy.array([1,2,3,4,5,6,7,8,9,10]) B = numpy.array([1,2,3,4,5,6]) A[7:7+len(B)] = B # A[7:7+len(B)] has in fact length 3 ! we get this typical error: ValueError: could not broadcast input array from shape (6) into shape (3) This is 100% normal because A[7:7+len(B)] has length 3, and not a...

python-2.7,matplotlib,computer-science,floating-point-conversion

I am finding errors with the last line of the for loop where I am trying to update the curve value to a list containing the curve value iterations. I get errors like "can only concatenate tuple (not "float) to tuple" and "tuple object has no attribute 'append'". Does anyone...

python,python-2.7,numpy

I have three arrays lat=[15,15.25,15.75,16,....30] long=[91,91.25,91.75,92....102] data= array([[ 0. , 0. , 0. , ..., 0. , 0. , 0. ], [ 0. , 0. , 0. , ..., 0. , 0. , 0. ], [ 0. , 0. , 0. , ..., 0. , 0. , 0. ], ...,...

python,pandas,matplotlib

I have this plot that you'll agree is not very pretty. Other plots I made so far had some color and grouping to them out of the box. I tried manually setting the color, but it stays black. What am I doing wrong? Ideally it'd also cluster the same tags...

python,numpy

I have a vector orig which is a p dimensional vector Now, I sampled c elements from this vector (with replacement), lets call it sampled_vec. So basically,sampled_vec has elements from orig Now, I want to find out the indices of these elements (in sampled_vec) from orig. Probably, an example would...

python,matplotlib

I am using matplotlib to graph a curve with a non-numeric x-axis. I would like there to be some space between the y-axis and the start of the plot. This code implements a subplot with a gap (on the left) & a subplot with a gap using the set_xlim method...

python,python-3.x,numpy,pandas,datetime64

I have two big csv files with different number of rows which I am importing as follows: tdata = pd.read_csv(tfilepath, sep=',', parse_dates=['date_1']) print(tdata.iloc[:, [0,3]]) TBA date_1 0 0 2010-01-04 1 9 2010-01-05 2 0 2010-01-06 3 8 2010-01-07 4 0 2010-01-08 5 0 2010-01-09 pdata = pd.read_csv(pfilepath, sep=',', parse_dates=['date_2']) print(pdata.iloc[:,...

python,numpy

I have a list as follows: l = [['A', 'C', 'D'], ['B', 'E'], ['A', 'C', 'D'], ['A', 'C', 'D'], ['B', 'E'], ['F']] The result should be: [['A', 'C', 'D'], ['B', 'E'], ['F']] The order of elements is also not important. I tried as: print list(set(l)) Does numpy has better way...

python,memory,numpy

If we convert a large array containing 0 and 1 as boolean to another array containing 0 and 1 as float, the size of array would be almost 10 times larger. What is the best way (if any) to handle this issue in python (Numpy) if we need this conversion?

python,numpy,kernel-density

Can one explain why after estimation of kernel density d = gaussian_kde(g[:,1]) And calculation of integral sum of it: x = np.linspace(0, g[:,1].max(), 1500) integral = np.trapz(d(x), x) I got resulting integral sum completely different to 1: print integral Out: 0.55618 ...

python,numpy,scipy,curve-fitting,data-fitting

Say I want to fit two arrays x_data_one and y_data_one with an exponential function. In order to do that I might use the following code (in which x_data_one and y_data_one are given dummy definitions): import numpy as np from scipy.optimize import curve_fit def power_law(x, a, b, c): return a *...

python,numpy,matplotlib,graph,plot

I am trying to read one input file of below format. Where Col[1] is x axis and Col[2] is y axis and col[3] is some name. I need to plot multiple line graphs for separate names of col[3]. Eg: Name sd with x,y values will have one line graph and...

python,python-2.7,matplotlib,plot,histogram

I want to draw a histogram and a line plot at the same graph. However, to do that I need to have my histogram as a probability mass function, so I want to have on the y-axis a probability values. However, I don't know how to do that, because using...

python,python-2.7,python-3.x,numpy,shapely

I have one list of data as follows: from shapely.geometry import box data = [box(1,2,3,4), box(4,5,6,7), box(1,2,3,4)] sublists = [A,B,C] The list 'data' has following sub-lists: A = box(1,2,3,4) B = box(4,5,6,7) C = box(1,2,3,4) I have to check if sub-lists intersect. If intersect they should put in one tuple;...

python,numpy,pandas,scipy

I have one data frame and pairwise correlation were calculated >>> df1 = pd.read_csv("/home/zebrafish/Desktop/stack.csv") >>> df1.corr() GA PN PC MBP GR AP GA 1.000000 0.070541 0.259937 -0.452661 0.115722 0.268014 PN 0.070541 1.000000 0.512536 0.447831 -0.042238 0.263601 PC 0.259937 0.512536 1.000000 0.331354 -0.254312 0.958877 MBP -0.452661 0.447831 0.331354 1.000000 -0.467683 0.229870...

python,numpy,matplotlib,plot

I have two arrays. One is the raw signal of length (1000, ) and the other one is the smooth signal of length (100,). I want to visually represent how the smooth signal represents the raw signal. Since these arrays are of different length, I am not able to plot...

python,arrays,numpy

I have a 3D numpy array that looks like this: X = [[[10 1] [ 2 10] [-5 3]] [[-1 10] [ 0 2] [ 3 10]] [[ 0 3] [10 3] [ 1 2]] [[ 0 2] [ 0 0] [10 0]]] At first I want the maximum along...

python,matplotlib

I would like to make a heatmap for a matrix of data such that all positions that are 1 will be red, all positions that are 2 will be white, and etc. with an arbitrary specification. Ideally this should handle the case where all of the values are the same,...

python,matplotlib,plot,google-visualization,heatmap

I am trying to plot a heatmap on top of an image. What I did: import matplotlib.pyplot as plt import numpy as np import numpy.random import urllib #downloading an example image urllib.urlretrieve("http://tekeye.biz/wp-content/uploads/2013/01/small_playing_cards.png", "/tmp/cards.png") #reading and plotting the image im = plt.imread('/tmp/cards.png') implot = plt.imshow(im) #generating random data for the histogram...

python,opencv,numpy

I'm trying to send live video frame that I catch with my camera to a server and process them. I'm usig opencv for image processing and python for the language. Here is my code client_cv.py import cv2 import numpy as np import socket import sys import pickle cap=cv2.VideoCapture(0) clientsocket=socket.socket(socket.AF_INET,socket.SOCK_STREAM) clientsocket.connect(('localhost',8089))...

python,matplotlib,plot,fill

I'm trying to get access to the shaded region of a matplotlib plot, so that I can remove it without doing plt.cla() [since cla() clears the whole axis including axis label too] If I were plotting I line, I could do: import matplotlib.pyplot as plt ax = plt.gca() ax.plot(x,y) ax.set_xlabel('My...

python,arrays,numpy,concatenation

I am trying to concatenate two arrays: a and b, where a.shape (1460,10) b.shape (1460,) I tried using hstack and concatenate as: np.hstack((a,b)) c=np.concatenate(a,b,0) I am stuck with the error ValueError: all the input arrays must have same number of dimensions Please guide me for concatenation and generating array c...

python,numpy,theano

I have a function f in theano which takes two parameters, one of them optional. When I call the function with the optional parameter being None the check inside f fails. This script reproduces the error: import theano import theano.tensor as T import numpy as np # function setup def...

python,arrays,numpy,scipy,distance

I have a raster with a set of unique ID patches/regions which I've converted into a two-dimensional Python numpy array. I would like to calculate pairwise Euclidean distances between all regions to obtain the minimum distance separating the nearest edges of each raster patch. As the array was originally a...

python,csv,matplotlib,graph,plot

I am trying to plot a graph with colored markers before and after threshold value. If I am using for loop for reading the parsing the input file with time H:M I can plot and color only two points. But for all the points I cannot plot. Input akdj 12:00...

python,numpy,sympy

With numpy, I am able to select an arbitrary set of items from an array with a list of integers: >>> import numpy as np >>> a = np.array([1,2,3]) >>> a[[0,2]] array([1, 3]) The same does not seem to work with sympy matrices, as the code: >>> import sympy as...

python,numpy

My code: #!/usr/bin/python import numpy as np names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe']) data = np.random.randn(7, 4) + 0.8 print (data) mask2= ((names != 'Joe') == 7.0) d2 = data[mask2] print (d2) d3 = data[names != 'Joe'] = 7.0 print (d3) Actually,my intention was to get the...

python,data,matplotlib,analysis

i have a lot of files and i want to open, read data1.txt and data2.txt file and then data1.txt file 22. column "x_coordinate" and data2.txt file 23. column "y_coordinate" scatter plot. how can i ? with open('data1.txt') as f: with open('data2.txt') as f2: data1 = f.readlines() data2 = f2.readlines() f1.xArr=[]...

python,matplotlib

I have a python code that produces the following figures. I would like to do an annotation with ellipses to surround the curves as the figure mentions. N.B. The figure is produced using MATLAB and I cannot do it in python-matplotlib. Thanks....

python,numpy,pandas

I am using pandas to factorize an array consisting of two types of strings. I want to make sure that one of the strings "XYZ" is always coded as a 0 and the other string "ABC" is always coded as 1. Is it possible to do this? I looked up...

python,numpy,matrix,modeling

I'm working on setting some boundary conditions for a water table model, and I am able to set the entire first row to a constant value, but not the entire first column. I am using np.zeros((11,1001)) to make an empty matrix. Does anyone know why I am successful at defining...