numpy,matplotlib,histogram , Logarithmic multi-sequenz plot with equal bar widths


Logarithmic multi-sequenz plot with equal bar widths

Question:

Tag: numpy,matplotlib,histogram

I have something like

import matplotlib.pyplot as plt
import numpy as np

a=[0.05, 0.1, 0.2, 1, 2, 3]
plt.hist((a*2, a*3), bins=[0, 0.1, 1, 10])
plt.gca().set_xscale("symlog", linthreshx=0.1)
plt.show()

which gives me the following plot: log histogram

As one can see, the bar width is not equal. In the linear part (from 0 to 0.1), everything is find, but after this, the bar width is still in linear scale, while the axis is in logarithmic scale, giving me uneven widths for bars and spaces in between (the tick is not in the middle of the bars).

Is there any way to correct this?


Answer:

Inspired by http://stackoverflow.com/a/30555229/635387 I came up with the following solution:

import matplotlib.pyplot as plt
import numpy as np

d=[0.05, 0.1, 0.2, 1, 2, 3]


def LogHistPlot(data, bins):
    totalWidth=0.8
    colors=("b", "r", "g")
    for i, d in enumerate(data):
        heights = np.histogram(d, bins)[0]
        width=1/len(data)*totalWidth
        left=np.array(range(len(heights))) + i*width

        plt.bar(left, heights, width, color=colors[i], label=i)
        plt.xticks(range(len(bins)), bins)
    plt.legend(loc='best')

LogHistPlot((d*2, d*3, d*4), [0, 0.1, 1, 10])

plt.show()

Which produces this plot: Correct logarithmic histogram with multiple datasets

The basic idea is to drop the plt.hist function, compute the histogram by numpy and plot it with plt.bar. Than, you can easily use a linear x-axis, which makes the bar width calculation trivial. Lastly, the ticks are replaced by the bin edges, resulting in the logarithmic scale. And you don't even have to deal with the symlog linear/logarithmic botchery anymore.


Related:


Rebin data and update imshow plot


python,numpy,matplotlib,draw,imshow
I have a large data set I want to be able to "zoom" in on. What I really want is for the data to be rebinned based on the selection and then update the data in the graph. So the graph will show different limits but maintain the same resolution....

Correlate a single time series with a large number of time series


python,numpy,correlation
I have a large number (M) of time series, each with N time points, stored in an MxN matrix. Then I also have a separate time series with N time points that I would like to correlate with all the time series in the matrix. An easy solution is to...

How do I make each histogram bin show me the frequency of each action/event/item?


python-3.x,matplotlib,histogram
I want to plot a histrogram showing the frequencies of various actions at different intervals. I want to bin the occurence of actions into 10 minute intervals. binwidth = 10*60 #10 minutes times = array([ 1.43431325e+09, 1.43431325e+09, 1.43431329e+09, 1.43431330e+09, 1.43431333e+09, 1.43431334e+09, 1.43431345e+09, 1.43431346e+09, 1.43431346e+09, 1.43431346e+09, 1.43431349e+09, 1.43431350e+09, 1.43431350e+09, 1.43431351e+09, 1.43431354e+09,...

How to show minor tick labels on log-scale with Matplotlib


python,matplotlib
Does anyone know how to show the labels of the minor ticks on a logarithmic scale with Python/Matplotlib? Thanks!...

Python np.delete issue


python,numpy
A = np.array([[1,2,3],[3,4,5],[5,6,7]]) X = np.array([[0, 1, 0]]) for i in xrange(np.shape(X)[0]): for j in xrange(np.shape(X)[1]): if X[i,j] == 0.0: A = np.delete(A, (j), axis=0) I am trying to delete j from A if in X there is 0 at index j. I get IndexError: index 2 is out of...

Better image normalization with numpy


python,image,numpy
I already achieved the goal described in the title but I was wondering if there was a more efficient (or generally better) way to do it. First of all let me introduce the problem. I have a set of images of different sizes but with a width/height ratio less than...

Numpy and dot products of multiple vector pairs: how can it be done?


python,numpy,matrix,scipy
I want to get dot product of N vector pairs (a_vec[i, :], b_vec[i, :]). a_vec has shape [N, 3], bvec has the same shape (N 3D vectors). I know that it can be easily done in cycle via numpy.dot function. But cannot it be done somehow simpler and faster?...

represent an index inside a list as x,y in python


python,list,numpy,multidimensional-array
I have a list which contains 1000 integers. The 1000 integers represent 20X50 elements of dimensional array which I read from a file into the list. I need to walk through the list with an indicator in order to find close elements to each other. I want that my indicator...

Linear programming with scipy.optimize.linprog


python,numpy,scipy
I've just check the simple linear programming problem with scipy.optimize.linprog: 1*x[1] + 2x[2] -> max 1*x[1] + 0*x[2] <= 5 0*x[1] + 1*x[2] <= 5 1*x[1] + 0*x[2] >= 1 0*x[1] + 1*x[2] >= 1 1*x[1] + 1*x[2] <= 6 And got the very strange result, I expected that x[1]...

Matplotlib figure not updating on data change


python,matplotlib,pyqt4
I'm implementing an image viewer using matplotlib. The idea is that changes being made to the image (such as filter application) will update automatically. I create a Figure to show the inital image and have added a button using pyQt to update the data. The data does change, I have...

Index 3D aray by 2D array


python,numpy
I have a 3D color image im (shape 512 512 3), and a 2D array mask(512 512). I want to annotate this color image by the mask: im = im[mask>threshold] + im[mask<threshold] * 0.2 + (255,0,0) * [mask<threshold]. How do I write this in Python efficiently?...

Parse multicolumn string using python


python,regex,numpy,cheminformatics
I'm trying to extract data from the text output of a cheminformatics program called NWChem, I've already extraced the part of the output that I'm interested in(the vibrational modes), here is the string that I have extracted: s = ''' 1 2 3 4 5 6 P.Frequency -0.00 0.00 0.00...

Will numpy.roots() ever return n different floats when a polynomial only has


numpy,scipy
I think the title says it all, but just to be specific, say I have some list of numbers named "coeffs". Assuming the polynomial with said coefficients has exactly k unique roots, will the following code ever set number_of_unique_roots to be a number greater than k? import numpy as np...

Factorial of a matrix elementwise with Numpy


python,numpy,matrix,factorial
I'd like to know how to calculate the factorial of a matrix elementwise. For example, import numpy as np mat = np.array([[1,2,3],[2,3,4]]) np.the_function_i_want(mat) would give a matrix mat2 such that mat2[i,j] = mat[i,j]!. I've tried something like np.fromfunction(lambda i,j: np.math.factorial(mat[i,j])) but it passes the entire matrix as argument for np.math.factorial....

manipulating top and bottom margins in pyplot horizontal stacked bar chart (barh)


python,matplotlib,margins
I'm trying to plot a horizontal stacked bar chart but get annoyingly big margins on top and bottom. I would like to get rid of that or control the size. Here is an example code and fig: from random import random Y = ['A', 'B', 'C', 'D', 'E','F','G','H','I','J', 'K'] y_pos...

Matplotlib: Plot the result of an SQL query


python,sql,matplotlib,plot
from sqlalchemy import create_engine import _mssql from matplotlib import pyplot as plt engine = create_engine('mssql+pymssql://**:****@127.0.0.1:1433/AffectV_Test') connection = engine.connect() result = connection.execute('SELECT Campaign_id, SUM(Count) AS Total_Count FROM Impressions GROUP BY Campaign_id') for row in result: print row connection.close() The above code generates an array: (54ca686d0189607081dbda85', 4174469) (551c21150189601fb08b6b64', 182) (552391ee0189601fb08b6b73', 237304) (5469f3ec0189606b1b25bcc0',...

Insert a numpy array into another without having to worry about length


python,numpy
When doing: import numpy A = numpy.array([1,2,3,4,5,6,7,8,9,10]) B = numpy.array([1,2,3,4,5,6]) A[7:7+len(B)] = B # A[7:7+len(B)] has in fact length 3 ! we get this typical error: ValueError: could not broadcast input array from shape (6) into shape (3) This is 100% normal because A[7:7+len(B)] has length 3, and not a...

Need workaround to treat float values as tuples when updating “list” of float values


python-2.7,matplotlib,computer-science,floating-point-conversion
I am finding errors with the last line of the for loop where I am trying to update the curve value to a list containing the curve value iterations. I get errors like "can only concatenate tuple (not "float) to tuple" and "tuple object has no attribute 'append'". Does anyone...

Identifying the nearest grid point


python,python-2.7,numpy
I have three arrays lat=[15,15.25,15.75,16,....30] long=[91,91.25,91.75,92....102] data= array([[ 0. , 0. , 0. , ..., 0. , 0. , 0. ], [ 0. , 0. , 0. , ..., 0. , 0. , 0. ], [ 0. , 0. , 0. , ..., 0. , 0. , 0. ], ...,...

How can I change the color of a grouped bar plot in Pandas?


python,pandas,matplotlib
I have this plot that you'll agree is not very pretty. Other plots I made so far had some color and grouping to them out of the box. I tried manually setting the color, but it stays black. What am I doing wrong? Ideally it'd also cluster the same tags...

Finding indices of elements in vector


python,numpy
I have a vector orig which is a p dimensional vector Now, I sampled c elements from this vector (with replacement), lets call it sampled_vec. So basically,sampled_vec has elements from orig Now, I want to find out the indices of these elements (in sampled_vec) from orig. Probably, an example would...

Plotting non-numeric x-axis away from the y-axis


python,matplotlib
I am using matplotlib to graph a curve with a non-numeric x-axis. I would like there to be some space between the y-axis and the start of the plot. This code implements a subplot with a gap (on the left) & a subplot with a gap using the set_xlim method...

What's the fastest way to compare datetime in pandas?


python,python-3.x,numpy,pandas,datetime64
I have two big csv files with different number of rows which I am importing as follows: tdata = pd.read_csv(tfilepath, sep=',', parse_dates=['date_1']) print(tdata.iloc[:, [0,3]]) TBA date_1 0 0 2010-01-04 1 9 2010-01-05 2 0 2010-01-06 3 8 2010-01-07 4 0 2010-01-08 5 0 2010-01-09 pdata = pd.read_csv(pfilepath, sep=',', parse_dates=['date_2']) print(pdata.iloc[:,...

Removing repeated sub-lists from a list


python,numpy
I have a list as follows: l = [['A', 'C', 'D'], ['B', 'E'], ['A', 'C', 'D'], ['A', 'C', 'D'], ['B', 'E'], ['F']] The result should be: [['A', 'C', 'D'], ['B', 'E'], ['F']] The order of elements is also not important. I tried as: print list(set(l)) Does numpy has better way...

Memory Issue for Array Conversion


python,memory,numpy
If we convert a large array containing 0 and 1 as boolean to another array containing 0 and 1 as float, the size of array would be almost 10 times larger. What is the best way (if any) to handle this issue in python (Numpy) if we need this conversion?

Inconsistency between gaussian_kde and density integral sum


python,numpy,kernel-density
Can one explain why after estimation of kernel density d = gaussian_kde(g[:,1]) And calculation of integral sum of it: x = np.linspace(0, g[:,1].max(), 1500) integral = np.trapz(d(x), x) I got resulting integral sum completely different to 1: print integral Out: 0.55618 ...

How to fit datasets so that they share some (but not all) parameter values


python,numpy,scipy,curve-fitting,data-fitting
Say I want to fit two arrays x_data_one and y_data_one with an exponential function. In order to do that I might use the following code (in which x_data_one and y_data_one are given dummy definitions): import numpy as np from scipy.optimize import curve_fit def power_law(x, a, b, c): return a *...

Read One Input File and plot multiple


python,numpy,matplotlib,graph,plot
I am trying to read one input file of below format. Where Col[1] is x axis and Col[2] is y axis and col[3] is some name. I need to plot multiple line graphs for separate names of col[3]. Eg: Name sd with x,y values will have one line graph and...

Python: matplotlib - probability mass function as histogram


python,python-2.7,matplotlib,plot,histogram
I want to draw a histogram and a line plot at the same graph. However, to do that I need to have my histogram as a probability mass function, so I want to have on the y-axis a probability values. However, I don't know how to do that, because using...

multiple iteration of the same list


python,python-2.7,python-3.x,numpy,shapely
I have one list of data as follows: from shapely.geometry import box data = [box(1,2,3,4), box(4,5,6,7), box(1,2,3,4)] sublists = [A,B,C] The list 'data' has following sub-lists: A = box(1,2,3,4) B = box(4,5,6,7) C = box(1,2,3,4) I have to check if sub-lists intersect. If intersect they should put in one tuple;...

what is the best method to extract highly correlated vaiables within the given threshold


python,numpy,pandas,scipy
I have one data frame and pairwise correlation were calculated >>> df1 = pd.read_csv("/home/zebrafish/Desktop/stack.csv") >>> df1.corr() GA PN PC MBP GR AP GA 1.000000 0.070541 0.259937 -0.452661 0.115722 0.268014 PN 0.070541 1.000000 0.512536 0.447831 -0.042238 0.263601 PC 0.259937 0.512536 1.000000 0.331354 -0.254312 0.958877 MBP -0.452661 0.447831 0.331354 1.000000 -0.467683 0.229870...

Plotting two different arrays of different lengths


python,numpy,matplotlib,plot
I have two arrays. One is the raw signal of length (1000, ) and the other one is the smooth signal of length (100,). I want to visually represent how the smooth signal represents the raw signal. Since these arrays are of different length, I am not able to plot...

Find Maximum of 3D np.array along Axis = 0


python,arrays,numpy
I have a 3D numpy array that looks like this: X = [[[10 1] [ 2 10] [-5 3]] [[-1 10] [ 0 2] [ 3 10]] [[ 0 3] [10 3] [ 1 2]] [[ 0 2] [ 0 0] [10 0]]] At first I want the maximum along...

Make a heatmap with a specified discrete color mapping with matplotlib in python


python,matplotlib
I would like to make a heatmap for a matrix of data such that all positions that are 1 will be red, all positions that are 2 will be white, and etc. with an arbitrary specification. Ideally this should handle the case where all of the values are the same,...

Matplotlib heatmap: Image rotated when heatmap plot over it


python,matplotlib,plot,google-visualization,heatmap
I am trying to plot a heatmap on top of an image. What I did: import matplotlib.pyplot as plt import numpy as np import numpy.random import urllib #downloading an example image urllib.urlretrieve("http://tekeye.biz/wp-content/uploads/2013/01/small_playing_cards.png", "/tmp/cards.png") #reading and plotting the image im = plt.imread('/tmp/cards.png') implot = plt.imshow(im) #generating random data for the histogram...

Sending live video frame over network in python opencv


python,opencv,numpy
I'm trying to send live video frame that I catch with my camera to a server and process them. I'm usig opencv for image processing and python for the language. Here is my code client_cv.py import cv2 import numpy as np import socket import sys import pickle cap=cv2.VideoCapture(0) clientsocket=socket.socket(socket.AF_INET,socket.SOCK_STREAM) clientsocket.connect(('localhost',8089))...

Object-oriented access to fill_between shaded region in matplotlib


python,matplotlib,plot,fill
I'm trying to get access to the shaded region of a matplotlib plot, so that I can remove it without doing plt.cla() [since cla() clears the whole axis including axis label too] If I were plotting I line, I could do: import matplotlib.pyplot as plt ax = plt.gca() ax.plot(x,y) ax.set_xlabel('My...

Array stacking/ concatenation error in python


python,arrays,numpy,concatenation
I am trying to concatenate two arrays: a and b, where a.shape (1460,10) b.shape (1460,) I tried using hstack and concatenate as: np.hstack((a,b)) c=np.concatenate(a,b,0) I am stuck with the error ValueError: all the input arrays must have same number of dimensions Please guide me for concatenation and generating array c...

Optional parameter to theano function


python,numpy,theano
I have a function f in theano which takes two parameters, one of them optional. When I call the function with the optional parameter being None the check inside f fails. This script reproduces the error: import theano import theano.tensor as T import numpy as np # function setup def...

Calculating distances between unique Python array regions?


python,arrays,numpy,scipy,distance
I have a raster with a set of unique ID patches/regions which I've converted into a two-dimensional Python numpy array. I would like to calculate pairwise Euclidean distances between all regions to obtain the minimum distance separating the nearest edges of each raster patch. As the array was originally a...

Read CSV and plot colored line graph


python,csv,matplotlib,graph,plot
I am trying to plot a graph with colored markers before and after threshold value. If I am using for loop for reading the parsing the input file with time H:M I can plot and color only two points. But for all the points I cannot plot. Input akdj 12:00...

Advanced indexing for sympy?


python,numpy,sympy
With numpy, I am able to select an arbitrary set of items from an array with a list of integers: >>> import numpy as np >>> a = np.array([1,2,3]) >>> a[[0,2]] array([1, 3]) The same does not seem to work with sympy matrices, as the code: >>> import sympy as...

Why my mask failed in Python?


python,numpy
My code: #!/usr/bin/python import numpy as np names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe']) data = np.random.randn(7, 4) + 0.8 print (data) mask2= ((names != 'Joe') == 7.0) d2 = data[mask2] print (d2) d3 = data[names != 'Joe'] = 7.0 print (d3) Actually,my intention was to get the...

Data Analysis and Scatter Plot different file and different column


python,data,matplotlib,analysis
i have a lot of files and i want to open, read data1.txt and data2.txt file and then data1.txt file 22. column "x_coordinate" and data2.txt file 23. column "y_coordinate" scatter plot. how can i ? with open('data1.txt') as f: with open('data2.txt') as f2: data1 = f.readlines() data2 = f2.readlines() f1.xArr=[]...

How to surround curves with annotation in matplotlib?


python,matplotlib
I have a python code that produces the following figures. I would like to do an annotation with ellipses to surround the curves as the figure mentions. N.B. The figure is produced using MATLAB and I cannot do it in python-matplotlib. Thanks....

Is it possible to specify the order of levels in Pandas factorize method?


python,numpy,pandas
I am using pandas to factorize an array consisting of two types of strings. I want to make sure that one of the strings "XYZ" is always coded as a 0 and the other string "ABC" is always coded as 1. Is it possible to do this? I looked up...

How to set first column to a constant value of an empty np.zeros numPy matrix?


python,numpy,matrix,modeling
I'm working on setting some boundary conditions for a water table model, and I am able to set the entire first row to a constant value, but not the entire first column. I am using np.zeros((11,1001)) to make an empty matrix. Does anyone know why I am successful at defining...