python-3.x,curl,web-scraping,wget , How to get extact page content in wget if error code is 404


How to get extact page content in wget if error code is 404

Question:

Tag: python-3.x,curl,web-scraping,wget

I have two url one is working url another one is page deleted url.working url is fine but for page deleted url instead of getting the exact page content wget receives 404

Working url

import os
def curl(url):
    data = os.popen('wget -qO- %s '% url).read()
    print (url)
    print (len(data))
    #print (data)

curl("https://www.reverbnation.com/artist_41/bio")

Output:

https://www.reverbnation.com/artist_41/bio
80067

Page Deleted url

import os
def curl(url):
    data = os.popen('wget -qO- %s '% url).read()
    print (url)
    print (len(data))
    #print (data)

curl("https://www.reverbnation.com/artist_42/bio")

output:

https://www.reverbnation.com/artist_42/bio
0

I get length as 0 but live page has some content in it

How to receive the exact content in wget or curl


Answer:

wget has a switch called "--content-on-error":

--content-on-error
           If this is set to on, wget will not skip the content when the server responds with a http status code that indicates error.

So just add it to your code and you will have the "content" of the 404 pages too:

import os
def curl(url):
    data = os.popen('wget --content-on-error -qO- %s '% url).read()
    print (url)
    print (len(data))
    #print (data)

curl("https://www.reverbnation.com/artist_42/bio")

Related:


XML Post from form using curl PHP


php,xml,curl
What is the best way to post XML from a form using Curl. I have a HTML Form and i post the data to a new php page and all the fields are collected. How do i collect these fields in XML Format. I can process it from a xml...

tkinter showerror creating blank tk window


python-3.x,tkinter,messagebox,tkmessagebox
I have a program that needs to display graphical error messages to users. It is a tkinter GUI, so I am using tkinter.messagebox.showerror When I call showerror, it shows the error, but also creates a blank "tk" window, the kind created when an instance of the Tk class is called,...

Server-Sent Events Polling causing long delays


javascript,php,jquery,curl,server-sent-events
I have a connector that will call a RESP API using cURL and PHP. I need to call one method every second to check for new messages and then process them. I used the following 2 approaches to handle the messages AJAX Polling using SetInterval(): call the php script once...

Pylint Error when using metaclass


python,python-3.x,vim,pylint,syntastic
i try to fix all pylint errors and pylint warnings in a project. but i keep getting an error when i set a metaclass (https://www.python.org/dev/peps/pep-3115/). here is my example code: #!/usr/bin/env python3 class MyMeta(type): pass class MyObject(object, metaclass=MyMeta): # pylint error here pass the error just says "invalid syntax". i...

CORS, Client vs. Server & Rails API GET Request


ruby-on-rails,ruby,api,curl,client
I've built a GET Rails API that checks for an access token and that the registered request.env["HTTP_X_REAL_IP"] matches the IP address that is registered within the admin panel of the app. Example request: https://staging.mysite.com/api/v1/products?access_token=7b9f3cddd3914a6f45fa692997fe6dc9 The API works great when I'm making requests from a server by curling the request or...

Cancel last line iteration on a file


python,python-3.x,for-loop,file-io
I need to iterate on a file, stop iteration on a condition and then continue parse the file at the same line with another function (That may change so I can't just add content in the previous function). An example file (file.txt) : 1 2 3 4 5 6 7...

What is a reliable isnumeric() function for python 3?


python,regex,validation,python-3.x,isnumeric
I am attempting to do what should be very simple and check to see if a value in an Entry field is a valid and real number. The str.isnumeric() method does not account for "-" negative numbers, or "." decimal numbers. I tried writing a function for this: def IsNumeric(self,...

writing a tkinter scrollbar for canvas within a class


python,python-3.x,tkinter
I've searched around and cannot seem to find an answer for my problem. I am trying to create a working scrollbar for the following code and cannot seem to get it to work. The problem appears to be with the OnFrameConfigure method. I have seen elsewhere that the method should...

What certificates does 'curl' use by default?


curl,certificate
What certificates does 'curl' use by default? Example: curl -I -L https://cruises.webjet.com.au fails on Ubuntu 15.04 with curl: (60) SSL certificate problem: unable to get local issuer certificate But when I add the root certificate (see https://www.ssllabs.com/ssltest/analyze.html?d=cruises.webjet.com.au&latest) and run curl -I -L --cacert downlaodedCert.pem https://cruises.webjet.com.au everything is fine. So I...

Python3 create files from dictionary


file,python-3.x,dictionary
I have a dictionary in a function which is called searchInMyDict(dict) for example. The dictionary included in that function has for key a group name and has for value a list of gene's functions. the dictionary looks like : {"OG_1": ["gene's functionA, gene's functionB, gene's functionC"] "OG_2": ["gene's functionM, gene's...

subprocess python 3 check_output not same as shell command?


python-3.x,subprocess
I am trying to use the subprocess module in python but its a bit tricky to get working. Here's my code import sys import os import subprocess import shlex def install_module(dir_path, command): c = shlex.split(command) os.chdir(dir_path) try: p = subprocess.check_output(c, shell=True) except subprocess.CalledProcessError as e: #print('install failed for: ' +...

Python 3.4: List to Dictionary


python,list,python-3.x,dictionary
I have a string as follows : ['Total Revenue', 31821000, 30871000, 29904000, 'Cost of Revenue', 16447000, 16106000, 15685000, 'Gross Profit', 15374000, 14765000, 14219000, 'Research Development', 1770000, 1715000, 1634000, 'Selling General and Administrative', 6469000, 6384000, 6102000, 'Non Recurring', '-', '-', '-', 'Others', '-', '-', '-', 'Total Operating Expenses', '-', '-', '-',...

django-admin startproject not working with python3 on OS X


python,django,osx,python-2.7,python-3.x
I have python3 installed with Django 1.8.2 on Mac OS. There is also python 2.7 installed by default with the OS. When trying to run startproject I get - $ django-admin startproject mysite Traceback (most recent call last): File "/usr/local/bin/django-admin", line 7, in <module> from django.core.management import execute_from_command_line ImportError: No...

Python file processing?


python,python-3.x
My assignment was to write a program which extracts the first/last names, birth year, and ID from a file, manipulate that information to create a username and formatted ID, prompt the user for 3 test grades, calculate the average, and finally write all the information to a new file. This...

How to avoid user to click outside popup Dialog window using Qt and Python?


qt,user-interface,python-3.x,dialog,qt-creator
I created a Dialog window using Qt Creator and Python. I would like that Window stays on the top of my Gui AND avoid users to click outside that Dialog Until this dialog was closed.

multiple iteration of the same list


python,python-2.7,python-3.x,numpy,shapely
I have one list of data as follows: from shapely.geometry import box data = [box(1,2,3,4), box(4,5,6,7), box(1,2,3,4)] sublists = [A,B,C] The list 'data' has following sub-lists: A = box(1,2,3,4) B = box(4,5,6,7) C = box(1,2,3,4) I have to check if sub-lists intersect. If intersect they should put in one tuple;...

Python3 after cursor.execute it stopped?


mysql,python-3.x
After much trying on python3 (as of still new in this language), the line whereby cursor.execute will prevent the for loop to continue when condition met. However when I comment cursor.execute line, the looping able to continue until the end. How can I made it continue till the last result...

Put a QLineEdit() into a QTreeWidgetItem()


python,python-3.x,pyqt,pyqt5
Is it possible to put a QLineEdit() into a QTreeWidgetItem() in order to modify the text of the QTreeWidgetItem ? Here is my code def addItemsToTree(self, parent, text, checkable=False, expanded=True): self.item = QTreeWidgetItem(parent, [text]) if checkable: self.item.setCheckState(0, Qt.Unchecked) else: self.item.setFlags(self.item.flags() & ~Qt.ItemIsUserCheckable) self.item.setExpanded(expanded) min = QLineEdit() max = QLineEdit() self.addChildTree(self.item,...

How do I silence the HEAD of a curl request while using the silent flag?


bash,shell,curl,command-line,pipe
When I run the curl command and direct the data to a file, I get back the content of the site as expected. $ curl "www.site.com" > file.txt $ head file.txt Top of site ... However, this command shows a progress bar, which I do not want: % Total %...

Wrapping Functions in Python 3.4 missing required positional argument


python,python-3.x,flask,flask-login
I am trying to customize a login_required decorator from the Flask-Login package. I have read the source code and mimicked the syntax. Mine: def login_role_required(f, req_roles=['any']): @wraps(f) def decorated_view(*args, **kwargs): if current_app.login_manager._login_disabled: return f(*args, **kwargs) if not current_user.is_authenticated(): return current_app.login_manager.unauthorized() if req_roles == ['any']: return f(*args, **kwargs) user_roles = current_user.get_roles...

Installing Python 3 Docker Ubuntu error command 'x86_64-linux-gnu-gcc


python,python-3.x,amazon-web-services,docker
I'm trying to create a dockerfile that uses Python 3. FROM ubuntu:14.04 RUN apt-get update RUN apt-get install -y python3 python3-dev python-pip RUN apt-get install -y libxml2-dev libxslt1-dev libpq-dev libjpeg-dev libfreetype6-dev zlib1g-dev RUN cd /var/projects/apps && pip install -r requirements.txt I get the error fatal error: Python.h: No such file...

CURL IMAP APPEND command


curl,imap
I would like to use CURL in order to APPEND emails at the given date. $ curl -kv -u [email protected]:user 'imap://IP:143' -X 'APPEND INBOX (Mon, 7 Feb > 1994 21:52:25 +0000) {310}' ' Date: Mon, 7 Feb 1994 21:52:25 -0800 (PST) From: Fred Foobar <[email protected]> Subject: afternoon meeting To: [email protected]

Pyqt - Add a QMenuBar to a QMainWindow which is in another class


python-3.x,pyqt,pyqt5
I have 2 classes : MainWindow() and Menubar(). MainWindow() is a QMainWindow and Menubar is a QMenuBar. I don't know how I can add the menu bar to the main window. With the QToolBar, I can make something like this : self.toolbar = Toolbar() self.addToolBar(self.toolbar) But with the QMenubar, there...

How do I make each histogram bin show me the frequency of each action/event/item?


python-3.x,matplotlib,histogram
I want to plot a histrogram showing the frequencies of various actions at different intervals. I want to bin the occurence of actions into 10 minute intervals. binwidth = 10*60 #10 minutes times = array([ 1.43431325e+09, 1.43431325e+09, 1.43431329e+09, 1.43431330e+09, 1.43431333e+09, 1.43431334e+09, 1.43431345e+09, 1.43431346e+09, 1.43431346e+09, 1.43431346e+09, 1.43431349e+09, 1.43431350e+09, 1.43431350e+09, 1.43431351e+09, 1.43431354e+09,...

Python 3 filtering directories by name that matches specific pattern


python,regex,python-3.x,directory,filtering
Currently I'm developing script that will perform cleanup of specific directories. For example: Directory: /app/test/log contains many sub-directories with name pattern testYYYYMMDD and logYYYYMMDD What I need, is to filter out only directories like testYYYYMMDD To get all folders with absolute path that are in given directory I use: folders_in_given_folder...

Slow CURL CentOS7 with “same” link


curl,dns,centos,hosts
I just installed CentOS7 (3.10.0-229.4.2.el7.x86_64) with nginx (1.8.0). Here my hosts file: [[email protected]_main1 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 127.0.0.1 arm.site.com kpp.site.com w.site.com server.site.com And problem: [[email protected]_main1 ~]# time curl http://arm.site.com/test/fad/site/site?siteId=152 {"OK"} real 0m0.162s user 0m0.003s sys 0m0.003s [[email protected]_main1 ~]# time curl...

“Initializing” a constant containing a file in python?


python,python-3.x
I know that initializing variables/constants in python is not necessary, but my professor still wants us to initialize variables for practice. In my program, I have a file to which I assigned a name: infile = open("studentinfo.txt", "r") How would it make sense to initialize the constant "infile"? Can I...

Python3:socket:TypeError: unsupported operand type(s) for %: 'bytes' and 'bytes'


sockets,python-3.x
I am try to use python socket package to implement an echo server. But it continuously occurs the error: TypeError: unsupported operand type(s) for %: 'bytes' and 'bytes', is there any errors in my code? here is the error: Exception in thread Thread-1: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/threading.py",...

TCL parsing a list of arguments to an external call


python,python-3.x,tcl
Im trying to execute a call to a python script through aldec riviera-pro my call is python $python_app_name $python_app_args However my $python_app_args are passed as a single string and not multiple strings resulting in that the python application only sees it as one argument and its execution fails. I've tried...

Pass function call as a function argument


python,python-2.7,python-3.x
Code: def function1(a,b): return a-1,b-1 def function2(c,d): return c+1,d+1 print function1(function2(1,2)) Error: Traceback (most recent call last): File "C:\Users\sony\Desktop\Python\scripts\twitter_get_data.py", line 6, in <module> print function1(function2(1,2)) TypeError: function1() takes exactly 2 arguments (1 given) [Finished in 0.1s with exit code 1] Why the above error? ...

Office 365 unified api Object reference not set to an instance of an object


php,curl,header,office365
I'm trying to use the new office365 unified api to query the users list and user file. I've created the application in azure management portal, and I gave the permission to the new api application (with the directory and files read) I've created both a client and a webapi application,...

python 3 error with print function syntax


python,python-3.x,printing
I have a list of lists with tuples. I want to get the length of a tuple using: item1=(4, 8, 16, 30) list6=[[(4, 8, 16, 29)], [(4, 8, 16, 30)], [(4, 8, 16, 32)]] print("list6.index((4, 8, 16, 29)):",list6.index([item1])) print("len(list6[1]):"), len(list6[1]) Output: list6.index((4, 8, 16, 29)): 1 len(list6[1]): There is no...

incessantly getting null values for reduce function


curl,couchdb
My data is as follows: { "_id": "33d4d945613344f13a3ee929337b1ca8", "_rev": "1-427c691a5c5f504c6b1d885b6b9ff4bc", "release": { "genres": { "genre": "Electronic" }, "identifiers": { "identifier": [ { "description": "Text", "value": "5 021603 054028", "type": "Barcode" }, { "description": "String", "value": 5021603054028, "type": "Barcode" }, { "value": "MAYKING WAP54CD", "type": "Matrix / Runout" } ] },...

If a block of code creates an error, do x; if not, do y (Python)


python,python-3.x
In Python, is it possible to test for an error in a block of code, and if one shows up, do something; if not, do something else? The psuedo-code would look like checkError: print("foobar" + 123) succeed: print("The block of code works!") fail: print("The block of code does not work!")...

sys.argv in a windows environment


python,windows,python-3.x
I'm attempting to learn python using the book 'a byte of python'. The code: import sys print('the command line arguments are:') for i in sys.argv: print(i) print('\n\nThe PYTHONPATH is', sys.path, '\n') outputs: the command line arguments are: C:/Users/user/PycharmProjects/helloWorld/module_using_sys.py The PYTHONPATH is ['C:\\Users\\user\\PycharmProjects\\helloWorld', 'C:\\Users\\user\\PycharmProjects\\helloWorld', 'C:\\Python34\\python34.zip', 'C:\\Python34\\DLLs', 'C:\\Python34\\lib', 'C:\\Python34', 'C:\\Python34\\lib\\site-packages']...

How to execute POST using CURL


post,curl,asp-classic
How to execute POST using CURL, i have this basic .asp that receives the data via POST and show the result in .JSON format. <%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%> <% Response.ContentType = "application/json" Response.Write("{ ""responseCode"": " + Request("responseCode") + ", ""publication_id"": " + Request("publication_id") + ", ""version"": " + Request("version") + "}") %>...

The event loop is already running


python,python-3.x,pyqt,pyqt4
I have the following 5 files: gui.py # -*- coding: utf-8 -*- from PyQt4 import QtCore, QtGui try: _fromUtf8 = QtCore.QString.fromUtf8 except AttributeError: def _fromUtf8(s): return s try: _encoding = QtGui.QApplication.UnicodeUTF8 def _translate(context, text, disambig): return QtGui.QApplication.translate(context, text, disambig, _encoding) except AttributeError: def _translate(context, text, disambig): return QtGui.QApplication.translate(context, text, disambig)...

Permission denied Setuptools


python,django,curl,setuptools
I'm trying install setuptools in my Mac, but when I run command curl https://bootstrap.pypa.io/ez_setup.py -o - | python show a message telling: Processing setuptools-17.1.1-py3.4.egg Removing /Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/setuptools-17.1.1-py3.4.egg Copying setuptools-17.1.1-py3.4.egg to /Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages Adding setuptools 17.1.1 to easy-install.pth file error: [Errno 13] Permission denied:...

Understanding curl request


curl
I got following line along with the response. How to force that connection is closed after the response? Connection #0 to host localhost left intact curl command: curl -v --user uname:password -H "Accept: application/xml" http://localhost:8090/services/VariableService/variableService/...

How to use curl return value in php script


php,wordpress,curl,login
I have two servers: A where WP is installed on one, and B where it's not. Application run on B is trying to use WP credentials for login. I have a login form on server B: <h1>Login</h1> <div> <form class="forma" id="form" action="login.php" method="POST"> <div class="form-group"> <label>Username</label> <input class="form-control" type="text" name="username"...

index() Method Not Accepting None as Start/Stop


python,python-3.x
While writing a binary search method for a list I decided to use the builtin index() method on a smaller slice of the list determined via the binary search method. However in certain cases I was getting the error: TypeError: slice indices must be integers or None or have an...

Python MVC style GUI Temperature Converter


python,user-interface,python-3.x,model-view-controller,tkinter
#The view (GuiTest.py) import tkinter import Controller class MyFrame(tkinter.Frame): def __init__(self, controller): tkinter.Frame.__init__(self) self.pack() self.controller = controller #Output Label self.outputLabel = tkinter.Label(self) self.outputLabel["text"] = ("") self.outputLabel.pack({"side":"right"}) #Entry Space self.entrySpace = tkinter.Entry(self) self.entrySpace["text"] = ("") self.entrySpace.pack({"side":"left"}) #two convert buttons self.convertButton=tkinter.Button(self) self.convertButton["text"]= "Fahrenheit to...

What's the fastest way to compare datetime in pandas?


python,python-3.x,numpy,pandas,datetime64
I have two big csv files with different number of rows which I am importing as follows: tdata = pd.read_csv(tfilepath, sep=',', parse_dates=['date_1']) print(tdata.iloc[:, [0,3]]) TBA date_1 0 0 2010-01-04 1 9 2010-01-05 2 0 2010-01-06 3 8 2010-01-07 4 0 2010-01-08 5 0 2010-01-09 pdata = pd.read_csv(pfilepath, sep=',', parse_dates=['date_2']) print(pdata.iloc[:,...

“Initializing” variables in python?


python,python-3.x
Even though initializing variables in python is not necessary, my professor still wants us to do it for practice. I wrote my program and it worked fine, but after I tried to initialize some of the variables I got an error message when I tried to run it. Here is...

How to parse this string?


python,python-3.x
I have a string like the below string: >>> string = """00 1f [email protected] 00c 00e 00N 00> 00E 00O 00F 002 00& 00* 00/ 00) 00 1f 00 1c 00 00 00 17 00\r 00 08 00 03 00 f8 ff ea ff e1 ff e1 ff e0 ff...

Return to main fuction in python


python-3.x,def
Working on Python 3.4.3 Let's say I have created three fuctions: def choosing(mylist=[]): print("We will have to make a list of choices") appending(mylist) done = False while(done == "False"): confirm = input("Is your list complete?[Y/N]") if(confirm == "Y"): print("Yaay! Choices creation complete." "{} choices have been added successfully".format(len(mylist))) done =...

Why does round(5/2) return 2?


python,python-3.x,python-3.4
Using python 3.4.3, round(5/2) # 2 Shouldn't it return 3? I tried using python 2 and it gave me the correct result round(5.0/2) # 3 How can I achieve a correct rounding of floats?...

json response handling issue


php,json,curl
Hi guys I stuck with retrieving json response below is the json output .I novice in this your help would be highly appreciated. { "productHeader" : { "totalHits" : 684 }, "products" : [ { "name" : "Victoria Hotels", "productImage" : { "url" : "http://hotels.com/hotels/9000000/8640000/8633700/8633672/8633672_20_b.jpg" }, "language" : "en", "description"...

Django runserver not serving some static files


django,python-3.x
My local testing server for Django v1.8.2 on Windows 8.1 is only serving certain static files. Others are giving a 404 error. #urls.py - excerpt urlpatterns = [ url(r'^$', views.index) ] + staticfiles_urlpatterns() #settings.py - excerpt INSTALLED_APPS = ( 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'main', 'users' ) STATIC_URL =...

python requests with redirection


python,authentication,redirect,curl,python-requests
Trying to authenticate on http://72.ru site, noticed that there were a redirect to https://loginka.ru/auth/. Found that there were 302 POST with plain credentials in data form. Copying headers from Chrome can reproduce that in cURL, but still can't reach in requests module. Warning: page is full of russian letters, registration...