postgresql-9.3 , SQL to remove duplicated rows


SQL to remove duplicated rows

Question:

Tag: postgresql-9.3

I've written a sql statement to only keep one instance (minimum id) where there are duplicated product_codes. The issue is that the statement is very inefficient and takes absolutely ages to run, so I'm hoping there is a more efficient way to write it

The dataset is structured as:

id  product_code  cat_desc      product_desc  
1   2352345       423           COCA COLA   
2   8967896       457           FANTA   
3   6456466       435           SPARKLING WATER 
4   3562314       457           STILL WATER 

The statement is:

DELETE
FROM raw_products_inter
WHERE id IN (SELECT id
             FROM raw_products_inter outer_table
             WHERE product_code IN (SELECT product_code
                                    FROM raw_products_inter
                                    GROUP BY 1
                                    HAVING COUNT(id) > 1)
             AND   id NOT IN (SELECT MIN(id)
                              FROM raw_products_inter inner_table
                              WHERE inner_table.product_code = outer_table.product_code))

Answer:

You should be able to boost the performance using an EXISTS condition:

DELETE 
  FROM raw_products_inter P
 WHERE EXISTS (
          SELECT *
            FROM raw_products_inter OP
           WHERE OP.product_code = P.product_code
             AND OP.id < P.id
       )

Related:


Why does adding a JOIN completely modify the query planner behaviour?


postgresql,django-orm,postgresql-9.3,query-planner
I have two queries: SELECT "recipes_recipe"."short_name", COUNT(DISTINCT "recipes_recipe"."quantity_type") AS "quantity_type_count", SUM("measures_measure"."standard") AS "volume", CASE WHEN COUNT(DISTINCT "recipes_recipe"."quantity_type") = 1 THEN (SUM((T7."standard" * T8."standard")) / SUM(T8."standard")) ELSE NULL END AS "weighted_temperature" FROM "orders_orderitemresult" INNER JOIN "orders_orderitem" ON ( "orders_orderitemresult"."order_line_id" = "orders_orderitem"."id" ) INNER JOIN "orders_order" ON ( "orders_orderitem"."order_id" =...

Call aliased column result of aggregate function JOOQ


postgresql,postgresql-9.3,jooq
I'm currently trying to retrieve a single double value from this query in JOOQ Query Builder and PostgreSQL as the database, providing that DRINKS.PRICE is of type double and ORDER_DRINK.QTY is of type integer. Record rec = create.select(DSL.sum(DRINKS.PRICE.multiply(ORDER_DRINK.QTY)).as("am_due")).from(ORDERS .join(ORDER_DRINK .join(DRINKS) .on(DRINKS.DRINK_KEY.equal(ORDER_DRINK.DRINK_KEY))) .on(ORDERS.ORDKEY.equal(ORDER_DRINK.ORDER_KEY))) .where(ORDERS.TOKEN.eq(userToken)) .fetchOne(); As I've understood from the (brief)...

Unable to connect to Postgres via PHP but can connect from command line and PgAdmin on different machine


apache,postgresql,redhat,iptables,postgresql-9.3
I've had a quick search around (about 30 minutes) and tried a few bits, but nothing seems to work. Also please note I'm no Linux expert (I can do most basic stuff, simple installs, configurations etc) so some of the config I have may be obviously wrong, but I just...

PostgreSQL 9.3: Is it possible to connect with localhost with postgres_fdw?


database,postgresql,postgresql-9.3,pgadmin,foreign-data-wrapper
The idea is that I have local database named northwind, and with postgres_fdw I want to connect with another database named test on localhost (remote connection simulation, for situations like when table in my database is updated, do something in other database like save to history etc..). So I opened...

Psycopg ppygis select query


python-2.7,postgis,psycopg2,postgresql-9.3
I'm trying to setup a basic working postgis setup with python ppygis package. >>> import psycopg2 >>> import ppygis >>> connection = psycopg2.connect(database='spre', user='postgres') >>> cursor = connection.cursor() >>> cursor.execute('CREATE TABLE test (geometry GEOMETRY)') >>> cursor.execute('INSERT INTO test VALUES(%s)', (ppygis.Point(1.0, 2.0),)) >>> cursor.execute('SELECT * from test') >>> point = cursor.fetchone()[0]...

Postgresql select min Date value


sql,postgresql,postgresql-9.3,dml
These are my tables: table: tickets ticketid: serial userid: integer dateticket: date timeticket: time table: users userid: serial username: varchar password: varchar These are my data: userid username password 1 user1 123 2 user2 123 ticketid userid dateticket timeticket 1 1 2015-05-27 14:47:14 2 1 2015-05-27 14:47:15 3 1 2015-05-27...

dependant: :destroy leading to postgresql error


ruby-on-rails-3.2,postgresql-9.3
class Accdist < ActiveRecord::Base has_many :accdistlavoraziones, dependent: :destroy When deleting an accdist, the following is being output by the console: Accdist Load (27.3ms) SELECT "accdists".* FROM "accdists" WHERE "accdists"."id" = $1 LIMIT 1 [["id", "1"]] Accdistlavorazione Load (20.2ms) SELECT "accdistlavoraziones".* FROM "accdistlavoraziones" WHERE "accdistlavoraziones"."accdist_id" = 1 SQL (59.7ms) DELETE FROM...

Alternatives to WITH .. AS .. clause in PostgreSQL


sql,postgresql,plpgsql,common-table-expression,postgresql-9.3
I have several big queries of the following type (simplified for clarity). create function myfunction() returns void as $$ begin ... with t as ( total as total, total * 100 / total as total_percent, total / people.count as total_per_person, part1 as part1, part1 * 100 / total as part1_percent,...

Error with single quotes inside text in select statement


sql,database,postgresql,postgresql-9.3
Getting the error using Postgresql 9.3: select 'hjhjjjhjh'mnmnmnm'mn' Error: ERRO:syntax error in or next to "'mn'" SQL state: 42601 Character: 26 I tried replace single quote inside text with: select REGEXP_REPLACE('hjhjjjhjh'mnmnmnm'mn', '\\''+', '''', 'g') and select '$$hjhjjjhjh'mnmnmnm'mn$$' but it did not work. Below is the real code: CREATE OR REPLACE...

Error while concatenating plpgsql var with query on cursor statement


sql,database,postgresql,plpgsql,postgresql-9.3
I am getting error trying concatenate the var sch in the second For: ERROR: syntax error in or next a "||" SQL state: 42601 Character: 1151 Does anyone know how to solve this problem concatenation? CREATE OR REPLACE FUNCTION generate_mallet_input2() RETURNS VOID AS $$ DECLARE sch name; r record; BEGIN...

How to specify timestamp along with date in postgresql


sql,timestamp,postgresql-9.3
I would like to specify a static timestamp along with dynamic date in postgresql I am using now()-1 to get date. I am not sure how to specify static timestamp The format should be 2015-06-12 20:45:00:00 Now I am using select now()-1 from dual to get previous date....

Why are both SELECT count(PK) and SELECT count(*) so slow?


sql,postgresql,select,postgresql-9.3
I've got a simple table with single column PRIMARY KEY called id, type serial. There is exactly 100,000,000 rows in there. Table takes 48GB, PK index ca 2,1GB. Machine running on is "dedicated" only for Postgres and it is something like Core i5, 500GB HDD, 8GB RAM. Pg config was...

load with order when using includes clause in ruby


ruby-on-rails,activerecord,postgresql-9.3
I need to do the following: I have a huge list of IDs (called user_ids). I would like to pull all the users where :id => user_ids, and include the photos model as well. However, I would like the photos model to be sorted by created_at (because I need to...

python - ImportError - help configure environment to use with postgresql module


python,postgresql,postgresql-9.3
Python newbie here. I'm trying to figure out how to connect to a postgresql database using python.I need help setting up this environment (Guest VM with Fedora 12). I have postgresql 9.3 and wanted to use Psycopg2 which I think comes with postgres? I'm getting the following message: $ python...

PSQL Error Level in Batch For Loop


batch-file,for-loop,postgresql-9.3,psql
I am attempting to run a postgres query from within a batch file. However, I have thus far been unable to detect when the command fails. The following is what I have tried thus far: @FOR /F %%A IN ('PSQL -U userName -d dbName -t -c "SELECT * FROM nonExistantTable"...

SQL to remove duplicated rows


postgresql-9.3
I've written a sql statement to only keep one instance (minimum id) where there are duplicated product_codes. The issue is that the statement is very inefficient and takes absolutely ages to run, so I'm hoping there is a more efficient way to write it The dataset is structured as: id...

FOR loop on PLpgSQL function result


sql,postgresql,plpgsql,postgresql-9.3
I wrote a PLpgSQL function which should return SETOF products table: CREATE OR REPLACE FUNCTION get_products_by_category (selected_category_id smallint DEFAULT 1) RETURNS SETOF products AS $BODY $BEGIN RETURN QUERY (SELECT * FROM products WHERE CategoryID = selected_category_id); END; $BODY$ LANGUAGE plpgsql VOLATILE NOT LEAKPROOF COST 100 ROWS 1000; And next I...

What's the correct way to do IN (date-range) in Postgres?


postgresql,postgresql-9.3
What's the correct way to do this in Postgres? delete from days where date IN ("2014-02-15", "2014-02-07", "2014-02-08", "2014-02-09", "2014-03-01"); ERROR -- : PG::UndefinedColumn: ERROR: column "2014-02-15" does not exist works fine in MySQL and Sqlite3...

PostgreSQL - Returning the results of multiple arbitrary sub-queries


node.js,postgresql,postgresql-9.3,knex.js
Like the title of the question suggests, I'm attempting take a number of arbitrary sub-queries and combine them into a single, large query. Ideally, I'd like to the data to be returned as a single record, with each column being the result of one of the sub-queries. E.G. | sub-query...

Update table in a complex function using exceptions


sql,postgresql,sql-update,plpgsql,postgresql-9.3
I'm little lost trying to solve a problem. At first I've this 5 tables: CREATE TABLE DOCTOR ( Doc_Number INTEGER, Name VARCHAR(50) NOT NULL, Specialty VARCHAR(50) NOT NULL, Address VARCHAR(50) NOT NULL, City VARCHAR(30) NOT NULL, Phone VARCHAR(10) NOT NULL, Salary DECIMAL(8,2) NOT NULL, DNI VARCHAR(10) UNIQUE, CONSTRAINT pk_Doctor PRIMARY...

Why does the Inner Join query not work while multi where clause does


sql,postgresql,join,postgresql-9.3
Lets say I run a crooked car company. Let's say I have the following table: car_engine_mileage_counters which is a join table from car_engines onto mileage_counters also storing a calculated field of mileage Lets also say that I encode a coefficient at the engine block level in my factory on an...

Activerecord Join not returning all expected results


ruby-on-rails,activerecord,postgresql-9.3
I've got a pre-existing postgres database that I'm attempting to query. I've set up two models, Customer and Equipment. I'm trying to query two tables joining them with a non-standard key. No matter what I've tried, I get back only the result from one table. I've tried changing the ActiveRecord...

sequelize with postgres database not working after migration from mysql


mysql,postgresql,sequelize.js,postgresql-9.3
I change MySQL databese into postgreSQL in sequelize. But After migration I have issue with upper and lowercase first letter in Table or Model... Before my MySQL version was working properly but after migration I got error message: 500 SequelizeDatabaseError: relation "Users" does not exist My User model: module.exports =...

PostgreSQL 9.3: missing FROM-clause entry for table


postgresql,postgresql-9.3
I have a table with two columns. Example: create table t1 ( cola varchar, colb varchar ); Now I want to insert the rows from function. In the function: I want to use two parameters which is of type varchar to insert the values into the above table. I am...

Errors importing data using COPY comand at postgresql 9.3.5


sql,postgresql,csv,postgresql-9.3,tsv
I am trying import a database table to Postgres 9.3.5 database server using COPY command as follows: COPY comment (generatedid, id, "timestamp", message, bugreport_id, personcontainer_id) FROM stdin; 1 12840538 2010-03-03 09:50:46 How is that an error in HttpClient? Don&#39;t buffer large content in memory, or configure memory in your VM...

What is the Maximum Size of PosgreSQL Child Table


database,postgresql,postgresql-9.3
What is the maximum size of a PosgreSQL Child Table? I saw a limit of 32TB here http://www.postgresql.org/about/, but it does not specify in regards to child tables....

How can I store a variable in a postgresql script?


sql,postgresql,postgresql-9.3,sql-scripts
I have the following script where I need to find a given chapter, change the state, then store the activity reference to remove the activity later (because of the FK in chapter_published activity), delete the chapter_published reference and then use the id_activity to finally remove the parent activity. How would...

Compare result of two table functions using one column from each


sql,postgresql,postgresql-9.3,set-returning-functions
According the instructions here I have created two functions that use EXECUTE FORMAT and return the same table of (int,smallint). Sample definitions: CREATE OR REPLACE FUNCTION function1(IN _tbl regclass, IN _tbl2 regclass, IN field1 integer) RETURNS TABLE(id integer, dist smallint) CREATE OR REPLACE FUNCTION function2(IN _tbl regclass, IN _tbl2 regclass,...

PostgreSQL 9.3: STUFF and CHARINDEX function


postgresql,postgresql-9.3
I want to retrieve some part of given string. Here is the following example for the string: Example: In SQL Server Declare @Names varchar = 'H1,H2,H3,' SELECT STUFF(@Names,1,CHARINDEX(',',@Names,0),''); After referring this : 'stuff' and 'for xml path('')' from SQL Server in Postgresql. String_agg can't help me for this scenario....

Subtract the value of a row from grouped result


sql,postgresql,postgresql-9.3
I have a table supplier_account which has five coloumns supplier_account_id(pk),supplier_id(fk),voucher_no,debit and credit. I want to get the sum of debit grouped by supplier_id and then subtract the value of credit of the rows in which voucher_no is not null. So for each subsequent rows the value of sum of debit...

PostgreSQL 9.3: Split one column into multiple


postgresql,split,postgresql-9.3
I want to split one column that is colb in the given below example into two columns like column1 and column2. I have a table with two columns: Example: create table t3 ( cola varchar, colb varchar ); Insertion: insert into t3 values('D1','2021to123'), ('D2','112to24201'), ('D3','51to201'); I want to split the...

Rails, Postgres, ActiveRecord query postgres based on columns value


ruby-on-rails,ruby,postgresql,activerecord,postgresql-9.3
I working with Rails and Postgresql and I'm trying to query my postgres db based on the value of a column. To put it into perspective, I have an Events table in postgres and in that table I have some events that are recurring and some that aren't based on...

Postgres Join Query is SOMETIMES taking the cartesian product


sql,postgresql,postgresql-9.3
I'm attempting to join multiple tables for one query and I am getting inconsistent results from the database, I believe my query is taking the cartesian product of all the users, when I only want users who are in the DirectConversation. The Schema for reference: The query is (where $id...

PostgreSQL 9.3: Generate months name list


postgresql,postgresql-9.3
I want to generate months names list using PostgreSQL 9.3. For example: Months --------- January February March April .. .. December ...

Speed up Min/Max operation on postgres with index for IN operator query


postgresql,postgresql-9.3
I would like to optimize the following query in postgres SELECT(MIN("products"."shipping") AS minimal FROM "products" WHERE "products"."tag_id" IN (?) with an index like CREATE INDEX my_index ON products (tag_id, shipping DESC); Unfortunately this one is only used when it's just one tag. Almost alwayst it is queried for a handful...

PostgreSQL function execute query


sql,postgresql,plpgsql,postgresql-9.3
I want to run a SQL query if a condition is met, but I get the following error: ERROR: a separate $ chain is unfinished in or near «$func$ my SQL query is: CREATE OR REPLACE FUNCTION myfunc() RETURNS TABLE(dateticket date, timeticket time, userid integer, my_all bigint) AS $func$ BEGIN...

postgres psql error trying to pass parameters in sql script


postgresql,postgresql-9.3
In postgresql, I'm psql with the -v for variable input that I can call within a sql file. For example from bash script, it looks like this: "$PSQL_HOME"/psql -h $HOST_NM \ -p $PORT \ -U postgres \ -v v1=$1 \ -f Test.sql ... .. From the sql file, it looks...

Postgres SQL, how to automatically increment ID when duplicate / insert between two sequential ID's?


sql,postgresql,postgresql-9.3
I have a table with a SERIAL ID as primary key. As you know the serial id increments itself automatically, and I need this feature in my table. ID | info --------- 1 | xxx 2 | xxx 3 | xxx For ordering matters, I want to insert a row...

Query to find second largest value from every group


sql,postgresql,postgresql-9.3
I have three tables: project: project_id, project_name milestone: milestone_id, milestone_name project_milestone: id, project_id, milestone_id, completed_date I want to get the second highest completed_date and milestone_id from project_milestone grouped by project_id. That is I want to get the milestone_id of second highest completed_date for each project. What would be the correct...

Can I replace TRUNCATE + COPY + ANALYZE with TRUNCATE + COPY/FREEZE in PostgreSQL 9.3


postgresql,postgresql-9.3
I am trying to optimize my bulk loading routine. Currently I load data in steps (I am not following SQL syntax below, just the algorithm): BEGIN TRUNCATE table COPY into table ANALYZE table COMMIT Before PostgreSQL 9.3 this was the only recommended way to re-load a table. Version 9.3 introduces...

PostgreSQL 9.3: Pivot table query


sql,postgresql,pivot-table,postgresql-9.3,table-functions
I want to show the pivot table(crosstab) for the given below table. Table: Employee CREATE TABLE Employee ( Employee_Number varchar(10), Employee_Role varchar(50), Group_Name varchar(10) ); Insertion: INSERT INTO Employee VALUES('EMP101','C# Developer','Group_1'), ('EMP102','ASP Developer','Group_1'), ('EMP103','SQL Developer','Group_2'), ('EMP104','PLSQL Developer','Group_2'), ('EMP101','Java Developer',''), ('EMP102','Web Developer',''); Now I want to show the pivot table for...

Postgres not logging all queries, despite logging the duration


postgresql,postgresql-9.3
I am trying to get my postgresql 9.3 server to log all sql that runs longer than 1 second. I have set: log_min_duration=1s log_statement='mod' log_duration=off for most queries, the logging is working correclty, but some statements, such as "CREATE TABLE AS" or "INSERT" are not logging the statement. The log...

Backslash works incorrectly in LIKE clause


jpa,postgresql-9.3
I'd to use LIKE and backslash to search some names. The problem is Postgres understands backslash as escape character in LIKE clause. I tried to turn on standard_conforming_strings but it doesn't help. SELECT h.software_id ,h.software_name FROM software h WHERE software_name LIKE '%\%'; This query doesn't show anything whereas I have...

Partition pruning based on check constraint not working as expected


sql,postgresql,postgresql-9.3,database-partitioning,postgresql-performance
Why is the table "events_201504" included in the query plan below? Based on my query and the check constraint on that table I would expect the query planner to be able to prune it entirely: database=# \d events_201504 Table "public.events_201504" Column | Type | Modifiers ---------------+-----------------------------+--------------------------------------------------------------- id | bigint |...

PostgreSQL query failure


sql,postgresql,postgresql-9.2,postgresql-9.3
I have a unique problem with PostgreSQL. After inserting data into a database I try and retrieve everything greater than a specific string. However, it does not return any data. So, I tried this on another machine and it worked. So my problem is that my data is returned on...

Populate NULL value with most recent value from the same column


postgresql,postgresql-9.3
I am trying to populate NULL values in a column with the most recent non-NULL value in that column. For instance in the example below I want the IG column for the FR and first SPR values to be '1', but the final SPR value to be '0'. As I...

Subquery is faster using a function


sql,postgresql,postgresql-9.3
I have a long query (~200 lines) that I have embedded in a function: CREATE FUNCTION spot_rate(base_currency character(3), contra_currency character(3), pricing_date date) RETURNS numeric(20,8) Whether I run the query directly or the function I get similar results and similar performance. So far so good. Now I have another long query...

Syntax error passing SQL result to PostgreSQL function accepting array


sql,arrays,postgresql,postgresql-9.3
I tried to pass the result of a SQL query to a function, but I got a syntax error. contacts=> SELECT count(*) FROM update_name(contact_ids := select array(select id from contact where name is NULL)); ERROR: syntax error at or near "select" LINE 1: SELECT count(*) FROM update_name(contact_ids := select array......

Update with row_number() not working, why?


sql,postgresql,sql-update,common-table-expression,postgresql-9.3
I have the following table: CREATE TABLE t_overview ( obj_uid uuid, obj_parent_uid uuid, obj_no integer, obj_text text, obj_path text, isdir integer, intid bigint, intparentid bigint ) I want to move from uuid to bigint and created the new columns intid and intparentid. I need a unique integer (obj_uid is the...

How to Mimic Postgres Foreign Keys into a Partitioned Table


postgresql,foreign-keys,postgresql-9.3,database-partitioning
I have a partitioned table (call it A) with a serial primary key that is referenced by another table (call it B). I know that I can't actually create a foreign key from one to the other (since I don't know from which partition the data is actually stored), so...