prepare statements 와 casting

POSTGRESQL 2015. 4. 21. 16:32
상수값으로 잘 실행되는 SQL 구문이 parepare 구문을 이용하면 오류가 발생하는 경우가 있습니다. 이는 바인드변수로 SQL을 실행할때 묵시적형변환이 지원되지 않기 때문입니다.

 

상수값으로 실행한 결과입니다.
test=# create table test_table(col1 numeric);
CREATE TABLE
test=# insert into test_table values(1);
INSERT 0 1
test=# explain select col1 from test_table where col1 = '1';
                                  QUERY PLAN                                   
-------------------------------------------------------------------------
 Seq Scan on test_table  (cost=10000000000.00..10000000026.38 rows=7 width=32)
   Filter: (col1 = 1::numeric)
(2 rows)

 

prepare 구문으로 실행한 결과입니다.
test=# prepare foo(varchar) as
test-# select col1 from test_table where col1 = $1;
ERROR:  operator does not exist: numeric = character varying
LINE 2: select col1 from test_table where col1 = $1;
                                               ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.
test=#

 

prepare 구문상 형변환을 명시하여 실행한 결과입니다.
test=# prepare foo(varchar) as
select col1 from test_table where col1 = $1::numeric;
PREPARE
test=# execute foo('1');
 col1 
------
    1
(1 row)

md5를 이용한 password hash value 구하기

POSTGRESQL 2014. 11. 26. 11:17
1. pgcrypto extension 을 생성합니다.
estdb=# create extension pgcrypto;
CREATE EXTENSION
testdb=# create table user_info(user_id varchar(30), password varchar(100));
CREATE TABLE
testdb=# insert into user_info values('id1', crypt('id1_password', gen_salt('md5')));
INSERT 0 1
testdb=# insert into user_info values('id2', crypt('id2_password', gen_salt('md5')));  
INSERT 0 1

 

2. 데이터를 조회할 때는 아래와 같이 SQL을 작성합니다.
testdb=# select user_id from user_info where user_id='id1' 
            and password = crypt('id1_password', password);
 user_id 
---------
 id1
(1 row)

testdb=# 

[9.4 beta]ALTER SYSTEM 구문을 이용한 파라메터 변경

POSTGRESQL 2014. 10. 29. 17:11
9.4 에 새롭게 적용될 ALTER SYSTEM 구문을 테스트해 보았습니다.
요약하자면 ALTER SYSTEM 구문을 통해 변경된 파라메터 이름과 value 는 postgresql.auto.conf 파일에 저장되며,
SQL 함수인 pg_reload_conf() 를 통해 reload 되는 방식입니다.
pg_settings 시스템뷰에서 정의하는 context 속성은 그대로 적용된듯 합니다.

1. ALTER SYSTEM 구문으로 파라메터값을 변경합니다.
(pg_reload_conf 함수 또는 pg_ctl reload 를 실행전까지 적용되지 않습니다.)
pg_test=# select context,name,setting from pg_settings
where name in ('enable_seqscan','authentication_timeout','max_connections');
  context   |          name          | setting 
------------+------------------------+---------
 sighup     | authentication_timeout | 60
 user       | enable_seqscan         | on
 postmaster | max_connections        | 100
(3 rows)

pg_test=# ALTER SYSTEM set authentication_timeout=30;
ALTER SYSTEM
pg_test=# ALTER SYSTEM set enable_seqscan = off;
ALTER SYSTEM
pg_test=# ALTER SYSTEM set max_connections = 1000;
ALTER SYSTEM
pg_test=# select context,name,setting from pg_settings
where name in ('enable_seqscan','authentication_timeout','max_connections');
  context   |          name          | setting 
------------+------------------------+---------
 sighup     | authentication_timeout | 60
 user       | enable_seqscan         | on
 postmaster | max_connections        | 100
(3 rows)


2. pg_reload_conf() 함수를 통해 reload 명령을 실행하고, 해당 파라메터의 변경된 값을 확인합니다.
(단, pg_settings.context 값이 postmaster인 경우 서버를 재기동해야 합니다.)
pg_test=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

pg_test=# select context,name,setting from pg_settings
where name in ('enable_seqscan','authentication_timeout','max_connections');
  context   |          name          | setting 
------------+------------------------+---------
 sighup     | authentication_timeout | 30
 user       | enable_seqscan         | off
 postmaster | max_connections        | 100
(3 rows)


3. reset 또는 default 구문을 이용하여 postgresql.conf 파일에 저장된 값으로 초기할 수 있습니다.
pg_test=# ALTER SYSTEM reset authentication_timeout;
ALTER SYSTEM
pg_test=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)
pg_test=# select context,name,setting from pg_settings
where name in ('enable_seqscan','authentication_timeout','max_connections');
  context   |          name          | setting 
------------+------------------------+---------
 sighup     | authentication_timeout | 60
 user       | enable_seqscan         | off
 postmaster | max_connections        | 100
 


4. reset all 구문을 이용하여 모든 파라메터를 postgresql.conf 파일에 저장된 값으로 초기화 할 수 있습니다.
pg_test=# ALTER SYSTEM reset all;
ALTER SYSTEM

pg_test=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

pg_test=# select context,name,setting from pg_settings
where name in ('enable_seqscan','authentication_timeout','max_connections');
  context   |          name          | setting 
------------+------------------------+---------
 sighup     | authentication_timeout | 60
 user       | enable_seqscan         | on
 postmaster | max_connections        | 100
(3 rows)


※ data_directory파라메터와 block_size 등과 같이 context 구분이 internal인 파라메터는 변경할 수 없습니다.
pg_test=# alter system set data_directory='/data';
ERROR:  parameter "data_directory" cannot be changed

'POSTGRESQL' 카테고리의 다른 글

prepare statements 와 casting  (0) 2015.04.21
md5를 이용한 password hash value 구하기  (0) 2014.11.26
table inheritance(상속)  (0) 2014.10.28
max_connections 와 커널파라메터  (0) 2014.10.27
postgresql DBLINK 예제  (0) 2014.10.27

table inheritance(상속)

POSTGRESQL 2014. 10. 28. 10:40
postgreSQL은 테이블 생성시에 하나이상의 다른 테이블로부터의 상속기능을 제공합니다.
상속받은 테이블을 자식테이블, 상속대상 테이블을 부모테이블로 할때 자식테이블을 create table 시 INHERITS 옵션을 이용하여 부모테이블의 컬럼, check / not-null 제약조건을 상속받습니다.
아래는 자식테이블 생성 및 조회에 대한 예제입니다.

1. 부모테이블을 생성합니다.
postgres=# CREATE TABLE cities (
postgres(#     name            text,
postgres(#     population      float,
postgres(#     altitude        int     -- in feet
postgres(# );
CREATE TABLE
postgres=# \d+ cities
                       Table "public.cities"
   Column   |       Type       | Modifiers | Storage  | Description 
------------+------------------+-----------+----------+-------------
 name       | text             |           | extended | 
 population | double precision |           | plain    | 
 altitude   | integer          |           | plain    | 
Has OIDs: no


2. cities 테이블을 상속받는 자식테이블 capitals을 IHERIT 구문을 이용하여 생성합니다.
postgres=# CREATE TABLE capitals (
postgres(#     state           char(2)
postgres(# ) INHERITS (cities);
CREATE TABLE
postgres=# \d+ capitals;
                      Table "public.capitals"
   Column   |       Type       | Modifiers | Storage  | Description 
------------+------------------+-----------+----------+-------------
 name       | text             |           | extended | 
 population | double precision |           | plain    | 
 altitude   | integer          |           | plain    | 
 state      | character(2)     |           | extended | 
Inherits: cities
Has OIDs: no


3. 부모테이블에 대한 조회/수정/삭제 SQL은 자식테이블을 포함하여 조회/수정/삭제되며, only 옵션을 이용하여 하나의 테이블을 지정할 수 있습니다.
postgres=# insert into cities values('kms',10.23,1000);
INSERT 0 1
postgres=# insert into capitals values('kms',10.23,1000,'01');
INSERT 0 1
postgres=# select * from cities;
 name | population | altitude 
------+------------+----------
 kms  |      10.23 |     1000
 kms  |      10.23 |     1000
(2 rows)
                           ^
postgres=# select * from only cities;
 name | population | altitude 
------+------------+----------
 kms  |      10.23 |     1000
(1 row)


4. 상속받은 테이블을 부모테이블로 하는 또다른 자식테이블을 생성하면, 최상위 부모테이블에 대한 조회/수정/삭제질의는 전계층의 테이블을 포함합니다.
postgres=# create table streets(col1 char(1)) inherits(capitals);
insert into streets values ('kms',20.24,2100,02,'1');
INSERT 0 1

postgres=# select * from only cities;
 name | population | altitude 
------+------------+----------
 kms  |      10.23 |     1000
(1 row)

postgres=# select * from only capitals;
 name | population | altitude | state 
------+------------+----------+-------
 kms  |      10.23 |     1000 | 01
(1 row)

postgres=# select * from streets;
 name | population | altitude | state | col1 
------+------------+----------+-------+------
 kms  |      20.24 |     2100 | 2     | 1
(1 row)

postgres=# select * from cities;
 name | population | altitude 
------+------------+----------
 kms  |      10.23 |     1000
 kms  |      10.23 |     1000
 kms  |      20.24 |     2100
(3 rows)


※ 가독성을 높히기 위해 부모테이블을 대상으로 하는 쿼리를 작성할 때 테이블 후미에 *를 붙여 상속테이블이 포함될 것임을 표시하는 것을 권장합니다.
예시) select * from cities*;

max_connections 와 커널파라메터

POSTGRESQL 2014. 10. 27. 19:41
postgresql 의 max_connections 값을 증가시키고 나서 클러스터(인스턴스) 기동 시 다음과 같은 오류가 발생할 수 있습니다.
FATAL:  could not create semaphores: No space left on device
[2012-10-24 16:49:55.313 KST][][][][][20731]DETAIL:  Failed system call was semget(5433033, 17, 03600).
[2012-10-24 16:49:55.313 KST][][][][][20731]HINT:  This error does *not* mean that you have run out of disk space.
 It occurs when either the system limit for the maximum number of semaphore sets (SEMMNI), or the system wide maximum number
 of semaphores (SEMMNS), would be exceeded.  You need to raise the respective kernel parameterively, reduce PostgreSQL's 
consumption of semaphores by reducing its max_connections parameter (currently 510).
 The PostgreSQL documentation contains more information about configuring your system for PostgreSQL.
이러한 오류는 max_connections 파라메터가 SEMMNI라는 세마포어 관련 커널 파라메터값을 초과해서 발생되는 오류로 커널 제한값을 늘리거나 postgresql의 max_connections 값을 줄여서 해결할 수 있습니다.
아래는 커널파라메터 값을 수정하는 방법입니다.(root권한이 필요합니다.)

1. 커널파라메터 확인 : Semaphore Limits 부분의 max number of arrays 항목을 확인합니다.
$ ipcs -l

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 67108864
max total shared memory (kbytes) = 17179869184
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767

------ Messages: Limits --------
max queues system wide = 16
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536


2. 커널파라메터 수정
$ vi /etc/sysctl.conf
kerlel.sem=250 256000 32 128    ## 네번째 값이 SEMMNI이고 이 값을 증가시켜야 합니다.(세마포어 배열의 최대갯수).
$ sysctl -p /etc/sysctl.conf    ## 수정한 값으로 리로드 합니다.
파라메터를 구하는 공식 :
SEMMNI = (max_connections + autovacuum_max_workers + 4) / 16

postgresql DBLINK 예제

POSTGRESQL 2014. 10. 27. 11:18
postgreSQL 의 오브젝트 구조는 하나의 클러스터 안에 2개 이상의 데이터베이스를 운영할 수 있는 구조입니다.
하지만 데이터베이스간의 직접적인 접근을 불가능하므로 다른 데이터베이스에 접근하려면 dblink 또는 9.3 버젼이상에서 지원하는 foreign DATA wrapper 라는 contrib 모듈을 사용해야 합니다.
아래는 dblink 모듈에 대한 예제입니다.

1. dblink 라이브러리 파일 생성
$ cd .src/contrib/dblink
$ make
$ make install


2. extension 생성
test=# create extension dblink;
CREATE extension


3. dblink 커넥션 생성 및 해제
test=# select dblink_connect('db링크이름','hostaddr=호스트IP port=호스트PORT dbname=DB명 user=유저명 password=비밀번호');
 dblink_connect 
----------------
 OK
(1 row)
-- 조회(as 뒤에 select결과에 대한 type을 column별로 지정)
test=# select * from dblink('db2', 'select * from test') as t1(c1 char);
 c1 
----
 d
(1 row)
-- 접속해제
test=# select dblink_disconnect('db2');
 dblink_disconnect 
-------------------
 OK
(1 row)

'POSTGRESQL' 카테고리의 다른 글

table inheritance(상속)  (0) 2014.10.28
max_connections 와 커널파라메터  (0) 2014.10.27
gdb로 postgresql backend process 디버깅하기  (0) 2014.10.26
table/index bloat 점검하기  (0) 2014.10.26
pg_trgm 설치하기  (1) 2014.10.26

gdb로 postgresql backend process 디버깅하기

POSTGRESQL 2014. 10. 26. 23:57
C function 을 작성할 때 gdb로 간단히 디버깅 할 수 있는 방법입니다. (참고로 gdb 명령에 익숙하지 않으시다면 optimizer.snu.ac.kr/yanne/ToolManuals/gdb.doc)

1. 먼저 컴파일 전 configure 를 이용하여 디버깅 레벨을 변경합니다.
("vaule optimized away" 메세지가 출력되면서 해당되는 변수값을 확인할 수 없는 경우가 많았습니다.)
./configure --enable-cassert --enable-debug CFLAGS="-ggdb -O0"

2. 새로운 backend process 를 생성하고 pid 를 확인합니다.
test=# select pg_backend_pid();
 pg_backend_pid 
----------------
           3167
(1 row)

3. gdb 를 pid 로 attach 하여 디버깅 실행
$ gdb -p 3193
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
.
Attaching to process 3193
Reading symbols from /usr/local/pgsql93/bin/postgres...done.
Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libssl.so.10
Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libcrypto.so.10
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgssapi_krb5.so.2
Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkrb5.so.3
Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libcom_err.so.2
Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib64/libk5crypto.so.3
Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libz.so.1
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libkrb5support.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkrb5support.so.0
Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkeyutils.so.1
Reading symbols from /lib64/libresolv.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libselinux.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
0x0000003f002e98c2 in recv () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6_5.2.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-15.el6_5.1.x86_64 libcom_err-1.41.12-18.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 openssl-1.0.1e-16.el6_5.14.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb)

4. break-point를 생성
(gdb) b [디버깅 대상 함수명]
Breakpoint 1 at 0x57a733: file parser.c, line 42.
(gdb) 

5. 함수를 호출하여 디버깅 시작
test=# select [디버깅대상함수]; -- gdb 에서 프로그램을 실행할때까지 대기합니다.



'POSTGRESQL' 카테고리의 다른 글

max_connections 와 커널파라메터  (0) 2014.10.27
postgresql DBLINK 예제  (0) 2014.10.27
table/index bloat 점검하기  (0) 2014.10.26
pg_trgm 설치하기  (1) 2014.10.26
prepare statements 와 casting  (0) 2014.10.26

table/index bloat 점검하기

POSTGRESQL 2014. 10. 26. 23:22

table/index bloat 은 말그대로 테이블 또는 인덱스의 물리크기가 비정상적으로 부풀어 있는 경우를 말합니다. table bloat 이 발생할 경우 불필요한 IO 비용이 수반되고 스토리지 낭비를 초래하는 postgresql의 주요 관리 factor 중 하나로써 auto vacuum 기능이 꺼져 있거나 관련 파라메터가 적절하지 않을 경우 발생할 수 있습니다. 아래는 index/table의 bloating 정도를 점검하는 SQL입니다.

1. index_bloat_check.sql

WITH btree_index_atts AS (
    SELECT nspname, relname, reltuples, relpages, indrelid, relam,
        regexp_split_to_table(indkey::text, ' ')::smallint AS attnum,
        indexrelid as index_oid
    FROM pg_index
    JOIN pg_class ON pg_class.oid=pg_index.indexrelid
    JOIN pg_namespace ON pg_namespace.oid = pg_class.relnamespace
    JOIN pg_am ON pg_class.relam = pg_am.oid
    WHERE pg_am.amname = 'btree'
    ),
index_item_sizes AS (
    SELECT
    i.nspname, i.relname, i.reltuples, i.relpages, i.relam,
    s.starelid, a.attrelid AS table_oid, index_oid,
    current_setting('block_size')::numeric AS bs,
    8 AS maxalign,
    24 AS pagehdr,
    /* per tuple header: add index_attribute_bm if some cols are null-able */
    CASE WHEN max(coalesce(s.stanullfrac,0)) = 0
        THEN 2
        ELSE 6
    END AS index_tuple_hdr,
    /* data len: we remove null values save space using it fractionnal part from stats */
    sum( (1-coalesce(s.stanullfrac, 0)) * coalesce(s.stawidth, 2048) ) AS nulldatawidth
    FROM pg_attribute AS a
    JOIN pg_statistic AS s ON s.starelid=a.attrelid AND s.staattnum = a.attnum
    JOIN btree_index_atts AS i ON i.indrelid = a.attrelid AND a.attnum = i.attnum
    WHERE a.attnum > 0
    GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9
),
index_aligned AS (
    SELECT maxalign, bs, nspname, relname AS index_name, reltuples,
        relpages, relam, table_oid, index_oid,
      ( 2 +
          maxalign - CASE /* Add padding to the index tuple header to align on MAXALIGN */
            WHEN index_tuple_hdr%maxalign = 0 THEN maxalign
            ELSE index_tuple_hdr%maxalign
          END
        + nulldatawidth + maxalign - CASE /* Add padding to the data to align on MAXALIGN */
            WHEN nulldatawidth::integer%maxalign = 0 THEN maxalign
            ELSE nulldatawidth::integer%maxalign
          END
      )::numeric AS nulldatahdrwidth, pagehdr
    FROM index_item_sizes AS s1
),
otta_calc AS (
  SELECT bs, nspname, table_oid, index_oid, index_name, relpages, coalesce(
    ceil((reltuples*(4+nulldatahdrwidth))/(bs-pagehdr::float)) +
      CASE WHEN am.amname IN ('hash','btree') THEN 1 ELSE 0 END , 0 -- btree and hash have a metadata reserved block
    ) AS otta
  FROM index_aligned AS s2
    LEFT JOIN pg_am am ON s2.relam = am.oid
),
raw_bloat AS (
    SELECT current_database() as dbname, nspname, c.relname AS table_name, index_name,
        bs*(sub.relpages)::bigint AS totalbytes, otta as expected,
        CASE
            WHEN sub.relpages <= otta THEN 0
            ELSE bs*(sub.relpages-otta)::bigint END
            AS wastedbytes,
        CASE
            WHEN sub.relpages <= otta
            THEN 0 ELSE bs*(sub.relpages-otta)::bigint * 100 / (bs*(sub.relpages)::bigint) END
            AS realbloat,
        pg_relation_size(sub.table_oid) as table_bytes,
        stat.idx_scan as index_scans
    FROM otta_calc AS sub
    JOIN pg_class AS c ON c.oid=sub.table_oid
    JOIN pg_stat_user_indexes AS stat ON sub.index_oid = stat.indexrelid
),
format_bloat AS (
SELECT dbname as database_name, nspname as schema_name, table_name, index_name,
        round(realbloat) as bloat_pct, round(wastedbytes/(1024^2)::NUMERIC) as bloat_mb,
        round(totalbytes/(1024^2)::NUMERIC,3) as index_mb,
        round(table_bytes/(1024^2)::NUMERIC,3) as table_mb,
        index_scans
FROM raw_bloat
)
-- final query outputting the bloated indexes
-- change the where and order by to change
-- what shows up as bloated
SELECT *
FROM format_bloat
WHERE ( bloat_pct > 50 and bloat_mb > 10 )
ORDER BY bloat_mb DESC;


2. table_bloat_check.sql
-- new table bloat query
-- still needs work; is often off by +/- 20%
WITH constants AS (
    -- define some constants for sizes of things
    -- for reference down the query and easy maintenance
    SELECT current_setting('block_size')::numeric AS bs, 23 AS hdr, 8 AS ma
),
no_stats AS (
    -- screen out table who have attributes
    -- which dont have stats, such as JSON
    SELECT table_schema, table_name
    FROM information_schema.columns
        LEFT OUTER JOIN pg_stats
        ON table_schema = schemaname
            AND table_name = tablename
            AND column_name = attname
    WHERE attname IS NULL
        AND table_schema NOT IN ('pg_catalog', 'information_schema')
    GROUP BY table_schema, table_name
),
null_headers AS (
    -- calculate null header sizes
    -- omitting tables which dont have complete stats
    -- and attributes which aren't visible
    SELECT
        hdr+1+(sum(case when null_frac <> 0 THEN 1 else 0 END)/8) as nullhdr,
        SUM((1-null_frac)*avg_width) as datawidth,
        MAX(null_frac) as maxfracsum,
        schemaname,
        tablename,
        hdr, ma, bs
    FROM pg_stats CROSS JOIN constants
        LEFT OUTER JOIN no_stats
            ON schemaname = no_stats.table_schema
            AND tablename = no_stats.table_name
    WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
        AND no_stats.table_name IS NULL
        AND EXISTS ( SELECT 1
            FROM information_schema.columns
                WHERE schemaname = columns.table_schema
                    AND tablename = columns.table_name )
    GROUP BY schemaname, tablename, hdr, ma, bs
),
data_headers AS (
    -- estimate header and row size
    SELECT
        ma, bs, hdr, schemaname, tablename,
        (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
        (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
    FROM null_headers
),
table_estimates AS (
    -- make estimates of how large the table should be
    -- based on row and page size
    SELECT schemaname, tablename, bs,
        reltuples, relpages * bs as table_bytes,
    CEIL((reltuples*
            (datahdr + nullhdr2 + 4 + ma -
                (CASE WHEN datahdr%ma=0
                    THEN ma ELSE datahdr%ma END)
                )/(bs-20))) * bs AS expected_bytes
    FROM data_headers
        JOIN pg_class ON tablename = relname
        JOIN pg_namespace ON relnamespace = pg_namespace.oid
            AND schemaname = nspname
    WHERE pg_class.relkind = 'r'
),
table_estimates_plus AS (
-- add some extra metadata to the table data
-- and calculations to be reused
-- including whether we cant estimate it
-- or whether we think it might be compressed
    SELECT current_database() as databasename,
            schemaname, tablename, reltuples as est_rows,
            CASE WHEN expected_bytes > 0 AND table_bytes > 0 THEN
                TRUE ELSE FALSE END as can_estimate,
            CASE WHEN expected_bytes > table_bytes THEN
                TRUE ELSE FALSE END as is_compressed,
            CASE WHEN table_bytes > 0
                THEN table_bytes::NUMERIC
                ELSE NULL::NUMERIC END
                AS table_bytes,
            CASE WHEN expected_bytes > 0 
                THEN expected_bytes::NUMERIC
                ELSE NULL::NUMERIC END
                    AS expected_bytes,
            CASE WHEN expected_bytes > 0 AND table_bytes > 0
                AND expected_bytes <= table_bytes
                THEN (table_bytes - expected_bytes)::NUMERIC
                ELSE 0::NUMERIC END AS bloat_bytes
    FROM table_estimates
),
bloat_data AS (
    -- do final math calculations and formatting
    select current_database() as databasename,
        schemaname, tablename, can_estimate, is_compressed,
        table_bytes, round(table_bytes/(1024^2)::NUMERIC,3) as table_mb,
        expected_bytes, round(expected_bytes/(1024^2)::NUMERIC,3) as expected_mb,
        round(bloat_bytes*100/table_bytes) as pct_bloat,
        round(bloat_bytes/(1024::NUMERIC^2),2) as mb_bloat,
        table_bytes, expected_bytes
    FROM table_estimates_plus
)
-- filter output for bloated tables
SELECT databasename, schemaname, tablename,
    --can_estimate, is_compressed,
    pct_bloat, mb_bloat,
    table_mb
FROM bloat_data
-- this where clause defines which tables actually appear
-- in the bloat chart
-- example below filters for tables which are either 50%
-- bloated and more than 20mb in size, or more than 25%
-- bloated and more than 4GB in size
WHERE ( pct_bloat >= 50 AND mb_bloat >= 10 )
    OR ( pct_bloat >= 25 AND mb_bloat >= 1000 )
ORDER BY pct_bloat DESC;


※ 출처 : https://github.com/pgexperts/pgx_scripts


'POSTGRESQL' 카테고리의 다른 글

max_connections 와 커널파라메터  (0) 2014.10.27
postgresql DBLINK 예제  (0) 2014.10.27
gdb로 postgresql backend process 디버깅하기  (0) 2014.10.26
pg_trgm 설치하기  (1) 2014.10.26
prepare statements 와 casting  (0) 2014.10.26

pg_trgm 설치하기

POSTGRESQL 2014. 10. 26. 18:48
postgresql 추가모듈 중 pg_trgm을 이용하여 '%문자열%' like 검색시 인덱스 스캔을 실행할 수 있습니다.

설치하기 전의 옵티마이저는 아래와 같이 full scan 메소드를 선택합니다.
testdb=# explain analyze select * from test_table where test_column like '%TEXT%';
                                                     QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------
 Seq Scan on test_table  (cost=0.00..288431.08 rows=264 width=298) (actual time=952.905..1119.831 rows=6 loops=1)
   Filter: ((test_column)::text ~~ '%TEXT%'::text)
 Total runtime: 1119.854 ms
(3 rows)


위와 같은 경우의 인덱싱 지원을 위해
1. contrib모듈을 설치합니다.
$ make
$ make install


2. 해당 데이터베이스에 extension을 생성합니다.
testdb=# create extension pg_trgm;
CREATE EXTENSION
CREATE INDEX trgm_idx ON test_table USING gin (test_column gin_trgm_ops);
CREATE INDEX


설치 이후 동일한 SQL에 대하여 index scan 메소드가 선택된 것을 확인할 수 있었습니다.
testdb=# explain analyze select * from test_table where test_column like '%TEXT%';
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test_table  (cost=22.42..1245.43 rows=313 width=298) (actual time=0.021..0.029 rows=6 loops=1)
   Recheck Cond: ((test_column)::text ~~ '%TEXT%'::text)
   ->  Bitmap Index Scan on trgm_idx  (cost=0.00..22.35 rows=313 width=0) (actual time=0.014..0.014 rows=6 loops=1)
         Index Cond: ((test_column)::text ~~ '%TEXT%'::text)
 Total runtime: 0.051 ms
(5 rows)


※제약사항 : 3글자 미만에 대한 패턴검색의 경우 인덱스 탐색비용이 급격히 증가, table full scan의 비용을 초과함.

'POSTGRESQL' 카테고리의 다른 글

max_connections 와 커널파라메터  (0) 2014.10.27
postgresql DBLINK 예제  (0) 2014.10.27
gdb로 postgresql backend process 디버깅하기  (0) 2014.10.26
table/index bloat 점검하기  (0) 2014.10.26
prepare statements 와 casting  (0) 2014.10.26

prepare statements 와 casting

POSTGRESQL 2014. 10. 26. 18:12

타 DB에서 운영하던 prepare statement를 postgreSQL로 포팅이후 못보던 오류(주로 서로다른 타입에 대한 operator가 없다는 등의)를 경험하게 되는데요. 이는 postgresql에서 parepare 구문을 이용하여 바인드변수로 SQL을 실행할때 묵시적형변환을 지원하지 않기 때문입니다.



먼저 일반적인 경우라면 아래와 같이 자동형변환을 확인할 수 있지만...

test=# create table test_table(col1 numeric);
CREATE TABLE
test=# insert into test_table values(1);
INSERT 0 1
test=# explain select col1 from test_table where col1 = '1';
                                  QUERY PLAN                                   
-------------------------------------------------------------------------
 Seq Scan on test_table  (cost=10000000000.00..10000000026.38 rows=7 width=32)
   Filter: (col1 = 1::numeric)
(2 rows)


바인딩변수를 이용하여 실행해 보면 numeric 타입과 varchar 타입간의 = operator를 찾을 수없다는 오류가 발생합니다.
test=# prepare foo(varchar) as
test-# select col1 from test_table where col1 = $1;
ERROR:  operator does not exist: numeric = character varying
LINE 2: select col1 from test_table where col1 = $1;
                                               ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.
test=# 


이는 형변환을 명시함으로써 해결할 수 있습니다.
test=# prepare foo(varchar) as
select col1 from test_table where col1 = $1::numeric;
PREPARE
test=# execute foo('1');
 col1 
------
    1
(1 row)


'POSTGRESQL' 카테고리의 다른 글

max_connections 와 커널파라메터  (0) 2014.10.27
postgresql DBLINK 예제  (0) 2014.10.27
gdb로 postgresql backend process 디버깅하기  (0) 2014.10.26
table/index bloat 점검하기  (0) 2014.10.26
pg_trgm 설치하기  (1) 2014.10.26