PostgreSQL数据库pg_hba.conf文件学习笔记

8月 30,2018 留下评论 PostgreSQL

对于PostgreSQL数据库的pg_hba.conf文件从以下几个方面来学习。

1 pg_hba.conf的作用

pg_hba.conf文件是数据库判断客户端/应用程序能否正常访问数据库的唯一依据。(HBA stands for host-based authentication.)

略微牵强一点儿的横向比较的话，其作用类似于Oracle数据库的listener.ora文件。

pg的客户端访问数据库的时候，需要提供数据库服务器主机名或者IP地址，端口号，数据库名，数据库用户名，以及数据库用户密码或者数据库认证方式。而这些信息是存在于pg_hba.conf文件中的。同样，Oracle客户端访问数据库的过程中，也需要提供Oracle数据库主机名或者IP地址，端口号，访问协议，以及服务名。而这些信息是记录在listener.ora文件中的。

2 pg_hba.conf的位置

默认情况下，该文件位于数据库服务器的data路径下，名为pg_hba.conf。当然，这些都不是绝对的，尤其对于Unix/Linux环境下，文件的扩展名只是一种约定的惯例，并不像windows平台下，但凡.exe代表可执行文件等。

可以在psql命令行中，通过超级用户执行show hba_file来获取pg_hba.conf的具体路径。

postgres=# show hba_file ;
              hba_file               
-------------------------------------
 /orabackup/pg9.6.4/data/pg_hba.conf
(1 row)

postgres=#

如果是没有权限的普通用户执行该命令，可能会有下述错误：

ai=> show hba_file;
ERROR:  must be superuser to examine "hba_file"
ai=>

如果需要修改或者调整配置文件pg_hba.conf的路径的话，需要通过修改postgresql.conf文件中hba_file参数来实现，并且要提前做好pg_hba.conf文件的备份，以及需要重启数据库才能使修改生效。

3 pg_hba.conf的内容

该文件是一个普通的文本类型文件，可以直接用文本编辑器编辑修改；
以#开头行的是注释行；
以行为单位，每一行是一个有效的记录，不允许不支持跨行的记录存在；
每一个生效的行记录都包含：连接类型、数据库名、用户名、允许的客户端IP地址或地址段、认证方式，这5个字段。且，各字段之间以空格分开。类似如下：

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             all                                     trust
# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
host    all             all             172.0.0.0/8             md5

这里local开头的行，表示的是数据库允许通过主机服务器的Unix socket通信来访问和连接数据库。如果没有local开头的行，则意味着数据库不接受socket通信的连接访问方式。说白了，就是没有这一行的话，或注释掉改行的话，则无法通过简单的psql来访问数据库：

[postgres@localhost data]$ psql
psql: FATAL:  no pg_hba.conf entry for host "[local]", user "postgres", database "postgres"
[postgres@localhost data]$

第1个host开头的行，意味着，允许接受客户端为数据库主机环路地址127.0.0.1上，通过提供数据库IP地址，端口，数据库名、用户名的方式来访问数据库，认证方式为trust，即不需要提供密码的方式来访问数据库。且，可以以数据库中已经存在的任意用户来访问任意数据库，如：

[postgres@localhost data]$ psql -h localhost -p 5432 -d postgres -U postgres
psql (9.6.4)
Type "help" for help.

postgres=# \q
[postgres@localhost data]$ psql -h 127.0.0.1 -p 5432 -d postgres -U ai
psql (9.6.4)
Type "help" for help.

postgres=>

第2个host开头的行，意味着，允许从IP范围是172.0.0.0到172.254.254.254之间的任意地址，以数据库中已经存在的用户，发起到任意数据库的连接请求，但是必须要提供用户口令，因为认证方式是是通过md5加密的。如：

[postgres@localhost data]$ psql -h 172.16.11.35 -p 5432 -d postgres -U enterprisedb
Password for user enterprisedb: 
psql (9.6.4, server 9.3.11.33)
Type "help" for help.

postgres=#

从IP为172.18.1.12的客户端，发起到172.16.11.35服务端的连接请求，在提供名为enterprisedb的数据库用户密码之后，顺利连接到服务器。

连接类型，除了local、host之外，还有hostssl和hostnossl方式；
数据库和用户名字段的all，表示可以访问任意数据库，以及以任意数据库用户发起对于数据库的访问，如果需要对于特定的数据库和特定的用户作限制的话，则应该明确写出库名和数据库用户名；
数据库在做连接验证的过程中，对于该文件是从上向下查找的，如果找到一条记录匹配客户端发起的数据库连接请求，则参照该记录来验证，不再继续向下找，验证成功则访问数据库成功，反之失败，如下：

pg_hba.conf含有下述2条记录：

# IPv4 local connections:
host    all             all             127.0.0.1/32            md5 
host    all             all             127.0.0.1/32            trust

那么，此时，数据库选择第一条记录来验证，就必须的正确的输入数据库用户postgres的口令，否则会报错。如下：

[postgres@localhost data]$ psql -h 127.0.0.1 -p 5432 -d postgres -U postgres
Password for user postgres: 
psql: FATAL:  password authentication failed for user "postgres"
[postgres@localhost data]$

显然，对于这次验证失败，数据库并没有选择第2条验证方位为trust的无需密码的方式来验证；

对于数据库名字段上，如果需要匹配多个数据库的话，可以用逗号分隔开多个数据库的库名；
数据库字段名上，如果出现replication的话，并不表示对于名为replication的数据库名授权验证，而是验证一个来自物理备库的连接请求；
address字段上，如下：

172.20.143.89/32 for a single host，只匹配172.20.143.89这一个IP地址；
172.20.143.0/24 for a small network，匹配从172.20.143.0到172.20.143.254这个地址段；
10.6.0.0/16 for a larger one，匹配从10.6.0.0到10.6.254.254地址段；
::1/128 for a single host ，匹配IPv6的本地环路地址，等价于IPv4的127.0.0.1；
0.0.0.0/0 represents all IPv4 addresses，匹配所有IPv4的地址；
::0/0 represents all IPv6 addresses，匹配所有IPv6地址；

4 修改pg_hba.conf的内容

通过文本编辑器，按照其固定格式，直接修改完该文件之后，需要执行pg_ctl reload使其生效。

5 个人收获

从基础知识点上搞清楚了该文件的作用；
明白了该文件中local那一行的真正具体含义；
学会了该文件和postgrres.conf文件中的listen_address参数一起搭配来控制，通过网络来访问数据库的控制和限制；如果listen_addresses =’localhost’时，则该数据库不允许任何通过网络方式来的异机访问，只允许通过数据库服务器本机连接本机上的数据库；
学到了一点儿关于网络掩码的设置，如172.20.143.0/24的具体含义。

更为详细的信息，可以进一步查看官方文档：https://www.postgresql.org/docs/current/static/auth-pg-hba-conf.html

PostgreSQL数据库中explain和explain analyze的区别

8月 27,2018 留下评论

对于PostgreSQL数据库，可以直接在命令行上使用“explain your_statements”的方式来查看SQL语句的执行计划。

更为具体详细的用法，可以在psql命令行上使用\h explain来查看获取更为具体用法。

这里，需要注意的是，explain analyze 会分析SQL语句的执行计划，并真正执行SQL语句。而，默认情况下，PostgreSQL数据库的事务是自动开启、提交的。那么，对于DML语句来说，如果要使用explain analyze来查看其执行计划的话，就得格外注意了。如下：

postgres=# \c ai ai
You are now connected to database "ai" as user "ai".
ai=> \d p_table
           Table "public.p_table"
 Column |         Type          | Modifiers 
--------+-----------------------+-----------
 id     | integer               | not null
 name   | character varying(32) | 
Indexes:
    "p_table_pkey" PRIMARY KEY, btree (id)
Referenced by:
    TABLE "c_table" CONSTRAINT "c_table_id_fkey" FOREIGN KEY (id) REFERENCES p_table(id) ON UPDATE CASCADE ON DELETE CASCADE

ai=> select * from p_table ;
 id |    name    
----+------------
  3 | onlyou.com
(1 row)

ai=> select * from c_table ;
 id |  info  
----+--------
  3 | xiamen
(1 row)

ai=> explain delete from p_table ;
                           QUERY PLAN                           
----------------------------------------------------------------
 Delete on p_table  (cost=0.00..17.10 rows=710 width=6)
   ->  Seq Scan on p_table  (cost=0.00..17.10 rows=710 width=6)
(2 rows)

ai=> select * from p_table ;
 id |    name    
----+------------
  3 | onlyou.com
(1 row)

ai=> explain analyze delete from p_table ;
                                                QUERY PLAN                                                
----------------------------------------------------------------------------------------------------------
 Delete on p_table  (cost=0.00..17.10 rows=710 width=6) (actual time=0.074..0.074 rows=0 loops=1)
   ->  Seq Scan on p_table  (cost=0.00..17.10 rows=710 width=6) (actual time=0.013..0.014 rows=1 loops=1)
 Planning time: 0.050 ms
 Trigger for constraint c_table_id_fkey: time=2.637 calls=1
 Execution time: 2.735 ms
(5 rows)

ai=> select * from p_table ;
 id | name 
----+------
(0 rows)

ai=>

explain analyze delete from p_table ;查看一条删除操作SQL的执行计划，其结果就是，真正的把表里的数据删除了，且提交了。
对于所有的DML操作，如果只是想通过explain analyze来查看其执行计划的话，需要在执行之前，显示开启事务，然后查看执行计划，最后回滚事务。

postgres=# begin;
BEGIN
postgres=# explain analyze xxxxxxxxx;
postgres=# rollback;
ROLLBACK
postgres=#

如何在Linux上源码安装PostgreSQL数据库？

8月 21,2018 留下评论 Linux, PostgreSQL

零说明

本文记录在CentOS/RHEL Linux平台上通过源码安装PostgreSQL数据库、配置数据库的过程和步骤，及注意事项。

一前提条件

要求OS上已经安装下述软件。如果没有，可以直接通过yum来安装。

1 gmake或者make要求至少3.80版本以上

[root@localhost ~]# make -v
GNU Make 3.82
Built for x86_64-redhat-linux-gnu
Copyright (C) 2010  Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
[root@localhost ~]# gmake -v
GNU Make 3.82
Built for x86_64-redhat-linux-gnu
Copyright (C) 2010  Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
[root@localhost ~]# which make
/usr/bin/make
[root@localhost ~]# which gmake
/usr/bin/gmake
[root@localhost ~]# ll /usr/bin/make
-rwxr-xr-x. 1 root root 182776 11月  6 2016 /usr/bin/make
[root@localhost ~]# ll /usr/bin/gmake
lrwxrwxrwx. 1 root root 4 1月   8 2018 /usr/bin/gmake -> make
[root@localhost ~]#

这里，看到gmake版本为3.82。gmake是一个链接文件，其指向了make。其实，在CentOS/RHEL平台上，默认情况下，gmake等同于make。

2 C编译器

[root@localhost ~]# which gcc
/usr/bin/gcc
[root@localhost ~]# gcc -v
使用内建 specs。
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
目标：x86_64-redhat-linux
配置为：../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
线程模型：posix
gcc 版本 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) 
[root@localhost ~]#

如果没有的话，可以直接安装一个最新的gcc即可。

3 tar

[root@localhost ~]# which tar
/usr/bin/tar
[root@localhost ~]# tar --version
tar (GNU tar) 1.26
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
这是自由软件：您可以自由地更改并重新发布它。
在法律所允许的范围内不含任何担保。

由 John Gilmore 和 Jay Fenlason 所写。
[root@localhost ~]#

用于解压缩源码格式的安装压缩包文件。

 如果下载的源文件格式类似于postgresql-9.4.10.tar.gz，则用tar -zxvf postgresql-9.4.10.tar.gz解压；
如果下载的源文件格式类似于postgresql-9.4.10.tar.bz2，则用tar jxvf postgresql-9.4.10.tar.bz2解压；
注意，第二种格式文件的解压，命令行选项中不带减号-。

4 GNU readline library

该库文件默认启用。用于在psql命令行下，可以通过键盘的上下箭头调出历史命令以及编辑之前的命令。如果不需要此功能的话，可以在configure的时候，带上--without-readline选项。

5 zlib compression library

该库文件默认启用。启用该选项则意味着，使用pg_dump/pg_restore对数据库进行备份/恢复的时候，不支持压缩？？？如果不需要此功能的话，可以在configure的时候，带上–without-zlib选项。

经过我初步验证测试，暂时还没搞清楚该库文件的具体功能和作用。暂时记下这个问题，留待我后续进一步验证一下？？？

https://dan.langille.org/2013/06/10/using-compression-with-postgresqls-pg_dump/

http://peter.eisentraut.org/blog/2015/12/07/check-your-pg-dump-compression-levels/

二源码安装

1 创建postgres用户

[root@dev-middleware ~]# groupadd postgres
[root@dev-middleware ~]# useradd -g postgres postgres
[root@dev-middleware ~]# passwd postgres
更改用户 postgres 的密码 。
新的 密码：
无效的密码： 它基于字典单词
重新输入新的 密码：
passwd：所有的身份验证令牌已经成功更新。
[root@dev-middleware ~]#

2 创建数据库安装路径

假定将来要把数据库软件安装在/data/postgres/10.4/路径下，数据库的数据存放在/data/postgres/10.4/data路径下。
这里，先把/data/postgres/10.4/路径创建出来即可，/data/postgres/10.4/data路径不需提前创建。

[root@dev-middleware ~]# mkdir -p /data/postgres/10.4/
[root@dev-middleware ~]# chown -R postgres:postgres /data/postgres/10.4/
[root@dev-middleware ~]#

3 获取源码

到PostgreSQL官方网站，获取源码格式的数据库安装文件。

[root@dev-middleware ~]# su - postgres
[postgres@dev-middleware ~]$ pwd
/home/postgres
[postgres@dev-middleware ~]$ wget https://ftp.postgresql.org/pub/source/v10.4/postgresql-10.4.tar.gz
--2018-08-06 11:25:51--  https://ftp.postgresql.org/pub/source/v10.4/postgresql-10.4.tar.gz
正在解析主机 ftp.postgresql.org (ftp.postgresql.org)... 87.238.57.227, 217.196.149.55, 204.145.124.244, ...
正在连接 ftp.postgresql.org (ftp.postgresql.org)|87.238.57.227|:443... 已连接。
已发出 HTTP 请求，正在等待回应... 200 OK
长度：26652442 (25M) [application/x-gzip]
正在保存至: “postgresql-10.4.tar.gz”

100%[======================================================================================================================================================================================================>] 26,652,442  5.83MB/s 用时 6.4s   

2018-08-06 11:26:03 (3.98 MB/s) - 已保存 “postgresql-10.4.tar.gz” [26652442/26652442])

[postgres@dev-middleware ~]$ ll
总用量 26028
-rw-rw-r-- 1 postgres postgres 26652442 5月   8 05:06 postgresql-10.4.tar.gz
[postgres@dev-middleware ~]$

4 解压源码

如果下载的源码文件格式为类似于postgresql-9.4.10.tar.bz2，则用tar jxvf postgresql-9.4.10.tar.bz2解压。

解压后，会生成一个新的postgresql-10.4路径，该路径下存放的就是源码格式的安装文件。

[postgres@dev-middleware ~]$ pwd
/home/postgres
[postgres@dev-middleware ~]$ ll
总用量 26028
-rw-rw-r-- 1 postgres postgres 26652442 5月   8 05:06 postgresql-10.4.tar.gz
[postgres@dev-middleware ~]$ 
[postgres@dev-middleware ~]$ tar -zxvf postgresql-10.4.tar.gz 
...
...
postgresql-10.4/GNUmakefile.in
postgresql-10.4/.gitattributes
postgresql-10.4/aclocal.m4
postgresql-10.4/configure.in
postgresql-10.4/INSTALL
[postgres@dev-middleware ~]$

5 执行configure

执行configure的命令行选项，–prefix参数，表示把PostgreSQL安装在哪个路径下。默认情况下，不带该参数时，则会安装在/usr/local/pgsql路径下。

[postgres@dev-middleware ~]$ pwd
/home/postgres
[postgres@dev-middleware ~]$ ll
总用量 26032
drwxrwxr-x 6 postgres postgres     4096 5月   8 05:06 postgresql-10.4
-rw-rw-r-- 1 postgres postgres 26652442 5月   8 05:06 postgresql-10.4.tar.gz
[postgres@dev-middleware ~]$ cd postgresql-10.4
[postgres@dev-middleware postgresql-10.4]$ ll
总用量 676
-rw-r--r--  1 postgres postgres    457 5月   8 04:51 aclocal.m4
drwxrwxr-x  2 postgres postgres   4096 5月   8 05:03 config
-rwxr-xr-x  1 postgres postgres 498962 5月   8 04:51 configure
-rw-r--r--  1 postgres postgres  76909 5月   8 04:51 configure.in
drwxrwxr-x 55 postgres postgres   4096 5月   8 05:03 contrib
-rw-r--r--  1 postgres postgres   1192 5月   8 04:51 COPYRIGHT
drwxrwxr-x  3 postgres postgres   4096 5月   8 05:03 doc
-rw-r--r--  1 postgres postgres   3638 5月   8 04:51 GNUmakefile.in
-rw-r--r--  1 postgres postgres    284 5月   8 04:51 HISTORY
-rw-r--r--  1 postgres postgres  71584 5月   8 05:06 INSTALL
-rw-r--r--  1 postgres postgres   1682 5月   8 04:51 Makefile
-rw-r--r--  1 postgres postgres   1212 5月   8 04:51 README
drwxrwxr-x 16 postgres postgres   4096 5月   8 05:06 src
[postgres@dev-middleware postgresql-10.4]$ ./configure --prefix=/data/postgres/10.4/
...
...
config.status: linking src/include/port/linux.h to src/include/pg_config_os.h
config.status: linking src/makefiles/Makefile.linux to src/Makefile.port
[postgres@dev-middleware postgresql-10.4]$

6 执行make world

这里，也可以简单执行make就可以build安装文件的，但是，make world的意思是指把PostgreSQL相关的文档，HTML，以及其它的一些模块都会一起编译进去。比如，在有的环境下遇到无法使用uuid或者使用不了gin index的原因，就是在编译的时候，没有包含这些对应的模块。处理这种的问题的方法也不复杂，只需要进到当时安装时的那个源文件路径下，重新执行make world，然后make install-world。

建议，在初始安装的时候，就直接用make world。

[postgres@dev-middleware postgresql-10.4]$ make world
..
..
make[2]: 离开目录“/home/postgres/postgresql-10.4/contrib/vacuumlo”
make[1]: 离开目录“/home/postgres/postgresql-10.4/contrib”
PostgreSQL, contrib, and documentation successfully made. Ready to install.
[postgres@dev-middleware postgresql-10.4]$

7 执行make install-world

[postgres@dev-middleware postgresql-10.4]$ make install-world 
...
...
/bin/mkdir -p '/data/postgres/10.4/bin'
/bin/install -c  vacuumlo '/data/postgres/10.4/bin'
make[2]: 离开目录“/home/postgres/postgresql-10.4/contrib/vacuumlo”
make[1]: 离开目录“/home/postgres/postgresql-10.4/contrib”
PostgreSQL, contrib, and documentation installation complete.
[postgres@dev-middleware postgresql-10.4]$

8 初始化数据库

[postgres@dev-middleware postgresql-10.4]$ /data/postgres/10.4/bin/initdb -D /data/postgres/10.4/data
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "zh_CN.UTF-8".
The default database encoding has accordingly been set to "UTF8".
initdb: could not find suitable text search configuration for locale "zh_CN.UTF-8"
The default text search configuration will be set to "simple".

Data page checksums are disabled.

creating directory /data/postgres/10.4/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /data/postgres/10.4/bin/pg_ctl -D /data/postgres/10.4/data -l logfile start

[postgres@dev-middleware postgresql-10.4]$

9 启动数据库

[postgres@dev-middleware postgresql-10.4]$ cd
[postgres@dev-middleware ~]$ ps -ef|grep postgres
root     19521 10727  0 11:25 pts/0    00:00:00 su - postgres
postgres 19522 19521  0 11:25 pts/0    00:00:00 -bash
postgres 19882 19522  0 17:39 pts/0    00:00:00 ps -ef
postgres 19883 19522  0 17:39 pts/0    00:00:00 grep --color=auto postgres
[postgres@dev-middleware ~]$  /data/postgres/10.4/bin/pg_ctl -D /data/postgres/10.4/data -l /home/postgres/pg.log start
waiting for server to start.... done
server started
[postgres@dev-middleware ~]$ ps -ef|grep postgres
root     19521 10727  0 11:25 pts/0    00:00:00 su - postgres
postgres 19522 19521  0 11:25 pts/0    00:00:00 -bash
postgres 21158     1  0 17:39 pts/0    00:00:00 /data/postgres/10.4/bin/postgres -D /data/postgres/10.4/data
postgres 21160 21158  0 17:39 ?        00:00:00 postgres: checkpointer process   
postgres 21161 21158  0 17:39 ?        00:00:00 postgres: writer process   
postgres 21162 21158  0 17:39 ?        00:00:00 postgres: wal writer process   
postgres 21163 21158  0 17:39 ?        00:00:00 postgres: autovacuum launcher process   
postgres 21164 21158  0 17:39 ?        00:00:00 postgres: stats collector process   
postgres 21165 21158  0 17:39 ?        00:00:00 postgres: bgworker: logical replication launcher   
postgres 21168 19522  0 17:39 pts/0    00:00:00 ps -ef
postgres 21169 19522  0 17:39 pts/0    00:00:00 grep --color=auto postgres
[postgres@dev-middleware ~]$

10 修改环境变量

修改postgres用户的~/.bash_profile，PATH添加/data/postgres/10.4/bin，同时，添加PGDATA=/data/postgres/10.4/data环境变量。

PATH=$PATH:$HOME/.local/bin:$HOME/bin:/data/postgres/10.4/bin

export PATH

PGDATA=/data/postgres/10.4/data
export PGDATA

使其环境变量生效：

[postgres@dev-middleware ~]$ source ~/.bash_profile 
[postgres@dev-middleware ~]$

11 修改数据库参数

a 修改pg_hba.conf文件

在/data/postgres/10.4/data/pg_hba.conf文件中，添加下面的一行：

# IPv4 local connections:
host    all             all             0.0.0.0/0               md5

其，表示当前数据库服务器，对使用IPV4地址的客户端机器，允许其通过任意IP地址，通过任意数据库用户发起对该数据库服务器上的任意数据库的访问，但是必须得提供数据库用户密码。其认证方式，是通过MD5加密认证的。

一句话简介：对于使用IPV4地址的客户端，数据库服务器不做限制，可以通过任意用户访问所有数据库。

b 修改postgresql.conf 文件

在/data/postgres/10.4/data/postgresql.conf 文件中，修改#listen_addresses = ‘localhost’为#listen_addresses = ‘*’。

然后，reload使其生效。

原因是，如果不修改的话，则客户端只能通过数据库服务器的环路地址127.0.0.1来访问数据库，不能使用服务器的公网IP地址来访问。

如下，修改之前：

[postgres@dev-middleware ~]$ psql
psql (10.4)
Type "help" for help.

postgres=# show listen_addresses ;
 listen_addresses 
------------------
 localhost
(1 row)

postgres=# \q
[postgres@dev-middleware ~]$ psql -h 127.0.0.1 -p 5432 -d postgres
psql (10.4)
Type "help" for help.

postgres=# show listen_addresses ;
 listen_addresses 
------------------
 localhost
(1 row)

postgres=# \q
[postgres@dev-middleware ~]$ psql -h 172.16.11.44 -p 5432 -d postgres
psql: could not connect to server: 拒绝连接
        Is the server running on host "172.16.11.44" and accepting
        TCP/IP connections on port 5432?
[postgres@dev-middleware ~]$

修改listen_addresses = ‘*’，且重启数据库使其生效之后：

[postgres@dev-middleware ~]$ vi /data/postgres/10.4/data/postgresql.conf 
...
...

listen_addresses = '*'         # what IP address(es) to listen on;
                                        # comma-separated list of addresses;
                                        # defaults to 'localhost'; use '*' for all
                                        # (change requires restart)
#port = 5432                            # (change requires restart)
max_connections = 100                   # (change requires restart)
#superuser_reserved_connections = 3     # (change requires restart)
#unix_socket_directories = '/tmp'       # comma-separated list of directories
                                        # (change requires restart)
#unix_socket_group = ''                 # (change requires restart)
#unix_socket_permissions = 0777         # begin with 0 to use octal notation
                                        # (change requires restart)
"/data/postgres/10.4/data/postgresql.conf" 658L, 22755C written
[postgres@dev-middleware ~]$ pg_ctl restart -m fast
waiting for server to shut down....2018-08-06 18:31:40.294 CST [12290] LOG:  received fast shutdown request
2018-08-06 18:31:40.306 CST [12290] LOG:  aborting any active transactions
2018-08-06 18:31:40.307 CST [12290] LOG:  worker process: logical replication launcher (PID 12297) exited with exit code 1
2018-08-06 18:31:40.314 CST [12292] LOG:  shutting down
2018-08-06 18:31:40.347 CST [12290] LOG:  database system is shut down
 done
server stopped
waiting for server to start....2018-08-06 18:31:40.427 CST [14493] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2018-08-06 18:31:40.427 CST [14493] LOG:  listening on IPv6 address "::", port 5432
2018-08-06 18:31:40.433 CST [14493] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2018-08-06 18:31:40.487 CST [14494] LOG:  database system was shut down at 2018-08-06 18:31:40 CST
2018-08-06 18:31:40.498 CST [14493] LOG:  database system is ready to accept connections
 done
server started
[postgres@dev-middleware ~]$

可以通过数据库服务器的实际IP地址，访问数据库，

[postgres@dev-middleware ~]$ psql 
psql (10.4)
Type "help" for help.

postgres=# show listen_addresses ;
 listen_addresses 
------------------
 *
(1 row)

postgres=# \q
[postgres@dev-middleware ~]$ psql -h 127.0.0.1 -p 5432 -d postgres
psql (10.4)
Type "help" for help.

postgres=# show listen_addresses ;
 listen_addresses 
------------------
 *
(1 row)

postgres=# \q
[postgres@dev-middleware ~]$ psql -h 172.16.11.44 -p 5432 -d postgres -U postgres
Password for user postgres: 
psql (10.4)
Type "help" for help.

postgres=# show listen_addresses ;
 listen_addresses 
------------------
 *
(1 row)

postgres=# \q
[postgres@dev-middleware ~]$

12 删除PostgreSQL数据库软件

postgres用户可以进入到之前解压的源码所在的路径，通过执行make uninstall来卸载已经安装的数据库软件。如下：

[postgres@dev-middleware ~]$ pwd
/home/postgres
[postgres@dev-middleware ~]$ ll
总用量 26036
-rw------- 1 postgres postgres      939 8月   6 18:29 pg.log
drwxrwxr-x 6 postgres postgres     4096 8月   6 11:29 postgresql-10.4
-rw-rw-r-- 1 postgres postgres 26652442 5月   8 05:06 postgresql-10.4.tar.gz
[postgres@dev-middleware ~]$ make uninstall
make: *** 没有规则可以创建目标“uninstall”。 停止。
[postgres@dev-middleware ~]$ cd postgresql-10.4
[postgres@dev-middleware postgresql-10.4]$ make uninstall 
...
...
make[1]: 进入目录“/home/postgres/postgresql-10.4/config”
rm -f '/data/postgres/10.4/lib/pgxs/config/install-sh'
rm -f '/data/postgres/10.4/lib/pgxs/config/missing'
make[1]: 离开目录“/home/postgres/postgresql-10.4/config”
[postgres@dev-middleware postgresql-10.4]$

13 重新编译安装PostgreSQL数据库软件

可以通过前面第4步的，重新解压源码，用新解压出来的源码文件，进行configure，make world,make install-world。

还有一种方式，就是把之前第5步骤执行configure之后的文件，恢复到configure之前的状态，然后通过执行configure，make world,make install-world。这个命令是make distclean。

如下是第二种方式的记录：

[postgres@dev-middleware postgresql-10.4]$ pwd
/home/postgres/postgresql-10.4
[postgres@dev-middleware postgresql-10.4]$ ll
总用量 1088
-rw-r--r--  1 postgres postgres    457 5月   8 04:51 aclocal.m4
drwxrwxr-x  2 postgres postgres   4096 5月   8 05:03 config
-rw-rw-r--  1 postgres postgres 373348 8月   6 11:29 config.log
-rwxrwxr-x  1 postgres postgres  39415 8月   6 11:29 config.status
-rwxr-xr-x  1 postgres postgres 498962 5月   8 04:51 configure
-rw-r--r--  1 postgres postgres  76909 5月   8 04:51 configure.in
drwxrwxr-x 55 postgres postgres   4096 5月   8 05:03 contrib
-rw-r--r--  1 postgres postgres   1192 5月   8 04:51 COPYRIGHT
drwxrwxr-x  3 postgres postgres   4096 5月   8 05:03 doc
-rw-rw-r--  1 postgres postgres   3638 8月   6 11:29 GNUmakefile
-rw-r--r--  1 postgres postgres   3638 5月   8 04:51 GNUmakefile.in
-rw-r--r--  1 postgres postgres    284 5月   8 04:51 HISTORY
-rw-r--r--  1 postgres postgres  71584 5月   8 05:06 INSTALL
-rw-r--r--  1 postgres postgres   1682 5月   8 04:51 Makefile
-rw-r--r--  1 postgres postgres   1212 5月   8 04:51 README
drwxrwxr-x 16 postgres postgres   4096 8月   6 11:29 src
[postgres@dev-middleware postgresql-10.4]$ make distclean
...
...
make[1]: 离开目录“/home/postgres/postgresql-10.4/src”
rm -rf tmp_install/
rm -f config.cache config.log config.status GNUmakefile
[postgres@dev-middleware postgresql-10.4]$ ll
总用量 676
-rw-r--r--  1 postgres postgres    457 5月   8 04:51 aclocal.m4
drwxrwxr-x  2 postgres postgres   4096 5月   8 05:03 config
-rwxr-xr-x  1 postgres postgres 498962 5月   8 04:51 configure
-rw-r--r--  1 postgres postgres  76909 5月   8 04:51 configure.in
drwxrwxr-x 55 postgres postgres   4096 5月   8 05:03 contrib
-rw-r--r--  1 postgres postgres   1192 5月   8 04:51 COPYRIGHT
drwxrwxr-x  3 postgres postgres   4096 5月   8 05:03 doc
-rw-r--r--  1 postgres postgres   3638 5月   8 04:51 GNUmakefile.in
-rw-r--r--  1 postgres postgres    284 5月   8 04:51 HISTORY
-rw-r--r--  1 postgres postgres  71584 5月   8 05:06 INSTALL
-rw-r--r--  1 postgres postgres   1682 5月   8 04:51 Makefile
-rw-r--r--  1 postgres postgres   1212 5月   8 04:51 README
drwxrwxr-x 16 postgres postgres   4096 8月   6 19:10 src
[postgres@dev-middleware postgresql-10.4]$

三小结

本文用于记录如何在Linux平台上，通过编译源码的方式，来安装配置PostgreSQL数据库，作一个简单记录。更为详细的信息可以参考官方文档。

PostgreSQL数据库排序的小结

8月 14,2018 留下评论

一 OS环境

 [root@localhost ~]# cat /etc/redhat-release 
CentOS release 6.5 (Final)
[root@localhost ~]# uname -rm
2.6.32-431.el6.x86_64 x86_64
[root@localhost ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         48216      47094       1122          0        286      44795
-/+ buffers/cache:       2012      46204
Swap:        24175       1712      22463
[root@localhost ~]#

二数据库环境

ai=> select version();
                                                 version                                                  
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), 64-bit
(1 row)

ai=> \dt+ t_btree 
                    List of relations
 Schema |  Name   | Type  | Owner |  Size  | Description 
--------+---------+-------+-------+--------+-------------
 public | t_btree | table | ai    | 346 MB | 
(1 row)

ai=> select count(*) from t_btree ;
  count   
----------
 10000000
(1 row)

ai=>

三 external merge

ai=> show shared_buffers ;
 shared_buffers 
----------------
 128MB
(1 row)

ai=> show work_mem ;
 work_mem 
----------
 4MB
(1 row)

ai=> explain (analyze,verbose,buffers,costs,timing) select * from t_btree order by id;
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1580363.83..1605363.83 rows=10000000 width=4) (actual time=8397.774..10610.457 rows=10000000 loops=1)
   Output: id
   Sort Key: t_btree.id
   Sort Method: external merge  Disk: 136936kB
   Buffers: shared hit=15871 read=28377, temp read=50518 written=50518
   ->  Seq Scan on public.t_btree  (cost=0.00..144248.00 rows=10000000 width=4) (actual time=0.028..861.348 rows=10000000 loops=1)
         Output: id
         Buffers: shared hit=15871 read=28377
 Planning time: 2.056 ms
 Execution time: 11135.532 ms
(10 rows)

ai=>

这里的排序方法采用的external mege，意味着数据量太大了，在内存里排序排不下(work_mem=4MB对于当前场景的排序，不够用)，只好交换到磁盘上排序，磁盘排序完成之后，再把排序结果交换到内存中。效率最低。

shared_buffers相当于Oracle数据库的database buffer cache，单纯加大该参数，即使所有的数据都能从shared buffers命中，对该排序依然无效，如下：

ai=> explain (analyze,verbose,buffers,costs,timing) select id from t_btree order by id;
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1580363.83..1605363.83 rows=10000000 width=4) (actual time=8032.685..10126.444 rows=10000000 loops=1)
   Output: id
   Sort Key: t_btree.id
   Sort Method: external merge  Disk: 136936kB
   Buffers: shared hit=44248, temp read=50518 written=50518
   ->  Seq Scan on public.t_btree  (cost=0.00..144248.00 rows=10000000 width=4) (actual time=0.021..792.516 rows=10000000 loops=1)
         Output: id
         Buffers: shared hit=44248
 Planning time: 0.062 ms
 Execution time: 10655.508 ms
(10 rows)

ai=> select 44248*8;
 ?column? 
----------
   353984
(1 row)

ai=> select 44248*8/1024.0;
       ?column?       
----------------------
 345.6875000000000000
(1 row)

ai=> show shared_buffers ;
 shared_buffers 
----------------
 4GB
(1 row)

ai=> show work_mem ;
 work_mem 
----------
 4MB
(1 row)

ai=>

shared_buffers=4GB，work_mem=4MB时。执行计划中，Buffers: shared hit=44248，内存命中44248个内存块儿，每个内存页8KB，换算之后，约为346MB，足够缓存该表的全部数据，即所有数据都从内存中读取。但是，排序方法依然是external merge。

四 quicksort排序

ai=> show shared_buffers ;
 shared_buffers 
----------------
 512MB
(1 row)

ai=> show work_mem ;
 work_mem 
----------
 4MB
(1 row)

ai=> set work_mem ='800MB';
SET
ai=> show work_mem ;
 work_mem 
----------
 800MB
(1 row)

ai=> explain (analyze,verbose,buffers,costs,timing) select * from t_btree order by id;
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1306922.83..1331922.83 rows=10000000 width=4) (actual time=15001.049..17213.924 rows=10000000 loops=1)
   Output: id
   Sort Key: t_btree.id
   Sort Method: quicksort  Memory: 741817kB
   Buffers: shared hit=15903 read=28345
   ->  Seq Scan on public.t_btree  (cost=0.00..144248.00 rows=10000000 width=4) (actual time=0.060..920.913 rows=10000000 loops=1)
         Output: id
         Buffers: shared hit=15903 read=28345
 Planning time: 0.054 ms
 Execution time: 17756.761 ms
(10 rows)

ai=> explain (analyze,verbose,buffers,costs,timing) select * from t_btree order by id;
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1306922.83..1331922.83 rows=10000000 width=4) (actual time=6075.205..7945.550 rows=10000000 loops=1)
   Output: id
   Sort Key: t_btree.id
   Sort Method: quicksort  Memory: 741817kB
   Buffers: shared hit=16063 read=28185
   ->  Seq Scan on public.t_btree  (cost=0.00..144248.00 rows=10000000 width=4) (actual time=0.060..879.729 rows=10000000 loops=1)
         Output: id
         Buffers: shared hit=16063 read=28185
 Planning time: 0.070 ms
 Execution time: 8466.822 ms
(10 rows)

ai=> explain (analyze,verbose,buffers,costs,timing) select * from t_btree order by id;
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1306922.83..1331922.83 rows=10000000 width=4) (actual time=5400.587..7208.614 rows=10000000 loops=1)
   Output: id
   Sort Key: t_btree.id
   Sort Method: quicksort  Memory: 741817kB
   Buffers: shared hit=16095 read=28153
   ->  Seq Scan on public.t_btree  (cost=0.00..144248.00 rows=10000000 width=4) (actual time=0.059..847.957 rows=10000000 loops=1)
         Output: id
         Buffers: shared hit=16095 read=28153
 Planning time: 0.068 ms
 Execution time: 7724.861 ms
(10 rows)

ai=>

说明：

1 默认情况下work_mem=4MB，不满足在内存中完成该排序操作，通过向上调整work_mem=800MB ，使其能够在内存中完成该排序，这里的800MB是逐渐上调得到的一个值。为了学习研究用，实际生产环境，调整该参数值时，要注意该值work_mem*max_connections不能超过总物理内存大小；

2 调整完work_mem=800MB后，多次观察该SQL的执行计划，可以发现随着Buffers: shared hit=16095 read=28153的变化，其Execution time逐渐降低。

This will definitely be faster than external merge, since all of the data is brought into memory and then the sorting is done in memory itself.

五 top-N heapsort

ai=> show work_mem ;
 work_mem 
----------
 4MB
(1 row)

ai=> show shared_buffers ;
 shared_buffers 
----------------
 128MB
(1 row)

ai=> explain (analyze,verbose,buffers,costs,timing) select id from t_btree order by id limit 10;
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=360344.40..360344.43 rows=10 width=4) (actual time=1784.116..1784.117 rows=10 loops=1)
   Output: id
   Buffers: shared hit=3 read=44248
   ->  Sort  (cost=360344.40..385344.40 rows=10000000 width=4) (actual time=1784.113..1784.113 rows=10 loops=1)
         Output: id
         Sort Key: t_btree.id
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=3 read=44248
         ->  Seq Scan on public.t_btree  (cost=0.00..144248.00 rows=10000000 width=4) (actual time=0.023..856.626 rows=10000000 loops=1)
               Output: id
               Buffers: shared read=44248
 Planning time: 0.396 ms
 Execution time: 1784.149 ms
(13 rows)

ai=>

说明:

1 这里SQL加上了limit 10，则执行计划选择了top-N heapsort，同时，内存只用了25kB。

2 PostgreSQL数据库维护了一个heap的内存结构，该heap有一个上限大小。排序过程大致如下：PostgreSQL先根据limit大小，顺序的把数据放入heap中，等到heap被填满之后，再去读取下一个数据，判断其值是否小于heap中已有的最大值？如果大于最大值，则直接丢弃这个数据，返回heap中的结果集，排序完成。如果小于最大值，则把这个值放入heap中的最大值处，并把heap中之前的那个最大值移除heap，继续读取下一个值，重复该过程，直到读取的新值不再大于heap中的最大值为止。

3 由于SQL中有order by id，才会涉及到排序操作，如果没有这个order by从句的话，那么无需涉及排序，直接返回表中读取的前10条记录；

4 这里无需对整个表进行排序，只需获取最小的10条数据，并对其进行排序即可。

六 index 排序

ai=> create index idx_t_btree_id on t_btree(id);
CREATE INDEX
ai=> show work_mem ;
 work_mem 
----------
 4MB
(1 row)

ai=> show shared_buffers ;
 shared_buffers 
----------------
 128MB
(1 row)

ai=> explain (analyze,verbose,buffers,costs,timing) select * from t_btree order by id limit 10;
                                                                      QUERY PLAN                                                                       
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..0.87 rows=10 width=4) (actual time=0.030..0.045 rows=10 loops=1)
   Output: id
   Buffers: shared hit=13
   ->  Index Only Scan using idx_t_btree_id on public.t_btree  (cost=0.43..436680.67 rows=10000033 width=4) (actual time=0.028..0.039 rows=10 loops=1)
         Output: id
         Heap Fetches: 10
         Buffers: shared hit=13
 Planning time: 0.120 ms
 Execution time: 0.066 ms
(9 rows)

ai=> explain (analyze,verbose,buffers,costs,timing) select * from t_btree order by id desc limit 10;
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..0.87 rows=10 width=4) (actual time=0.024..0.038 rows=10 loops=1)
   Output: id
   Buffers: shared hit=12 read=1
   ->  Index Only Scan Backward using idx_t_btree_id on public.t_btree  (cost=0.43..436680.67 rows=10000033 width=4) (actual time=0.022..0.035 rows=10 loops=1)
         Output: id
         Heap Fetches: 10
         Buffers: shared hit=12 read=1
 Planning time: 0.213 ms
 Execution time: 0.061 ms
(9 rows)

ai=>

说明：

实际情况下，通过index来访问数据，并不是严格意义的排序，只不过是index本身已经是排好序的数据集，数据库只需直接从index读取数据即可，并不需要执行额外的排序操作。这里，id字段上有一个B-tree index，无论是按照id的升序、降序排序，都可以直接从index读取数据。

七小结

1 此前，并没有认真的关注过external merge排序操作，原来是因为内存空间不够，work_mem内存区不足以用于排序的操作，故而数据库不得不采用将数据交换到磁盘排序，然后把排序结果交换回内存；

2 quick sort是一种成熟的排序方法，大学的数据结构课程学过，早已还给老师了，找时间回顾一下该知识点；

3 heap sort方法对于自己是一种新的排序，通过小结，也大致清楚了其排序的实现方式。

八参考

https://madusudanan.com/blog/all-you-need-to-know-about-sorting-in-postgres/

Archive 2018年8月30日