解决PostgreSQL存储过程中调用c编写的动态库出错的问题

现象,存储过程调用外部c写的动态库函数decode解密偶尔会出错,原因:传递给decode的密文不能保障一定是合法解密结果。

解密函数如下:

text mydecode(text t) {

char p = decodeBytes((char ) VARDATA(t), VARSIZE(t) - VARHDRSZ);

text new_t = (text ) palloc(strlen(p)+VARHDRSZ);

SET_VARSIZE(new_t, strlen(p)+VARHDRSZ);

memcpy((void ) VARDATA(new_t),(void)p,strlen(p));

free(p);

return new_t;

}

对PG几乎也是一窍不通,下了它的文档,就直接看C Language Functions,大概了解了一下。

上面的写法是c语言函数的版本-0调用风格,已经不提倡了。

text 是一个结构,length是存储文本长度,data是起始指针:

typedef struct { integer length; char data[1]; } text;

palloc用来分配内存,用了一些宏VARDATA, VARHDRSZ, SET_VARSIZE方便书写,不易出错。

(VARHDRSZsizeof(int4) 一样, 但是我们认为用宏 VARHDRSZ 表示附加尺寸是用于变长类型的更好的风格。)

找个测试环境,在.bash_profile中增加PG环境,PATH=$PATH:$HOME/bin:/usr/local/pgsql/bin/

编译代码,超简单的makefile:

MODULES = codetest

PGXS := $(shell pg_config –pgxs)

include $(PGXS)

引用了pg的makefile,这个真好,只需要修改MODULES的就行了。这里pg_config可以用于查看pg的配置:

[postgres@bi ~]$ pg_config –pgxs

/usr/local/pgsql/lib/pgxs/src/makefiles/pgxs.mk

然后将编译后生成的动态库复制到PG的lib目录下。

在数据库中创建函数和调用如下:

CREATE OR REPLACE FUNCTION myencode(text)
RETURNS text AS

‘codetest.so’, ‘myencode’

LANGUAGE c IMMUTABLE STRICT;

弄几个测试例子:


select mydecode(‘’);

select mydecode(‘z255’);

select mydecode(‘z2551’);

select mydecode(‘)7zED5A=’);

函数调用时会出错,数据库连接都断开了。查看后端的日志

[postgres@bi pg_log]$ tail -f /var/applog/pg_log/postgresql-2012-05-10_000000.csv

2012-05-10 13:29:50.765 CST,,,5825,,4f9dc310.16c1,126,,2012-04-30 06:39:12 CST,,0,LOG,00000,”server process (PID 20433) was terminated by signal 11: Segmentation fault“,,,,,,,,

2012-05-10 13:29:50.765 CST,,,5825,,4f9dc310.16c1,127,,2012-04-30 06:39:12 CST,,0,LOG,00000,”terminating any other active server processes”,,,,,,,,

2012-05-10 13:29:50.767 CST,,,5825,,4f9dc310.16c1,128,,2012-04-30 06:39:12 CST,,0,LOG,00000,”all server processes terminated; reinitializing”,,,,,,,,

2012-05-10 13:29:51.219 CST,,,20479,,4fab524f.4fff,1,,2012-05-10 13:29:51 CST,,0,LOG,00000,”database system was interrupted; last known up at 2012-05-10 13:27:38 CST”,,,,,,,,

2012-05-10 13:29:51.228 CST,,,20479,,4fab524f.4fff,2,,2012-05-10 13:29:51 CST,,0,LOG,00000,”database system was not properly shut down; automatic recovery in progress”,,,,,,,,

2012-05-10 13:29:51.269 CST,,,20479,,4fab524f.4fff,3,,2012-05-10 13:29:51 CST,,0,LOG,00000,”record with zero length at BF/1C615314”,,,,,,,,

2012-05-10 13:29:51.269 CST,,,20479,,4fab524f.4fff,4,,2012-05-10 13:29:51 CST,,0,LOG,00000,”redo is not required”,,,,,,,,

2012-05-10 13:29:51.278 CST,,,20479,,4fab524f.4fff,5,,2012-05-10 13:29:51 CST,,0,LOG,00000,”checkpoint starting: shutdown immediate”,,,,,,,,

2012-05-10 13:29:51.294 CST,,,20479,,4fab524f.4fff,6,,2012-05-10 13:29:51 CST,,0,LOG,00000,”checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.000 s, total=0.025 s”,,,,,,,,

2012-05-10 13:29:51.304 CST,,,20482,,4fab524f.5002,1,,2012-05-10 13:29:51 CST,,0,LOG,00000,”autovacuum launcher started”,,,,,,,,

2012-05-10 13:29:51.306 CST,,,5825,,4f9dc310.16c1,129,,2012-04-30 06:39:12 CST,,0,LOG,00000,”database system is ready to accept connections”,,,,,,,,

2012-05-10 13:29:52.232 CST,”postgres”,”postgres”,20484,”172.16.15.49:61169”,4fab5250.5004,1,”startup”,2012-05-10 13:29:52 CST,,0,FATAL,42704,”unrecognized configuration parameter “”application_name”””,,,,,,,,

应该还是的内存地址的引用出了问题,原代码中缺少写日志(要写了日志文件调试会好很多)。看了代码decodeBytes 可能会返回NULL,这导致了后续的strlen()函数出错。所以简单地修改了方法,增加空指针判断,如下:

if (p == NULL) {

p = malloc(1);

*p = ‘’;

}

问题得以解决。


BTW. 好久未看代码了,没感觉。