If a data block to be skipped over is less than 4kB, just read the
data instead of using fseeko(). Experimentation shows that this
avoids useless kernel calls --- possibly quite a lot of them, at
least with current glibc --- while not incurring any extra I/O,
since libc will read 4kB at a time anyway. (There may be platforms
where the default buffer size is different from 4kB, but this
change seems unlikely to hurt in any case.)
We don't expect short data blocks to be common in the wake of
66ec01dc4 and related commits. But older pg_dump files may well
contain very short data blocks, and that will likely be a case
to be concerned with for a long time.
While here, do a little bit of other cleanup in _skipData.
Make "buflen" be size_t not int; it can't really exceed the
range of int, but comparing size_t and int variables is just
asking for trouble. Also, when we initially allocate a buffer
for reading skipped data into, make sure it's at least 4kB to
reduce the odds that we'll shortly have to realloc it bigger.
Author: Dimitrios Apostolou <jimis@gmx.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/
2edb7a57-b225-3b23-a680-
62ba90658fec@gmx.net
lclContext *ctx = (lclContext *) AH->formatData;
size_t blkLen;
char *buf = NULL;
- int buflen = 0;
+ size_t buflen = 0;
blkLen = ReadInt(AH);
while (blkLen != 0)
{
- if (ctx->hasSeek)
+ /*
+ * Seeks of less than stdio's buffer size are less efficient than just
+ * reading the data, at least on common platforms. We don't know the
+ * buffer size for sure, but 4kB is the usual value. (While pg_dump
+ * currently tries to avoid producing such short data blocks, older
+ * dump files often contain them.)
+ */
+ if (ctx->hasSeek && blkLen >= 4 * 1024)
{
if (fseeko(AH->FH, blkLen, SEEK_CUR) != 0)
pg_fatal("error during file seek: %m");
if (blkLen > buflen)
{
free(buf);
- buf = (char *) pg_malloc(blkLen);
- buflen = blkLen;
+ buflen = Max(blkLen, 4 * 1024);
+ buf = (char *) pg_malloc(buflen);
}
if (fread(buf, 1, blkLen, AH->FH) != blkLen)
{