Top 10 PostgreSQL Performance Optimization Tips

PostgreSQL performance optimization is a widely discussed issue on the most forums. Although PostgreSQL is shipped with a solid default configuration aimed to fit most setups, it requires fair amount of performance optimization to offer the best. There are basically two major aspects of PostgreSQL database performance optimization.

This involves enhancing the use of the hardware and configuration setting, and optimizing the performance queries sent to database. Well, it's not possible for the PostgreSQL developers to tune the default configuration for everyone. We prepared a list of top 10 PostgreSQL performance optimization tips that combines both the above aspects.

1. Hardware and Configuration changes

When you are running queries you need to look at how much your CPU and memory is being taxed. To increase the speed and memory run postmaster with various flags to increase the speed and memory. Having said so much, it needs to be added that if your query plan is not feasible the hardware and configuration has nothing to do with it.

2. Choosing the file system

If you are using an Operating System like Linux that has multiple file system, choose the one that will be the best from the performance point of view. There no single opinion among PostgreSQL users about which file system is best.

There are diverse opinion over Ext2, Ext3, ReiserFS, and XFS. Although Ext2 is said to be faster on some setups the recovery issues are a major concern. Essentially the benchmark would be a combination of file system, disk/array configuration, OS version, and database table size and distribution.

Overtly, you must stick to the file system that is best supported by your distribution, like for instance Ext3 for Red Hat Linux, ReiserFS for SuSE Linux and don't forget XFS known for it's large file support.

3. Magic Block

In order to ensure that a dynamically loaded object file is not loaded into an incompatible server, PostgreSQL whether that file includes a magic block with the appropriate contents. It allows the server to identify the incompatibilities like the code compiled for a different major version of PostgreSQL. In PostgreSQL versions like 8.2 you need to include the magic block. In order to include the magic block you need to write this in one of the module source files. Before that include the header

fmgr.h:

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

Note: The #ifdef test can be removed if the code doesn't need to compile against pre-8.2 PostgreSQL releases.

4. Try the Auto Vacumm daemon

PostgreSQL databases need periodic maintenance known as vacuuming. In the PostgreSQL 8.1 and versions above there is a separate optional server process called the auto vacuum deamon. It's automates the the execution of vacuum and analyze commands. When the auto vacuum daemon is enabled, it runs periodically and checks for tables that have had a large number of updated, inserted or deleted tuples. To use the autovacuum daemon stats_start_collector and stats_row_level should be set to true, as it checks the use of row-level statistics collection facility. Further, when choosing the value of superuser_reserved_connections the user must allow a slot for auto vacuum process.

Basically the process will vacuum the database when needed. This would allow not require the you to change the corn setting for vacuum frequency. Overall, it would result in better database performance by eliminating overdue vacuum issues.

5. Partial Indices

In order to force use an index you need to employ the true partial index. You can assume that table2 below has no rows where field2=0. To store the clause field2<>0 in pg-index you can use the actions below. When you see the predicate it always uses the partial index. In this case, it is used as a full index to trick it.

create index i on table2(field2) where field2 <> 0;

select * from table2 where field2<>0;

6. External programs

To improve the performance of queries in Postgresql you need to break out a query into a series of small, explicit nested loops in a C, Perl, or other client program. This would actually improve the performance, especially if you require a subset or results/tables.

7. Monitor Query Progress

When doing with long queries you must be looking to monitor the progress of queries. There is an easy trick by Alan Williams that allows you to monitor the progress of long running queries. Just add to the query a sequence - (select nextval('sq_test'),…) and then use the currval('sq_test') to watch how far the query has progressed.

8. Sort memory

In case you have queries with order-by or group-by clauses that requires sorting large data set increasing the sort memory can help. Using this parameter you can set maximum limits on the memory that a database connection can use to perform sorts. However, its important to take into account that this parameter is per sort, per connection. Especially, database with many users must consider the shortcomings before setting the parameter. The best approach would be to set this parameter per connection as and when required. it would be low for the simple queries and higher for the large, complex queries and data dumps.

9. Stored Procedures

To control the query execution more explicitly you need to write a stored procedure. What you need to do is break out SQL into small cursors instead of a single large cursor. You might run up against the same problem.

10. Views

Adding a step to the query planner might impact the query speed according to the views. However, adding more clauses to the view should be avoided as it may affect the query plan in a bad way. More queries make it confusing for the user.
Enhanced by Zemanta

Share/Bookmark

No comments: