Weblog » Tuning Berkeley DB for high performance web applications
Extensive stress tests of the new Porcupine version using different configurations for Berkeley DB have shown some interesting results. Here are some tips for configuring Berkeley DB in order to get the most out of it when used for web applications.
1. Use the DB_TXN_NOWAIT flag for transactional operations. This option minimizes the number of the write locks at a given time since transactions do not block waiting for the next lock to become available, but instead they are aborted releasing all the locks that they currently possess. This option also eliminates the need for a deadlock detector because no transaction will ever block waiting for a lock acquired by another, resulting in a deadlock-free database. Once a lock is not available the transaction is immediately aborted and then retried.
Of course this approach requires that you wrap all of your transactional operations in a special wrapper (a Python decorator) that is responsible for retrying each transaction whenever a lock is not granted for a given number of times using some kind of exponential/linear layoff between retries.
2. Drop the D out of ACID for transactions that need not to be durable by using the DB_TXN_NOSYNC flag. Such type of transactions include the session state or even better all transactions if you have some kind of synchrounous replication to multiple nodes for fail over capabilities.
3. For transactional cursors use an isolation level of 2 whenever possible by using the DB_READ_COMMITTED flag. This increases the overall write throughput because data items read by this type of cursors can be modified or deleted prior to the commit of the transaction for this cursor.
4. If you have many requests competing for updates on the same resource consider using a semaphore for limiting the maximum number of concurrent transactions per process to a specific number. Tests have shown that under these conditions and without a semaphore, single process server instances achieve a better throughput than the multi process instances. This is happening because on a single process server instance the maximum number of transactions is kept low enough for transactions not to compete hard one against another, resulting in less retries.
5. I guess that last one is well known. Keep that data and log files on separate disks.