Tuesday, November 20, 2012

Better tracing of Npgsql connection pool usage in the works

Sometimes, Npgsql users receive errors when working with connection pooling. The problem appears when they try to open a new connection and receive the following message: 

"Timeout while getting a connection from pool."

This is caused when an attempt is made to get a connection from a pool which  has all its connections used. Most of the time this problem is a difficult one to track because it generally happens when the system is in production and not in the development phase. Of course. :)

Some time ago, I received a report from Miłosz Kubański about such a problem. I told him I would work in a way to get more information so we could check what was happening.

In order to help us find the problem, I added a little "hack" to Npgsql: Whenever an error while getting a connection from the pool occurs, Npgsql would log a stacktrace of the allocation of all the connections which were in the pool. Theoretically, those connections which were allocated should not be retained and by taking a "snapshot" of those allocations, we could get some tips about which code allocated the connection and check for potential missing releases.

Last week I sent Milosz this modified version of Npgsql so he could give it a try and send me back the log. Yesterday he sent me the log. Inside it, I chose two stacktraces and asked him if would be possible there was any missing close.

Milosz replied saying there was indeed a missing close call. It was an exception inside a very complex system. 

Based on this feedback, I think this "hack" has shown to be very useful in future cases and I'll work to get this change in a "non-hack" status and add it to Npgsql code. I hope this helps Npgsql users to find possible causes for connection pooling problems.

When this feature is available I'll let you know.

Wednesday, October 10, 2012

Where is vs.net design time support?

You may already know that VS.Net design time support has started a long time ago and didn't have too much support since then. 

Now that Npgsql release 2.0.12 is out, I want to put more attention to finish a version which adds design time support. I noticed that this is the biggest missing feature in Npgsql and I want to fix that. Npgsql users deserve to be able to use VS.Net design time support to help them create better apps which access Postgresql databases.

Although I can't give any concrete timeframe of when it will be available, I want you to know that I'm focused on this feature and it is not stalled. I hope to be able to give you more information soon.

Stay tuned.

Monday, January 09, 2012

ConnectionPool performance improvements

Hi, all!

Today I committed a change to Npgsql which will improve connection pool performance. This change was motivated by Andrew's bug report where he noticed that a lot of threads were waiting to get a new connection from pool.

In order to keep consistence of the pool, Npgsql has to lock access to it. Andrew's problem appeared in a busy server where a lot of threads were trying to get a new connection from the pool. They had to wait in line. And obviously this isn't good.

The current implementation of Npgsql creates a big lock surrounding all the code needed to work with the pool and more! As Andrew noticed in his bug report, I/O operations were being done inside this lock which was contributing to more delays to get a connection from the pool.

So, to fix that, I rewrote connection pool logic to remove this big lock and break it down to smaller ones only when really needed. All the I/O operations were left out of the locks, this way, other threads waiting to get a new connection from the pool don't need to wait for those expensive operations to finish.

I made a lot of tests and could confirm that when I break the code inside the debugger, threads are spread throughout connection pool code as expected instead of waiting in line on the big lock.

As this change is somewhat critical to Npgsql usage, I'd like to ask you to download the code, compile it and give it a try and see if everything is working ok or even better than before. I expect busy servers to be able to increase their raw throughput because it will have to wait less to get connections from the pool.

As always, please, let me know if you have any problems and all feedback is very welcome!