Post by Jay HarperPost by John DeSoiPost by Jay HarperI've done all sorts of research trying to track down this issue
and at the moment I'm trying to figure out what would cause a
CLOSE_WAIT and how it can be fixed. It appears that CLOSE_WAIT
can be the result of a keep alive connection that is not
properly closed by the client, but the "Use Keep Alive
Connections" checkbox is NOT selected in Preferences, and
changing the "Inactive Web Process Timeout" setting has no
effect. On top of everything I changed the proxy settings and
all requests from Apache should be HTTP 1.0 with no keep alive
connections, but I need to check that I implemented that correctly.
I don't think CLOSE_WAIT has anything specifically to do with
keep- alive connections. Here is the description from the netstat
CLOSE_WAIT: The socket connection has been closed by the
remote peer,
and the system is waiting for the local application to close
its half of
the connection.
So to me it sounds like the client side has closed, but the
server side has not.
As I read more, it seems your right... CLOSE_WAIT isn't necessarily
about keep-alive connections. The problem is continuing even though
Apache is now only doing HTTP 1.0 connections and 4D is configured
not to do keep-alive connections.
I'm now wondering whether this is what happens when Apache times
out (I had the proxy timeout set to 30 seconds). So, I've upped
that setting and will wait to see if it has any effect.
In the meantime, does anyone know how to lower the CLOSE_WAIT
setting for OS X? Apparently the default is 2 minutes.
It's not something that you should try to change. The timeout period
is used by the OS as a fail-safe; it is the responsibility of the
local application to properly and promptly close its own network
connections when done. It sounds like you've either uncovered a bug
in 4D's web server, or you have a bug in your application code that
is preventing 4D from properly closing the connection.
The CLOSE_WAIT status is part of the normal life cycle of a TCP
connection. TCP is a full-duplex protocol with each side maintaining
an independent state for the connection. CLOSE_WAIT indicates that
the remote side of the connection (in your case, Apache) has signaled
its intention to close. It is waiting for you to send any remaining
data and close your side of the connection as well.
When trying to debug a TCP connection issue, it is helpful to refer
to the connection state diagram on page 22 of the TCP specification
(RFC 793, located at http://www.rfc-editor.org/rfc/rfc793.txt). This
diagram is divided vertically into two halves. The upper half
describes the state flow for opening a connection, the lower half for
closing the connection. In the middle is the ESTABLISHED state
(status 8 for ITK and 4DIC users), where TCP connections spend most
of their time.
There are typical paths through this diagram, depending on whether
you are acting as the client or the server. For example, when opening
a connection, a server will perform a passive open (creating a
LISTEN) and wait to be contacted (top center and left of the
diagram). A client performs an active open and attempts to contact a
listening server (top right). Once both sides have completed what is
called the three-way handshake, they both enter the ESTABLISHED state
where most of the data exchange takes place.
At any time, either side may close the connection. The path through
the remainder of the diagram depends on which side initiated the
close. If you close the connection, you follow the bottom-left path,
through the FIN_WAIT-1 and FIN_WAIT-2 states. If the other side
closes the connection, you move through the CLOSE_WAIT and LAST-ACK
states. Both eventually end up at CLOSED.
With HTTP requests, it is usually the server that initiates the
close. After receiving an incoming request, it responds with the
appropriate data, then promptly closes the connection and creates a
replacement LISTEN for the next request. Here is a typical path
through the diagram:
1. Server creates new listen by performing passive open. Server
state: LISTEN; Client state: n/a
2. Server waits for incoming connection.
3. Client attempts to connect to server by performing active open.
Server state: LISTEN; Client state: SYN_SENT
4. Three-way handshake begins by server receiving the first SYN
packet. It sends an acknowledgment and its own SYN. Server state:
SYN_RCVD; Client state: SYN_SENT.
5. Step two of handshake continues with client receiving and sending
an acknowledgment of the server's SYN. It then moves to the
established state where it can begin sending and/or receiving data.
Server state: SYN_RCVD; Client state: ESTABLISHED.
6. Handshake concludes with the server's receipt of the client's
acknowledgment. Server is ready to talk and there is now a full-
duplex data connection in place. Server state: ESTABLISHED; Client
state: ESTABLISHED.
7. Client sends data (such as an HTTP request). Server state:
ESTABLISHED; Client state: ESTABLISHED.
8. Server reads data, sends other data in response. Server state:
ESTABLISHED; Client state: ESTABLISHED.
9. Both sides may continue to send more data if desired without
destroying the TCP connection. This is how HTTP/1.1 persistent
connections and the HTTP/1.0 keep-alive extension work since HTTP is
layered on top of TCP.
10. Server decides it is finished and begins closing the connection
by sending the close signal FIN. Note that the connection isn't
simply dropped at this point; it may be necessary for the OS to
retransmit a lost packet. Server state: FIN_WAIT-1; Client state:
ESTABLISHED.
11. Client receives the server's FIN and sends an acknowledgment.
Server state: FIN_WAIT-1; Client state: CLOSE_WAIT.
12. Server receives the acknowledgment and waits for the client to
close its side of the connection. Server state: FIN_WAIT-2; Client
state: CLOSE_WAIT.
...I must pause briefly here to point out something very important:
At this point, the TCP connection still exists, and in fact there is
still a half-duplex data connection from client to server. The reason
the connection doesn't go away at this point is because only one side
(the server) has indicated it will not be sending any more data. It
is still possible for the client to send more data and for the server
to receive it. This doesn't usually happen with HTTP, but may be
common with other TCP-based protocols. I will come back to this below...
13. Client is done with the connection and sends its own FIN. Server
state: FIN_WAIT-2; Client state: LAST-ACK.
14. Server receives the client's FIN and sends an acknowledgment. It
then moves into a sometimes confusing state called TIME_WAIT. I won't
go into TIME_WAIT here unless somebody really wants to know. Suffice
it to say that you rarely need to worry about TIME_WAIT and for all
intents and purposes, the connection is now closed on the server-
side. At some point, the OS will move the connection to CLOSED.
Server state: TIME_WAIT -> CLOSED; Client state: LAST-ACK.
15. Server creates new listen for next incoming connection. Server
state: LISTEN; Client state: n/a.
16. Original client receives the acknowledgment of its close. All
done. Server state: CLOSED; Client state: CLOSED.
There is no requirement that the server be the one to initiate the
connection close; the client can begin to close the connection
immediately after sending the request. It will wait for its requested
data, followed by the server's close. This happens frequently with
HTTP applications and is often a "gotcha" for people writing their
own web servers with ITK. The path through the diagram is similar,
but the roles reverse at the end:
Steps 1-7 are the same as above.
8. Before waiting for the server's response, the client indicates it
will not be sending any additional data by beginning to close the
connection. Remember that this is only a half-close, just the client-
to-server data connection. The server-to-client data connection is
still open. Server state: ESTABLISHED; Client state: FIN_WAIT-1.
9. Server reads all available data then receives the client's FIN.
Server sends acknowledgment of the FIN but continues to work on the
request. Server state: CLOSE_WAIT; Client state: FIN_WAIT-1.
10. Client receives the expected acknowledgment and continues to
receive whatever other data the server sends. Server state:
CLOSE_WAIT; Client state: FIN_WAIT-2.
...Here is where CLOSE_WAIT usually occurs in HTTP. At this point,
the server has the client's request and is working on fulfilling it.
The client has closed its side of the data connection, signaling that
it won't send any more data. However, this does not mean it is
unwilling to receive data. This is why the state is called
CLOSE_WAIT: it will be closed soon, but the client must wait until
the server is done with the request. For HTTP servers, it is
perfectly valid to send data to the client at both the ESTABLISHED
and CLOSE_WAIT states. For HTTP clients, you can receive data at any
of the ESTABLISHED, FIN_WAIT-1, or FIN_WAIT-2 states. I will cover
the case where the client is unwilling to receive any more data below...
11. Server works on request and finishes sending the requested data.
Server state: CLOSE_WAIT; Client state: FIN_WAIT-2.
12. Server indicates it is done sending data by closing the
connection. Server state: LAST-ACK; Client state: FIN_WAIT-2.
13. Client receives the server's FIN and sends an acknowledgment. It
then moves through TIME_WAIT and on to CLOSED. The connection is now
effectively closed on the client-side. Server state: LAST-ACK; Client
state: TIME_WAIT -> CLOSED.
14. Server receives acknowledgment of its FIN. All done. Server
state: CLOSED; Client state: CLOSED.
15. Server creates new listen for next incoming connection. Server
state: LISTEN; Client state: n/a.
There is a special case where both sides decide to close the
connection simultaneously. This is rare, but in this case, both send
the closing FIN and move to the FIN_WAIT-1 state. Both will receive
the FIN and send an acknowledgment, moving to the CLOSING state.
After receiving each acknowledgment, both move through TIME_WAIT to
CLOSED.
The last thing to mention is the circumstance where one side chooses
to close the connection but is unwilling to receive any more data.
Because TCP was designed to be a robust protocol, both sides need to
agree that the connection should be closed before it is truly closed.
Most well-behaved HTTP clients will simply read in but discard the
server's response, waiting for the closing FIN packet to be received.
This allows both sides to close the TCP connection cleanly. If the
client is unwilling or unable to do that, it may send what's called a
RST (reset) packet which is essentially a "kill it now" order. Upon
sending the RST, the client force-closes its TCP connection. Upon
receiving the RST, the server force-closes its TCP connection. The
status instantly jumps to CLOSED and send and receive attempts will
return an error indicating the connection was killed.
Getting back to Jay's situation, it appears that the Apache proxy is
generating a request, sending it to the 4D web server, and either
half-closing immediately, or giving up after some timeout period.
Whatever the case, it is not force-closing the TCP connection (as
Apache is a well-behaved HTTP client) and waiting for 4D to perform
its side of the close. This is why you are seeing so many connections
in the CLOSE_WAIT state; either your application code or 4D is not
finishing the job. I would suspect that if you ran netstat on the
Apache proxy server, you would see the same number of connections in
the FIN_WAIT-2 state.
I must admit a bit of ignorance here with respect to 4D's built-in
web server. I toyed with it for a weekend back in the 4D 6.7 days,
but since then I have worked only on built-from-scratch web servers,
one using ITK, another using a custom plug-in, where I had a great
deal of control over the underlying HTTP and TCP connections. 4D's
automatic handling of the web connections is a blessing in that there
is less code for you to write, but a curse to debug because you are
at the complete and utter mercy of whatever 4D decides to let you see
(which, in this case, is very little).
4D's web server uses a classic master-slave design. The web server
process serves as the master, delegating new connection requests to
its slaves. The slave is then responsible for the entirety of the
connection. Because you are seeing a lot of TCP connections at the
CLOSE_WAIT state, I would suspect that you have one or more processes
that never return from the On Web Connection method. 4D itself is
responsible for closing the underlying TCP connection and will only
do so when it believes the web request is complete, that is to say,
when On Web Connection finishes execution.
You probably also have a user that is compounding the situation: The
user requests some resource. It doesn't show up. So they click
"Reload" in their browser. It doesn't show up. So they click "Reload"
in their browser. And so on. With each new request, another slave
process gets tied up. Eventually, all slaves are busy doing...
something, and the web server cannot delegate new incoming requests.
It effectively goes deaf and you "crash."
Here are some things to try and/or look for:
Turn on the web server's log file. It's enabled in the Web > Advanced
pane of the database preferences. It's not the greatest log file, and
entries are only written after the hit completes (and then only after
a buffer is filled), but it might give you an idea of the user's
requests that lead up to the crash.
Since I suspect that On Web Connection is not completing, create your
own log file that simply records when that method starts and stops
for each process. I'd also include the request URL, provided in $1.
At the next freeze, examine the log file to verify that every process
has indeed completed. If so, you're probably looking at a bug in 4D
because you have no control beyond that point.
You could also create a different log file and write the complete
contents of the HTTP request header, provided in $2, out at the top
of every On Web Connection call. This file could grow to be quite
large, so be sure to rotate it regularly--perhaps daily. The next
time you encounter your freeze, look at the last few entries in the
log; they should give you an idea of the specific requests that were
in process at the time.
Another possibility is that you are experiencing some sort of denial-
of-service attack. Any one of the log file options described above
can help you identify that.
I'd also recommend opening and leaving the Runtime Explorer window
open on the server. That way you'll be able to see if any of the
invisible kernel processes (at least the ones that are 4D processes,
like the Web Server) are chewing up the CPU.
Hope this helps!
--
Willie Alberty, Owner
Spenlen Media
willie-***@public.gmane.org
http://www.spenlen.com/
**********************************************************************
4th Dimension Internet Users Group (4D iNUG)
Unsub: mailto:4D_Tech-off-d2/***@public.gmane.org
**********************************************************************