The lost time

During a briefing at the beginning of a tuning assignment the IT staff of a Swiss company describes their problem: “Our new application runs in Switzerland and also as a local installation of our subsidiary in the US. Both installations are served by our local database in Switzerland . The Step in question in this application is running with us 10 Seconds in Switzerland, in the US, it takes about 2 minutes. We have a trace of the application, but we can not find the problem. Granted, there are thousands of SQL statements that we work through, but each one is fast and their total time is about 10 seconds.”

What do you think is the cause of the problem? How would you verify your assumption? What could be a short-term solution? How should a long-term solution look like?

4 thoughts on “The lost time

  1. Hi Lothar,
    at first i would be interested in how they have traced it, but based on the description i would assume that they just looked at the “db time” waits (= 10 seconds in sum) and ignored the “idle” waits, even though they count as application response time.

    >> What do you think is the cause of the problem?
    I would consider the network round trip time for packet transfer (e.g. for content of FETCH calls) as root cause as the local application installation in Switzerland is fast, but the remote installation in the US is slow.

    “SQL*Net message to client” and “SQL*Net more data to client” waits track only the time for writing the data from SDU buffer to TCP socket buffer. The time (of a TCP packet) that is needed for sending the data to the client is already counted as “SQL*Net message from client”.

    >> How would you verify your assumption?
    Perform an extended SQL trace for both locations and parse the raw trace with help of a profiler like “Method R profiler” or “TVD$XTAT”. Afterwards i would check the wait histogram for “SQL*Net message from client”, if most of the time is spent there (as i would expect).

    >> What could be a short-term solution?
    Maybe the US location can also use the local app installation in Switzerland by using WTS, VNC or whatever.

    >> How should a long-term solution look like?
    Distributed database landscape, Advanced Network Compression, Re-write app that only the preprocessed content is displayed (e.g. in browser) but the database/client transfer happens locally (example of such an app would be Enovia V6).

    I hope that my assumption was in the right direction :-))

    Regards
    Stefan

    Liked by 1 person

  2. Hi Lothar,

    > What do you think is the cause of the problem?
    From the problem description and without additional 10046 trace data, my guess is that 110 seconds (from the total of 120) are lost in many small network round-trips caused by many SQL executions – you mentioned that there are thousands of SQL-s in the trace file.

    > How would you verify your assumption?
    That might be verified by checking the 10046 trace resource profile. If there are many short “SQL*Net message from client” wait events and their total duration is approximately 110 seconds, that might confirm the hypothesis.

    > What could be a short-term solution?
    If the problem is really caused by many network round-trips, then it would be necessary to determine why do these round-trips occur. Is that because of a too small arraysize/fetchsize so that larger resultsets need more network round-trips for the data to be transferred? Is that because array/bulk processing is not used where it could be, so that many similar short SQL statements are unnecessarily executed? Is that because of too many unnecessary parses? etc.

    > How should a long-term solution look like?
    If the problem is really caused by many network round-trips, the goal should be to minimize them based on the root-cause analysis mentioned in the previous point. If their number can’t be minimized, the alternative is to make their duration shorter. That might be achieved by shortening the physical distance between the database and the application server (maybe using replication/Data Guard if the nature of the application allows it or by moving the application server closer to the database).

    Regards,
    Jure Bratina

    Like

Leave a comment