Discussion in 'Server Operation' started by edge, Aug 23, 2006.

    When I use the command top I get the following results.

    top - 23:11:59 up 11 days, 12:51,  1 user,  load average: 56.43, 48.66, 47.98
    Tasks: 502 total,  85 running, 417 sleeping,   0 stopped,   0 zombie
     Cpu0 : 71.4% us, 24.8% sy,  0.0% ni,  3.8% id,  0.0% wa,  0.0% hi,  0.0% si
     Cpu1 : 96.5% us,  3.5% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0% si
     Cpu2 : 90.2% us,  6.0% sy,  0.0% ni,  3.8% id,  0.0% wa,  0.0% hi,  0.0% si
     Cpu3 : 83.6% us, 15.5% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.9% si
    Mem:   3632568k total,  3423552k used,   209016k free,    37252k buffers
    Swap:  2714944k total,    21724k used,  2693220k free,  2518884k cached
    The part that got me worried a bit is the us (I think it is the "user mode"). Now should this be low, or should this be high as shown in my results?
    It should be low. Your load average is very high (> 40). Normally, it should be below 1! :eek:
    Hmm this is no good..

    So whats next? How can I see what is causing this?

    The system is still nice and fast!


    I've restart coldfusion and all looks "ok" again for now

    top - 23:59:52 up 11 days, 13:39,  1 user,  load average: 0.36, 14.00, 32.08
    Tasks: 263 total,   1 running, 261 sleeping,   0 stopped,   1 zombie
     Cpu0 : 10.0% us,  0.7% sy,  0.0% ni, 89.4% id,  0.0% wa,  0.0% hi,  0.0% si
     Cpu1 :  7.3% us,  0.3% sy,  0.0% ni, 92.4% id,  0.0% wa,  0.0% hi,  0.0% si
     Cpu2 :  6.6% us,  2.3% sy,  0.0% ni, 90.8% id,  0.0% wa,  0.0% hi,  0.3% si
     Cpu3 :  6.3% us,  4.3% sy,  0.0% ni, 87.7% id,  1.0% wa,  0.0% hi,  0.7% si
    Mem:   3632568k total,  3537420k used,    95148k free,    40672k buffers
    Swap:  2714944k total,    21724k used,  2693220k free,  2629740k cached
    So it seems ColdFusion is eating up your resources...
    You can check with the top command which processes are using the most resources.
    Yes it looks like ColdFusion is the problem,

    Some days ago I did update to 7.02. The Coldfusion server is really much faster now (even with the shown top results), but for some reason all the results are way of the scale..
    I'm not sure what to do now... Go back to the old 7.01 version (slow server), or keep it as it is right now with the high results.
    It does not look like it's effecting the server at all on other applications.

    Hmmm what extra command do I need to give to see this?
    Usually top is the command that allows you to see the process list and you can also see what are the more CPU / SYstem / User /Memory intensive process.
    You can order (sort ) them by any parameter resource you choose.

    To see simply what processes are running just make:
    ps aux or
    ps saxe (long listing here)

    identify the processname for Coldfusion and if your machine has a mod_coldfusion running on apache check to see if the httpd is not the cause of the load ...

    But top should show the ordered list of running processes.
    Based on several resources used.

    You have a quad CPU machine, the load shoul really be almost zero when no thread is running.

    Thanks for the info keybd_user,

    I've down graded the CFMX7 server again to the old version, and now I'm adding the 'updates' one by one. As soon as CFMX7 is causing trouble again, I'll give your commands a go!

    I loved the old Coldfusion 5 (not java based). This version was running perfect on my Debian server!.. Updates are not always good :/

    re: quad CPU machine.

    Yes I know, and I'm really starting to think that this is the problem! Next time that I'm in the datacenter I might dissable the hyperthreading, and see if it's any better!
    Hmm back to slow coldfusion pages again :(

    I'm not sure if it's coldfusion (I'm now on the old 7,0,1,116466 version), or if it's apache2 that is causing the slow pages.
    When I restart apache2 (version 2.0.54), the coldfusion pages are nice and fast again.. It's the same thing when I restart coldfusion.. Nice and fast (the way it should be)

    I'm (as far as I know) not using mod_coldfusion.. Could this be the problem?
    I never used CF (too many scriptings already in my life :) my help will be limited, but it really seems like an incremental update problem ... or a configuration for that up-grade.

    Your server rocks! (I also have a Dual Opteron machine, --> Dual Dual Cores rock :)
    I also noticed that your available memory is also low ... only 95MB.
    If your server is live there can also be the case of the machine being hit by some sort of attack ... check the logs of CF ...

    No.. No attack or something like that on/to my server. The Apache on port 81 is still nice and fast when the Apache on 80 is not. Also the logfiles (snort) are nice and clean.

    The server does get a lot of users per day (about 50.000), and makes heavy use of ImageMagick.

    re: my server..

    I'm also loving the dual server. It's a Dell Poweredge dual Xeon 2.8 with at the moment 4 GB of mem. in it. I've also added a 3ware RAID1 card with 2 SATA 250GB HDD's.. I guess the HDD's with the 3ware card could also be the bottleneck!

    The server is using the 2.6.8-3-686-smp kernel (not 64bit as Coldfusion will not work on that)

    I've had about every OS on the server, but till now Debian seems tobe the best for it, but I might move back to Fedora (Redhat) again, as this is officialy supported by Coldfusion.

    Why is Coldfusion so important to me? I'm loving the language for it, and it's so nice to use with database stuff!
    I have a Dual Core Dual CPU Opteron 265 (4GB DDR ECC 400MHz but 2x 500GB SATA II hdd's with place for 2 more), And this is the Best ---> all in 1U rack mount.
    I also use a 3Ware 9550 4 channels and they can not be the bottleneck.
    A SCSI 10.000 RPM's disk can give you 40MB/s at top consistent troughput.
    A SCSI 15.000RPM can give you a top troughput of say 45MB/s.
    Of course in "granulized" small access the SCSI controller is way superior to SATA.
    It also depends on hdd make and model (brands), I once placed a Maxtor in a IBM SCSI disks array with exactly the same parameters in RAID 5 controller.
    The green led on the Maxtor was always open! While the others made simple "blips" the Maxtor Led green (access led) was almost always on ... Why ... much slower access time therefore the constant lag in multiple acumulated access times.
    HDD diferencial and integral response are both important.

    But on a internet server SATA is more then capable to handle requests (single 100Mbps ethernet connector and slow ADSL accesses to the server). Non-intranet environment.
    Specially if the Raid aids a little with multiple disk readings (your setupo with only two hdds)...
    Of course in a distribution of timming events every single delay counts.
    But unless your applications has huge Database access and an enormous amount of data manipulation on the disk things should go pretty well.

    Lets imagine a simple linear and contant distribution of events say your users access only during 8 hours:
    50.000 users/8hours = 6250/hour = 104.17/ minute = 1.7 users /s +/- 2 users / s
    Usual web pages take only 150KB ...
    Data tranfers on your disk is at least 35MB/s that mean with 2 users /s each user would have to take say 15MB / each of data transfer.
    Do you think this is the case ?
    There is a huge margin here ...
    And also web user bandwith does not sustain such speed.

    Also if you use Kernel 2.6.8 you should have installed the 3ware with drivers provided by AMCC on 3Ware site, has they where only included in the Main Kernel on version 2.6.14 .
    Maybe a Kernel up-date would be a good idea ... you get the new drivers but if you do this do not forget to get the lattest firmware to the controller.

    Do not get the wrong idea, I also like CF.
    There are many good sites made with CF. They seem to be fast and very good to handle all those goodies from Macromedia (flash and all ).
    The new version with (Java based) does have the advantage of enabling you to use a database connection pool. (Very important to avoid bottlenecks)
    (Java Rules!)

    Hello Pedor,

    Thank you for the info.. You like your numbers :)

    Nice setup.. I'm at the moment working on a new 1u server.. This is going tobe my nr4 now :)
    Yes I have.. I even have a nice "web" interface that I can login, and do all kind of nice things..
    As far as I can see there is no newer "official" kernel for Debian!
    apt-cache search kernel-image | grep smp is showing the 2.6.8-3-686-smp as the highest version... Or.. am I missing something here?

    Thanks again for the info!
    You could use Debian's testing and/or unstable branches in your sources.list. They contain newer kernels. I recommend to use apt-pinning for this so that none of your other packages will be taken from testing/unstable:

