Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1017776.1
Update Date:2011-05-31
Keywords:

Solution Type  Troubleshooting Sure

Solution  1017776.1 :   Handling Microsoft Windows Hangs on an X64 Sun[TM] Microsystems systems.  


Related Items
  • Sun Blade X6220 Server Module
  •  
  • Sun Blade X8420 Server Module
  •  
  • Sun Fire X4200 M2 Server
  •  
  • Sun Java Workstation W2100z
  •  
  • Sun Ultra 20 Workstation
  •  
  • Sun Blade X6450 Server Module
  •  
  • Sun Fire X4440 Server
  •  
  • Sun Fire V20z Compute Grid Rack System
  •  
  • Sun Ultra 20 M2 Workstation
  •  
  • Sun Fire X2200 M2 Server
  •  
  • Sun Blade X8450 Server Module
  •  
  • Sun Fire X4600 Server
  •  
  • Sun Blade X8440 Server Module
  •  
  • Sun Fire X4100 Server
  •  
  • Sun Fire X4500 Server
  •  
  • Sun Blade X6250 Server Module
  •  
  • Sun Fire V20z Server
  •  
  • Sun Java Workstation W1100z
  •  
  • Sun Ultra 40 Workstation
  •  
  • Sun Blade X6420 Server Module
  •  
  • Sun Fire V40z Server
  •  
  • Sun Fire X4100 M2 Server
  •  
  • Sun Fire X4540 Server
  •  
  • Sun Ultra 40 M2 Workstation
  •  
  • Sun Fire X2100 M2 Server
  •  
  • Sun Fire X2100 Server
  •  
  • Sun Fire X4200 Server
  •  
  • Sun Fire X4600 M2 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>x64 Servers
  •  

PreviouslyPublishedAs
228996




To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Sun x86 Systems


Summary

This document addresses failures in the system stability which result in a hang state.

The definition of hang in this context is when the (hardware) system continues to run, but doesn't answer to any user actions locally and remotely.

The Screen maybe completely frozen or the mouse cursor could be the only component that is still responding to user interaction.

In some case the mouse pointer is displayed, but you cannot move it.

In other cases the computer does not respond either to mouse movements or to any keyboard input.

The system will not offer any prompt or display any messages.

NOTE: This document is not for any unwanted reboots or BSoD (Blue Screen of Death) - the aim of this document is mainly to show the user how to collect the system informations in such case.

Symptoms

- System Hangs
- System does not answer
- Frozen screen
- Can't move any windows



Steps to Follow
Steps to follow

Step 1. clarify the kind of hang you are facing

Firstly, you need to verify and clarify which kind of hang you are facing, and indeed if your system has actually hung.

Refer to <Document: 1012991.1> "How to check if your x64 platform "system hang" actually is a system hang"

  • Have you got an Unrecoverable system deadlock ?

We will qualify an Unrecoverable hang state by all of these conditions:

  • Blank screen or completely frozen screen

  • Mouse cursor does not move

  • Caps Lock keyboard key does not answer (led off/led on)

  • Interactivity with the keyboard gives no result

  • No sound from the bell system board

  • Operating System does not answer to any ping from the network

  • Operating System does not answer to any remote queries to its web and network services

  • And for Windows 2003, the out-of-band management Special Administration Console (SAC) is also not reachable locally or remotely.

  • Have you got a Recoverable system state ?

Explorer is responsible of the main Windows GUI. This is the Window Manager.

If Explorer is " not responding " you may not be able to move any windows but the mouse cursor may still respond.

The Caps Lock keyboard key should be working properly (led off/led on).

Using hotkeys CTRL+ALT+DEL should also launch successfully the Task Manager.

The Operating system should answer to the pings (if not firewalled).

The web and network services should be able to answer to the remote queries as expected (if any of these services are configured).

For Windows 2003 versions, the out-of-band management Special Administration Console (SAC) should also be reachable locally or remotely.

Step 2. What to do if there is an "Unrecoverable system deadlock"

All you can do is to manually reboot your system to the Windows safe user mode in order to collect system informations on the last issue and arrange some parameters for the troubleshooting purpose.

To restart your system please refer to <Document: 1008398.1> "How to Reset an x64 System"

Once you have performed the reboot, force the system to boot in Safe mode using the F8 key after the complete BIOS initialization boot process.

Now you can follow the instructions in Step 4 How to collect system informations and configure it for troubleshooting.

While rebooting if you get a "Blue Screen of Death (BsoD)(crash/panic) please refer to

<Document: 1017889.1> "Analyzing System panics on x64 platforms running Microsoft Windows"

Step3. What to do when you have a "Recoverable system state"

The first action to try is a CTRL+ALT+DEL keyboard sequence in order to get the access to the Graphical Identification and Authentication (GINA) and click on the "Task Manager" button.

Here are more details on the steps to follow :

1. Be sure to verify your system state by following the instructions in

<Document: 1012991.1> "How to check if your x64 platform "system hang" actually is a system hang"

2. For any Windows version use CTRL+ALT+DEL to run the "Task Manager".

3. If the "Task Manager" appears, go to the Applications tab and check the list of not responding programs.

4. Try to "end task" for all the "not responding" programs.

5. You can also go to the "Processes" tab and use the "End Process" to kill any "not responding" process.

The most common action will be to kill any Explorer.exe procesess and re-launch at least one Explorer task from the "Applications" tab by using the "New Task" button just by entering "explorer" followed by the enter key to the command prompt.

For Microsoft Windows 2003 only, you may also have the possibility to use the out-of-band Special Administration Console (SAC).

See the "Microsoft Windows 2003 specific references" part at the end of this document for further informations.

If the issue is happening again you should :

  • Reboot to the Safe mode (and follow the instructions of the chapter IV of this document).

  • Contact your OS support service provider with the collected data for analysis.

Step 4. Collect system information and configure it for troubleshooting

In order to understand what's going on or if the system remains unstable, you will have to reboot your operating system to the Safe user mode following some of the methods as described bellow.

  • Try to reboot the system :

    If possible use the Start menu on the bottom left corner and select the Shut Down option (and for Windows 2003 you will also have to select Restart option),then click the OK button)

    If the main Windows GUI is unavailable or not responding you can try to use the Windows Task Manager by issuing a CTRL+ALT+DEL.

To do so, go to the Users tab and select your logged user and hit the "Logoff" Button. When you reach the GINA Logon screen you can select the "Shutdown" button in order to restart your system.

Alternatively you can go to the Applications tab for the Task Manager and click to the "New Task" button and enter the following shell command :

    shutdown /f /r

  • Force hardware reboot :

If none of these actions work, please refer to

<Document: 1008398.1> "How to Reset an x64 System"

  • Getting to the Safe Mode :

While rebooting, wait until you pass the BIOS screen and reach the Microsoft Windows black screen showing "Starting Windows" where you'll have to quickly hit the F8 keys to stop auto-booting and enter the advanced boot options menu.

Then select the "Debugging Mode" or "Safe Mode" option and press enter to boot it.

If you can't get to the Safe mode, please consult the Microsoft knowledgebase Safe-Mode Boot Switches for Windows Boot.ini File, article ID : 239780

Once you are logged to Windows, please collect the information as outlined in <Document: 1010936.1> "Microsoft Windows and Linux operating systems: How to obtain troubleshooting information".

By the way, if while booting you get a BSoD (crash/panic) please refer to

<Document: 1017889.1> "Analyzing System panics on x64 platforms running Microsoft Windows"

We will now describe some methods that require prior setup before a basic reboot:

  • Manually crash the hung system and hope you get a dump to analyze offline.

  • Get a complete memory dump for further analyze.

  • Boot the system in debugging mode and when it hangs, break in with the kernel debugger.

Here are some instructions on the many possible ways to setup your Windows installation in order to collect system informations :

  • Some servers come with an NMI button that you will need to press. To enable it:

Edit in the Registry the following key :

HKLM\System\CurrentControlSet\Control\CrashControl\NMICrashDump

(DWORD) set to value of 1

Then reboot Windows to its normal way. Once you got the hang, hit your NMI button to generate a crash and its dumpfile.

  • A specific technique for Windows 2003 versions :

Go to the SAC --if available-- anytime you got a system hang and use the crashdump command.

  • Using the keyboard (PS2 and USB) :

Find out how to setup your Windows installation in order to generate a memory dump while your system is having a hang state.

Read this Microsoft knowledgebase documentation :

"Windows feature lets you generate a memory dump file by using the keyboard" article ID : 244139 http://support.microsoft.com/default.aspx?scid=kb;EN-US;244139

  • Install WinDbg on your laptop or any computer close to the faulty system and after the hang,connect the host debugger system to the target using a serial cable and run WinDbg.

Then Press Ctrl-C (or click Debug->Break) ? this breaks into target system.

Once you are into the target system, run the .crash command to force the crash.

For any Windows versions, see the Microsoft knowledgebase documentation:

ID 315263 "How to read the small memory dump files that Windows creates for debugging" http://support.microsoft.com/kb/315263

Get Microsoft WinDbg software and its documentation at http://www.microsoft.com/whdc/devtools/debugging/default.mspx

You will find more helps about WindDbg from the Microsoft support website and the internal help pages from the software itself.

  • See Microsoft knowledgebase documentation ID 303021

"How to Generate a Memory Dump File When a Server Stops Responding (Hangs)" http://support.microsoft.com/kb/303021/en-us

At this point, if you followed each troubleshooting step above you have collected the needed information for further troubleshooting. For additional support contact Sun Support.

Once you have generated the panic dump please refer to < Document: 1017889.1 >  "Analyzing System panics on x64 platforms running Microsoft Windows" for further actions.

At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required. For additional support contact your local Sun Support Service.

Some important links :

<Document: 1010936.1> "Microsoft Windows and LINUX operating systems: How to obtain troubleshooting information"

Windows feature lets you generate a memory dump file by using the keyboard available at http://support.microsoft.com/kb/244139

How to use ADPlus to troubleshoot "hangs" and "crashes" available at http://support.microsoft.com/kb/286350/en-us

'''Microsoft Windows 2003 specific references '''

Special Administration Console (SAC) and SAC commands

http://technet2.microsoft.com/windowsserver/en/library/2acd37af-5439-4789-924c-14e1040cf5a01033.mspx?mfr=true

Emergency Management Services in Windows Server 2003

http://technet2.microsoft.com/windowsserver/en/library/7ea7180d-6ea3-48dc-8b64-8b53884b11d01033.mspx?mfr=true

Using Emergency Management Services

http://technet2.microsoft.com/windowsserver/en/library/5446416a-1d6e-442f-979f-1ce0981099f81033.mspx?mfr=true

Management Software for Out-of-Band Connections

http://technet2.microsoft.com/windowsserver/en/library/7ea7180d-6ea3-48dc-8b64-8b53884b11d01033.mspx?mfr=true

!Special Administration Console (!SAC) and !SAC commands

http://technet2.microsoft.com/windowsserver/en/library/9de126ac-db8b-4d20-8aef-3b5b6110ba9f1033.mspx?mfr=true

Microsoft Sysinternals Tools

http://www.microsoft.com/technet/sysinternals/default.mspx





Product
Sun Fire X4600 Server
Sun Fire X4600 M2 Server
Sun Fire X4540 Server
Sun Fire X4500 Server
Sun Fire X4440 Server
Sun Fire X4200 M2 Server
Sun Fire X4200 Server
Sun Fire X4100 Server
Sun Fire X4100 M2 Server

Internal Comments
This document contains normalized content and is managed by the the Domain Lead(s) of the
respective domains. To notify content owners of a knowledge gap
contained in this document, and/or prior to updating this document,
please contact the domain engineers that are managing this document
via the "Document Feedback" alias(es) listed below: Normalization team alias: tsc-emea-x64@sun.com
Domain Lead: anthony.mcnamara@sun.com

Additional information

Troubleshooting Server Hangs:
http://blogs.technet.com/askperf/archive/2007/09/25/troubleshooting-server-hangs-part-one.aspx
Basic Troubleshooting Toolkit:
http://blogs.technet.com/askperf/archive/2007/05/25/basic-troubleshooting-toolkit.aspx
Preparing to Troubleshoot:
http://blogs.technet.com/askperf/archive/2007/01/10/preparing-to-troubleshoot-part-one.aspx
"GINA" is the WinLogon GUI, it means “graphical identification and authentication”
see http://en.wikipedia.org/wiki/Graphical_identification_and_authentication
x64, normalized, hang, Microsoft, Windows, 2003, 32 bits, 64 bits, W2k3, XP, OS, W32, X86, CRASH, HUNG, Freeze
Previously Published As
91512

Change History
Date: 2007-12-24
User Name: 31620
Action: Approved

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback