Tuesday, June 21, 2011

Apache troubleshooting steps for Siteminder

Problem Definition 1 : problem in starting the webagent

Please do the following:


1) Kill all the semaphore:

ipcs to show and list the semaphores and shared memory ipcrm -s semaphoreid to clean them up

2)

To shut down the LLAWP, use the command with this syntax:
LLAWP path_to_WebAgent.conf -web_server_type -shutdown

For example:
LLAWP /usr/apache/conf/WebAgent.conf -APACHE20 -shutdown


3)Then restart the webserver

Unable to process SMIDENTITY cookie

Solution:

The SMIDENTITY is a persistent cookie is used for Affiliate agents, or Anonymous authentication schemes. If you are using either of these, then this cookie is required.
If you are not using either of these, then there is no need for this cookie. The Siteminder policy server generates this cookie if you have "Enable User Tracking" turned on under Tools-->Global Settings in the Siteminder UI. It is a Persistent cookie, so it will stay in the browser if you have ever accessed a webagent that points to a policy server that has this setting enabled. Please delete the cookie manually from the browsers persistent cookie store. This will make the error stop showing up. Unfortunately, it will happen for every user that has such a cookie stuck in their browser.
If you are not using Affiliates or anonymous auth schemes, then you should disable this switch in all policy servers in your environment.
Note that this warning will NOT ever cause access failures to any users, unless it is when users are accessing an anonymous realm.

What information is present in the SMSESSIONSPEC in the SMSESSION cookie.

Solution:

The SMSESSIONSPEC is an encrypted ticket that contains information related to the user session.

If the session is validated from the Policy Server then session spec will change and the SMSESSION cookie will be changed as well, but in case "SessionGracePeriod" expires, session cookie will be re-computed as per the new key received but the session spec remains the same.

Only the Policy Server knows how to decode the information in the SMSESSIONSPEC.

The data the SMSESSIONSPEC contains are those listed below:

  • SessionVersion
  • SessionStartTime
  • SessionLastTime
  • SessionMaxTimeout
  • SessionIdleTimeout
  • SessionLevel
  • SessionId
  • SessionIp
  • SessionDn
  • SessionDirOid
  • SessionDirName
  • SessionUnivId
  • SessionType
  • SessionAnonymous
  • SessionImpersonatorName
  • SessionLoginName
  • SessionPersistent
  • SessionDrift
  • SessionImpersonatorDirName
  • SessionAuthContext

Troubleshooting approaches for LLAWP shutdown issues occurring on Solaris

Description:

This article shall provide first troubleshooting approaches when facing LLAWP shutdown issues especially on Solaris OS.

Solution:

1. Ensure proper semaphores and shared memory tuning Instructions eg. can be found on https://support.ca.com/irj/portal/anonymous/redirArticles?reqPage=search&searchID=TEC485876 and in CA Site.

2. Issues had been seen when Serverpath set were too similar (only differing in 1 character), therefore identical IPCS keys had been generated.
Please double check and do a test changing one of the webagent's ServerPath to e.g. log directory.

3. Most common cause for such LLAWP shutdown error is that it does not shut down because there are other child processes still running.
Due to webserver restart instead of start and stop with providing enough time for LLAWP to shutdown, too, there might be even duplicate LLAWP processes.

Please also check any startup and shut down scripts to make sure there is no kill command used for the web server which would leave LLAWP running, too.

4. Double check if necessary Solaris 10 patches are installed:

- BM HTTP Server patch IBM PQ 71734 for IBM HTTP Server 1.3.19.4 and 1.3.19.5

- 119963-08 (need this patch to avoid a runtime issue with Web Agent installation binaries)

You can check on patch versions by logging in as root and executing the following command:
'showrev -p | grep patchid'

5. If the LLAWP process does not shut down properly when shutting down the web server, shut down the LLAWP from the command line.
This shuts down the running worker process associated with a WebAgent.conf file.

To shut down the LLAWP, use the command with this syntax:
LLAWP path_to_WebAgent.conf -web_server_type -shutdown

For example:
LLAWP /usr/apache/conf/WebAgent.conf -APACHE20 -shutdown

Note: Configuration file names and version strings that contain spaces should be surrounded by quotes, such as "value with spaces." The LLAWP process will take a few seconds to shut down.

Use the command line to shut the LLAWP down instead of the kill -9 command, so that the process cleans up shared system resources used by the Web Agent.

6. Double-check permissions for the user account used to launch the webserver to ensure you not run into any problem with permissions on webagent related files by reading and writing.

7. Upgrade webagent to a more current release to benefit from latest enhancement and fixes.

Very frequent handshake errors

PROBLEM:

[24389/145951][Wed Dec 02 2009 14:31:49][CServer.cpp:1392][ERROR] Bad security handshake attempt. Handshake error: 3152

[24389/145951][Wed Dec 02 2009 14:31:49][CServer.cpp:1399][ERROR] Handshake error: Failed to receive client hello. Socket error 131
[24389/145951][Wed Dec 02 2009 14:31:49][CServer.cpp:1487][ERROR] Failed handshake with 10.10.28.33:56037
[24389/146023][Wed Dec 02 2009 14:31:49][CServer.cpp:1392][ERROR] Bad security handshake attempt. Handshake error: 3152
[24389/146023][Wed Dec 02 2009 14:31:49][CServer.cpp:1399][ERROR] Handshake error: Failed to receive client hello. Socket error 131
[24389/146023][Wed Dec 02 2009 14:31:49][CServer.cpp:1487][ERROR] Failed handshake with 10.10.28.33:56038

Solution:

Here are some parameters you need to check in order to reduce or eliminate the error messages:

* If there is significant network latency or potentially Policy Server overload, you may be hitting the Web Agent "RequestTimeout" limit (set in the HCO or SmHost or both depending if the webagent is starting up and is getting connected to Policy Server mentioned in smHost or if webagent already has the Policy Server details from smHost but is now connecting to Policy Server mentioned in HCO). This is 60 seconds by default, and the if the Policy Server takes longer than this (combined with latency) then an Agent will reset the connection and try again. However, this means that users would be waiting a full 60 seconds for a response.

* The Web Agent command "AgentWaitTime" (set in 'WebAgent.conf') may allow you to overcome network latency problems during Agent startup. A description of the usage of this parameter follows: AgentWaitTime: Specifies the number of seconds that the Web Agent waits for the Lowlevel Agent Worker process (LLAWP) to become available. When the interval expires the Web Agent tries to connect to the Policy Server. Setting this parameter may help resolve agent start-up errors related to LLAWP connections. We recommend starting with the default value and then increasing the interval by five seconds at a time until the agent starts successfully. (Default: 5 seconds, Upper Limit: 45 seconds) 'AgentWaitTime' would be used whenever the Web Agent is making new connections to the Policy Server. If you have a rather high 'MaxSocketsPerPort' setting (e.g. 60 connections) in the HCO, the issue may occur frequently in your environment during runtime as well as startup. Example: If you have primary and secondary policy servers, use a value between 30 and 40. So I'd recommend trying '30' (just restart the web server after making the change). You can simply add "AgentWaitTime=30" in your 'WebAgent.conf' file, and you should see this take effect upon startup in the Web Agent error log (when it lists out all parameters).

* Are you running Apache in Prefork or Worker mode (thread model)? You can tell by running "httpd -V" (capital 'V'). If in Prefork, each incoming request would require its own Apache process, and the Agent would need to make a set of connections for every process as well. Apache Prefork mode limits the Agent to one thread per process, and has serious implications for efficiency (increased Agent to Policy Server connections, etc.).