Archive for the ‘Oracle posts’ Category

h1

10g Enterprise Manager – HTTP server not starting

May 4, 2010

Yesterday when I was logging into our OEM, the webpage was down. So I went into the server and check the the services.

oem-01:/opt/oracle/oem10g/oms10g/opmn/bin> opmnctl status

Processes in Instance: EnterpriseManager0.oem-01
——————-+——————–+———+———
ias-component      | process-type       |     pid | status
——————-+——————–+———+———
DSA                | DSA                |     N/A | Down
HTTP_Server        | HTTP_Server        |     N/A | Down
LogLoader          | logloaderd         |     N/A | Down
dcm-daemon         | dcm-daemon         |     N/A | Down
OC4J               | home               |   19765 | Alive
OC4J               | OC4J_EM            |   19764 | Alive
OC4J               | OC4J_EMPROV        |   19766 | Alive
WebCache           | WebCache           |   19767 | Alive
WebCache           | WebCacheAdmin      |   19768 | Alive

The output above shows that the HTTP server is down so I thought bringing it up would fix the issue. So I tried starting it using  opmnctl command (opmnctl startproc ias-component=HTTP_Server) but it did not start. So I thought, maybe bouncing all the components would help (opmnctl stopall then opmnctl startall). Unfortunately, it did not and it says it has issues bringing up the HTTP server but no error code! So I checked oem-01:/opt/oracle/oem10g/oms10g/opmn/logs/HTTP_Server~1 file but found no error and the last message was just:

/opt/oracle/oem10g/oms10g/Apache/Apache/bin/apachectl startssl: execing httpd

Digging through the log files under /opt/oracle/oem10g/oms10g/Apache/Apache/logs I found access_log file. Tried to open this file but I got an error saying “Tmp file too large” so I tailed it found a message something to the effect of “number too large when starting the HTTP server”. Hmmmm…this is when I check the filesystem if it is full, but no, its only 97% used! Then I checked size of the file, and voila, its 2GB! I immediately knew that this must be the issue, although at the back of my mind, 2gb file size limit shouldn’t be the problem, but I copied this file over to /tmp (as backup) and zeroed out the existing file (cp -p /dev/null access_log). I then started the HTTP server and then its went up with no issues! Problem solved!

h1

oracle.sysman.emSDK.emd.comm.CommException

January 21, 2010

These are the basic things that I did to troubleshoot the issue (oracle.sysman.emSDK.emd.comm.CommException: rac02.localhost:cn=rac02).

1. Remove the Agent that was automatically discovered by OMS.
a. remove the components (listener, asm, database) of the Agent per node
b. remove the agent from the Grid. you need to shutdown MA first on the node

2. Edit the targets.xml file for each node under $MA_HOME/rac/sysman/emd.
a. put in the correct/fully-qualified host name (i.e. instead of rac01, put rac01.localhost)
b. put in the username and password for dbsnmp id as shown below.

<Property NAME=”UserName” VALUE=”dbsnmp”/>

<Property NAME=”password” VALUE=”<clear_text_password>” ENCRYPTED=”FALSE”/>

3. Secure the agent by executing ‘emctl secure agent’ on each node.

4. Edit emd.properties ($MA_HOME/rac/sysman/config) to provide the fully-qualified host name for OMS location and Agent location (i.e. EMD_URL=https://rac03.localhost:3872/emd/main instead of EMD_URL=https://rac03:3872/emd/main)

5. Restart the MAs on each node and wait for OMS to discover them again. If all of the components were not discovered, add them using the Grid GUI interface.

h1

Perl DBI core dump…ora-12154

September 8, 2009

It’s been a while since I posted here but I think this post is worth your while.

PROBLEM:

Recently, we’ve encountered a problem in one of our test systems. Whenever a user uses perl to connect to the database, the script crashes and creates a core dump. However, when using SQL*Plus or tnsping, no error is encountered or core dump created. Our environment is running on Solaris 10 (x86-64) and Oracle 10g.

TROUBLESHOOTING:  

At first we looked at the environment settings and found that everything is set accordingly. Next thing I did is to run a truss command and as expected it gave me the file (missing sqlnet.ora) it is looking for before it crashes. So I created the sqlnet.ora under $ORACLE_HOME/network/admin directory, ran the perl script and voila (!), no more core dump.

However, I encountered another problem. The script does not connect to the database and the error is a misleading error (ORA-12154) as tnsping works, SQL*Plus connection works with no issue!

lab-ora:/tmp/conn.pl

DBI connect(‘lab_ora’,’devusr’,…) failed: ORA-12154: TNS:could not resolve the connect identifier specified (DBD ERROR: OCIServerAttach) at conn.pl line 4 Database connection not made: ORA-12154: TNS:could not resolve the connect identifier specified (DBD ERROR: OCIServerAttach) at conn.pl line 4.

So this time, I enabled a sqlnet.ora trace by specifying the following parameters in sqlnet.ora:
 
TRACE_LEVEL_CLIENT = SUPPORT
TRACE_DIRECTORY_CLIENT = /tmp
TRACE_LEVEL_CLIENT parameter values are OFF, USER, ADMIN, SUPPORT. By default, or if not specified, it’s value is OFF. Set the value depending on how much detailed you want your trace file to have. (For details and description please refer to Metalink Note 216912.1 ). I’ve set ours to SUPPORT hoping I would get the information I need to fix this misleading error.

TRACE_DIRECTORY_CLIENT can be set to any valid directory in your system and the user that own’s your oracle client should have a write privilege on that directory, in this case, its /tmp.

After several trace files, it got more confusing and misleading as the trace file indicates that it cannot find the TNS “path”. So I came to think that the problem might related to a corrupted OCI driver. Thus we decided to reinstall the oracle client.

We tested the perl script after the oracle client was reinstalled and we’re back to square one because we have the core dump again. So I recreated the sqlnet.ora file and the core dump went away. But, we are now back to the ora-12154 error!

THE FIX:

Upon googling pages and pages of forums and support issues, I chanced upon this document from Metalink (Note: 388631.1 – Solaris x86-64: Running 32-bit Applications Connecting to Database Using TNS Naming Adapter Fails With Segmentation Fault (SIGSEGV) or ORA-12154).

Apparently, we are hitting a bug! So I immediately requested for an outage for us to install the patch for this bug.

Installing the patch has fixed the misleading ora-12154 error and even if you remove (rename) sqlnet.ora, core dumps are not being created when you use perl to connect to the database.

And that made my day….woohoo…!

h1

Index Usage Monitoring

August 5, 2009

So last night, i got a couple of pages because some of our tablespaces for indexes are filling up. Good thing, these tablespaces did not fill up until I came to the office.

We have this one multi-tb partitioned table that has lots of indexes (normal and bmi) that i think only a few of them are really being used so i decided to do the following:

1. increase the max extent of these tablespaces to 250gb so the alerts will go away. anyway, we have enough space in the filesystem to accomodate this.

2. figure out which indexes can be dropped. i know that we should be careful enough when dropping indexes as it might impact performance which is the reason why i wrote a script that captures which indexes are used and not used. this script will run for a month before we can talk to the application owner and recommend that the unused indexes be dropped.

Dropping unused indexes will free up a lot of space, will provide a shorter window for index maintenance, and faster data loading time.

The scripts that I wrote are as shown below (i’m fond of writing scripts that are reusable):

1. runsql.sh

#!/usr/bin/ksh -x
sqlexec=`sqlplus /nolog <<SQLBLK
conn ${CONNSTR}
set echo off feed on head off lines 80
spool ${LOGDIR}/${DT}_${1}.log
@${SQLDIR}/${1}
spool off
set echo off
exit
SQLBLK`
exit

The above script is reusable, just make sure you have the values for the environment variables it needs. its also one way of securing SQL*Plus connections as it hides the connect string from a ps -ef command.

2. chk_idx_usage.sh

#!/usr/bin/ksh -x
export DT=`date ‘+%Y%m%d%H%M%S’`
export ORACLE_HOME=/opt/Oracle/product/10.2.0
export PATH=${ORACLE_HOME}/bin:${PATH}:.
export LOGDIR=/home/oracle/admin/log
export SQLDIR=/home/oracle/admin/schema/oradb
export PSWD=`cat /home/oracle/admin/.oradbmon`
export CONNSTR=oradbmon/${PSWD}@oradb
/home/oracle/admin/bin/runsql.sh gsx_idx_usage
exit

3. idx_usage.sql

select min(p.timestamp)||’,’||max(p.timestamp)||’,’||p.object_name||’,’||p.operation||’,’||p.options||’,’||count(1)
from dba_hist_sql_plan p,
   dba_hist_sqlstat s
where p.object_owner = ‘Myschema’
and p.operation like ‘%INDEX%’
and p.operation <> ‘INDEX BUILD’
and p.sql_id = s.sql_id
and p.object_name in (select index_name from dba_indexes where table_name=’HUGE_TABLE’)
group by p.object_name,
   p.operation,
   p.options
/

I patterned my query from Burleson’s article found in the web and customized it for my particular need. I did not query for ‘INDEX REBUILD’ operations as this will include all indexes even if they are not being used. Also, I concatenated the result set so I can put them in a spreadsheet someday in case I need to figure out a trend.

h1

ORA-16826: apply service state is inconsistent with the DelayMins property

August 4, 2009

16826, 0000, “apply service state is inconsistent with the DelayMins property”
// *Cause:  This warning was caused by one of the following reasons:
//          1. Apply service was started without specifying the real time apply
//             option or without the NODELAY option while DelayMins is zero.
//          2. Apply service was started with the real-time apply option or
//             with the NODELAY option while DelayMins is greater than zero.
// *Action: Reenable the standby database to allow the broker to restart
//          the apply service with the apply options that are consistent
//          with the specified value of the DelayMins property.

well, this error has convinced me to really start out this blog because I don’t want to spend a lot of time researching just to find out that it only takes less than a minute to fix it. i’ve encountered this error before but forgot how to troubleshoot it thus the blog.

if ever you have this error, the simple fix would be to get on data guard manager line-mode (dgmgrl, i prefer command lines rather than gui’s because it gives more control over what’s happening) and disable the configuration and reenable it. see below for details

 

prim-oradb:dgmgrl
DGMGRL for Solaris: Version 10.2.0.3.0 – 64bit Production

Copyright (c) 2000, 2005, Oracle. All rights reserved.

Welcome to DGMGRL, type “help” for information.
DGMGRL> connect sys
Password:
Connected.
DGMGRL> show configuration

Configuration
  Name:                dg_iq_oradb
  Enabled:             YES
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    prim_oradb – Primary database
    std_oradb – Physical standby database

Current status for “dg_iq_oradb”:
Warning: ORA-16608: one or more databases have warnings
DGMGRL> show database verbose ‘prim_oradb’

Database
  Name:            prim_oradb
  Role:            PRIMARY
  Enabled:         YES
  Intended State:  ONLINE
  Instance(s):
    oradb

  Properties:
    InitialConnectIdentifier        = ‘prim_oradb’
    LogXptMode                      = ‘ASYNC’
    Dependency                      = ”
    DelayMins                       = ‘0’
    Binding                         = ‘OPTIONAL’
    MaxFailure                      = ‘0’
    MaxConnections                  = ‘1’
    ReopenSecs                      = ‘300’
    NetTimeout                      = ‘180’
    LogShipping                     = ‘ON’
    PreferredApplyInstance          = ”
    ApplyInstanceTimeout            = ‘0’
    ApplyParallel                   = ‘AUTO’
    StandbyFileManagement           = ‘MANUAL’
    ArchiveLagTarget                = ‘0’
    LogArchiveMaxProcesses          = ‘2’
    LogArchiveMinSucceedDest        = ‘1’
    DbFileNameConvert               = ”
    LogFileNameConvert              = ”
    FastStartFailoverTarget         = ”
    StatusReport                    = ‘(monitor)’
    InconsistentProperties          = ‘(monitor)’
    InconsistentLogXptProps         = ‘(monitor)’
    SendQEntries                    = ‘(monitor)’
    LogXptStatus                    = ‘(monitor)’
    RecvQEntries                    = ‘(monitor)’
    HostName                        = ‘prim_oradb’
    SidName                         = ‘oradb’
    LocalListenerAddress            = ‘(ADDRESS=(PROTOCOL=tcp)(HOST=prim-oradb.com)(port=1522))’
    StandbyArchiveLocation          = ‘/opt/Oraarch/arch/oradb/’
    AlternateLocation               = ”
    LogArchiveTrace                 = ‘0’
    LogArchiveFormat                = ‘arch_%t_%s_%r.log’
    LatestLog                       = ‘(monitor)’
    TopWaitEvents                   = ‘(monitor)’

Current status for “prim_oradb”:
SUCCESS

DGMGRL> show database verbose ‘std_oradb’

Database
  Name:            std_oradb
  Role:            PHYSICAL STANDBY
  Enabled:         YES
  Intended State:  ONLINE
  Instance(s):
    oradb

  Properties:
    InitialConnectIdentifier        = ‘std_oradb’
    LogXptMode                      = ‘ASYNC’
    Dependency                      = ”
    DelayMins                       = ‘0’
    Binding                         = ‘OPTIONAL’
    MaxFailure                      = ‘0’
    MaxConnections                  = ‘1’
    ReopenSecs                      = ‘300’
    NetTimeout                      = ‘180’
    LogShipping                     = ‘ON’
    PreferredApplyInstance          = ”
    ApplyInstanceTimeout            = ‘0’
    ApplyParallel                   = ‘AUTO’
    StandbyFileManagement           = ‘MANUAL’
    ArchiveLagTarget                = ‘0’
    LogArchiveMaxProcesses          = ‘2’
    LogArchiveMinSucceedDest        = ‘1’
    DbFileNameConvert               = ”
    LogFileNameConvert              = ”
    FastStartFailoverTarget         = ”
    StatusReport                    = ‘(monitor)’
    InconsistentProperties          = ‘(monitor)’
    InconsistentLogXptProps         = ‘(monitor)’
    SendQEntries                    = ‘(monitor)’
    LogXptStatus                    = ‘(monitor)’
    RecvQEntries                    = ‘(monitor)’
    HostName                        = ‘std-oradb’
    SidName                         = ‘oradb’
    LocalListenerAddress            = ‘(ADDRESS=(PROTOCOL=tcp)(HOST=std-oradb.com)(port=1522))’
    StandbyArchiveLocation          = ‘/opt/Oraarch/arch/oradb/’
    AlternateLocation               = ”
    LogArchiveTrace                 = ‘1’
    LogArchiveFormat                = ‘arch_%t_%s_%r.log’
    LatestLog                       = ‘(monitor)’
    TopWaitEvents                   = ‘(monitor)’

Current status for “std_oradb”:
Warning: ORA-16826: apply service state is inconsistent with the DelayMins property

DGMGRL> disable configuration
Disabled.
DGMGRL> show configuration

Configuration
  Name:                dg_iq_oradb
  Enabled:             NO
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    prim_oradb – Primary database
    std_oradb – Physical standby database

Current status for “dg_iq_oradb”:
DISABLED

DGMGRL> enable configuration
Enabled.

DGMGRL> show configuration

Configuration
  Name:                dg_iq_oradb
  Enabled:             YES
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    prim_oradb – Primary database
    std_oradb – Physical standby database

Current status for “dg_iq_oradb”:
Warning: ORA-16610: command ‘Broker automatic health check’ in progress
DGMGRL> show configuration

Configuration
  Name:                dg_iq_oradb
  Enabled:             YES
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    prim_oradb – Primary database
    std_oradb – Physical standby database

Current status for “dg_iq_oradb”:
SUCCESS

DGMGRL>

Follow

Get every new post delivered to your Inbox.