Tuesday, December 21, 2010

Oracle block browser and editor (BBED)

Oracle BBED is there from Oracle7 to Oracle10g and is designed for internal use only.

Check the above links for more info on this tool:

http://www.dba-oracle.com/t_bbed.htm
http://www.dba-oracle.com/forensics/t_forensics_bbed.htm

http://orafaq.com/papers/dissassembling_the_data_block.pdf
http://www.oracle-internals.com/?p=17

Also check this : http://www.freelists.org/post/oracle-l/is-it-the-bbed



This is a very dangerous tool to use it in production with out Oracle Support guidance and suggestion. Also the tool is password protected so you must find the password to run it.

Thursday, December 16, 2010

Building an Oracle RAC 11g R2 on Linux - (RHEL 5.5)

A very nice article about build Oracle 11gR2 RAC on Linux
http://www.idevelopment.info/data/Oracle/DBA_tips/Oracle11gRAC/CLUSTER_12.shtml

Wednesday, December 15, 2010

How is 'skip tablespace' handled by the RMAN duplicate command

Once RMAN opens the Auxilary database it will attempt to drop all tablespaces which are skipped.
RMAN does the drop of these tablespaces in reverse sorted tablespace names to be skipped.
Depending on what is in these tablespaces, or if it is the system default tablespace, the 'drop
tablespace' command can fail.

Tuesday, December 14, 2010

RMAN Tablespace Point-in-Time Recovery and ORA-19602

Bug: 7016765 10.2.0.3.0 RDBMS 10.2.0.3.0 RMAN
Abstract: ORA-19602 WHEN RUNNING RMAN'S TSPITR
RMAN TSPITR does not always work using datafilecopies. Not an issue with backupsets.
This is fix in 10.2.0.5

Path 7016765 have been ported for Linux x86_64. You can also check on Metalink patch search for other platforms.

This bug also states for the 10.2.0.4 version and if you hit this there is no workaround until you apply 10.2.0.5.So in order to avoid this bug , you must manual duplicate the database to another instance and applying incomplete recovery and then do a transaportable tablespace set export/import to the target database for tbs you want to perform TSPITR.

My scenario is based on a 3 Node Oracle 10gR2 (10.2.0.4) RAC database.

A tablespace called MYTBS must be recovered back in time before an accidentally "drop user cascade" occurred. The user dropped at 14:00 and itried to use TSPITR but RMAN failed with ORA-19602 as stated in the beginning of the article.

The only valid workaround to this very annoying issue (believe me , i have fall into this bug on my production 24x7 system) is to use the duplicate database feature doing a whole database point in time recovery skipping all the tablespaces you except SYSTEM,SYSAUX,UNDO and the tablespace you want to recover.

In this example ORACLE_BASE=/u01/app/oracle

1) Create an AUX init.ora file like you do in manual TSPITR having the following entries.

# FILE: initAUX.ora
audit_file_dest =/u01/app/oracle/admin/AUX/adump
background_dump_dest =/u01/app/oracle/admin/AUX/bdump
core_dump_dest =/u01/app/oracle/admin/AUX/cdump
user_dump_dest =/u01/app/oracle/admin/AUX/udump
db_name =AUX
instance_name =AUX
db_unique_name=AUX
db_create_file_dest=/data/ttbs
db_recovery_file_dest=/data/ttbs
db_recovery_file_dest_size=100000M # Resize according to the size on the mount point you are duplicating
db_block_size = 8192
compatible=10.2.0.4.0
db_block_size=8192
db_files=1000
large_pool_size=1M
shared_pool_size=110M
remote_login_passwordfile=exclusive
undo_management=AUTO
undo_retention=900
undo_tablespace=UNDO2 # Must be the same as the target instance

I have placed my file in the [oracle@labdb01 AUX]$ORACLE_HOME/dbs in order sqlplus to use it by default.

2) Create the following directories and the password file
[oracle@labdb01 AUX]$ mkdir -p /data/ttbs/AUX
[oracle@labdb01 AUX]$ mkdir -p /u01/app/oracle/admin/AUX/adump
[oracle@labdb01 AUX]$ mkdir -p /u01/app/oracle/admin/AUX/bdump
[oracle@labdb01 AUX]$ mkdir -p /u01/app/oracle/admin/AUX/udump
[oracle@labdb01 AUX]$ mkdir -p /u01/app/oracle/admin/AUX/cdump

[oracle@labdb01 AUX]$ orapwd file=[oracle@labdb01 AUX]$ORACLE_HOME/dbs/orapwAUX password=pass

3) Startup nomount the AUX instance
[oracle@labdb01 AUX]$ export ORACLE_SID=AUX
[oracle@labdb01 AUX]$ sqlplus / as sysdba
SQL> create spfile from pfile;

File created.

SQL> startup nomount
ORACLE instance started.

Total System Global Area 205520896 bytes
Fixed Size 2082752 bytes
Variable Size 146802752 bytes
Database Buffers 50331648 bytes
Redo Buffers 6303744 bytes
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
[oracle@labdb01 AUX]$

4) Create an rman script for the duplication (dup.rcv)
Keep in mind that SYSTEM,SYSAUX and UNDO tablespaces can not be skipped.
Also the recovery set of tablespaces must consist of all dependable objects e.g table and indexes , tables/partitions, lob segments , lob indexes.
Check also the v$archived_log in order to find the exact time to use.

RUN
{
sql 'alter system archive log current';
allocate channel C1 device type disk;
allocate auxiliary channel C2 device type disk;
set until time "to_date('14/12/2010 13:20:00','DD/MM/YYYY HH24:MI:SS')";
DUPLICATE TARGET DATABASE TO AUX
SKIP TABLESPACE USERS;
}

5) Connect to rman and run the script
NOTE: if your target db is using a catalog then add the string catalog user/pass@catalog_sid

[oracle@labdb01 ~]$ rman target sys@testdb1 auxiliary /
RMAN>@dup.rcv

NOTE: You cannot skip the default temporary tablespace from the recovery set.

6) Begin transport tablespace procedure.

on AUX :

SQL>begin
SQL>sys.dbms_tts.transport_set_check('MYTBS',TRUE);
SQL>end;
SQL>/

PL/SQL procedure successfully completed.

SQL>select * from sys.transport_set_violations;

If this query returns rows then fix them before proceed.

SQL>alter tablespace mytbs read only;

SQL> create or replace directory exp_dir as '/data/ttbs/AUX';

Directory created.

SQL> grant read,write on directory exp_dir to system;

Grant succeeded.

SQL>exit;

[oracle@labdb01 ~]$ expdp system dumpfile=mytbs.dmp directory=EXP_DIR TRANSPORT_TABLESPACES=MYTBS TRANSPORT_FULL_CHECK=Y

If the expdp fails then :

SQL>select reason,count(*) from sys.pluggable_set_check where ts1_name like '%MYTBS%' group by reason;

And resolve each reason !!!

Copy the dmp file and the datafile(s) of the tabelspace to the desired target db directory.
[oracle@labdb01 ~]$ cp /data/ttbs/AUX/mytbs.dmp /data/ttbs/testdb/
[oracle@labdb01 ~]$ cp /data/ttbs/AUX/datafile/o1_mf_mytbs_6jgv6obd_.dbf /data/ttbs/testdb/

IF you are recovering form a user error like our scenario drop the tabelspace from the target !!
Re create the dropped user. Remember to grant his roles and give him his object grants.
My user had the dba role granted so.



[oracle@labdb01 ~]$ sqlplus sys@testdb as sysdba
SQL> drop tablespace mytbs including contents and datafiles cascade constraints;
SQL> grant dba to agis identified by agis;
SQL> alter user agis default tablespace users;
SQL> exit;

Go to the directory you copied the files:

[oracle@labdb01 ~]$ cd /data/ttbs/testdb/

I have a directory object TTS_DIR that points to the correct O/S directory. If not create one as in previous section
[oracle@labdb01 testdb]$ export ORACLE_SID=testdb
[oracle@labdb01 testdb]$ impdp system dumpfile=mytbs.dmp logfile=imp.log DIRECTORY=TTS_DIR TRANSPORT_DATAFILES='/data/ttbs/testdb/o1_mf_mytbs_6jgv6obd_.dbf'

If you want you can omit to gather statistics again using : EXCLUDE=TABLE_STATISTICS EXCLUDE=INDEX_STATISTICS on the impdp

At this point the tablespace mytbs has been restored bur its read only and unknown to the control file.


[oracle@labdb01 testdb]$ sqlplus agis/agis

SQL*Plus: Release 10.2.0.4.0 - Production on Tue Dec 14 15:40:17 2010

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> select table_name from user_tables;

TABLE_NAME
------------------------------
TEST

SQL> select count(*) from test;

COUNT(*)
----------
49643

BUT !!!

[oracle@labdb01 testdb]$ rman target /

Recovery Manager: Release 10.2.0.4.0 - Production on Tue Dec 14 15:41:02 2010

Copyright (c) 1982, 2007, Oracle. All rights reserved.

connected to target database: TESTDB (DBID=2520745547)

RMAN> report schema;

using target database control file instead of recovery catalog
Report of database schema

List of Permanent Datafiles
===========================
File Size(MB) Tablespace RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1 490 SYSTEM *** /data/ttbs/testdb/system01.dbf
2 90 UNDOTBS1 *** /data/ttbs/testdb/undotbs01.dbf
3 360 SYSAUX *** /data/ttbs/testdb/sysaux01.dbf
4 5 USERS *** /data/ttbs/testdb/users01.dbf
5 25 UNDOTBS2 *** /data/ttbs/testdb/undotbs02.dbf

List of Temporary Files
=======================
File Size(MB) Tablespace Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1 166 TEMP 32767 /data/ttbs/testdb/temp01.dbf


Now its the time to rename the datafile if you want or make it writable in order to inform the controlfile.

On the target instance
[oracle@labdb01 testdb]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Tue Dec 14 15:43:34 2010

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> select file_id,file_name from dba_data_files where tablespace_name='MYTBS';

FILE_ID
----------
FILE_NAME
--------------------------------------------------------------------------------
6
/data/ttbs/testdb/o1_mf_mytbs_6jgv6obd_.dbf


SQL> alter tablespace mytbs offline;
SQL> !mv /data/ttbs/testdb/o1_mf_mytbs_6jgv6obd_.dbf /data/ttbs/testdb/mytbs.dbf
SQL> alter tablespace mytbs rename datafile '/data/ttbs/testdb/o1_mf_mytbs_6jgv6obd_.dbf' to '/data/ttbs/testdb/mytbs.dbf';

Tablespace altered.

SQL> select file_id,file_name from dba_data_files where tablespace_name='MYTBS';

FILE_ID
----------
FILE_NAME
--------------------------------------------------------------------------------
6
/data/ttbs/testdb/mytbs.dbf


SQL> alter tablespace mytbs online;

Tablespace altered.

SQL> alter tablespace mytbs read write;

Tablespace altered.

SQL> exit

Backup your tabelspace imemdiate after the restore !!!

7) Drop the AUX database
[oracle@labdb01 ~]$ export ORACLE_SID=AUX
[oracle@labdb01 ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Tue Dec 14 15:56:22 2010

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> shutdown abort;
ORACLE instance shut down.
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
[oracle@labdb01 ~]$ rm -fr /data/ttbs/AUX
[oracle@labdb01 ~]$


We keep the password file and the init file in order to use the same configuration for other point int time recovery scenarios.

If you database uses ASM you must put the tabelspace in ASM.

From sqlplus
SQL> alter database tablespace mytbs offline;

Use RMAN to transfer the tablespace to ASM

rman target /
RMAN>
RMAN> backup as copy tablespace mytbs format '+DATA';
RMAN> switch tablespace mytbs to copy;
RMAN> recover tablespace mytbs;

From sqlplus

SQL> alter database tablespace mytbs online;

Database altered.

And you are done !!!

References:

Metalink Documnet IDs:
340848.1
368029.1
153981.1
388424.1
371556.1

Friday, November 26, 2010

Oracle enqueue wait tips

Oracle enqueue wait tips from Burleson Consulting site

http://www.dba-oracle.com/t_enqueue_wait_types_.htm

Wednesday, November 17, 2010

Installation of Oracle 10gR2 database on RedHat Enterprise Linux/Oracle Enterprise Linux 5 (x86_64)

As root

1) Check Hardware Requirements

# grep MemTotal /proc/meminfo >= 1Gb
# grep SwapTotal /proc/meminfo
Between 1024 MB and 2048 MB 1.5 times the size of RAM
Between 2049 MB and 8192 MB Equal to the size of RAM
More than 8192 MB 0.75 times the size of RAM

# df -h
400 MB of disk space in the /tmp directory
Between 1.5 GB and 3.5 GB of disk space for the Oracle software, depending on the installation type
1.2 GB of disk space for a preconfigured database that uses file system storage (optional)
To determine the amount of free disk space on the system, enter the following command

2) Kernel Requirements

# uname -a >= 2.6.18-8

3) Disable SELinux or switch it to "Permissive" mode

#/usr/sbin/sestatus
SELinux status: disabled
or
#/usr/sbin/getenforce
Disabled

If you see the previous output SELinux is disabled.

If SELInux is enabled then run :

Commands, as root:
======================
getenforce (returns "Enforcing")
setenforce 0
getenforce (returns "Permissive")

End to restore it to enforcing
setenforce 1
getenforce (returns "Enforcing")

For more info check Oracle Metalink Note: 454196.1

4) Installing Required packages

Use rpm -qa | grep package_name for the following packages

binutils-2.17.50.0.6-2 (x86_64)
compat-db-4.2.52-5.1 (x86_64)
compat-libstdc++-296-2.96-138 (i386)
compat-libstdc++-33-3.2.3-61(x86_64)
compat-libstdc++-33-3.2.3-61 (i386)
control-center-2.16.0-14 (x86_64)
gcc-4.1.1-52 (x86_64)
gcc-c++-4.1.1-52 (x86_64)
glibc-2.5-12 (x86_64)
glibc-2.5-12 (i686)
glibc-common-2.5-12 (x86_64)
glibc-devel-2.5-12 (x86_64)
glibc-devel-2.5-12 (i386)
glibc-headers-2.5-12 (x86_64)
ksh-20060214-1.4 (x86_64)
libaio-0.3.106-3.2 (x86_64)
libgcc-4.1.1-52 (i386)
libgcc-4.1.1-52 (x86_64)
libgnome-2.16.0-6 (x86_64)
libgnomeui-2.16.0-5 (x86_64)
libgomp-4.1.1-52 (x86_64)
libstdc++-4.1.1-52 (x86_64)
libstdc++-devel-4.1.1-52 (x86_64)
libXp-1.0.0-8 (i386)
make-3.81-1.1 (x86_64)
sysstat-7.0.0-3 (x86_64)

The version of the above packages is the lower needed. So if you are in higher you are OK

e.g

# rpm -qa --queryformat "%{NAME}-%{VERSION}-%{RELEASE} (%{ARCH})\n" | grep binutils
binutils-2.17.50.0.6-14.el5 (x86_64)

5) Set the kernel parameters

kernel.shmall = physical RAM size / pagesize (getconf PAGESIZE) -- If the defaults are greater then leave it.
kernel.shmmax = 1/2 of physical RAM, but not greater than 4GB -- If the defaults are greater then leave it.
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 512 x processes (for example 65536 for 128 processes)
net.ipv4.ip_local_port_range =9000 65500
a.) The runInstaller (OUI) checks may expect this to be the old guidance of “1024 65000”. The new guidance from Oracle development is “9000 65500”. Please allow the runInstaller (OUI) to proceed with the new guidance from Oracle development.
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 1048576

# vi /etc/sysctl.conf

Set the values and then run :

# sysctl -p

6) Add Oracle user and groups and create the ORACLE_BASE directory for Oracle Software and database.

# groupadd oinstall
# groupadd dba
# useradd -g oinstall -G dba oracle
# passwd oracle

e.g
# mkdir -p /u01/app/oracle
# chown -R oracle:oinstall /u01/app/oracle

7) Set the session limits for Oracle User

Set the following to the /etc/security/limits.conf
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536

Set the following to the /etc/pam.d/login
session required pam_limits.so

Add the following to /etc/profile if Oracle user will use the bash shell.

if [ $USER = "oracle" ]; then
ulimit -u 16384
ulimit -n 65536
fi

As oracle user
8) Login as Oracle and change the .bash_profile
Add the following in order to begin the installation
export ORACLE_BASE=/u01/app/oracle
export ORACLE_SID=orcl # You can not set the ORACLE_SID if you do not want write now
export DISPLAY=YOUR_XTERM_IP:0 in order to you an X session for the installation
unset ORACLE_HOME
$ . ./.bash_profile

to apply the new settings

9) Put the software for 10gR2 to a desired server directory.

10201_database_linux_x86_64.cpio.gz
10201_companion_linux_x86_64.cpio.gz
p6810189_10204_Linux-x86-64.zip

Unzip the database and run
$ gunzip 10201_database_linux_x86_64.cpio.gz
$ cpio -idmv < 10201_database_linux_x86_64.cpio
$ cd database
$./runInstaller –ignoreSysPrereqs

Then unzip the companion and run
$ gunzip 10201_companion_linux_x86_64.cpio.gz
$ cpio -idmv < 10201_companion_linux_x86_64.cpio
$ cd companion
$./runInstaller –ignoreSysPrereqs

Finally unzip the 10.2.0.4 Patch Set and run (At the time this post was written only the 10.2.0.4 patch was available):
$ unzip p6810189_10204_Linux-x86-64.zip
$ cd Disk1
$ ./runInstaller –ignoreSysPrereqs


10) Create the database
Add the following to the .bash_profile
export ORACLE_HOME=[YOUR_ORACLE_HOME]
export PATH=$ORACLE_HOME/bin:.:$PATH

Run
$ . ./.bash_profile
$ dbca

Create your desired database and you finished
After that run

$ netca

To create a listener

ATTENTION if your host name is more than 32 characters then you are about to have the following situation:

If you have choose to install EM and during its creation you get from dbca
ORA-06502: PL/SQL: numeric or value error: character string buffer too small

then apply the following workaround:

1) Ignore error and continue with database creation
2) After successful finish of dbca do :

Verify the case using sqlplus
SQL> select length(host_name) from v$instance where rownum=1;

LENGTH(HOST_NAME)
-----------------
36

Here the length is 36 (>32) and for 10.2 this is a bug and em repository creation fails.
So
a) Make a backup copy of the file

$ cp $ORACLE_HOME/sysman/admin/emdrep/sql/core/latest/self_monitor/self_monitor_post_creation.sql $ORACLE_HOME/sysman/admin/emdrep/sql/core/latest/self_monitor/self_monitor_post_creation.sql_bak

b) Change the l_host_name variable in self_monitor_post_creation.sql to have 128 characters
The l_host_name variable needs to be changed twice in the script.
$ vi $ORACLE_HOME/sysman/admin/emdrep/sql/core/latest/self_monitor/self_monitor_post_creation.sql

c) Drop and recreate the DB Control configuration files and repository
$ $ORACLE_HOME/bin/emca -deconfig dbcontrol db -repos drop
$ $ORACLE_HOME/bin/emca -config dbcontrol db -repos create

Check your em by using the URL mentioned.


REFERENCES
Metalink Note ID 421308.1, 335118.1

Thursday, November 11, 2010

Flushing and closing trace files using oradebug

Lots of times we erase trace files on the $ORACLE_BASE/admin directory but the database is still open. The processes that have created these files have the inode of the file opened and thus removing the file any new information goes to it but we can not see the file until the process rotates it or close its.

We can use the oradebug utility to manual flush and close the file.

e.g

[oracle@labdb01 udump]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Thu Nov 11 12:29:35 2010

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> SELECT
s.sid, s.serial#, p.pid
FROM v$session s, v$process p
WHERE s.paddr=p.addr
and s.program like '%LMD%'; 2 3 4 5

SID SERIAL# PID
---------- ---------- ----------
1101 1 6

SQL> oradebug setorapid 6
Unix process pid: 17652, image: oracle@labdb01 (LMD0)
SQL> oradebug close_trace
Statement processed.
SQL> oradebug flush
Statement processed.
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

In the above example we use the query to find the PID for the oracle lmd0 process.
The we use the oradebug in order to flush the data and close the file.

We can also use the O/S PID.

$ ps -ef | grep $ORACLE_SID | grep lmd

Would return the PID for Oracle background process LMD.

Now:

$ sqlplus / as sysdba
SQL>oradebug setospid
SQL>oradebug close_trace
SQL>oradebug flush
SQL>exit
Now check the file in the bdump location, a new file would have been created

Wednesday, November 10, 2010

Oracle 10G Composite RANGE-LIST partitioning

Composite RANGE-LIST partitioned table maintenace.

1) Create

create
table mytable
(
PROC varchar2(30 byte),
ACTION varchar2(100 byte),
START_TIME date,
END_TIME date,
)
tablespace USERS enable row movement
partition by range (start_time)
subpartition by list (action)
subpartition template
(
subpartition added values ('ADD') tablespace users ,
subpartition removed values ('REMOVE') tablespace users
)
(
partition nov10 values less than (to_date('01/12/10','DD/MM/YYYY')) tablespace users,
partition dec10 values less than (to_date('01/01/11','DD/MM/YYYY')) tablespace users,
partition jan11 values less than (to_date('01/02/11','DD/MM/YYYY')) tablespace users
);

create index my_idx on mytable (ACTION) local;

All partitions are stored in the same tablespace and a local index is created too.

ORA-01450 maximum key length (num) exceeded

I faced this error when i was rebuilding and index online but without the online the index rebuilds fine.

This is caused by Bug:2525767. The online rebuild of the index creates a journal table and index and the table contains more columns in its index with total length greater than number reported in ORA-01450 error message.This is a feature of online rebuild.

Maximum key length is calculated with respect to the database block size. It
means that current value of the initialization parameter db_block_size is not
large enough so that the internal journal IOT can be created without errors.

Workaround to rebuild the index offline.

Monday, November 1, 2010

Oracle Event Codes

In the following link you can find all Oracle event Codes.

http://www.kevinloney.com/free/events.htm

WARNING: inbound connection timed out (ORA-3136)

The "WARNING: inbound connection timed out (ORA-3136)" in the alert log indicates that the client was not able to complete it's authentication within the SQLNET.INBOUND_CONNECT_TIMEOUT time.(Defaults to 60 secs after 10.2)

Take look of the following

1. Check whether local connection on the database server is successful & quick.
2. If local connections are quick ,then check for underlying network delay with the help of your network administrator.
3. Check whether your Database performance has degraded in anyway.
4. Check alert log for any critical errors for eg, ORA-600 or ORA-7445 and get them resolved first.
These critical errors might have triggered the slowness of the database server.

Wednesday, October 13, 2010

Oracle External Table Examples

If you have some temporary data to import to the database you can use an external table instead.

Lets amuse that you have the following file.

myfile.txt
1
2
3
....

1) You must create a directory and grant read writes to your user.

create or replace directory my_dir as '/home/oracle';

grant read on directory my dir to my_user;

2) Put your file there and


create table mytable_load
(
a int
)
organization external
(
type oracle_loader default directory my_dir
access parameters (RECORDS DELIMITED BY NEWLINE FIELDS)
location ('myfile.txt')
);


... and you have your table.

This is an example of adding a window csv file created by Excel , having Headings and the Windows new line as record terminator:

create table mytable_load
(
a int,
b varchar2(100 char),
c date
)
organization external
(
type oracle_loader default directory my_dir
access parameters (RECORDS DELIMITED BY '\r\n' skip 1 FIELDS TERMINATED BY ';')
location ('myfile.csv')
);

Thursday, October 7, 2010

ORA-600 [17018] And ORA-21522

In your alert log you find out : You have

ORA-00600: internal error code, arguments: [17018], [0], [], [], [], [], [], []

In the trace file you find out that :

Error on rollback: ORA-22303: type "SYS"."AQ$_RECIPIENTS" not found
ORA-21522: attempted to use an invalid connection in OCI (object mode only)
*** 2010-10-05 07:23:34.517
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [17018], [0], [], [], [], [], [], []

The call stack has one of the following:

kglrls kxsReleaseLookupLoc kxsUnlock kksCloseCursor opicca
or
kglrls kxsReleaseParentLoc kxsFreeXsc kksCloseCursor opicca

The cause of this problem has been identified and verified in unpublished Bug 5444620, which is a duplicate of unpublished Bug 5675310.

There is no known workaround for this bug which is fixed in 11.1
According to development, it is not feasible to fix this issue in the 10.2

Tuesday, September 28, 2010

Un Indexed Foreign Keys

Un indexed foreign keys is meaning a bad schema design. If you do not have index on foreign keys you will probably face situations of
1) deadlocks (http://agstamy.blogspot.com/2010/09/oracle-deadlocks.html)
2) But query plans when joining parent/child table
3) Full table scans causing locks on child tables while updating the parent table pk
4) Problem with cascade delete

There is always an overhead for too many indexes specially on tables that may be high transactional but you will not face the above situations. Bellow are too interesting scripts.

First one you can find the foreign keys referencing a list of parent tables.

select table_name,constraint_name
from dba_constraints
where owner='MY_OWNER'
and r_constraint_name in (
select constraint_name from dba_constraints
where owner='MY_OWNER'
and table_name in ('MY_TABLE','MY_TABLE2',......)
and constraint_type in ('P'));

You can then go and add indexes on the tables FKs columns.

A more automated script by Tomas Kyte.

SELECT
'create index '||owner||'.'||SUBSTR(table_name,INSTR(table_name,'_',1,1)+1)||'_idx'||TO_CHAR(rnk)||' on '||owner||'.'||table_name||'('||col1||');'
FROM (
SELECT
status,owner,table_name,col1,col2
,rnk+1 rnk
FROM(
SELECT status,owner,table_name,col1,col2
,ROW_NUMBER() OVER (PARTITION BY owner ORDER BY owner) rnk
FROM (
SELECT DECODE( b.table_name, NULL, '****', 'ok' ) Status, owner,
a.table_name, a.COLUMNS col1, b.COLUMNS col2
FROM
( SELECT SUBSTR(a.table_name,1,30) table_name, a.owner,
SUBSTR(a.constraint_name,1,30) constraint_name,
MAX(DECODE(position, 1, SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position, 2,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position, 3,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position, 4,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position, 5,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position, 6,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position, 7,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position, 8,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position, 9,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position,10,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position,11,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position,12,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position,13,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position,14,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position,15,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(position,16,', '||SUBSTR(column_name,1,30),NULL)) COLUMNS
FROM dba_cons_columns a, dba_constraints b
WHERE a.owner LIKE 'MY_OWNER'
AND a.owner=b.owner
AND a.constraint_name = b.constraint_name
AND b.constraint_type = 'R'
GROUP BY a.owner,SUBSTR(a.table_name,1,30), SUBSTR(a.constraint_name,1,30) ) a,
( SELECT SUBSTR(table_name,1,30) table_name, SUBSTR(index_name,1,30) index_name,
MAX(DECODE(column_position, 1, SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position, 2,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position, 3,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position, 4,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position, 5,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position, 6,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position, 7,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position, 8,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position, 9,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position,10,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position,11,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position,12,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position,13,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position,14,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position,15,', '||SUBSTR(column_name,1,30),NULL)) ||
MAX(DECODE(column_position,16,', '||SUBSTR(column_name,1,30),NULL)) COLUMNS
FROM dba_ind_columns
WHERE index_owner LIKE 'MY_OWNER'
GROUP BY SUBSTR(table_name,1,30), SUBSTR(index_name,1,30) ) b
WHERE a.table_name = b.table_name (+)
AND b.COLUMNS (+) LIKE a.COLUMNS || '%'
) WHERE status !='ok' ORDER BY 2
))
;


Any one can use its own naming conversion for the indexes by altering the first statement and the x on the rnk+x rnk column of the second select.

Oracle Deadlocks

Recently , i had a j2ee application that suddenly caused deadlocks ORA-00060 on a test environment. After that i had to refresh my memory about deadlocks.

In general a deadlock occurs when a session (A) wants a resource held by another session (B) , but that session also wants a resource held by the first session (A). There can be more than 2 sessions involved but the idea is the same.

You will find an entry in the alert.log pointing to the trace file that has all the info regarding the deadlock , the deadlock graph,, the sessions involved , the sql statements , the rwoid of the objects waited and other info.

The most common lock types seen in deadlock graphs are TX and TM locks.It is the lock type and modes which help determine what situation has caused the deadlock.


Lock Mode
Type Requested Probable Cause
~~~~ ~~~~~~~~~ ~~~~~~~~~~~~~~
TX X (mode 6) Application row level conflict.
Avoid by recoding the application to ensure
rows are always locked in a particular order.

TX S (mode 4) There are a number of reasons that a TX lock
may be requested in S mode usually caused by waits
E.g

Waits due to Row being locked by an active Transaction
Waits due to Unique or Primary Key Constraint enforcement
Waits due to Insufficient 'ITL' slots in the Block
Waits due to rows being covered by the same BITMAP index fragment


TM SSX (mode 5) This is usually related to the existence of
or foreign key constraints where the columns
S (mode 4) are not indexed on the child table.

ID1 of a TM lock indicates which object is being locked.
The TM lock id in the form TM-AAAAAAAA-BBBBBBBB by converting AAAAAAAA from hexadecimal to a decimal number DDDD we have the object_id. Then locate the object using SELECT * FROM dba_objects WHERE object_id= DDDD;


We also have deadlocks that concerns library cache locks with error ora-04020. The first thing is to check if they are invalid objects used, and try to recompile/validate them.
Secondly, check if DDL commands are executed by different sessions at the same time, and serialize this processing.

Great info in the web
http://yong321.freeshell.org/computer/deadlocks.txt
http://www.oratechinfo.co.uk/deadlocks.html

Always
asktom.oracle.com

And
Metalink Note IDs : 62365.1 , 62354.1, 33453.1, 166924.1

Thursday, September 9, 2010

Getting Milliseconds from Oracle Timestamps

http://www.dba-oracle.com/t_timestamp_math_elapsed_times.htm

An example procedure based on the link calculations.

create table test (a int, b int, primary key (a,b));

declare
startt timestamp;
endt timestamp;
ms int;
begin
dbms_output.enable(1000000);
startt:=systimestamp;
for i in 1 .. 1001
loop
insert into test values (i,i+1);
commit;
end loop;
endt:=systimestamp;
select sum(
(extract(hour from endt)-extract(hour from startt))*3600+
(extract(minute from endt)-extract(minute from startt))*60+
extract(second from endt)-extract(second from startt))*1000 into ms from dual;
dbms_output.put_line(ms);
end;
/

drop table test purge;

Creating materialized view in other schema reports error ORA-01031

While creating materialized view in other schema reports error ORA-01031.

The user you are using has the SELECT ANY TABLE,CREATE MATERIALIZED VIEW,CREATE ANY VIEW,CREATE ANY MATERIALIZED VIEW,CREATE ANY TABLE,CREATE SESSION.

This is due a bug. Grant explicitly the create table privilege to the user has the tables the materialized view relies.

Metalink Note Id: 749112.1

Friday, September 3, 2010

Reload Oracle JVM Oracle 10g

In order to reload the Oracle JVM in an Oracle Database 10g do the following:

1) shutdown the database

2) create the following script to remove the installed JVM:

spool full_rmjvm.log
set echo on
connect / as sysdba
startup mount
alter system set "_system_trig_enabled" = false scope=memory;
alter system enable restricted session;
alter database open;
@?/rdbms/admin/catnoexf.sql
@?/rdbms/admin/catnojav.sql
@?/xdk/admin/rmxml.sql
@?/javavm/install/rmjvm.sql
truncate table java$jvm$status;
select * from obj$ where obj#=0 and type#=0;
delete from obj$ where obj#=0 and type#=0;
commit;
select owner, count(*) from all_objects
where object_type like '%JAVA%' group by owner;
select obj#, name from obj$
where type#=28 or type#=29 or type#=30 or namespace=32;
select o1.name from obj$ o1,obj$ o2
where o1.type#=5 and o1.owner#=1 and o1.name=o2.name and o2.type#=29;
shutdown immediate
set echo off
spool off
exit


3) create the following script to reinstall JVM
spool full_jvminst.log;
set echo on
connect / as sysdba
startup mount
alter system set "_system_trig_enabled" = false scope=memory;
alter database open;
select obj#, name from obj$
where type#=28 or type#=29 or type#=30 or namespace=32;
@?/javavm/install/initjvm.sql
select count(*), object_type from all_objects
where object_type like '%JAVA%' group by object_type;
@?/xdk/admin/initxml.sql
select count(*), object_type from all_objects
where object_type like '%JAVA%' group by object_type;
@?/xdk/admin/xmlja.sql
select count(*), object_type from all_objects
where object_type like '%JAVA%' group by object_type;
@?/rdbms/admin/catjava.sql
select count(*), object_type from all_objects
where object_type like '%JAVA%' group by object_type;
@?/rdbms/admin/catexf.sql
select count(*), object_type from all_objects
where object_type like '%JAVA%' group by object_type;
shutdown immediate
set echo off
spool off
exit

4) Compile all invalid objects

@?/rdbms/admin/utlrp.sql

5) Check for the jvm installed

select version, status from dba_registry where comp_id='JAVAVM';

and / or

select owner, count(*) from all_objects where object_type like '%JAVA%' group by owner;

6) If you previously have installed InterMedia (ORDSYS), UltraSearch (WKSYS) Data Mining (DMSYS), Spatial (MDSYS) or (EXFSYS) then create and runt the following script

spool jvm_refresh.log
connect / as sysdba
@?/ord/im/admin/iminst.sql;
connect / as sysdba
@?/dm/admin/dminst1.sql SYSAUX TEMP ;
connect / as sysdba
@?/md/admin/mdinst.sql;
connect / as sysdba
@?/dm/admin/dminst2.sql;
connect / as sysdba
@?/ultrasearch/admin/wk0deinst.sql SYS change_on_install "";
connect / as sysdba
@?/ultrasearch/admin/wk0install.sql SYS change_on_install change_on_install
SYSAUX TEMP "" PORTAL false;
connect / as sysdba
alter user WKSYS account unlock identified by change_on_install;
@?/ultrasearch/admin/wk0config.sql change_on_install
(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)
(HOST=)(PORT=1521))
(CONNECT_DATA=(SERVER=DEDICATED)
(SERVICE_NAME=.)))
false " ";
spool off


7) Recompile all invalid objects as in step 4

Wednesday, September 1, 2010

Oracle Database 11g: The Top New Features for DBAs and Developers

Oracle Database 11g:
The Top New Features for DBAs and Developers

http://www.oracle.com/technetwork/articles/sql/index-099021.html

The best series in order to understand the new features of 11g Database.

Thursday, July 29, 2010

RMAN Tablespace Point-in-Time Recovery

Recovery Manager (RMAN) automatic tablespace point-in-time recovery (commonly abbreviated TSPITR) enables you to quickly recover one or more tablespaces in an Oracle database to an earlier time, without affecting the state of the rest of the tablespaces and other objects in the database.

Check online Oracle documentation for TSPITR limitations and how tos.

Known bugs :

TS_PITR_CHECK SHOWS MISLEADING MESSAGES WHEN STATISTICS ARE LOCK (BUG 5192382)

Workaround

1) Apply patch 5192382
2) Unlock affected table statistics.
exec DBMS_STATS.UNLOCK_TABLE_STATS ('table_owner','table_name');

TS_PITR_CHECK REPORTS "MASTER TABLE USED FOR SNAPSHOTS/REPLICATION NOT "

No workaorund

Apply patch 5192382

Wednesday, July 21, 2010

Ora-30078 Error Creating A Partition Table With Timestamp As The Range

Partitions on column with datatype timestamp with or without tiem zone information on Oracle 9i and up(If this capability is implemented in the 11g i will edit the article).

You have the following table test and you want to partition it on the d column.
You write the following SQL:

create table test(
a varchar2(5) not null,
d timestamp not null
) partition by range (d) (
partition p1 values less than (to_timestamp('2010-08-15','YYYY-MM-DD')),
partition p_max values less than (maxvalue)
)
;

ORA-30078: partition bound must be TIME/TIMESTAMP WITH TIME ZONE literals

In order to create the table that you must write the SQL as follows:

create table test(
a varchar2(5) not null,
d timestamp not null
) partition by range (d) (
partition p1 values less than (timestamp '2010-08-15 00:00:00'),
partition p_max values less than (maxvalue)
)
;

create table succeeded.

Be careful of the nls_timestamp_format setting of your database.

If your column d was timestamp with time zone the statement would fail with :

ORA-03001: unimplemented feature

In 11g

But if you use the local time zone for column d

create table test(
a varchar2(5) not null,
d timestamp with local time zone not null
) partition by range (d) (
partition p1 values less than (timestamp '2010-08-15 00:00:00 +00:00'),
partition p_max values less than (maxvalue)
)
;

create table succeeded.

Be carefull that when you use the local time zone :
1 Data is normalized to the database time zone when it is stored in the database.
2 When the data is retrieved, users see the data in the session time zone.

Wednesday, July 14, 2010

ORA-12012

ORA-12012: error on auto execute of job [num]

If this is a job created with the dbms_job package then you can find information using the following select

select * from dba_jobs where job=[num];

If you have created the job using the dbms_scheduler then execute the following:

select obj# , object_name
from sys.scheduler$_job a, dba_objects b
where a.obj# = b.object_id
and a.obj#=[num];


The using the EM Dbconsole you can find more details about the job.

Friday, June 4, 2010

ORA-959 Tablespace '_$deleted$0$0' Does Not Exist or users has '_$deleted$0$0' as default tablespace.

You definitely have seen this if you use OEM tablespace reorganize feature using rename tablespace method.

The case is that a tablespace is renamed to one that use to exist on the database. Here are the steps that will cause this problem.

create tablespace aa
create tablespace bb
drop tablespace aa
rename bb to aa

Then why it occurs is because when a tablespace is dropped, its ts$ entry is set to an invalid state, but it is not deleted. When a rename is done, if the target name already has a ts$ entry, then during the rename, because we cannot have two ts$ entries with the same name, the _$deleted$ construct is introduced. The change is not taken care of properly, and is flushed to other dictionary tables causing this problem.

It also counts for temporary tbs.

Confirmed on to 10.2.0.3 and 10.2.0.4

Solution

1) Apply Patch 5769963

2) Using SQL

Alter the user to have as default/temp tablespace the new tbs
If an object is offended such and index partition or table partition default tbs then alter the object.
Check
http://agstamy.blogspot.com/2009/12/ora-00959-tablespace-deleted00-does-not.html

References
Oracle Metalink Notes 604648.1, 789023.1

Tuesday, June 1, 2010

IBM Technical Articles for AIX

IBM Technical Articles for AIX

http://www.ibm.com/developerworks/views/aix/library.jsp

Thursday, May 27, 2010

Backup Job fails with Agent unreachable when agent in one of rac nodes is down.

Please run the below procedure connecting as SYSMAN to the repository database

BEGIN
EMD_RAC.REGISTER_AVAILABILITY( 'rac_database', 'Response', 'Status', 'oracle_database','3.1');
COMMIT;
END;

Bounce the DB Control.

The Cpu & Active Sessions Charts Show "No Data Is Currently Available" For Cluster Database

On the EM dbconsole home page the Cpu & Active Sessions Charts Show "No Data Is Currently Available" For Cluster Database. In emagent.trc you find the following lines:

emagent.trc shows:-
2008-07-02 09:06:10 Thread-4124015504 WARN TargetManager: Query returned 0 rows (only one expected) for the dynamic property DGPrimaryDBName
2008-07-02 09:06:10 Thread-4124015504 WARN TargetManager: The execution of the query for DGPrimaryDBName in {RACDB, rac_database} is missing the following required property values : DGPrimaryDBName

This is can happen after a RAC node reboot.

To fix this issue you must apply patch for BUG:6005707

The following workaround is available:

$ORACLE_HOME/bin/emctl reload agent dynamicproperties [targetname]:[targettype]

eg.

emctl reload agent dynamicproperties RACDB:rac_database

Wait 5 minutes , after this the charts should be visible.

Thursday, May 20, 2010

10gR2 RAC ORA-00600 kgeade_is_0

Full Error on alert.log :
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []

Line from trace file :
select tablespace_id, rfno, allocated_space, file_size, file_maxsize, changescn_base, changescn_wrap, flag from GV$FILESPACE_USAGE where inst_id != :inst and (changescn_wrap >= :
w or (changescn_wrap = :w and changescn_base >= :b))

Having the initialization parameter PARALLEL_EXECUTION_MESSAGE_SIZE set to a different value on one or more instances in the cluster could cause an error when querying the gv$ views.

Solution

Make the parameter same across all RAC instances.

ALTER SYSTEM SET parallel_execution_message_size=2148 SCOPE=SPFILE SID='*';

Restart instance having this errors !!!

Tuesday, May 18, 2010

Oracle Blogs on the WEB

One of the coolest blogs about Oracle in the Web.

http://jarneil.wordpress.com/

Thursday, May 13, 2010

Oracle 10g RAC articles

A nice collection of articles concerning Oracle 10g RAC from database journal.

http://www.databasejournal.com/article.php/3559996

Thursday, April 22, 2010

CRS-1006 : No more members to consider

When trying to start a service that in crs_stat -t seems offline.

ora.DB.MYSRV.DB1.srv ONLINE OFFLINE on node1

Trying to stop the service saids that the service is stopped.
Trying to remove the service saids that the service is running.

The you have to stop the service manually from each instance.

On each instance do.

sqlplus / as sysdba
show parameter service

-- If you see you service listed there then do

exec DBMS_SERVICE.STOP_SERVICE('MYSERVICE','MYINSTANCE');

show parameter service

-- The service must have been disappeared now.

After you complete the following procedure then start the service again on the preferred instance.


References

http://forums.oracle.com/forums/thread.jspa?messageID=3626152
http://surachartopun.com/2009/03/srvctl-start-service-prkp-1030-failed.html

Thursday, April 15, 2010

11g new features Articles

A very nice collection of Articles for the 11g new features in ORACLE-BASE.

http://www.oracle-base.com/articles/11g/Articles11g.php

Tuesday, March 9, 2010

ORA-00005

Usually this error comes from Application side in the form :

ORA-00005:
Message 5 not found; product=RDBMS; facility=ORA
ORA-06512: at "SYS.DBMS_UTILITY", line 70
ORA-06512: at line 1

and its related with the timezones that are supported on the database.
Oracle has 2 Timezone files on the $ORACLE_HOME/oracore/zoneinfo/ directory.

timezone.dat -- small file
timezlrg.dat -- large file

From 9.2.0.5 onwards, $ORACLE_HOME/oracore/zoneinfo/timezlrg.dat is the default timezone file and will be used if ORA_TZFILE parameter is not set.

if you use on the application side a timezone that is not in the timezlrg.dat then this error is raised.

You can query this view by to see the timezones with:

select *
from v$timezone_names;

Thursday, February 11, 2010

Finding CRS Master Node

As Metalink Document 220970.1(RAC: Frequently Asked Questions) states:
"For the cluster synchronization service (CSS), the master can be found by searching $ORACLE_HOME/log/cssd/ocssd.log" Where ORACLE_HOME is the CRS Home.

e.g
Use olsnodes -n to find out the node numbers:

nodea 1
nodeb 2
nodec 3

Then on the directory mentioned check for each node the latest entry :
e.g from nodec (3)
grep MASTER ocsdd.log
ocssd.log:[ CSSD]2010-02-05 03:26:03.320 [3861] >TRACE: clssgmEstablishMasterNode: MASTER for 63 is node(2) birth(58)

This tell as that for incarnation 63 the master node is 2

But is this the last incarnation ??? You must also check the other nodes to see.

For more accurate information (always combined with the previous method) check the crsd.log

cd $ORACLE_HOME/log/crsd/crsd.log

2010-02-05 02:32:32.139: [ OCRMAS][3348]th_master:13: I AM THE NEW OCR MASTER at incar 67. Node Number 3

This means that we had another incarnation and the master node is 2.

Confusing a little beat. This is happening because the logs rotate or are write in a matter of events happening between the nodes.

So for my opinion please check the crsd.log from each node and the master is the one in the latest incarnation.

e.g

From node1 crsd.log :
2010-02-02 13:05:44.424: [ OCRMAS][3348]th_master:13: I AM THE NEW OCR MASTER at incar 62. Node Number 1

From node2 crsd.log :
2010-01-18 09:58:19.932: [ OCRMAS][3352]th_master:13: I AM THE NEW OCR MASTER at incar 53. Node Number 2

From node3 crsd.log :
2010-02-05 02:32:32.139: [ OCRMAS][3348]th_master:13: I AM THE NEW OCR MASTER at incar 67. Node Number 3

So the master node is 3 !!!

You can also check with :

ocrconfig -showbackup

where the latest backup is, that's the master node. But remember that by default ocrbackups are taken every 4 hours.

Friday, February 5, 2010

Collecting CRS diagnostic files 1OgR2

In order to collect all crs all at once ou can run the following command :

As root :

diagcollection.pl --collect --crs --crshome [$ORA_CRS_HOME]

e.g

diagcollection.pl --collect --crs --crshome /u01/app/oracle/product/crs

Tuesday, January 19, 2010

ORA-27061: waiting for async I/Os failed on AIX

When you see this error and the database crashes , hangs or does not shutdown/startup then there is a problem with Async I/O servers on AIX.

Syptomps

The following messages in the alert log
-------------------------------------------------------------------
.............
Mon Oct 8 22:57:05 2007
Errors in file /dwrac/BDUMP/dwhp_p009_1941690.trc:
ORA-27061: waiting for async I/Os failed <<<<
IBM AIX RISC System/6000 Error: 22: Invalid argument

Users affected:
* systems running aix 5.3 with the bos.rte.aio fileset at the 5.3.0.60 or 5.3.0.61 level.
* this problem is known to affect oracle.

Problem description:
When the maximum requests (maxreqs) is exceeded, lio_listio will update the control blocks, but may not return an error to indicate that some i/os were not started. this results in an unexpected einval from aio_nwait later when these i/os are waited for.

* errors similar to the following may be logged by oracle:
* ora-27061: waiting for async i/os failed
* ibm aix risc system/6000 error: 22: invalid argument



There are 3 actions to try :

1) Increase the number of aios and the maxreqs

Check aio the current settings:
aioo -a
or
lsattr –El aio0

Set maxreqs at least 8192.

chdev -P -l aio0 -a maxreqs=8192
or
aioo -p -o maxreqs=8192 (AIX 5.3 and up)

If this does not resolve the problem increase the number of min/max aio servers

chdev -P -l aio0 -a maxservers=[new value] -a minservers=[new value]

To make changes persistance on reboot

aioo -p -o maxservers=[new value] -o minservers=[new value]

To apply new values online (AIX 5.3 and up)

2) Decrease the parallel_min_servers and parallel_max_servers on the failing instance.

FOR RAC

ALTER SYSTEM SET parallel_min_servers = [NUM] SCOPE=BOTH SID='MYINSTANCE1';
ALTER SYSTEM SET parallel_max_servers = [NUM] SCOPE=BOTH SID='MYINSTANCE1';

FOR SINGLE DB
ALTER SYSTEM SET parallel_min_servers = [NUM] SCOPE=BOTH;
ALTER SYSTEM SET parallel_max_servers = [NUM] SCOPE=BOTH;

3) IBM FIX for AIX 5.3 ML6 and UP
Install fix for apar iz03260



REFERENCES
Metalink Note 464768.1
http://www-01.ibm.com/support/docview.wss?uid=isg1IZ03260

Friday, January 15, 2010

Oracle 10g REDO LOGS sizing

The size of the redo log files can influence performance, because the behavior of
the dbw and arc processes depend on their sizes.
Larger redo log files provide better performance but instance recovery time is increased. Sou you must find a good balance depending the type of your system and your instance recovery time needed.

Also be careful that undersized log files increase checkpoint activity and increase CPU usage.Checkpoint frequency is affected by several factors, including log file size and
the setting of the FAST_START_MTTR_TARGET.

If the FAST_START_MTTR_TARGET parameter is set to limit the instance recovery time,
Oracle automatically tries to checkpoint as frequently as necessary.
Under this condition, the size of the log files should be large enough to avoid
additional checkpointing due to under sized log files.

In Oracle 10g the redo logfile size can tuned as per recommendation specified by column optimal_logfile_size of v$instance_recovery or in the EM Dbconsole under Administration Page under Storage on the Redo Log Groups Link on the Actions List.

In order to enable this feature you must set the parameter "fast_start_mttr_target" in order to activate the advisory and the population of the column optimal_logfile_size.

The key columns in v$instance_recovery view are :

ACTUAL_REDO_BLKS
The current actual number of redo blocks required for recovery.

TARGET_MTTR
Effective MTTR (mean time to recover) target value in seconds. The TARGET_MTTR value is calculated based on the value of the FAST_START_MTTR_TARGET parameter and is usually an approximation of the parameter's value. However, if the FAST_START_MTTR_TARGET parameter value is very small (for example, one second), or very large (for example, 3600 seconds), the calculation will produce a target value dictated by system limitations. In such cases, the TARGET_MTTR value will be the shortest calculated time, or the longest calculated time that recovery is expected to take.

ESTIMATED_MTTR
The current estimated mean time to recover (MTTR) based on the number of dirty buffers and log blocks (0 if FAST_START_MTTR_TARGET is not specified). Basically, this value tells you how long you could expect recovery to take based on the work your system is doing right now.

OPTIMAL_LOGFILE_SIZE
The recommended optimal redolog file size for the current setting of FAST_START_MTTR_TARGET.

select ACTUAL_REDO_BLKS,TARGET_MTTR,ESTIMATED_MTTR,OPTIMAL_LOGFILE_SIZE
from v$instance_recovery;

The target is to reduce ACTUAL_REDO_BLKS by setting an appropriate value of FAST_START_MTTR_TARGET and finding the OPTIMAL_LOGFILE_SIZE for the work your system does.