Thursday, December 20, 2007

Resolve hanging mview refresh job (UNIX)

The above steps must be followed in order to resolve a hanging mview refresh job.

select * from dba_jobs_runing in order to find the job number(JOB_ID) and the sid (SID)
of the hanging job.

exec dbms_job.broken(JOB_ID,TRUE); in order to broke the job

select a.spid, b.sid, b.username
from v$process a, v$session b
where a.addr = b.paddr and sid=SID;
in order to find the process id (SPID)

On the O/S kill the process: kill -9 SPID

And after that run the job .

dbms_job.run(JOB_ID);

Same process is working on windows too , except the concept how to kill the process on windows.

Wednesday, December 19, 2007

Friday, December 14, 2007

Drop temporary tablespace hang

I issued a drop tablespace temp and it hang. If you are in the same situation check the above:

select username,session_num,session_addr from v$sort_usage
where tablespace='[Name of the temp tablespace you want to drop]';

if this query return rows use the session_num to find the sid from the v$session.

select sid,serial#,status from v$session where serial#=[session_num];

Alternatively use the session_addr.

select sid,serial#,status from v$session where saddr='[session_addr]';

and then kill the session at will.

alter system kill session '[sid],[serial#]' immediate;

If in RAC then check each instance !!!

Thursday, December 13, 2007

Monitoring Redo Latches

The following query caclulates 2 ratios for redo copy latch & redo allocation latch.

If either of the 2 ratios is above 1% then we have latch contention.

SELECT substr(ln.name, 1, 20) "Latch"
, round(misses/gets,2) "misses/gets ratio",
case when immediate_gets=0 then 0 else
round(immediate_misses/immediate_gets,2)
end "immediate misses/gets ratio"
FROM v$latch l, v$latchname ln
WHERE ln.name in ('redo allocation', 'redo copy')
and ln.latch# = l.latch#;

Tuesday, December 11, 2007

Materialized Views & Refresh Groups

mviews & refresh groups queries

The follwoing query (9i and onwards) shows which mview is refreshed the current time.

select currmvowner, currmvname
from v$mvrefresh;


begin
DBMS_MVIEW.EXPLAIN_MVIEW('MVIEW_NAME');
end;
/

Analyzes mview and by querying the MV_CAPABILITIES_TABLE you can see the details of the analysis e.g if mview is fast refreshable.

Refresh Groups

All refresh groups with its mviews

select r.rowner, r.rname, r.job, c.name,
r.next_date next_refresh, r.broken
from all_refresh r, all_refresh_children c
where r.job = c.job;

select o.owner, o.object_name mview, username, s.sid
from v$lock l, dba_objects o, v$session s
where o.object_id=l.id1 and
l.type='JI' and
l.lmode=6 and
s.sid=l.sid and
o.object_type='TABLE';

If u use refresh groups the above query shows the locks on the base tables of mview so you can see if the refresh group is running.

Monday, December 3, 2007

DAY to DAY

Last days , i am concerning on monitoring my databases.

I am checking for space problems , performance problems and etc.

Wednesday, November 28, 2007

Online Redefinition - SOLVED

Applied EM Grid Control 10.2.0.4 patch set and issue fixed.

Because the 10.2.0.4 patch set is not available on all platforms if the EM Grid Control is 10.2.0.1
then apply 10.2.0.2 patch set and install the patch 4438550.

I don't know if this issue happens on 10.2.03 version. If problems exists then apply 10.2.0.4 if available or contact oracle support.

ORA-00059: maximum number of DB_FILES exceeded

Today when i was trying to create a datafile i got this error :

ORA-00059: maximum number of DB_FILES exceeded

The parameter DB_FILES by default has the value of 200 datafiles. My RAC database was created with the maxdatafiles parameter 1024 so i had to modified the db_files parameter.

You need to have the same value in all instances of rac database and you muste restart the database after this change.

alter system set db_files=[VALUE] scope=spfile sid='*';

To set the parameter to the specified value.

Tuesday, November 27, 2007

Oracle 10g New Features

Currently , i am reviewing Oracle 10g new features. I will list some of my favorites.

1) Cross Platform Transportable Tablespaces

2) Automatic Shared Memory Management

3) AWR , ADDM , SQL_TUNNING ADVISORS

4) Flashback database

There are others too , but these are my favorites.

Monday, November 26, 2007

Online Redefinition

I did some tests. It seems that is a problem of the Enterprise Manager Grid Control.

When i tried to move a table online to an other tablespace and the table had primary key the job finished successfully. When the table had no primary key and i used the rowid method , i had the same error.

Thursday, November 22, 2007

Online Redefinition

I just tried to do online redefinition on one of my tablespaces objects but failed on the first object with the following error:
ORA-00904: "M_ROW$": INVALID IDENTIFIER

According to

Oracle® Database Administrator's Guide
10g Release 2 (10.2)
Part Number B14231-02

If you used rowids for the redefinition and your COMPATIBLE initialization
parameter is set to 10.1 or lower, set to UNUSED the hidden column M_ROW$$
that is now in the redefined table.
ALTER TABLE table_name SET UNUSED (M_ROW$$);
If COMPATIBLE is 10.2 or higher, this hidden column is automatically set to
UNUSED for you when redefinition completes.

But my problem is that i was reorganizing an entire tablespace through enterprise manager
and got this error on output and aborted the job.

I am investigating this issue with oracle support.


TIP : Grid Control Stop All processes

In a Oracle Grid Control Server 10gR2 in order to stop at once all managed processes u can :

cd on $OMS_HOME/opmn/bin and issue opmnctl stopall .

To start the managed processes from the same path issue : opmnctl startall

Wednesday, November 21, 2007

10.2.0.3 Bundle Patch 13 on Windows

I just patched a 10.2.0.3 ORACLE_HOME on Windows 2003 R2 64bit using bundle patch 13.

The patch needs the new opatch in order to be applied.

Also REMEMBER to stop the Distributed Transaction Coordinator service because it uses oci.dll and opatch will fail.

Tuesday, November 20, 2007

RAC 10gR2 Install on IBM Power and NETAPP

The following installation was performed on IBM Blade System (PPC64) on RHEL 4 AS U4. Having clustware files and datafiles on NFS(NETAPP FILERS) without using ASM.

RAC INSTALL ON 2 NODES (node1,node2)

ON both nodes
export DISPLAY=192.168.167.115:0
xconsole (for TEST)
groupadd -g 500 dba
useradd -u 200 -g dba -G dba oracle
passwd oracle (oracle)
pgrep ssh
which scp if not in /usr/local/bin create link : cd /usr/local/bin ; ln -s /usr/bin/scp scp
which ssh if not in /usr/local/bin create link : cd /usr/local/bin ; ln -s /usr/bin/ssh ssh
login as oracle
mkdir .ssh
chmod 700 .ssh
ssh-keygen -t rsa
ssh-keygen -t dsa
On previous steps do not use passphrase otherwise Management Agent 10gR2 will show cluster down in grid control.
on node1:
cd .ssh
ssh node1 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh node1 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
ssh node2 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh node2 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
scp authorized_keys node2:/home/oracle/.ssh/
on node2:
cd .ssh
ssh node1 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh node1 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
ssh node2 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh node2 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
scp authorized_keys node1:/home/oracle/.ssh/

on both nodes
chmod 600 ~/.ssh/authorized_keys
exec /usr/bin/ssh-agent $SHELL
/usr/bin/ssh-add
vi /home/oracle/.ssh/config ... add
Host *
ForwardX11 no
vi .bashrc ... add
if [ -t 0 ]; then
stty intr ^C
fi
vi .bash_profile ... add
umask 022
grep MemTotal /proc/meminfo
grep SwapTotal /proc/meminfo
as root : /sbin/service nscd start
as oracle :/DB/orasoft/crs/Disk1/cluvfy/runcluvfy.sh comp nodecon -n node1,node2 -verbose
IGNORE : ERROR:Could not find a suitable set of interfaces for VIPs. SEE METALINK DOCUMENTS : 338924.1,316583.1
MUST RUN VIPCA manually !!!!!!!!!!!!
cat /etc/issue
Red Hat Enterprise Linux AS release 4 (Nahant Update 4)
uname -r
2.6.9-42.0.10.EL
Check packages !!!

Remember in some packages both 32bit & 64bit required !!!!

ON NETAPP NFS CONFIGURATION GPFS NOT REQUIRED !!!!

Configure Oracle Parameters

On RedHat systems, the default ulimits for individual users are set in /etc/security/limits.conf. As a root user, add the following entries using root users:
# Oracle specific settings
oracle soft nofile 4096
oracle hard nofile 65536
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft memlock 3145728
oracle hard memlock 3145728

Add the following parameters for the shared memory and semaphores to the “/etc/sysctl.conf “ file using root user:
kernel.shmall = 2097152
kernel.shmmax = 2147483648
kernel.shmmni = 4096
kernel.sem = 250 32000 100 141
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default = 1048576
net.core.wmem_default = 262144
net.core.rmem_max = 1048576
net.core.wmem_max = 262144

2. Add or edit the following line in the /etc/pam.d/login file, if it does not
already exist:
session required pam_limits.so

For the Bourne, Bash, or Korn shell, add the following lines to the
/etc/profile
if [ $USER = "oracle" ]; then
if [ $SHELL = "/bin/ksh" ]; then
ulimit -p 16384
ulimit -n 65536
else
ulimit -u 16384 -n 65536
fi
fi

as root on each node :

mkdir /orabin
chown -R oracle:dba /orabin
chmod -R 755 /orabin
mkdir /oraInventory
chown -R oracle:dba /oraInventory
chmod -R 755 /oraInventory
cd /orabin
mkdir crs
chown -R oracle:dba crs
chmod -R 755 crs
mkdir database
chown -R oracle:dba database
chmod -R 755 database

cd /DB

[root@node1 DB]# mkdir oradata
[root@node1 DB]# mkdir oralogs
[root@node1 DB]# mkdir ora10g

chown -R oracle:dba /DB/oradata
chmod -R 755 /DB/oradata
chown -R oracle:dba /DB/oralogs
chmod -R 755 /DB/oralogs
chmod -R 755 /DB/ora10g
chown -R oracle:dba /DB/ora10g
chmod -R 755 /DB/ora10g

ORACLE_BASE=/orabin (local)
ORA_DB_HOME=/orabin/database
ORA_CRS_HOME=/orabin/crs
Modify .bash_profile on each node

Check for hangcheck timer , required on 10g RAC
lsmod | grep hang
if not then
run : /sbin/insmod /lib/modules/2.6.9-42.0.10.EL/kernel/drivers/char/hangcheck-timer.ko hangcheck_tick=30 hangcheck_margin=180
and add the line in /etc/rc.d/rc.local

REMEMBER this is a kernel module , and if kernel is updated you must re-insert the correct module.

TIP : ON EACH SERVER ON /etc/hosts
put 127.0.0.1 localhost as first entry !!!!!
e.g
127.0.0.1 localhost.localdomain localhost
XXX.XXX.XXX.XXX node1 -- Public IP
XXX.XXX.XXX.XXX node1-i -- Cluster Interconnect IP
XXX.XXX.XXX.XXX node1-v -- Virtual IP
XXX.XXX.XXX.XXX node2 -- Public IP
XXX.XXX.XXX.XXX node2-i -- Cluster Interconnect IP
XXX.XXX.XXX.XXX node2-v -- Virtual IP

vi /etc/hosts.equiv
node1 oracle
node2 oracle
node1-i oracle
node2-i oracle

Install Oracle Clustware

CRITICAL : Check if LINUX FIREWALL IS UP !!!!!!!

on both nodes as root /etc/rc.d/init.d/iptables status if enabled then stop

Set ORACLE_HOME to ORA_CRS_HOME

runcluvfy.sh stage -pre crsinst -n node1,node2 -osdba dba -orainv dba
runcluvfy.sh comp nodecon -n node1,node2
runcluvfy.sh stage -post hwos -n node1,node2
runcluvfy.sh comp sys -n node1,node2 -p crs -osdba dba -orainv dba -verbose
WARNING:
Package cvuqdisk not installed.
node1,node2

Ignore NOT NEEDED ON NETAPP
Ignore Vip error and glibc error if glibc greater exists !!!

as oracle

./runinstaller

/orabin/oraInventory for invenotry ... specific in each node !!!
OraCrs10g_home1
/orabin/oracrs

Ignore warnning if u use the same interface (eth0)

Use OCR and voting are files on shared location (NFS) using normal reduduncy !!!

Ignore error Remote attach home and command on :
....../runInstaller -attachHome -noClusterEnabled ORACLE_HOME=/orabin/crs ORACLE_HOME_NAME=OraCrs10g CLUSTER_NODES=node1,node2 CRS=true "INVENTORY_LOCATION=/orabin/oraInventory" LOCAL_NODE=node2
beacuse home has allready attached !!!

as root@node1: run orainstRoot.sh then as root@node2
as root@node1 run root.sh
u should see :
CSS is active on these nodes.
node1
CSS is inactive on these nodes.
node2
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.
as root@node2 run root.sh
u should see this:
CSS is active on these nodes.
node1
node2
CSS is active on all nodes.
Oracle CRS stack installed and running under init(1M)
Ignore vipca error anu run it from failling node !!!

[oracle@node1 Disk1]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.node1.gsd application ONLINE ONLINE node1
ora.node1.ons application ONLINE ONLINE node1
ora.node1.vip application ONLINE ONLINE node1
ora.node2.gsd application ONLINE ONLINE node2
ora.node2.ons application ONLINE ONLINE node2
ora.node2.vip application ONLINE ONLINE node2

[oracle@node2 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.node1.gsd application ONLINE ONLINE node1
ora.node1.ons application ONLINE ONLINE node1
ora.node1.vip application ONLINE ONLINE node1
ora.node2.gsd application ONLINE ONLINE node2
ora.node2.ons application ONLINE ONLINE node2
ora.node2.vip application ONLINE ONLINE node2

Install Oracle 10gR2 RAC Software
Set ORACLE_HOME= ORA_DB_HOME on both nodes

as oracle install database software only ... cluster install ... on linking error apply patch 4767801 (p4767801_10201_IBMPower.zip)
on node running the installer
unzip and
cp 4767801/stubs64/* $ORACLE_HOME/lib/stubs
cp 4767801/stubs32/* $ORACLE_HOME/lib32/stubs

retry linking ... OK

ignorer remote attach home error

run root sh on both nodes starting from node1

Create database ....

run dbca as oracle

if entrprise manager configuration fails
then
stop dbconsole on each node
on blade4
cd /orabin/database/sysman/admin/emdrep/bin/
./RepManager node1 1521 rac1 -sys_password oracle -action drop
emca -config dbcontrol db -repos create -cluster


http://node1-v:1158/em

(OR FOLLOW metalink doc.id=395162.1)

To ensure TAF from clients 2 methods :
1) Provide failover method on client tnsnames.oracle
racsrv =
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST=node1-v)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=node2-v)(PORT=1521))
(LOAD_BALANCE=ON)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = racsrv)
(FAILOVER_MODE=
(TYPE=SELECT)
(METHOD=BASIC)
(RETRIES=180)
(DELAY=5)
)
)
)
and connect racsrv service !!!

2) Using
execute dbms_service.modify_service(service_name => 'racsrv' , aq_ha_notifications => true, failover_method => dbms_service.failover_method_basic , failover_type => dbms_service.failover_type_select, failover_retries => 180 , failover_delay => 5 , clb_goal => dbms_service.clb_goal_long);

and tnsnames.ora service
racsrv =
(DESCRIPTION= (LOAD_BALANCE=ON(FAILOVER=ON)
(ADDRESS=(PROTOCOL=TCP)(HOST=node1-v)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=node2-v)(PORT=1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = racsrv)
)
)

SQL TO CHECK FAILOVER
from client

sqlplus system@racsrv

select instance_name,host_name,null as failover_type,null as failover_method
,null as failed_over
from v$instance
union
select null,null,failover_type,failover_method,failed_over
from v$session where username='SYSTEM'
/

and shutdown abort instance which connected !!! and re run !!!



Hello,

My name is Agis Stamatopoulos and i am working as Oracle DBA on Telecomunications industry.
In this blog i will post information of my day to day experience with Oracle Technology.