Wednesday, August 6, 2008

Physical Standby Database Tips

I have started a new blog for standby database (dataguard stuff) as i understand that it is in itself a huge area. Please refer http://rohitguptastandbydbtips.blogspot.com/ for my experiences with managing standby databases

Keep posting your comments and suggestions to help me improve and learn

-Rohit

Thursday, July 17, 2008

Basics of ASM (Automatic Storage Management) - I

Here are some basics about using ASM storage. Installation and configuration is not in scope of this blog :-) The tips listed in this blog are helpful when you are working in an enviornment where ASM is already in use and you are completely novice with ASM

Using ASM

1. To use ASM storage you need to install an ASM instance.

2. ASM storage can be viewed and accessed using the command line utility called ASMCMD. To use ASMCMD, you need to first source the environment for ASM instance.

Following is a small demo of using ASMCMD:

oracle@ipw-dev-db:~> . oraenv
ORACLE_SID = [mydb] ? +ASM

oracle@ipw-dev-db:~> asmcmd

ASMCMD [+] > ls -l
State Type Rebal Unbal Name
MOUNTED EXTERN N N DATA/

ASMCMD [+] > cd +DATA

ASMCMD [+DATA] > ls -l
Type Redund Striped Time Sys Name
Y MYDB/
Y MYDB_1/

ASMCMD [+DATA] >

3. Commands that can be used in ASMCMD are:
mkdir, ls, rm, cd

4. ASM assigns its own names to the datafiles

5. ASM storage is entirely different from a normal filesystem storage.

Viewing ASM information

ASM information can be viewed by using dynamic performance views like v$asm_diskgroup. You need to login to the ASM instance to view this. Using ASMCMD or normal OS commands (like df –h), you can not directly know the vital information like size of the ASM diskgroup, free space etc. You have to query some relevant views to get this information. For example:

oracle@ipw-dev-db:~> sqlplus '/as sysdba'

SQL*Plus: Release 10.2.0.3.0 - Production on Thu Jul 17 01:27:08 2008

Copyright (c) 1982, 2006, Oracle. All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning and Data Mining options

SQL> select free_mb, total_mb from v$asm_diskgroup;

FREE_MB TOTAL_MB
---------- ----------
181331 664697

SQL>

Moving datafiles in ASM storage

1. Shutdown and start up the database in MOUNT state. Datafile movement should be done only in MOUNT state.

2. Connect to RMAN and target database without any recovery catalog

RMAN> Connect target /

3. Assuming that I am moving a datafile of database MYDB from ‘+DATA/mydb_1/datafile’ to ‘+DATA/mydb/datafile’

RMAN> copy datafile '+DATA/MYDB_1/DATAFILE/users.488.658972917' to '+DATA';
RMAN> switch datafile '+DATA/MYDB_1/DATAFILE/users.488.658972917' to copy;

4. You can move system and sysaux datafiles also similarly because the database is mounted and not open.

5. Switch command is responsible for updating the controlfile with the new location

Happy learning !!!! More to come (ofcourse as soon as i learn something worth sharing :-))

- Rohit

Monday, June 2, 2008

Common Oracle installation errors/issues/problems on Linux - 1

Well!!!! This is not something i have been fortunate enough to find at a one single place in any book or umpteen tutorials over the internet. There are very good installation documents like
http://www.puschitz.com/InstallingOracle9i.shtml and Metalink note: 184821.1, to name a few, which provide excellent step by step installation procedures but they don't really address the common installation problems consolidated at one single location. Ofcourse, one reason is that some problems are not generic and faced during one particular installation only. Following is an attempt towards the same (with highest of regards to all available notes and tutorials).



Generic installation problems on Linux:


1. Installation can fail during linking phase with errors like "errors in invoking target Install_isqlplus of makefile /u01/app/oracle/product/9.2.0.7/sqlplus/lib/ins_sqlplus.mk"

Reason/Resolution: Linking problems are usually associated with incorrect version of gcc packages of your OS version.

For 9i installation, gcc version should be 3.2.3 and for 10g installation, it should be 3.4.6. You can check the version by the command "gcc -v". Usually, 3.4.6 is the default.

For activating correct gcc version for 9i installation on 32bit OS (i386):

$ mv /usr/bin/gcc /usr/bin/gcc.orig
$ mv /usr/bin/g++ /usr/bin/g++.orig
$ ln -s /usr/bin/i386-redhat-linux-gcc32 /usr/bin/gcc
$ ln -s /usr/bin/i386-redhat-linux-g++32 /usr/bin/g++

For activating correct gcc version for 9i installation on 64bit OS (x86_64):

$ mv /usr/bin/gcc /usr/bin/gcc.orig
$ mv /usr/bin/g++ /usr/bin/g++.orig
$ ln -s /usr/bin/x86_64-redhat-linux-gcc32 /usr/bin/gcc
$ ln -s /usr/bin/x86_64-redhat-linux-g++32 /usr/bin/g++

Refer to Metalink Note: 353529.1 and 169706.1 for installation pre-requisites


2. "There is no non-empty value for variable s_jservPort under section Ports in file /u01/app/oracle/product/9.2.0/Apache/ports.ini"

Reason/Resolution: This problem is usually encountered when you are making a second attempt for installing the software after a failed previous installation. This is an ignorable error. If you open the file : /u01/app/oracle/product/9.2.0/Apache/ports.ini , you will see that the "s_jservPort " might be defined above the "Ports" section . We need to just place this variable under "ports" section. In case you are not using IAS or Grid Control, you can safely ignore this error or do the settings manually as mentioned above. In any case there should be no operational impacts on the database.


3. Errors in writing few files like "error in writing to file /u01/app/oracle/product/9.2.0/Apache/Apache/conf/ssl.key/server.key"

Reason/Resolution: Again, This problem is usually encountered when you are making a second attempt for installing the software after a failed previous installation. The files mentioned in these errors are actually created during the previous attempt and can not be overwritten because they are created as read only while installation. So to proceed with the installation, you need to change the permissions on these files (using chmod) to make them writable. Even better solution is that before starting the installation again, remove the Oracle_Home completely which was created and populated during the previous attempt for installation, and create a fresh and empty directory for Oracle_Home


4. "Error occurred during initialization of VM
Unable to load native library: /tmp/OraInstall2003-10-25_03-14-57PM/jre/lib/i386/libjava.so: symbol __libc_wait, version GLIBC_2.0 not defined in file libc.so.6 with link time reference
"

Reason/Resolution: To resolve the __libc_wait symbol issue, download the p3006854_9204 patch p3006854_9204_LINUX.zip from http://metalink.oracle.com/. See bug 3006854 for more information. To apply the patch, run

su - root
# unzip p3006854_9204_LINUX.zip
Archive: p3006854_9204_LINUX.zip
creating: 3006854/
inflating: 3006854/rhel3_pre_install.sh
inflating: 3006854/README.txt
# cd 3006854
# sh rhel3_pre_install.sh
Applying patch...
Patch successfully applied
#


5. OUI Hangs at 18% - "Copying naeet.o"

Reason/Resolution: The reason is that environment variable LD_ASSUME_KERNEL has not been set. Check the metalink notes:

Note: 360142.1: When Running OUI, OUI Hangs at 18% Copying naeet.o
Note: 377217.1: What should the value of LD_ASSUME_KERNEL be set to for Linux?


Problems specific to 9i RAC installation:


1. On starting the ORACM service, you can get the error
ocmstart.sh: Error: Restart is too frequent
ocmstart.sh: Info: Check the system configuration and fix the problem.
ocmstart.sh: Info: After you fixed the problem, remove the timestamp file
ocmstart.sh: Info: "/u01/app/oracle/product/9.2.0.7/oracm/log/ocmstart.ts"

Reason/Resolution: To resolve this, remove the file $ORACLE_HOME/oracm/log/osmstart.ts and then you should be able to start the service.



2. During installation of CM patch set (like 9207 patchset or 9208 patchset), following error: "error in writing to file '/u01/app/oracle/product/9.2.0.7/oracm/bin/oracm (text file busy)"

Reason/Resolution: This error occurs if you are trying to install the CM patch set without stopping the ORACM service. ORACM services on both nodes should be stopped before installing the CM patch set.



3. After installing Cluster manager, ORACM service should be started on all nodes to proceed with the RDBMS installation. I have faced this situation personally that service does not starts on both nodes. For example, if it is a 2 nodes RAC, service could be started on one node only. Starting the service on one node kills the service on other node.

Reason/Resolution: The service has to be started as root and requires the LD_ASSUME_KERNEL to be set correctly. So i had set the LD_ASSUME_KERNEL properly as "oracle" user and when switching to "root" user to start the service, i was doing "su -" instead of "su". "su -" does not carries over the environment settings and hence value of LD_ASSUME_KERNEL was not carried over to the user "root"



4. When trying to apply the 9208 CM patch set, all nodes were not considered by the installation. Following error was found in the installation logs:

"Cluster nodes cannot be retrieved from the vendor clusterware (/tmp/OraInstall2008-03-20_12-12-02AM/oui/bin/lsnodes.bin: error while loading shared libraries: libcmdll.so: cannot open shared object file: No such file or directory). This system will not be considered as a vendor clusterware"

Also, "lsnodes" command, which can be used to verify all the nodes in the RAC was failing.

Reason/Resolution: Actually the correct order to be followed to install the 9208 RAC should be:

--> 9204 CM
--> 9204 RDBMS
--> 9208 CM Patchset
--> 9208 RDBMS patchset.

The reason for the above error and "lsnodes" failing is that $ORACLE_HOME/lib32 directory does not exist. The file libcmdll.so mentioned in the error is located inside the lib32 directory and lib32 is created only after the installation of RDBMS software and not the CM. So if you don't follow the correct order and try to install 9208 CM patchset after 9204 CM, you'll get this error and "lsnodes" will also not work to show all the nodes in the cluster (which you assume should be there after you have installed 9204 CM successfully). Instead after 9204, you should be installing 9204 RDBMS software. Then if you apply the 9208 patchset, this error won't be seen and "lsnodes" will also work.


5. Always check the inventory on all nodes to verify that correct versions of CM and RDBMS patchsets have been applied on all nodes. Inventory can be verified by launching OUI. The abnormality in versions is more prevalent in CM patchsets where if you apply 9208 CM patchset on one node, the CM version on other node is still 9204. However, this can be true for RDBMS patchsets also. So in such cases, you need to apply the patchset on other node also separately (ideally all the installation in RAC happens from a single node only and other nodes are updated automatically) to have the correct version. You can check the version of CM on each node by following command after starting the ORACM service:

$ head -1 $ORACLE_HOME/oracm/log/cm.log


This was the first set of real life installation problems i have faced. I have more to come soon. So stay tuned and keep checking this space for more updates.
"Suggestions are more than welcome" :-)


Cheers,
-Rohit

Tuesday, May 6, 2008

How to clone a database using a cold backup

There is a very good documentation available on cloning at http://www.samoratech.com/TopicOfInterest/swCloneDB.htm
With due respect to the above documentation, I am trying to be more descriptive in approach. These are purely based on my experience.

1. Decide carefully about the location and server for the new database's datafiles, redo files and control files. Make sure you create all the directories where you want to keep the datafiles, redofiles and control files(like /u02/oradata/$ORACLE_SID), alert log (/u01/app/oracle/admin/$ORACLE_SID/bdump), trace files(/u01/app/oracle/admin/$ORACLE_SID/udump), parameter file(/u01/app/oracle/admin/$ORACLE_SID/pfile)

2. Generate the script for creating the control files for the destination database. To do this, while logged in as SYS to the source database, do ALTER DATABASE BACKUP CONTROLFILE TO TRACE. This will generate a trace file at /u01/app/oracle/admin/source_SID/udump containing the CREATE CONTROLFILE script. Edit the trace file to keep only the relevant script for creating the new control file and save with a suitable name. The Control file script should contain RESETLOGS because we are going to create a brand new cloned database. For example, the first 2 lines of the contro file script should look like:
STARTUP NOMOUNT pfile=/u01/app/oracle/admin/$ORACLE_SID/pfile/$ORACLE_SID.ora
CREATE CONTROLFILE set DATABASE "DD707" resetlogs noarchivelog


3. To know the name and locations of all datafiles to be copied from source to destination

SQL>select file_name from dba_data_files

4. To know the name and locations of all redo files to be copied from source to destination
SQL>select member from v$logfile

5. Shutdwn the source database
SQL>SHUTDOWN IMMEDIATE

6. Copy all the datafiles and redo log files of the source database to the location holding the destination database's datafiles and redo log files. If it is on the same server, use "cp". If the destination is on a different server, use"scp". For example, change your directory to location of source database's datafiles $cd /u02/oradata/$ORACLE_SID and then do
"$ scp -p xyz.dbf oracle@:/u01/app/oracle/product/10.2.0/dbs/xyz.dbf"

7. Copy the CREATE CONTROLFILE script generated in Step#2 to a secure location (like /u01/app/oracle/admin/dest_SID/udump).

8. Copy the source database's init$ORACLE_SID.ora file as destination database's init$ORACLE_SID.ora at /u01/app/oracle/admin/$ORACLE_SID/pfile.

9. In the copied init.ora, make sure you edit all relevant parameters which are specific to the source database (like udump, bdump, DB_NAME, INSTANCE_NAME etc). These should be specific to destination database.

10. Create a soft link at $ORACLE_HOME/dbs for the parameter file. For example, change the directory to $ORACLE_HOME/dbs
$cd $ORACLE_HOME/dbs and then do "$ ln -s /u01/app/oracle/admin/$ORACLE_SID/pfile/init$ORACLE_SID.ora init$ORACLE_SID.ora"
where $ORACLE_SID is the destination SID.

Please note that if in Step#8, you copy the pfile directly to $ORACLE_HOME/dbs on the destination, you can skip the Step#10.

11. Now edit tnsnames, listener and oratab files on the destnation server to have entries for the new destination database. All 3 are usually kept at /var/opt/oracle.

12. Now set your environment for the new SID.
$ . oraenv

13. Log in to the idle instance (for det database) as SYS
SQL> sqlplus '/as sysdba'
You should see the message: logged into idle instance

14. Here run the command for creating the CONTROLFILE. Assuming that you copied the script in Step#5 to /u01/app/oracle/admin/$ORACLE_SID/udump and name of the script is ctrlnew.sql, do the foll:
SQL>@/u01/app/oracle/admin/$ORACLE_SID/udump/ctrlnew.sql
Remember that Control file can be created in the NOMOUNT state only. Your CREATE CONTROLFILE command should have the first command as STARTUP NOMOUNT pfile=
If successful, you would get a message "controlfile created"


15. Now open the database with resetlogs
SQL>ALTER DATABASE OPEN RESETLOGS
If this successfuly runs and says "database altered" your cloned destination database is ready to be used.


-Rohit

Thursday, May 1, 2008

How to change SPFILE parameters for a RAC database

Well, this is a simple task as it sounds. But believe me when i was supposed to do this for the first time in my career i had a hard time searching for exact solution. Here i am just sharing my experience in doing this task
There are definitely 2 known methods available to perform this task

METHOD 1:
Simply issue the following SQL statement from any of the nodes
ALTER SYSTEM SET = scope=spfile;
There are 3 possible values for the 'scope' clause in this statement:
1. MEMORY: The change is immediate but will not be available after next startup or reboot of the instance.
2. SPFILE: The change will be effective in SPFILE only and will be available after next startup or reboot
3. BOTH: The change is effective for both MEMORY and SPFILE and will be available after next startup also.
Default is BOTH
You could also specify another clause called 'sid' at the end of the above ALTER statement which is specifically meant for a RAC database. This is to specify the instance where you want to make that change. For example,
ALTER SYSTEM SET = scope=spfile sid='*';
means that this particular change in the parameter will happen on all instances after rebooting them (Default is '*')

METHOD 2:
Another method to change a parameter in the spfile is to export it to a pfile, change it and then create a new spfile. Let me detail out the various steps involved:
1. On one instance, create a pfile from the existing spfile: SQL> CREATE PFILE FROM SPFILE. This will create a pfile called initSID.ora at /$ORACLE_HOME/dbs
2. Edit the resulting PFILE initSID.ora in a vi editor (Add/alter the required parameter). You should use *. so that this parameter value is applied to all instances
3. Now shutdown all the instances
4. startup the instance (and hence the database) where you created and altered the pfile using this pfile only

STARTUP PFILE=$ORACLE_HOME/dbs/initSID.ora
Do not start other instances yet
5. Now through this instance only, create a new spfile (which can be at a common location being accesses by all instances)

CREATE SPFILE=‘commom_location/spfile.ora’ FROM PFILE=‘$ORACLE_HOME/dbs/initSID.ora’;
This will overwrite the existing spfile with the spfile which has the new/altered parameter
6. Now shutdown this instance again
7. Now startup normally all the 3 instances without PFILE or SPFILE option:

STARTUP
By default startup will now consider the new spfile
8. To confirm that the new parameters have been set/removed, issue following sql statement from all 3 instances
SHOW PARAMETER

The method would not be much different if this is a single instance database instead of a RAC database (In Step#7 you would be starting only that single instance)

-Rohit