Monday, August 24, 2009

RAC FAQ

Oracle Real Application Clusters (RAC) FAQ - previously called Oracle Parallel Server (OPS):

What is RAC and how is it different from non RAC databases?

RAC stands for Real Application Clusters. It allows multiple nodes in a clustered system to mount and open a single database that resides on shared disk storage. Should a single system fail (node), the database service will still be available on the remaining nodes.

A non-RAC database is only available on a single system. If that system fails, the database service will be down (single point of failure).

Creating Real Application Clusters Data Dictionary Views with CATCLUST.SQL

Because you manually created your Real Application Clusters database, you must run the CATCLUST.SQL script to create Real Application Clusters-related views and tables. You must have SYSDBA privileges to run this script.

Can any application be deployed on RAC?

Most applications can be deployed on RAC without any modifications and still scale linearly (well, almost).

However, applications with 'hot' rows (the same row being accessed by processes on different nodes) will not work well. This is because data blocks will constantly be moved from one Oracle Instance to another. In such cases the application needs to be partitioned based on function or data to eliminate contention.

Do you need special hardware to run RAC?

RAC requires the following hardware components:

  • A dedicated network interconnect - might be as simple as a fast network connection between nodes; and
  • A shared disk subsystem.

Example systems that can be used with RAC:

  • Windows Clusters
  • Linux Clusters
  • Unix Clusters like SUN PDB (Parallel DB).
  • IBM z/OS in SYSPLEX
  • HP Serviceguard extension for RAC

How many OCR and voting disks should one have?

For redundancy, one should have at lease two OCR disks and three voting disks (raw disk partitions). These disk partitions should be spread across different physical disks.

How does one convert a single instance database to RAC?

Oracle 10gR2 introduces a utility called rconfig (located in $ORACLE_HOME/bin) that will convert a single instance database to a RAC database.

$ cp $ORACLE_HOME/assistants/rconfig/sampleXMLs/ConvertToRAC.xml racconv.xml
$ vi racconv.xml
$ rconfig racconv.xml

One can also use dbca and enterprise manager to convert the database to RAC mode.

For prior releases, follow these steps:

  • Shut Down your Database:
SQL> CONNECT SYS AS SYSDBA
SQL> SHUTDOWN NORMAL
  • Enable RAC - On Unix this is done by relinking the Oracle software.
  • Make the software available on all computer systems that will run RAC. This can be done by copying the software to all systems or to a shared clustered file system.
  • Each instance requires its own set of Redo Log Files (called a thread). Create additional log files:
SQL> CONNECT SYS AS SYSBDA
SQL> STARTUP EXCLUSIVE
 
SQL> ALTER DATABASE ADD LOGFILE THREAD 2
SQL>     GROUP G4 ('RAW_FILE1') SIZE 500k,
SQL>     GROUP G5 ('RAW_FILE2') SIZE 500k,
SQL>     GROUP G6 ('RAW_FILE3') SIZE 500k;
 
SQL> ALTER DATABASE ENABLE PUBLIC THREAD 2;
  • Each instance requires its own set of Undo segments (rollback segments). To add undo segments for New Nodes:
UNDO_MANAGEMENT = auto
UNDO_TABLESPACE = undots2
  • Edit the SPFILE/INIT.ORA files and number the instances 1, 2,...:
CLUSTER_DATABASE = TRUE           (PARALLEL_SERVER = TRUE prior to Oracle9i).  
INSTANCE_NUMBER = 1
THREAD = 1
UNDO_TABLESPACE = undots1          (or ROLLBACK_SEGMENTS if you use UNDO_MANAGEMENT=manual)
# Include %T for the thread in the LOG_ARCHIVE_FORMAT string.
# Set LM_PROCS to the number of nodes * PROCESSES
# etc....
  • Create the dictionary views needed for RAC by running catclust.sql (previously called catparr.sql):
SQL> START ?/rdbms/admin/catclust.sql
  • On all the computer systems, startup the instances:
SQL> CONNECT / as SYSDBA
SQL> STARTUP;

How does one stop and start RAC instances?

There are no difference between the way you start a normal database and RAC database, except that a RAC database needs to be started from multiple nodes. The CLUSTER_DATABASE=TRUE (PARALLEL_SERVER=TRUE) parameter needs to be set before a database can be started in cluster mode.

In Oracle 10g one can use the srvctl utility to start instances and listener across the cluster from a single node. Here are some examples:

$ srvctl status database -d RACDB
$ srvctl start database -d RACDB
$ srvctl start instance -d RACDB -i RACDB1
$ srvctl start instance -d RACDB -i RACDB2
$ srvctl stop database -d RACDB
$ srvctl start asm -n node2

Before Oracle 8.0, use the following command sequence from each node (using the old server manager):

SVRMGR> connect INTERNAL
SVRMGR> set retries 5
SVRMGR> startup parallel retry      .. or SVRMGR> startup shared

You can also use the SET INSTANCE instanceN command to switch between instances (if defined in TNSNAMES.ORA).

Can I test if a database is running in RAC mode?

Use the DBMS_UTILITY package to determine if a database is running in RAC mode or not. Example:

BEGIN
  IF dbms_utility.is_cluster_database THEN
      dbms_output.put_line('Running in SHARED/RAC mode.');
  ELSE
      dbms_output.put_line('Running in EXCLUSIVE mode.');
  END IF;
END;
/

For Oracle 8i and prior releases:

BEGIN
  IF dbms_utility.is_parallel_server THEN
      dbms_output.put_line('Running in SHARED/PARALLEL mode.');
  ELSE
      dbms_output.put_line('Running in EXCLUSIVE mode.');
  END IF;
END;
/

Another method is to look at the database parameters. For example, from SQL*Plus:

SQL> show parameter CLUSTER_DATABASE

If the value of CLUSTER_DATABASE is FALSE then database is not running in RAC Mode.

How can I keep track of active instances?

You can keep track of active RAC instances by executing one of the following queries:

SELECT * FROM SYS.V_$ACTIVE_INSTANCES;
SELECT * FROM SYS.V_$THREAD;

To list the active instances from PL/SQL, use DBMS_UTILITY.ACTIVE_INSTANCES().

How does one setup SQL*Net for RAC?

See Transparent Application Failover.

Can one see how connections are distributed across the nodes?

Select from gv$session. Some examples:

SELECT inst_id, count(*) "DB Sessions" FROM gv$session
 WHERE type = 'USER' GROUP BY inst_id;

With login time (hour):

SELECT inst_id, TO_CHAR(logon_time, 'DD-MON-YYYY HH24') "Hour when connected", count(*) "DB Sessions"
  FROM gv$session
 WHERE type = 'USER'
 GROUP BY inst_id, TO_CHAR(logon_time, 'DD-MON-YYYY HH24')
 ORDER BY inst_id, TO_CHAR(logon_time, 'DD-MON-YYYY HH24');

[edit] What is pinging and why is it so bad?

Starting with Oracle 9i, RAC can transfer blocks from one instance to another across the interconnect (cache fusion). This method is much faster than the old "pinging" method, where one instance had to write the block to disk before another instance could read it.

Oracle 8i and below:

Pinging is the process whereby one instance requests another to write a set of blocks from its SGA to disk so it can obtain it in exclusive mode. This method of moving data blocks from one instance's SGA to another is extremely slow. The challenge of tuning RAC/OPS is to minimize pinging activity.

Real Application Clusters

From Oracle FAQ

Jump to: navigation, search

Oracle Real Application Clusters (RAC) is a database clustering solution that allows more than one instance to mount and open an Oracle database. RAC can only be used on special clustered systems with shared disk storage and a private network interconnect.

A normal Oracle installation consists of a single Oracle instance that accesses a database on the same computer system. In contrast, RAC allows multiple instances on different computer systems (nodes in a cluster) to access the same database files simultaneously. Communication between instances is managed by the Distributed Lock Manager (DLM). To address the possibility of two or more instances attempting to modify the same information simultaneously, Oracle uses up to ten additional background processes, named LCK0 through LCK9, to lock the resources in use by these instances.

RAC is available with Oracle Enterprise Edition, and under certain conditions, with Standard Edition as well. These restrictions (for Standard Edition) include:

  • Must use Oracle Clusterware (no third-party clusterware allowed);
  • Must use ASM to store database files; and
  • Can only use a max of 4 CPU sockets in the cluster (either 2 nodes with 2 CPUs each, or 4 nodes with 1 CPU each).

Contents

[hide]

[edit] History

RAC was previously called Oracle Parallel Server (OPS). Oracle re-branded OPS as RAC when they released Oracle 9i.

[edit] Benefits

Running Oracle in RAC mode can benefit you in the following ways:

  • High availability - If some nodes fail, the remainder of the nodes will still be available for processing requests. Failover support is available from Oracle 8 with Oracle's Transparent Application Failover (TAF) technology and from Oracle 10g, server side load balancing.
  • Speedup (increased transaction response time) - RAC normally adds some overhead. However, for some DSS applications one can achieve increased performance by running operations in parallel (mainly for DSS applications).
  • Scale-up (increased transaction volume) - RAC can be used to provide increased application scalability (mainly used for OLTP applications).

[edit] RAC Storage Options

The database's data and control files are shared between the instances. However, each instance must have its own UNDO and REDO:

  • UNDO: Each instance must have its own UNDO_TABLESPACE
  • REDO: Each instance must have its own REDO LOG files (called a thread)

Shared files can be stored on:

  • RAW volumes
  • ASM
  • Clustered file systems (like OCFS)

[edit] Oracle Clusterware

Oracle Clusterware (previously called Cluster Ready Services or CRS) provides Cluster Management Services and High Availability Services to Oracle and 3rd party software that wants to hook into it. For example, if you kill your LGWR process, CRS will detect the failure and automatically restart the database.

Oracle clusterware eliminates the need for 3rd party clusterware software like SUN Clusters, IBM HACMP and HP Omniguard. Oracle clusterware is provided at no additional cost with the 10g database.

Oracle clusterware needs to be installed on all nodes of the cluster before installing the database software. During installation, you will be prompted to configure a virtual IP, voting disk and cluster registry.

[edit] Virtual IP

IP address that will fail-over to another node in the cluster when a failure is detected. This will allow connected session to (almost) immediately fail-over to another node when a problem is experienced.

RAC requires one or more private interconnects, and two or more public network interfaces.

[edit] Voting Disk

A Voting Disk is a shared disk device or file used to determine node availability (establishes quorum) and resolve split-brain scenarios. All instances write to the voting disk (check in) to indicate that they are still active. This is required as instances may not always be able to communicate across the network with each other.

The voting disk must be stored in a raw partition or regular file in a clustered filesystem (like OCFS, but not on ASM!).

The voting disk, like the OCR, should be multiplexed and backed-up to protect them against media failures.

[edit] Cluster Registry

The Cluster Registry (OCR) is used to store cluster wide settings and status information. For example: node names, IP and VIP addresses, voting disk locations, node applications, database names, instance names, listener names, etc.

The Oracle Cluster Registry (OCR) is a binary file that is maintained by the CRS daemon. The OCR must be stored in a raw partition or regular file in a clustered filesystem (not on ASM!).

To see what is in the OCR, use the "ocrdump" command to dump it contents to a flat file.

[edit] Clusterware processes

When clusterware is started, the following process will be running:

crsd – Cluster Resource Services Daemon cssd – Cluster Synchronization Services Daemon evmd – Event Manager Daemon

Corresponded executables are located in $ORA_CRS_HOME/bin/.

[edit] Stop and start

Oracle Clusterware is started at boot with the /etc/rc.d/init.d/init.crs script.

Commands to manually start or stop:

/etc/init.d/init.crs start 
/etc/init.d/init.crs stop 

Commands to manually disable or enable start at boot time:

/etc/init.d/init.crs disable 
/etc/init.d/init.crs enable 
 

Check status of components registered in the OCR:

crs_stat

Status in tabular format:

crs_stat -t

[edit] Starting and stopping resources

Stops all RAC instances:

$ srvctl stop database –d myracdb 

Stops Listener, VIP, GSD, ONS:

$ srvctl stop nodeapps –n racnode1 
 

Starts ASM on racnode1 and all required dependencies:

$ srvctl start asm –n racnode1

Starts one instance and all required dependencies:

$ srvctl start instance –d myracdb –i mydb1

For more info, read article srvctl.

RAC parameters

Some of the INIT.ORA/SPFILE parameters required for RAC:

  • CLUSTER DATABASE=TRUE -- start the database in RAC mode
  • INSTANCE_NUMBER=n -- a unique instance number
  • THREAD=n -- each instance must have its own thread of redo logs
  • UNDO_TABLESPACE=... -- each instance must have its own undo tablespace
  • Etc.

[edit] GV$ views

The V$ views only shows details from the currently connected instance. Use the GV$ parameters to query all values across all instances (note the INST_ID column).

For example: instead of using V$THREAD, query GV$THREAD to see all threads across all nodes.

[edit] Configure archiving

Since each instance has its own thread of redo logs, all instances needs to archive its own redo logs. For convenience, it is recommended that each instance archives to a shared cluster file system.

[edit] Configure Services

One can use DBCA or srvctl to configure a service with "primary" instances and "available" instances. Client connections will be routed to the "primary" instances. When one of the "primary" instances becomes unavailable, one of the "available" instance will be promoted to "primary".

NOTE: while fail-over to a "available" instance happens automatically, you need to manually switch them back to "primary" when they become available again.

No comments:

Post a Comment