TTS in Combination with RMAN backups on dataguard

At a Customer’s site we recently upgraded a database from 10.2.0.5 to 11.2.0.3 by using Transportable Tablespace ( TTS ). This worked flawlessly but we ran into an issue taking backups at the dataguard location of this database.

We followed the normal procedures for being able to take backups on Dataguard and use them for Primary

      • Use a RMAN catalog
      • Register the Primary database
        RMAN> REGISTER DATABASE;
      • Configure the DB Unique Names
        RMAN> CONFIGURE DB_UNIQUE_NAME DB CONNECT IDENTIFIER 'DB_PRIM';
        RMAN> CONFIGURE DB_UNIQUE_NAME DB_DG CONNECT IDENTIFIER 'DB_DG';
        
        RMAN> LIST DB_UNIQUE_NAME OF DATABASE;
        
        List of Databases
        DB Key  DB Name  DB ID            Database Role    Db_unique_name
        ------- ------- ----------------- ---------------  ------------------
        1       DB       336860753        PRIMARY          DB
        1       DB       336860753        STANDBY          DB_DG

At this moment, we can take backups on DB_DG and make them available to DB by changing the unique name in the catalog

RMAN> change backup for db_unique_name DB_DG reset db_unique_name to DB;

However for this one database we can’t seem to do anything with it inside RMAN.

RMAN> show all; 
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of show command at 04/26/2013 13:58:11
RMAN-03014: implicit resync of recovery catalog failed
RMAN-03009: failure of partial resync command on default channel at 04/26/2013 13:58:11
RMAN-20999: internal error

RMAN> backup database; 
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 04/26/2013 14:05:11
RMAN-03014: implicit resync of recovery catalog failed
RMAN-03009: failure of partial resync command on default channel at 04/26/2013 14:05:11
RMAN-20999: internal error

This turns out to be Bug 13000553 ( Metalink Id. 13000553.8 ) that occurs when you

  • take backups on dataguard 
  • use TTS
  • Add a datafile to the Tablespace you transported

At the moment of writing there is no fix for this. The only workaround is to take RMAN backups on the Primary Database.

Converting a windows 2003R2 VM from VMWare vCenter to Oracle VM 3.x

With more and more VMWare customers choosing to use Oracle VM as virtualisation platform for running Oracle Software, the need rose at a client to convert some of the windows VM’s on vCenter to OVM. Surprisingly, I didn’t find a guide in the OVM 3 manual. I knew OVM 2.2 had a chapter about V2V, but only P2V gets covered in the OVM 3 manual.

This is the procedure I successfully followed :

Pre-export

  • Apply fix for kb31408 if you are using scsi devices in your VM. ( This is also documented in Metalink note. 754071.1 )
  • Uninstall vmware tools
  • Stop the VM ( Downtime starts here )
  • Make sure there are no snapshots on the vm.

Export

  • Select the VM in vCenter and click on the menu bar on File > Export > OVF
  • When you get the question if you want a single OVA file, answer yes.
  • Place the exported ova file on the http server you use to import into OVM Manager. ( I use httpd on the ovm manager )

Import

  • Follow all the usual steps you would follow to import an ova template into OVM.
    • Import as assembly
    • Create VM Template
    • Create VM
    • Start VM ( TIP: open the console before you boot, so you can follow the boot sequence ). If you receive a blue screen, chances are that you didn’t apply the fix for kb31408 correctly.

Post-Import

  • You need to reconfigure your network because the mac address of the network card in the vm has now changed. 
  • Reactivate windows ( because your hardware has changed )
  • Install the paravirtual drivers ( version 3.0.1 ) and reboot ( Downtime ends here )

And we had a running VM on OVM identical to the one we had on vCenter in about 45 minutes. Most of the time is spent in the export process of vCenter.

ASMLIB – Slow running of oracleasm scandisks

Recently I had a client where oracleasm scandisks took over 10 minutes to run. This meant a reboot took 12 minutes instead of 2 minutes and performing any kind of HA tests where disks were involved had this 10-min wait.

The Problem

This client had an enormous amount of Logical Volumes per server and this was the only difference with other systems where I had succesfully used asmlib. So I went to check how these were defined on an OS level.

It turned out the devices were created in /dev/mapper and were visible in /proc/partitions but there were no devices created for them in /dev.

(root) # cat /proc/partitions|grep dm-
253 0 2097152 dm-0
253 1 4194304 dm-1
253 2 524288 dm-2
253 3 4194304 dm-3
253 4 524288 dm-4
253 5 524288 dm-5
253 6 6291456 dm-6
253 7 4194304 dm-7
253 8 524288 dm-8
253 9 16777216 dm-9
253 10 4194304 dm-10
253 11 2097152 dm-11
253 12 16777216 dm-12
253 13 17846400 dm-13
253 14 267696000 dm-14
253 15 267696000 dm-15
253 16 267696000 dm-16
253 17 267696000 dm-17
253 18 2973120 dm-18
253 19 107078400 dm-19
253 20 107078400 dm-20
253 21 267696000 dm-21
253 22 62462400 dm-22
253 23 2973120 dm-23
253 24 267691031 dm-24
253 25 107073161 dm-25
253 26 267691031 dm-26
253 27 62462336 dm-27
253 28 267691031 dm-28
253 29 107073161 dm-29
253 30 2971993 dm-30
253 31 17840118 dm-31
253 32 2971961 dm-32
253 33 267691031 dm-33
253 34 267691031 dm-34
253 35 524288 dm-35
253 36 10485760 dm-36
253 37 4194304 dm-37
253 38 52428800 dm-38
253 39 15728640 dm-39
253 40 2097152 dm-40
253 41 18874368 dm-41
253 42 524288 dm-42
253 43 1048576000 dm-43
253 44 10485760 dm-44
253 45 4194304 dm-45
253 46 15728640 dm-46
253 47 17846400 dm-47
253 48 267696000 dm-48
253 49 267696000 dm-49
253 50 267696000 dm-50
253 51 267696000 dm-51
253 52 2973120 dm-52
253 53 107078400 dm-53
253 54 107078400 dm-54
253 55 267696000 dm-55
253 56 62462400 dm-56
253 57 2973120 dm-57
253 58 2971961 dm-58
253 59 107073161 dm-59
253 60 107073161 dm-60
253 61 62462336 dm-61
253 62 267691031 dm-62
253 63 267691031 dm-63
253 64 17840118 dm-64
253 65 267691031 dm-65
253 66 2971993 dm-66
253 67 267691031 dm-67
253 68 267691031 dm-68
253 69 2935808 dm-69
253 70 2931831 dm-70

(root) # ls -l /dev/dm-*
brw-rw—- 1 root root 253, 13 Mar 3 14:17 /dev/dm-13
brw-rw—- 1 root root 253, 14 Mar 3 14:17 /dev/dm-14
brw-rw—- 1 root root 253, 15 Mar 3 14:17 /dev/dm-15
brw-rw—- 1 root root 253, 16 Mar 3 14:17 /dev/dm-16
brw-rw—- 1 root root 253, 17 Mar 3 14:17 /dev/dm-17
brw-rw—- 1 root root 253, 18 Mar 3 14:17 /dev/dm-18
brw-rw—- 1 root root 253, 19 Mar 3 14:17 /dev/dm-19
brw-rw—- 1 root root 253, 20 Mar 3 14:17 /dev/dm-20
brw-rw—- 1 root root 253, 21 Mar 3 14:17 /dev/dm-21
brw-rw—- 1 root root 253, 22 Mar 3 14:17 /dev/dm-22
brw-rw—- 1 root root 253, 23 Mar 3 14:17 /dev/dm-23
brw-rw—- 1 root root 253, 24 Mar 3 14:17 /dev/dm-24
brw-rw—- 1 root root 253, 25 Mar 3 14:17 /dev/dm-25
brw-rw—- 1 root root 253, 26 Mar 3 14:17 /dev/dm-26
brw-rw—- 1 root root 253, 27 Mar 3 14:17 /dev/dm-27
brw-rw—- 1 root root 253, 28 Mar 3 14:17 /dev/dm-28
brw-rw—- 1 root root 253, 29 Mar 3 14:17 /dev/dm-29
brw-rw—- 1 root root 253, 30 Mar 3 14:17 /dev/dm-30
brw-rw—- 1 root root 253, 31 Mar 3 14:17 /dev/dm-31
brw-rw—- 1 root root 253, 32 Mar 3 14:17 /dev/dm-32
brw-rw—- 1 root root 253, 33 Mar 3 14:17 /dev/dm-33
brw-rw—- 1 root root 253, 34 Mar 3 14:17 /dev/dm-34
brw-rw—- 1 root root 253, 47 Mar 3 14:50 /dev/dm-47
brw-rw—- 1 root root 253, 48 Mar 3 14:50 /dev/dm-48
brw-rw—- 1 root root 253, 49 Mar 3 14:50 /dev/dm-49
brw-rw—- 1 root root 253, 50 Mar 3 14:50 /dev/dm-50
brw-rw—- 1 root root 253, 51 Mar 3 14:50 /dev/dm-51
brw-rw—- 1 root root 253, 52 Mar 3 14:50 /dev/dm-52
brw-rw—- 1 root root 253, 53 Mar 3 14:50 /dev/dm-53
brw-rw—- 1 root root 253, 54 Mar 3 14:50 /dev/dm-54
brw-rw—- 1 root root 253, 55 Mar 3 14:50 /dev/dm-55
brw-rw—- 1 root root 253, 56 Mar 3 14:50 /dev/dm-56
brw-rw—- 1 root root 253, 57 Mar 3 14:50 /dev/dm-57
brw-rw—- 1 root root 253, 58 Mar 3 14:50 /dev/dm-58
brw-rw—- 1 root root 253, 59 Mar 3 14:50 /dev/dm-59
brw-rw—- 1 root root 253, 60 Mar 3 14:50 /dev/dm-60
brw-rw—- 1 root root 253, 61 Mar 3 14:50 /dev/dm-61
brw-rw—- 1 root root 253, 62 Mar 3 14:50 /dev/dm-62
brw-rw—- 1 root root 253, 63 Mar 3 14:50 /dev/dm-63
brw-rw—- 1 root root 253, 64 Mar 3 14:50 /dev/dm-64
brw-rw—- 1 root root 253, 65 Mar 3 14:50 /dev/dm-65
brw-rw—- 1 root root 253, 66 Mar 3 14:50 /dev/dm-66
brw-rw—- 1 root root 253, 67 Mar 3 14:50 /dev/dm-67
brw-rw—- 1 root root 253, 68 Mar 3 14:50 /dev/dm-68
brw-rw—- 1 root root 253, 69 Mar 3 14:58 /dev/dm-69
brw-rw—- 1 root root 253, 70 Mar 3 14:58 /dev/dm-70

I knew oracleasm used /proc/partitions as its leading table of devices to check, so I believed a timeout occured while trying to access the non-existing devices. It turns out this was correct.

The Solution

Oracle Linux 5 & Redhat 5 do not create the devices for LVM2 devices by default. It took me some time to find this, but in the udev rules there is a clear ignore rule.

(root) # cat /etc/udev/rules.d/90-dm.rules
KERNEL==”dm-[0-9]*”, ACTION==”add”, OPTIONS+=”ignore_device”

When we disable this by commenting it out and retrigger the udev rules, our devices get created.

(root) # cat /etc/udev/rules.d/90-dm.rules
#KERNEL==”dm-[0-9]*”, ACTION==”add”, OPTIONS+=”ignore_device”

(root) # udevtrigger 

(root) # ls -ltr /dev/dm-*
brw-rw—- 1 root root 253, 15 Mar 3 14:17 /dev/dm-15
brw-rw—- 1 root root 253, 14 Mar 3 14:17 /dev/dm-14
brw-rw—- 1 root root 253, 19 Mar 3 14:17 /dev/dm-19
brw-rw—- 1 root root 253, 20 Mar 3 14:17 /dev/dm-20
brw-rw—- 1 root root 253, 16 Mar 3 14:17 /dev/dm-16
brw-rw—- 1 root root 253, 21 Mar 3 14:17 /dev/dm-21
brw-rw—- 1 root root 253, 17 Mar 3 14:17 /dev/dm-17
brw-rw—- 1 root root 253, 22 Mar 3 14:17 /dev/dm-22
brw-rw—- 1 root root 253, 23 Mar 3 14:17 /dev/dm-23
brw-rw—- 1 root root 253, 25 Mar 3 14:17 /dev/dm-25
brw-rw—- 1 root root 253, 24 Mar 3 14:17 /dev/dm-24
brw-rw—- 1 root root 253, 30 Mar 3 14:17 /dev/dm-30
brw-rw—- 1 root root 253, 27 Mar 3 14:17 /dev/dm-27
brw-rw—- 1 root root 253, 26 Mar 3 14:17 /dev/dm-26
brw-rw—- 1 root root 253, 31 Mar 3 14:17 /dev/dm-31
brw-rw—- 1 root root 253, 33 Mar 3 14:17 /dev/dm-33
brw-rw—- 1 root root 253, 32 Mar 3 14:17 /dev/dm-32
brw-rw—- 1 root root 253, 34 Mar 3 14:17 /dev/dm-34
brw-rw—- 1 root root 253, 18 Mar 3 14:17 /dev/dm-18
brw-rw—- 1 root root 253, 28 Mar 3 14:17 /dev/dm-28
brw-rw—- 1 root root 253, 29 Mar 3 14:17 /dev/dm-29
brw-rw—- 1 root root 253, 13 Mar 3 14:17 /dev/dm-13
brw-rw—- 1 root root 253, 53 Mar 3 14:50 /dev/dm-53
brw-rw—- 1 root root 253, 47 Mar 3 14:50 /dev/dm-47
brw-rw—- 1 root root 253, 55 Mar 3 14:50 /dev/dm-55
brw-rw—- 1 root root 253, 49 Mar 3 14:50 /dev/dm-49
brw-rw—- 1 root root 253, 52 Mar 3 14:50 /dev/dm-52
brw-rw—- 1 root root 253, 54 Mar 3 14:50 /dev/dm-54
brw-rw—- 1 root root 253, 51 Mar 3 14:50 /dev/dm-51
brw-rw—- 1 root root 253, 56 Mar 3 14:50 /dev/dm-56
brw-rw—- 1 root root 253, 48 Mar 3 14:50 /dev/dm-48
brw-rw—- 1 root root 253, 57 Mar 3 14:50 /dev/dm-57
brw-rw—- 1 root root 253, 50 Mar 3 14:50 /dev/dm-50
brw-rw—- 1 root root 253, 64 Mar 3 14:50 /dev/dm-64
brw-rw—- 1 root root 253, 61 Mar 3 14:50 /dev/dm-61
brw-rw—- 1 root root 253, 66 Mar 3 14:50 /dev/dm-66
brw-rw—- 1 root root 253, 60 Mar 3 14:50 /dev/dm-60
brw-rw—- 1 root root 253, 58 Mar 3 14:50 /dev/dm-58
brw-rw—- 1 root root 253, 59 Mar 3 14:50 /dev/dm-59
brw-rw—- 1 root root 253, 68 Mar 3 14:50 /dev/dm-68
brw-rw—- 1 root root 253, 65 Mar 3 14:50 /dev/dm-65
brw-rw—- 1 root root 253, 63 Mar 3 14:50 /dev/dm-63
brw-rw—- 1 root root 253, 67 Mar 3 14:50 /dev/dm-67
brw-rw—- 1 root root 253, 62 Mar 3 14:50 /dev/dm-62
brw-rw—- 1 root root 253, 69 Mar 3 14:58 /dev/dm-69
brw-rw—- 1 root root 253, 70 Mar 3 14:58 /dev/dm-70
brw-r—– 1 root disk 253, 46 Mar 4 11:34 /dev/dm-46
brw-r—– 1 root disk 253, 45 Mar 4 11:34 /dev/dm-45
brw-r—– 1 root disk 253, 42 Mar 4 11:34 /dev/dm-42
brw-r—– 1 root disk 253, 40 Mar 4 11:34 /dev/dm-40
brw-r—– 1 root disk 253, 36 Mar 4 11:34 /dev/dm-36
brw-r—– 1 root disk 253, 44 Mar 4 11:34 /dev/dm-44
brw-r—– 1 root disk 253, 39 Mar 4 11:34 /dev/dm-39
brw-r—– 1 root disk 253, 41 Mar 4 11:34 /dev/dm-41
brw-r—– 1 root disk 253, 43 Mar 4 11:34 /dev/dm-43
brw-r—– 1 root disk 253, 37 Mar 4 11:34 /dev/dm-37
brw-r—– 1 root disk 253, 12 Mar 4 11:34 /dev/dm-12
brw-r—– 1 root disk 253, 38 Mar 4 11:34 /dev/dm-38
brw-r—– 1 root disk 253, 35 Mar 4 11:34 /dev/dm-35
brw-r—– 1 root disk 253, 4 Mar 4 11:34 /dev/dm-4
brw-r—– 1 root disk 253, 10 Mar 4 11:34 /dev/dm-10
brw-r—– 1 root disk 253, 7 Mar 4 11:34 /dev/dm-7
brw-r—– 1 root disk 253, 3 Mar 4 11:34 /dev/dm-3
brw-r—– 1 root disk 253, 6 Mar 4 11:34 /dev/dm-6
brw-r—– 1 root disk 253, 1 Mar 4 11:34 /dev/dm-1
brw-r—– 1 root disk 253, 11 Mar 4 11:34 /dev/dm-11
brw-r—– 1 root disk 253, 0 Mar 4 11:34 /dev/dm-0
brw-r—– 1 root disk 253, 9 Mar 4 11:34 /dev/dm-9
brw-r—– 1 root disk 253, 5 Mar 4 11:34 /dev/dm-5
brw-r—– 1 root disk 253, 8 Mar 4 11:34 /dev/dm-8
brw-r—– 1 root disk 253, 2 Mar 4 11:34 /dev/dm-2

After these actions, oracleasm scandisks now runs in a few seconds.

(root) # time oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks…
Scanning system for ASM disks…

real 0m0.480s
user 0m0.134s
sys 0m0.259s

 

 

Oracle VM Disaster Recovery

A lot of my clients ask me about Disaster Recovery in a OVM setup. I hope this new event of oracle gives us some more insight. You can register here. I’ll certainly check it out. The whitepaper the event is based on can be found here.

Consistency between the Primary and DR site is not handled in this paper.According to Oracle this is a task for the application (f.e. Dataguard ) or the Storage layer (f.e. EMC Recoverpoint ). The white paper handles all the necessary tasks to make sure vm’s can be seen and started on the DR site. I hoped there was more possible with the tight integration of UEK and OVM. But offcourse, is that really needed when you can already have consistency solutions on storage/application level?

I believe VMWare has some solutions for this. Anyone care to elaborate on those?

 

 

Configuring Kerberos for Oracle Databases 11.2 with win2008R2 AD

In this blog entry I try to document how to enable Kerberos. This procedure was actually created and followed during a project at one of my customers.

The Infrastructure

AD
- windows 2008R2 server
- domain : milkyway.space.com
- Kerberos installed and enabled
- DES encryption default disabled

Server :
- moon.milkyway.space.com
- database : crater
- version : 11.2.0.3.4

Client
- windows 7 enterprise edition
- 11.2.0.3 client

The Procedure

  • On the AD server
    • Create a service account in Active Directory for the database server moon to validate the Kerberos tickets with. This user does not need any specific rights but enable “password never expires”. We called this account “ssoval”
    • ensure that you deselect Setup option “Use DES Encryption” and select option “Do not require Kerberos PreAuthentication” for this user
    • Make sure that the SPN is set to the correct realm
      setspn -A oracle/moon.milkyway.space.com@MILKYWAY.SPACE.COM ssoval
      (oracle is just the name of the service, we reuse this name in the kerberos config to point here. This has no connection to service_names of the database.)
    • Extract a keytab file for this user so we don’t need to enter password to create tickets
      ktpass -princ oracle/moon.milkyway.space.com@MILKYWAY.SPACE.COM -crypto all -pass ssoval -mapuser ssoval -out v5srvtab
    • Put this file on the database server. I’ve put it in /etc/v5srvtab
  • On the Database Server “moon”
    • Make sure Advanced Security Option is installed, this is a paying option on top of Enterprise Edition.
    • Generate a kerberos ticket, this will be used for connection to the kerberos server for ticket validation
      $ORACLE_HOME/jdk/bin/kinit -k -t /etc/v5srvtab oracle/moon.milkyway.space.com
      ( You might want to create a crontab job for this so that you always have a valid ticket )
    • Adjust the sqlnet.ora
      SQLNET.KERBEROS5_CONF=/etc/krb5.conf
      SQLNET.KERBEROS5_KEYTAB=/etc/v5srvtab
      SQLNET.KERBEROS5_CONF_MIT=TRUE
      SQLNET.AUTHENTICATION_KERBEROS5_SERVICE=oracle
      SQLNET.AUTHENTICATION_SERVICES=(BEQ,KERBEROS5)
    • Create the /etc/krb5.conf file
      [libdefaults]
      default_realm = MILKYWAY.SPACE.COM
      [realms]
      MILKYWAY.SPACE.COM = {
      kdc = DC1.MILKYWAY.SPACE.COM:88
      kdc = DC2.MILKYWAY.SPACE.COM:88
      }
      [domain_realm]
      .milkyway.space.com = MILKYWAY.SPACE.COM
      milkyway.space.com = MILKYWAY.SPACE.COM
  • On the Database “crater”
    • Clear OS_AUTHENT_PREFIX
      SQL> alter system set OS_AUTHENT_PREFIX=’’ scope=spfile;
    • Disable remote_os_authent
      SQL> alter system set remote_os_authent=false;
    • restart the database
  • On the Windows Clients
    • Make sure ASO is installed.
    • Adjust the sqlnet.ora
      SQLNET.AUTHENTICATION_SERVICES= (BEQ,KERBEROS5)
      SQLNET.KERBEROS5_CONF =c:\kerberos\krb5.conf
      SQLNET.KERBEROS5_CONF_MIT = true
      SQLNET.KERBEROS5_CC_NAME=OSMSFT://
      This last line is important for windows clients because this reuses the already generated tickets available on the system as a result of your AD login. Hence it enables the SSO login. Keep in mind that the Oracle tool okinit will fail with OSD error if this cache is set when you try to get manual tickets.
    • Create the c:\kerberos\krb5.conf file identical as on the server except for the port numbers
      [libdefaults]
      default_realm = MILKYWAY.SPACE.COM
      [realms]
      MILKYWAY.SPACE.COM = {
      kdc = DC1.MILKYWAY.SPACE.COM
      kdc = DC2.MILKYWAY.SPACE.COM
      }
      [domain_realm]
      .milkyway.space.com = MILKYWAY.SPACE.COM
      milkyway.space.com = MILKYWAY.SPACE.COM
    • Make sure the file services in directory c:\windows\system32\drivers\etc has “kerberos5” in the list as first entry
      kerberos 88/tcp kerberos5 krb5 kerberos-sec #Kerberos
      kerberos 88/udp kerberos5 krb5 kerberos-sec #Kerberos

Now you are ready to use Kerberos Authentication.

Example for an user “Bjorn”

  • Create an user Bjorn on the AD server in domain MILKYWAY.SPACE.COM
    Ensure that you :

    • deselect Setup option “Use DES Encryption”
    • select option “Do not require Kerberos PreAuthentication”

    The username is case sensitive, so make sure you have the correct case.

  • Create an user Bjorn on the database crater
    SQL> create user BJORN IDENTIFIED EXTERNALLY as 'Bjorn@MILKYWAY.SPACE.COM';
    SQL> grant create session to BJORN;
  • Login to the windows desktop and connect to the database over TNS for example :
    C:\> sqlplus /@crater
    CONNECTED

    SQL> show user
    USER is "BJORN"

    SQL> select sys_context('userenv ', 'session_user') from dual;
    SYS_CONTEXT('USERENV','SESSION_USER')
    -----------------------------------------
    BJORN

    SQL> select sys_context('userenv','external_name') from dual;
    SYS_CONTEXT('USERENV','EXTERNAL_NAME')
    -----------------------------------------------------------------
    Bjorn@MILKYWAY.SPACE.COM

Troubleshooting

  • KDC has no support for encryption type : pre-11gR2 only supports DES encryption. The company where I performed this setup, did not want to enable this legacy protocol ( and rightly so ), so only connections with 11.2 and higher clients to 11.2 and higher databases will work in this setup.
  • Cannot find KDC for requested realm : Make sure your services file is correctly formatted and kerberos5 is the first protocol in the list for port 88

Special thanks to antonio mata gomez from Oracle Belgium for support in this project

Best Practices for Oracle Linux for Production Systems

I found this needed a blog post because most of the customers I meet, just install their Redhat/Oracle Linux environment and start using it out of the box in production. I believe this list should be included in every post-installation procedure.

  • Hostname : Make sure it’s a FQDN. Especially when you connect with NFS to other systems. If your hostname is not FQDN, locks will not be freed on the NFS server when you reboot.
  • Support : If you have support, make sure you register your system with ULN.
  • Update : Update your system with yum or up2date to the latest version.
  • Hugepages : If you are running Oracle Databases, this is a must. Metalink note. 361468.1
  • Ipmitool : This allows for control over the hardware from inside the OS. Can be very usefull for Cluster setups or automated scripts to collect information.
  • Kexec : This allows the system to dump the kernel-memory to disk whenever a kernel panic occurs. Instead of rebooting or hanging, the system boots into a separate kernel with the task of dumping the memory to disk in the form of a vmcore file. This file can then later be analysed with the crash utility. Don’t forget to test it!!
  • magic sysrq key : This enables some key-strokes in the console to force a kernel to do all sorts of things ( show locks, reboot without FS corruption, … ). It is often used to dump a kernel stacktrace to /var/log/messages and reboot a system after soft hangs ( hangs on console with numlock flashing ). This is default enabled in OL5 but in OL6 you need to enable it manually. Also, make sure you know the keystrokes for when you need them.
  • Oswatcher BB : Monitoring tool of Oracle. Can show you if there were spikes just before or leading to the crash. Metalink note.301137.1
  • vncserver : allows for X11 environment over vnc. Faster then X11 over the net and allows you to continue where you left off when you lose your connection during an installation or configuration.
  • oratop : utility for near real-time monitoring of databases, RAC and Single Instance. Metalink note. 1500864.1.
  • dstat : allows you to view all of your system resources in real-time
  • Rlwrap: Saves you time ;)

If anyone is interested in how to perform some of these tasks, let me know and I’ll consider writing some blog entries about them. But most procedures can be found in the manual or official pages about it. Keep in mind that this list also applies to Oracle Engineered Systems.( ODA, EXADATA, … )