Lost all voting disk on ASM in oracle 11gr2

  • This topic is empty.
Viewing 3 reply threads
  • Author
    Posts
    • #2011
      Pankaj Sharma
      Participant

      Hi,

      This is regarding Oracle 11gR2 RAC. I have a 2 node RAC on RHEL5. I have my voting file on ASM diskgroup named “+DATA1”, which is created on device “/dev/sdc1”. Below is the details:

      [root@rac1 ~]# oracleasm querydisk -p ASMDISK2

      Disk “ASMDISK2” is a valid ASM disk

      /dev/sdc1: LABEL=”ASMDISK2″ TYPE=”oracleasm”

      I was just want to test the scenario when I lost my all voting disk, then what happens actually. So I make the zero space the partition using below command:

      [root@rac1 ~]# dd if=/dev/zero of=/dev/sdc1

      dd: writing to `/dev/sdc1′: No space left on device

      10474318+0 records in

      10474317+0 records out

      5362850304 bytes (5.4 GB) copied, 161.053 seconds, 33.3 MB/s

      My disktimeout was 200 seconds. But I waited for approx 8-10 minutes. Then I query the votedisk, but still it is showing ONLINE on DATA1 diskgroup.

      [root@rac1 ~]# crsctl query css votedisk

      ## STATE File Universal Id File Name Disk group





      1. ONLINE 06d8300bb6af4fd3bfa2e1e5cda9c440 (ORCL:ASMDISK2) [DATA1]

      Located 1 voting disk(s).

      Also below are some findings in this situation:

      1) ASMCA is not showing the diskgroup DATA1, it is deleted.

      2) Cluster alert log is not displaying any message regarding the lost of voting disk.

      3) cssd.log also not displaying any message regarding the lost of voting disk.

      4) Cluster is still running and up. Also its working properly.

      As per documentation, oracle 11g r2 provide “rebootless node fencing” in case of all voting disk failure. But atleast it must display any message in log files or how the “crsctl query css votedisk” is still giving that voting disk is ONLINE ?

      I am not able to understand this situation. Please explain me the reason ?

      Thanks in advanced.

    • #2167
      Saurabh Sood
      Member

      Hi Pankaj,

      Can you provide the output of following command to me:

      $ kfed read /dev/sdc1

      Thanks,

      Saurabh Sood

    • #2168
      Amit Bansal
      Keymaster

      Pankaj,

      Is this a normal redundancy diskgroup?

      Regards

      Amit

    • #2169
      Pankaj Sharma
      Participant

      Hi,

      Sorry for the late reply.

      Amit:- This is external redundancy group.

      Saurabh:- That time I recovered the vote disk by restoring it. But just to get answer of your question, I have simulate the failure again, so that we can find out the reason for this unexpected behaviour of clusterware.

      This time my votedisk was on ASMDISK3 and corresponding os deveice is “/dev/sdd1”

      [BEFORE FAILURE]

      Below is the output from command “$ kfed read /dev/sdd1”

      [root@rac1 ~]# kfed read /dev/sdd1

      kfbh.endian: 1 ; 0x000: 0x01

      kfbh.hard: 130 ; 0x001: 0x82

      kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD

      kfbh.datfmt: 1 ; 0x003: 0x01

      kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0

      kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8 NUMB=0x0

      kfbh.check: 711410310 ; 0x00c: 0x2a674286

      kfbh.fcn.base: 0 ; 0x010: 0x00000000

      kfbh.fcn.wrap: 0 ; 0x014: 0x00000000

      kfbh.spare1: 0 ; 0x018: 0x00000000

      kfbh.spare2: 0 ; 0x01c: 0x00000000

      kfdhdb.driver.provstr: ORCLDISKASMDISK3 ; 0x000: length=16

      kfdhdb.driver.reserved[0]: 1145918273 ; 0x008: 0x444d5341

      kfdhdb.driver.reserved[1]: 860574537 ; 0x00c: 0x334b5349

      kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000

      kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000

      kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000

      kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000

      kfdhdb.compat: 186646528 ; 0x020: 0x0b200000

      kfdhdb.dsknum: 0 ; 0x024: 0x0000

      kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL

      kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER

      kfdhdb.dskname: ASMDISK3 ; 0x028: length=8

      kfdhdb.grpname: DATA2 ; 0x048: length=5

      kfdhdb.fgname: ASMDISK3 ; 0x068: length=8

      kfdhdb.capname: ; 0x088: length=0

      kfdhdb.crestmp.hi: 32961169 ; 0x0a8: HOUR=0x11 DAYS=0x14 MNTH=0xc YEAR=0x7db

      kfdhdb.crestmp.lo: 252092416 ; 0x0ac: USEC=0x0 MSEC=0x1a8 SECS=0x30 MINS=0x3

      kfdhdb.mntstmp.hi: 32961387 ; 0x0b0: HOUR=0xb DAYS=0x1b MNTH=0xc YEAR=0x7db

      kfdhdb.mntstmp.lo: 2314231808 ; 0x0b4: USEC=0x0 MSEC=0x18 SECS=0x1f MINS=0x22

      kfdhdb.secsize: 512 ; 0x0b8: 0x0200

      kfdhdb.blksize: 4096 ; 0x0ba: 0x1000

      kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000

      kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80

      kfdhdb.dsksize: 5114 ; 0x0c4: 0x000013fa

      kfdhdb.pmcnt: 2 ; 0x0c8: 0x00000002

      kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001

      kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002

      kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002

      kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000

      kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000

      kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000

      kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000

      kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000

      kfdhdb.grpstmp.hi: 32961169 ; 0x0e4: HOUR=0x11 DAYS=0x14 MNTH=0xc YEAR=0x7db

      kfdhdb.grpstmp.lo: 251921408 ; 0x0e8: USEC=0x0 MSEC=0x101 SECS=0x30 MINS=0x3

      kfdhdb.vfstart: 96 ; 0x0ec: 0x00000060

      kfdhdb.vfend: 128 ; 0x0f0: 0x00000080

      kfdhdb.spfile: 0 ; 0x0f4: 0x00000000

      kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000

      kfdhdb.ub4spare[0]: 0 ; 0x0fc: 0x00000000

      kfdhdb.ub4spare[1]: 0 ; 0x100: 0x00000000

      kfdhdb.ub4spare[2]: 0 ; 0x104: 0x00000000

      kfdhdb.ub4spare[3]: 0 ; 0x108: 0x00000000

      kfdhdb.ub4spare[4]: 0 ; 0x10c: 0x00000000

      kfdhdb.ub4spare[5]: 0 ; 0x110: 0x00000000

      kfdhdb.ub4spare[6]: 0 ; 0x114: 0x00000000

      kfdhdb.ub4spare[7]: 0 ; 0x118: 0x00000000

      kfdhdb.ub4spare[8]: 0 ; 0x11c: 0x00000000

      kfdhdb.ub4spare[9]: 0 ; 0x120: 0x00000000

      kfdhdb.ub4spare[10]: 0 ; 0x124: 0x00000000

      kfdhdb.ub4spare[11]: 0 ; 0x128: 0x00000000

      kfdhdb.ub4spare[12]: 0 ; 0x12c: 0x00000000

      kfdhdb.ub4spare[13]: 0 ; 0x130: 0x00000000

      kfdhdb.ub4spare[14]: 0 ; 0x134: 0x00000000

      kfdhdb.ub4spare[15]: 0 ; 0x138: 0x00000000

      kfdhdb.ub4spare[16]: 0 ; 0x13c: 0x00000000

      kfdhdb.ub4spare[17]: 0 ; 0x140: 0x00000000

      kfdhdb.ub4spare[18]: 0 ; 0x144: 0x00000000

      kfdhdb.ub4spare[19]: 0 ; 0x148: 0x00000000

      kfdhdb.ub4spare[20]: 0 ; 0x14c: 0x00000000

      kfdhdb.ub4spare[21]: 0 ; 0x150: 0x00000000

      kfdhdb.ub4spare[22]: 0 ; 0x154: 0x00000000

      kfdhdb.ub4spare[23]: 0 ; 0x158: 0x00000000

      kfdhdb.ub4spare[24]: 0 ; 0x15c: 0x00000000

      kfdhdb.ub4spare[25]: 0 ; 0x160: 0x00000000

      kfdhdb.ub4spare[26]: 0 ; 0x164: 0x00000000

      kfdhdb.ub4spare[27]: 0 ; 0x168: 0x00000000

      kfdhdb.ub4spare[28]: 0 ; 0x16c: 0x00000000

      kfdhdb.ub4spare[29]: 0 ; 0x170: 0x00000000

      kfdhdb.ub4spare[30]: 0 ; 0x174: 0x00000000

      kfdhdb.ub4spare[31]: 0 ; 0x178: 0x00000000

      kfdhdb.ub4spare[32]: 0 ; 0x17c: 0x00000000

      kfdhdb.ub4spare[33]: 0 ; 0x180: 0x00000000

      kfdhdb.ub4spare[34]: 0 ; 0x184: 0x00000000

      kfdhdb.ub4spare[35]: 0 ; 0x188: 0x00000000

      kfdhdb.ub4spare[36]: 0 ; 0x18c: 0x00000000

      kfdhdb.ub4spare[37]: 0 ; 0x190: 0x00000000

      kfdhdb.ub4spare[38]: 0 ; 0x194: 0x00000000

      kfdhdb.ub4spare[39]: 0 ; 0x198: 0x00000000

      kfdhdb.ub4spare[40]: 0 ; 0x19c: 0x00000000

      kfdhdb.ub4spare[41]: 0 ; 0x1a0: 0x00000000

      kfdhdb.ub4spare[42]: 0 ; 0x1a4: 0x00000000

      kfdhdb.ub4spare[43]: 0 ; 0x1a8: 0x00000000

      kfdhdb.ub4spare[44]: 0 ; 0x1ac: 0x00000000

      kfdhdb.ub4spare[45]: 0 ; 0x1b0: 0x00000000

      kfdhdb.ub4spare[46]: 0 ; 0x1b4: 0x00000000

      kfdhdb.ub4spare[47]: 0 ; 0x1b8: 0x00000000

      kfdhdb.ub4spare[48]: 0 ; 0x1bc: 0x00000000

      kfdhdb.ub4spare[49]: 0 ; 0x1c0: 0x00000000

      kfdhdb.ub4spare[50]: 0 ; 0x1c4: 0x00000000

      kfdhdb.ub4spare[51]: 0 ; 0x1c8: 0x00000000

      kfdhdb.ub4spare[52]: 0 ; 0x1cc: 0x00000000

      kfdhdb.ub4spare[53]: 0 ; 0x1d0: 0x00000000

      kfdhdb.acdb.aba.seq: 0 ; 0x1d4: 0x00000000

      kfdhdb.acdb.aba.blk: 0 ; 0x1d8: 0x00000000

      kfdhdb.acdb.ents: 0 ; 0x1dc: 0x0000

      kfdhdb.acdb.ub2spare: 0 ; 0x1de: 0x0000

      [root@rac1 ~]#

      [AFTER FAILURE]

      [root@rac1 ~]# kfed read /dev/sdd1

      kfbh.endian: 0 ; 0x000: 0x00

      kfbh.hard: 0 ; 0x001: 0x00

      kfbh.type: 0 ; 0x002: KFBTYP_INVALID

      kfbh.datfmt: 0 ; 0x003: 0x00

      kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0

      kfbh.block.obj: 0 ; 0x008: TYPE=0x0 NUMB=0x0

      kfbh.check: 0 ; 0x00c: 0x00000000

      kfbh.fcn.base: 0 ; 0x010: 0x00000000

      kfbh.fcn.wrap: 0 ; 0x014: 0x00000000

      kfbh.spare1: 0 ; 0x018: 0x00000000

      kfbh.spare2: 0 ; 0x01c: 0x00000000

      ERROR!!!, failed to get the oracore error message

      [root@rac1 ~]# oracleasm querydisk -p ASMDISK3

      Disk “ASMDISK3” defines an unmarked device

      Still the same findings as before like no error message, vote disk are still ONLINE.

      Let me know if you want any other information.

      Thanks

      Pankaj

Viewing 3 reply threads
  • You must be logged in to reply to this topic.