In Oracle database 11g, a new feature/enhancement has potential of bringing down the application due to multiple sessions hitting database with wrong password.
As per Bug 7715339 – Logon failures causes “row cache lock” waits – Allow disable of logon delay [ID 7715339.8]
In 11g there is an intentional delay between allowing failed logon attempts to retry. After 3 successive failures a sleep delay is introduced starting
at 3 seconds and extending to 10 seconds max. During each delay the user X row cache lock is held in exclusive mode preventing any concurrent logon attempt as user X (and preventing any other operation which would need the row cache lock for user X).
Let’s simulate this using a test case.I am using a 11.2.0.3 database.Open two or more database server connections
Connection 1
create user amit identified by amit; grant connect to amit; alter user amit profile MONITORING_PROFILE;
MONITORING_PROFILE has failed_login_attempts set to unlimited.
Create two scripts
$cat 1.sh sqlplus -s amit/test123 <<EOF EOF $cat lock_db.sh #!/bin/bash for i in {1..1000} do nohup sh 1.sh & done
I have set loop to 1000 which you can increase depending on your process limit. This is test database with process parameter set to 150. Execute script
sh lock_db.sh
Now if you go to Connection 2, you will see following issues
a) Database will not allow fresh connections for short time as you will get ORA-00020: maximum number of processes (150) exceeded errors.
After that database connections will be possible
b) You cannot connect to AMIT user for considerable time even with correct password. All sessions will be waiting on library cache lock
select event,count(*) from V$session group by event order by 2; EVENT COUNT(*) ---------------------------------------------------------------- ---------- Streams AQ: waiting for time management or cleanup tasks 1 VKTM Logical Idle Wait 1 Space Manager: slave idle wait 1 SQL*Net message from client 1 ges remote message 1 ASM background timer 1 GCR sleep 1 smon timer 1 pmon timer 1 Streams AQ: qmn coordinator idle wait 1 asynch descriptor resize 1 Streams AQ: qmn slave idle wait 1 PING 1 class slave wait 2 gcs remote message 2 DIAG idle wait 2 wait for unread message on broadcast channel 2 rdbms ipc message 18 library cache lock 101 select distinct ses.ksusenum sid, ses.ksuseser serial#, ses.ksuudlna username,KSUSEMNM module, ob.kglnaown obj_owner, ob.kglnaobj obj_name ,lk.kgllkcnt lck_cnt, lk.kgllkmod lock_mode, lk.kgllkreq lock_req , w.state, w.event, w.wait_Time, w.seconds_in_Wait from x$kgllk lk, x$kglob ob,x$ksuse ses , v$session_wait w where lk.kgllkhdl in (select kgllkhdl from x$kgllk where kgllkreq >0 ) and ob.kglhdadr = lk.kgllkhdl and lk.kgllkuse = ses.addr and w.sid = ses.indx order by seconds_in_wait desc / SID SERIAL# USERNAME MODULE OBJ_OWNER OBJ_NAME LCK_CNT LOCK_MODE LOCK_REQ STATE EVENT WAIT_TIME SECONDS_IN_WAIT ---------- ---------- ---------- ----------------------------------- --------------- --------------- ---------- ---------- ---------- ------------------- ------------------------------ ---------- --------------- 153 77 testbox01 84 0 0 3 WAITING library cache lock 0 3 36 101 testbox01 84 0 0 3 WAITING library cache lock 0 3 16 137 testbox01 84 0 0 3 WAITING library cache lock 0 3 143 203 testbox01 84 0 0 3 WAITING library cache lock 0 3 26 261 testbox01 84 1 3 0 WAITED KNOWN TIME library cache lock 119 3
You will notice that seconds_in_wait value will increase . Also ‘alter system kill session’ takes long time to kill session (it seems as if hung). As per bug ,any operation involving row cache lock for that user will not be allowed. I confirmed by trying to reset profile to Default and session was stuck.
Till all these sessions are cleared from database, you will not be allowed to login. This can severy impact applications as fresh connection would not be possible. Pre-existing sessions will work
If you repeat same experiment in 10g, you will not face similar issue.
To disable this behavior i.e no sleep delay set following event
event=”28401 trace name context forever, level 1″
You can set it dynamically too using
alter system set events ‘28401 trace name context forever, level 1’;
To go back to original behavior
alter system set events ‘28401 trace name context off’;
Bug 7715339 mentions that if FAILED_LOGIN_ATTEMPTS is set to some value, then further attempts to log in will then correctly fail immediately with no delay
I tested this and found that this is not entirely true as my script was able to again spawn 101 sessions which were waiting on library cache lock again. Fresh connection with correct password was again stuck. DBA_USERS view was showing that account is locked. Though here the session cleared in quick time as compared to test case involving failed_login_attempts to unlimited.
Interesting ! Good article Amit !
Nice one.Thanks for sharing 🙂
Regards,
Anand
Thanks Suvv and Anand for your comments…
Thank you for posting this. It’s been a vexing issue for us.
good article, was very useful