Retry count has no effect in thin mode when using an address list and running into a timeout · Issue #420 · oracle/python-oracledb (original) (raw)
- What versions are you using?
Database: 19.0.0.0.0
platform.platform: Linux-5.4.0-148-generic-x86_64-with-glibc2.31
sys.maxsize > 2**32: True
platform.python_version: 3.12.6
oracledb.version: 2.4.1
I have not seen anything in the changelog that suggests that this has been addressed in 2.5.0
- Is it an error or a hang or a crash?
error - What error(s) or behavior you are seeing?
We have a list of alternative IP addresses for an Oracle database. When one of them fails on connect with a timeout and retry_count is set to something greater than zero, we expect that the connection is attempted again with one of the other addresses. This works as expected in thick mode, but fails in thin mode: In thin mode, an exception is raised after the first attempt.
import oracledb
import time
# connection details have been redacted
working_ip = '[IP of Oracle database]'
broken_ip = '[IP that produces a timeout]'
service_name = '[Oracle service name]'
user = '[my user]'
password = '[my_password]'
dsn = f'(DESCRIPTION=(RETRY_COUNT=3)(CONNECT_TIMEOUT=2)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST={broken_ip})(PORT=33001))(ADDRESS=(PROTOCOL=TCP)(HOST={working_ip})(PORT=33001))(LOAD_BALANCE=yes))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME={service_name})))'
for i in range(10):
start = time.monotonic()
success = False
try:
with oracledb.connect(dsn=dsn, user=user, password=password, tcp_connect_timeout=2, retry_count=3) as con:
cursor = con.cursor()
cursor.execute('select * from dual')
r = cursor.fetchall()
success = True
except Exception as ex:
print(ex)
print(f'success: {success}, time: {time.monotonic() - start}')
This gives the following result:
DPY-6005: cannot connect to database (CONNECTION_ID=zcENDxX4Xx2538CUve3T2Q==).
timed out
success: False, time: 2.003529854118824
success: True, time: 0.15504495054483414
success: True, time: 0.15987224131822586
DPY-6005: cannot connect to database (CONNECTION_ID=MF8hAFPk2DrulkI9T3Xf1Q==).
timed out
success: False, time: 2.001703642308712
DPY-6005: cannot connect to database (CONNECTION_ID=v6Zs3XftGMih6hB45cJklA==).
timed out
success: False, time: 2.0028985515236855
success: True, time: 0.1483408883213997
DPY-6005: cannot connect to database (CONNECTION_ID=pa6I5LldNbNobwc2RXNuUg==).
timed out
success: False, time: 2.0020845904946327
DPY-6005: cannot connect to database (CONNECTION_ID=FE43mMP9hjofXzAcIm5P+g==).
timed out
success: False, time: 2.0028276592493057
DPY-6005: cannot connect to database (CONNECTION_ID=jJKdT0p8T+nOOH5j0Qu0ww==).
timed out
success: False, time: 2.002876453101635
success: True, time: 0.40784119069576263
So it looks like sometimes it suceeds on first try, but sometimes it fails on first try and immediately raises an exception, instead of trying a second time. Since it takes 2 seconds until the exception is raised (instead of 8), we can exclude that the first IP is tried multiple times
Running the same code in thick mode gives the expected result: Sometimes it succeeds on first try, sometimes it runs into a timeout after 2 seconds, but succeeds on second try
success: True, time: 0.18361841142177582
success: True, time: 0.19042250514030457
success: True, time: 2.1918497011065483
success: True, time: 2.1891980096697807
success: True, time: 0.18908774107694626
success: True, time: 2.182748392224312
success: True, time: 0.18147382885217667
success: True, time: 0.18108884245157242
success: True, time: 2.1901468336582184
success: True, time: 0.17620021849870682
Removing (LOAD_BALANCE=yes) results in the connection always failing after 2 seconds in thin mode and the connection always succeeding after 2 seconds in thick mode.
Putting (RETRY_COUNT=3) in the DSN or passing it directly as parameter to connect() does not seem to make a difference
- Does your application call init_oracle_client()?
The problem occurs in thin mode. If we call init_oracle_client, it works as expected