A common theme that I run across is the lack of exception handling within production code. The amount of man hours wasted tracking down resolutions on production issues gets lost amongst the obscure error message, or the error message that sends everyone down the wrong path. It seems so obvious to those of us charged with incident management, well constructed exception handling will save time and frustration.
A typical fire fighting scenario without well written exception handling begins with the end user calling the help desk with an error message “ORA-01858 a non-numeric character was found where a numeric was expected” . Not too obscured and fairly straight forward on the surface. So we pull up the code and the line number where the error states its being generated. The line number points to a straight insert statement.
Insert into table_detail (table_seq, desc, message) values (vTbl_seq, vdesc, vmessage);
Only one column is defined as a number:
The only number column is populated via a sequence which is a number. So where is the error being introduced?
In further review of the code, the actual error message is being inserted into the audit_log (which is used to track the progress of different batch processes) and the process continues to the next step which is an insert into the table_detail. This insert raises the ora-01858 because its trying to insert previous variables since new values were not populated since the file could not be read. The actual error that triggered these events and the reason the file could not be read ora-29283 Invalid File Operation. To further complicate the troubleshooting neither the line number nor the directory in question is included and for this particular process there are few different selections.
To recap, there is an error which is logged in the audit_log (which will never be seen by the end user unless they review the audit_log), the process continues but the next step fails and raises the ora-01858. Now that we have the right error message, we know that we can review the permissions within the DBA_TAB_PRIVS for each of the directories, as well as the permissions on the OS directory. The time spent troubleshooting and debugging the code approximately 2 hours.
If the original error message had included the actual oracle error number, text and line numbers as well as some additional information like “Invalid File Operation for Directory FILE_DIR and Filename Table_Load.txt Line 512”, then the troubleshooting efforts would have equaled 5 minutes maximum for one individual instead of the multiple people and 2 hours.
Production is not the place to debug errors. The exceptional handling should account for errors in a straight forward manner allowing production support personnel to efficiently and effectively troubleshoot incidents without involving developers. In my opinion the only type of exception handling worse than obscure messages is WHEN EXCEPTION THEN NULL.
Tagged: Exception Handling