The magic of bad design

Home › Uncategorized › The magic of bad design

boneist August 21, 2012 Uncategorized Leave a comment (18)

The other day, I was investigating an issue in our process that takes a list of users and (effectively) merges them into the database. As part of that, I was trying to understand how the whole process worked.

This is the basic structure of the table:

Name          Null?    Type                        
------------- -------- ----------------------------
ID            NOT NULL NUMBER(12)                  
LOGIN_NAME             VARCHAR2(25)                
CREATED                DATE                        
LAST_UPDATED           DATE

I could see data in the table:

   ID LOGIN_NAME CREATED             LAST_UPDATED       
----- ---------- ------------------- -------------------
 1451 user_1451  06/08/2008 00:10:54 10/10/2008 01:01:34
 1452 user_1452  07/08/2008 00:11:18 23/02/2009 15:23:53
 1453 user_1453  12/08/2008 00:09:51 04/07/2012 03:09:08
 1454 user_1454  14/08/2008 00:09:58 10/06/2009 05:28:24

And this is the code that actually inserts the user information (I’ve removed the updates from this, as they’re not relevant to my mystery!):

PROCEDURE SynchroniseUsers (pUserList user_info_tab)
IS
BEGIN
  INSERT INTO USERS (ID, LOGIN_NAME)
  SELECT ID_SEQ.Nextval, theUsers.column_value
  FROM   THE (SELECT CAST(l_userTable AS VARCHARTAB) FROM dual) theUsers
  WHERE  NOT EXISTS (SELECT /*+ index(U USERS_LOGIN) */ 1
                     FROM USERS U
                     WHERE U.login_name = theUsers.column_value);
END;
/

“Hmm,” I thought, “I can’t see where the CREATED or LAST_UPDATED columns are being updated.” So, I hunted through the rest of the code in the package – there was an UPDATE of the LAST_UPDATED column elsewhere, so I wasn’t concerned about that, but I could not find any mention of where the CREATED column was being updated.

“Aha! Must be in a trigger!” … alas, no.

I was absolutely stumped; by now, I’ve spent the best part of 30 mins trying to work out where this column gets updated (the code involved is not the best code on the planet, as I’m sure you can judge from the above procedure!). Eventually, after staring at it some more and scratching my head, it hit me: default column values *smacks forehead*

Sure enough, when I go into the table structure:

CREATE TABLE USERS
(
  ID NUMBER(12) NOT NULL,
  LOGIN_NAME VARCHAR2(25 BYTE),
  CREATED DATE DEFAULT SYSDATE, -- here is the culprit!
  LAST_UPDATED DATE
);

If whoever had designed this ~~pile of…~~ code had included the CREATED column in the INSERT statement in the first place, I wouldn’t have had to waste my time trying to work out what the heck was going on! That is why you should always be explicit in your coding habits – sure, have default values so that things don’t slip through the cracks, but make sure the table is well designed and that any code doing inserts, updates, etc has ALL the necessary columns included.

Don’t rely on the side effects of doing things automagically, it just leads to maintenance headaches!

« Previous | Next »

18 Comments.[ Leave a comment ]

Norman Dunbar August 21, 2012 at 11:41 am

Hi Dawn.

If whoever had designed this ~~pile of…~~ code had included the CREATED column in the INSERT statement in the first place, I wouldn’t have had to waste my time trying to work out what the heck was going on!

Do you mean, had not specified a default AND included the CREATED column in the INSERT….?

Because the way I read it, I assumed you were leaving the default but specifying the CREATED value in the INSERT – which confused me a bit, until the penny dropped.

I was confused because if you specify the column name, you have to supply a value or NULL and then you don’t get the default value used – because you didn’t default it.

But I realise now that you are having a rant about default values, aren’t you? 😉

Cheers,
Norm.

Reply
- boneist August 21, 2012 at 11:49 am
  
  I meant, because the original coder had not included the CREATED column in the INSERT statement, it was being set automatically via the default values column.
  
  I’m not against default values, but I think if you’re doing INSERTs into that table as part of a procedure, then you jolly well ought to be explicit and include that column in the insert! What happens if someone came along and removed the default value from the table?
  
  So my rant is towards the lazy/bad design on the part of the original developer. And against relying on side-effects that happen automagically. I’ve nothing against default values per se!
  
  Reply
  - Norman Dunbar August 21, 2012 at 11:58 am
    
    Hi Boneist.
    
    Thanks.
    
    I meant, because the original coder had not included the CREATED column in the INSERT statement, it was being set automatically via the default values column.
    
    I see now, even better than before. You wanted an explicit SYSDATE value supplied for CREATED in the INSERT statement as opposed to relying on the implicit default value. Makes sense.
    
    Mind you, I usually use a trigger to do that sort of thing myself, if I have to have a CREATED date on a table. I do likewise with a trigger for any deletes/updates as well, to fill in the who and when for any changes. Obviously deletes are not really deletes in that case, just a flag set.
    
    And against relying on side-effects that happen automagically. I’ve nothing against default values per se!
    
    I agree on the part about someone altering the table to remove the default, that would cause problems, but I assume you have change control and proper (regression) testing in place?
    
    And I’m not seeing where using a trigger or a default is all that different, both are supplying values where none were supplied?
    
    And finally, if you have nothing against default values, then you cannot also be against relying on things happening automagically! 😉
    
    (Or am I missing the obvious again? It happens!)
    
    Cheers,
    Norm.
    
    Reply
    - boneist August 21, 2012 at 12:07 pm
      
      To me, default values are a “what if something slips through the net” solution – ie. someone comes along and bypasses all the code by manually adding a row into the table.
      
      I’m not a massive fan of triggers and avoid them wherever possible. I much prefer having everything in the pl/sql – that way, you only need to look in one place to work out what the heck is going on, rather than “Ok, so my package adds a row, but that column gets set via a default value, and this thing happens because of the trigger, so now I need to go look over there…”.
      
      YMMV, of course, especially if you’re a DBA, not a developer! *{;-)
      
      Reply
      - Norman Dunbar August 21, 2012 at 12:53 pm
        
        Hi again!
        
        Your first paragraph could easily interchange trigger for default. And while it would be nice to have the PL/SQL doing everything correctly, that’s going down the lines of “my application is the only way to put data into the database” – and we know what that leads to!
        
        You need to protect your data, so the “business rules” go as close to the data as possible. If it is a possibility that there will be data loaded without passing through the application, then I have to say that triggers and/or defaults are almost mandatory!
        
        And I say that as both a developer and DBA – the DBA side says “I must protect the integrity and security of the data at all times” and the developer says “I agree!”.
        
        I’m afraid the DBA side has had to deal with to many vendors who think that the application is king, when it is the data that is the company’s asset. And to that end, the data has to be valid and protected at all times. The vendors I’ve dealt with all say the same things “Oh, there’s no way that anyone can put data directly into the tables, it has to be from the application”.
        
        Even a Java developer (joke) can see through that one!
        
        So, yes, my developer side agrees that it is a lot easier when everything is done in one place, but the DBA side of me says that the data are really important, and has to be protected. So, triggers and/or defaults are a necessity.
        
        And my developer side also says, that there’s no reason why the developer couldn’t have supplied an explicit SYSDATE even though there’s a default.
        
        Cheers,
        Norm.
      - boneist August 21, 2012 at 1:31 pm
        
        Hmm, you raise an interesting point – to my mind, the PL/SQL code inside the database, coupled with appropriately permissioned separate object-owner / user accounts means that your only way into the data is via the PL/SQL code, aside from manual tweaks of the data by support. That way, the PL/SQL code *is* closely tied to the data, and is only circumvented under unusual circumstances.
        
        Data is important though (my raison d’être!), hence why I’m ok with default values as a safety net, and – to a certain degree – triggers (but only if absolutely necessary). Constraints, etc., are absolutely vital – of course!
        
        And my developer side also says, that there’s no reason why the developer couldn’t have supplied an explicit SYSDATE even though there’s a default.
        
        Exactly! I would automatically include that as part of the insert, or at least include a comment as to why I wasn’t!
Dominic Brooks August 21, 2012 at 1:04 pm

Ah, the good old THE pseudo-function … meaningfully named or what?

Reply
- boneist August 21, 2012 at 1:33 pm
  
  Lol, I wondered if someone would spot that; clearly, I didn’t tidy up this code when we moved from 9i to 10g, so now I’m changing it wherever possible to use TABLE – or even better, to use FORALL…INSERT…!
  
  Reply
Norman Dunbar August 21, 2012 at 1:51 pm

Hmm, you raise an interesting point – to my mind, the PL/SQL code inside the database, coupled with appropriately permissioned separate object-owner / user accounts means that your only way into the data is via the PL/SQL code, aside from manual tweaks of the data by support. That way, the PL/SQL code *is* closely tied to the data, and is only circumvented under unusual circumstances.

Yes, like anyone logging in as the data owner using Toad, SQL*Plus or, if you must, SQL Developer (hope Jeff isn’t reading this, if he is, Hi Jeff!)

Or a script from the vendor which nicely bypasses the application – as I have suffered from to my cost!

Or a [junior] DBA loggining in a SYSDBA and running an “insert into your_schema.your_table(…) values (…);”.

With a default and/or a trigger, the data are safe. With only the application running the show, the data are now not ok, and depending on whether the application expects everything to be as it would have loaded it, then all hell might break loose when said data are read by the application. Another one I’ve had joyful experience of in the past.)

Anyway, I’ll leave you alone now!

Cheers,
Norm.

PS. I did notice the THE in the code but deliberately ignored it because I’ve never actually understood what it’s all about, nor have I ever found a description that actually explained what it was all about! Ignorant DBAs eh? 😉

Reply
- boneist August 21, 2012 at 2:10 pm
  Having a default value is not going to stop someone from writing their own insert statement containing a different value for that column and therefore completely bypassing it. And if someone’s in as the data-owner, there’s nothing to stop them from dropping the triggers, etc, etc.
  
  It’s six of one, half-a-dozen of the other; you can set your database up to be as secure as possible, but there will always be ways that a disgruntled db developer (or DBA!) could screw things up for you… (although IME, disgruntled DBAs tend to be far more subtle in their ways to really mess you up! *{;-) ).
  
  As for the THE, that’s an old-fashioned, deprecated way of converting an array into something that can be queried directly by the SQL engine.
  
  These days, you’d use the TABLE function, so the above example would be converted to:
```
PROCEDURE SynchroniseUsers (pUserList user_info_tab)
IS
BEGIN
  INSERT INTO USERS (ID, LOGIN_NAME)
  SELECT ID_SEQ.Nextval, theUsers.column_value
  FROM   TABLE(CAST(l_userTable AS VARCHARTAB)) theUsers
  WHERE  NOT EXISTS (SELECT /*+ index(U USERS_LOGIN) */ 1
                     FROM USERS U
                     WHERE U.login_name = theUsers.column_value);
END;
/
```
  which is a bit more self-explanatory.
  Reply
  - Norman Dunbar August 21, 2012 at 2:27 pm
    
    Thanks Dawn. THE = TABLE. Nice.
    
    Now, don’t get me started on hints! 😉
    
    Cheers,
    Norm.
    
    PS. I’m not disgruntled – yet! My data are safe!
    
    Reply
    - boneist August 21, 2012 at 2:48 pm
      
      Oh yeah, the hint… that’s something else I was wondering if someone would pick up on! I suspect I’ll remove it, assuming it doesn’t affect performance badly.
      
      I always, always try very hard to stay on the good side of any DBA I have to work with, having once worked with a total jobsworth (I suspect she’d had her fingers badly burnt in the past, but it certainly didn’t make my job any easier). I once ended up having to spend 2 weeks researching and documenting the fact that the new 9i CREATE DIRECTORY command was not, in fact, available for 8i, and that yes, we would need the utl_file_dir parameter amended to include our directory. Painful.
      
      Reply
      - Norman Dunbar August 21, 2012 at 2:57 pm
        
        DBAs and Developers *should* be working together, always. They have so much to learn from each other.
        
        Keeping them in separate cages is cruel!
        
        Cheers,
        Norm.
      - Dominic Brooks August 22, 2012 at 11:28 am
        
        Why remove the hint? What’s the problem? I assume it’s related to the defaut block-size related cardinality for the collection and maybe that was causing a FTS of USERS.
        
        We know nothing about your versions so isn’t it 11.1.0.7 when dynamic sampling of collections kicks in?
        
        Assuming that this default is the root cause of the index hint, in older versions I would have favoured something like a cardinality hint to adjust the default statistics … but we all have our preferences … and change those preferences reasonably frequently.
        
        But I suppose it’s highly probably that this code was originally developed under the RBO so….
        
        “These days, you’d use the TABLE function”
        Indeed. And maybe a MERGE.
        
        The merits of a NOT EXISTS (SELECT) over just capturing a dup_val_on_index exception if there was a UK on login_name – so many things we could discuss…
David Fitzjarrell August 21, 2012 at 3:10 pm

“DBAs and Developers *should* be working together, always. They have so much to learn from each other.

Keeping them in separate cages is cruel!

Cheers,
Norm.”

But sometimes necessary …

David

Reply
- Norman Dunbar August 21, 2012 at 3:29 pm
  
  David,
  
  necessary only if the Developers are Java ones, and insisting on reinventing the wheel yet again. Or using Hibernate.
  
  🙂
  
  Cheers,
  Norm.
  
  Reply
  - David Fitzjarrell August 21, 2012 at 4:10 pm
    
    I haven’t had the pleasure of working with any other kind for years — 1995 was the last year I worked with any but Java/Hibernate developers.
    
    The joys of being a career DBA.
    
    David
    
    Reply
  - Bob Jarvis March 27, 2015 at 2:03 pm
    
    I think that calling out Hibernate is unfair. Please, let’s try to do better. Ahem… Using *any* framework for hiding database access behind a pile of non-standard “gosh I wonder what this does” code is grounds for damnatio memoriae.
    
    Reply

Boneist's Oracle Blog

The ramblings of an Oracle developer