Certified at last!

Today saw me trundling off to dear old Birmingham for the second time in 2 weeks (the first time being for the UKOUG Tech17 conference) and doing a fair impression of a nervous wreck.

That’s because I was heading to take both the SQL (1z0-061) and PL/SQL (1z0-144) Oracle certification exams.

I don’t like doing exams, mainly because I’m rubbish at revising (seriously; the most revision I did throughout all my university career consisted of 10-15 minutes before each exam. Thankfully, the Maths degree I did (and got!) involved practising stuff as part of the coursework, so I learnt by doing, rather than rote memorisation).

I was very worried that I’d fail these exams because the questions in the exam (certainly in the practice exams) don’t necessarily reflect anything like what you’d do in real life.

For example, I would never consider using NATURAL JOINs in any of my code, and I’d certainly flag it up if I ever saw it in someone else’s meant-for-production code! Why guess at what join conditions you want, rather than specifying them explicitly? Especially when that opens you up to a potential bug (what if someone adds a new column that ends up being included in the join condition?). Even the JOIN … USING form is not something I’d recommend, because then you will end up aliasing some columns but not the columns in the join; why make life hard for yourself when you can use JOIN … ON and simply alias all the columns?

Anyway, I digress. I passed both exams relatively convincingly, fortunately. It would have been embarrassing if I’d failed them, given how long I’ve worked with SQL and PL/SQL!

The main reason I’m writing this post is because Tim Hall basically asked me for an interview, and I thought I would humour him (because he definitely didn’t have a good idea. Nope! No siree! *{;-) ). And also, it’s been waaay too long since I last updated this blog!

On with the interview:

You’re a very experienced Oracle developer. What made you decide to do some Oracle certification exams at this point in your career?

I can’t take the credit for this; basically my department was allocated some budget to get everyone certified to either Associate (most of the developers) or Professional (the team leads) in 2017.

Before you say “didn’t you leave it a bit late?” I had planned on doing it at the beginning of October, but emergency surgery (I’m fine now!) the week before I was due to do it meant I had to postpone it. Today was the day!

What materials (books, blogs, training, practise questions) did you use to prepare for the exams?

The books I used were Roopesh Ramklass’s OCA Oracle Database 12c SQL Fundamentals I Exam Guide and Matthew Morris’s Study Guide for 1Z0-144: Oracle Database 11g: Program with PL/SQL: Oracle Certification Prep.

Roopesh’s book included a CD with some practice questions, so I worked my way through those. Matthew Morris’s book had a link (and discount) to a test you had to pay for, which I took and failed (this did *not* help my nerves one little bit!) and had to retake.

Plus, of course, there was the official Oracle training course for the PL/SQL exam (although I only managed the first 3 days of the online course!), plus the official practice exams provided by Transcender.

How long did you spend revising specifically for the exams?

I’ve been working with SQL for nearly 19 years and PL/SQL for at least 13 years, so you could say that I’ve been working towards the exams for a while now.

However, in terms of actually boning up for the SQL and PL/SQL exams? I’d already done a bit before life interfered – I’d read the first 3 chapters of Roopesh’s SQL book and done 3/5ths of the official PL/SQL training (I ran out of time on that one; I can’t blame the surgery for that!) plus the official practice exams.

Then I had a break of about two-and-a-bit-months, until last Monday (four days before the exams) when work very kindly gave me four days off for revision before the exams. I finished reading the SQL book over Monday and Tuesday (finishing up with the practice exams from the CD), and then read the PL/SQL book over Wednesday and Thursday (finishing up with the practice exams from the website linked to the book).

As revision goes, I couldn’t recommend my technique to anyone. I should have actually practised things more (e.g. actually creating triggers on tables, etc) but that would have taken far longer than the four days I had. Also, the SQL book didn’t cover some topics, such as LISTAGG (which I knew about) and the new FETCH FIRST… etc functionality (which I hadn’t realised would be tested), so perhaps a wider reading selection would have helped cover these extra topics.

Now you’ve passed the exams, how do you feel about the certification process?

I feel relieved, to be honest! These exams have been hanging over me for at least the past 6 months, and now they are done!

More seriously, I think they’re probably a good way to get beginners to learn a broad subset of SQL and PL/SQL, but like most text books, the examples given/questions in the exams are terribly contrived and are nothing like what you’d actually do in real life.

I think for our department (which consists of lots of PL/SQL developers, not all of whom are that familiar with SQL and/or PL/SQL) it was a good idea – at least everyone should now be at a similar level of knowledge.

I think some of the questions on the actual exams were very ambiguous. I had to spend time analysing the requirement and trying to work out what the heck they meant. Did they want employees earning over X salary that were hired in the last N months that have a manager, or was it employees with a manager plus employees hired within N months who earn over X salary? Or something else? *scratches head* I’m still not sure on that one, but I’m bloomin’ sure I wouldn’t provide an answer using set operations, which is what all the suggested answers used!

I’m convinced at least one of the questions in each exam didn’t have an answer. *That’s* not helpful. I think someone really needs to go through the exam and point out all these inconsistencies to Oracle.

Also I’m confused about why regexp and analytic functions were completely missing from the SQL exam and prep; there was one question where I could have ticked all four boxes, except it wouldn’t allow me to. I left the box that would have been covered by regexp_count unticked since that wasn’t in the training gumph.

Are you likely to do any more certifications?

Part of me is like “Noooooo, are you kidding? Why would I put myself through this again?!” but the completist side of me thinks that maybe I should go the whole hog and try for the OCP. Whether work would have enough budget for that (seeing as I’m not one of the team leads), I don’t know. I’m still debating whether I should ask or not.

Do you think you’ve benefited from the process?

I get an extra three letters to add after my name? I guess that’s a benefit.

I don’t think I learnt much of anything new, or at least nothing that I couldn’t google as part of my day-job (I mean, who remembers the syntax for creating a DDL trigger, if you only ever do it once in a blue moon?). Possibly the only thing I learnt was about PLSQL_OPTIMIZE_LEVEL, which I may bring up at work on Monday, if we aren’t already set at the highest level!

Personally, I don’t feel the need to “prove my worth” by having a piece of paper stamped with “OCA”, since I have other means of proving my capabilities, but I do think this might be of use to new SQL and PL/SQL developers. It’s not something that should be taken to indicate the person is an expert, though – experience is needed long before you can call yourself that! To be fair, I did see that mentioned on the Transcender practice exam page.

Dropping constraint… what about the index?

In a recent release that I’ve been working on, we had to drop and recreate some of the unique/primary key constraints (due to columns being added into the constraint) on some of our tables.

Prior to working in this team, I had only ever come across explicitly creating the indexes for a table, similar to:

create table test_explicit_index (col1 number,
                                  col2 varchar2(30),
                                  col3 date);

create unique index col1_exp_pk on test_explicit_index (col1);

We can see the index exists:

select index_name, table_name, uniqueness, status
from   user_indexes where table_name = 'TEST_EXPLICIT_INDEX';

INDEX_NAME   TABLE_NAME           UNIQUENESS STATUS  
------------ -------------------- ---------- --------
COL1_EXP_PK  TEST_EXPLICIT_INDEX  UNIQUE     VALID   

If we drop the constraint, we can see that the index still exists:

alter table test_explicit_index drop constraint col1_exp_pk;

select index_name, table_name, uniqueness, status
from   user_indexes where table_name = 'TEST_EXPLICIT_INDEX';

INDEX_NAME   TABLE_NAME           UNIQUENESS STATUS  
------------ -------------------- ---------- --------
COL1_EXP_PK  TEST_EXPLICIT_INDEX  UNIQUE     VALID   

Therefore, in my release scripts, I would always run a drop index statement after dropping a constraint supported by an index.

All was well with the world, until I moved to a team who created constraints that implicitly created the index, something like:

create table test_implicit_index (col1 number,
                                  col2 varchar2(30),
                                  col3 date);

alter table test_implicit_index add constraint col1_imp_pk primary key (col1) using index;

Once again, we can see that the index exists:

select index_name, table_name, uniqueness, status
from   user_indexes where table_name = 'TEST_IMPLICIT_INDEX';

INDEX_NAME   TABLE_NAME           UNIQUENESS STATUS  
------------ -------------------- ---------- --------
COL1_IMP_PK  TEST_IMPLICIT_INDEX  UNIQUE     VALID 

But what happens when we drop the constraint?

alter table test_implicit_index drop constraint col1_imp_pk;

col index_name format a12;
col table_name format a20;

select index_name, table_name, uniqueness, status
from   user_indexes where table_name = 'TEST_IMPLICIT_INDEX';

no rows selected.

The index is automatically dropped!

Again, all was well in the world – I no longer needed to explicitly drop the index first… great!

That is, until I encountered “ORA-00955: name is already used by an existing object” when trying to recreate the constraint with the extra columns. After researching, I discovered that although the constraint wasn’t present, the index still was. Oh no! Inconsistently created objects!

It was at this point, I remembered that Neil Chandler had written a blog entry about how to tell if the index is going to be dropped when dropping the corresponding constraint. Turns out, there isn’t an easy way – you have to use the sys.ind$ table which, of course, I don’t have access to!

“Ah well,” I thought, “I’ll simply have to code a bit of PL/SQL to drop the index if it exists, something along the lines of:

begin
  execute immediate 'drop index col1_exp_pk';
exception
  when others then null;
end;
/

However, I wasn’t really happy with this solution; it seemed… awkward. After doing some additional research, it turns out that Oracle have catered for dropping the index at the same time as the constraint, as mentioned in the documentation!

So now, in my release scripts, instead of a clunky PL/SQL procedure, I can just encapsulate the index drop in one command:

create table test_explicit_index2 (col1 number,
                                   col2 varchar2(30),
                                   col3 date);

create unique index col1_exp2_pk on test_explicit_index2 (col1);

alter table test_explicit_index2 add constraint col1_exp2_pk primary key (col1);

-- **** Ensure the index is dropped along with the constraint: ****
alter table test_explicit_index2 drop constraint col1_exp2_pk drop index;

select index_name, table_name, uniqueness, status
from   user_indexes where table_name = 'TEST_EXPLICIT_INDEX2';

no rows selected.

Problem solved!

Moving from Procedural to Set-Based Thinking

Most people who are new to working with databases tend to think procedurally. That is: I have a table that I need to update, so:

  • Find the first row
  • Update the relevant columns with the relevant values
  • Find the next row
  • Update the relevant columns with the relevant values
  • Find the last row
  • Update the relevant columns with the relevant values

People who are familiar with databases and set-based thinking would simply do:

  • Update all the rows in the table, setting the relevant columns to the relevant values

I think you’ll agree that this set-based (i.e. taking the whole set of data and then applying the changes in one go) is a much neater approach than the procedural, step-by-step approach! Also, it will be faster, since you don’t have to spend time context switching between – in the case of Oracle – the PL/SQL and SQL “engines”.

Having helped out on the SQL and PL/SQL Oracle forum, I’ve seen lots of different people ask questions around how to fix a syntax error in their horrendously procedural, row-by-row (aka slow-by-slow, or in SQL Server land, rbar (row-by-agonising-row) code, only for people (myself included!) to say “don’t do that, do it like this: <set-based code>”.

I remember having difficulty with the concept of set-based when I first started working with PL/SQL way back when, so I thought I’d put together a bit of a guide as to how to spot when you can use set-based processing and, more importantly, how to convert from the procedural code.

Remember, as the great Tom Kyte says, “you should do [the work] in a single SQL statement if at all possible”.

Spotting Procedural Code

In PL/SQL, the number one flag for procedural code that can be converted to set-based code is: code looping through a cursor (yes, even if you’re taking advantage of bulk processing (BULK COLLECT/FORALL)!).

It’s not always possible to convert procedural code to set-based, of course – if the work you’re doing against each row is not DML on another table (e.g. you’re sending a mail, or writing out to a file), then there’s nothing you can do to make it set-based. (There are other options that may be available to you in this scenario, but that’s not the subject of this post.)

Ok, so I’ve found some procedural code… what next?

Let’s take an example that popped up recently on the forum. Rewriting to remove the poster’s syntax and logic errors, here’s the procedure they intended to end up with:

create or replace procedure valid_proc (start_date in date,
                                        end_date in date)  
is  
  cursor c1 is   
    select *
    from emp  
    where hiredate between start_date and end_date;
begin
  for c_row in c1
  loop
    insert into emp_back (empno,
                          ename,
                          job,
                          mgr,
                          hiredate,
                          sal,
                          comm,
                          deptno)  
    values (c_row.empno,
            c_row.ename,
            c_row.job,
            c_row.hiredate,
            c_row.sal,
            c_row.comm,
            c_row.deptno);  
  end loop;
  commit;
end;  
/

The poster started off by asking what was wrong with his code, and immediately, what leapt to my mind was: “You’re using row-by-row processing! Why not do it in a single insert?” If you look through the post to reply #16, you can see that BluShadow had the same thought and has provided the solution, which I present, slightly reformatted, here:

create or replace procedure valid_proc(start_date in date,
                                       end_date in date) as
begin
  insert into emp_back (empno,
                        ename,
                        job,
                        mgr,
                        hiredate,
                        sal,
                        comm,
                        deptno)
  select empno,
         ename,
         job,
         mgr,
         hiredate,
         sal,
         comm,
         deptno
  from  emp
  where  hiredate between start_date and end_date;

  commit; -- if appropriate to business/transaction logic
end; 
/

The steps to get from the original, row-by-row procedure to the set-based procedure are:

  1. Look at the cursor
  2. Look at the insert … values () statement
  3. Think “all the rows being selected in the cursor are being inserted by the insert statement – this calls for an insert-as-select statement!”
  4. Make sure the select statement in the cursor lists the columns being inserted by the insert statement
  5. Remove the values() clause of the insert statement and replace with the amended select statement from the cursor

For those who are new to set-based thinking, I believe that step 3 is the hardest to grasp – part of it is familiarity with INSERT / UPDATE / DELETE / MERGE statements, but part of it is experience (which following the SQL and PL/SQL forum will help you get!)

But my procedure is much more complicated than that! I laugh at your single loop and give you my code with nested cursor loops!

First off, congratulations for reinventing the wheel – you have succeeded in replicating the NESTED LOOP join. Did you really intend to do that? If so, you’ve successfully hobbled Oracle’s optimizer to the possible detriment of your code (what if one of the other types of join condition was more efficient?). Whoops…!

Let’s look at a simple example of this type of processing:

declare 
  cursor c1 is select col_id, col_name from temp1; 
  cursor c2 is select col_id, col_name from temp2; 
begin 
  for i in c1 loop 
    for j in c2 loop 
      if i.col_name != j.col_name then 
        insert into temp1 values(j.col_id,j.col_name); 
      end if; 
    end loop; 
  end loop; 
  commit;
end;
/

The above code can be read as:

  1. For each row in cursor c1, find all rows in cursor c2 where the col_name in c1 doesn’t equal the col_name in c2
  2. Then insert the col_id and col_name from c2 into the temp1 table

Step 1 is basically describing a join between the two cursors, and we know that a join can be written in a single SQL statement:

select c1.col_id c1_col_id,
       c1.col_name c1_col_name,
       c2.col_id c2_col_id,
       c2.col_name c2_col_name
from   temp1 c1,
       temp2 c2
where  c1.col_name != c2.col_name;

Step 2 just inserts all the rows that we just selected, so the final statement becomes:

insert into temp1 (col_id, col_name)
select c2.col_id c2_col_id,
       c2.col_name c2_col_name
from   temp1 c1,
       temp2 c2
where  c1.col_name != c2.col_name;

If you need to, you can then slot the above statement into a PL/SQL procedure.

In reality, most of the code that I’ve seen involving nested cursor loops is not as simple as the above example, but with a bit of patience, it’s possible to untangle the nested cursors and, at the very least, turn it into a single cursor. If you can’t merge the cursor and the code that’s doing the work into a single SQL statement (e.g. sending a mail), at least you will have improved the performance by reducing the code to a single cursor loop!

Okay, so now I know a bit about refactoring existing procedure, what about if I have to write my own set-based procedure from scratch?

The very first step that I do is to try and build a SQL statement that matches the requirements. Once you have that (which you can build up in stages, depending on how complex the logic is), you can then use it to plug into an INSERT, UPDATE, MERGE or DELETE statement. You may have to modify the SELECT to fit in with the final statement, but once you have that base statement, you’re 90% of the way there.

The trick is to work out how to write that initial SELECT statement, which may involve aggregated queries, joins, analytic functions, etc. If you’re not very comfortable with SQL, then I suggest you invest some time in learning more about the various things you can do in SQL – it’s a very powerful language that will be faster at getting results from the database than anything you could write yourself in PL/SQL.

My Agenda for UKOUG 2012 Conference

It’s a bit late arriving, but this year I couldn’t just do a copy and paste job that would put it into a similar table format as last years, so I’ve had to edit it by hand. *yawn*

Anyway, here it is:

Sunday

OakTable registration opening times 11:00 – 12:00

12:15 – 13:00 8-bit Oak Roots
Bernie Dugggs
Imagine Software
Hall:
13:05 – 13:50 DUDE, Where’s My Other Data?
Kurt Van Meerbeeck
Consultant
Hall:
13:50 – 14:35 Late Lunch
14:35 – 15:20 Guiding Practices for Gathering Optimiser Statistics (or Not)
Martin Widlake
ORA600
Hall:
15:25 – 16:10 Secret Talk About Possible Future Release of Oracle Database
Christian Antognini
Trivadis AG
Hall:

Monday

Registration opening times 08:00 – 18:00

Exhibition opening times 11:15 – 19:30

09:15 – 09:30 Welcome and Introduction
09:30 – 10:20 Opening Oracle Keynote
Dermot O’Kelly & Dr. Andrew Sutherland
Oracle
Hall: 1
10:30 – 11:15 KEYNOTE: Oracle’s Latest Generation of Database Technology
Mr Tom Kyte, OakTable
Oracle
Hall: 1
11:15 – 11:50 Exhibition open/Coffee break
11:50 – 12:35 Creating Test Cases
Mr Jonathan Lewis, ACE Director & OakTable
JL Computer Consultancy
Hall: 5
12:45 – 13:45 SQL Tips, Techniques and Traps to Avoid
Mr Carl Dudley, Ace Director
University of Wolverhampton
Hall: 1
13:00 – 13:45 PL/SQL: Stop Making the Same Performance Mistakes
Dr Tim Hall, Ace Director & OakTable
ORACLE-BASE.com
Hall: 5
13:45 – 14:30 Lunch
14:30 – 15:30 Auditing Techniques for Oracle Database 11g
Mr Carl Dudley, Ace Director
University of Wolverhampton
Hall: 5
15:30 – 15:50 Coffee break
15:50 – 16:35 How to Create in 5 Minutes a SQL Tuning Test Case Using SQLTXPLAIN
Mr Carlos Sierra
Oracle
Hall: 1
16:45 – 17:30 Shareable Cursors
Mr Christian Antognini OakTable
Bayvale Consulting Services Ltd
Hall: 1

Exhibition Welcome Drinks 18:30 – 19:30

SIG Focus Pubs: 19:30 – 22:30

Tuesday

Registration opening times 08:00 – 18:30

Exhibition opening times 08:00 – 17:00

09:00 – 09:45 “To Iterate is Human, to Recurse Divine”
Mr Ben Burrell
Hansard Global
Hall: 5
09:55 – 10:55 Oracle Optimizer: Harnessing The Power of Optimizer Hints
Ms Maria Colgan, Oaktable
Oracle
Hall: 1
09:55 – 10:40 My Year With Exadata
Dr Jason Arneil
e-DBA
Hall:
10:55 – 11:15 Coffee break
11:15 – 12:00 A Deep Dive into the SQL Monitor Report
Mr Graham Wood
Oracle
Hall: 5
12:10 – 13:10 Controlling Execution Plans (Without Touching the Code)
Mr Kerry Osborne, ACE Director & OakTable
Enkitec
Hall: 1
13:10 – 13:55 Lunch
13:55 – 14:40 How the Query Optimizer Learns from its Mistakes
Christian Antognini
Trivadis AG
Hall: 8b
13:55 – 14:40 5 SQL and PL/SQL Things in the Latest Generation of Database Technology
Mr Tom Kyte, OakTable
Oracle
Hall: 1
14:40 – 15:10 Coffee break
15:10 – 15:55 6-Node Active-Active Oracle GoldenGate: Experiences and Lessons Learned
Mr Luke Davies
Pythian
Hall: 9
16:05 – 17:05 Cardinality Feedback
Mr Peter Brink
Credit Suisse
Hall: 5
16:05 – 17:05 What’s New in Security in the Latest Generation of Database Technology
Mr Tom Kyte, OakTable
Oracle
Hall: 1

Evening social – ‘Las Vegas Lite’: 18:30 – 19:00

Evening social – ‘Viva Las Vegas!’: 19:00 – late

Wednesday

Registration opening times 08:30 – 14:00

Exhibition opening times 08:30 – 15:30

09:55 – 10:55 Oracle Optimizer: An Insider’s View of How the Oracle Optimizer Works
Ms Maria Colgan, Oaktable
Oracle
Hall: 5
10:55 – 11:15 Coffee break
11:15 – 12:00 The MODEL Clause Explained
Mr Tony Hasler
Anvil Computer Services
Hall: 5
12:10 – 13:10 PGA Memory Management Revisited
Mr Joze Senegacnik, ACE Directore & OakTable
DbProf d.o.o.
Hall: 7b
13:10 – 13:55 Lunch
13:55 – 14:55 Exadata and the Oracle Optimizer: The Untold Story
Ms Maria Colgan, Oaktable
Oracle
Hall: 9
14:55 – 15:15 Coffee break
15:15 – 16:00 Troubleshooting the Most Complex Performance Issues I’ve Seen
Mr Tanel Poder, ACE Director & OakTable
Enkitec
Hall: 5

There are some tough clashes in there! Those UKOUG schedulers sure know how to torture people! *shakes fist* *{;-)

The magic of bad design

The other day, I was investigating an issue in our process that takes a list of users and (effectively) merges them into the database. As part of that, I was trying to understand how the whole process worked.

This is the basic structure of the table:

Name          Null?    Type                        
------------- -------- ----------------------------
ID            NOT NULL NUMBER(12)                  
LOGIN_NAME             VARCHAR2(25)                
CREATED                DATE                        
LAST_UPDATED           DATE        

I could see data in the table:

   ID LOGIN_NAME CREATED             LAST_UPDATED       
----- ---------- ------------------- -------------------
 1451 user_1451  06/08/2008 00:10:54 10/10/2008 01:01:34
 1452 user_1452  07/08/2008 00:11:18 23/02/2009 15:23:53
 1453 user_1453  12/08/2008 00:09:51 04/07/2012 03:09:08
 1454 user_1454  14/08/2008 00:09:58 10/06/2009 05:28:24

And this is the code that actually inserts the user information (I’ve removed the updates from this, as they’re not relevant to my mystery!):

PROCEDURE SynchroniseUsers (pUserList user_info_tab)
IS
BEGIN
  INSERT INTO USERS (ID, LOGIN_NAME)
  SELECT ID_SEQ.Nextval, theUsers.column_value
  FROM   THE (SELECT CAST(l_userTable AS VARCHARTAB) FROM dual) theUsers
  WHERE  NOT EXISTS (SELECT /*+ index(U USERS_LOGIN) */ 1
                     FROM USERS U
                     WHERE U.login_name = theUsers.column_value);
END;
/

“Hmm,” I thought, “I can’t see where the CREATED or LAST_UPDATED columns are being updated.” So, I hunted through the rest of the code in the package – there was an UPDATE of the LAST_UPDATED column elsewhere, so I wasn’t concerned about that, but I could not find any mention of where the CREATED column was being updated.

“Aha! Must be in a trigger!” … alas, no.

I was absolutely stumped; by now, I’ve spent the best part of 30 mins trying to work out where this column gets updated (the code involved is not the best code on the planet, as I’m sure you can judge from the above procedure!). Eventually, after staring at it some more and scratching my head, it hit me: default column values *smacks forehead*

Sure enough, when I go into the table structure:

CREATE TABLE USERS
(
  ID NUMBER(12) NOT NULL,
  LOGIN_NAME VARCHAR2(25 BYTE),
  CREATED DATE DEFAULT SYSDATE, -- here is the culprit!
  LAST_UPDATED DATE
);

If whoever had designed this pile of… code had included the CREATED column in the INSERT statement in the first place, I wouldn’t have had to waste my time trying to work out what the heck was going on! That is why you should always be explicit in your coding habits – sure, have default values so that things don’t slip through the cracks, but make sure the table is well designed and that any code doing inserts, updates, etc has ALL the necessary columns included.

Don’t rely on the side effects of doing things automagically, it just leads to maintenance headaches!

Dev or DBA? Where does the responsibility lie…

… which could be the title of a really great post about a big prod issue debriefing, but sadly, all you’re getting is this post instead!

At the UKOUG 2011 conference, I attended a presentation by Michael Salt about indexes. During the course of the presentation, he mentioned something along the lines of “DBA’s should be responsible for indexes (including creating them)” (that’s not a quote; I can’t remember exactly what he said, unfortunately!). I more or less recoiled, because as a database developer, I would expect to be the person worrying about whether I should be creating an index on a column(s) or not, as part of the design / coding phase. Or even as part of the “why is this query going soooo sloooooooowly?!” prod issue investigation phase.

When I mentioned this on Twitter and Facebook, one of my friends said it was because I was a control freak. He’s not wrong – I am a control freak – but I think that’s irrelevant in this case; the time for thinking about index creation is at design phase, and that (at least, to my mind) is firmly in the developer’s area.

Don’t get me wrong; DBAs are fantastic, knowledgable people (I have to say that, I know far more DBAs than Oracle devs! Anyway, it’s true – at least for most of the DBAs I know) and I’m more than willing to take up my DBA’s advice whenever he’s got some to dispense, but DBA’s are always busy looking after far more systems than I do, so I try hard not to take up more of their time than I have to.

It seems that I’m in a minority over where the dev responsibility ends and the DBA’s starts – things like schema / app design, index creation, SQL optimisation etc are dev activities, whereas things like backups, restores, db creations, GoldenGate setup and tinkering, db tuning, giving advice on poorly performing areas of the code, etc etc are more DBA concerns.

Is it just me? Who should deal with thinking about indexes? Devs or DBA’s? What are your thoughts?

UKOUG 2011 – Social stuff

I go to the UKOUG conferences to learn stuff from the presentations, but I also go to meet people. If I didn’t really know anyone, I wouldn’t go! (As evidenced by the fact that I didn’t go until 2 years ago, which was not long after I’d met Doug Burns and let him talk me into going! Doug knows *everyone*, I swear!).

Here are the Social highlights of this year’s conference for me (I could detail every meeting, but even I would get bored of reading all that! Needless to say, everyone I met was brilliant, and the discussions etc that we had were entertaining.):

People I met/met up with during the conference (in no particular order!):

  • David (@Farkough on Twitter), who is an old friend (and fellow spod!(*))
  • Martin Widlake (@MDWidlake on Twitter). A fellow Oracle London Beer conspiritor
  • Neil Chandler (@ChandlerDBA on Twitter). A fellow Oracle London Beer conspiritor
  • Doug Burns (@orcldoug on Twitter). Yet another fellow Oracle London Beer conspiritor (they get around, right?!)
  • David Kurtz (@davidmkurtz on Twitter).
  • Niall Litchfield (@nlitchfield on Twitter). An occasional Oracle London Beer attendee
  • Patrick Hurley (@phurley on Twitter).
  • Piet de Visser (@pdevisser on Twitter).
  • Tony Sleight (? I think that was his surname!)
  • “Big” Dave Roberts (blog).
  • Karen, Tony, Kalyan, and Aftab (? I think I’ve probably got the wrong name here; hope somebody will correct me!) – ex-colleagues of Martin Widlake’s.
  • Jonathan Lewis (blog). Another occasional Oracle London Beer attendee.
  • Connor McDonald (@connor_mc_d on Twitter).
  • Kyle Hailey (@dboptimizer on Twitter).
  • Marco Gralike (@MGralike on Twitter).
  • Alex Gorbachev (@alexgorbachev on Twitter).
  • Debra Lilley (@debralilley on Twitter).
  • Lisa Dobson (@lj_dobson on Twitter).
  • Alex Nuijten (@alexnuijten on Twitter).
  • Tim Hall (@oraclebase on Twitter).
  • Ben Burrell, aka Munky (@munkyben on Twitter).
  • Martin Bach (@MartinDBA on Twitter).

(If I’ve forgotten to add anyone, it’ll be my shoddy memory, and not down to them!)

Highlights

  • Going out for a meal on the Monday night with David (@Farkough), Martin W and Neil, and having a laugh
  • Going on to the Tap and Spile following the focus pubs on the Monday night with Niall, Martin, Neil, Dave R and Tony, with Munky popping his head in occasionally. I shall never look at carrots in the same way again!
  • Going out for a meal on the Tuesday night with Martin W, Dave R and Martin W’s ex-colleagues, then heading off to the masked ball, and bumping into Connor, Kyle, Doug, Niall and Tony amongst others
  • Ending up at the Jury’s bar until the wee small hours of Monday evening, and talking lots of random stuff with various people, including particle physics, and also getting told off by Marco about my shoddy XML db design *{;-)
  • The Oak Talks, whilst not strictly a social thing, were fun and entertaining. I hope they have them again next year!

(*) We’re both members of the same BBS: http://www.mono.org/ – still accessible via telnet!)

UKOUG2011 – Day 3 (Wednesday)

After a late night (um, < 4 hours sleep...) I was up bright and... well, just early now I come to think of it! I checked out, retrieved the coat (phew!) I hadn't realised I'd left in the hotel bar until just before I was about to leave the hotel room for the final time and had breakfast. Then it was off to the first presentation of the day... Statistics On Partitioned Objects – Doug Burns

I have a feeling that I’ve seen this presentation before – it’s possible that I haven’t seen the entire presentation, but I believe Doug did a couple of work presentations around this subject. Anyway, whether it was those or his blog entries on the subject, a lot of it felt familiar. Doug is an engaging speaker, and the subject is an interesting one. I’m not sure whether it’s applicable to the way our databases are designed, but I will be recalling this talk (amongst others) when I think of ways to improve our stats gathering when we go to 11g.

Using GoldenGate To Minimize Database Upgrade Risk – Marc Fielding

I didn’t fully understand what this talk was going to be about when I signed up for it, but having just worked with and upgraded GoldenGate following a bug in the version we use, plus the upcoming 11g upgrade, I thought it would be a good thing to go to.

Ultimately, I don’t think it’s applicable in our situation this time, but it was about using GoldenGate to copy the data changes from a, for example, 10g database up to the same db but in 11g. Potentially very useful, so a good idea to have in the back of my mind, should I ever come across a situation where it could be used.

The talk was aimed more at DBAs than devs, I think, but I was pleased that I understood most of the GG terms he used – our recent issues at least consolidated my GG knowledge, so it wasn’t all bad!

Optimizer Statistics – A Fresh Approach – Connor McDonald

Yes, the keener eyed/memoried amongst you will spot that I attended this on Sunday. Well, I did have another session lined up to go to, but by this time, my brain couldn’t handle much! Connor was just as amazing as he was on Sunday (no surprise there, then!), and I was able to just sit back and be entertained! (Thanks Connor!)

Faster, Leaner, More Efficient Databases – Index Organised Tables – Martin Widlake

Martin has done a series of blog entries on IOTs and this proved to be an interesting talk. I’ve used IOTs in the past, and will perhaps again, now that our version of GG has gone up to the latest version (the version we had previously couldn’t handle IOTs; I’m not sure about the new version, but I’d hope it would!).

The performance stuff was especially interesting, so I’ll have to go away and see how it might be applied in our databases. A good talk to have gone to.

What Shape Is Your Data? – Niall Litchfield

This was the last talk of the conference, and it was interesting. Niall took us through some examples of how data might be organised and what can be done to best optimize it.

Truth be told, I was flagging a little by this point, so didn’t retain as much of the talk as I’d have liked – I’ll definitely be downloading the slides, when they appear!!

UKOUG2011 – Day 2 (Tuesday)

So, Tuesday dawned bright and … er, well I didn’t go in for the first session of the day (not that I’d found one I wanted to go in for anyway, so nothing lost there!) but I was definitely there for the second session!

Who’s Afraid Of Analytic Functions? – Alex Nuijten

Alex is someone I’ve known from the OTN SQL & PL/SQL forum for quite some time now, although I’ve only met him in person a couple of times now. He’s a massive proponent of analytic functions (as am I – seriously, they *rock*!) and I couldn’t miss his analytics talk. Sorry Niall – I would have loved to have gone to yours too, but this just pipped yours!

This was one of the few that I came out of thinking that maybe I wasn’t actually so dumb after all! As with everything, there are lots of ways of doing the same thing, and although I’m fairly comfortable with analytic functions, there are still ways that I approach things that aren’t the most efficient. Some of Alex’s examples highlighted that for me!

If you aren’t familiar or comfortable with analytic functions yet, I would highly recommend you go see one of Alex’s talks on the subject!

A Year In Purgatory – Diary Of An 11.2 Upgrade – Connor McDonald

This was a presentation that very nearly wasn’t given! Debra Lilley had seen it when Connor did it in Perth, and was determined to get him to do it at UKOUG. Given that he’d already used up both slots for other presentations, this one was going to be saved for next year but, due to someone else having to pull out at the last minute, Debra got her wish!

All I can say about this is poor, poor Connor and his team (I assume there was a team?!). He took us through all the mishaps that occured in the past year or so when upgrading to an 11g Rac environment. As per his normal presentation style there was plenty of humour and entertainment, but I’m amazed that he’s got any hair left after the nightmare he described!

Troubleshooting The Most Complex Oracle Performance Problem I’ve Ever Seen – Tanel Poder

This was an intersting talk in that for most of it, I was scratching my head as to why it was an Oracle problem! It turned out to be a problem that Doug Burns had encountered (sorry, no spoilers here!), and whilst I basically followed things, I think it was more aimed at a DBA than someone like me. Still, it’s interesting to pick up little tips and tricks on the O/S side of things. And of course, the more I know about the underneath areas of Oracle, the easier it is to talk to my DBAs!

It’s All In The Index – Michael Salt

Poor Michael only just made his presentation, by the sounds of it – massive train problems! However, he made it and produced a useful talk on indexes and how they can be used to improve the performance of poorly running SQL statements.

I didn’t always agree with what Michael said (add an index to aid queries on tables where the data is stored in the wrong datatype? Fix the table/data, not mitigate the symptoms! Meanwhile, back in the real world, I know that’s not always possible!), but he did throw up an interesting off-the-cuff remark that completely stopped me in my tracks! I’ll be blogging on this shortly!

Beating The Oracle Optimizer – Jonathan Lewis

This was a tour of how to optimise a two-table query in 7 different ways – some of which I would never have come up with in a million years! An excellent presentation, as always.

This also happened to be the one that Jonathan asked me to be his alarm clock! I was so worried about missing the 10min / 5min warnings that he’d requested, I was checking the time every other minute or so throughout his presentation… only for him to finish 3 minutes before I was due to give him his 10 minute warning! By this time, I was panicking that the time on both my watch and my mobile phone was wrong or something, whereas Jonathan had apparently been thinking that I’d be waving at him any second for quite a while! Hehehe; I got all worked up for no reason!

Challenges And Chances Of The 11g Query Optimizer – Christian Antognini

This was a bit of a whistlestop tour through the 11g Query Optimizer changes and what we could expect to see once we’ve upgraded to 11g. Some of these I’d already seen covered in earlier presentations, such as SQL Plan Management, but others I was completely unaware of! A useful session to go to if, like me, you’re contemplating upgrading to 11g.

UKOUG2011 – Day 1 (Monday)

Opening Keynote – Mark Sunday

Before Mark came on, we had Debra Lilley and Lisa Dobson give a short talk about the changes in structure in the UKOUG organisation, which was interesting – I wasn’t aware that there’d been such big changes.

Mark came on and gave a talk about how fab Oracle was, and … well, that was my take-home message *{;-) I suspect I may not have been part of the ideal target audience!

SQL Tuning – Kyle Hailey

I’ve seen a couple of presentations in previous UKOUG conferences that were similar to what Kyle presented here – one by Jonathan Lewis on diagraming SQL statements and one by Wolfgang Breitling on tuning sql by cardinality. What I particularly liked about this one is that Kyle used a program to do the diagramming – it can be done by hand, but boy, do I now want that program! I’ll have to research it and see if I can navigate my work’s purchasing maze to get it, assuming we don’t already have it!

I also need to get Dan Tow’s book, which Kyle had based his talk upon.

Partitioning 101 – Connor McDonald

Yes, another entertaining Connor show! This time, all about partitions. I knew most of the pre-11g stuff, but not too much about the new 11g features (I’d heard about the new Interval partions, but not the rest). Connor took us through the different types of partitions and gotchas, tips etc. I shall definitely be revisiting partitions when we go to 11g!

Oracle Optimizer – Upgrading to Oracle Database 11g Without Pain – Maria Colgan The Optimizer Lady

Maria took us through the steps required to successfully upgrade to 11g with minimal pain due to the optimizer changes. A lot of this revolved around SQL Plan Baselines, which sound useful, but I’m not sure I fully “got” it, or how it could apply in our 11g migrations. Yet more things I’ll be adding into the “Things To Consider” pile for our migration!

With all the steps Maria was covering, I was beginning to panic about our migration, until right at the very end, someone asked a question of the rest of the audience: “Has anyone had a bad upgrade to 11g?” which got a tumbleweed! Someone replied and said it had gone smoothly for them, and another response was that they’d not heard of a bad upgrade, so that somehwat put my mind at rest!

Database I/O Performance Measuring and Planning – Alex Gorbachev

I suspect this was aimed mostly towards DBAs, but it sounded interesting, so I thought I’d go along anyway. Some of the technicalities went over my head, but it was useful – I can at least raise the discussion of I/O capacity planning with my DBAs and at least have some vague understanding of what I’m talking about! *{:-)

Performance and Stability with Oracle 11g SQL Plan Management – Doug Burns

Doug went into SQL Plan Management a bit more in depth than Maria – whom he insisted on calling “The Optimizer Lady” throughout his talk! – did, but it wasn’t too indepth, which was perfect for the time of day (being the last presentation before the final keynote of the (very long and full) day, and also served as excellent consolidation for Maria’s earlier presentation.

UKOUG2011 – Day 0 (Oak Table Sunday)

First, a note about how I’m going to write up this year’s UKOUG Conference. There was a looooot happening this conference, both socially and learning-wise, so I’m going to break things up – first, I’ll write up my thoughts on the presentations (over several blog entries!) and then I’ll write up my networking/social/fun thoughts.

So anyway, this year, UKOUG and the Oak Table guys got together and put on an extra day’s worth of sessions (which were repeated later on in the conference). Despite National Express East Anglia/Network Rail’s best efforts (Cambridge train station was closed!) I managed to get to Birmingham in time for the start of OakTable Sunday – mainly due to the fact that my other half very kindly drove me to Milton Keynes so that I could catch the train from there! This had the bonus side effect of me not having to get up early enough to catch the coach from Cambridge station at 5:50am!

Anyway, on with the presentation write-ups:

Tales From The OakTable – Mogens Nørgaard

I’ve never seen Mogens before, though I’ve heard plenty about him, and it was somewhat of a shock! He was incredibly laid back, and started off by saying that we were probably not going to learn anything over the conference, or at most, about 5% would stick with us after the conference.

He treated us to a few stories about the OakTable and its members, encouraging stories from Oakies in the audience too. I wasn’t quite sure what to make of this, really, as it was a very strange talk! Perhaps if I’d seen Mogens present before, then it wouldn’t have been a surprise!

About Multiblock Reads – Frits Hoogland

This was about the changes to multiblock reads in 11g, and how you cannot rely on scattered reads anymore – sometimes it will choose to do direct path reads, which means that you lose the advantages of query results being cached. This could be a good thing or not, but it was something I was unaware of (even though Doug has apparently already blogged about it!). Given that we will hopefully be migrating our 10g databases to 11g next year, I’m very glad I went to this session!

Optimizer Statistics – A Fresh Approach – Connor McDonald

Having seen Connor present at the 2009 UKOUG conference, I was determined that I was going to see as many of his presentations as I could. Thankfully, he presents on topics that are relevant to me, but even if he didn’t, I’d go along and see him in action! Honestly, if you’ve never seen him present, if you should ever get the chance to do so, do! He is *a* *maze* *ing!

Connor talked about statisics, both object and database stats, such as how you should gather stats, etc. There were some ideas in there that I’d never considered before (pending stats), but I’ll definitely go back to work and think about our stats strategies etc!

Statistics Gathering and Histograms – DOs and DON’Ts – Wolfgang Breitling

Following on from Connor’s presentation was Wolfgang’s. He went into more detail on histograms and was of the same opinion as Connor – don’t use histograms if you don’t need to; set them explicitly. Another talk I shall consider when looking at our stats gathering!

Performance and High Availablility Panel Session

Finally, the OakTable Sunday was topped off by a panel session of various OakTable members who took questions from the audience and wanted to know about the longest prod outage that people were brave enough to own up to! Thankfully, I’ve never caused a prod outage, but someone had had an overnight outage and someone else had had 18 hours although I don’t think it was caused by them. Connor won on the monetary outage front – he managed to hose a 10 million Aussie dollar truck through a bit of dodgy code!

Writing “complex” SQL – a how to, pt1b

As a corollary to the previous entry in this series, we can take the question further.

When I first heard the requirement:

I need to display the numbers 1-12 in 3 rows of four columns, like so:

1 2 3 4
5 6 7 8
9 10 11 12

my first reaction was “really? Just the numbers 1 to 12? You won’t be given other numbers or values to display in that particular way?”

The answer to that was “no”, so the solution I gave in the last post did the trick.

What if the answer had been “yes – I need to display <some other list of twelve bits of info> in that format”?

Thankfully, we’ve already done most of the work – we know how to display twelve things in 3 rows of 4 columns. The trick with the new requirement is identifying which values go in which place.

We’d do that by artificially labelling the rows with a number from 1 – 12.

Lets take our sample data as:

with sample_data as (select 12 col1 from dual union all
                     select -1 col1 from dual union all
                     select 3 col1 from dual union all
                     select 13 col1 from dual union all
                     select -20 col1 from dual union all
                     select 31 col1 from dual union all
                     select 0 col1 from dual union all
                     select -9 col1 from dual union all
                     select -2 col1 from dual union all
                     select 18 col1 from dual union all
                     select 7 col1 from dual union all
                     select 4 col1 from dual)
select col1
from   sample_data
order by col1;

which gives us:

      COL1
----------
       -20
        -9
        -2
        -1
         0
         3
         4
         7
        12
        13
        18
        31

Reaching into our SQL toolbox, we’ll use the ROW_NUMBER() analytic function to assign a number for each row:

with sample_data as (select 12 col1 from dual union all
                     select -1 col1 from dual union all
                     select 3 col1 from dual union all
                     select 13 col1 from dual union all
                     select -20 col1 from dual union all
                     select 31 col1 from dual union all
                     select 0 col1 from dual union all
                     select -9 col1 from dual union all
                     select -2 col1 from dual union all
                     select 18 col1 from dual union all
                     select 7 col1 from dual union all
                     select 4 col1 from dual)
select col1,
       row_number() over (order by col1) rn
from   sample_data
order by col1;

which gives us:

      COL1         RN
---------- ----------
       -20          1
        -9          2
        -2          3
        -1          4
         0          5
         3          6
         4          7
         7          8
        12          9
        13         10
        18         11
        31         12

Aha, now we’re back in familiar territory – we’re back to the numbers 1-12 again!

So, amending the solution from the previous post slightly, we can add it in to give us the results we want:

with sample_data as (select 12 col1 from dual union all
                     select -1 col1 from dual union all
                     select 3 col1 from dual union all
                     select 13 col1 from dual union all
                     select -20 col1 from dual union all
                     select 31 col1 from dual union all
                     select 0 col1 from dual union all
                     select -9 col1 from dual union all
                     select -2 col1 from dual union all
                     select 18 col1 from dual union all
                     select 7 col1 from dual union all
                     select 4 col1 from dual),
         results as (select col1,
                            ceil(row_number() over (order by col1)/4) grp,
                            mod(row_number() over (order by col1), 4) col_pos
                     from   sample_data)
select max(case when col_pos = 1 then col1 end) col1,
       max(case when col_pos = 2 then col1 end) col2,
       max(case when col_pos = 3 then col1 end) col3,
       max(case when col_pos = 0 then col1 end) col4
from   results
group by grp
order by grp;

giving us:

      COL1       COL2       COL3       COL4
---------- ---------- ---------- ----------
       -20         -9         -2         -1
         0          3          4          7
        12         13         18         31

If you had to adjust the format of the columns – eg. you wanted the resuls in 4 rows of 3 columns, or 6 rows of 2 columns, etc, then you’d have to amend the number you’re using in the ceil and mod functions in the “results” subquery to reflect the number of columns you’re expecting.

Then, in the main query, you’d have to extend or reduce the number of columns accordingly.

I haven’t restricted the number of rows to 12 in my queries above, because my sample data had at most 12 rows. If your requirement was to only display the first (or last!) 12 values of a larger set of data, you’d have to restrict the rows accordingly. I’d simply expose the row_number() as a colum in the “results” subquery and call it something like “rn”, then I’d use “and rn <= 12” in the main query.

My UKOUG 2011 Conference Agenda

Sunday

OakTable registration opening times 11:00 – 12:00

Exhibition registration opening times – 15:00 – 18:00

12:15 – 13:15 OakTable Sunday – Tales From The OakTable
Mr Mogens Nørgaard, ACE Director
Miracle AS
Hall:
14:35 – 15:20 OakTable Sunday – Join Techniques
Mr Christian Antognini, OakTable
Trivadis AG
Hall:
15:30 – 16:15 OakTable Sunday – Statistics Gathering and Histograms – DOs and DON'Ts
Mr Wolfgang Breitling, OakTable
Centrex Consulting Corporation
Hall:
16:15 – 16:40 Tea break
16:40 – 17:30 Performance & High Availability Panel Session

Monday

Registration opening times 08:00 – 20:00

Exhibition opening times 10:00 – 20:45

09:00 – 09:20 UKOUG Introduction
09:20 – 09:50 Opening Keynote

Mark Sunday, Oracle

10:00 – 10:45 SQL Tuning
Mr Kyle Hailey, ACE
Delphix
Hall:
10:45 – 11:05 Coffee break
11:05 – 12:05 Partitioning 101
Mr Connor McDonald, ACE & OakTable
Independent
Hall:
12:15 – 13:15 Oracle Optimizer – Upgrading to Oracle Database 11g without Pain
Ms Maria Colgan
Oracle
Hall:
13:15 – 14:30 Lunch
14:30 – 15:15 High Availability Options for Modern Oracle Infrastructures
Mr Simon Haslam, ACE Director
Veriton Limited
Hall:
15:25 – 16:10 Implementing Effective Oracle Database Auditing in 3 Steps
Ms Tammy Bednar
Oracle
Hall:
16:35 – 17:35 Database I/O Performance: Measuring and Planning
Mr Alex Gorbachev, ACE & OakTable
Pythian
Hall:
17:45 – 18:30 Performance and Stability with Oracle 11g SQL Plan Management
Mr Doug Burns, ACE Director & OakTable
Bayvale Consulting Services Ltd
Hall:
18:45 – 19:45 Learning about Life through Business and Software

Cary Millsap, Method R Corporation

Exhibition Welcome Drinks 19:45 – 20:45

Focus Pubs and Inspiring Presentation Awards 20:45 – late

Tuesday

Registration opening times 08:00 – 20:00

Exhibition opening times 08:00 – 18:30

10:10 – 10:55 First Things First: Getting the basics right
Mr Niall Litchfield
Maxima Managed Services
Hall:
10:10 – 10:55 Who’s afraid of Analytic Functions?
Mr Alex Nuijten, ACE Director
AMIS
Hall:
10:55 – 11:15 Coffee break
11:15 – 12:15 Analysing Your Data with Analytic Functions
Mr Carl Dudley, ACE
University of Wolverhampton
Hall:
11:15 – 12:15 Execution Plan Interpretation
Mr Joze Senegacnik, ACE Director & OakTable
DbProf d.o.o.
Hall:
12:25 – 13:25 Oracle Optimizer – Best Practices for Managing Optimizer Statistics
Ms Maria Colgan
Oracle
Hall:
12:25 – 13:25 Tuning, Refactoring, and Instrumenting Code for Developers
Mr Barry McGillin
Oracle
Hall:
12:25 – 13:25 Troubleshooting the Most Complex Oracle Performance Problem I've Ever Seen
Mr Tanel Poder, ACE Director
Enkitec Europe
Hall:
13:25 – 14:40 Lunch
14:40 – 15:25 ASH Outliers: Detecting Unusual Events in Active Session History
Mr John Beresniewicz, OakTable
Oracle
Hall:
15:35 – 16:20 Can people be identified in the database?
Mr Pete Finnigan, OakTable
PeteFinnigan.com Ltd
Hall:
16:20 – 16:40 Tea break
16:40 – 17:40 Beating the Oracle Optimizer
Mr Jonathan Lewis, ACE Director & OakTable
JL Computer Consultancy
Hall:
17:50 – 18:35 Challenges and Chances of the 11g Query Optimizer
Mr Christian Antognini, OakTable
Trivadis AG
Hall:

Evening social and networking 20:00 – late

Wednesday

Registration opening times 08:30 – 13:30

Exhibition opening times 08:30 – 14:30

09:00 – 10:00 Statistics on Partitioned Objects
Mr Doug Burns, ACE Director & OakTable
Bayvale Consulting Services Ltd
Hall:
10:10 – 10:55 Using Oracle GoldenGate to Minimize Database Upgrade Risk
Mr Marc Fielding
Pythian
Hall:
11:20 – 12:20 Latch Internals in 2011
Mr Andrey Nikolaev
RDTeX
Hall:
12:30 – 13:30 Avoiding Execution Plan instability with SQL Plan Baselines
Mr Chris Dunscombe
Christallize Ltd
Hall:
14:25 – 15:10 Faster, Leaner, More Efficient databases – Index Organised Tables
Mr Martin Widlake, OakTable
ORA600
Hall:
14:25 – 15:10 Six Easy Pieces: Essentials of Database Tuning for Packaged Applications
Mr Mike Swing
TruTek
Hall:
15:20 – 16:05 Is This The Future of Load and Performance Testing?
Mr Chris Jones
Oracle
Hall:
16:15 – 17:00 What shape is your data?
Mr Niall Litchfield
Maxima Managed Services
Hall:

Writing “complex” SQL – a how to, pt1

Not so long ago, someone on our work Oracle IM channel asked a question, and then several people were amazed at the solution provided. I’m not sure why; it doesn’t seem so difficult to me, but it strikes me that the practice I’ve had via the SQL & PL/SQL OTN Forum probably has a lot to do with that.

So, anyway, I thought I would start a little series on writing complicated sql, by taking you through the steps I would typically take in order to solve the requirement.

I’ll start with that simple request from my colleague:

I need to display the numbers 1-12 in 3 rows of four columns, like so:

1 2 3 4
5 6 7 8
9 10 11 12

Looking in my SQL toolbox, the very first thing I reach for is how to create the list of numbers from 1 to 12:

with nums as (select level rn
              from   dual
              connect by level <= 12)
select *
from   nums;

which gives us:

   RN
-----
    1
    2
    3
    4
    5
    6
    7
    8
    9
   10
   11
   12

Next thing, I know I need to pivot the results (I should add that I don't have 11g yet, so can't use the nice new functionality!), so now I know I need two things:

  • how to identify the row a number belongs to
  • how to identify the column a number belongs to

Let's take the first item - identifying the row each number belongs to. We know that the numbers 1-4 will go on the first row, 5-8 on the second and 9-12 on the third. I can acheive that by dividing each number by 4 and then take the ceiling value of it, like so:

with nums as (select level rn, ceil(level/4) grp
              from   dual
              connect by level <= 12)
select *
from   nums;

which gives us:

   RN        GRP
----- ----------
    1          1
    2          1
    3          1
    4          1
    5          2
    6          2
    7          2
    8          2
    9          3
   10          3
   11          3
   12          3

Next, I need to work out the column position each number belongs in - that's where the mod() function comes in handy!

with nums as (select level rn, ceil(level/4) grp, mod(level, 4) col_pos
              from   dual
              connect by level <= 12)
select *
from   nums;

which gives us:

   RN        GRP    COL_POS
----- ---------- ----------
    1          1          1
    2          1          2
    3          1          3
    4          1          0
    5          2          1
    6          2          2
    7          2          3
    8          2          0
    9          3          1
   10          3          2
   11          3          3
   12          3          0

So, having identified the row and the column position, we can now use pivoting to display the numbers in the desired format:

with nums as (select level rn, ceil(level/4) grp, mod(level, 4) col_pos
              from   dual
              connect by level <= 12)
select max(case when col_pos = 1 then rn end) col1,
       max(case when col_pos = 2 then rn end) col2,
       max(case when col_pos = 3 then rn end) col3,
       max(case when col_pos = 0 then rn end) col4
from   nums
group by grp;

Giving us:

 COL1       COL2       COL3       COL4
----- ---------- ---------- ----------
    1          2          3          4
    5          6          7          8
    9         10         11         12

Et voilà!

My CLOBs are trunca

As the title of this blog entry indicates, I encountered a problem yesterday with my CLOBs somehow ending up being truncated. I scratched my head, poured over the documentation and still came up blank.

I felt sure the problem was something to do with the multibyte character set my database (10.2.0.4) was set up with, but everything in the documentation indicated that DBMS_LOB subprograms worked in terms of characters, not bytes, when CLOBs were involved.

I managed to recreate my issue:

declare
    l_finalSQL VARCHAR2(32767 char);
    l_innerSQL CLOB := 'a';
    l_outerSQL CLOB := 'b';
    l_userid   INT;
    gOutersql varchar2(113) := 'ccccccccccccccccccccccccccccccccccccccccc'||
                               'ccccccccccccccccccccccccccccccccccccccc'||
                               'ccccccccccccccccccccccccccccccccc';
begin
  for i in 1..1342
  loop
    l_outersql := l_outersql||'b';
  end loop;

  dbms_output.put_line('l_outersql = '||length(l_outersql));

  for i in 1..10860
  loop
    l_innersql := l_innersql||'a';
  end loop;

  dbms_output.put_line('l_innersql = '||length(l_innersql));

  l_userid := 123;

  dbms_output.put_line('l_userid = '||length(l_userid));

  l_outerSQL := l_outerSQL||' FROM ('||l_innerSQL||gOUTERSQL||
                                      TO_CHAR(l_userid);

  dbms_output.put_line('appended l_outersql = '||
                                dbms_lob.getlength(l_outersql));

  IF dbms_lob.getlength(l_outerSQL) <= 32767 THEN
    l_finalSQL := dbms_lob.substr(l_outerSQL, 32767, 1);
  else raise too_many_rows;
  end if;

  dbms_output.put_line('l_finalsql = '||length(l_finalsql));
end;
/

which produced the following output:

l_outersql = 1343
l_innersql = 10861
l_userid = 3
appended l_outersql = 12327
l_finalsql = 10922

Note the 12327 vs 10922 – the expected and actual lengths of the l_finalSQL string.

I could not spot what the problem was for the life of me, so I asked for help on my work’s Oracle chat channel (don’t think I didn’t have a chuckle about that given my previous blog entry!). Thankfully, someone else had the presence of mind to check My Oracle Support, and it turned out that I have most likely tripped over bug 9020537. This isn’t fixed until 12.1 (!) or 11.2.0.2 (Server Patch Set) or 10.2.0.5.3 (Patch Set Update).

Unfortunately, upgrading to any of these versions is not an option for us in the timescale I needed my code to be working, and there is no workaround (boo!) so I’ve had to rewrite my code to work around the issue. Annoying! (Even more so when what I’m doing is to work around the fact that EXECUTE IMMEDIATE doesn’t handle CLOBs until 11g…)

Oracle Communities

When I first started working with Oracle, as a lowly system tester/support person, I had no idea there was such a thing as an Oracle community, beyond the developers and DBAs that I had contact with. In fact, it wasn’t until I started my current job (4 years ago) that I realised there was even a hint of a community! That’s probably a combination of how I became an Oracle developer and the structure of the company I worked for previously.

I consider myself a part of several Oracle communities:

  • OTN Forums
  • Work chat channels
  • the informal London pub Oracle massive*
  • the Oracle Blogging community
  • Twitter

and each one adds to my knowledge in different ways. I count myself very, very lucky that I work at a company which encourages communities via the IM chat program that we use – there are lots of channels dedicated to various areas, not just Oracle-based (eg. Unix, java, c# ….).

I think these channels are not only a vital source of help and knowledge, they’re a great way of networking across the company. I now have a little network of like-minded DBAs and devs, spanning continents, that I can be cheeky and ask for help from (and vice versa, of course!), which has been very helpful when I’ve been stuck, etc!

Pooling knowledge and forging ties is at the heart of every community and by participating, you can’t help but continue to keep learning new things or new ways to apply things you already know.

I would definitely encourage new (and old!) Oracle developers/DBAs to participate in at least one Oracle community; it’ll be well worth their while to do so!

* OMG! I sat next to Jonathan Lewis the last time there was a pub outing. O. M. G!! (This is not me name dropping, unlike a certain Doug I know *{;-) – this is me being totally amazed at the company I found myself in, even though I’ve met JL several times before now! Someone pinch me….)

Help! My view isn’t filtering early enough!

Yesterday, I came across an extremely odd problem with a view that I had created. The problem I had was that I was joining the view to another table, based on the primary key of the main table in the view, yet it was doing a FTS of that table and then doing the filtering.

This made the query take ~18 minutes, yet I was convinced that it ought to be able to push the filtering inside the view. Nothing I tried would get the filtering to take place in the view, not push_pred or no_merge or no_unnest hint; nothing.

It was actually a conversation with Doug Burns about the trace file I generated that spat out a clue – thanks Doug!

I’ve managed to recreate the issue, so here goes:

create table t1 as 
select level col1,
       dbms_random.string('A', 5) col2,
       mod(level, 20) col3
from   dual
connect by level <= 10000;

alter table t1 add constraint t1_pk primary key (col1);

create table t2 as
select level col1,
       ceil(level/2) col2,
       dbms_random.string('A', 5) col3
from   dual
connect by level <= 20000;

alter table t2 add constraint t2_pk primary key (col1);

alter table t2 add constraint t2_t1_fk foreign key (col2) references t1 (col1);

create table t3 as
select level col1,
       level * 2 col2
from   dual
connect by level <= 19;

alter table t3 add constraint t3_pk primary key (col1);

begin
  dbms_stats.gather_table_stats(user, 'T1');
  dbms_stats.gather_table_stats(user, 'T2');
  dbms_stats.gather_table_stats(user, 'T3');
end;

create view t1_t3_view
as
with t3a as (select col1,
                    decode(col2, 4, 90, 8, 45, col2) col2
             from   t3)
select t1.col1,
       t1.col2,
       t1.col3,
       t3a.col2 t3_col2
from   t1,
       t3a
where  t1.col3 = t3a.col1 (+);

The following query does push the filtering into the view (see the use of the T1_PK index):

select *
from   t1_t3_view
where  col1 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
SQL_ID  ca9pw03r3tua1, child number 0
-------------------------------------
select * from   t1_t3_view where  col1 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Plan hash value: 1448310819             
---------------------------------------------------------------------------------------------
| Id  | Operation                      | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |            |       |       |    10 (100)|          |
|   1 |  VIEW                          | T1_T3_VIEW |    10 | 20410 |    10   (0)| 00:00:01 |
|*  2 |   HASH JOIN OUTER              |            |    10 |   180 |    10   (0)| 00:00:01 |
|   3 |    INLIST ITERATOR             |            |       |       |            |          |
|   4 |     TABLE ACCESS BY INDEX ROWID| T1         |    10 |   120 |     7   (0)| 00:00:01 |
|*  5 |      INDEX UNIQUE SCAN         | T1_PK      |    10 |       |     6   (0)| 00:00:01 |
|   6 |    TABLE ACCESS FULL           | T3         |    19 |   114 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
   2 - access("T1"."COL3"="COL1")
   5 - access(("T1"."COL1"=1 OR "T1"."COL1"=2 OR "T1"."COL1"=3 OR "T1"."COL1"=4 OR
              "T1"."COL1"=5 OR "T1"."COL1"=6 OR "T1"."COL1"=7 OR "T1"."COL1"=8 OR "T1"."COL1"=9 OR
              "T1"."COL1"=10)

whereas the following query does not (note the FTS on T1):

select *
from   t1_t3_view
where  col1 in (select col2 from t2
                where col1 <= 10);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
SQL_ID  95rxxkfy26rg3, child number 0
-------------------------------------
select * from   t1_t3_view where  col1 in (select col2 from t2
         where col1 <= 10)
Plan hash value: 637444478
-------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |       |       |    15 (100)|          |
|*  1 |  HASH JOIN RIGHT SEMI        |            |    10 | 20500 |    15   (7)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T2         |    10 |    90 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T2_PK      |    10 |       |     2   (0)| 00:00:01 |
|   4 |   VIEW                       | T1_T3_VIEW | 10000 |    19M|    12   (9)| 00:00:01 |
|*  5 |    HASH JOIN RIGHT OUTER     |            | 10000 |   175K|    12   (9)| 00:00:01 |
|   6 |     TABLE ACCESS FULL        | T3         |    19 |   114 |     3   (0)| 00:00:01 |
|   7 |     TABLE ACCESS FULL        | T1         | 10000 |   117K|     8   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
   1 - access("COL1"="COL2")
   3 - access("COL1"<=10)
   5 - access("T1"."COL3"="COL1")

After much scratching of my head (and not a small amount of swearing!) I eventually worked out how to do a 10053 trace on the query. The following is what Doug picked up on (and to be fair, so had I, only I hadn't got round to doing anything about it aside from making a mental note of it):

**************************
Predicate Move-Around (PM)
**************************
PM: Considering predicate move-around in SEL$1 (#0).
PM:   Checking validity of predicate move-around in SEL$1 (#0).
CBQT: copy not possible because linked to with clause
 in SEL$3 (#0)
CBQT: copy not possible because view
 in SEL$1 (#0)
CBQT bypassed for SEL$1 (#0): Cannot copy query block.
CBQT: Validity checks failed for 95rxxkfy26rg3.

If I change the view so that the subquery is an in-line view, bingo!:

create view t1_t3_view2
as
select t1.col1,
       t1.col2,
       t1.col3,
       t3a.col2 t3_col2
from   t1,
       (select col1,
               decode(col2, 4, 90, 8, 45, col2) col2
        from   t3) t3a
where  t1.col3 = t3a.col1 (+);

PLAN_TABLE_OUTPUT                             
----------------------------------------------------------------------------------------------
SQL_ID  9jj0atfw7050d, child number 0
-------------------------------------
select * from   t1_t3_view2 where  col1 in (select col2 from t2
where col1 <= 10)
Plan hash value: 2918082440
----------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |       |       |       |    13 (100)|          |
|*  1 |  HASH JOIN OUTER               |       |    10 |   270 |    13  (16)| 00:00:01 |
|   2 |   NESTED LOOPS                 |       |    10 |   210 |     9  (12)| 00:00:01 |
|   3 |    SORT UNIQUE                 |       |    10 |    90 |     3   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID| T2    |    10 |    90 |     3   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | T2_PK |    10 |       |     2   (0)| 00:00:01 |
|   6 |    TABLE ACCESS BY INDEX ROWID | T1    |     1 |    12 |     1   (0)| 00:00:01 |
|*  7 |     INDEX UNIQUE SCAN          | T1_PK |     1 |       |     0   (0)|          |
|   8 |   TABLE ACCESS FULL            | T3    |    19 |   114 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."COL3"="COL1")
   5 - access("COL1"<=10)
   7 - access("T1"."COL1"="COL2")

So one place where a WITH clause may cause problems and, much as I love it, I will now steer clear of WITH clauses in views until I know the issue has been fixed! (Not that I work with views that often; I'm only doing so here due to the fact we're on 10g, there's a join I need to add into a dynamic sql and the 32767 character limit EXECUTE IMMEDIATE has. I can't wait for 11g where EXECUTE IMMEDIATE works with CLOBs!)

The trouble with dates

Judging from the amount of forum posts around the subject of dates, you would dates in Oracle are difficult, confusing and just plain awkward to use.

However, far from being difficult, I think dates in Oracle are really, really easy.

The DATE datatype in Oracle holds date and time (to the nearest second) information and stores it in its own internal format. Everyone knows that date+time consists of year, month, day, hours, minutes and seconds, so Oracle gives you a method of allowing you to specify which bits are the year, which are the month, etc. What could be simpler than that?

It’s the TO_DATE() function that allows you to pass in a string and have Oracle read it as a date. Similarly, it’s the TO_CHAR() function that allows you to take an Oracle DATE value and convert it back into a string that you and I can understand.

Let’s look at an example:

to_date('01/01/2011', 'dd/mm/yyyy')
to_char(sysdate, 'dd Month yyyy hh24:mi')

Granted, it’s a wee bit of typing, but it’s not difficult. The list of available format masks can be found here. It’s worth nothing that you can also use these format masks with TRUNC and ROUND, for example if you need to find the quarter start date of a given date etc.

People abuse dates in many ways in Oracle, such as only using 2-digit years (why, I’m not sure. It’s as if Y2K never happened!) or worse, when they want to compare dates, they first convert them to strings.

This is a bad idea in many ways – for one, if you don’t organise your date-as-a-string correctly, comparisons are meaningless – eg. ‘2 Feb 1999’ is greater than ‘1 Jan 2011’, because ‘1’ < '2' in the string comparison. For another, when you convert the DATE into a string (or number), you remove vital information from the optimizer. This can lead to incorrect cardinality estimates which could throw your execution path off. Why? Well, how many days are there between 1st January 2011 and 31st December 2010? Of course, the answer is 1. But if you were converting those to numbers, the question becomes "What is the difference between 20110101 and 20101231" to which the answer is, of course, 8870. 1 vs 8870 - that's a fair amount of difference! By converting dates to strings or numbers, you leave the optimizer no option than to do the equivalent of sticking its finger in the air to guess how many rows it's going to retrieve! Don't be afraid of Oracle's DATE format, it's simple to use and can help you do some very powerful things without much work from you - that's got to be a bonus! (Grouping results into quarters? No problem! Pivoting by month? Not a problem either! etc etc)

Oracle Tips and Tricks – Part 3: Tabibitosan

Tabibitosan is a technique that I picked up from, guess where? Yup, the OTN Forums! It was Aketi Jyuuzou who first introduced me to the idea, and it’s a very useful technique for a certain set of problems. Aketi’s forum post has lots of information on it, but I just wanted to demonstrate a simple example here, to give you a taste of what you can use it for.

So, suppose we have a set of data, which is unique for id and dt:

        ID DT
---------- ----------
         1 01/01/2011
         1 02/01/2011
         1 03/01/2011
         1 04/01/2011
         1 05/01/2011
         1 10/01/2011
         1 11/01/2011
         1 12/01/2011
         1 14/01/2011
         1 15/01/2011
         1 16/01/2011
         2 01/01/2011
         2 02/01/2011
         2 03/01/2011
         2 04/01/2011
         2 06/01/2011
         2 07/01/2011
         2 11/01/2011
         2 12/01/2011
         2 13/01/2011
         2 28/01/2011
         2 30/01/2011
         2 31/01/2011
         2 01/02/2011

and you want to group the results to find the start and end dates of consecutive rows, eg:

        ID MIN_DT     MAX_DT    
---------- ---------- ----------
         1 01/01/2011 05/01/2011
         1 10/01/2011 12/01/2011
         1 14/01/2011 16/01/2011
         2 01/01/2011 04/01/2011
         2 06/01/2011 07/01/2011
         2 11/01/2011 13/01/2011
         2 28/01/2011 28/01/2011
         2 30/01/2011 01/02/2011

Where do we start? Well, the tabibitosan works by assigning a number to each of the rows in the resultset, either over the whole set or over the partitioned result sets – in our case, we’ll be doing this per each id. Then it assigns a different, consecutive number to the rows.

Because we’re using dates, we’ll label each row by converting those to a number in Julian format. We’ll also use the row_number() analytic function to label each row with a consecutive number for each id in ascending dt order. Finally, we’ll subtract one from the other:

with sample_data as (select 1 id, to_date('01/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('02/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('03/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('04/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('05/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('10/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('11/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('12/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('14/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('15/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('16/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('01/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('02/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('03/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('04/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('06/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('07/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('11/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('12/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('13/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('28/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('30/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('31/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('01/02/2011', 'dd/mm/yyyy') dt from dual)
select id,
       dt,
       to_number(to_char(dt, 'j')) main_rn,
       row_number() over (partition by id order by dt) partitioned_rn,
       to_number(to_char(dt, 'j'))
         - row_number() over (partition by id order by dt) grp
from   sample_data;

        ID TO_CHAR(DT,'DD/MM/YYYY')    MAIN_RN PARTITIONED_RN        GRP
---------- ------------------------ ---------- -------------- ----------
         1 01/01/2011                  2455563              1    2455562
         1 02/01/2011                  2455564              2    2455562
         1 03/01/2011                  2455565              3    2455562
         1 04/01/2011                  2455566              4    2455562
         1 05/01/2011                  2455567              5    2455562
         1 10/01/2011                  2455572              6    2455566
         1 11/01/2011                  2455573              7    2455566
         1 12/01/2011                  2455574              8    2455566
         1 14/01/2011                  2455576              9    2455567
         1 15/01/2011                  2455577             10    2455567
         1 16/01/2011                  2455578             11    2455567
         2 01/01/2011                  2455563              1    2455562
         2 02/01/2011                  2455564              2    2455562
         2 03/01/2011                  2455565              3    2455562
         2 04/01/2011                  2455566              4    2455562
         2 06/01/2011                  2455568              5    2455563
         2 07/01/2011                  2455569              6    2455563
         2 11/01/2011                  2455573              7    2455566
         2 12/01/2011                  2455574              8    2455566
         2 13/01/2011                  2455575              9    2455566
         2 28/01/2011                  2455590             10    2455580
         2 30/01/2011                  2455592             11    2455581
         2 31/01/2011                  2455593             12    2455581
         2 01/02/2011                  2455594             13    2455581

You can see that we now have the same number (the grp column) for rows with consecutive dt’s. Each group has a different grp value per each id. This is because as each dt increases by 1, so does the row number. The difference between the two values of that row and the previous row remains the same. As soon as the dt jumps by more than 1, the difference becomes greater, as the row number will only ever increase by 1.

Having identified the numbers that separate the rows into consecutive chunks, we can group the result set using them, to find the min and max dt’s for each set:

with sample_data as (select 1 id, to_date('01/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('02/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('03/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('04/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('05/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('10/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('11/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('12/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('14/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('15/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 1 id, to_date('16/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('01/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('02/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('03/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('04/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('06/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('07/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('11/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('12/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('13/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('28/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('30/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('31/01/2011', 'dd/mm/yyyy') dt from dual union all
                     select 2 id, to_date('01/02/2011', 'dd/mm/yyyy') dt from dual),
     tabibitosan as (select id,
                            dt,
                            to_number(to_char(dt, j))
                              - row_number() over (partition by id order by dt) grp
                     from   sample_data)
select id,
       min(dt) min_dt,
       max(dt) max_dt
from   tabibitosan
group by id, grp
order by id, grp;

        ID MIN_DT     MAX_DT    
---------- ---------- ----------
         1 01/01/2011 05/01/2011
         1 10/01/2011 12/01/2011
         1 14/01/2011 16/01/2011
         2 01/01/2011 04/01/2011
         2 06/01/2011 07/01/2011
         2 11/01/2011 13/01/2011
         2 28/01/2011 28/01/2011
         2 30/01/2011 01/02/2011

Neat, huh?

Whenever I see this kind of “grouping consecutive rows together” problem, the tabibitosan method immediately leaps to mind. Thanks Aketi!

Also, note how I’ve used both analytic functions and subquery factoring in this – both powerful tools in their own right, you can use them as building blocks to come up with queries that may look fairly complicated, but are easy to break down to see what each constituent part is doing. Pack a few tools in your SQL toolkit and you can easily combine them to build complex queries.

Random thought…

… is it just me, or have any other OTN Forum regulars switched to “!=” directly as a result of the Jive’s inability to display “<>“?