Friday, 25 May 2012
Saturday, 19 May 2012
SQLs
My Important SQLs:
--Creating a db link
create database link sbl78_link connect to ora321 identified by ora321
using '(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP) (HOST = hostname.domainname)(PORT = 1521)))
(CONNECT_DATA =
(SERVICE_NAME = service_name)
)
)'
Querying the remote db:
select * from remote_table@dblink_name
2. SELECT
ODI_TABLE.TABLE_NAME,
ODI_TABLE.COLUMN_NAME,
case when ODI_table.longc = siebel.data_length/4 then "pass" else "fail" end result1,
case when ODI_table.dummy_col is NULL then "table not exist" else "pass" end result2
FROM
(select snp_table.table_name TABLE_NAME, snp_col.col_name COLUMN_NAME, snp_col.source_dt, snp_col.longc, 1 dummy_col from snp_col, snp_table, snp_model where
snp_col.i_table= snp_table.i_table and snp_table.i_mod = snp_model.i_mod
and snp_model.i_mod= 17001 and (snp_col.SOURCE_DT ='VARCHAR2' OR snp_col.SOURCE_DT = 'CHAR')
and snp_table.table_name like 'S%') ODI_TABLE
left outer join SBL78_LINK@ALL_TAB_COLUMNS siebel
ON ODI_TABLE.TABLE_NAME= siebel.TABLE_NAME AND ODI_TABLE.COLUMN_NAME = SIEBEL.COLUMN_NAME 1
3. alter tablespace users add datafile 'c:\extend3.dbf' size 15120M
alter database tempfile 'C:\app\fakebideveloper\oradata\orcl\TEMP02.dbf' resize 2048M
4. data validation:
select * from all_tab_columns ALL_TABLES where owner = 'SCHEMA NAME' AND TABLE_NAME ='TABLE NAME'
AND COLUMN_NAME LIKE '%_WID' ORDER BY COLUMN_NAME
5. When max is applied on two date columns then the max works perfectly and returns only one row with max values of each date column , no matter whether the two dates are max for the same record.
create table Test (
start_dt date,
end_dt date,
Amt number(10)
)
insert into Test
values (TO_DATE('01/01/1980', 'DD/MM/YYYY'), TO_DATE('12/03/2050', 'DD/MM/YYYY') , 5000)
--Insert another row with changed dates
SELECT MAX(START_DT), MAX(END_DT) FROM TEST
6. Importing db dump files exported with expdp :
a. First create user for the dmp file to be imported and define a directory and grant read, write privilege
on directory created to the user just created :
CREATE USER supplier2 IDENTIFIED BY supplier2 DEFAULT TABLESPACE users;
ALTER USER supplier2 IDENTIFIED BY supplier2 ACCOUNT UNLOCK;
GRANT DBA to supplier2;
CREATE USER obiee_navteq IDENTIFIED BY obiee_navteq DEFAULT TABLESPACE users;
ALTER USER obiee_navteq IDENTIFIED BY obiee_navteq ACCOUNT UNLOCK;
GRANT DBA to obiee_navteq;
CREATE OR REPLACE DIRECTORY datapump as 'd:\datapump';
GRANT READ, WRITE ON DIRECTORY datapump to supplier2;
GRANT READ, WRITE ON DIRECTORY datapump to obiee_navteq;
ALTER SYSTEM SET SEC_CASE_SENSITIVE_LOGON = FALSE;
exit;
2. Place dump file in the DIRECTORY location specified above:
3. Go to command prompt and enter following:
impdp supplier2/supplier2@XE schemas=supplier2 directory=datapump dumpfile=supplier2.dmp logfile=impdpsupplier2.log
hit enter.
Troubleshooting:
If after step 3 an error is displayed :
UDI-12154: operation generated ORACLE error 12154
ORA-12154: TNS:could not resolve the connect identifier specified
Apparently something is wrong with tnsnames.ora.
In that case :
a. Enter sqlplus on command prompt, press enter. SQLPlus command prompt opens up.
b. Prompt will ask username: username/pwd@sid
e.g. local_source/oracle@xe
if you are not able to connect tnsnames.ora either doesn't have entry for sid xe or some mismatching quote or character is causing issue in tnsnames.ora. Take a backup of tnsnames.ora and try editing it.
6.1 Exporting a database schema using expdp(Oracle 10g onwards):
If your database from which you want to export a schema is on a machine which has obiee installed, chances are your ORACLE_HOME environment variable is pointing to OBIEE's oracle home. Follow these steps to point it temporarily to your database:
a. Open command prompt. type expdp( the location of expdp should be set in PATH in order to invoke it from command line) . If you see an error:
UDE-00013: Message 13 not found; No message file for product=RDBMS, facility=
UDE-00019: You may need to set ORACLE_HOME to your Oracle software directory
go to step b.
b. set ORACLE_HOME to C:\app\fakebideveloper\product\11.1.0\db_1
set ORACLE_HOME=C:\app\fakebideveloper\product\11.1.0\db_1
in the command prompt.
c. Login to the database with a user other than the one you want to export(EBS1213_DW_1) and execute following:
CREATE OR REPLACE DIRECTORY dump_dir AS 'c:\datapump';
GRANT READ, WRITE ON DIRECTORY dump_dir TO EBS1213_DW_1;
d. go to the DB server from where you've to export a schema, and execute the following:
expdp EBS1213_DW_1/oracle@orcl schemas=EBS1213_DW_1 directory=DUMP_DIR dumpfile=EBS1213_DW_1.dmp logfile=expdpEBS1213_DW_1.log
Note: ORACLE_HOME must be set already and the physical directory as mentioned in b. above must already exist in DB server machine.
6.2 Exporting a schema from a remote DB server:
a. Create an entry in your tnsnames.ora(in the client machine):
ORCL_QA =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = xxxxxxxxsvdb.us.oracle.com)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = qa26a)
)
)
b. Create a db link to the remote server:
create public database link remoteloc connect to ORA26109 identified by ORA26109 using 'ORCL_QA';
c.
expdp remote_user/remote_pwd directory=dumpdir logfile=impi.log network_link=remoteloc schemas=scott dumpfile=newscot.dmp
Dropping a dbf file in Oracle db:
Here are some important metadata tables accessible to Users in Oracle db:
This hierarchical data can be represented in a chart as :
Problem 1: finding out the root node of this tree:
Solution: select lname from employee where emp_manager_id is null;
Problem 2: finding out the leaf nodes:
Solution: select lname from employee e where e.emp_id not in (select manager_emp_id from employee emp where emp.manager_emp_id is not null );
Problem3: Flattening the parent -child relationship so that in a single row we can see the entire reporting line.
Solution: This requires knowledge of number of levels in the hierarchy. We'll have to do a series of left outer joins.
select e1.lname emp_name, e2.lname mgr_name, e3.lname, e4.lname from employee e1 left outer join employee e2 on e1.manager_emp_id =e2.emp_id
left outer join employee e3 on e2.manager_emp_id = e3.emp_id left outer join employee e4 on e3.manager_emp_id = e4.emp_id;
For this hierarchy to be consumed by OBIEE10 g the leaf node is repeated upto the last level.
Note: when you say flattened parent-child hierarchy, each row represents path from a node upto the root node, including the path from the root node to itself.
Problem 4: Flattening a parent child relationship to see each manager with it's parent. This is a more generic problem, since the #3 was more from a standpoint of OBIEE 10 g consume able form.
select e1.lname, e2.lname, e3.lname, e4.lname from employee e1 left outer join employee e2 on e1.emp_id = e2.manager_emp_id left outer join employee e3 on e2.emp_id = e3.manager_emp_id left outer join employee e4 on e3.emp_id = e4.manager_emp_id where e1.manager_emp_id is null;
note: if one just wants to see the hierarchy for visual inspection, the query provided as a solution for prob#4 can be used. However if the output of the query has to be in a form to be consumed by OBIEE 10g then the solution to problem#4 is more suitable.
Start With, Connect By and Prior in Oracle SQL
The information in a table can be retrieved in a hierarchical form with the help of Select statement's Start With .... Connect By clause.
[ [START WITH condition1 ] CONNECT BY condition2]
START WITH condition1: condition1 specifies the way to identify the root node of a hierarchy. All rows that satisfy the condition will be considered roots. If you don't specify a START WITH clause all rows are considered root rows(which is usually not desirable). A subquery can be included in Condition1.
CONNECT BY Condition2: Specifies the relationship between. parent rows and child rows in the hierarchy. The relationship is expressed as a comparison expression, where columns from the current row are compared to corresponding parent columns. condition2 must contain the PRIOR operator, which is used to identify columns from the parent row. Condition2 can't contain a subquery.
PRIOR is an oracle sql operator used only in conjunction with Hierarchical queries. When PRIOR is used in a Connect By condition the expression following PRIOR is evaluated for the Parent Row of the current row in the query.
e.g.
select lname, emp_id, manager_emp_id from employee
START WITH manager_emp_id is null
CONNECT BY PRIOR emp_id = manager_emp_id;
As shown above this lists the hierarchical information for visual inspection, compare this information with the diagram of the hierarchy.
Note: the PRIOR keyword doesn't need be listed first. for e.g. the query above can be written as:
select lname, emp_id, manager_emp_id from employee
START WITH manager_emp_id is null
CONNECT BY manager_emp_id = PRIOR emp_id ;
both of them return the same result.
Given that Condition1 specifies the starting point of the hierarchy(or root node), we can get the relative hierarchical position, by making root as the node who's hierarchy we want to see:
e.g.
select lname, emp_id, manager_emp_id from employee
START WITH manager_emp_id i = 'BLAKE'
CONNECT BY manager_emp_id = PRIOR emp_id ;
This gives all the direct/indirect subordinates of BLAKE
We can also have a subquery for Condition1, suppose we want to find out the employees reporting directly/indirectly to the oldest employee:
select lname, emp_id, manager_emp_id from employee
START WITH hire_date = (Select min(hire_date) from employee)
CONNECT BY PRIOR emp_id =manager_emp_id;
Note: beware that the subquery should return only one column and row. If it returns more than one row then you'll have to use IN operator instead of = , but in that case there will be multiple roots.
LOOP/CYCLE in hierarchical data:
If a row is both parent and child of another row then it's a cyclic data. Since the Connect By condition specifies a parent child relationship, it can't contain a loop.
More than One column to define Parent- Child relationship:
When a parent child relationship involves more than one column, you'll have to use PRIOR operator before each column(which should be picked up for parent record).
e.g. select * from assembly
START WITH parent_assembly_type is null and parent_assembly_id is null
CONNECT BY parent_assembly_type = PRIOR assembly_type
and parent_assembly_id = PRIOR assembly_id;
The LEVEL Pseudocolumn
Oracle provides pseudocolumn LEVEL to represent levels in a hierarchical tree. Whenever we use START WITH ... CONNECT BY clauses in a hierarchical query, you can use the pseudocolumn LEVEL to return the level number for reach row returned by the query.
for e.g.
select level, lname, emp_id, manager_emp_id
from employee
START WITH manager_emp_id is null
CONNECT BY manager_emp_id = PRIOR emp_id;
as shown in the resultset, each employee is now associated with a number representing it's level in the organisation.
Problem 5: Complex Hierarchy Operations can be performed much more easily than using standard ANSI sql:
a. Finding the Number of Levels in the hierarchy:
SELECT MAX(LEVEL)
FROM
EMPLOYEE
START WITH manager_emp_id is null
CONNECT BY PRIOR emp_id = manager_emp_id;
o/p: 4
b. Deterimine the number of employees at each level, group the results by Level and count number of employees in each distinct group:
SELECT LEVEL, Count(emp_id)
from Employee
START WITH manager_emp_id is null
CONNECT BY PRIOR emp_id = manager_emp_id
GROUP BY LEVEL;
;
4. data validation:
select * from all_tab_columns ALL_TABLES where owner = 'SCHEMA NAME' AND TABLE_NAME ='TABLE NAME'
AND COLUMN_NAME LIKE '%_WID' ORDER BY COLUMN_NAME
5. When max is applied on two date columns then the max works perfectly and returns only one row with max values of each date column , no matter whether the two dates are max for the same record.
create table Test (
start_dt date,
end_dt date,
Amt number(10)
)
insert into Test
values (TO_DATE('01/01/1980', 'DD/MM/YYYY'), TO_DATE('12/03/2050', 'DD/MM/YYYY') , 5000)
--Insert another row with changed dates
SELECT MAX(START_DT), MAX(END_DT) FROM TEST
6. Importing db dump files exported with expdp :
a. First create user for the dmp file to be imported and define a directory and grant read, write privilege
on directory created to the user just created :
CREATE USER supplier2 IDENTIFIED BY supplier2 DEFAULT TABLESPACE users;
ALTER USER supplier2 IDENTIFIED BY supplier2 ACCOUNT UNLOCK;
GRANT DBA to supplier2;
CREATE USER obiee_navteq IDENTIFIED BY obiee_navteq DEFAULT TABLESPACE users;
ALTER USER obiee_navteq IDENTIFIED BY obiee_navteq ACCOUNT UNLOCK;
GRANT DBA to obiee_navteq;
CREATE OR REPLACE DIRECTORY datapump as 'd:\datapump';
GRANT READ, WRITE ON DIRECTORY datapump to supplier2;
GRANT READ, WRITE ON DIRECTORY datapump to obiee_navteq;
ALTER SYSTEM SET SEC_CASE_SENSITIVE_LOGON = FALSE;
exit;
2. Place dump file in the DIRECTORY location specified above:
3. Go to command prompt and enter following:
impdp supplier2/supplier2@XE schemas=supplier2 directory=datapump dumpfile=supplier2.dmp logfile=impdpsupplier2.log
hit enter.
Troubleshooting:
If after step 3 an error is displayed :
UDI-12154: operation generated ORACLE error 12154
ORA-12154: TNS:could not resolve the connect identifier specified
Apparently something is wrong with tnsnames.ora.
In that case :
a. Enter sqlplus on command prompt, press enter. SQLPlus command prompt opens up.
b. Prompt will ask username: username/pwd@sid
e.g. local_source/oracle@xe
if you are not able to connect tnsnames.ora either doesn't have entry for sid xe or some mismatching quote or character is causing issue in tnsnames.ora. Take a backup of tnsnames.ora and try editing it.
6.1 Exporting a database schema using expdp(Oracle 10g onwards):
If your database from which you want to export a schema is on a machine which has obiee installed, chances are your ORACLE_HOME environment variable is pointing to OBIEE's oracle home. Follow these steps to point it temporarily to your database:
a. Open command prompt. type expdp( the location of expdp should be set in PATH in order to invoke it from command line) . If you see an error:
UDE-00013: Message 13 not found; No message file for product=RDBMS, facility=
UDE-00019: You may need to set ORACLE_HOME to your Oracle software directory
go to step b.
b. set ORACLE_HOME to C:\app\fakebideveloper\product\11.1.0\db_1
set ORACLE_HOME=C:\app\fakebideveloper\product\11.1.0\db_1
in the command prompt.
c. Login to the database with a user other than the one you want to export(EBS1213_DW_1) and execute following:
CREATE OR REPLACE DIRECTORY dump_dir AS 'c:\datapump';
GRANT READ, WRITE ON DIRECTORY dump_dir TO EBS1213_DW_1;
d. go to the DB server from where you've to export a schema, and execute the following:
expdp EBS1213_DW_1/oracle@orcl schemas=EBS1213_DW_1 directory=DUMP_DIR dumpfile=EBS1213_DW_1.dmp logfile=expdpEBS1213_DW_1.log
Note: ORACLE_HOME must be set already and the physical directory as mentioned in b. above must already exist in DB server machine.
6.2 Exporting a schema from a remote DB server:
a. Create an entry in your tnsnames.ora(in the client machine):
ORCL_QA =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = xxxxxxxxsvdb.us.oracle.com)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = qa26a)
)
)
b. Create a db link to the remote server:
create public database link remoteloc connect to ORA26109 identified by ORA26109 using 'ORCL_QA';
c.
expdp remote_user/remote_pwd directory=dumpdir logfile=impi.log network_link=remoteloc schemas=scott dumpfile=newscot.dmp
Dropping a dbf file in Oracle db:
Datafiles are allocated to tablespaces, not users. Dropping the users will remove the tables etc. from the tablespace, but the underlying storage is not affected.
Don't delete the dbf files from the filesystem directly, you'll get your database into a mess. To remove them, find which tablespace the files belong to using the following statement:
select tablespace_name
from dba_data_files
where file_name = <file name with full path>;
You can then remove the file(s) by dropping the tablespace:
drop tablespace <tablespace_name> including contents and datafiles;
Before doing this you should verify that there aren't any objects still allocated to the tablespace for other users however. You can find this by running:
select * from dba_segments
where tablespace_name = <tablespace to drop>;
If this returns anything, move the objects to another tablespace if you want to keep them before dropping.
Dropping Datafiles
You use the
DROP
DATAFILE
and DROP
TEMPFILE
clauses of the ALTER
TABLESPACE
command to drop a single datafile or tempfile. The datafile must be empty. (A datafile is considered to be empty when no extents remain allocated from it.) When you drop a datafile or tempfile, references to the datafile or tempfile are removed from the data dictionary and control files, and the physical file is deleted from the file system or Automatic Storage Management (ASM) disk group.
The following example drops the datafile identified by the alias
example_df3.f
in the ASM disk group DGROUP1
. The datafile belongs to the example
tablespace.ALTER TABLESPACE example DROP DATAFILE '+DGROUP1/example_df3.f';
The next example drops the tempfile
lmtemp02.dbf
, which belongs to the lmtemp
tablespace.ALTER TABLESPACE lmtemp DROP TEMPFILE '/u02/oracle/data/lmtemp02.dbf';
This is equivalent to the following statement:
ALTER DATABASE TEMPFILE '/u02/oracle/data/lmtemp02.dbf' DROP INCLUDING DATAFILES;
The following are restrictions for dropping datafiles and tempfiles:
- The database must be open.
- If a datafile is not empty, it cannot be dropped.If you must remove a datafile that is not empty and that cannot be made empty by dropping schema objects, you must drop the tablespace that contains the datafile.
- You cannot drop the first or only datafile in a tablespace.This means that
DROP
DATAFILE
cannot be used with a bigfile tablespace. - You cannot drop datafiles in a read-only tablespace.
- You cannot drop datafiles in the
SYSTEM
tablespace. - If a datafile in a locally managed tablespace is offline, it cannot be dropped.
Here are some important metadata tables accessible to Users in Oracle db:
DBA_ALL_TABLES ALL_ALL_TABLES USER_ALL_TABLES DBA_PARTIAL_DROP_TABS ALL_PARTIAL_DROP_TABS USER_PARTIAL_DROP_TABS DBA_PART_TABLES ALL_PART_TABLES USER_PART_TABLES DBA_TABLES ALL_TABLES USER_TABLES TAB DBA_TABLESPACES USER_TABLESPACES DBA_TAB_COLUMNS ALL_TAB_COLUMNS USER_TAB_COLUMNS DBA_TAB_COLS ALL_TAB_COLS USER_TAB_COLS DBA_TAB_COMMENTS ALL_TAB_COMMENTS USER_TAB_COMMENTS DBA_TAB_HISTOGRAMS ALL_TAB_HISTOGRAMS USER_TAB_HISTOGRAMS DBA_TAB_MODIFICATIONS ALL_TAB_MODIFICATIONS USER_TAB_MODIFICATIONS DBA_TAB_PARTITIONS ALL_TAB_PARTITIONS USER_TAB_PARTITIONS
7. Changing NLS_DATE_FORMAT in Oracle:
The Oracle nls_date_format environmental variable does much more than control the date display within Oracle. Changing the value of nls_date_format will change the way that the data is displayed in Oracle SQL*Plus:ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY MM DD'; select sysdate from dual; ALTER SESSION SET NLS_DATE_FORMAT = 'HH24:MI:SS'; select sysdate from dual;
Oracle SQL:
Pseudocolumns: Oracle makes available several Phantom columns, known as pseudocolumns, that do not exist in any tables. Rather they are values visible during query execution that can be helpful in certain situations.
e.g.
SYSDATE current date and time ROWNUM sequence number assigned to retrieved rows ROWID unique identifier for a row UID number associated with a user USER userid of current userLevel is also a pseudo column .Usage:
select user, sysdate, cust_nbr from customer;
Dealing with hierarchical data:
Let's see Oracle sql functionality to deal with hierarchical data. Here is Employee table which stores parent child relationship in between employee and his manager as follows:
This hierarchical data can be represented in a chart as :
Problem 1: finding out the root node of this tree:
Solution: select lname from employee where emp_manager_id is null;
Problem 2: finding out the leaf nodes:
Solution: select lname from employee e where e.emp_id not in (select manager_emp_id from employee emp where emp.manager_emp_id is not null );
Problem3: Flattening the parent -child relationship so that in a single row we can see the entire reporting line.
Solution: This requires knowledge of number of levels in the hierarchy. We'll have to do a series of left outer joins.
select e1.lname emp_name, e2.lname mgr_name, e3.lname, e4.lname from employee e1 left outer join employee e2 on e1.manager_emp_id =e2.emp_id
left outer join employee e3 on e2.manager_emp_id = e3.emp_id left outer join employee e4 on e3.manager_emp_id = e4.emp_id;
For this hierarchy to be consumed by OBIEE10 g the leaf node is repeated upto the last level.
Note: when you say flattened parent-child hierarchy, each row represents path from a node upto the root node, including the path from the root node to itself.
Problem 4: Flattening a parent child relationship to see each manager with it's parent. This is a more generic problem, since the #3 was more from a standpoint of OBIEE 10 g consume able form.
select e1.lname, e2.lname, e3.lname, e4.lname from employee e1 left outer join employee e2 on e1.emp_id = e2.manager_emp_id left outer join employee e3 on e2.emp_id = e3.manager_emp_id left outer join employee e4 on e3.emp_id = e4.manager_emp_id where e1.manager_emp_id is null;
note: if one just wants to see the hierarchy for visual inspection, the query provided as a solution for prob#4 can be used. However if the output of the query has to be in a form to be consumed by OBIEE 10g then the solution to problem#4 is more suitable.
Start With, Connect By and Prior in Oracle SQL
The information in a table can be retrieved in a hierarchical form with the help of Select statement's Start With .... Connect By clause.
[ [START WITH condition1 ] CONNECT BY condition2]
START WITH condition1: condition1 specifies the way to identify the root node of a hierarchy. All rows that satisfy the condition will be considered roots. If you don't specify a START WITH clause all rows are considered root rows(which is usually not desirable). A subquery can be included in Condition1.
CONNECT BY Condition2: Specifies the relationship between. parent rows and child rows in the hierarchy. The relationship is expressed as a comparison expression, where columns from the current row are compared to corresponding parent columns. condition2 must contain the PRIOR operator, which is used to identify columns from the parent row. Condition2 can't contain a subquery.
PRIOR is an oracle sql operator used only in conjunction with Hierarchical queries. When PRIOR is used in a Connect By condition the expression following PRIOR is evaluated for the Parent Row of the current row in the query.
e.g.
select lname, emp_id, manager_emp_id from employee
START WITH manager_emp_id is null
CONNECT BY PRIOR emp_id = manager_emp_id;
As shown above this lists the hierarchical information for visual inspection, compare this information with the diagram of the hierarchy.
Note: the PRIOR keyword doesn't need be listed first. for e.g. the query above can be written as:
select lname, emp_id, manager_emp_id from employee
START WITH manager_emp_id is null
CONNECT BY manager_emp_id = PRIOR emp_id ;
both of them return the same result.
Given that Condition1 specifies the starting point of the hierarchy(or root node), we can get the relative hierarchical position, by making root as the node who's hierarchy we want to see:
e.g.
select lname, emp_id, manager_emp_id from employee
START WITH manager_emp_id i = 'BLAKE'
CONNECT BY manager_emp_id = PRIOR emp_id ;
This gives all the direct/indirect subordinates of BLAKE
We can also have a subquery for Condition1, suppose we want to find out the employees reporting directly/indirectly to the oldest employee:
select lname, emp_id, manager_emp_id from employee
START WITH hire_date = (Select min(hire_date) from employee)
CONNECT BY PRIOR emp_id =manager_emp_id;
Note: beware that the subquery should return only one column and row. If it returns more than one row then you'll have to use IN operator instead of = , but in that case there will be multiple roots.
LOOP/CYCLE in hierarchical data:
If a row is both parent and child of another row then it's a cyclic data. Since the Connect By condition specifies a parent child relationship, it can't contain a loop.
More than One column to define Parent- Child relationship:
When a parent child relationship involves more than one column, you'll have to use PRIOR operator before each column(which should be picked up for parent record).
e.g. select * from assembly
START WITH parent_assembly_type is null and parent_assembly_id is null
CONNECT BY parent_assembly_type = PRIOR assembly_type
and parent_assembly_id = PRIOR assembly_id;
The LEVEL Pseudocolumn
Oracle provides pseudocolumn LEVEL to represent levels in a hierarchical tree. Whenever we use START WITH ... CONNECT BY clauses in a hierarchical query, you can use the pseudocolumn LEVEL to return the level number for reach row returned by the query.
for e.g.
select level, lname, emp_id, manager_emp_id
from employee
START WITH manager_emp_id is null
CONNECT BY manager_emp_id = PRIOR emp_id;
as shown in the resultset, each employee is now associated with a number representing it's level in the organisation.
Problem 5: Complex Hierarchy Operations can be performed much more easily than using standard ANSI sql:
a. Finding the Number of Levels in the hierarchy:
SELECT MAX(LEVEL)
FROM
EMPLOYEE
START WITH manager_emp_id is null
CONNECT BY PRIOR emp_id = manager_emp_id;
o/p: 4
b. Deterimine the number of employees at each level, group the results by Level and count number of employees in each distinct group:
SELECT LEVEL, Count(emp_id)
from Employee
START WITH manager_emp_id is null
CONNECT BY PRIOR emp_id = manager_emp_id
GROUP BY LEVEL;
;
Group Operations
Questions such as:
1. who is the highest paid executive in Operations department.
2. What was the sales by Region in the year 2018.
require application of group functions which aggregate data over a set of rows returning a single value.
Aggregate Function syntax: aggregate_function([Distinct | All] expression)
aggregate_function: is the name of the function e.g. SUM, COUNT, MAX, MIN
DISTINCT: specifies that the aggregate function should consider only distinct values of the argument
expression.
expression.
ALL: specifies the aggregate function should consider all values including all duplicate values of the
argument expression. The default is ALL.
expression: specifies a column or any other expression, on which you want to perform the aggregation.
argument expression. The default is ALL.
expression: specifies a column or any other expression, on which you want to perform the aggregation.
Aggregate functions and NULLs
All aggregate functions ignore NULLs except COUNT(*) and GROUPING.
Count(*) counts rows and not values since the concept of NULL does not apply to rows.
Note that Count() does not ignore NULLs.
Question: What is the result of Count(1) or Count(2) or Count(3) ?
select Count(1), Count(2), Count(3) from Employee;
Answer: The parameter to function Count is an expression. The function Count, like any other Aggregate function, counts the Not Null values of this expression for each row. So 1 will be not null for first row, second row, third row so on and so forth for all rows. So all these three expressions will return the row count of the table.
following three SQLs:
select avg(SALE_PRICE) FROM CUST_ORDER;
SELECT SUM(SALE_PRICE)/COUNT(*) FROM CUST_ORDER;
SELECT SUM(SALE_PRICE)/COUNT(SALE_PRICE) FROM CUST_ORDER;
first and third will give the same result while second will give the wrong result as COUNT(*) will also count the rows where SALE_PRICE is NULL.
In some cases you may want an average to be taken over all the rows in a table, not just the rows with non null values for the column in question. In those situations you'll have to use NVL function within the AVG function call to assign 0 (or some other useful value) to the column in place of any NULL values.
SELECT AVG(NVL(SALE_PRICE,0)) FROM CUST_ORDER
result:39.4
all the rows are considered for the average computation and the rows with NULL value for sale_price are assumed to have a 0 value for that column.
Note that Count() does not ignore NULLs.
Question: What is the result of Count(1) or Count(2) or Count(3) ?
select Count(1), Count(2), Count(3) from Employee;
Answer: The parameter to function Count is an expression. The function Count, like any other Aggregate function, counts the Not Null values of this expression for each row. So 1 will be not null for first row, second row, third row so on and so forth for all rows. So all these three expressions will return the row count of the table.
following three SQLs:
select avg(SALE_PRICE) FROM CUST_ORDER;
SELECT SUM(SALE_PRICE)/COUNT(*) FROM CUST_ORDER;
SELECT SUM(SALE_PRICE)/COUNT(SALE_PRICE) FROM CUST_ORDER;
first and third will give the same result while second will give the wrong result as COUNT(*) will also count the rows where SALE_PRICE is NULL.
In some cases you may want an average to be taken over all the rows in a table, not just the rows with non null values for the column in question. In those situations you'll have to use NVL function within the AVG function call to assign 0 (or some other useful value) to the column in place of any NULL values.
SELECT AVG(NVL(SALE_PRICE,0)) FROM CUST_ORDER
result:39.4
all the rows are considered for the average computation and the rows with NULL value for sale_price are assumed to have a 0 value for that column.
Aggregate functions and DISTINCT and ALL
Most Aggregate functions allow the usage of DISTINCT and ALL along with the expression argument. DISTINCT allows you to disregard duplicate values of the expression whereas ALL causes duplicate expression values to be included in the result.
e.g.
select count(distinct cust_nbr), count(cust_nbr), count(all cust_NBR) from cust_order;
result: 4 , 20, 20
Note: that the default ALL takes into account the duplicate values of the expression but does not take into account the NULL values.
e.g.
select count(sale_price) , count(ALL sale_price), count(distinct sale_price) from cust_order;
Result: 14, 14, 4
There are totally 20 records 6 have null values for SALE_PRICE, so the result above.
Since ALL is the default, you can explicitly use ALL with every aggregate function. However, the aggregate functions that take more than one argument as input don't allow the sue of DISTINCT. These include CORR, COVAR_POP, COVAR_SAMP, and all the linear regression functions.
In addition, some functions that take only one argument don't allow the use of DISTINCT. This category includes STTDEV_POP, STDDEV_SAMP, VAR_POP, VAR_SAMP, and GROUPING.
or
SELECT 'Customer' , cust_nbr, Count(order_nbr)
from Cust_order
Group by
Cust_Nbr, 'Customer';
both are valid(reason being, since it is a constant for all the rows it can no way affect the way data is grouped).
d. Scalar functions may be grouped by their underlying column.
SELECT SUBSTR(lname, 1,1 ) , COUNT(*) FROM EMPLOYEE GROUP BY
LNAME;
This query is syntactically correct and allowed. However if you look at the result set the Grouping happens based LNAME. So if there are two LNAME values 'ARMSTRONG' and 'AGNEL' there will be two records for each. In case you want grouping to be done on the basis of just the first character, use SCALAR function in the Group By Clause as follows:
SELECT SUBSTR(lname, 1,1 ) , COUNT(*) FROM EMPLOYEE GROUP BY
SUBSTR(lname, 1,1);
In this case both 'ARMSTRONG', 'AGNEL' will fall in the same group.
e. Concatenated columns may be grouped in either of two ways, both giving the same result.
SELECT MANAGER_EMP_ID|| JOB_ID, COUNT(*) from EMPLOYEE
GROUP BY MANAGER_EMP_ID || JOB_ID;
SELECT MANAGER_EMP_ID||JOB_ID, COUNT(*) from EMPLOYEE
GROUP BY MANAGER_EMP_ID, JOB_ID;
both the queries above give the same result.
f. Sometimes you may want to keep a non aggregate column in the Select list but not in the Group By clause.
for e.g. you want to see line number along with the summarised information. If you try following :
SELECT ROWNUM, CUST_NBR, COUNT(ORDER_NBR) FROM CUST_ORDER
GROUP BY CUST_NBR, ROW_NUM;
Note: ROWNUM is the pseudo column.
Since ROWNUM is in the Group By and each row will be assigned a new number so the result is grouped by ROWNUM. This is not what you wanted. So use following instead:
SELECT ROWNUM, V.* FROM
(SELECT CUST_NBR, COUNT(ORDER_NBR) FROM CUST_ORDER GROUP BY CUST_NBR) V;
g. You are not required to show your GROUP BY columns. However the result can confuse the user.
e.g. SELECT COUNT(ORDER_NBR) FROM CUST_ORDER GROUP BY CUST_NBR
the o/p of the query won't make sense to the user unless the user sees the query along with the output.
Therefore it is recommended that the non aggregate columns be included in select clause as well.
h. Group by clause will create a separate group for NULL values of the columns in the group by clause and aggregate values for it and present.
i. Group by clause when used with Where Clause requires the Where Clause to appear first. Oracle first filters out the records which don't satisfy the condition and then does a Group By.
SELECT CUST_NBR, COUNT(ORDER_NBR) FROM CUST_ORDER
WHERE SALE_PRICE > 25
GROUP BY CUST_NBR;
j. HAVING Clause is closely associated with GROUP BY clause. HAVING Clause is used to put a filter on the Groups created by Group By clause. If a query has a HAVING clause along with a GROUP By clause, the result set will include only the groups that satisfy the condition specified in the HAVING clause.
SELECT CUST_NBR, COUNT(ORDER_NBR) FROM CUST_ORDER
GROUP BY CUST_NBR
HAVING CUST_NBR<6;
The output only includes customers with numbers below 6. That's because the HAVING clause specified CUST_NBR < 6 as a condition. Orders for all Customers were counted but only those groups that matched the specified HAVING condition were returned as the result.
Note: Having clause filters data after summarization whereas Where clause filters data before summarization, so the example above is not appropriate. Instead of Having clause, Where clause should be used in this example.
A more appropriate example will be following:
SELECT CUST_NBR, COUNT(ORDER_NBR)
FROM CUST_ORDER
GROUP BY CUST_NBR
HAVING COUNT(ORDER_NBR) > 2;
This is an appropriate use of HAVING clause because the value of Count(ORDER_NBR) is not known until grouping takes place.
Note: The syntax for the Having clause is similar to that of the WHERE clause. However there is one restriction on the conditions you can write in the HAVING clause.
A HAVING clause can only refer to an expression in the SELECT list or to an expression involving aggregate function. If you specify a condition which is not in Select list or not an Aggregate function you'll get an error:
SELECT CUST_NBR, COUNT(ORDER_NBR)
FROM CUST_ORDER
GROUP BY CUST_NBR
HAVING ORDER_DT < sysdate;
this will result in an error:
ORA-00979: not a group by expression.
However you can use an aggregate function in HAVING clause even if it's not present in Select list.
(The reason for above constraint is very simple, before aggregation, Group by does grouping based on the non aggregate column in the Select(and Group By), after that aggregation happens and post aggregation HAVING clause filter is applied.)
e.g.
select count(distinct cust_nbr), count(cust_nbr), count(all cust_NBR) from cust_order;
result: 4 , 20, 20
Note: that the default ALL takes into account the duplicate values of the expression but does not take into account the NULL values.
e.g.
select count(sale_price) , count(ALL sale_price), count(distinct sale_price) from cust_order;
Result: 14, 14, 4
There are totally 20 records 6 have null values for SALE_PRICE, so the result above.
Since ALL is the default, you can explicitly use ALL with every aggregate function. However, the aggregate functions that take more than one argument as input don't allow the sue of DISTINCT. These include CORR, COVAR_POP, COVAR_SAMP, and all the linear regression functions.
In addition, some functions that take only one argument don't allow the use of DISTINCT. This category includes STTDEV_POP, STDDEV_SAMP, VAR_POP, VAR_SAMP, and GROUPING.
Group By Clause
Usually Aggregation need to be done by first grouping the data in different sets and then getting one aggregated value for each group. Group By clause fills this requirement.
select cust_nbr, count(order_nbr) from cust_order
group by cust_nbr;
a. All non aggregate expressions in Select list should be part of Group By clause.
b. Aggregate functions not allowed in Group By clause.
c. Constants can be omitted from Group By clause.
e.g. Select 'Customer', cust_nbr, Count(order_nbr)
from Cust_order
Group By
Cust_Nbr
SELECT 'Customer' , cust_nbr, Count(order_nbr)
from Cust_order
Group by
Cust_Nbr, 'Customer';
both are valid(reason being, since it is a constant for all the rows it can no way affect the way data is grouped).
d. Scalar functions may be grouped by their underlying column.
SELECT SUBSTR(lname, 1,1 ) , COUNT(*) FROM EMPLOYEE GROUP BY
LNAME;
This query is syntactically correct and allowed. However if you look at the result set the Grouping happens based LNAME. So if there are two LNAME values 'ARMSTRONG' and 'AGNEL' there will be two records for each. In case you want grouping to be done on the basis of just the first character, use SCALAR function in the Group By Clause as follows:
SELECT SUBSTR(lname, 1,1 ) , COUNT(*) FROM EMPLOYEE GROUP BY
SUBSTR(lname, 1,1);
In this case both 'ARMSTRONG', 'AGNEL' will fall in the same group.
e. Concatenated columns may be grouped in either of two ways, both giving the same result.
SELECT MANAGER_EMP_ID|| JOB_ID, COUNT(*) from EMPLOYEE
GROUP BY MANAGER_EMP_ID || JOB_ID;
SELECT MANAGER_EMP_ID||JOB_ID, COUNT(*) from EMPLOYEE
GROUP BY MANAGER_EMP_ID, JOB_ID;
both the queries above give the same result.
f. Sometimes you may want to keep a non aggregate column in the Select list but not in the Group By clause.
for e.g. you want to see line number along with the summarised information. If you try following :
SELECT ROWNUM, CUST_NBR, COUNT(ORDER_NBR) FROM CUST_ORDER
GROUP BY CUST_NBR, ROW_NUM;
Note: ROWNUM is the pseudo column.
Since ROWNUM is in the Group By and each row will be assigned a new number so the result is grouped by ROWNUM. This is not what you wanted. So use following instead:
SELECT ROWNUM, V.* FROM
(SELECT CUST_NBR, COUNT(ORDER_NBR) FROM CUST_ORDER GROUP BY CUST_NBR) V;
g. You are not required to show your GROUP BY columns. However the result can confuse the user.
e.g. SELECT COUNT(ORDER_NBR) FROM CUST_ORDER GROUP BY CUST_NBR
the o/p of the query won't make sense to the user unless the user sees the query along with the output.
Therefore it is recommended that the non aggregate columns be included in select clause as well.
h. Group by clause will create a separate group for NULL values of the columns in the group by clause and aggregate values for it and present.
i. Group by clause when used with Where Clause requires the Where Clause to appear first. Oracle first filters out the records which don't satisfy the condition and then does a Group By.
SELECT CUST_NBR, COUNT(ORDER_NBR) FROM CUST_ORDER
WHERE SALE_PRICE > 25
GROUP BY CUST_NBR;
j. HAVING Clause is closely associated with GROUP BY clause. HAVING Clause is used to put a filter on the Groups created by Group By clause. If a query has a HAVING clause along with a GROUP By clause, the result set will include only the groups that satisfy the condition specified in the HAVING clause.
SELECT CUST_NBR, COUNT(ORDER_NBR) FROM CUST_ORDER
GROUP BY CUST_NBR
HAVING CUST_NBR<6;
The output only includes customers with numbers below 6. That's because the HAVING clause specified CUST_NBR < 6 as a condition. Orders for all Customers were counted but only those groups that matched the specified HAVING condition were returned as the result.
Note: Having clause filters data after summarization whereas Where clause filters data before summarization, so the example above is not appropriate. Instead of Having clause, Where clause should be used in this example.
A more appropriate example will be following:
SELECT CUST_NBR, COUNT(ORDER_NBR)
FROM CUST_ORDER
GROUP BY CUST_NBR
HAVING COUNT(ORDER_NBR) > 2;
This is an appropriate use of HAVING clause because the value of Count(ORDER_NBR) is not known until grouping takes place.
Note: The syntax for the Having clause is similar to that of the WHERE clause. However there is one restriction on the conditions you can write in the HAVING clause.
A HAVING clause can only refer to an expression in the SELECT list or to an expression involving aggregate function. If you specify a condition which is not in Select list or not an Aggregate function you'll get an error:
SELECT CUST_NBR, COUNT(ORDER_NBR)
FROM CUST_ORDER
GROUP BY CUST_NBR
HAVING ORDER_DT < sysdate;
this will result in an error:
ORA-00979: not a group by expression.
However you can use an aggregate function in HAVING clause even if it's not present in Select list.
(The reason for above constraint is very simple, before aggregation, Group by does grouping based on the non aggregate column in the Select(and Group By), after that aggregation happens and post aggregation HAVING clause filter is applied.)
Java Learnings
1. Never try to create two Iterators of the same underlying object:
for e.g
iterator1 = odiJoins.iterator()
iterator2= odiJoins.iterator()
the second iterator will never be able to manipulate objects.
use this instead.:
iterator1=odiJoins.iterator()
//processing using iterator1
iterator1 =odiJoins.iterator()
//again some processing using iterator1
2. Whenver iterator is used
always use
iterator.hasNext() for looping .
ODI SDK NOTES:
for e.g
iterator1 = odiJoins.iterator()
iterator2= odiJoins.iterator()
the second iterator will never be able to manipulate objects.
use this instead.:
iterator1=odiJoins.iterator()
//processing using iterator1
iterator1 =odiJoins.iterator()
//again some processing using iterator1
2. Whenver iterator is used
always use
iterator.hasNext() for looping .
ODI SDK NOTES:
oracle.odi.core.persistence.transaction.support.TransactionCallbackWithoutResult
implements ITransactionCallback
TransactionCallbackWithoutResult is a convenient class for ITransactionCallback interface. It helps in defining a doInTransaction() without the need for a return statement.
doInTransaction(ITransactionStatus pStatus)
doInTransaction() is called by TransactionTemplate.execute(ITransactionCallback)
Interface ITransactionCallback
oracle.odi.core.persistence.transaction.support.ITransactionCallback
Callback interface for transactional code. Used with
TransactionTemplate
's execute method, often as anonymous class within a method implementation.
java.lang.object doInTransaction(ITransactionStatus pStatus)
gets called by TransactionTemplate.execute(ITransactionCallbac) within a transactional context does not need to care about the transaction itself. This method can return an ODI entity or a collection of ODI entities created within the transaction.
Class TransactionTemplate
oracle.odi.core.persistence.transaction.support.TransactionTemplate extends DefaultTransactionDefinition
- All Implemented Interfaces:
- ITransactionDefinition
Class TransactinTemplate simplifies programmatic transaction demarcation and transaction exception handling against an
ITransactionManager
. The method execute(ITransactionCallBack) is key to this class, supporting transactional code that implements ITransactionCallback interface.
Note: this template handles the transaction lifecycle and possible exceptions such that neither the ITransactionCallback implementation nor the calling code need to handle transactions.
Constructor:
TransacitonTemplate(ITransactionManager)
Constructs a new transaction template using the given transaction manager.
TransactionTemplate(ITransactionManager, ITransactionDefinition)
Constructs a new transaction template using the given transaction manager
Method: execute(ITransactionCallback)
execute the action specified by the given callback object within a given transaction.
Returns a result object created by the callback of null.
Interface ITransactionManager
Java Exception Handling
A good read about Java exception handling:
https://today.java.net/article/2003/11/20/three-rules-effective-exception-handling
TransacitonTemplate(ITransactionManager)
Constructs a new transaction template using the given transaction manager.
TransactionTemplate(ITransactionManager, ITransactionDefinition)
Constructs a new transaction template using the given transaction manager
Method: execute(ITransactionCallback)
execute the action specified by the given callback object within a given transaction.
Returns a result object created by the callback of null.
Interface ITransactionManager
oracle.odi.core.persistence.transaction
Transaction management interface.
methods:
1. void commit(ITransactionStatus)
2. ITransactionStatus getTransaction(ITransactionDefinition )
gets the currently Active transaction or creates a new one according to the given transaction definition(containing a propagation behavior). If the given definition is null a default definition will be used (with default propagation behavior and isolation level). Parameters like isolation level and timeout will only be applied to new transactions and will be ignored when participating with active ones.
It returns transaction status object representing the new or current transaction.
3. rollback(ITransactionStatus)
Perform a rollback of the given transaction.
3. rollback(ITransactionStatus)
Perform a rollback of the given transaction.
A good read about Java exception handling:
https://today.java.net/article/2003/11/20/three-rules-effective-exception-handling
Subscribe to:
Posts (Atom)