Sunday, March 16, 2014

HOW TO LOAD MILLIONS OF RECORDS INTO TABLE BY SQLLDR (SQL LOADER)

SQLLDR is Oracle's Inbuilt utility to load data into tables from flat files.

REQUIREMENTS:
a) Datafiile
b) Control File
c) Batch file / Shell Script
4) Logfile ( Optional)

Steps:
1)  create table majestic_million with following column specification:

CREATE TABLE RAJANI.majestic_million
(
  GLOBALRANK      NUMBER,
  TLDRANK         NUMBER,
  DOMAIN          VARCHAR2(200),
  TLD             VARCHAR2(20),
  REFSUBNETS      NUMBER,
  REFIPS          NUMBER,
  IDN_DOMAIN      VARCHAR2(200),
  IDN_TLD         VARCHAR2(50),
  PREVGLOBALRANK  NUMBER,
  PREVTLDRANK     NUMBER,
  PREVREFSUBNETS  NUMBER,
  PREVREFIPS      NUMBER
);


TABLE majestic_million
 Name                                      Null?    Type                      
 ----------------------------------------- -------- ----------------------------
 GLOBALRANK                                         NUMBER                    
 TLDRANK                                            NUMBER                    
 DOMAIN                                             VARCHAR2(200)            
 TLD                                                VARCHAR2(20)              
 REFSUBNETS                                         NUMBER                    
 REFIPS                                             NUMBER                    
 IDN_DOMAIN                                         VARCHAR2(200)            
 IDN_TLD                                            VARCHAR2(50)              
 PREVGLOBALRANK                                     NUMBER                    
 PREVTLDRANK                                        NUMBER                    
 PREVREFSUBNETS                                     NUMBER                    
 PREVREFIPS                                         NUMBER

 2) Store the datafile majestic_million.csv  (CSV File without column name header ) in D:\ drive .

 3) Create a .ctl (Control file ) as majestic_million.ctl  in D:\ drive with following details as per column name of table.

LOAD DATA
INFILE 'D:\majestic_million.csv'
BADFILE 'D:\majestic_million.bad'
DISCARDFILE 'D:\majestic_million.dsc'
INTO TABLE "HR"."MAJESTIC_MILLION"
INSERT
FIELDS TERMINATED BY ','
(GLOBALRANK,
TLDRANK,
DOMAIN,
TLD,
REFSUBNETS,
REFIPS,
IDN_DOMAIN,
IDN_TLD,
PREVGLOBALRANK,
PREVTLDRANK,
PREVREFSUBNETS,
PREVREFIPS)

 4) Create a batch file majestic_million.bat to call sqlplus with the following details.

 sqlldr hr/hr control='D:\majestic_million.ctl' LOG= 'D:\majestic_million.log'

Either you can run by double clicking the batch file or providing the script from command prompt.
C:\>sqlldr hr/hr control='D:\majestic_million.ctl' LOG= 'D:\majestic_million.log'
 














5) Log file name should be explicitly mentioned in the command line or in the batch file.


****--------------------**********************--------------------------
The Logfile will look like this:

SQL*Loader: Release 11.2.0.1.0 - Production on Sun Mar 16 19:51:23 2014

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Control File:   D:\majestic_million.ctl
Data File:      D:\majestic_million.csv
  Bad File:     D:\majestic_million.bad
  Discard File: D:\majestic_million.dsc 
 (Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array:     64 rows, maximum of 256000 bytes
Continuation:    none specified
Path used:      Conventional

Table "HR"."MAJESTIC_MILLION", loaded from every logical record.
Insert option in effect for this table: INSERT

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
GLOBALRANK                          FIRST     *   ,       CHARACTER            
TLDRANK                              NEXT     *   ,       CHARACTER            
DOMAIN                               NEXT     *   ,       CHARACTER            
TLD                                  NEXT     *   ,       CHARACTER            
REFSUBNETS                           NEXT     *   ,       CHARACTER            
REFIPS                               NEXT     *   ,       CHARACTER            
IDN_DOMAIN                           NEXT     *   ,       CHARACTER            
IDN_TLD                              NEXT     *   ,       CHARACTER            
PREVGLOBALRANK                       NEXT     *   ,       CHARACTER            
PREVTLDRANK                          NEXT     *   ,       CHARACTER            
PREVREFSUBNETS                       NEXT     *   ,       CHARACTER            
PREVREFIPS                           NEXT     *   ,       CHARACTER            


Table "HR"."MAJESTIC_MILLION":
  1000000 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.


Space allocated for bind array:                 198144 bytes(64 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:       1000000
Total logical records rejected:         0
Total logical records discarded:        0

Run began on Sun Mar 16 19:51:23 2014
Run ended on Sun Mar 16 19:52:00 2014

Elapsed time was:     00:00:36.94
CPU time was:         00:00:05.78

************************************

If any errors log file will look like this:

Record 1: Rejected - Error on table "HR"."MAJESTIC_MILLION", column GLOBALRANK.
ORA-01722: invalid number

Record 628413: Rejected - Error on table "HR"."MAJESTIC_MILLION", column DOMAIN.
ORA-12899: value too large for column "HR"."MAJESTIC_MILLION"."DOMAIN" (actual: 114, maximum: 100)


Table "HR"."MAJESTIC_MILLION":
  999999 Rows successfully loaded.
  2 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.


Space allocated for bind array:                 198144 bytes(64 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:       1000001
Total logical records rejected:         2
Total logical records discarded:        0

Run began on Sun Mar 16 19:22:20 2014
Run ended on Sun Mar 16 19:22:58 2014

Elapsed time was:     00:00:38.06
CPU time was:         00:00:06.07

***************************---------------------------

If any errors bad file (majestic_million.bad) will  record the rows like below.

GlobalRank,TldRank,Domain,TLD,RefSubNets,RefIPs,IDN_Domain,.... etc... etc , PrevRefIPs
628412,326424,www.showsiteinf.appspot.com, ...etc ...etc ......626317,325407,652,728 


Hope this helps.

Cheers
Rajani

No comments:

Post a Comment