Project Summary

NEWS: 8/12/2009 QuickLoad 1.3beta released.

Driver

Take a sneak peek at Message Boards for Excel to Database data transfer and you
will find something similar to the following -
How does one load MS-Excel data into Oracle?
How to copy data from Excel to Oracle?
How to get data from Excel to Oracle?
How to read data from excel spreadsheet into an Oracle Table?
How to automate loading of data from Excel spreadsheet into Database?

Not a rocket science though!

Anyone who has worked as a System Maintenence/Support person would have done this one thing at least once - and that is loading data from a file into a database. Most commonly these files are plain text having delimited values. But data analysts prefer working with spreadsheets. Of course a spreadsheet like Excel offers so much more. When this spreadsheet reaches a support person like me or you for some backend manipulation, things can be quite cumbersome at times. Especially if the number of files is high. The 'normal' procedure (with Oracle) is first converting the Excel to a delimited text file. Then creating a database table with required columns and then using SQL Loader utility but not before you have the control file. Imagine doing this for 50 files! I couldn't handle 10, so I sat down one fine Saturday and wrote this. The objective was to have a program which can handle multiple files and which does not require me to create the table or even map the file to table or its columns. As in most cases we need the data in a table before we can do anything with it. So there it is...

Legal

© 2006-2009 Abhinav Srivastava

Licensed under the Apache License, Version 2.0

This distribution of QuickLoad includes Jakarta POI and CLI binary distributions
which are covered under Apache Software License.

http://www.apache.org/licenses/LICENSE-2.0
http://jakarta.apache.org/poi/
http://jakarta.apache.org/commons/cli/

System Requirements

WIN XP/2000/NT, JRE 5.0, JDBC Driver

What it does

The current distribution supports MS-EXCEL to ORACLE/MySQL data load. The program determines the Table name based on the file and worksheet names. If the table already exists, further processing is stopped (definitely a candidate for improvement!). Column names are decided based on the first row of the worksheet which has any value. Column names are unique. From subsequent rows only those cells are selected which have a valid column header (in the first NOT NULL row).

Installation

Unzip the distribution in a directory of your choice.

The extracted folder structure will be like this -

C:\QuickLoad1.1beta
config.properties
EXCEL2DB.CMD
QuickLoad-bin1.1beta-68.jar
QuickLoad-src1.1beta-68.zip
---docs\

Edit EXCEL2DB.CMD and set JDBC_DRIVER_LIB value or leave it blank if the driver is already in the classpath.

Use

Edit config.properties and fill in suitable values for your environment/requirement.

From the directory where the full distribution is unzipped
run EXCEL2DB.CMD config.file input.file log.file

input file is the Excel document that you want loaded in the database.
config file supplies the program its necessary parameters.
logfile is optional. If logfile is not specified anywhere, then
log is redirected to the console.

Wildcards for input file name are supported.

Configuration Parameters

The configuration file should have the following properties set as appropriate.

JDBC_DRIVER_CLASS e.g. oracle.jdbc.driver.OracleDriver

CONNECTION URL e.g. jdbc:oracle:thin:@hostname:1521:sid
Check with your DBA for hostname and SID values.

DB_USER/DB_PASSWORD User Id and Password for the database.

COLUMN_SIZE Size of the table columns. Default 80

TAB_NAME_PREFIX/
COL_NAME_PREFIX
These prefixes are used in Table and Column Names.

BATCH_SIZE No of records to be written to the Database in one batch. Default 50

DB_MANAGER For Oracle use com.as.quickload.db.dbmodule.OracleDBManagerImpl For MySQL use com.as.quickload.db.dbmodule.MySQLDBManagerImpl

FILE_READER For Excel use com.as.quickload.file.filemodule.ExcelFileReaderImpl

DATE_FORMAT Format in which date would be stored in Database, this is a java.text.SimpleDateFormat pattern.

READ_NUM_AS_TEXT Whether numbers stored in Input file is stored as NUMBER(0.00) or STRING Yes (Default) or No

LOG_FILE Log File, use forward slashes e.g. c:/logfile.txt

TRIM Trim Whitespaces around values: Yes (Default) or No

CASE Change the case: Upper, Lower, Keep (Default)

SKIP_HIDDEN_COLS Skip Hidden Columns of the Excel: Yes, No (Default)

Limitations

*Unlike database tables, Excel can have a free form grid. It can also have different datatypes in the same column. This cannot be supported by a normal database table. QuickLoad treats all columns as VARCHAR2 (i.e. Strings).
*Column names and columns themselves are decided from the first NOT NULL row of Excel. Other rows and columns are ignored.
*Cells with datatype TEXT, NUMBER and DATE are supported. Others are stored as
_INDETERMINATE.
*The column size is set in the configuration file. If an excel cell contains longer value, the process would fail.

Future Directions

*Customization of TABLE/COLUMN names.
*Loading delimited text files.
*Who knows a GUI!
*Unix support.
*SQL Server support.

Changes from Release 1.0

#Option to TRIM and Change CASE.
#Option to log onto the Console.

Changes from Release 1.1

#Option to configure MAX Table and Column Lengths
#Reduced occurrences of underscores "_" in table/column names.

Changes from Release 1.2

#Option to skip hidden columns
#Functions are evaluated before loading the data.

FAQ

*I do not see EXCEL2DB.CMD in the distribution.
--
If you do not find EXCEL2DB.CMD in the root of the archive, you can write one yourself with the following content:

@echo off
echo ******************************************************
echo Usage - EXCEL2DB.CMD config.file data.file [logfile] *
echo ******************************************************

SETLOCAL
set JDBC_DRIVER_LIB=
set APP_JAR=[QuickLoad-binXYZ.jar]

set ag3=%3

if NOT ""%ag3% == "" goto SETLOG
if ""%ag3%=="" goto RUN

:SETLOG
set ARG3=-log %ag3%
GOTO RUN

:RUN
JAVA -cp %APP_JAR%;"%JDBC_DRIVER_LIB%";"%CLASSPATH%" com.as.quickload.QLMain -config %1 -data %2 %ARG3%
ENDLOCAL


Once again, this file should be in the root of the directory where you have unzipped the distribution. Unzip only the ZIP file that you downloaded. No other archive needs to be exploded.

*When I try to load the file I see exceptions like
java.lang.reflect.InvocationTargetException
org.apache.poi.hssf.record.RecordFormatException

--
Make sure you do not have any filters or combo-boxes on the Excel document.
You can turn off filters from the excel menu (Data->Filter->Autofilter)

*OutOfMemoryException
--
On the line
JAVA -cp %APP_JAR%;"%JDBC_DRIVER_LIB%";"%CLASSPATH%" com.as.quickload.QLMain -config %1 -data %2 %ARG3%
in excel2db.cmd you can include JVM memory parameters.
e.g. java -Xmx512m ...