Enterprise data can be of very different kinds ranging from flat files to data stored in relational databases with the recent trend of storing data in XML data sources. The extraordinary number of database related products, and their historic evolution, makes this task exacting. The entry of cloud computing has turned this into one of the hottest areas as SSIS has been one of the methods indicated for bringing ground based data to cloud storage in SQL Azure, the next milestone in Microsoft Data Management. The reader may review my book on this site, "Beginners Guide to Microsoft SQL Server Integration Services" to get a jump start on learning this important product from Microsoft.
SQL Azure is one of the three pillars of Microsoft's Azure cloud computing platform. It is a relational database built on SQL Server Technologies maintained on Microsoft's physical site where subscribers like you and me can rent out storage of data. Since the access is over the internet it is deemed to be in the cloud and Microsoft would provide all data maintenance. Some of the key benefits of this 'Database usage as a service' are:
- High Availability
Which in other words means taking away a lot headache from you like worrying about hardware and software (SQL Azure Provisioning takes care of this), replication, DBAs with attitudes etc.
Preparation for this tutorial
You need some preparation to work with this tutorial. You must have a SQL Server 2008 installed to start with. You also need to register yourself with Microsoft to get an invitation to use SQL Azure by registering for the SQL Azure CTP. Getting permission is not immediate and may take days. After you register agreeing to the license terms, you get the permission (You become a subscriber to the service) to use the Azure Platform components (SQL Azure is one of them). After subscribing you can create a database on the SQL Azure instance. You will be the administrator of your instance (Your login will be known as the server level principal equivalent to the landbased sa login), and you can web access the server with a specific connection string provided to you and a strong password which you create. When you access the Azure URL, you provide the authentication to get connected to your instance of the server by signing in. Therein, you can create a database or delete an existing database. You have couple of tools available to work with this product. Read the blog post mentioned in the summary.
Overview of this tutorial
In this tutorial you will be using MS SQL Server Integration Services to create a package that can transfer a table from SQL Server 2008 to SQL Azure for which you have established your credentials. In my case the credentials are:
- Server: tcp:XXXXXX.ctp.database.windows.net
- User ID: YYYYY
- Password: ZZZZZ
- Database: PPPPPP
Here XXXXXX, YYYY,ZZZZZ, and PPPPPP are all the author's personal authentication values and you would get yours when you register as previously mentioned.
Table to be migrated on SQL Server 2008
The table to be migrated on the SQL Server 2008 (Enterprise server, evaluation edition is shown in the next figure). PrincetonTemp is a simple table in the TestNorthwind database on the default instance of the local server on a Windows XP machine, with a few columns and no primary key.
Create a SQL Server Integration Services Package
Open BIDS (a Visual Studio add-in extending support to build database applications with SQL Server) and create a new SQL Server Integration Services project[Use File |New |Project...in the IDE]. Herein the Visual Studio 2008 with SP1 is used. You need to provide a name which for this project is GroundToCloud. The program creates the project for you which you can see in the Solution Explorer. By default it creates a package for you, Package.dtsx. You may rename the package (herein ToAzure.dtsx)and the project folders and file appear as shown.
Add an ADO.NET Source component
Drag and drop a Data Flow Task to the tabbed page Control Flow in the package designer. Into the Data flow tabbed page drag and drop an ADO.NET Source component from the Toolbox. Double click the component you just added, from the pop-up menu choose Edit... The ADO.NET Source editor gets displayed. If there are previously configured connections one of them may show up in this window. We will be creating a new connection and therefore click the New... button to display an empty Configure ADO.NET Connection Manager as shown (again, if there are existing connections they all will show up in this window). A connection is needed in connecting to a source outside the IDE.
Double click the New... button to display the Connection Manager window which is all but empty. Fill in the details for your instance of ground based server as shown (the ones shown are for this article at the author's site). You may test the connection by hitting the Test Connection button.
Clicking the OK buttons on the Connection Manager and the Configure ADO.NET Connection Manager will bring you back to the ADO.NET Source Editor displaying the connection you have just made as shown. A connection string also gets added to the bottom pane of the package designer as well as to the Configure ADO.NET Connection Manager.
Click on the drop-down and pick the table (PrincetonTemp) that needs to be migrated to the cloud based server, SQL Azure. Click OK. The Columns navigation on the left would reveal all the columns in the table if it were to be clicked. The Preview button would return the data returned by a SELECT query on the columns as shown.
Add an ADO.NET Destination Component
Drag and drop an ADO.NET Component from the Toolbox to the Data Flow tabbed page adjacent to the ADO.NET Source you added earlier. Before you configure this establish a Flow to this component from the ADO.NET Source by dragging the dangling green arrow with the mouse and dropping it on the ADO.NET Destination. The dangling line now connects the Source and the Destination.
Double click the ADO.NET Destination component to open the ADO.NET Destination Editor window similar to the source window you configured earlier. The destination, which is the SQL Azure will need a connection. SQL Azure can be accessed by ADO.NET. Invoke the Configure ADO.NET Connection Manager just as you did earlier by hitting the New... button. In the window that gets displayed hit on the New... button to display the Connection Manager page. Accept the default provider. The server name is the name that was shown earlier which should contain the full string starting with "tcp:XXXXXX.ctp.database.windows.net". The authentication for SQL Azure is the one that is provided by Microsoft while you obtained your permission by subscribing to their service. In this case, the USER ID:YYYYY and the PASSWORD:ZZZZZ. You may test and verify the connection by hitting the Test Connection button as before. If the connection was properly made and the Server at Microsoft is up and running (high availability) you should also be able to see the databases on your server. Choose the one into which you want to take your ground based table. Herein a database PubdAzure was chosen. Your ADO.NET Destination Editor would look like the figure shown.
There will be no tables (or views) to choose. Hit the New... button to create a table. You may get a warning as shown. The SQL Azure data types may be considered as a subset of the ground based server and if you have tables other than the one used here, you may have to look up Microsoft documentation for data type matching.
When you click OK then a Create Window gets displayed with the following table statement.
CREATE TABLE "ADO NET Destination 1" (
"Temperature" DOUBLE PRECISION,
"RecordHigh" DOUBLE PRECISION
Herein you can make changes. The only change made was to change the name of the table to PrincetonAzure.
Click on the Mappings on the left to reveal how the incoming data from the source is mapped to the outgoing flow as shown in the next figure. Presently we will not be configuring the error for this tutorial (default is assumed).
Add a Data Viewer to monitor the data flow
The source to destination is completely configured but it is always nice to monitor the data that would be flowing from source to data. Double click the green connecting line to display the Data Flow Editor. Click on the Data Viewers on the left to add a Data Viewer. Here a Grid type data viewer with the name Flowing To AzureDB was added as shown.
Execute the package
This is the easy step. Build the project using menu Build. Right click ToAzure.dtsx and from the drop-down choose Execute Package. The program starts running, after a while the flow gets stopped at the DataViewer as shown.
If you let flow to reach the destination by clicking on the green arrow head on the DataViewer, the program continues and the flow tries to go to the destination. If the package were to be successfully executed the Destination Component and the Source Component in the package designer should turn green. However in this case, the destination component turned red implying that there were errors during execution.
Click on the Progress tab on the designer which has collected all the information from the time the Execute Package was clicked as shown.
The lines that are significant which show the reasons for the failure are shown below as they were not completely shown in the above figure:
[ADO NET Destination ] Error: An exception has occurred during
data insertion, the message returned from the provider is: Heaps
can not be replicated tables. Please create a clustered index for
[SSIS.Pipeline] Error: SSIS Error Code DTS_E_PROCESSINPUTFAILED.
The ProcessInput method on component "ADO NET Destination" (28)
failed with error code 0xC020844B while processing input "ADO
NET Destination Input" (31). The identified component returned an
error from the ProcessInput method. The error is specific to the
component, but the error is fatal and will cause the Data Flow
task to stop running. There may be error messages posted before
this with more information about the failure.
Although the package failed and no data was written to the destination you can verify that table and the table schema have been migrated as shown in this figure from SQL AZURE Manager. The SELECT query on the migrated table (PrincetonAZURE) shows all the columns of PrincetonTemp.
The article described in detail the process of migrating a table from the ground based SQL Server 2008 to the cloud based SQL Azure server instance. The process of migration uses a Microsoft SQL Server Integration Services package which uses ADO.NET Source and a ADO.NET Destination components. There are other options for migrating to SQL Azure. Read my blog post on this and other SQL Azure related posts.
If you have read this article you may be interested to view :
- Best Practices for Microsoft SQL Server 2008 R2 Administration
- SQL Server 2008 R2: Multiserver Management Using Utility Explorer
- Exporting data from MS Access 2003 to MySQL
- Transferring Data from MS Access 2003 to SQL Server 2008