Minitab Cookbook

By Isaac Newton
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    Worksheet, Data Management, and the Calculator
About this book

Minitab has been a statistical package of choice across all numerous sectors of industry including education and finance. Correctly using Minitab's statistical tools is an essential part of good decision making and allows you to achieve your targeted results, while displaying fantastic charts and a powerful analysis will also communicate your results more effectively.

"Minitab Cookbook" will take the mystery out of using Minitab and will simplify the steps to produce great results. This book will be hugely beneficial for anyone who knows what statistics or studies they want to run, but who is unsure about just what button to press or what option to select.

In this book, you will learn how to use data from different sources and will be guided through the basics of graphs as well as the basics of hypothesis tests. You will explore the use of non-linear regression, how to construct complex ANOVAs, and even delve into Multivariate statistics.

"Minitab Cookbook" is a great reference on how to create graphs, generate P-values, and how to put data in order. You will explore the basics of charts as well as into the complex depths of Factor analysis, and you will learn everything from the simplest of t-tests to the complexity of mixed model ANOVA. And finally, for all of you who want to write Macros, this book covers the use of Execs, Global, and Local Macros.

Publication date:
February 2014
Publisher
Packt
Pages
338
ISBN
9781782170921

 

Chapter 1. Worksheet, Data Management, and the Calculator

In this chapter, we'll be covering the following recipes:

  • Opening an Excel file in Minitab

  • Opening data from Access using ODBC

  • Stacking several columns together

  • Stacking blocks of columns at the same time

  • Transposing the columns of a worksheet

  • Splitting a worksheet by categorical column

  • Creating a subset of data in a new worksheet

  • Extracting values from a date/time column

  • Calculator – basic functions

  • Calculator – using an if statement

  • Coding a numeric column to text values

  • Cleaning up a text column with the calculator

 

Introduction


This first chapter illustrates the use of a worksheet, data menu, and calculator tools. We start by bringing data in from other applications and then working on it using other data tools within the calculator menu. The emphasis is on getting data into Minitab and reformatting into a preferred structure for later studies.

As you can see in the preceding screenshot, Minitab prefers working in columns. This can be structurally a different approach to Excel, where the data is held in cells. We will start by bringing our data in from Excel and Access, and then move into reformatting the data. It should be noted that copying and pasting into Minitab can be a perfectly acceptable method of moving data from one application to another.

 

Opening an Excel file in Minitab


In this task, we will open a set of data in an Excel file. This can be either in .xls or .xlsx format. We will set options to help with reading the data in the Excel file as the correct type of data.

Getting ready

Preparation for this task is very simple. We only need a set of data saved in an Excel workbook. Any worksheet is suitable, but be aware that the Open Worksheet… command will open every worksheet in an Excel workbook at the same time. The formatting options that we will use here are applied across the entire workbook and cannot have separate format options for every worksheet. Minitab worksheets have a maximum limit of 4000 columns and a practical limit of 10 million rows.

We follow an example here using the Pulse workbook.xlx worksheet.

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

How to do it…

The following instructions detail the steps for importing data from Excel by using the open worksheet command:

  1. Within Minitab, go to the File menu and click on Open Worksheet.

  2. Change the Files of type field to Excel, and navigate to the folder containing the Excel file.

  3. Select the Excel file by clicking on the workbook.

    Note

    Double-clicking will open the workbook, but it is important to use Preview and Options as in the following steps.

  4. Click on Preview to see the file structure, make a note of the row in which column names appear, and the data appears. In the following example, the column names are in the third row and the data starts from the fourth row.

  5. Click on OK and then select Options.

  6. Select Variable Names to indicate the row of the column names. In this example we will use row 3. The first row of data can be set to row 4. The automatic setting will pick the next row with any data in it for the first row of data. Click on OK.

  7. Click on Preview. We can check if this has helped with identifying the type of data. Further alterations to data type can be made. If further alterations need to be made, either change the data type from the drop-down list under each column name or return to Options to see what further changes need to be made.

  8. Click on Open.

How it works…

The Preview screen will display the first 100 rows in the dataset. This can be a useful tool in seeing how the file is going to be opened and then deciding what needs to be changed in options.

Excel files can come in many different formats, and while options cannot correct everything, it is an important first step.

If a dataset contains summarized data rows such as means or standard deviations at the end of the worksheet, it is best to exclude them. This can be performed by limiting the number of rows that Minitab will open.

Another option that is useful is to ignore blank data rows. Any row that is completely empty will be left out as it is unnecessary to include them in a Minitab worksheet.

There's more…

Text files, CSV files, XML files, and more can be opened using the Open Worksheet option. While opening text files, column separators can be identified by using the field definition.

See also

  • The Opening data from Access using ODBC recipe

 

Opening data from Access using ODBC


Here, we will show the instructions to pull data from a table within Access.

Getting ready

The instructions are left generic to enable us to use a suitable Access database. Try using these with your data.

How to do it…

The following instructions detail the steps for importing data from a database into Minitab:

  1. Within Minitab, go to the File menu and click on Query Database(ODBC).

  2. Click on Machine Data Source, and select MS Access Database from the data source list. Then click on OK.

  3. We can select the drive at the bottom of the next screen, and navigate to the folder containing the database.

  4. Select the database and click on OK; fill in the username and password as required.

  5. We will pick a table in the database from the drop-down control and then select the columns required from the selection on the left-hand side.

How it works…

ODBC stands for Open Database Connectivity and is aimed to allow transfer of data between databases independent of operating systems or database systems. Most databases supporting ODBC can be queried this way using Minitab.

Here, Minitab is constructing an ODBC command to ask Access for the data. The command that was sent from Minitab to Access can be seen by going into the History folder.

 

Stacking several columns together


Minitab will prefer data set up as columns, and often, it is better to stack data together, using one column for all the results, and a second or third column to group the information together. Here, we will stack several numeric columns together into one column.

Getting ready

We want to stack several columns together to give a column of results and an indicator column to identify the group they belong to.

We require a worksheet with several columns of the same data type. In this example, we use data from the Party Membership of US Senators file. This data can be found at

http://mathforum.org/workshops/sum96/data.collections/datalibrary/data.set6.html.

This data is in the Excel file format. Download the file senators.xls. If you are copying and pasting data into Minitab, copy lines 3 to line 53. Alternatively, when following the instructions in the Opening an Excel file in Minitab recipe, set row 3 for the variable names.

The first column should be named State. Ensure that in Minitab the data appears as shown in the following screenshot:

How to do it…

The following instructions will stack the party columns together to create a single column for the number of US Senators:

  1. Go to the Data menu and select Stack; then click on Columns.

  2. Select the numeric columns, Democrats, Republicans, and Independents from the left-hand selection. They can be double-clicked on across into the section Stack the following columns:, alternatively select column 2 and drag down to column 4. Then click on the Select button to move the columns as a group.

  3. Name the worksheet Stacked Data, from the New Worksheet section and click on OK.

  4. The worksheet will appear as in the following screenshot. As the columns will not be named from the stack command we need to rename them in the new worksheet. Column 1 Subscripts should be renamed to Party, and column 2 should be renamed to Senators.

The preceding screenshot shows the stacked results that are created in a new worksheet. A useful tip here to avoid confusion while generating new worksheets, is to name the worksheet that will be generated in the stack column's dialog box.

How it works…

By default, the columns are created in a new worksheet, and an additional subscript column is created to differentiate the groups in the data. This can be deselected and will identify the stacked data by numeric values: 1 for the first column, 2 for the second, and so on. Alternatively, the data can be stacked into the same worksheet.

See also

  • The Stacking blocks of columns at the same time recipe

  • The Transposing the columns of a worksheet recipe

 

Stacking blocks of columns at the same time


With some datasets, we may want to keep several columns together while running a stack command rather than running several individual stack commands to stack blocks of columns.

In the previous recipe, the data for senators was stacked, but we didn't keep the information about the states in the new worksheet. Stacking blocks of columns can keep the information about the states with the stacking command.

Getting ready

We will use the senators data from the previous recipe. This can be found from the following location:

http://mathforum.org/workshops/sum96/data.collections/datalibrary/data.set6.html

The Getting ready section of the Stacking blocks of columns at the same time recipe has details on how to import the data.

How to do it…

The following instructions will stack the party membership details and retain information about the state at the same time:

  1. Navigate to Data | Stack and then click on Blocks of Columns.

  2. In the first field, enter Democrats and select the State columns.

  3. In the second field, enter Republicans and select the State columns

  4. In the third field, enter Independents and select the State columns

  5. In the Store stacked data in section, name the new worksheet Stacked Data and click on OK.

  6. As with the stack command, the columns created in the new worksheet are not named. We will need to rename these columns, go to the section above the data in column one and name it Party, column two as Senators, and column three as State.

How it works…

The stack blocks of columns commands will stack columns vertically by order in each row. The number of senators for Democrats will be placed in the second column at the top, those for Republicans will placed below it, and finally, the Independent senators will appear. The State column is placed in each row to ensure that the stacked data has the correct state next to each count of senators in the second column.

The optional selection of Use variable names in subscript column uses the name of the first column in each row to identify where the data has come from in the new worksheet and places this in the first column. Hence, in the stacked worksheet, column 1 lists Democrats, then Republicans, and finally, Independents.

The preceding screenshot shows the worksheet generated up to step 5 in the instructions. The column names are stored in the subscript column, C1. If we do not want the name of the parties to appear in column 1, then unselecting this option will create the subscript column with numbers instead.

As the stack commands do not name the columns, we should use step 6 to ensure that we can identify each column when we use these later in dialogs.

See also

  • The Stacking several columns together recipe

  • The Transposing the columns of a worksheet recipe

 

Transposing the columns of a worksheet


Data is often structured with categories denoted in a row and multiple observations of these dimensions in subsequent columns. This can be a popular format used in Excel worksheets. Here, we will transpose this into the column format that is preferred by Minitab in the following manner:

In the preceding screenshot, we can see that different individuals are listed starting from the second column. Their resting pulse is in first row, and after activity in second row, it is preferable for this data to be transposed.

Getting ready

The data used in this example is in the Pulse workbook.xlx file. To get this data into Minitab, follow the Opening an Excel file in Minitab recipe.

How to do it…

The following instructions will transpose the numeric columns of the Pulse workbook file:

  1. Click on the Data menu and then click on Transpose Columns.

  2. Select the subject columns in the left-hand column panel by selecting C2 and dragging down through the columns till the end of the column list. Then click on Select.

    Tip

    The left-hand pane in the following screenshot will show the available columns that can be selected. We could double-click on each column across, but here it is faster by dragging and selecting to highlight the columns we want. We could also use Shift or Ctrl to allow multiple column selection.

  3. In the Create variable names using column section, enter C1 and click on OK.

How it works…

Columns of different data types cannot be transposed with this command. We cannot transpose column 1 from the original data into the numeric data held by the subject columns. Row one of the original worksheet becomes column one of the transposed worksheet. Mixing data types will generate an error. Instead, variable names should be optionally identified by the row categories if they are given in a column of the original worksheet. Here, we use column one to identify the column names of the transposed data.

See also

  • The Opening an Excel file in Minitab recipe

  • The Stacking several columns together recipe

  • The Stacking blocks of columns at the same time recipes

 

Splitting a worksheet by categorical column


In this recipe, we will create several new worksheets from a grouping column. Split worksheet is a great way of quickly separating data out into separate worksheets.

Getting ready

We will use the file pulse.mtw from the sample Minitab data folder. Here, we will split the worksheet into two new worksheets: one for those who ran and the other for those who didn't run.

To open the pulse file, go to File and then click on Open Worksheet. Click on the button Look in Minitab Sample Data folder. Then find the file pulse.mtw. If the file does not show up, make sure the file type is set to Minitab (*.mtw; *.mpj).

How to do it…

The following instructions will split a worksheet by a categorical column:

  1. Go to the Data menu and click on Split Worksheet.

  2. In the By variables section, select the Ran column, and click on OK.

How it works…

Split worksheet is a very quick command to generate new worksheets for each grouping level of a column. If we had data on temperature measurements over time from different weather stations, the Split worksheet could generate a new worksheet for each weather station to enable us to analyze the data separately.

See also

  • The Creating a subset of data in a new worksheet recipe

 

Creating a subset of data in a new worksheet


Here, we will want to obtain a smaller set of data from a large worksheet. This is useful when we do not want to analyze all the data, or maybe only a small portion is of interest to us. We will create a subset of this worksheet to look only at the most recent years, that is, from 2000 onwards.

Getting ready

The data shown in the following screenshot is from the Met office (http://www.metoffice.gov.uk/climate/uk/stationdata/) and shows weather details for the Oxford weather station on a monthly basis from 1853. We will subset the data to view the results from 2000 onwards. Select the Oxford station data and copy the data into Minitab.

Note

When copying the data into Minitab from the website, only select the information for the year and weather. Copying the column header information will create text columns. Minitab only allows one header for column names, and the dataset here has a header for names and a second header for units. It is advised that we only grab the information for now and rename the columns afterwards.

The Oxford weather (cleaned).mtw file is provided in the code bundle.

How to do it...

The following instructions will generate a new worksheet of weather data 2000 onwards:

  1. Go to the Data menu and click on Subset Worksheet.

  2. Rename the new worksheet as Temperature for 2000 onwards and click on the Condition button.

  3. In the Condition section, double-click on the Year column to move this into the Condition section, and then add >= 2000, as shown in the following screenshot. Then click on OK twice.

How it works…

Subset worksheet can create a new worksheet based on an entered condition, row numbers, or selected data that we have brushed on a chart.

This gives a bit more control than split worksheet, and we only generate one worksheet.

There's more…

With this command, data can be excluded or included very quickly.

Row numbers can be specified in the Row numbers field, either singly, with a space between each value (for example, 2 4 8 9 10), or a range of row numbers can be entered by using a colon (for example, 100:150).

The Condition section works in the same way as the calculator. Multiple conditions can be added with the use of And, Or, and Not. Text values should be referred to within double quotes (for example, "April") and date formatted data should be used as DATE("05/11/2000").

See also

  • The Splitting a worksheet by categorical column recipe

 

Extracting values from a date/time column


Often, it can be useful to find values such as day, month, or hour of the day from a date column. This can help later on while trying to find out about the variation in our results, finding effects based on day of the week or month of the year.

Getting ready

The data column displayed in the following screenshot shows dates in the mm/dd/yyyy format. To run this recipe, type the following data into a new column and label the column as Date. We will then create a new column displaying the months from the dates.

How to do it…

The following steps will create a column for the month from the dates in the worksheet:

  1. Navigate to Data | Extract from Date/Time | Extract to Text.

  2. In the dialog box, enter the date column in the Extract from date/time column section. Enter Month into the Store text column in section.

  3. Tick the Month selection and click on OK.

How it works…

If the worksheet does not contain a column called Month already, it will create a new column for us. The selected fields will then populate this column in the worksheet. The Extract to Text option will create months and days of the week by their names, where the Extract from date/time to Numeric option will create a column of numbers.

 

Calculator – basic functions


The e-calculator tool is analogous to formulas in Excel. In Minitab, we will run calculations on the columns one at a time. Using the weather data from the Oxford station, we will create a new column for the hours of sunlight in a month divided bythe amount of rainfall (in millimeters) for that month.

Getting ready

We will be using data from Oxford weather station; it can be obtained from http://www.metoffice.gov.uk/climate/uk/stationdata/. Select Oxford weather station data and copy the data into Minitab. Only copy the data and not the column headers. Then, rename the columns as Year, Month, T Max, T Min, Airfrost(days), Rain(mm), and Sun(Hrs).

Note

Inspect the data that we copy to ensure that the columns have the right data type. The column header C1-T will indicate text, C1-D will indicate data, and C1 without the hyphen is a numeric column.

Also, note that a few of the values in the data will be recorded as missing; this is where text values are supplied next to the number. A cleaned worksheet is provided in the Oxford weather (cleaned).mtw file.

How to do it…

The following instructions will create a new column for the hours of sunlight to amount of rainfall in a month ratio:

  1. Go to the Calc menu and click on Calculator.

  2. In the dialog box, enter a name for the new column into the Store result in Variable section. We will name it as 'Sun(Hrs)/Rain(mm)'.

  3. Select the section labeled Expression, and then double-click on Sun (Hours) to move it across, enter /, and then move the Rain(mm) column over.

  4. Tick the box labeled Assign as formula and click on OK.

How it works…

The results of this function are generated in the selected column. The checkbox Assign as a formula will create a formula in the worksheet that will keep updating it. By default, this is not selected to avoid problems with circular or self-referencing formulas. When a formula is stored in a column, a green cross is placed at the top of the column to indicate the presence of a formula. The green cross indicates that this is up-to-date, while a red cross would indicate a problem in updating the formula.

There's more…

The calculator in Minitab has a plethora of functions available. We can choose from the function list on the right in the calculator screen. These include several statistical, mathematical, and text functions. They can be filtered with the drop-down list at the top. The help function for the calculator includes a list of functions and how to use them.

See also

  • The Calculator – using an if statement recipe

  • The Cleaning up a text column with the calculator recipe

 

Calculator – using an if statement


If statements can be useful tools in extracting information from a column or in reformatting data. Here, we will recode the values in the pulse worksheet.

In the pulse dataset, the C4 smokes column indicates if an individual is a smoker. 1 indicates a regular smoker and 2 stands for those who do not smoke. We will use an if statement to replace the numeric values for smokers and nonsmokers.

Getting ready

The data used here is one of the example files that come with Minitab. Open the pulse.mtw worksheet from the sample Minitab data folder by going to the File menu and clicking on Open Worksheet. Select the icon Look in Minitab Sample Data folder.

How to do it…

The following instructions will convert the values of 1 and 2 in the Smokes column to Smokes and Non-Smoker

  1. Go to the Calc menu and select Calculator.

  2. In Store result in variable, enter the column Smokes.

  3. In the Expression text box, enter IF('Smokes' = 1, "Smoker", "Non-Smoker").

  4. Click on OK.

How it works…

An if statement is entered in the following format: IF(Logical statement, TRUE, FALSE); the value entered after the first comma being the true for the statement. Numeric values for the response are entered purely as numbers without any double quotes. Text is always entered inside double quotes, for example, "True". Column names are used with single quotes, for example, 'Pulse1'.

Note

Column names can be typed, double-clicked, or selected in the column from the available list on the left of the dialog box. While single quotes are used to denote columns, they are not always necessary. A column name, that is, a single name without special characters or spaces in the name does not need to be referred to with single quotes. The column Ran can be used as Ran or 'Ran'. Selecting a column from the list will automatically enter single quotes where necessary.

There's more…

The IF statements can be generalized to multiple IF conditions very simply. A general IF statement would have the following syntax:

IF(Statement1, True1, Statement2, True2… StatementN, TrueN, False)

When identifying missing values in the worksheet, the code must be used as the first statement. These values can be identified with the code Miss() or '*'.

For example, the following will identify a missing result in the Smokes column as Unknown, 1 as Smoker, and all others as Non-Smoker:

IF('Smokes' = '*', "Unknown", 'Smokes' = 1, "Smoker", "Non-Smoker")

See also

  • The Calculator – basic functions recipe

  • The Coding a numeric column to text values recipe

 

Coding a numeric column to text values


In the pulse worksheet, the Activity column lists the usual level of activity of the students in the study, where 1 denotes slight activity, 2 is moderate activity, and 3 stands for a lot of activity.

We will change the numeric values of Activity from 1, 2, and 3 into Slight, Moderate, and High.

Getting ready

Use File and Open Worksheet to open the pulse.mtw dataset worksheet from the sample Minitab data folder.

How to do it…

The following instructions will convert the values of 1, 2, and 3 in the Activity column to Slight, Moderate, and High:

  1. In the Data menu, navigate to Code | Numeric to Text.

  2. Enter the Activity column into Code data from columns.

  3. Enter the Activity column into the section Store coded data in columns.

  4. In the Original values section, enter 1 in the first line, 2 in the second line, and 3 in the third line, as shown in the following screenshot:

  5. In the New section, enter Slight in the first line, Moderate in the second line, and High in the third line.

  6. Click on OK.

How it works…

The code tools work like an IF statement. The code tools cannot be set as an updating formula in the worksheet like the IF statements can be set from the calculator.

In this example, we are replacing the numeric values in the Activity column and putting the new results back into the same column. This will change the data type of the Activity column from Numeric to Text.

There's more…

Ranges of numbers can be entered using colons. If we wish to change the values from 0 to 30 as low, 30 to 60 as medium, and 60 and above as high, then we would enter these as shown in the following screenshot:

The code tool does not accept symbols such as >, <, or =, and therefore, we can use a large number, such as 100000, to create a greater than statement.

See also

  • The Calculator – using an if statement recipe

 

Cleaning up a text column with the calculator


In the dataset shown in the following screenshot, volunteers are listed with their first and last names:

Throughout the column, there are errors in the capitalization of names. We will apply the correct case to the names and separate the names into new columns for first and last name.

Getting ready

Enter the names in the screenshot into a new column in Minitab. Name the column Volunteer.


How to do it…

The following instructions will use the proper word commands in the calculator to create a column of correctly capitalized surnames:

  1. In the Calc menu, click on Calculator.

  2. In the section Store result in variable, create a new column called Surname.

  3. Filter the function list by selecting the text functions from the drop-down list. Then look for the function Proper. Double-click on the function to move it into the calculator.

  4. Next, find the function Word in the text function list. Double-click on this to highlight and replace the section text in the calculator. This should look like the following screenshot:

  5. Double-click on the column Volunteer to bring it into the expression and replace the section labeled text.

  6. Replace word_num with 2 and delete the delimiters and brackets. The expression should look like the following screenshot:

  7. Click on OK to create a column of surnames capitalized correctly.

How it works…

The text function Proper returns a capital letter at the start of each word. The rest are set to lowercase. The Word function is used to locate a word in a text string. The Word 2 function finds the second word in a cell, using spaces as delimiters by default.

Nesting the Word function inside the Proper command tells Minitab to find the second word for the text in the column Volunteer and then set the first letter in each word as a capital and all others as lowercase.

There's more…

There is a list of calculator functions within the Help menu for the calculator. The text functions include a number of useful tools. Note, Item is similar to the Word function but allows the separator to be specified. For example, A_123_tx could be used with an Item function such as Item(c1,2,"_") to define the separator of the items as an underscore.

It is also worth noting that Find and Replace work in the worksheet as they do in other applications. Press Ctrl + F, and Ctrl + H or they can be found in the Editor menu.

About the Author
  • Isaac Newton

    It was probably inevitable that, after being gifted with the name Isaac, he discovered he was really good at mathematics and science.

    Isaac Newton  studied physics at Leicester University and is one of the few people to have an MPhys in Space Science and Engineering. MPhys degrees later changed to MSci after only two years. Yes, he has heard the joke or comment you are just thinking about. After a short stint of postgraduate studies at Birmingham University, he joined Minitab in 1999, where he has been helping the users of Minitab and taking training courses ever since.

    Apart from introducing Minitab courses and the basic statistical tools, he has the pleasure of teaching reliability statistics, design of experiments, macro writing, and time series, among other subjects. Recently, he was extensively involved in mentoring others in their own projects and assisting them on getting the most out of their data.

    Browse publications by this author
Latest Reviews (1 reviews total)
It's useful and help me a lot of studying
Minitab Cookbook
Unlock this book and the full library FREE for 7 days
Start now