The SWB data structure is pretty straightforward. All information about a dataset is contained in a "data.dict" file (data dictionary), and in a set of "var" files (var1, var2, , varN), numbered in the same sequence as in the data.dict file.
A sample data.dict file is shown below (this file describes five variables extracted from the General Social Survey 1996):
2904 var,1, T, "SPANKING", 7, 3, "FAVOR SPANKING TO DISCIPLINE CHILD" "226. Do you strongly agree, agree, disagree, or " "strongly disagree that it is sometimes necessary to " "discipline a child with a good, hard spanking?" 0, "missing data" 1, "STRONGLY AGREE" 2, "AGREE" 3, "DISAGREE" 4, "STRONGLY DISAGREE" 8, "Do not know" 9, "Not applicable" var,2, T, "SATJOB", 7, 5,"JOB OR HOUSEWORK" "180. (IF R IS CURRENTLY WORKING, TEMPORARILY NOT " "AT WORK, OR KEEPING HOUSE:) On the whole, how " "satisfied are you with the work you do--would you " "say you are very satisfied, moderately satisfied, a " "little dissatisfied, or very dissatisfied? " 0, "missing data" 1, "VERY SATISFIED" 2, "MOD SATISFIED" 3, "A LITTLE DISSATISFIED" 4, "VERY DISSATISFIED" 8, "Do not know" 9, "Not applicable" var,3, T, "POLABUSE", 5, 3,"CITIZEN SAID VULGAR OR OBSCENE THINGS" "233A. (IF YES OR NOT SURE TO Q. 233:) Would you " "approve of a policeman striking a citizen who: Had " "said vulgar and obscene things to the policeman? " 0, "Missing data" 1, "YES" 2, "No" 8, "Do not know" 9, "Not applicable" var,4, T, "JOBLOSE", 8, 4, "IS R LIKELY TO LOSE JOB" "178. (IF R HAS A JOB:) Thinking about the next " "12 months, how likely do you think it is that you " "will lose your job or be laid off -- very likely, " "fairly likely, not too likely, or not at all likely?" 0, "Missing data" 1, "VERY LIKELY" 2, "FAIRLY LIKELY" 3, "NOT TOO LIKELY" 4, "NOT LIKELY " 5, "LEAVING LABOR FORCE" 8, "Do not know" 9, "Not applicable" var,5, N, "AGE", 0, 1, "AGE OF RESPONDENT" "24. How old are you? (full years)
Let us examine the contents of this file.
The first line tells how many cases (respondents to a survey, records, etc.) are there in the dataset. In this case, there are 2904 respondents whose answers to the 1996 General Social Survey questionnaire are recorded.
The second line describes the first variable. The actual data for this variable will be contained in a file named "var1".
This second line has the following comma-separated elements:
Following the variable description line, are variable comments, if present, one or several lines. Each line of the comments text should be enclosed in double quotation marks. A list of possible labeled responses follows the comments text in the following format:
0, "missing data" 1, "STRONGLY AGREE" 2, "AGREE" 3, "DISAGREE" 4, "STRONGLY DISAGREE" 8, "Do not know" 9, "Not applicable"
Each line contains a numeric code (as it appears in the "var" files), and a label for this code, separated by a comma. Typically, you use the 0 code for missing data in a text variable. Several other categories may describe answers "not applicable", "do not know" and similar. It is sometimes convenient to exclude such categories from analysis, and SWB procedures allow you to do just that, recognizing these categories by key words in their labels. In a numeric variable, missing values are indicated by value -9999.
The data about particular responses are contained in the "var" files. Each var file contains one column of numbers (codes of text variables, or numbers for numeric variables). The number of rows in this file equals the number of cases described by a dataset (in our example, it is 2904).
For example, if the first respondent answered "Agree" to the first question, the second answered "Disagree", the third answered "agree", the fourth did not give an answer, the fifth answered "strongly disagree", then the first five rows of the var1 file will look like
2 3 2 0 4
Thus, creating datasets in SWB format is fairly simple. One thing that we strongly recommend after you have created a dataset, is to check its integrity, i.e. to test whether all variables are described in data.dict and var files consistently, and whether you have any codes in var files that dont have respective labels in the data.dict. To do just that, you might want to download this simple program that we provide for your convenience.
You can also enter your data into some Windows spreadsheet or statistical analysis software, and then convert it to SWB format. Such data conversion is described below.
We have written several routines to let you convert SPSS, DALSolution and Microsoft Excel files into SWB format (i.e. into a collection of data.dict and a bunch of var files). Download these programs and try them out. The set of conversion routines is certainly not exhaustive. We will be adding more conversion routines as we go along. We also hope that the simplicity of SWB format will let you easily write conversion scripts on your own. (in such case, you are more than welcome to share them with us, so that we can make them available to everybody from this page)
Download the SPSS2SWB.sbs script and open it in SPSS. Be sure that your data is loaded into SPSS, then run the script. The script should create a data.dict and var files in the directory where the script resides.