Saturday, March 28, 2009

Fisher

Fisher (1890 ~ 1962), the United Kingdom Statistics and geneticists that modern statistical science and one of the founders of the evolution of Darwin's work was made to clarify the foundation.

Fisher graduated from the University of Cambridge degree in astronomy, but also because of the error analysis of astronomical observations, he began to explore the question of statistics. 1915,1918 published in two important articles, which explore the correlation coefficient distribution; on the latter to prove the continuous genetic variation, much can be used to comply with the laws of Mendelian genetic variation explained by the superposition.

In 1919 he refused to K. Pearson at work, at the Rothamsted agricultural experiment. Here the development of Fisher ANOVA his theory, hypothesis testing, experimental design and put forward the principle of randomization, making scientific experiments can be carried out simultaneously detect many parameters and to reduce sample bias. In his 1925 book, "researchers of the statistical methods."

University of London in 1933 to obtain jobs, RH blood group engaged in research. From 1943 to 1957 he returned to Cambridge to teach, Sir closures in 1952, published in 1956, "Statistical Methods and Scientific Inference"

William Sealey Gosset

Pearson's Student

Ghose (William Sealey Gosset) was born in Canterbury, Kent City, England, studying at Manchester College and Oxford University, studying chemistry and math major. 1899, Ghose into the Dublin brewery A. Guinness & Sons, where production methods can be a lot of the raw materials (barley, etc.) features and product quality of the relationship between the statistical data. Enhance the importance of barley quality farmland eventually led him to study a pilot scheme, and in 1904 wrote the first report of "rules of the application error."

1907-1937 years, Ghose statistics published 22 papers in 1942 to "" students "collection" for the title re-issued

Book "the average probability of error" (1908)

T distribution (student distribution)

Set up a small sample of the basic statistical inference

K. Pearson

K. Pearson (1857-1936), English mathematician, philosopher, one of the founders of modern statistics. He graduated from the University of Cambridge in 1879. To study in Germany in 1881 the University of Heidelberg, Berlin University, has been after the 1882 Masters, PhD. Was employed as in 1884, University of London Professor of Applied Mathematics and Mechanics.

Galton's students

A multiple correlation, partial correlation, relieved function, moment estimation, chi-square distribution

Galton (1822-1911) the founder of biological statistics

Galton (1822-1911) the founder of biological statistics

"Galton regression analysis and other people about the work of pioneering, as well as time series analysis of some of the work, ... the development of mathematical statistics are an important event in history." ─ ─ from "Chinese Encyclopedia" (Math vol)

His cousin Charles Darwin's masterpiece, "The Origin of Species" came after the touch he used statistical methods to genetic evolution of intelligence research questions, first principles of probability and statistics, such as the mathematical methods used in biological science, clearly the "biometric" term. Now Statistics on the "relevant" and "return" is the concept of first use of Galton

Monday, March 23, 2009

What is the probability

God is concerned, everything is determined, so the probability of existence as an academic, is a testament to human ignorance. Fortunately, human or smart enough, we do not because things are random and flawless hands, we have the possibility of things to determine our behavior. For instance, a person before Sugar, must repeatedly considered various possibilities. If people want to wait until after everything is determined to do, then you probably can not be anything to do, because almost everything is random.

Statistics idea

Statistics idea include:
1, the mean idea: for the overall general trend of development
2, variation of thought: the overall difference between the various units
3, it is estimated that idea: in order to speculate the overall sample (random)
4, the related idea: the study of the nature of things at the same foundation on
5, fitting thought: the relationship between different things in the abstract expression
6, testing idea: the statistical methods are inductive in nature, has, or probability

Friday, March 20, 2009

spss Release History

SPSS 15.0.0 - September 2006
SPSS 16.0.1 - November 2007
SPSS 17.0.0 - 2008

SPSS From Wikipedia

[edit] Statistics program
SPSS (originally, Statistical Package for the Social Sciences) was released in its first version in 1968 after being founded by Norman Nie and C. Hadlai Hull. Nie was then a political science postgraduate at Stanford University,[1] and now Research Professor in the Department of Political Science at Stanford and Professor Emeritus of Political Science at the University of Chicago.[2] SPSS is among the most widely used programs for statistical analysis in social science. It is used by market researchers, health researchers, survey companies, government, education researchers, marketing organizations and others. The original SPSS manual (Nie, Bent & Hull, 1970) has been described as 'Sociology's most influential book'.[3] In addition to statistical analysis, data management (case selection, file reshaping, creating derived data) and data documentation (a metadata dictionary is stored with the data) are features of the base software.

Statistics included in the base software:

Descriptive statistics: Cross tabulation, Frequencies, Descriptives, Explore, Descriptive Ratio Statistics
Bivariate statistics: Means, t-test, ANOVA, Correlation (bivariate, partial, distances), Nonparametric tests
Prediction for numerical outcomes: Linear regression
Prediction for identifying groups: Factor analysis, cluster analysis (two-step, K-means, hierarchical), Discriminant
The many features of SPSS are accessible via pull-down menus or can be programmed with a proprietary 4GL command syntax language. Command syntax programming has the benefits of reproducibility and handling complex data manipulations and analyses. The pull-down menu interface also generates command syntax, though the default settings have to be changed to make the syntax visible to the user. Programs can be run interactively, or unattended using the supplied Production Job Facility. Additionally a "macro" language can be used to write command language subroutines and a Python programmability extension can access the information in the data dictionary and data and dynamically build command syntax programs. The Python programmability extension, introduced in SPSS 14, replaced the less functional SAX Basic "scripts" for most purposes, although SaxBasic remains available. In addition, the Python extension allows SPSS to run any of the statistics in the free software package R. From version 14 onwards SPSS can be driven externally by a Python or a VB.NET program using supplied "plug-ins".

SPSS places constraints on internal file structure, data types, data processing and matching files, which together considerably simplify programming. SPSS datasets have a 2-dimensional table structure where the rows typically represent cases (such as individuals or households) and the columns represent measurements (such as age, sex or household income). Only 2 data types are defined: numeric and text (or "string"). All data processing occurs sequentially case-by-case through the file. Files can be matched one-to-one and one-to-many, but not many-to-many.

The graphical user interface has two views which can be toggled by clicking on one of the two tabs in the bottom left of the SPSS window. The 'Data View' shows a spreadsheet view of the cases (rows) and variables (columns). The 'Variable View' displays the metadata dictionary where each row represents a variable and shows the variable name, variable label, value label(s), print width, measurement type and a variety of other characteristics. Cells in both views can be manually edited, defining the file structure and allowing data entry without using command syntax. This may be sufficient for small datasets. Larger datasets such as statistical surveys are more often created in data entry software, or entered during computer-assisted personal interviewing, by scanning and using optical character recognition and optical mark recognition software, or by direct capture from online questionnaires. These datasets are then read into SPSS.

SPSS can read and write data from ASCII text files (including hierarchical files), other statistics packages, spreadsheets and databases. SPSS can read and write to external relational database tables via ODBC and SQL.

Statistical output is to a proprietary file format (*.spo file, supporting pivot tables) for which, in addition to the in-package viewer, a stand-alone reader is provided. The proprietary output can be exported to text or Microsoft Word. Alternatively, output can be captured as data (using the OMS command), as text, tab-delimited text, HTML, XML, SPSS dataset or a variety of graphic image formats (JPEG, PNG, BMP and EMF).

Add-on modules provide additional capabilities. The available modules are:

SPSS Programmability Extension (added in version 14). Allows Python programming control of SPSS.
SPSS Data Validation (added in version 14). Allows programming of logical checks and reporting of suspicious values.
SPSS Regression Models - Logistic regression, ordinal regression, multinomial logistic regression, and mixed models (multilevel models).
SPSS Advanced Models - Multivariate GLM and repeated measures ANOVA (removed from base system in version 14).
SPSS Classification Trees. Creates classification and decision trees for identifying groups and predicting behaviour.
SPSS Tables. Allows user-defined control of output for reports.
SPSS Exact Tests. Allows statistical testing on small samples.
SPSS Categories
SPSS Trends
SPSS Conjoint
SPSS Missing Value Analysis. Simple regression-based imputation.
SPSS Map
SPSS Complex Samples (added in Version 12). Adjusts for stratification and clustering and other sample selection biases.
SPSS Server is a version of SPSS with a client/server architecture. It has some features not available in the desktop version, such as scoring functions.


[edit] Versions
Early versions of SPSS were designed for batch processing on mainframes, including for example IBM and ICL versions, originally using punched cards for input. A processing run read a command file of SPSS commands and either a raw input file of fixed format data with a single record type, or a 'getfile' of data saved by a previous run. To save precious computer time an 'edit' run could be done to check command syntax without analysing the data. From version 10 (SPSS-X) in 1983, data files could contain multiple record types.

SPSS version 16.0 runs under Windows, Mac OS 10.5 and earlier, and Linux. The graphical user interface is written in Java. The Mac OS version is provided as a Universal binary, making it fully compatible with both PowerPC and Intel-based Mac hardware.

Prior to SPSS 16.0, different versions of SPSS were available for Windows, Mac OS X and Unix. The Windows version was updated more frequently, and had more features, than the versions for other operating systems.

SPSS version 13.0 for Mac OS X was not compatible with Intel-based Macintosh computers, due to the Rosetta emulation software causing errors in calculations. SPSS 15.0 for Windows needed a downloadable hotfix to be installed in order to be compatible with Windows Vista.

Thursday, March 19, 2009

《Introduction to Nonparametric Regression》

I buy a book《Introduction to Nonparametric Regression》

but i can understand some conception .

although i read the english ver.

i didn't have some backround of western live.

so i can not understood it.

in china we do not use spss.


Tuesday, March 17, 2009

I buy a spss book

書名/作者SPSS :statistical package for the social sciences /Norman H. Nie ... [at al.]
版次2nd ed.
出版地/出版者/出版年Taipei, Taiwan :道明 ;[1978].New York :McGraw-Hill,
稽核項xxiv, 675 p. :ill. ;28 cm.
國際標準書號/價格0-07-046531-2 (pbk.)
標題SPSS (Computer file)
標題Social sciences--Statistical methods--Computer programs.
其他作者Nie, Norman H.
其他書刊名Statistical package for the social sciences.
一般註Taiwan ed.

convert among SAS, Stata and SPSS files


On this page, conversions of different data formats are discussed. In general, the strategies should work with SAS 9.*, SPSS 14+ and Stata 9. If you have Stata 10 and you need to convert your data to other formats, you need to use the saveold command within Stata for saving the data in Stata version 9 format before you convert the data set.

To SAS To SPSS To Stata
From SAS

- How do I use a SAS data file in SPSS?

- How do I use a SAS data file in Stata?
From SPSS

- How do I use a SPSS data file in SAS?

- How do I use a SPSS data file in Stata?
From Stata - How do I use a Stata data file in SAS?

- How do I use a Stata data file in SPSS?

Another way to convert data files between SAS, Stata and SPSS is to use programs such as Stat/Transfer or DBMS Copy. For more information on Stat/Transfer, please see our Stat/Transfer page. With the newest version of Stat/Transfer (version 9), you can transfer the SAS version 9.*, Stata 10 and SPSS 16 files. Stat/Transfer allows you to transfer data files to many other file formats, including Statistica, Systat, S-Plus, R, Excel, Access, Minitab, Matlab, Lisrel and JMP. You may need to update your copy of Stat/Transfer to be able to transfer data sets created by the latest version of the software. To update Stat/Transfer, click on the "About" tab (in the upper right corner), and click on the "Check for Updates" pull-down menu and select "Right Now".

Wednesday, March 11, 2009

Interaction contrasts The model including only two-way interactions

Interaction contrasts

The model including only two-way interactions

Let's consider an interaction contrast involving diet and exertype in a mixed model which includes diet, exertype, time and all the two-way interactions. From the graphs we see that exertype level 3 is higher than both levels 1 and 2 for both diets but that level 3 increases at different rates depending on which diet is considered. We would like to contrast exertype level 3 in diet 2 with the other groups in order to see if they are really different from the other groups. In order to accomplish, this we will need to use an interaction contrast where exertype is coded with a reverse Helmert coding contrasting level three versus levels 1 and 2 and diet has a contrasts coding contrasting level1 versus level 2. For more information on reverse Helmert coding and other contrast coding systems please refer to chapter 5 in our webbook on regression.

The contrast coding:

exertype level 1

-1/2

exertype level 2

-1/2

exertype level 3

1


diet level 1

1

diet level 2

-1

The diet*exertype interaction is coded as d1e1 d1e2 d1e3 d2e1 d2e2 d2e3, where d# = coding for diet level # and e# = coding for exertype level # and d#e# is the product of the two. The contrast coding for diet is level 1 = 1 and level 2 = -1 and the contrast coding for exertype is the same as in the previous example, namely, level 1 = -1/2, level 2 = -1/2 and level 3 = 1. The order of the factors is determined by the order in which they appear in the mixed command. In this particular case diet appears before exertype in the mixed command, and thus with our coding system for diet and exertype, we would have the following coding for the interaction:

d1e1 = 1*-1/2 = -1/2
d1e2 = 1*-1/2 = -1/2
d1e3 = 1*1 = 1
d2e1 = -1*-1/2 = 1/2
d2e2 = -1*-1/2 = 1/2
d2e3 = -1*1 = -1

In order to more easily understand the coding for the interaction, it might help to visualize it as a matrix which equals the product of the code for diet as a column matrix and the contrast coding of exertype as a row matrix.

exertype level 1 = -1/2 exertype level 2 = -1/2 exertype level 3 = 1
diet level 1 = 1

1*-1/2 = -1/2

1*-1/2 = -1/2

1*1 = 1

diet level 2 = -1

-1*-1/2 = 1/2

-1*-1/2 = 1/2

-1*1 = 1

We don't need to include any other terms besides the interaction diet*exertype since the model does not include any higher order terms that contain diet*exertype. If a term, such as the interaction exertype*time, contains only part of the interaction being tested, then it is not included in the test subcommand. Only those higher order terms, which contain the complete term diet*exertype being tested are included.

mixed pulse by diet exertype time
/fixed = diet exertype time diet*exertype
/repeated = time | subject(id) covtype(ar1)
/test = 'exertype 12v3 & diet 1v2' diet*exertype -.5 -.5 1 .5 .5 -1.

Let's test the same interaction contrast but now we will consider a model that includes all two-way interactions and the three-way interaction. Just as in the previous example we now have to "distribute" the contrast coding of the lower order term across the higher order terms. In other words, the test subcommand will include the contrast coding for the interaction of diet*exertype that we developed in the above example. We also have to include the three-way interaction with the appropriate contrast coding because the three-way interaction includes the term, namely diet*exertype, being tested.

The contrast coding for the three-way interaction is slightly more complicated. The diet*exertype*time interaction is coded as d1e1t1 d1e1t2 d1e1t3 d1e2t1 d1e2t2 d1e2t3 d1e3t1 d1e3t2 d1e3t3 d2e1t1 d2e1t2 d2e1t3 d2e2t1 d2e2t2 d2e2t3 d2e3t1 d2e3t2 d2e3t3. In this code, d# = coding for diet level #, e# = coding for exertype level #, t# = coding for time level # and d#e#t# is the product of the three. The contrast coding for exertype is level 1 = -1/2, level 2 = -1/2 and level 3 = 1; the contrast coding for diet is level 1 = 1 and level 2 = -1. Since we are not implementing a contrast for time we want to code each level of time to be "weighted" equally, so all three levels are coded as 1/3. Note that the coding is very similar to the second example in the contrast section of this FAQ, but that instead of coding diet as 1/2 for each level, we are now giving diet a contrast coding because we are testing in interaction contrast. In this case the coding for the interaction is:

d1e1t1 = 1*-1/2*1/3 = -1/6
d1e1t2 = 1*-1/2*1/3 = -1/6
d1e1t3 = 1*-1/2*1/3 = -1/6
d1e2t1 = 1*-1/2*1/3 = -1/6
d1e2t2 = 1*-1/2*1/3 = -1/6
d1e2t3 = 1*-1/2*1/3 = -1/6
d1e3t1 = 1*1*1/3 = 1/3
d1e3t2 = 1*1*1/3 = 1/3
d1e3t3 = 1*1*1/3 = 1/3

d2e1t1 = -1*-1/2*1/3 = 1/6
d2e1t2 = -1*-1/2*1/3 = 1/6
d2e1t3 = -1*-1/2*1/3 = 1/6
d2e2t1 = -1*-1/2*1/3 = 1/6
d2e2t2 = -1*-1/2*1/3 = 1/6
d2e2t3 = -1*-1/2*1/3 = 1/6
d2e3t1 = -1*1*1/3 = -1/3
d2e3t2 = -1*1*1/3 = -1/3
d2e3t3 = -1*1*1/3 = -1/3

This can be more conveniently visualized as matrices. The coding for time which is the last factor in the mixed command can be thought of as the row matrix that is multiplied by the coding for exertype which is the second to last factor in the mixed command and which can be thought of as the column matrix. The matrix which is the product is then multiplied by the coding for each level of diet which appears before exertype and time in the mixed command.

For diet level 1 = 1:

time level 1 = 1/3 time level 2 = 1/3 time level 3 = 1/3
exertype level 1 = -1/2 1*-1/2*1/3 = -1/6 1*-1/2*1/3 = -1/6 1*-1/2*1/3 = -1/6
exertype level 2 = -1/2 1*-1/2*1/3 = -1/6 1*-1/2*1/3 = -1/6 1*-1/2*1/3 = -1/6
exertype level 3 = 1 1*1*1/3 = 1/3 1*1*1/3 = 1/3 1*1*1/3 = 1/3

For diet level 2 = -1:

time level 1 = 1/3 time level 2 = 1/3 time level 3 = 1/3
exertype level 1 = -1/2 -1*-1/2*1/3 = 1/6 -1*-1/2*1/3 = 1/6 -1*-1/2*1/3 = 1/6
exertype level 2 = -1/2 -1*-1/2*1/3 = 1/6 -1*-1/2*1/3 = 1/6 -1*-1/2*1/3 = 1/6
exertype level 3 = 1 -1*1*1/3 = -1/3 -1*1*1/3 = -1/3 -1*1*1/3 = -1/3

mixed pulse by diet exertype time
/fixed = time diet exertype time*exertype diet*exertype time*diet diet*exertype*time
/repeated = time | subject(id) covtype(cs)
/test = 'exertype 12v3, diet 1v2' diet*exertype -.5 -.5 1 .5 .5 -1
diet*exertype*time -1/6 -1/6 -1/6 -1/6 -1/6 -1/6 1/3 1/3 1/3
1/6 1/6 1/6 1/6 1/6 1/6 -1/3 -1/3 -1/3.

How can I test contrasts and interaction contrasts in a mixed model?

How can I test contrasts and interaction contrasts in a mixed model?

It can be rather tricky to program the test subcommand when there are higher order interactions (e.g., three-way interactions, four-way interactions, etc.) included in the mixed model. Let's look at an example where we are using the mixed command in a repeated measures model. The data set exercise was used in our seminar on repeated measures. The data set consists of people who were randomly assigned to two different diets: low-fat and not low-fat and three different types of exercise: at rest, walking leisurely and running. Their pulse rate was measured at three different time points during their assigned exercise: at 1 minute, 15 minutes and 30 minutes. We are going to model pulse rates, and the model under consideration will include all three variables in our mixed model, diet which has two levels, exertype which has three levels and time which also has three levels. Even though time is a repeated factor we can treat it in the same manner as the other variables when we want to tests the various contrasts and interactions contrasts that may be of interest.

We will look at programming the test subcommand for a contrast involving only one variable when the model includes main effects, when the model includes a two-way interaction and when the model includes both two-way and three-way interactions. We will also demonstrate how to program the test subcommand when testing an interaction contrast when the model includes two-way interactions and when the model includes both two-way and three-way interactions.

Contrasts for one variable

The one-way model including only main effects

Let's look at an example where the mixed model includes only the main effects of diet, exertype and time. From the graph we see that there might be a difference between exertype level 3 and the two other levels of exertype. Therefore, we will use a reverse Helmert coding for exertype in the test subcommand in order to test this particular contrast. For more information on reverse Helmert coding and other contrast coding systems please refer to chapter 5 in our webbook on regression.

The contrast coding:

exertype level 1

-1/2

exertype level 2

-1/2

exertype level 3

1


mixed pulse by diet exertype time
/fixed = time diet exertype
/repeated = time | subject(id) covtype(ar1)
/test = 'exertype 12v3' exertype -.5 -.5 1.

The two-way model including the interaction of exertype and diet

Let us look at testing the same contrast, but this time we will use a different model which will include an interaction between exertype and diet. In order to implement the test of the contrast, we have to introduce the idea of distributing the contrast coding of lower order terms on the higher order terms. In other words, when we have higher order terms which include the term being tested, such as the two-way interaction of exertype and diet, then we need to include these higher order terms in the test subcommand with the appropriate contrast coding as well as the contrast coding for the main effect being tested.

The diet*exertype interaction is coded as d1e1 d1e2 d1e3 d2e1 d2e2 d2e3, where d# = coding for diet level # and e# = coding for exertype level # and d#e# is the product of the two. Since we are not contrast coding diet, we just want the "weight" for each level of diet to be equal; hence, we assign both levels of diet the same "weight" of 1/2 because there are two levels of diet. The order of the factors is determined by the order in which they appear in the mixed command. In this particular case, diet appears before exertype in the mixed command, and thus with our coding system for diet and exertype we would have the following coding for the interaction:

d1e1 = 1/2*-1/2 = -1/4
d1e2 = 1/2*-1/2 = -1/4
d1e3 = 1/2*1 = 1/2
d2e1 = 1/2*-1/2 = -1/4
d2e2 = 1/2*-1/2 = -1/4
d2e3 = 1/2*1 = 1/2

In order to more easily understand the coding for the interaction, it might help to visualize it as a matrix which equals the product of the code for diet as a column matrix and the contrast coding of exertype as a row matrix.

exertype level 1 = -1/2 exertype level 2 = -1/2 exertype level 3 = 1
diet level 1 = 1/2

1/2*-1/2 = -1/4

1/2*-1/2 = -1/4

1/2*1 = 1/2

diet level 2 = 1/2

1/2*-1/2 = -1/4

1/2*-1/2 = -1/4

1/2*1 = 1/2

Now the test subcommand will include both exertype with its contrast coding and the interaction of exertype*diet with its contrast coding. Note that the fixed subcommand reflects the change in the model to include the interaction of diet and exertype.

mixed pulse by diet exertype time
/fixed = time diet exertype diet*exertype
/repeated = time | subject(id) covtype(ar1)
/test = 'exertype 12v3' exertype -.5 -.5 1 diet*exertype -1/4 -1/4 1/2 -1/4 -1/4 1/2.

The three-way model including the interaction of exertype*time and the three-way interaction

Let's consider testing the same contrast for exertype but using a model that includes the interaction of exertype and time as well as the three-way interaction of diet*exertype*time. In this case we will need to include three terms in our test subcommand: the main effect of exertype, the interaction diet*exertype and the three-way interaction diet*exertype*time, each with their appropriate contrast coding. The coding of exertype will be the same as in the one-way example.

The exertype*time interaction is coded as e1t1 e1t2 e1t3 e2t1 e2t2 e2t3 e3t1 e3t2 e3t3, where e# = coding for exertype level # and t# = coding for time level # and e#t# is the product of the two. Since we are not contrast coding time, we just want the "weight" for each level of time to be equal; hence, we assign all three levels of time the same "weight" of 1/3. The order of the factors is determined by the order in which they appear in the mixed command. In this particular case exertype appears before time in the mixed command, and thus with our coding system for exertype and time we would have the following coding for the interaction:

e1t1 = 1/3*-1/2 = -1/6
e1t2 = 1/3*-1/2 = -1/6
e1t3 = 1/3*-1/2 = -1/6
e2t1 = 1/3*-1/2 = -1/6
e2t2 = 1/3*-1/2 = -1/6
e2t3 = 1/3*-1/2 = -1/6
e3t1 = 1/3*1 = 1/3
e3t2 = 1/3*1 = 1/3
e3t3 = 1/3*1 = 1/3

In order to more easily understand the coding for the interaction, it might help to visualize it as a matrix which equals the product of the contrast code for exertype as a column matrix and the coding of time as a row matrix.
time level 1 = 1/3 time level 2 = 1/3 time level 3 = 1/3
exertype level 1 = -1/2

1/3*-1/2 = -1/6

1/3*-1/2 = -1/6

1/3*-1/2 = -1/6

exertype level 2 = -1/2

1/3*-1/2 = -1/6

1/3*-1/2 = -1/6

1/3*-1/2 = -1/6

exertype level 3 = 1

1/3*1 = 1/3

1/3*1 = 1/3

1/3*1 = 1/3

The contrast coding for the three-way interaction is slightly more complicated. The diet*exertype*time interaction is coded as d1e1t1 d1e1t2 d1e1t3 d1e2t1 d1e2t2 d1e2t3 d1e3t1 d1e3t2 d1e3t3 d2e1t1 d2e1t2 d2e1t3 d2e2t1 d2e2t2 d2e2t3 d2e3t1 d2e3t2 d2e3t3. For this code d# = coding for diet level #, e# = coding for exertype level #, t# = coding for time level # and d#e#t# is the product of the three. Since we are only contrasting levels of exertype and not levels of either diet or time, we want to code each level of diet to be "weighted" equally, so they are both coded as 1/2; likewise, we want to code each level of time to be "weighted" equally, so all three levels of time are coded as 1/3. In this case the coding for the interaction is:

d1e1t1 = 1/2*-1/2*1/3 = -1/12
d1e1t2 = 1/2*-1/2*1/3 = -1/12
d1e1t3 = 1/2*-1/2*1/3 = -1/12
d1e2t1 = 1/2*-1/2*1/3 = -1/12
d1e2t2 = 1/2*-1/2*1/3 = -1/12
d1e2t3 = 1/2*-1/2*1/3 = -1/12
d1e3t1 = 1/2*1*1/3 = 1/6
d1e3t2 = 1/2*1*1/3 = 1/6
d1e3t3 = 1/2*1*1/3 = 1/6

d2e1t1 = 1/2*-1/2*1/3 = -1/12
d2e1t2 = 1/2*-1/2*1/3 = -1/12
d2e1t3 = 1/2*-1/2*1/3 = -1/12
d2e2t1 = 1/2*-1/2*1/3 = -1/12
d2e2t2 = 1/2*-1/2*1/3 = -1/12
d2e2t3 = 1/2*-1/2*1/3 = -1/12
d2e3t1 = 1/2*1*1/3 = 1/6
d2e3t2 = 1/2*1*1/3 = 1/6
d2e3t3 = 1/2*1*1/3 = 1/6

This can be more conveniently visualized as matrices. The coding for time, which is the last factor in the mixed command, can be thought of as the row matrix that is multiplied by the coding for exertype which is the second to last factor in the mixed command and which can be thought of as the column matrix. The matrix which is the product is then multiplied by the coding for each level of diet which appears before exertype and time in the mixed command.

For diet level 1 = 1/2:

time level 1 = 1/3 time level 2 = 1/3 time level 3 = 1/3
exertype level 1 = -1/2 1/2*-1/2*1/3 = -1/12 1/2*-1/2*1/3 = -1/12 1/2*-1/2*1/3 = -1/12
exertype level 2 = -1/2 1/2*-1/2*1/3 = -1/12 1/2*-1/2*1/3 = -1/12 1/2*-1/2*1/3 = -1/12
exertype level 3 = 1 1/2*1*1/3 = 1/6 1/2*1*1/3 = 1/6 1/2*1*1/3 = 1/6

For diet level 2 = 1/2:

time level 1 = 1/3 time level 2 = 1/3 time level 3 = 1/3
exertype level 1 = -1/2 1/2*-1/2*1/3 = -1/12 1/2*-1/2*1/3 = -1/12 1/2*-1/2*1/3 = -1/12
exertype level 2 = -1/2 1/2*-1/2*1/3 = -1/12 1/2*-1/2*1/3 = -1/12 1/2*-1/2*1/3 = -1/12
exertype level 3 = 1 1/2*1*1/3 = 1/6 1/2*1*1/3 = 1/6 1/2*1*1/3 = 1/6


mixed pulse by diet exertype time
/fixed = time diet exertype diet*exertype diet*exertype*time
/repeated = time | subject(id) covtype(ar1)
/test = 'exertype 12v3' exertype -.5 -.5 1
diet*exertype -1/4 -1/4 1/2 -1/4 -1/4 1/2
diet*exertype*time -1/12 -1/12 -1/12 -1/12 -1/12 -1/12 1/6 1/6 1/6
-1/12 -1/12 -1/12 -1/12 -1/12 -1/12 1/6 1/6 1/6.

Tuesday, March 10, 2009

How can I test a group of variables in SPSS regression?

Suppose that you want to run a regression model and to test the statistical significance of a group of variables. For example, let's say that you want to predict students' writing score from their reading, math and science scores. The data set with these variables in it can be downloaded by following this link: hsb2.sav .

The SPSS syntax for this would be:

regression
/dependent = write
/method = enter read math science.
Variables Entered/Removed(b)
Model Variables Entered Variables Removed Method
1 science score, reading score, math score(a) . Enter
a All requested variables entered.
b Dependent Variable: writing score
Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .684(a) .467 .459 6.97111
a Predictors: (Constant), science score, reading score, math score
ANOVA(b)
Model Sum of Squares df Mean Square F Sig.
1 Regression 8353.990 3 2784.663 57.302 .000(a)
Residual 9524.885 196 48.596

Total 17878.875 199


a Predictors: (Constant), science score, reading score, math score
b Dependent Variable: writing score


Coefficients(a)

Unstandardized Coefficients Standardized Coefficients t Sig.
Model B Std. Error Beta
1 (Constant) 13.192 3.069
4.299 .000
reading score .236 .069 .255 3.410 .001
math score .319 .076 .316 4.222 .000
science score .202 .069 .211 2.918 .004
a Dependent Variable: writing score

Now let's suppose that you wanted to test the combined effect of math and science on writing. The SPSS syntax for doing that is below. Note that the variables listed in the method = test() subcommand are not listed on the method = enter subcommand. In other words, the independent variables are listed only once. Also note that, unlike other SPSS subcommands, you can have multiple method = subcommands within the regression command.

regression
/dependent = write
/method = enter read
/method = test(math science).
Variables Entered/Removed(b)
Model Variables Entered Variables Removed Method
1 reading score(a) . Enter
2 science score, math score . Test
a All requested variables entered.
b Dependent Variable: writing score
Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .597(a) .356 .353 7.62487
2 .684(b) .467 .459 6.97111
a Predictors: (Constant), reading score
b Predictors: (Constant), reading score, science score, math score
ANOVA(d)
Model Sum of Squares df Mean Square F Sig. R Square Change
1 Regression 6367.421 1 6367.421 109.521 .000(a)
Residual 11511.454 198 58.139


Total 17878.875 199



2 Subset Tests math score, science score 1986.569 2 993.284 20.439 .000(b) .111
Regression 8353.990 3 2784.663 57.302 .000(c)
Residual 9524.885 196 48.596


Total 17878.875 199



a Predictors: (Constant), reading score
b Tested against the full model.
c Predictors in the Full Model: (Constant), reading score, science score, math score.
d Dependent Variable: writing score


Coefficients(a)

Unstandardized Coefficients Standardized Coefficients t Sig.
Model B Std. Error Beta
1 (Constant) 23.959 2.806
8.539 .000
reading score .552 .053 .597 10.465 .000
2 (Constant) 13.192 3.069
4.299 .000
reading score .236 .069 .255 3.410 .001
math score .319 .076 .316 4.222 .000
science score .202 .069 .211 2.918 .004
a Dependent Variable: writing score




Excluded Variables(b)

Beta In t Sig. Partial Correlation Collinearity Statistics
Model Tolerance
1 math score .396(a) 5.583 .000 .370 .561
science score .322(a) 4.609 .000 .312 .603
a Predictors in the Model: (Constant), reading score
b Dependent Variable: writing score

If you wanted to test all three variables together, the syntax would be:

regression
/dependent = write
/method = test(read math science).
Variables Entered/Removed(a)
Model Variables Entered Variables Removed Method
1 science score, reading score, math score . Test
a Dependent Variable: writing score
Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .684(a) .467 .459 6.97111
a Predictors: (Constant), science score, reading score, math score
ANOVA(c)
Model Sum of Squares df Mean Square F Sig. R Square Change
1 Subset Tests reading score, math score, science score 8353.990 3 2784.663 57.302 .000(a) .467
Regression 8353.990 3 2784.663 57.302 .000(b)
Residual 9524.885 196 48.596


Total 17878.875 199



a Tested against the full model.
b Predictors in the Full Model: (Constant), science score, reading score, math score.
c Dependent Variable: writing score


Coefficients(a)

Unstandardized Coefficients Standardized Coefficients t Sig.
Model B Std. Error Beta
1 (Constant) 13.192 3.069
4.299 .000
reading score .236 .069 .255 3.410 .001
math score .319 .076 .316 4.222 .000
science score .202 .069 .211 2.918 .004
a Dependent Variable: writing score

You will notice that the output from the first example with the three independent variables on the method = enter subcommand and the output from this example with the three independent variables on the method = test() subcommand are virtually identical. The only difference between them is the line in the ANOVA table that gives the test of the subset, which in this case is all of the variables. The point of this example is that you can put all of the independent variables in the regression on the method = test() subcommand and not use a method = enter subcommand if you like.

Monday, March 9, 2009

Capturing Unobserved Heterogeneity in Structural Equation Models:* Q. Q- ?) ^' N7 l! z0 u0 O from Maximum Likelihood to Partial Least Squares

L# @9 M/ O* i
Capturing Unobserved Heterogeneity in Structural Equation Models:from Maximum Likelihood to Partial Least Squares
థిస్ పేపర్ ఇస్ వేరి గుడ్ పేపర్ ఇన్ సెం.ఈఫ్ యు వన్'త రీడ్ థిస్ పేపర్.
ప్లస్ విసిత్ థిస్ లింక్ :http://www.spsschina.cn/thread-2922-1-1.html