Build Text Speed Model
Let's practice defining models. Remember to name your latent variables with a name that is not in your current dataset.
However, manifest variables should be column names from your dataset.

Use the HolzingerSwineford1939 dataset to create a new model of textual speed with the variables x4, x5, and x6, which
represent reading comprehension and understanding word meaning. x7, x8, and x9 represent speed counting and addition. 
The model will have one latent variable that predicts scores on these six manifest variables.

* Name your model text.model.
* Name your latent variable textspeed.
* Use variables x4, x5, x6, x7, x8, and x9 as the manifest variables.

# Load the lavaan library
library(lavaan)

# Look at the dataset
data(HolzingerSwineford1939)
head(HolzingerSwineford1939[ , 7:15])

# Define your model specification
text.model <- 'textspeed =~ x4 + x5 + x6 + x7 + x8 + x9'

You have defined your first model, which has one latent variable and six manifest variables.

------------------------------------------------------------------------------------------------------------------------

Build Political Democracy Model
You can now expand your model specification skills to a new dataset. Create a model of Political Democracy ratings from
1960 using the PoliticalDemocracy dataset. This dataset includes ratings of politics in developing countries from the
1960s.

Variables y1, y2, y3, and y4 measure freedom of the press, freedom of political opposition, election fairness, and
effectiveness of the legislature. You should create a model with one latent variable, named poldemo60, and four manifest
variables.

* Name your model politics.model.
* Name your latent variable poldemo60.
* Use variables y1, y2, y3, and y4 as the manifest variables.

# Load the lavaan library
library(lavaan)

# Look at the dataset
data(PoliticalDemocracy)
head(PoliticalDemocracy)

# Define your model specification
politics.model <- 'poldemo60 =~ y1 + y2 + y3 + y4'

You have set up a model of freedom of the press and election fairness.

------------------------------------------------------------------------------------------------------------------------

Analyze Text Speed Model
Let's analyze your text speed model from the first lesson. This model included one latent variable, textspeed,
represented by six manifest variables. Variables x4, x5, x6 measured reading comprehension, and x7, x8, and x9 
measured speed counting and addition from the HolzingerSwineford1939 dataset.

We will use the cfa() function to analyze text.model using the data from HolzingerSwineford1939. Our summary should
indicate the model was identified with 9 degrees of freedom. You should examine the latent variable estimates to
determine which items measure the latent variable well (high scores) and which do not (low scores).

* Use the cfa() function to fit a model called text.fit. Remember to include both model and data arguments!
* Use the summary() function to view the model fit.

# Load the lavaan library
library(lavaan)

# Load the dataset and define model
data(HolzingerSwineford1939)
text.model <- 'textspeed =~ x4 + x5 + x6 + x7 + x8 + x9'

# Analyze the model with cfa()
text.fit <- cfa(model = text.model, data = HolzingerSwineford1939)

# Summarize the model
summary(text.fit)


lavaan 0.6-11 ended normally after 20 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        12
                                                      
  Number of observations                           301
                                                      
Model Test User Model:
                                                      
  Test statistic                               149.786
  Degrees of freedom                                 9
  P-value (Chi-square)                           0.000

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)
  textspeed =~                                        
    x4                1.000                           
    x5                1.130    0.067   16.946    0.000
    x6                0.925    0.056   16.424    0.000
    x7                0.196    0.067    2.918    0.004
    x8                0.186    0.062    2.984    0.003
    x9                0.279    0.062    4.539    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .x4                0.383    0.048    7.903    0.000
   .x5                0.424    0.059    7.251    0.000
   .x6                0.368    0.044    8.419    0.000
   .x7                1.146    0.094   12.217    0.000
   .x8                0.988    0.081   12.215    0.000
   .x9                0.940    0.077   12.142    0.000
    textspeed         0.968    0.112    8.647    0.000

------------------------------------------------------------------------------------------------------------------------

Examine Standardized Loadings
You have created and summarized the text-speed model in previous steps using the HolzingerSwineford1939 dataset. You
were able to view the coefficients for the model using the summary() function. However, the unstandardized coefficients
in the Estimate column are often hard to interpret for how well they represent the latent variable.

In this exercise, add the standardized = TRUE argument to your summary() function to view the standardized loadings.
Look at the Std.all column for the completely standardized solution to see which variables have a poor relationship to
the text speed latent variable.

* Use the summary() function on your text.fit model.
* Include the argument to view the standardized loadings.
* Do not include fit.measures arguments in this exercise.

# Load the lavaan library
library(lavaan)

# Load the data and define model
data(HolzingerSwineford1939)
text.model <- 'textspeed =~ x4 + x5 + x6 + x7 + x8 + x9'

# Analyze the model with cfa()
text.fit <- cfa(model = text.model, data = HolzingerSwineford1939)

# Summarize the model
summary(text.fit, standardized = TRUE)

lavaan 0.6-11 ended normally after 20 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        12
                                                      
  Number of observations                           301
                                                      
Model Test User Model:
                                                      
  Test statistic                               149.786
  Degrees of freedom                                 9
  P-value (Chi-square)                           0.000

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  textspeed =~                                                          
    x4                1.000                               0.984    0.846
    x5                1.130    0.067   16.946    0.000    1.112    0.863
    x6                0.925    0.056   16.424    0.000    0.910    0.832
    x7                0.196    0.067    2.918    0.004    0.193    0.177
    x8                0.186    0.062    2.984    0.003    0.183    0.181
    x9                0.279    0.062    4.539    0.000    0.275    0.273

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .x4                0.383    0.048    7.903    0.000    0.383    0.284
   .x5                0.424    0.059    7.251    0.000    0.424    0.256
   .x6                0.368    0.044    8.419    0.000    0.368    0.308
   .x7                1.146    0.094   12.217    0.000    1.146    0.969
   .x8                0.988    0.081   12.215    0.000    0.988    0.967
   .x9                0.940    0.077   12.142    0.000    0.940    0.926
    textspeed         0.968    0.112    8.647    0.000    1.000    1.000
    
Looking at Std.all, we can tell that variables 7, 8, and 9 do not measure text speed very well, as these loading
coefficients are close to zero.

------------------------------------------------------------------------------------------------------------------------

Explore Fit Indices
After reviewing the standardized loadings in the previous exercise, we found that several of the manifest variables may
not represent our latent variable well. As a second measure of our model, you can examine the fit indices to see if the
model appropriately fits the data. You can look at both the goodness of fit and badness of fit statistics using the
fit.measures argument within the summary() function.

Remember that goodness of fit statistics, like the CFI and TLI, should be large (over .90) and close to one, while
badness of fit measures like the RMSEA and SRMR should be small (less than .10) and close to zero.

* Use the summary() function on your text.fit model.
* Include the argument to view the fit indices.
* Do not include the standardized loadings.

# Load the lavaan library
library(lavaan)

# Load the data and define model
data(HolzingerSwineford1939)
text.model <- 'textspeed =~ x4 + x5 + x6 + x7 + x8 + x9'

# Analyze the model with cfa()
text.fit <- cfa(model = text.model, data = HolzingerSwineford1939)

# Summarize the model
summary(text.fit, fit.measures = TRUE)

lavaan 0.6-11 ended normally after 20 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        12
                                                      
  Number of observations                           301
                                                      
Model Test User Model:
                                                      
  Test statistic                               149.786
  Degrees of freedom                                 9
  P-value (Chi-square)                           0.000

Model Test Baseline Model:

  Test statistic                               681.336
  Degrees of freedom                                15
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.789
  Tucker-Lewis Index (TLI)                       0.648

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -2476.130
  Loglikelihood unrestricted model (H1)      -2401.237
                                                      
  Akaike (AIC)                                4976.261
  Bayesian (BIC)                              5020.746
  Sample-size adjusted Bayesian (BIC)         4982.689

Root Mean Square Error of Approximation:

  RMSEA                                          0.228
  90 Percent confidence interval - lower         0.197
  90 Percent confidence interval - upper         0.261
  P-value RMSEA <= 0.05                          0.000

Standardized Root Mean Square Residual:

  SRMR                                           0.148

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)
  textspeed =~                                        
    x4                1.000                           
    x5                1.130    0.067   16.946    0.000
    x6                0.925    0.056   16.424    0.000
    x7                0.196    0.067    2.918    0.004
    x8                0.186    0.062    2.984    0.003
    x9                0.279    0.062    4.539    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .x4                0.383    0.048    7.903    0.000
   .x5                0.424    0.059    7.251    0.000
   .x6                0.368    0.044    8.419    0.000
   .x7                1.146    0.094   12.217    0.000
   .x8                0.988    0.081   12.215    0.000
   .x9                0.940    0.077   12.142    0.000
    textspeed         0.968    0.112    8.647    0.000
    
We can see that our fit indices are poor, with low CFI and TLI and high RMSEA and SRMR values.

------------------------------------------------------------------------------------------------------------------------

Examine Political Democracy
For this final exercise, you will put together all the steps you've completed so far in building a one-factor model. You
will examine the standardized loadings and fit indices for the political democracy model. The model was analyzed with
the cfa() function. You will now use the summary() function with both the standardized and fit.measures arguments to
view everything together.

In the Std.all column for loadings, you should find that the items appear to measure political democracy well with high
numbers close to one. However, when you look at the fit indices, you should find a mix of good and bad values. These
results are often common and indicate that the model may fit the data, but also has room for improvement.

* Use the summary() function on your politics.fit model.
* Include the argument to view both standardized loadings and fit indices.

# Load the lavaan library
library(lavaan)

# Load the data and define model
data(PoliticalDemocracy)
politics.model <- 'poldemo60 =~ y1 + y2 + y3 + y4'

# Analyze the model with cfa()
politics.fit <- cfa(model = politics.model, data = PoliticalDemocracy)

# Summarize the model
summary(politics.fit, standardized = TRUE, fit.measures = TRUE)

lavaan 0.6-11 ended normally after 26 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                         8
                                                      
  Number of observations                            75
                                                      
Model Test User Model:
                                                      
  Test statistic                                10.006
  Degrees of freedom                                 2
  P-value (Chi-square)                           0.007

Model Test Baseline Model:

  Test statistic                               159.183
  Degrees of freedom                                 6
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.948
  Tucker-Lewis Index (TLI)                       0.843

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)               -704.138
  Loglikelihood unrestricted model (H1)       -699.135
                                                      
  Akaike (AIC)                                1424.275
  Bayesian (BIC)                              1442.815
  Sample-size adjusted Bayesian (BIC)         1417.601

Root Mean Square Error of Approximation:

  RMSEA                                          0.231
  90 Percent confidence interval - lower         0.103
  90 Percent confidence interval - upper         0.382
  P-value RMSEA <= 0.05                          0.014

Standardized Root Mean Square Residual:

  SRMR                                           0.046

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  poldemo60 =~                                                          
    y1                1.000                               2.133    0.819
    y2                1.404    0.197    7.119    0.000    2.993    0.763
    y3                1.089    0.167    6.529    0.000    2.322    0.712
    y4                1.370    0.167    8.228    0.000    2.922    0.878

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .y1                2.239    0.512    4.371    0.000    2.239    0.330
   .y2                6.412    1.293    4.960    0.000    6.412    0.417
   .y3                5.229    0.990    5.281    0.000    5.229    0.492
   .y4                2.530    0.765    3.306    0.001    2.530    0.229
    poldemo60         4.548    1.106    4.112    0.000    1.000    1.000

Our standardized loadings indicate the items measure the latent variable well, but the fit indices are a mix of good
values (high CFI, low SRMR) and bad values (low TLI, high RSMEA).