Chapter 1: Introduction to PLS-SEM

1. The Research Dilemma

In any marketing research project, an ideal data set should have a large sample size and be normally distributed. Unfortunately, the reality is that many applied research projects have limited participants because of the nature of the project. Insufficient resources and tight project timelines further prevent researchers from obtaining a decent data set for proper statistical analysis, particularly in the structural equation modeling (SEM) of latent variables where LISREL (linear structural relations) and AMOS (analysis of moment structures) have strict data assumptions. Some researchers have taken the risk of drawing incorrect or limited inferences by ignoring the data set requirements, while others have resorted to testing simplified versions of complex hypotheses. This book introduces an emerging multivariate analysis approach called “partial least squares structural equation modeling” (PLS-SEM), which is a good solution to these problems, if it is used properly.

2. A Better Way to Measure Customer Satisfaction

Companies strive to increase their bottom-line performance through increasing customer satisfaction levels. However, a single question (e.g., Are you satisfied with our product?) may provide marketers with little value, because customer satisfaction is multi-dimensional, and this latent variable is not directly observable. A better way to measure satisfaction is to consider survey responses to several manifest variables on a continuous (multi-point) scale. Marketers are often interested in identifying the key operational processes and product attributes that drive customer satisfaction so that they can prioritize resources to improve these areas. SEM is designed for testing theoretically supported linear and additive causal models. It is ideal for examining the relationship between customer satisfaction and other variables.

3. Different Approaches to SEM

There are several distinct approaches to SEM: The first approach is the widely applied Covariance-based SEM (CB-SEM)6, using software packages such as AMOS, EQS, LISREL and MPlus. The second approach is Partial Least Squares (PLS), which focuses on the analysis of variance and can be carried out using ADANCO, PLS-Graph, VisualPLS, SmartPLS, and WarpPLS. It can also be employed using the PLS module in the “r” statistical software package. The third approach is a component-based SEM known as Generalized Structured Component Analysis (GSCA); it is implemented through VisualGSCA or a web-based application called GeSCA. Another way to perform SEM is called Nonlinear Universal Structural Relational Modeling (NEUSREL), using NEUSREL’s Causal Analytics software.

Faced with various approaches to path modeling, one has to consider their advantages and disadvantages to choose an approach to suit.

4. CB-SEM

CB-SEM has been widely applied in the field of social science during the past several decades, and is still the preferred data analysis method today for confirming or rejecting theories through testing of hypothesis, particularly when the sample size is large, the data is normally distributed, and most importantly, the model is correctly specified. That is, the appropriate variables are chosen and linked together in the process of converting a theory into a structural equation model (Hair, Ringle, & Smarted, 2011; Hwang et al., 2010; Reinartz, Haenlein, & Henseler, 2009). However, many industry practitioners and researchers note that, in reality, it is often difficult to find a data set that meets these requirements. Furthermore, the research objective may be exploratory, in which we know little about the relationships that exist among the variables. In this case, marketers can consider PLS.

5. PLS-SEM

PLS is a soft modeling approach to SEM with no assumptions about data distribution7 (Vinzi et al., 2010). Thus, PLS-SEM becomes a good alternative to CB-SEM for many researchers. In reality, PLS is found to be useful for structural equation modeling in applied research projects, especially when there are limited participants and that the data distribution is skewed, e.g., surveying female senior executive or multinational CEOs (Wong, 2011). PLS-SEM has been deployed in many fields, such as behavioral sciences (e.g., Bass et al, 2003), marketing (e.g., Henseler et al., 2009), organization (e.g., Sosik et al., 2009), management information system (e.g., Chin et al., 2003), and business strategy (e.g., Hulland, 1999).

6. GSCA & Other Approaches

If overall measures of model fit are really important to the researcher, or in projects where many non-linear latent variables exist and have to be accommodated, GSCA may be a better choice than PLS for running structural equation modeling (Hwang et al., 2010). And for data sets that demonstrate significant nonlinearities and moderation effects among variables, the NEUSREL approach may be considered (Frank and Hennig-Thurau, 2008).

However, since GSCA and NEUSREL are relatively new approaches in SEM, the amount of literature for review is relatively limited. Marketers may find it difficult to locate sufficient examples to understand how these emerging SEM approaches can be used in different business research scenarios.

7. Why not LISREL or Amos?

Since the 1970s, marketers have used Scientific Software International’s LISREL and SmallWaters/SPSS’s Amos statistical software packages to build causal models. Although these covariance-based SEM software packages are great for estimating and testing model parameters using maximum likelihood, they have some disadvantages from a user’s perspective. For example, a large sample size of 500 or more participants is usually required to generate stable estimation of the parameters. The dataset has to be normally distributed, or else standard errors must be used with care when the assumptions of multivariate normality are not met. The researcher also needs at least three manifest variables per latent variable to avoid identification problems.

8. The Birth of PLS-SEM

In the mid-1960s, the renowned econometrician and statistician Herman Wold developed the concept of a predictive causal system called “partial least squares.” This new variance-based SEM approach extended the principal component and canonical correlation analysis to the next level. Unlike LISREL or AMOS, it is designed to provide flexibility for exploratory modeling. PLS is well known for its soft modeling approach, using ordinary least squares (OLS) multiple regression, which makes no distributional assumptions in computation of the model parameters. Because PLS fits each part of the model separately, it reduces the number of cases required. However, please note that a larger sample size always helps to improve parameter estimation and reduce average absolute error rates. PLS favours the outer measurement model that deals with the relations between latent variables and their manifest variables. Statistically speaking, the objective of PLS is to get score values of latent variables for prediction purposes. It is a component-based technique in which latent variables are calculated as exact weighted linear combinations of the manifest variables. This methodology is called “partial” least squares because its iterative procedure involves separating the parameters instead of estimating them simultaneously. Key resampling procedures include bootstrapping, jackknifing and blindfolding.

9. Growing Acceptance of PLS-SEM

Although PLS was developed more than five decades ago, it did not gain the attention of the academic community until the late 1990s, because of a lack of PLS software and documentation. In the last two decades, the situation has improved significantly with the launch of graphical PLS software such as PLS-Graph, VisualPLS, SmartPLS, WarpPLS, and ADANCO. The first international PLS conference was conducted in 1999, and the first PLS handbook was published by Springer in 2010. With increased use of the PLS method in top-tier, peer-reviewed journal papers (particularly in the Journal of Management Information Systems) and in the marketing and behavioural science fields, it is a good time to give this innovative approach serious consideration. As PLS has been utilized by researchers in many studies based on the American Customer Satisfaction Index (ACSI), having a good understanding of PLS methodology helps researchers to compare their research results with those of prior studies.

9. Strengths of PLS-SEM

A substantial amount of research on the benefits of the PLS path modelling approach has been published (Bacon, 1999; Hwang et al., 2010; Wong, 2010). Among these benefits are the following:

  • Small sample size requirement8
  • Hypotheses that are less probabilistic
  • No assumptions about the distribution of the variables
  • Insensitivity to non-normality, heteroscedasticity, and autocorrelation of the error terms
  • No parameter identification problem
  • No need for observations to be independent
  • Ability to explore the relationship between a latent variable and its manifest variables in both formative and reflective ways
  • Effectiveness in analysing moderation effects and identification of potential moderators
  • Production of scores both for overall and for individual cases
  • Ability to handle large model complexity (up to 100 latent and 1,000 manifest variables)
  • Suitability for research when improper or non-convergent results are likely.

10. Weaknesses of PLS-SEM

Marketing researchers are urged to evaluate PLS’s strengths and weaknesses carefully before adopting the approach. As experts would agree, there is no magic bullet in any particular statistical procedure. Among the weaknesses of PLS are the following:

  • Requirement for high-valued structural path coefficients when using small sample sizes
  • Inability to handle the multicollinearity problem well
  • Inability to provide ways of modelling undirected correlation
  • Possibility of resulting in biased estimates of component loadings and path coefficients, due to a lack of complete consistency in scores on latent variables
  • Possible generation of large mean square errors of loading estimates and large mean square errors of path coefficient estimates.

11. Evolution of PLS-SEM Software

Although developed in the mid-1960s (Wold, 1973, 1985), there has been a lack of advanced yet easy-to-use PLS path modeling software (not to be confused with PLS regression as it is different from PLS-SEM) until mid 2000s. The first generation of PLS-SEM software that was commonly used in the 1980s included LVPLS 1.8, but it was a DOS-based program. The subsequent arrival of PLS-Graph and VisualPLS added a graphical interface but they have received no significant updates since their initial releases. PLS-SEM can be performed in “r” but it requires certain level of programming knowledge. Therefore, it may not be suitable for those marketers who do not have strong computer science background. The remaining standalone PLS-SEM software packages, still in active development, include ADANCO, SmartPLS, and WarpPLS. Please refer to Chapter 13 for a full list of available PLS-related software packages.

This book focuses on SmartPLS because it is widely used in the academic community. This software not only releases updates regularly, but also maintains an active online discussion forum9, providing a good platform for knowledge exchange among its users.

Source: Ken Kwong-Kay Wong (2019), Mastering Partial Least Squares Structural Equation Modeling (Pls-Sem) with Smartpls in 38 Hours, iUniverse.

Leave a Reply

Your email address will not be published. Required fields are marked *