3-page analysis of the case study

Journal of Applied Psychology 1991, Vol.76, No. 1,106-114

Copyright 1991 by the American Psychological Association, Inc. 0021-9010/91/S3.00

Can Ipsative and Single-Item Measures Produce Erroneous Results in Field Studies of French and Raven's (1959) Five Bases of Power?

An Empirical Investigation

Chester A. Schriesheim School of Business Administration

University of Miami

Timothy R. Hinkin Mclntire School of Commerce

University of Virginia

Philip M. Podsakoff Graduate School of Business

Indiana University

Research is presented on the single-item ranking (ipsative) scales that have been the dominant measures used to assess French and Raven's (1959) power bases in previous organizational re- search. These measures, along with multi-item and single-item Likert rating scales, were adminis- tered to 3 independent samples. Two of these samples were also administered measures of job satisfaction, motivation, role clarity and conflict, and organizational commitment; the 3rd sample was given a second administration of the 3 sets of power measures 2 weeks later. Analysis of variance and correlational and chi-square analyses yielded largely consistent results, indicating that single-item and ranking measures have serious psychometric shortcomings and that they produce distorted results in field-collected data. Implications for the interpretation of previous research are considered, as well as suggestions for improved future research.

Power, the inferred potential to exert influence (Yukl, 1989), has long been investigated (Bass, 1981). However, interest in studying power has recently increased, and new research has appeared at different levels of analysis (e.g., Gaski, 1986; House, 1988). Although social or interpersonal power has been studied from many theoretical perspectives, Mintzberg (1983) noted that probably the best known framework is that developed by French and Raven (1959). French and Raven identified five types of social power (reward, coercive, legitimate, expert, and referent), and much of what is known about power in work organizations is based on French and Raven's typology (Podsa- koff & Schriesheim, 1985). This is true despite the fact that some theorists have criticized the framework and even though many alternative conceptualizations exist (e.g., Yukl, 1989).

Study Purpose

Although much of what is known about interpersonal power in organizations derives from research in which French and Raven's (1959) framework was used, recent reviews have se- verely questioned whether valid conclusions may be drawn from this body of research as a whole. Podsakoff and Schrie- sheim (1985) presented a detailed critique of the three major

This research was supported by grants from the Corporate Affiliate Research Program, University of Miami, and from the Marriott Corpo- ration Summer Research Fellowships, Mclntire School of Commerce, University of Virginia.

Correspondence concerning this article should be addressed to Chester A. Schriesheim, Department of Management, School of Busi- ness Administration, University of Miami, 414 Jenkins Building, Coral Gables, Florida 33124-9145.

measures that have been used to operationalize French and Raven's typology in field research (Bachman, Smith, & Sle- singer, 1966; Student, 1968; Thamhain & Gemmill, 1974) and noted a number of problems common to these scales. In fact, Podsakoff and Schriesheim (1985) went so far as to suggest that "the existing research does not support drawing confident con- clusions about such things as relationships between the five power bases and subordinate outcome variables" (p. 409). Yukl (1989), after conducting an independent literature review, agreed that "the methodological limitations of the [French and Raven] power studies raise serious doubts about the accuracy of the[ir] findings" (p. 35).

These problems are extremely troublesome because the af- fected research is central to psychologists' knowledge about so- cial power in organizations. However, to date, no empirical research has been reported on these concerns. Thus, in this article, we (a) briefly summarize several criticisms of the most commonly used measures of French and Raven's (1959) power bases, (b) demonstrate that these measures have poor reliability and convergence with a reasonable set of rival measures (Exami- nation 1), and (c) show that these measures distort relationships with subordinate outcome variables (Examination 2). The basic purpose of this research was both methodological and substan- tive; we hope that the methodological aspects of these studies will lead to changes in the way in which research on French and Raven's bases of power is commonly conducted and that the substantive parts of this article will help begin building sound knowledge about these bases of power in organizations.

Shortcomings of Existing Field Research As mentioned previously, only three instruments have been

used in almost all field research of French and Raven's (1959)

106

IPSATIVE AND SINGLE-ITEM MEASURES 107

bases of power. The first and most commonly used of these scales was developed by Bachman et al. (1966). The other two instruments, developed by Student (1968) and by Thamhain and Gemmill (1974), are slight modifications of Bachman et al.'s original scales.

In all of these instruments, single items are used to measure the five power bases. The items are presented to respondents as a set, and the respondents are asked to rank order the items according to the items' descriptiveness of why the respondents comply with requests from their supervisors. (See, for example, the top part of Table 1, which presents Bachman et al.'s [1966] items).

Although the three dominant instruments mentioned in the preceding paragraph appear reasonable at first glance, their ranking procedure makes them fully ipsative measures (Guil- ford, 1954). As a consequence, empirical relationships among the five power bases are distorted: They are forced to be nega- tive and to average — .20 (— \/k, where k = the number of fully ipsative measures; Hicks, 1970). In addition, the ipsative rank- ing procedure typically used with these scales produces artifac- tual correlations with dependent variables because their empiri- cal validities are constrained to average 0.00 (Hicks, 1970). In fact, the problems with ipsative measures are generally consid- ered so severe that

it is necessary to reevaluate thoroughly. . . research that has uti- lized purely ipsative. . .instruments.. . While it may be unlikely that all (or most) of the attribute relationships obtained are artifac- tual, the extent to which such invalid artifactual relationships are produced by ipsative measurement cannot . . . be determined [a priori] (Hicks, 1970, p. 182)

In their review, Podsakoff and Schriesheim (1985) performed qualitative and chi-square analyses on the patterns of positive, negative, and nonsignificant correlations between the five power bases and dependent variables, separating ipsative and nonipsative studies. On the basis of these analyses, Podsakoff and Schriesheim (1985) concluded that

the results for reward, legitimate, and expert power are signifi- cantly influenced by the [ipsative] scaling procedure used.. . The effects of reward, legitimate, and expert power on subordinate criterion variables are less negative or more positive than would be indicated by. . . the ranking studies [alone], (p. 405)

Although the ipsative nature of the major power instruments is clearly a serious problem, these scales suffer from other im- portant shortcomings as well. Probably next most important is that they suffer from poor content validity; it seems impossible for single-item measures to adequately sample the content do- main of the relatively broad constructs defined by French and Raven (Nunnally, 1978). For example, Bachman et al.'s (1966) reward power item (see Table 1) clearly samples only two of the many positively valent outcomes that French and Raven (1959) suggested are involved in reward power.

Finally, reliance on single-item scales or measures in these instruments also creates additional problems. Single items are generally believed to be unreliable, and internal consistency coefficients cannot be calculated for them (Nunnally, 1978). As a result, relationships between power and various dependent variables may be attenuated because of undetected (and unde- tectable) measurement error.

Table 1 Power Scales and Items

Item Power base

Bachman, Smith, and Slesinger's (1966) ranking scales8

4. He/she can give special help and benefits to those who cooperate with him/her. Reward

2. He/she can apply pressure or penalize those who do not cooperate. Coercive

1. He/she has a legitimate right, considering his/her position, to expect that his/her suggestions will be carried out. Legitimate

3. I admire him/her for his/her personal qualities, and want to act in a way that merits his/her respect and admiration. Referent

5. I respect his/her competence and good judgment about things with which he/she is more experienced than I. Expert

Single-item Likert scales'"

3. He/she can give special help and benefits to those who cooperate with him/her. Reward

2. He/she can apply pressure or penalize those who do not cooperate. Coercive

5. He/she has a legitimate right, considering his/her position, to expect that his/her suggestions will be carried out. Legitimate

1. I admire him/her for his/her personal qualities, and want to act in a way that merits his/her respect and admiration. Referent

4. I respect his/her competence and good judgment about things with which he/she is more experienced than I. Expert

Multi-item Likert scales'"

1. He/she can determine my pay level. Reward 6. He/she can give me desirable job assignments. Reward

11. He/she can promote me. Reward 16. He/she can provide me with valuable recognition. Reward 21. He/she can verbally praise me. Reward 2. He/she can fire me. Coercive 7. He/she can give me a written reprimand. Coercive

12. He/she can give me undesirable job assignments. Coercive 17. He/she can suspend me without pay. Coercive 22. He/she can give me a verbal reprimand. Coercive 3. He/she is my immediate supervisor. Legitimate 8. He/she has a right to expect me to carry out his/

her wishes. Legitimate 13. He/she is a representative of the organization. Legitimate 18. His/her role is sanctioned by the organization. Legitimate 23. He/she has been given the right to make

demands of me. Legitimate 4. He/she is someone I want to be like. Referent 9. He/she is a person meriting respect. Referent

14. He/she is someone I admire. Referent 19. He/she is someone with whom I identify. Referent 24. He/she is a nice person. Referent

5. He/she can devise clever solutions to my job- related problems. Expert

10. He/she can provide me with sound job-related advice. Expert

15. He/she can provide me with needed technical knowledge. Expert

20. He/she can share with me his/her considerable experience/training. Expert

25. He/she can give me good technical suggestions. Expert

* Items were ranked from most descriptive (5) to least descriptive (1) of the reasons respondents complied with their supervisors' re- quests. b Items were rated, on a scale ranging from strongly agree (5) to strongly disagree (1), in terms of how descriptive they were of re- spondents' reasons for complying with supervisors' requests.

108 C. SCHRIESHEIM, T. HINKIN, AND P. PODSAKOFF

Examination 1: Measurement Adequacy The issues raised in the preceding section are drawn from

earlier analyses of the literature (Podsakoff & Schriesheim, 1985; Yukl, 1989) and have not been the subject of primary empirical research. Because these concerns are serious, we used the first phase of this investigation to examine Bachman et al.'s (1966) ipsative ranking scales for convergence with identi- cally worded one-item Likert-type rating scales and with multi- item Likert-type scales. Also examined was the test-retest reli- ability of these three measures. As mentioned earlier, of the three major instruments used in field studies of French and Raven's (1959) power bases, Bachman et al.'s (1966) scales are most commonly used (Podsakoff & Schriesheim, 1985); they were therefore used in this research. Although a number of psychometric assessments could be made with respect to these scales, testing for convergent validity (Campbell & Fiske, 1959) seemed reasonable (cf. Schwab, 1980), as did assessing short- term test-retest stability as a way to examine measurement unreliability (Nunnally, 1978).

Method

Samples and Procedure

We used one sample to examine the test-retest reliabilities of the three sets of scales and two additional samples to assess the conver- gence of the scales.

Test-retest sample. Department secretaries (N = 42 women) work- ing for a large university in the southern United States made up this sample (Sample AY They had an average age of 34.2 years and average education of 2 years past high school. Most had been employed with the university for more than 5 years. All of the respondents had worked for their immediate supervisor for at least 6 months. Of the 51 secretar- ies originally asked to participate, all completed the first survey admin- istration. Nine, however, either did not fully complete the second ad- ministration or could not be matched to their first-administration sur- veys, so that the final sample size was 42.

Secretaries completed the surveys during normal working hours; the questionnaire sessions were exactly 2 weeks apart. Extreme care was taken to assure respondent anonymity, and the secretaries were told to place a distinctive mark of their own choosing (e.g., a doodle, a number, etc.) in an identification box on the two administrations of the survey. Subsequent interviews with 12 of the secretaries (randomly selected) indicated that the matching procedure did make them feel confident about their anonymity. This sample was administered only the three sets of power scales; when the convergence in these measures was exam- ined, however, the results were extremely close (within .04, in absolute terms, for both administrations) to the averages reported below for the two other samples.

Convergence samples. The first sample that we used to assess scale convergence, labeled in the tables and discussion that follow as Sample B, consisted of 53 research scientists with the Florida Agricultural Extension service. The 53 respondents represent 80% of the 66 scien- tists to whom surveys were sent. All respondents had doctorate degrees and were full-time employees; their average age was 44 years. The second convergence sample, SampleC, consisted of 63 full-time restau- rant employees. These respondents, whose average age was 23, were drawn from three eating establishments. They represent approxi- mately 93% of those asked to participate.

Measures

Power. Three different measures of French and Raven's (1959) five bases of social power were used, each appearing in a different survey section (and each with different introductory instructions). The first

was Bachman et al.'s (1966) measure; the respondents rank ordered these items to describe why they complied with requests from their supervisor. The second measure contained the same items, but respon- dents used a Likert format to indicate the extent of their agreement with each item as a reason for their compliance with supervisory re- quests. The third measure was a multi-item Likert-format instrument that was adapted from Hinkin and Schriesheim's (1989) scales and was designed to measure each of the five power bases from the same per- spective as Bachman et al.'s (1966) and the single-item scales.

Hinkin and Schriesheim's (1989) scales were adapted for two rea- sons. First, Hinkin and Schriesheim conceptualized two bases of power differently than French and Raven (1959) did. Hinkin and Schriesheim defined the origin of all five power bases consistently, in terms of the power holder's (O) ability to mediate outcomes for the power target (P). French and Raven (1959), however, defined legiti- mate power as "power that stems from internalized values in P" (p. 161) and referent power as having "its basis in the identification of P with O" (p. 161). Second, in Hinkin and Schriesheim's scales, respon- dents are asked to describe the degree to which the power holder can mediate outcomes, whereas in Bachman et al.'s (1966) and the one-item Likert scales, respondents are asked why they comply with requests from the power holder. Thus, had Hinkin and Schriesheim's scales not been modified for this research, differences in the constructs being measured would have been a potential rival explanation for obtained results.

The items used in all three measures are presented in Table 1. Consid- erable care went into the revision of Hinkin and Schriesheim's (1989) instrument: The items were judged by faculty and student panels for content validity, and item, internal consistency reliability, and factor analyses were conducted (in a sample of 275 business undergraduates who were employed at least 10 hrs. a week). The coefficient alpha inter- nal consistency reliabilities of the multi-item scales in Samples B and C, respectively, were .62 and .66 for reward power, .83 and .78 for coer- cive power, .65 and .69 for legitimate power, .93 and .83 for referent power, and .93 and .88 for expert power. The alpha reliabilities for Sample A are given in Table 2; all exceeded .70. Thus, although some alpha coefficients were lower than might be desired, they appear ac- ceptable as a set.

Table 2 Reliability Results for the Three Power-Base Scales in Sample A

Scale/power base

Multi-item Likert Reward Coercive Legitimate Referent Expert

Single-item Likert Reward Coercive Legitimate Referent Expert

Ranking" Reward Coercive Legitimate Referent Expert

2-week test-retest

.72

.74

.73

.81

.79

.39

.28

.40

.43

.41

.33

.22

.36

.39

.38

Coefficient alphas

1st test 2nd test

.71 .70

.75 .77

.70 .71

.78 .80

.77 .81

* Bachman, Smith, and Slesinger (1966).

IPSATIVE AND SINGLE-ITEM MEASURES 109

Other measures. A number of additional measures were also ad- ministered but are not pertinent to the current analyses. These mea- sures are discussed in the section titled Examination 2.

Analyses

All analyses for this research were conducted at the individual level of analysis. In the analyses for this first examination, we used only data on the three power-base measures. First, test-retest correlations were computed for Sample A. Then, all 15 power scales (five power bases X three instruments) were intercorrelated in Samples B and C (separately) and examined for convergence with LISREL maximum likelihood confirmatory factor analysis (JOreskog & Sorbom, 1984). Because there were serious problems in the data (discussed later), we ended up relying on the multitrait-multimethod (MTMM) approach suggested by Campbell and Fiske (1959), supplemented by the analysis of variance (ANOVA) procedure developed for MTMM analysis by Kavanagh, MacKinney, and Wollins (1971). Although LISREL MTMM analyses would have been preferable, Kavanagh et al.'s approach does permit the direct comparison of results across samples with different levels of error variance and provides summary or overall convergence estimates for each analysis (cf. Schmitt & Stults, 1986).

Using Spearman rank order coefficients for the ipsative scale analy- ses produced virtually no differences in results from those obtained with Pearson product-moment coefficients. The average change in retest coefficients was only .02; the average change for the convergent- validity coefficients was .01; finally, the average change in dependent- variable correlations (the focus of Examination 2) was only .01. Further- more, ignoring the signs of the coefficients and computing change in absolute magnitudes yielded average differences of only .03, .02, and .03 for the retest, convergent-validity, and dependent-variable correla- tions, respectively. Thus, even though Spearman coefficients are tech- nically more appropriate for relationships involving ipsative scales (Guilford, 1954; Siegel, 1956), their use made virtually no difference in the findings of this research. Consequently, only Pearson coefficients are reported, for comparability and generalizability to previously pub- lished research (cf. Podsakoff& Schriesheim, 1985).

Results and Discussion

Test-Retest Reliability Analysis

The test-retest reliability estimates for the three scale sets in Sample A are presented in Table 2. Good short-term stability was obtained for the multi-item Likert scales, but the results for the single-item Likert and single-item ipsative (ranking) scales were poor. For the single-item Likert scales, the retest coeffi- cients ranged from .28 to .43; for the ranking scales, they ranged from .22 to .39. This pattern clearly supports the concerns we raised earlier about the possibility that measurement unreli- ability may have attenuated research results, leading to errone- ous conclusions.

LISREL Confirmatory Factor Analyses

Because Samples B and C were small, we used the multiple- group approach to LISREL analysis suggested by Joreskog and Sorbom (1984, Chap. 5), constraining parameter estimates to be equal across the two samples. Unfortunately, however, the computed MTMM matrices were not positive-definite, which

precluded their analysis (this was because the eigenvalues of the 15th principal component for both matrixes were negative). Re- constructing the two MTMM matrixes by extracting the first 14 principal components and using these to compute positive- definite MTMMs caused only the correlations among the rank- ing (ipsative) scale items to change (none of the changes ex- ceeded±.01). However, solution convergence still could not be obtained with the reconstructed matrixes (indicating a poor model fit to the data). This was apparently due to the ranking scales; when these were deleted, a satisfactory solution was ob- tained (goodness-of-fit index = .89; x

2 = 80.50, df= 69, ns). However, because this analysis did not allow us to examine the ranking scales, we used conventional MTMM and ANOVA analyses instead.

Conventional MTMM and ANOVA Analyses

The convergence evidence from the two MTMMs for Sam- ples B and C are summarized in Table 3. (The full matrixes from which these data are derived are available on request.) The multi-item Likert scales converged fairly well with the single- item Likert scales. In fact, the levels of convergence between the multi-item and single-item Likert scales are not markedly lower than the coefficients usually viewed as indicative of acceptable convergent validity for multi-item scale-to-scale assessments (cf. Gillet & Schwab, 1975). Even higher levels might be ex- pected if both sets of measures were multi-item in nature be- cause the convergent relationships would be less attenuated by the measurement error of the single-item scales (Nunnally, 1978).

Bachman et al.'s single-item ipsative ranking scales converged only moderately to weakly with the single- and multi-item Li- kert scales. Although most of the coefficients were statistically significant, many would not be considered large enough (by Campbell & Fiske, 1959, among others) to warrant examining these measures further (for discriminant validity). This situa- tion was most pronounced for the ranking-multi-item-Likert convergences, which tended to be rather low (with the exception of referent and expert power). In addition, there was absolutely no evidence of convergent validity for reward power as mea- sured by the ranking and multi-item Likert scales. This is what one might expect given the reviews of Yukl (1989) and Podsa- koffand Schriesheim (1985); in both reviews, the validity of empirical findings on reward power in particular was ques- tioned.

The results obtained by applying Kavanagh et al.'s (1971) AN- OVA approach to the MTMM correlation matrixes summa- rized in Table 3 are presented in Table 4. In Kavanagh et al.'s design, four major sources of score variance are examined: (a) respondent variance, which indicates the overall amount of convergence (convergent validity) among respondents over sources (the three scales) and traits (the power bases); (b) Re- spondent X Trait variance, which reveals the degree of discrimi- nation on traits by respondents (discriminant validity); (c) Re- spondent x Source variance, which represents the amount of source bias in the data, and (d) error.

As suggested by Kavanagh et al. (1971), we also present vari- ance component indices (VCIs) in Table 4. VCIs control for

110 C. SCHRIESHEIM, T. HINKIN, AND P. PODSAKOFF

Table 3 Summary of Scale Convergence Results

Power base

Scales/sample Reward Coercive Legitimate Referent Expert Average

Multi-item Likert and single-item Likert

B C

Average Single-item Likert and

ranking8

B C

Average Multi-item Likert and

ranking" B C

Average

.34**

.51**

.43**

.44**

.17

.31**

.11

.11

.11

.51**

.64**

.58**

.65**

.47**

.56**

.24*

.40**

.32**

.53**

.26**

.40**

.48**

.30**

.39**

.37** -.06

.16*

.82**

.68**

.75**

.62**

.70**

.66**

.63**

.53**

.58**

.70**

.59**

.65**

.52**

.47**

.50**

.65**

.40**

.53**

.54**

.42**

.30**

Note. Averages were computed and tested for significance with the Fisher z transformation (McNemar, 1969, pp. 157-158). " Bachman, Smith, and Slesinger (1966). *p<.05. **p<.01.

sample differences in error variance and are roughly interpret- able as intraclass correlation coefficients (showing the amount of variance accounted for by each ANOVA effect).

The three nonerror effects were significant for the multi- and single-item Likert scales in both samples. When the magni- tudes of these effects were compared (with the VCIs), these scales demonstrated substantial convergent and discriminant validity, especially when compared with their relatively small levels of source bias. In addition, all of the nonerror effects were statistically significant for the single-item-Likert-ranking and multi-item-Likert-ranking comparisons (in both samples). However, the VCIs indicate that the magnitude of convergence was relatively small. These results support the earlier analysis,

showing that Bachman et al.'s (1966) single-item ipsative rank- ing scales are clearly inferior to the single- and multi-item rating scales. The results also clearly indicate that the ipsative scales are fundamentally different from the single- and multi-item Likert scales.

Examination 2: Dependent Variable Relationships

The results of Examination 1 support our earlier discussion of measurement shortcomings in field research on French and Raven's (1959) power bases. However, the results did not ad- dress the important question of whether the results obtained with Bachman et al.'s (1966) ipsative ranking scales systemati-

Table 4 Results of Analysis of Variance of the Multitrait-Multimethod Matrixes

M-S S-R M-R

Source SS MS VCI SS MS VCI SS MS VCI

Sample B Respondent Respondent X Trait Respondent X Source Error

Sample C Respondent Respondent X Trait Respondent X Source Error

52 208

52 208

61 244

61 244

125.40 293.30

35.30 76.00

172.24 303.92 44.76 99.08

2.41 1.41 0.68 0.37

2.82 1.25 0.73 0.41

6.51** 3.81** 1.84**

6.88** 3.05** 1.78**

.36

.58

.14

.37

.51

.14

37.10 371.53 36.68 84.69

47.12 393.70 50.84

128.34

0.71 1.79 0.71 0.41

0.77 1.61 0.83 0.53

1.73** 4.37** 1.73**

1.45** 3.04** 1.57**

.07

.63

.13

.04

.50

.10 —

43.88 327.12 46.00

113.00

58.03 337.53 63.49

160.95

0.84 1.57 0.88 0.54

0.95 1.38 1.04 0.66

1.56** 2.91** 1.63**

1.44* 2.09** 1.58**

.05

.49

.11

.04

.35

.10 —

Note. M = multi-item Likert scales; S = single-item Likert scales; R = Bachman, Smith, and Slesinger's (1966) ranking scales; VCI = variance component index. *p<.05. **/><.01.

IPSATIVE AND SINGLE-ITEM MEASURES 111

cally differ from those obtained with the nonipsative rating measures. To assess this, we correlated the three sets of power- base measures with seven commonly used criterion variables. The data used were obtained from Samples B and C at the same time the data used in Examination 1 was collected. For Exami- nation 2, we used simple bivariate correlation supplemented by chi-square analyses (for comparison with the results of Podsa- koff & Schriesheim, 1985). The additional measures used in these analyses are briefly described in the following para- graphs.

Method and Measures

Satisfaction

Respondent job satisfaction was assessed with the five-item Supervi- sor-Human Relations and Supervisor-Technical Ability subscales of the Minnesota Satisfaction Questionnaire (MSQ; Weiss, Dawis, En- gland, & Lofquist, 1967). Global satisfaction was measured with a composite of 12 items from six MSQ subscales: Autonomy (Items 24 and 64), Variety (5 and 25), Co-workers (16 and 36), Recognition (18 and 58), Supervisor-Human Relations (30 and 70), and Supervisor- Technical Ability (15 and 35). The MSQ was carefully developed and refined and has been subjected to multiple validity examinations with generally very positive results (e.g., Gillet & Schwab, 1975). The alpha internal consistency reliabilities were .92 and .78 for Supervisor-Hu- man Relations, .90 and .76 for Supervisor-Technical Ability, and .86 and .78 for global satisfaction in Samples B and C, respectively.

Motivation

Yukl (1989) argued that future studies of French and Raven's (1959) power bases ought to include measures of attitudinal and behavioral compliance. Therefore, we used Patchen's (1965) four-item Work Moti- vation Scale in this research. Alpha internal consistency reliabilities of .56 and .58 were obtained in Samples B and C, respectively. Although these reliabilities are relatively low, this scale was retained for analysis because of its theoretical importance.

Role Clarity and Conflict

Rizzo, House, and Lirtzman's (1970) Role Ambiguity (reverse-scored for role clarity) and Role Conflict scales were administered in Samples B and C to broaden the base of dependent variables examined. Rizzo et al.'s scales have been used in numerous studies (Van Sell, Brief, & Schuler, 1981), and generally positive evidence exists on their reliabil- ity and validity (e.g., Schuler, Aldag, & Brief, 1977). Coefficient alpha reliabilities were in excess of .80 for both samples.

Commitment

Yukl (1989) noted that research on power has not included many theoretically relevant dependent variables and that commitment has not been investigated (but should be). Because of this, we administered Porter, Steers, Mowday, and Boulian's (1974) commitment measure to Samples B and C. This scale has been subjected to much use, and the data pertaining to its reliability and validity are generally positive (Mowday, Steers, & Porter, 1979); in both samples, the alpha reliabil- ities exceeded .70.

Results and Discussion

Satisfaction

The power-base correlations with satisfaction are presented in Table 5. As shown, the three power measures yielded consid- erably different results. The multi- and single-item Likert scales were positively or nonsignificantly correlated with reward power, but Bachman et al.'s (1966) ranking-measure was nega- tively or nonsignificantly correlated with reward power. The same general pattern held for legitimate power.

The results for coercive power, shown in Table 5, likewise support the idea that ipsative measures yield systematically dif- ferent results. Here, the multi-item Likert scales produced only nonsignificant correlations with satisfaction, whereas the sin- gle-item Likert scales produced modestly negative or nonsig- nificant relationships. However, Bachman et al.'s ipsative rank- ing scales produced a pattern of negative (and large) correla- tions between coercive power and satisfaction.

All three instruments, on the other hand, were positively re- lated to expert and referent power, although the correlations appear stronger for the single- and multi-item Likert measures.

Role Clarity and Conflict, Motivation, and Commitment The correlations between the power-base scales and mea-

sures of role clarity and conflict, motivation, and organiza- tional commitment are presented in Table 6. Few substantial differences were apparent for role clarity and conflict, although the general direction of the relationships differed between sev- eral of the multi-item Likert measures and Bachman et al.'s (1966) ranking scales.

The results for motivation do, however, suggest some scale- format effects. Although the relationships for reward, expert, and referent power did not seem to differ much by the scale used, the relationships between coercive and legitimate power and the multi-item Likert measures were nonsignificant in both samples, whereas the single-item Likert scales and ranking scales produced goodly proportions of significant negative correlations. These results were not unexpected for the ipsative ranking items, but they were unexpected for the single-item Likert rating items. Perhaps the unreliability of the motivation measure compounded the unreliability of the single-item scales. In any event, although this explanation must remain speculative, it appears that scale effects do exist for the motiva- tion dependent variable.

Scale differences also seem to exist for organizational com- mitment. As shown in Table 6, the multi- and single-item Likert rating scales yielded either positive or nonsignificant relation- ships across all five power bases. Bachman et al.'s (1966) ipsative scales, however, yielded significant negative relationships in at least one sample for reward, coercive, and legitimate power.

Comparison of Results With Podsakoffand Schriesheim (1985)

The present results are consistent and supportive of the idea that the existing ipsative measures of French and Raven's (1959) bases of power may have produced distorted research results, particularly for reward, coercive, and legitimate power. More-

112 C. SCHRIESHEIM, T. HINKIN, AND P. PODSAKOFF

Table 5 Power-Base Correlations With Satisfaction Measures

Global satisfaction

Scale/power base

Multi-item Likert Reward Coercive Legitimate Referent Expert

Single-item Likert Reward Coercive Legitimate Referent Expert

Ranking3

Reward Coercive Legitimate Referent Expert

Sample B

.22

.07

.06

.64**

.68**

.18 -.18 -.22

.49**

.49**

-.16 -.44** -.23*

.31*

.43**

Sample C

.16 -.15

.04

.34**

.29**

.03 -.19

.01

.10

.26*

.01 -.39** -.08

.13

.37**

Supervisor-technical ability

Sample B

.08 -.10 -.04

.70**

.79**

.07 -.28* -.10

.65**

.72**

-.26* -.53** -.23*

.38**

.54**

Sample C

.07 -.07

.06

.40**

.41**

.06 -.07 -.01

.04

.26*

-.26* -.23*

.00

.12

.39**

Supervisor-human relations

Sample B

.26*

.05

.01

.78**

.72**

.09 -.21 -.16

.67**

.62**

-.24* -.56** -.26*

.45**

.50**

Sample C

.03 -.07

.00

.41**

.33**

.00 -.09 -.11

.00

.23*

-.15 -.24* -.08

.11

.38**

* Bachman, Smith, and Slesinger (1966). */?<.05. **;?<.01.

over, the use of single-item rating scales also may produce dis- tortions, perhaps through increased measurement error. There- fore, to further examine scaling effects, we conducted chi- square analyses similar to those used by Podsakoff and Schriesheim (1985) to provide a quantitative assessment as well as a basis for comparing the present results with those of Podsa- koff and Schriesheim (1985).

The results shown in Tables 5 and 6 were first tallied in a cross-tabulation table showing the number of positive, nonsig- nificant, and negative correlations obtained with the multi-item Likert scales, the single-item Likert scales, and Bachman et al.'s (1966) ranking scales. Then, as recommended by Siegel (1956), Fisher's exact probability test was employed and cells collapsed as shown in Table 7.

Table 6 Power-Base Correlations With Role Clarity and Conflict, Motivation, and Commitment

Scale/power base

Role clarity Role conflict Motivation Commitment

Sample B Sample C Sample B Sample C Sample B Sample C Sample B

3 Bachman, Smith, and Slesinger (1966). *p<.05. **/?<.01.

Sample C

Multi-item Likert Reward Coercive Legitimate Referent Expert

Single-item Likert Reward Coercive Legitimate Referent Expert

Ranking8

Reward Coercive Legitimate Referent Expert

.02

.04 -.01

.14

.26*

.08 -.12 -.35**

.01

.11

-.16 -.08 -.14

.11

.19

.08

.10

.17

.02 -.12

.06

.02

.15 -.17 -.14

.13 -.05 -.04 -.05

.03

-.09 .07 .00

-.44** -.50**

.05

.20

.17 -.36** -.47**

.21

.32**

.05 -.17 -.36**

.04

.04

.03 -.07 -.02

.02

.05

.03

.05 -.23*

.09

.19

.02 -.08 -.23*

.08

.13 -.21

.14

.06

.20 -.19 -.28*

.14 -.02

.09 -.16 -.43**

.32**

.07

.32**

.00

.05

.19

.14

.11 -.25* -.32**

.24*

.04

.21 -.11 -.40**

.28* -.08

.06 -.06

.13

.48**

.53**

.06 -.09 -.08

.38**

.55**

-.27* -.31* -.03

.15

.40**

.32**

.00

.05

.19

.14

.17 -.06 -.08

.19

.24*

.06 -.22* -.25*

.19

.20

IPSATIVE AND SINGLE-ITEM MEASURES 113

Table 7 Chi-Square Comparisons of Multi-Item Liken, Single-Item Likert, and Ranking Scale Correlations With Dependent Variables

Scale

Reward power

+ 0 -

Coercive power

+ 0

Legitimate power

+ 0 -

Referent power

+ 0 -

Expert power

+ 0

Current study

Multi-item Likert (M) Single-item Likert (S) Ranking8 (R) M-S S-R M-R

3 1 1 0 0 14 0 0 10 4

p < .10b

p < .05C

p < .05"

0 14 0 12 1 5

nf p < .05' p < .05C

0 2 8

0 14 0 11 0 8

p < .10b

nf p < .05C

0 3 6

7 5 5

6 8 9

ns" ns* nsb

I 1 0

8 8 7

5 4 5

nf ns* ns*

1 2 2

Results reported by Podsakoff and Schriesheim (Table 3)d

M R M-R

5 14 0 5 51 8

p< .10

15 46 ns

4 17

8 1 1 0 2 54 8

p<.05

14 5 32 31

12 7 26 38

ns

Note. Table entries shown are the number of significant (+ or -) and nonsignificant (0) correlations between each power base and the seven dependent variables of Tables 5 and 6 for Samples B and C; the significance levels shown are based on the Fisher exact probability test for 2 X 2 tables (Siegel, 1956). * Bachman, Smith, and Slesinger (1966). b For this analysis, negative (-) column entries were combined with nonsignificant (0) entries. c For this analysis, positive (+) column entries were combined with non- significant (0) entries. d From "Field Studies of French and Raven's Bases of Power: Critique, Reanalysis, and Suggestions for Future Research" by P. M. PodsakofFand C. A. Schriesheim, 1985, Psychological Bulletin, 97, p. 405. Copyright 1985 by the American Psychological Association. Adapted by permission.

The Fisher's test results yielded similar patterns of correla- tions across the multi- and single-item Likert scales for coer- cive, referent, and expert power. However, the distribution of correlations for reward and legitimate power differed from each other (p < . 10). The single-item Likert scales and Bachman et al.'s (1966) ranking scales produced significantly different corre- lation patterns for reward and coercive power, but there was no difference in the correlation patterns for legitimate, referent, or expert power. Finally, the multi-item Likert scales produced significantly different patterns of dependent variable correla- tions than did Bachman et al.'s ipsative scales for reward, coer- cive, and legitimate power. Overall, although the distortions obtained in Samples B and C are not exactly the same as those found by Podsakoff and Schriesheim (1985), they are consistent with regard to reward, legitimate, and expert power; they also reveal scale effects for coercive power and suggest differential patterns for the single- and multi-item power rating scales.

tioning that this discrepancy is probably not due to artifacts alone. Leader reinforcement research has produced a pattern of strong positive relationships between contingent leader reward behavior and subordinate affective outcomes but mostly non- significant relationships between contingent punishment and subordinate affective outcomes. An exception is the relation between subordinate role clarity and organizational commit- ment, which Williams and Podsakoff (1988) found to be moder- ately positive. These patterns do not coincide with the current findings (see Tables 5 and 6), but they may be reconciled by the fact that power is potential, so that not all leaders necessarily use their power for contingently rewarding or punishing subor- dinates. Thus, post hoc, it seems reasonable to expect both reward and coercive power to have weaker and less consistent relationships with subordinate outcomes than do contingent reward and punishment behavior, in conformity with Tables 5 and 6.

Further Discussion

Before concluding, we briefly highlight and discuss two par- ticularly interesting substantive findings (shown in Tables 5 and 6). When measured with reliable multi-item nonipsative scales, reward power tended to have weak positive relationships with affective subordinate outcome variables, whereas coercive power had only nonsignificant relationships with those vari- ables.

These findings suggest that Podsakoff and Schriesheim (1985) were correct in arguing that the discrepancy between the literatures on leader reinforcement behavior and social power in organizations is artifactual. However, it seems worth men-

Conclusion

The results of this study support the assertions of Podsakoff and Schriesheim (1985) and Yukl (1989) that the literature on French and Raven's (1959) bases of power contains seriously distorted relationships with dependent variables. Thus, much additional research is needed in this domain, both to build a literature to replace the existing one and to provide empirical results that can be trusted for future research and theory con- struction.

The finding that not much trust can be placed on the bulk of the research that has been conducted on French and Raven's (1959) five power bases is disturbing; there is a strong need for

114 C. SCHRIESHEIM, T. HINKIN, AND P. PODSAKOFF

research in which psychometrically sound measures of social power in organizations are used (cf. Schwab, 1980). The multi- item Likert scales used in this investigation could be used as a starting point for developing and validating scales that measure power from the perspective of French and Raven. However, it seems preferable to us to use the more thoroughly developed measures of Hinkin and Schriesheim (1989), particularly given their conceptual clarification and modification of French and Raven's (1959) typology. Hinkin and Schriesheim defined all five power bases so that they are conceptually consistent with respect to their origin (i.e., arising from an agent's ability to mediate valued outcomes for a target). As mentioned earlier, this conceptualization departs from French and Raven's concep- tualization of legitimate and referent power, which they treated as arising from felt legitimacy and identification, respectively. In any event, more care and attention should be devoted to the use of psychometrically sound measures in this domain, so as to ensure that future substantive knowledge is based on a firm foundation.

In conclusion, the results of this study indicate that re- searchers know a lot less about the correlates of French and Raven's (1959) five power bases than thought and that the exist- ing literature should be interpreted with great caution. In addi- tion, it appears that much new research is needed to develop a body of sound knowledge on social power in organizations.

References

Bachman, J. G, Smith, C. G, & Slesinger, J. A. (1966). Control, perfor- mance, and satisfaction: An analysis of structural and individual effects. Journal of Personality and Social Psychology, 4, 127-136.

Bass, B. M. (1981). Stogdill's handbook of leadership (Rev. ed.). New York: Free Press.

Campbell, D. T., & Fiske, D. W (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bul- letin, 56, 81-105.

French, J., & Raven, B. H. (1959). The bases of social power. In D. Cartwright(Ed), Studies in social power (pp. 150-167). Ann Arbor: University of Michigan, Institute for Social Research.

Gaski, J. F. (1986). Interrelations among a channel entity's power sources: Impact of the exercise of reward and coercion on expert, referent, and legitimate power sources. Journal of Marketing Re- search, 23, 62-77.

Gillet, B., & Schwab, D. P. (1975). Convergent and discriminant validi- ties of corresponding Job Descriptive Index and Minnesota Satisfac- tion Questionnaire scales. Journal of Applied Psychology, 60, 313- 317.

Guilford, J. P. (1954). Psychometric methods (2nd ed.). New York: McGraw-Hill.

H icks, L. E. (1970). Some properties of ipsative, normative, and forced- choice normative measures. Psychological Bulletin, 74, 167-184.

Hinkin, T. R., & Schriesheim, C. A. (1989). Development and applica- tion of new scales to measure the French and Raven (1959) bases of social power. Journal of Applied Psychology, 74, 561-567.

House, R. J. (1988). Power and personality in complex organizations. Research in Organizational Behavior, 10, 305-357.

Joreskog, K. G, & SOrbom, D. (1984). LISREL VI: Analysis of linear structural relationships by maximum likelihood, instrumental vari- ables, and least squares methods. Mooresville, IN: Scientific Soft- ware.

Kavanagh, M. J., MacKinney, A. C, & Wollins, L. (1971). Issues in managerial performance: Multitrait-multimethod analyses of rat- ings. Psychological Bulletin, 75, 34-49.

McNemar, Q. (1969). Psychological statistics(4th ed.). New York: Wiley. Mintzberg, H. (1983). Power in and around organizations. Englewood

Cliffs, NJ: Prentice-Hall. Mowday, R. T, Steers, R. M., & Porter, L. W (1979). The measurement

of organizational commitment. Journal of Vocational Behavior, 14, 224-247.

Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.

Patchen, M. (1965). Some questionnaire measures of employee motiva- tion and morale. Ann Arbor: University of Michigan, Institute for Social Research.

Podsakoff, P. M, & Schriesheim, C. A. (1985). Field studies of French and Raven's bases of power: Critique, reanalysis, and suggestions for future research. Psychological Bulletin, 97, 387-411.

Porter, L. W, Steers, R. M., Mowday, R. T., & Boulian, P. V (1974). Organizational commitment, job satisfaction, and turnover among psychiatric technicians. Journal of Applied Psychology, 59, 603-609.

Rizzo, J. R., House, R. J., & Lirtzman, S. E. (1970). Role conflict and ambiguity in complex organizations. Administrative Science Quar- terly, 15, 150-163.

Schmitt, N., & Stults, D. M. (1986). Methodology review: Analysis of multitrait-multimethod matrices. Applied Psychological Measure- ment, 10, 1-22.

Schuler, R. S., Aldag, R. J., & Brief, A. P. (1977). Role conflict and ambiguity: A scale analysis. Organizational Behavior and Human Performance, 20, 119-128.

Schwab, D. P. (1980). Construct validity in organizational behavior. Research in Organizational Behavior, 2, 3-43.

Siegel, S. (1956). Nonparametric statistics. New York: McGraw-Hill. Student, K. R. (1968). Supervisory influence and work-group perfor-

mance. Journal of Applied Psychology, 52,188-194. Thamhain, H. J., & Gemmill, G. R. (1974). Influence styles of project

managers: Some project performance correlates. Academy of Man- agement Journal, 17, 216-224.

Van Sell, M., Brief. A. P., & Schuler, R. S. (1981). Role conflict and ambiguity: Integration of the literature and directions for future re- search. Human Relations, 34, 43-71.

Weiss, D. J., Dawis, R. V, England, G. W, & Lofquist, L. H. (1967). Manual for the Minnesota Satisfaction Questionnaire (Minnesota Studies in Vocational Rehabilitation No. 22). Minneapolis: Univer- sity of Minnesota, Industrial Relations Center.

Williams, M. L., & Podsakoff, P. M. (1988). A meta-analysisofattitu- dinal and behavioral correlates of leader reward and punishment behaviors. In D. F. Ray (Ed.), Southern Management Association pro- ceedings (161-163). Mississippi State: Southern Management Asso- ciation.

Yukl, G. A. (1989). Leadership in organizations (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall.

Received September 15,1989 Revision received August 20,1990

Accepted August 22,1990 •