Tuesday, April 01, 2008

The Journal of Social Psychology, 1984, 124, 237-246

THE GREAT ANDROGYNY MYTH: Sex Roles and Mental Health in the Community at Large



JOHN J. RAY & F. H. LOVEJOY

School of Sociology, University of New South Wales, Australia

ABSTRACT

Previous research on androgyny has focused heavily on college-student populations and has used measuring instruments open to several methodological criticisms. The present research used a random postal sample (N = 214) of an entire state and a modified form of a new sex role inventory by Antill, Cunningham, Russell, and Thompson (1981). The modifications were designed to control against acquiescent response bias. Only a minority of the items discriminated the male and female respondents, making scale construction difficult. Two 12-item inventories of male and female orientation were produced but without balancing against acquiescence. Femininity was found to correlate strongly with neuroticism, lack of assertiveness, and lack of self-esteem. Masculine orientation, however, also went with low self-esteem and low assertiveness. It was concluded that the best mental health was shown by undifferentiated respondents (those who tended to say that sexually polarized self-descriptions were inapplicable to them). Those who were androgynous were generally low scorers on the three indices of mental health.


IN A RECENT STUDY with the Bem sex role inventory (1974), Myers and Gonda (1982) found that the inventory was completely invalid in a most basic sense. When it was applied to groups of respondents other than college students, Myers and Gonda found that items said by Bem to measure general community stereotypes of what is desirable in, for instance, masculinity did not, in fact, correspond to general community stereotypes of what is desirable in masculinity. Previous work on androgyny based on college-student samples, then, becomes very suspect. Myers and Gonda also suggested that work with other scales constructed with student samples, such as the PAQ, would be similarly flawed (Beere, 1979; Spence & Helmreich, 1978).

A very interesting sex role scale is, therefore, the recent inventory by Antill et al. This inventory was specifically constructed to avoid several of the methodological criticisms that have been aimed at the Bem inventory (e.g., Pedhazur & Tetenbaum, 1979) but is of the same general type and format. That Antill et al. used at least some non-student raters during scale construction may also make their scales immune to the Myers and Gonda (1982) criticisms.

One possible source of artifact that remains uncontrolled for in even the Antill scales, however, is acquiescent response bias. The masculine scale, for example, contains only pro-masculine items. There are no anti-masculine items. There is now extensive evidence that this can be a source of serious artifact (Altemeyer, 1981; Block, 1965; Christie, Havel, & Seidenberg, 1956; Martin, 1964; Ray, 1972a, 1972b, 1979a, 1983, 1984; Ray & Pratt, 1979; Rorer, 1965). Such scales may simply measure tendency to careless responding. Balancing the Antill scales to eliminate this problem, then, seemed necessary.

Method

To enable balancing of the Antill scales, any item in either their Form A or Form B that could be replaced by an antonym was so replaced. This meant that 35 Antill items were administered unaltered, and 45 were replaced by their antonyms. Other scales administered were Eagly's (1967) revision of the Janis-Field self-esteem scale, the short Eysenck (1959, 1969) Neuroticism scale, a short Marlowe-Crowne (Greenwald & Satow, 1970) social desirability scale, and the Ray Directiveness scale (Ray, 1976, 1980, 1981; Ray & Lovejoy, 1983).

The measurement of assertiveness, although of obvious importance in this research area, posed a difficulty in that there is no real agreement about what it is (Burkhart, Green, & Harrison, 1969; Law, Wilson, & Crassini, 1979; Lawrence, 1950). It was arbitrarily decided, therefore, to use the 14-item Ray (1980) Directiveness scale. This is generically a dominance scale and was originally written to measure dominance of an authoritarian kind. Its correlates, however, show that it is associated generally with good mental health (Ray, 1979b), and Ray (1981) has pointed out at some length that there are many points of contact in the authoritarianism and assertiveness literatures. Lorr and More (1980) have also identified directiveness as one of the four main subtypes of assertiveness, although they measure it with a scale of their own construction.

All the above scales were included in a single questionnaire mailed out under a university letterhead to 950 people selected at random from the registered voter lists of the Australian state of New South Wales. Because voter registration in Australia is compulsory for all adult citizens and even for many non-citizens, the sampling frame was, thus, unusually comprehensive. A total of 214 (23%) usable replies were received-from 88 men and 126 women. Aside from the slight imbalance in the sex ratio, the distribution of demographic characteristics (age, occupation, and education) did not differ significantly from that observed in contemporaneous samples obtained in the Sydney metropolitan area (Sydney is Australia's largest city and accounts for 3 million of the state's 5 million people). Thus, although the response rate was low, it did not result in any noticeable biasing of the sample in terms of basic demographic criteria. The total sample may perhaps seem small, but with such a sample correlations explaining as little as 2% of the variance are significant.

Results

In view of Antill et al. using some non-student raters in developing their scales, an unexpected and awkward finding (but one in line with what Myers & Gonda, 1982, predicted) was that most of the Antill items, whether originals or their antonyms, did not, in fact, discriminate men and women. Only 36 of 80 items separated men and women among the respondents by one fourth of a standard deviation or more. In other words, on many of the supposedly male-role and female-role items, both men and women were equally likely to get high (or low) scores. This may have some significance in and of itself; a scale of supposedly male or female sex roles, if it is to have any validity at all, however, should show men as, on the whole, following male sex roles and women as, on the whole, following female sex roles. Although there may be a great deal of role overlap in the community, it is the aim of sex role research to detect and study those roles that do not totally overlap. Furthermore, if men do not score especially high on an item believed to measure, for example, a male characteristic, the claim that it measures a male characteristic is arbitrary. Although it is true that the Bem inventory discriminates men and women at the total score level rather than at the item level, its approach, as has already been mentioned, is unsatisfactory.

Under the circumstances, then, further scale development was confined to the 36 discriminating items mentioned above. Unfortunately, the reduced size of the available pool of items meant that the aim of producing balanced scales could not be realized. An attempt was made to construct two scales using equal proportions of original and reversed Antill items, but this restriction led to scales for masculinity and femininity with reliabilities (alpha) of only .40 and .30, respectively. More conventional one-way-worded scales, therefore, had to be resorted to if any sex role scales that would be applicable to the general population were to be produced. Acquiescence had to be controlled for by means other than balanced scales.

Of the 36 discriminating items, 16 showed higher mean scores among men, and 20 showed higher mean scores among women. Some of these, however, correlated very little with one another so that two scales of only 12 items each were produced to measure masculinity and femininity. Their reliabilities (alpha) were .57 and .64, respectively. They did not correlate significantly with each other. Their correlations with other scales are presented in Table 1. Following the emphasis by Taylor and Hall (1982) on the separate study of masculinity and femininity, these results were the ones most heavily relied upon in assessing the implications of the present study. A high score on either male-polarized or female-polarized attributes was related to low self-esteem (high scores indicate low esteem) and low assertiveness. For neuroticism, however, subscribing to female roles was neurotic. Because neither the male nor the female scale correlated with social desirability, none of these results can be dismissed as due to social desirability.

The items making up the general-population masculinity scale were as follows: formal, bashful, seldom outspoken, not fussy, crude, insensitive, hard-hearted, mild, inconsiderate, reserved, quiet, and "I seldom if ever cry." The items making up the general-population femininity scale were: need approval, nervous, lacking confidence, "I feel inferior," unathletic, dreamy, little or no mechanical ability, "I change my mind easily," gullible, religious, unwilling to take risks, and excitable in a major crisis.

The simplest measure of androgyny would seem to be a high summed score on the masculinity and femininity scales, obtained only by scoring high on both male and female attributes, which is what generally seems to be meant by psychological androgyny. People with a low summed score would be undifferentiated, and people with a conventional one-role predominance would get middling scores. Such a measure was, therefore, calculated, and high scores on androgyny on this measure were found to be unassertive (r = -.415, p < .001), neurotic (r = .334, p < .001), and of low self-esteem (r = .594, p < .001). These are high correlations in comparison with much that is reported in the psychological literature and are uniformly unfavorable to the feminist hypothesis as enunciated by Bem (1974) and her successors.

In a third analysis of the data, all subjects were divided into low and high scorers on each of the two scales. High and low were defined as above and below the scale mean, respectively, rather than as above and below the scale median. Although use of the median is more usual, it was felt that dividing respondents into exactly equal groups was too artificial because there might have been an empirical tendency in the sample toward a greater proportion of, for example, male-oriented persons than female-oriented persons. Use of the mean gave the data some influence on the frequencies in the various categories, as can be seen from the variable ns in the different subgroups of Table 2. This table gives the means of the three mental health measures for each of the four possible combinations of dichotomized sex role scores. The conclusions to be drawn from Table 2 do not seem to differ markedly from those already drawn on the basis of the previous analyses. Undifferentiated respondents seem to have the best mental health and androgynous respondents, the worst.

The method adopted to control for acquiescent response bias derives from Martin's (1964) suggestion that an independent measure of acquiescence can be obtained by summing scores on a balanced scale without doing any reverse scoring. Such a score was obtained from the Directiveness scale for each respondent and correlated with scores on the other (substantively scored) scales. Acquiescence was found to correlate .36 (p < .001) with scores on the Eysenck Neuroticism scale, .37 (p < .001) with scores on the self-esteem inventory, .14 (p < .05) with scores on the Masculinity scale, and .44 (p < .001) with scores on the Femininity scale. This implies that there was a pervasive influence of acquiescence throughout the battery of scales.

Partial correlations were carried out to remove the effect of acquiescence from the correlations between the sex role scales and the other personality scales. The initial relationships were, however, so strong that this did not affect any correlation substantially or alter its level of significance. Because the Directiveness scale was balanced against acquiescence, acquiescence could not be a common influence underlying the correlations between assertiveness and sex role.

Finally, of the 35 Antill items used unaltered for the purposes of the present survey, only 9 fell into the group of 36 valid items found to be usable for scale construction. Hence, the poor validity observed in the body of items available for the present analyses was not due to alterations in the Antill items. In fact, the altered items had a higher success rate in differentiating men and women.



TABLE 1

Correlates of the Two Sex Role Scales of Masculinity and Femininity

......................................Total sample (N = 214)

Variable.......................Masculine...Feminine

Directiveness................ -.28.............. -.34
Neuroticism................... -.04............... .49
Social Desirability.......... .10............... -.08
Self-esteem (a)............... .24................ .62
Sex................................ -.45................ .35

(a) A high score on the self-esteem scale indicates low self-esteem.

TABLE 1 (Continued)

........................................Men only (n = 88)..................Women only (n = 126)
........................................Masculine....Feminine............Masculine....Feminine

Directiveness................. -.04................. -.45..................... -.53.............. -.27
Neuroticism.................... .11.................. .42..................... .09................ .45
Social Desirability.......... -.03................ -.04...................... .19............... -.10
Self-esteem (a)................ .37................ .56....................... .46................ .60



TABLE 2

The Means on Three Mental Health Measures for all Possible Combinations of Scores on the Sex Role Scales

......................................................Directiveness...............Self-esteem (b)

Orientation (a)...............N..............M.............SD.................M................SD

Males

Masculine.....................40...........29.80.........6.39...............41.70..........8.02
Feminine........................5...........21.40.........3.44...............48.80...........9.06
Androgynous.................21..........26.61..........7.65...............49.95..........9.37
Undifferentiated...........22..........30.09..........5.13..............38.54...........6.28
F ratio...........................................3.58*...............................8.26**

Females

Masculine.......................7...........31.28.........5.99...............41.14...........9.03
Feminine.......................49...........28.44.........7.48...............49.49.........10.47
Androgynous.................33...........23.93.........5.28...............57.93.........10.60
Undifferentiated..........37............30.37.........6.78...............43.18...........7.83
F ratio...........................................6.07*..............................14.70**

Grand mean, males.......88...........28.63.........6.70...............43.28...........9.15
Grand mean, females..126...........27.98.........7.15...............49.38...........11.35
t ratio............................................ .66.................................4.15**

(a) Masculine = high M, low F; feminine = low M, high F; androgynous = high M, high F; undifferentiated = low M, low F. (b) A high score on the self-esteem scale indicates low self-esteem.

*p < .05. **p < .01.

TABLE 2 (Continued)

.....................................................Neuroticism

Orientation (a)...............N..............M.............SD

Males

Masculine.....................40............10.42........4.04
Feminine........................5............12.80.........3.65
Androgynous.................21...........13.04.........4.01
Undifferentiated...........22...........10.54........3.50
F ratio............................................2.48

Females

Masculine.......................7............12.28........2.65
Feminine.......................49............14.44........3.42
Androgynous.................33............13.93........3.24
Undifferentiated..........37............ 11.48.........3.50
F ratio...........................................5.91**

Grand mean, males.......88........... 11.21........4.05
Grand mean, females..126........... 13.31........3.60
t ratio............................................4.13**



Discussion

The present paper has been, in a sense, a deliberate attempt to reinvent the wheel. Very little of what has become accepted practice in this research area was taken on faith. If the various relationships reported in the existing literature were accurate reflections of reality, the attempt at thoroughly independent replication should have succeeded. It did not. The conclusions could hardly be more different. The usual findings to date have been that either androgyny or masculinity confer mental health advantages (Taylor & Hall, 1982). The present work appears to be the first occasion on which undifferentiated respondents have been shown to be most advantaged in mental health terms.

The present findings, however, fit in very well with those of Myers and Gonda (1982) that turning to non-college respondents reveals as totally invalid items carefully checked for validity on college students. Community stereotypes about sex roles turn out to be college student stereotypes only. Given the limited importance of the college student population in the world at large, existing work may at best be trivial.

Even an interpretation of the existing findings as reflecting reality among college students only, however, may be unduly expansive. To do so would leave us to explain why the situation is not only different among college students from the world at large but is, in fact, opposite in important senses. Because it seems rather improbable that reality is so perversely arranged, we must seriously consider another possibility -- the Rosenthal (experimenter-expectation) effect. It seems that feminist views are widely respected in academe, even to the point of being normative. Could it not be, therefore, that our students have in previous research simply given back to us what they thought we expected to hear? The possibility of experimenter expectations having affected responses in the present study, on the other hand, appear to be small. Mailed questionnaires seem to be thoroughly impersonal means of data gathering.

In conclusion, then, it appears that in the population at large it is maladaptive to see oneself in terms defined by sex roles. It is maladaptive to see oneself in ways that are either characteristically male or characteristically female. A fortiori, it must be even more maladaptive to see oneself as androgynous as that word is generally now used in the psychological literature. The finding that undifferentiated personalities are most advantageous is, however, still quite congenial to feminist theory. This is because the finding conflicts with the traditional point of view that had begun to receive some support in recent research (Antill & Cunningham, 1980; Taylor & Hall, 1982), that is, the view that it is male sex roles that are most adaptive.

REFERENCES

Altemeyer, R. A. (1981). Right-wing authoritarianism. Winnipeg: University of Manitoba Press.

Antill, J. K., & Cunningham, J. D. (1980). The relationship of masculinity, femininity, and androgyny to self-esteem. Australian Journal of Psychology, 32, 195-207. Antill,

J. K., Cunningham, J. D., Russell, G., & Thompson, N. L. (1981). An Australian sex-role scale. Australian Journal of Psychology, 33, 169-183.

Beere, C. A. (1979). Women and women's issues: A handbook of tests and measures. San Francisco: Jossey Bass.

Bem, S. (1974). The measurement of psychological androgyny. Journal of Consulting & Clinical Psychology, 42, 155-162.

Block, J. (1965). The challenge of response sets. New York: Appleton Century.

Burkhart, B. R., Green, S. B., & Harrison, W. M. (1969). Measurement of assertive behavior: Construct and predictive validity of self-report, role-playing and in-vivo measures. Journal of Clinical Psychology, 35, 376-383.

Christie, R., Havel, J., & Seidenberg, B. (1956). Is the F scale irreversible? Journal of Abnormal Social Psychology, 56, 141-158.

Eagly, A. H. (1967). Involvement as a determinant of response to favourable and unfavourable information. Journal of Personality and Social Psychology Monograph, 7(3, Whole No. 643).

Eysenck, H. J. (1959). Manual of the Maudsley Personality Inventory. London: University of London Press.

Eysenck, H. J., & Eysenck, S. B. G. (1969). Personality structure and measurement. London: Routledge.

Greenwald, H. J., & Satow, Y. (1970). A short social desirability scale. Psychology Reports, 27, 131-135.

Law, H. G., Wilson, E., & Crassini, B. (1979). A principal components analysis of the Rathus Assertiveness Schedule. Journal of Consulting Clinical Psychology, 47, 631-633.

Myers, A. M., & Gonda, G. (1982). Empirical validation of the Bem sex-role inventory. Journal of Personality and Social Psychology, 43, 304-318.

Lawrence, P. S. (1970). The assessment and modification of assertive behavior. Dissertation Abstracts International, 31, 173-971B.

Lorr, M., & More, W. W. (1980). Four dimensions of assertiveness. Multivariate Behavioral Research, 2, 127-138.

Martin, J. (1964). Acquiescence: Measurement and theory. British Journal of Social and Clinical Psychology, 3, 216-225.

Pedhazur, E. J., & Tetenbaum, T. J. (1979). Bem sex-role inventory: A theoretical and methodological critique. Journal of Personality and Social Psychology, 37, 996-1016.

Ray, J.J. (1972a) Are conservatism scales irreversible? British J. Social & Clinical Psychology 11, 346-352.

Ray, J.J. (1972b) A new balanced F scale -- And its relation to social class. Australian Psychologist 7, 155-166.

Ray, J.J. (1976) Do authoritarians hold authoritarian attitudes? Human Relations, 29, 307-325.

Ray, J.J. (1979) Is the acquiescent response style not so mythical after all? Some results from a successful balanced F scale. Journal of Personality Assessment 43, 638-643.

Ray, J.J. (1979) The authoritarian as measured by a personality scale Solid citizen or misfit? J. Clinical Psychology 35, 744-746.

Ray, J.J. (1980) Authoritarianism in California 30 years later -- with some cross-cultural comparisons. Journal of Social Psychology, 111, 9-17.

Ray, J.J. (1981) Authoritarianism, dominance and assertiveness. Journal of Personality Assessment 45, 390-397.


Ray, J.J. (1983) Defective validity of the Machiavellianism scale. Journal of Social Psychology 119, 291-292.

Ray, J.J. (1983) Reviving the problem of acquiescent response bias. Journal of Social Psychology 121, 81-96.

Ray, J.J. & Lovejoy, F.H. (1983). The behavioral validity of some recent measures of authoritarianism. Journal of Social Psychology, 120, 91-99.

Ray, J.J. & Pratt, G.J. (1979) Is the influence of acquiescence on "catchphrase" type attitude scale items not so mythical after all? Australian Journal of Psychology 31, 73-78.

Rorer, L. G. (1965). The great response style myth. Psychological Bulletin, 63, 129-156.

Spence, J. T., & Helmreich, R. L. (1978). Masculinity and femininity: Their psychological dimensions. Austin, TX: University of Austin Press.

Taylor, M. C., & Hall, J. A. (1982). Psychological androgyny: Theories, methods and conclusions. Psychological Bulletin, 92, 347-366.