Acculturation and the labor market in Mexico

*Correspondence: jcanourbina@fsu.edu 1Department of Economics, The Florida State University, 113 Collegiate Loop, 257 Bellamy Building, Tallahassee, FL 32306, USA Full list of author information is available at the end of the article Abstract This paper empirically examines the relationship between the self-identity as Indigenous and earnings inequality in the Mexican labor market. Using Mexican Census data and a large set of wage covariates reveals the existence of an earnings penalty for self-identification as Indigenous. There is an additional and larger penalty for Indigenous persons who are fluent in an Indigenous language, regardless of Spanish language fluency. Further analyses using the Mexican Family Life Survey reveal that these earnings gaps persist after we also control for an individual’s cognitive ability. Ethno-linguistic inequality is particularly strong in smaller cities and among self-employed workers.


Introduction
There is a vast literature in economics that addresses the issue of group differences in labor market outcomes (e.g. Lang, 1986;Neal and Johnson, 1996;Darity Jr. and Mason, 1998;Mason, 2004;Darity Jr. et al., 2006;Lang and Manove, 2011). This paper addresses group differences in labor market outcomes in Mexico under the light of the wealth of literature for the United States. Our study focuses on the labor market differences in outcomes between Indigenous workers and those who are not Indigenous. We find that Indigenous workers carry a significant penalty in the labor market and that this penalty increases in the level of attachment to the Indigenous group. In particular, we find an earnings penalty for those individuals who self-identify as members of an Indigenous group, but we also find that those individuals who also speak their Indigenous language carry an even higher penalty.
This result persists after including a large set of controls that include, among others, the individual's education, cognitive ability and fluency in Spanish.
There are many reasons to believe that the situation in the Mexican labor market is similar to that of other Latin-American and Caribbean countries where there exists clear ethnic differences in the population. For example, Atal,Ñopo, and Winder (2009) did an interesting study that surveys gender and ethnic wage gaps in 18 Latin American countries.
However, Mexico is a particularly interesting case for the analysis of group differences in the labor market given the ideological and legal frameworks of this country. First, Mexico celebrates its racial and cultural mixing through the social ideal of mestizaje. Mestizaje is characterized by the belief that out of many Indigenous, African, and European peoples with diverse historical and political economic experiences, Mexico has formed one hybrid and inclusive national identity. In 2001, Mexico's Secretary of State made modifications and additions to the Mexican Constitution to represent the unique character of Mexico and its multi-cultural composition which is based on its Indigenous groups (see SEGOB, 2008a).
Second, starting in 1999 the Criminal Code for Mexico City prohibited "labor discrimination on the basis of age, sex, pregnancy, marital status, race, language, religion, ideology, sexual orientation, skin color, nationality, social origin or position, work or profession, economic position, physical characteristics, disability or state of health" (Commission for Labor Cooperation, 2006, p. 15). In 2002, this Criminal Code further outlawed discrimination on the basis of ethnic origin (see Art. 206 in ALDF, 2002). Similarly, in 2003 Mexico's Secretary of State published the Federal Law to Prevent and Eliminate Discrimination (see SEGOB, 2008b) and in 2014, Articles 3 and 133 of the Federal Work Law of Mexico were modified to prohibit labor discrimination due to ethnic origin.
Third, Mexico has 68 officially recognized Indigenous languages described in the Catalog of National Indigenous Languages published in 2008 (see INALI, 2008). This is "the largest population of speakers of indigenous languages [· · · ] and presents the highest cultural diversity in the Americas, with regard to the number of languages spoken" (Terborg et al., 2006, p. 417). Bilingual education, where the ability to read, write, and speak Spanish is taught through transitional programs in Indigenous languages, has been an official Mexican policy since the 20th century; hence, during 1921-2000 the proportion of bilingual Indigenous persons expanded from 38 percent to over 81 percent of the Indigenous population (Vazquez Carranza, 2009). The 2003 General Law on the Linguistic Rights of Indigenous Peoples outlaws linguistic discrimination in the work, education, and judicial environments. 1 Therefore, the finding that Indigenous self-identification carries an earnings penalty and that the penalty is higher if the individual is also capable of speaking an Indigenous language 1 In particular this law states that: (i) All Mexicans have the right to speak their language without restriction of any kind and without any kind of discrimination; (ii) Spanish and Indigenous languages have equal status and both are valid in any public or private sector and in any kind of social activity; (iii) The right of Indigenous People to bilingual and bicultural education in the compulsory levels, respecting and dignifying their cultural identity; (iv) The right to have access to the judicial system through Indigenous languages; and, (v) The State and its three governmental orders (Federation, States, and Municipalities) will protect, preserve, promote and develop the Indigenous languages through the participation of the Indigenous population and their communities (see Vazquez Carranza, 2009, p. 205).
is at odds with the ideological and legal environment of Mexico. This finding is also at odds with the theory of human capital, as "Language skills satisfy the three requirements for human capital, that it is productive, costly to produce, and embodied in the person" (Chiswick, 2008, p. 4). Understanding that Spanish is the dominant language in Mexico, if Indigenous languages have no value in the labor market, at the minimum the ability to speak such a language should be orthogonal to the labor market outcomes of the individual. Chiswick et al. (2000) found similar results for Bolivia, and they conclude that "Bilingual speakers may be penalized in the labor market because of a poorer proficiency in Spanish" (see Chiswick et al., 2000, p. 365). We address this issue in our study by controlling for whether individuals report speaking Spanish at home. Speaking Spanish at home has a positive effect on earnings, but it does not account for the whole earnings gap.
Evidently, Mestizaje is not a one-dimensional ideal, but it also associates "progress" with European or white acculturation. Hence, despite the appearance of inclusiveness, blanqueamiento (or whitening) has been and remains an important element of mestizaje (Safa, 2005). Mexico, in that regard, is similar to the US, where dark complexion/Indian phenotype Mexican Americans have a 12 percent income penalty relative to light complexion/European phenotype Mexican Americans and where not speaking Spanish raises an individual's earnings by 13 percent, holding constant a Mexican American's English-fluency (Mason, 2004). 2 Our study updates the literature of group differences in the labor market for Mexico in regards to the Indigenous population (e.g. Panagides, 1994;Patrinos, 1994;Hisamatsu and Ukeda, 2002). Also, this study substantially improves upon the regression analysis of this important topic. We exploit both the Mexican Census and the Mexican Family Life Survey to address group differences in the labor market. The Mexican Census allows us to study the earnings differential between those self-identified as Indigenous and those who are not and the Mexican Family Life Survey allows us to control for cognitive ability and proficiency in Spanish, but at the expense of smaller sample sizes.
The most general finding of this paper is that Indigenous identification carries an earnings penalty compared to non-Indigenous workers and that workers self-identified as Indigenous who preserve their Indigenous roots, which manifests in their ability to speak an Indigenous language, have an even larger earnings penalty. While the earnings penalty for Indigenous self-identification is not a new result, there has been a limited effort on finding the mechanism behind this gap (e.g. see Borja-Vega, Lunde, and García Moreno, 2007). We discuss possible channels that can generate the earnings gap observed in the data such as differences in job search strategies and labor market discrimination. We also explore whether the earnings gap is the result of occupational segmentation and we find that both wage/salary workers and self-employed workers suffer an earnings penalty but the penalty for self-employed Indigenous workers is substantially larger. Another result from the analysis reveals that the earnings gap is mainly a small-communities phenomenon, Indigenous workers still suffer an earnings penalty in large cities, but the penalty is substantially larger in small communities.
We explore other possible explanations for the earnings gap such as labor market segmentation by education or ability, the persistence of Indigenous self-identification, and whether the Indigenous language leads Indigenous workers to have different behavior towards skills investment and the labor market. 3 The paper is organized as follows. Section 2 presents the results from the Mexican Census. Working with the Mexican Census allows us to work with a large sample drawn from all Mexican states and also allows us to perform the analysis by the type of Indigenous language. Section 2.1 allows us to address some of the concerns with the estimation results from the Census since we can control for cognitive ability. Section 3 discusses alternative channels that could generate the earnings gap observed in the data. Section 4 concludes.

Group Differences in the Mexican Labor Market
In this section, we use data from the 2010 Mexican Census to gauge group differentials in labor market outcomes between the Indigenous and non-Indigenous population. The census data are collected and distributed by the Instituto Nacional de Estadística y Geografía from Mexico, INEGI (for its acronym in Spanish), which is the Mexican National Statistics Bureau. 4 The 2010 Mexican Census collects two pieces of information that are especially relevant for our study. First, it collects information on whether an individual is capable of speaking an Indigenous language as well as the individual's ability to speak Spanish, which is the dominant language in Mexico. 5 Second, it collects information on whether an individual selfidentifies as a member of the Indigenous population. 6 In what follows, we call Mestizos those individuals who do not self-identify as Indigenous, and we call Indigenous those individuals who self-identify as Indigenous.
A first step towards understanding group differentials between Indigenous and Mestizo workers in the Mexican labor market is to understand the geographical location of these groups. Figure 1 shows that the Indigenous population in Mexico is concentrated in the central and southern states of Guerrero, Puebla, Hidalgo, Campeche, Chiapas, Quintana Roo, Oaxaca, and Yucatan. In the state of Guerrero, 22.69 percent of the population is Indigenous;in Puebla,is 25.31 percent;in Hidalgo,is 30.26;in Campeche,is 32.11;in 4 We would like to thank the Department of Economics from the Universidad Autónoma de Nuevo León for providing the 2010 Census data. 5 As mentioned in the Introduction, Mexico has 68 national Indigenous languages (INALI, 2008). 6 Notice that, as in many household surveys, the individual in question might not be the respondent of the questionnaire. Chiapas,is 32.82;in Quintana Roo,is 34.79;in Oaxaca,is 58.26;and in Yucatan,is 63.26 percent of the population. These proportions are quite high compared to the concentration of the African-American population in the United States, who represent about 21 percent of the population in those states with the highest concentrations of African-Americans, and coincidentally are also located in the south of the United States.
[Insert Figure 1  years of education on average. These differences in educational attainment are consistent with the substantial differences in the labor market. For example, among those who are monolingual in Indigenous language 47.88 percent are self-employed, they have the least hours of work per week, and have the lowest monthly earnings.
In terms of the size of residential location, Table 1 reveals negative correlation between Indigenous cultural attachment and population size of residence. That is, the vast majority of Monolingual-Indigenous live in very small communities, about 43 percent of the Bilingual live in this type of communities, but only 27 percent of the Monolingual-Spanish Indigenous workers live in very small communities.
Next, in terms of occupation, Monolingual-Indigenous are mostly agricultural and fishery and crafts workers, whereas the distribution of Monolingual-Spanish is closer to that of the Mestizos. Finally, in terms of the industry of occupation, Monolingual-Indigenous are concentrated in agriculture, and once again Monolingual-Spanish are distributed within industries in a way similar to Mestizos. Overall, the numbers in Table 1 indicate that Monolingual-Spanish are the most acculturated among the Indigenous groups.
In Table 2 we explore earnings differentials between Indigenous and Mestizos conditional on some of these observable characteristics. The dependent variable is the natural logarithm of monthly earnings. All specifications control for education, age, location size, and state of residence. The results in column (1) indicate that individuals who self-identify as members of an Indigenous group have earnings that are on average 9 percent lower than Mestizos and that individuals who speak an Indigenous language have earnings that are on average 18 percent lower than those who only speak Spanish. That is, the results suggest that the lower the degree of acculturation of the individual the higher the earnings penalty. Human capital covariates are significant and have the expected signs suggesting positive returns to the investment in human capital.
[Insert Table 2 here] Next, in columns (2) and (3) of Table 2 we also control for occupation and industry.
Typically, for regressions examining group differences in earnings, occupation and industry are considered endogenous variables, since discrimination in occupation and industry of employment might be important paths for generating earnings inequality; hence, occupation and industry are not appropriate regressors. Nevertheless, we include these variables here under the extreme assumption that intergroup differences in occupation and industry are due solely to intergroup differences in ability and employment preferences. This extreme assumption works against finding statistically significant differences in the earnings of Indigenous and Mestizos. However, the estimates on Indigenous self-identification and speaking an Indigenous language remain significant and of the same order of magnitude. The results in columns (4)-(6) indicate that Mestizos who are Monolingual-Spanish have a wage advantage over Bilingual Mestizos of 6 percent. We examined the observable characteristics of both groups of male Mestizos, those who are Monolingual-Spanish and those who are Bilingual, and we found no significant difference in terms of education, work position, occupation, industry, or population size. Moreover, we include these variables as controls in the regression analysis. Although, the Bilingual differential for Mestizos is statistically significant, it is fairly small compared to the male language differentials for Indigenous in column (1). One possibility is that the differential reflects finer occupation and industry differences or systematic differences in unobserved ability. Hence we remain agnostic regarding the reason for this differential. One can argue then, that we should subtract 6 percent from the wage differentials in column (1) for males; even so, the earnings differential among the indigenous population remains quite sizable. 7 Finally, notice that the results from Table 3 indicate that the returns to education are lower for Indigenous workers than for the Mestizo workers once we include the occupation and industry controls. Borja-Vega et al. (2007) find similar results in their study.
Since linguistic ability is a skill, especially fluency in Spanish -the dominant language, it is not surprising that Monolingual-Indigenous have lower earnings than both bilingual Indigenous and Monolingual-Spanish. But, it is counter-intuitive that bilingual speakers have lower earnings than Monolingual-Spanish. All other things equal, bilingual persons have greater skill than monolingual persons.
The results in both Tables 2 and 3 suggest that workers who speak an Indigenous language have an earnings disadvantage in the Mexican labor market. Now we explore whether this disadvantage is similar for different Indigenous languages. Table 4 presents the results from log-earnings regressions where we distinguish between the different Indigenous languages. As mentioned below, in Mexico there are 68 official Indigenous languages. These 68 languages belong to 11 language families as described in the Catalog of National Indigenous Languages (see INALI, 2008). Here we combine three of these families into one since the proportion of people speaking an Indigenous language in each of these families is very small. The sample only includes those individuals who self-identify as Indigenous and the omitted category is Monolingual-Spanish. Table 4 also presents the proportion of Indigenous in the sample who speak a language that is affiliated with a particular language family. The estimates indicate that for males the Huave have the largest penalty followed by the Maya, the Tarasca, and the Yuto-Nahua. Interestingly, these language families, with the exception of the Tarasca, are among the largest Indigenous groups. Notice that for the case of the Chontal de Oaxaca they have an earnings advantage compared to those who are monolingual in Spanish, however they represent a very small portion of the Indigenous groups. In terms of the geographical location of these language families: the Huave are mainly concentrated in the state of Oaxaca, the Yuto-Nahua in the states of Veracruz and Puebla, the Maya in the states of Chiapas and Yucatan, the Tarasca in the state of Michoacan, and the Chontal de Oaxaca in Oaxaca.
[Insert Table 4 here] To conclude, the results from the Mexican Census suggest that the labor market in Mexico values acculturation. That is, self-identification as Indigenous carries a penalty in the labor market, but this penalty is higher if the individual also speaks an Indigenous language. It is possible that the education received by Indigenous workers has lower quality than that of the Mestizos. Although we have no measures of school quality we do control for state of birth, which may be correlated with differences in school quality among states. It is also possible that there is an unobserved component common to less acculturated individuals that generates the earnings differentials observed in the data, e.g. unobserved ability. In an attempt to solve this puzzle we turn, in the next section, to the analysis of the Mexican Family Life Survey, a dataset that contains additional information that can shed light on this question.

Measuring Group Differences with the Family Life Survey
In this section we explore group differences in the labor market of Mexico using data from the Mexican Family Life Survey (MxFLS). In particular, we work with three waves of the MxFLS: 2002MxFLS: , 2005MxFLS: -2006MxFLS: , and 2009MxFLS: -2012 (see Teruel, 2006a,b, 2013, for details about this survey). Our sample is limited to persons 16 -64 years of age and who are not currently attending school.
An advantage of working with the MxFLS is that we have measures of the respondent's cognitive ability. Cognitive ability is measured by Raven Progressive Picture Matrices. For each of 12 matrices, the respondent receives a score of one if the question is answered correctly and a score of zero if it is not answered correctly. A skill index, s it , is created such that , 1} for question j = 1, . . . , 12, for individual i = 1, . . . , n at year t. It is important to note that the matrices that compose this test were designed to measure cognitive ability independently of the individual's reading and writing skills and that respondents take the test every time they are visited, and so the test score can vary for the same individual. Following Neal and Johnson (1996) we create an age-and educationadjusted standardized cognitive skill index by standardizing the residuals obtained from regressing the test score s it on individual's age and education.
A disadvantage of working with the MxFLS is that it does not cover all the Mexican territory (see Appendix B) and that we will have a smaller sample compared to the Census data. Another disadvantage of working with the MxFLS data is that given the order of the questions asked in the MxFLS questionnaire and the pattern of valid skips, we cannot perfectly identify Indigenous individuals who only speak an Indigenous language (the Monolingual-Indigenous), as we did with the Census data. That is, from the MxFLS data we know whether individuals speak Spanish in their households, but we do not know if they speak Spanish outside the household. Hence, with the MxFLS data we can only identify three groups of individuals: (i) Mestizos, (ii) Indigenous monolingual in Spanish, (iii) Indigenous who speak an Indigenous language. The latter group will include Indigenous who only speak an Indigenous Language, but we cannot separately identify them from those who are bilingual in Spanish and an Indigenous language. 8 Table 5 presents the summary statistics for the MxFLS data. Similar to what we found in the Census data, Indigenous who only speak Spanish have observable characteristics similar to those of Mestizos. Notice that on average the sample from the MxFLS is about one year less educated than the Census sample for all groups of individuals although the ranking of average years of education by group is preserved. 9 The differences in terms of the proportion of individuals who are self employed are not as stark as they were in the Census. Even though the classification used for occupations is somewhat different in the MxFLS and the Census data, the patterns are similar in both data sets, that is, Indigenous workers who only speak Spanish have a distribution of occupations similar to that of Mestizos, and half of the Indigenous who speak an Indigenous language work in agricultural activities compared to only a quarter of the other two groups. We do not use industry where the individual is employed when working with the MxFLS data because this information is missing in the third wave of the MxFLS. 10 [Insert Table 5 here] Something new in the summary statistics compared to the Census data is that we have a measure of cognitive ability in the MxFLS data. Table 5 indicates that Mestizos are on average 0.07 standard deviations above the mean, Monolingual-Spanish Indigenous are 0.11 standard deviations below the mean, and Indigenous who speak an Indigenous language are 0.30 standard deviations below the mean.
Tables 6 and 7 address the question of whether group differences in cognitive ability explain the earnings gap between these groups. Table 6 presents estimates of a specification similar to the specification estimated in Table 2. Comparing columns (1) and (3) we can see that including the standardized test score has barely an effect on the estimates, while still suggesting a positive return of cognitive ability in the labor market. Similar to the estimates with Census data, the results indicate that Indigenous workers suffer an earnings penalty compared to Mestizos, but this penalty is higher if these individuals also speak an indigenous language. Columns (2) and (4) also condition on occupational controls and even though the estimates on the effect of speaking an Indigenous language are significantly reduced when including occupational controls they are not really affected by the inclusion of the test score.
We observed a similar effect on the coefficient estimates when including occupational controls in the regressions using Census data. The similarity between the estimates with and without controlling for the test score suggest that systematic differences in cognitive ability cannot explain the earnings gap observed in the Census data and confirmed with the MxFLS data.
[Insert Table 6 here] [Insert Table 7 here] Columns (5) and (6) address the possibility that poor Spanish proficiency leads to the earnings gap between Mestizo and Indigenous workers, as suggested by Chiswick et al. (2000).
controls that have a significant effect on the estimates. This is done by including a control for whether the respondent speaks Spanish at home. We would expect that Indigenous workers who speak Spanish at home have better Spanish proficiency than those who only speak an Indigenous language at home. The results indicate that whether or not we include speaking Spanish at home there are large earnings penalties associated with Indigenous self-identification and speaking an Indigenous language. Including occupational controls yields similar estimates for columns (2), (4), and (6). However, speaking Spanish at home is associated with significant earnings gain. Our data are not sufficiently rich to determine if this is a Spanish fluency effect or an acculturation effect.
Next, in Table 7 we only include individuals who self-identify as Indigenous. The omitted category is Indigenous who only speak Spanish. Once again a comparison of columns (1) and (3) indicates that including our measure of cognitive ability has no effect on the estimates while still indicating a positive return to cognitive ability in the labor market. The estimates from these columns are in line with the results in Table 6 as Indigenous who also speak an indigenous language have earnings that are on average lower than Indigenous who only speak Spanish, and this earnings gap cannot be explained by systematic differences in cognitive ability between these groups. Notice that once we include occupational controls the earnings gap between these two groups is substantially reduced. However, there are no real differences between the estimates that control for the test score conditional on occupation. The last two columns of Table 7 address differences in Spanish proficiency. It is important to notice that the effect of speaking Spanish at home on the respondent's earnings is stronger when we compare only Indigenous workers.

Indigenous Identity Persistence
Throughout this study we have used Indigenous self-identification as a key variable. It is however possible that individuals self-identify as Indigenous depending on the evolution of their economic success or failure. In order to investigate whether this could be biasing the results we can use the MxFLS data, since this is a panel, and see whether individuals change their responses to the question of Indigenous self-identification depending on their economic condition. To this end we identified individuals who changed the response to the Indigenous self-identification question. Then among those who changed their response we have two groups for those who changed their self-identification from: (i) Indigenous to Mestizo and (ii) Mestizo to Indigenous. We also calculated the annual earnings change. Our

Discussion
The empirical evidence drawn from the analysis of the Census data suggests the existence of earnings gaps between more-and less-acculturated groups in Mexico. However, the analysis with the Census data does not allow us to rule out the possibility that an unobserved component, e.g. unobserved ability, is driving these earnings differentials. To ameliorate this problem, we use data from the MxFLS which allows us to control for individuals' cognitive ability, at the expense of a smaller and less representative sample of the Mexican labor market. The empirical evidence drawn from the MxFLS indicates that systematic differences in cognitive ability are not enough to explain the earnings gap observed in the data. The question remains as what can explain the earnings gap. In this section, we consider a series of potential explanations.

Language and Behavior
Some behavioral economics research asserts that linguistic factors can affect cognitive processes other than intellectual ability. In particular, Chen (2013) and Sutter et al. (2015) present evidence that language affects intertemporal choice. This line of research is not without major critics; nevertheless, we wish to explore whether its core claims are consistent with the empirical results reported in this study. 11 According to this research, some languages have strong future-time reference (s-FTR) while other languages have weak future-time reference (w-FTR). Persons who speak languages with s-FTR see the future as distant and grammatically separable from the present; hence, future outcomes are strongly discounted relative to present outcomes. The grammatical capacity to reference future outcomes within the present tense raises the importance of future outcomes because they seem closer to the present and more certain to be established; with w-FTR languages the future is less distant.
Using the language of the economics of intertemporal choice, persons who speak a s-FTR language have less patience than persons who speak a w-FTR language. For the latter languages, the future is not grammatically separable from the present. Hence, language influences the extent of intertemporal patience and therefore investment decisions related to health, education, savings, etc. So, children reared within a w-FTR language family may make greater investments in reproducible skills with high time and pecuniary costs in the present but with greater payoffs in the future. Importantly for this study, the effects of a language's future-time on economic patience are independent of IQ as measured by the German [the w-FTR language]" (see Sutter et al., 2015, page 4). Hence, a labor market prediction is that bilingual persons will earn less than mono-lingual w-FTR speakers but more than monolingual s-FTR speakers.
Chen assumes that a person's thoughts are structured by the language spoken at home (see Chen, 2013, page 10). Sutter et al. show that bilingual German children are less future-oriented than mono-lingual German children, while bilingual Italian children are more future-oriented than monolingual Italian children. As such, the language spoken at home may affect economic behavior even if the person also speaks a language in addition to the language spoken at home. Hence, bilingual-home Indigenous should have a different income penalty than bilingual-literate Indigenous, but the same income penalty as fully bilingual Indigenous. In comparison to Monolingual-Spanish Indigenous, Bilingual-home Indigenous and fully bilingual Indigenous will have market penalties or market premia according to whether the grammar of an Indigenous language creates more or less impatience than the grammar of Spanish. Indeed, if Spanish has a relatively higher s-FTR than Indigenous languages, then bilingual Indigenous should receive a market premium.
Thus far the language and behavior research is quiet on the relationship between language and other attributes that may affect income attainment, for example, risk-taking, drive and ambition, the ability of overcome obstacles, or language group cultural pressure for supranormal effort and achievement. In any case, an evaluation of the relationship between the relative FTRs of Spanish and Indigenous languages and their impact on income is beyond the scope of this paper. We note however that for males Table 4 shows that there are significant and substantively large negative income penalties or an insignificant effect for all Indigenous language family-groups. Hence, for the language-behavior explanation of Mestizo-Indigenous income inequality to hold it would have to be the case that every Indigenous language family-group has a s-FTR language relative to Spanish. Further, all other language effects on cognitive skills other than intelligence must also favor Spanish speakers. Even so, the language-behavior explanation does not explain why Monolingual-Spanish Indigenous males have a 10 percent income penalty.

Discrimination
Next, there exists the possibility that less-acculturated individuals in Mexico are discriminated against in the labor market. Indigenous individuals who speak an Indigenous language have stronger ties to their Indigenous roots than Indigenous who do not speak an Indigenous language. Then, in terms of the Mestizaje with blanqueamiento ideal, Indigenous who speak an Indigenous language are less acculturated than those Indigenous who only speak Spanish. Aguilar (2011) performs a lab experiment to examine the social and political consequences of Mexicans' different racial appearances. She finds that Mexicans tend to socially evaluate more positively European-looking Mexicans. She also finds that people are cognizant of more negative traits attached to Indigenous-looking individuals than to Mestizo or Europeanlooking persons, while attaching more positive traits to the latter. Similarly, Arceo-Gomez and Campos-Vazquez (2014) perform a correspondence study to identify racial discrimination in Mexico throughout the range of phenotypes generated by racial mixing since the Spanish colonization. The authors find that firms in Mexico tend to discriminate against applicants of Indigenous appearance. And so there seems to be some evidence of discrimination against indigenous in Mexico.  Tables 2 and 3). The results from the MxFLS are less precise but also suggest an earnings advantage for more acculturated individuals (see Tables 6 and 7).
Notice that, in this study, we use the individual's capacity to speak an Indigenous language as a metric for being more or less acculturated. However, one can argue that this difference in the level of acculturation maps into systematic differences in the way that lessacculturated individuals interact in the labor market, the way they dress, or the way they speak Spanish (i.e. their accent), their phenotype, among other things. As a result, even without explicitly knowing that an Indigenous individual is bilingual it is possible that labor market participants infer that some individuals are less acculturated than others and so if less-acculturated individuals are discriminated against this results in the earnings gap observed in the data.
It is also important to notice that we found that speaking Spanish at home is associated with an earnings gain. While we expect that those respondents who speak Spanish at home should have better Spanish proficiency, it is also true that respondents who speak Spanish at home are more acculturated than respondents who speak an Indigenous language at home.
Finally, notice the effect that occupational controls (i.e. occupation and industry) have on the earnings gap. The results from Sections 2 and 2.1, indicate that without conditioning on occupational outcomes the earnings gap is even wider. It is possible that labor market discrimination against less-acculturated individuals is also reflected in restricted ac-cess to certain occupations or industries. The fact that the earnings gap persists even after controlling for occupation and industry suggests that less-acculturated individuals are both discriminated in terms of their job choices and in terms of the return to their work.

Job Search and Networks
Another mechanism that could generate the earnings gaps observed in the data are differences in job search and labor market networks. Consider the search and matching framework of Mortensen and Pissarides (1994). In this framework the wages are determined through Nash bargaining and the resulting wage is positively related to the labor market tightness, which is defined as the ratio of the number of vacancies to the number of unemployed workers. If Mestizo workers have larger networks in the labor market than Indigenous workers, this will result in a more favorable labor market tightness for Mestizo workers, resulting in better bargained wages than Indigenous workers. Also, the descriptive statistics in Tables 1 and 5 reveal that more-acculturated Indigenous workers, that is those who only speak Spanish, have labor market outcomes more similar to the Mestizos than less-acculturated Indigenous workers. Hence, if more-acculturated Indigenous workers have better networks than lessacculturated Indigenous workers, the more-acculturated group will face more favorable labor market tightness and bargain better wages than the less-acculturated Indigenous workers.
A model that can be used to explain these earnings gaps in the Mexican labor market is developed in Mortensen and Pissarides (2003) which assumes segmented labor markets for different groups of workers in the labor market.
Alternatively, we can consider the equilibrium search framework of Burdett and Mortensen (1998) to understand the earnings gaps observed in the data. Once again if Mestizos have larger networks than Indigenous workers, Mestizo workers will face higher job-offer arrival rates than Indigenous. Similarly, if within Indigenous workers, those who are more acculturated would have higher job-offer arrival rates than less-acculturated. In recent work, 12 Under the assumption that the cumulative distribution function (c.d.f.) of log-earnings for Indigenous is above the c.d.f. of log-earnings for Mestizos, the largest difference between the two c.d.f.'s is 0.2353 which is significantly different from zero (p-value of 0.000). However, under the assumption that the c.d.f. of log-earnings for Mestizos is above the c.d.f. of log-earnings for Indigenous, the largest difference is -0.0004 which is not significantly different from zero (p-value of 0.915). This is consistent with the distribution of log-earnings for Mestizos first-stochastically dominating that of Indigenous. 13 Under the assumption that the cumulative distribution function (c.d.f.) of log-earnings for Bilingual Indigenous is above the c.d.f. of log-earnings for Monolingual-Spanish Indigenous, the largest difference between the two c.d.f.'s is 0.2336 which is significantly different from zero (p-value of 0.000). However, under the assumption that the c.d.f. of log-earnings for Monolingual-Spanish is above the c.d.f. of logearnings for Bilingual, the largest difference is 0.0000 which is not significantly different from zero (p-value of 1.000). This is consistent with the distribution of log-earnings for Monolingual-Spanish first-stochastically dominating that of Bilingual.

Borja-Vega et al. (2007) test for the effect of social networks based on group-identity
or ethnic affinity on employment outcomes among Indigenous individuals in Mexico. The authors find that social networks have a positive and significant effect on males being employed as unskilled workers or being self-employed in both rural and urban areas and they also explore the effect of social networks in other outcomes such as migration and participation in public programs. Similarly, Skoufias, Lunde, and Patrinos (2010) examine the extent to which social networks among Indigenous individuals in Mexico have an effect on investment in human capital, school attendance, teenage work, migration, welfare participation, employment, occupation and employment sector. Their empirical analysis confirms that social networks plan an important role in these economic decisions, particularly in rural areas. And so it seems reasonable to think that differences in the size of networks between Indigenous and Mestizo workers could be an important source of the earnings gap observed in the data. 14

Segmentation by Education or Cognitive Ability
Another possibility is that the Mexican labor market may be segmented by years of education or cognitive ability. If so, there may be non-linear and negative income effects associated with Indigenous identity. Theories of labor market segmentation tend to suggest limited mobility between a high-wage primary sector and a low-wage secondary sector, where the secondary sector is also characterized by limited on-the-job training and extremely few career ladders (Lewis, 1954;Gordon et al., 1982). Given the relative absence of on-the-job training and career ladders, secondary jobs require fewer years of education and less ability than primary sector jobs. If Indigenous workers are more likely than Mestizos to be employed in the secondary sector, then we will observe ethno-linguistic income differentials. In this case ethno-linguistic income differentials are the result of ethnic differences in mobility from the secondary to the primary sector and the superior wages, on-the-job training, and career ladders within the primary sector.
To investigate this possibility we estimated a series of specifications for alternative years of education and cognitive ability quartiles using the MxFLS data. The quartiles are based on Mestizos years of education and cognitive ability and are presented in Tables 8 and 9 which present the estimates from specifications similar to those in Table 6. The results from Table 8 indicate that the earnings penalty for Indigenous identification only occurs in the lowest education quartile, whereas the penalty for speaking an Indigenous language only occurs at the third and fourth education quartiles. The results from Table 9 indicate that the earnings penalty for Indigenous identification only occurs at the second ability quartile, whereas the penalty for speaking an indigenous language only occurs at the first and third ability quartiles. The specifications in Tables 8 and 9 control for occupation. The results without these controls are very similar and do not change any of the conclusions.

Employment Status
The statistical analysis has included self-employed persons in the samples used to estimate the log-earnings regressions. The MxFLS data has a very uniform distribution of self-employed among the Mestizo and Indigenous workers. But the Census data, which is more representative of the Mexican population, reveals that about 30 percent of bilingual Indigenous workers and 50 percent of Monolingual-Indigenous workers are self-employed (see Table 1). As it has been documented for the case of the Unites States, there are significant differences in earnings between self-employed and wage/salary workers (see Hamilton, 2000).

If the earnings process differs by occupational segmentation and Mestizo and Indigenous
workers have different occupational distributions, then the penalties for Indigenous selfidentification and bilingual linguistic ability may represent spurious correlation. Hence, we use the Census data to estimate specifications similar to those in Tables 2 and 3 separately for self-employed and wage/salary workers. 15 The results are presented in Table 10 where all specifications control for occupation and industry, the results without these controls are very similar and do not change any of the conclusions. Columns (1) and (2) present the estimates for a specification identical to that of  (3) and (4) present the estimates for a specification identical to that of Table 3. The results from all specifications are in line with the findings using the whole sample, and so we conclude that segmentation into self-employment is not the cause of ethnolinguistic identity wage penalties. Notice however, that the earnings differentials are higher for self-employed workers than for wage/salary workers. This is particularly important as the compensation of wage/salary workers must be somewhat restricted by labor regulations or even the status quo. However, the compensation of self-employed workers will be mostly determined by bargaining between the self-employed worker and the buyer of the worker services, and so this compensation could carry some discount for hiring a less-acculturated worker.

Location Size
Similarly, even when we control for the size of the location of residence of the respondents in the sample, it is possible that the earnings gap is driven by the fact that there is a larger 15 Recall that the Census does not allow us to control for differences in cognitive ability, but as seen before, systematic differences in the measure of cognitive ability available in the MxFLS are not enough to explain the gap (see Tables 6 and 7). On the other hand, working with the Census allows us to control for a large set of observable characteristics, has larger samples and it is more representative of the Mexican population.
proportion of indigenous workers living in small communities whereas there is a larger proportion of Mestizos and Monolingual-Spanish Indigenous workers living in large cities. Then, similar to what we did in Table 10 we estimate separate regressions for small communities and large cities. We define small communities as locations with population less than 100,000; and large cities as locations with population larger than 100,000. The results are shown in Table 11. The results from Columns (1) and (2) indicate that the earnings gap between Indigenous and Mestizos is mainly a small-communities phenomenon. In large cities, those individuals who self-identify as indigenous still suffer an earnings gap but it is only of five percentage points, whereas in these locations speaking an Indigenous language is not associated with an earnings penalty; on the contrary, Indigenous workers who speak an Indigenous language have an earnings advantage of about one percent, although this estimate is borderline significant. On the other hand, the results in Column (1) indicate that it is in small communities that self-identification as Indigenous has a large earnings gap and the penalty is higher if the individual speaks and Indigenous language. Similarly, when comparing among Indigenous workers in Columns (3) and (4) we find that the gains for acculturation is completely a phenomenon that occurs in small communities. In large cities there is no earnings gain for acculturation. 16 The specifications in Table 11 control for occupation and industry, the results without these controls are very similar and do not change any of the conclusions.

Policy Implications of the Alternative Explanations
This paper finds a large ethnic identity earnings gap. Our best estimates indicate that Indigenous workers earn about 9.5 percent less than otherwise identical Mestizo workers.
These estimates control for cognitive ability, Spanish fluency, and other productivity-linked attributes. There is an additional 22 percent penalty for Indigenous workers who also speak an Indigenous language. Additionally, our findings suggest that the social and labor market mechanisms generating the earnings gap are structural, rather than behavioral. Specifically, our results are inconsistent with the notion that Indigenous language culture discourages skill accumulation and thereby contributes to ethnic inequality in the Mexican labor market.
These results are consistent with several mechanisms: 1) earnings discrimination in the most narrow sense, that is, lower pay for Indigenous workers who are as productive as Mestizo workers; 2) larger or more effective job search networks for Mestizos relative to Indigenous people; 3) segmentation in the labor market with Indigenous workers disproportionately located in what are most likely secondary sector jobs; 4) Indigenous identity and language penalties are substantively higher among self-employed workers; 5) earnings penalties are strongly concentrated among workers located in small communities, that is, locations with population less 100,000 persons.
Tying all these outcomes together suggests Indigenous workers lack sufficient job networks to obtain the most desirable employment in larger urban locations, that is, jobs within the primary employment sector and where there is the least earnings inequality. Accordingly, Indigenous workers especially those with the strongest attachment to Indigenous identity are disproportionately sorted into self-employment occupations and jobs located in smaller municipalities. All of these results suggest that the earnings gap between Indigenous and Mestizo workers can be reduced through policies that: (i) improve Indigenous worker access to waged jobs in order to reduce their concentration in self-employment, and (ii) develop mechanisms to improve the development of Indigenous communities in small communities.
In regards to improving access to waged jobs, Mexico has already started moving in the right direction. In the last decade, Mexico has established standards for labor equality and non-discrimination (see the labor standard NMX-R-025-SCFI-2009 and more recently the labor standard NMX-R-025- SCFI-2015). Although most of these standards focus on the male-female earnings gap (especially in the earlier versions), rather than the earnings gaps due to ethnic differences, the labor standards for equality exist. Moreover, there have been additions and modifications to the labor laws and the Mexican constitution to prevent discrimination in employment. It suggests that more work needs to be done in guaranteeing that these labor standards and labor regulations are enforced, especially for Indigenous workers.
For example, firms can obtain certification in the most recent Mexican Standards for Labor Equality and Non Discrimination NMX-R-025-SCFI-2015. Certification allows the firm to be recognized as an employer that satisfies employment equality, which can be used by the firm to improve its image and perhaps its position in consumers' preferences. However, certification is voluntary so the labor standards for equality are not really binding. While we believe that making these standards mandatory for every firm would not be viable, the authorities might try to induce firms' certification in a variety of ways.
One possibility is to make certification mandatory for firms that have significant economic transactions with the Federal Government of Mexico. This would be similar to the United States Executive Order 11246 which prohibits discrimination in employment for firms which do over $10,000 in Government businesses in one year and impose penalties if firms do not comply with these regulations. Mandatory certification could also be based on firm size.
Other possibilities include offering tax incentives to firms that are certified in these labor standards. The feasibility and the implementation of these incentives are beyond the scope of this paper.
Similarly, as the results suggest, the earnings gap between Indigenous and Mestizo workers seem to be higher in smaller communities. Hence the economic development of Indigenous people in small communities could help in reducing the earnings gap we observe between Indigenous and Mestizo workers. In Mexico, the Government agency in charge of this is the National Commission for the Development of Indigenous Peoples (CDI for its acronym in Spanish) which has at its objectives "the integral and sustainable development of the indigenous peoples and communities." This agency has developed a series of programs that are aimed at developing Indigenous communities, such as programs to improve production and productivity in Indigenous communities, foster education and improve infrastructure in Indigenous communities. Once again, the Mexican authorities seem to be moving in the right direction, but much more work needs to be done to reduce or eliminate the Indigenous-Mestizo earnings gap.

Conclusion
The results herein reveal that the Mexican labor market penalizes Indigenous identity, that is, even highly acculturated Monolingual-Spanish Indigenous will receive a wage penalty relative to Mestizos. Moreover, the penalty will increase if Indigenous workers also speak an Indigenous language -regardless of the degree of Spanish fluency.
Cognitive ability does have a large, positive, and statistically significant effect on wages.
For a one standard deviation increase in the age-and education-adjusted test score, male wages will rise by 6-7 percent. However, including a cognitive ability measure into the wage equation has little or no effect on the ethno-linguistic identity coefficients. Hence, it is possible to obtain consistent coefficient estimates of ethno-linguistic inequality from regressions estimated on data drawn from the Mexican Census, a very large dataset which includes information on earnings, hours, and individual characteristics but which does not include a measure of ability.
It is important to note that the Mexican labor market is characterized for having a sizable informal sector (e.g. see Maloney, 1999;Cano-Urbina, 2015, 2016. However, the present study abstracts from that feature of the labor market. The main reason for such abstraction is because the informal sector is an urban phenomenon, and concentrating on urban areas would leave a very important group out of our sample: the monolingual Indigenous as it was shown in Table 1 that the vast majority of these individuals live in rural areas. It is left for future research to explore the labor market experience of Indigenous and Mestizos in the urban labor market and address directly the issue of informality of the job. The discussion in Section 3 provides many possible channels that could explain the earnings gap between Indigenous and Mestizo workers. In all likelihood, the earnings gap is a result of a combination of multiple mechanisms. But it is left for future research to explore the empirical relevance of each of these.    (1) Table 2 are used.

APPENDIX A Mexican Census Data
The 2010 Mexican Census data is the 10 percent sample obtained from the Instituto Nacional We use the answer to these questions to create the indicator variables for Table 2 and the categories of Indigenous and Mestizo used in Table 3.
For individuals who indicate being able to speak an indigenous language, the questionnaire also asks them if they have the ability to speak Spanish. We find that 79.35 percent of the individuals ages 16 to 64 not attending school and that report being able to speak an indigenous language also can speak Spanish. We use this information to build the indicator variables for Table 3, which are: monolingual-indigenous, monolingual Spanish, and bilingual. We drop from the sample individuals who are monolingual indigenous but do not self-identify as members of an indigenous group. A comparison of the summary statistics reveal that these individuals are very similar to monolingual-indigenous who self-identify as members of an indigenous group and not to those who do not self-identify as members of an indigenous group. Hence, we believe that these individuals did not understand the question correctly. Only 1,990 individuals are in this situation, and from them only 678 have valid observations so they could have been used in the estimation sample. The results are basically identical if we include or exclude these individuals.
We also drop observations with missing information in variables used in the analysis such as age, education, income, state of residence, state of birth, occupation, industry and population size (see Tables 1, 2 This catalog identifies 68 different languages, and we used the the catalog published in the Diario Oficial de la Federacion to group these 68 languages in the 11 linguistic families used in Table 4 (see INALI, 2008). However, three of these families have very low proportions and so we group them in one category, these are: Algica, Cochimi-Yumana, and Seri. No observations are dropped from the main sample used in Tables 1, 2, and 3 for missing identification of the language.
The summary statistics presented in Table 1 are weighted using the person weight variable available in the Census, wtper, which indicates the number of persons in the actual population represented by the person in the sample.

B Mexican Family Life Survey Data
We use the three waves from the Mexican Family Life Survey (MxFLS) which are 2002, 2005 and 2009 (see Teruel, 2006a,b, 2013). The information provided by these surveys is organized in separate data files or "books" but the structure of these books remained constant over the three waves of the survey for the books that are relevant for our study. The following description refers to the particular books used to obtain the variables for the analysis.
First, we use the control book, Book C, to obtain geographical data such as state, city, county, and city size. We also obtain from Book C the month of the visit to the household which is used to locate the months that are relevant to deflate annual income measures, this is obtained in the back-cover section of Book C. Finally, a third section of Book C is used to obtain some individual data such as: respondent's age, gender, and annual earnings from the previous 12 months. This is the measure of earnings used in the regression analysis in Tables 6 and 7. A small number of observations (21) were deleted because of excessively high annual earnings values.
Individual language, ethnic self-identification, and education data was obtained from Book IIIA. The language and ethnic self-identification variables ask if the respondent: (i) speaks Spanish at household, (ii) reads and/or writes in Spanish, (iii) self-identify as indigenous, and (iv) speaks an indigenous language. This information is used to build the language variables in Tables 6 and 7.
The education information is very detailed for the highest grade completed up to grade 12 which is high school. We can also identify if the respondent attended the "open system" for middle school (grades 7 to 9) and for high school (grades 10 to 12) and use this information when necessary. 17 The education information for college and graduate degrees is not very detailed and we only know whether the respondent attended and graduated from these degrees, and so we follow a similar strategy as with the open system schools. 18 We use Book IIIA also to obtain the respondent's occupation or position in his/her main job. The categories for the respondent's occupation are: non-agricultural worker, agricultural worker, self-employed, and boss of his/her own company.
The not complete open middle school. Similarly, we assign a highest grade completed of 12 if the respondent finished open high school (9 grades for elementary and middle schools and 3 grades for high school) and a highest grade completed of 9 if the respondent did not complete open high school. 18 We assign a highest grade completed of 16 if the respondent completed college or teacher's college, and 12 if they attended these levels of education but did not get the degree. Similarly we assign 18 years of education if the respondent completed a graduate degree and 16 if they attended a graduated degree but did not get the degree.
Then, in the regression analysis, there will be less than 32 dummies for state of residence, but there are 32 dummies for state of birth since we have observations where individuals were born in a different state from the state of residence.