California pistachios, on the other hand, are concentrated in the southern part of San Joaquin Valley. Moreover, they are planted in areas where the climatic conditions are mostly beneficial for them. Few events of adverse weather exist on record, which can be used for analysis. Therefore, the variance in CP in our range of interest is even more limited.The California Department of Food and Agriculture, as well as the US Department of Agriculture, usually report average yields on the county level. If the counties are large, compared to the growing area, few observations will be generated, and the averaging process will get rid of useful extreme observations on the sub-county level. The aggregated reporting problem, together with crop concentration, limits the possibilities of traditional econometric analysis on crop yields. I address this problem here for California pistachios, but the challenge might prove a barrier for research on other crops as well. Consider not only high value commercial crops concentrated in a few California counties, but also “orphan crops”: local crops which have received less attention from researchers and the private sector, yet generate substantial nutritional value for low income communities in developing countries. The African Orphan Crops Consortium, an initiative to promote research and use of these crops in Africa, list 101 crop of interest on its website, many of them perennial.2 Cullis and Kunert note that orphan crops “…are poorly documented as to their cultivation and use, and are adapted to specific agro-ecological niches and marginal land with weak or no formal seed supply systems”. Research on specific orphan varieties might therefore suffer from the same challenges of California pistachios: biological complexity, concentration of growing acreage,blueberry plants in pots and few data reporting units. In this chapter, I combine two approaches to estimate the yield response of California pistachios to winter CP count. The first approach is a “big data” one: I enhance a California yield panel of five counties with local temperatures at the pistachio growing areas. I use satellite data and temperature readings from local weather stations to create a large data set that can be connected with the yearly yields.
Substantially increasing the number of explanatory variables, this allows for more nuances observations. The second approach is an aggregate estimation methodology, previously used in agricultural productivity literature but –to my knowledge– not yet explored in climate literature. This approach notes that the observed outcome variable is a mix of unobserved sub-unit heterogeneity in the data generating process. Information about this heterogeneity is used to recover the relationship between temperatures and yields. The result of this exercise is the first successful recovery of the nonlinear yield response to winter chill in commercial pistachio production. I apply my findings to climate predictions in the current growing areas to show the potential impact of climate change on California pistachios in the next 20 years, and predict that a significant decline can be expected. California pistachios are a high value crop, with grower revenues of $1.8 billion in 2016. The most common variety is “Kerman” , and almost all the California acreage is planted in five adjacent counties in the southern part of the San Joaquin valley. In recent years, rising winter daytime temperatures and decreasing fog incidence have lowered winter CP counts. Climatologists have concluded that winter chill counts will continue to dwindle , putting pistachios in danger at their current locations. To better predict the trajectory for this crop and make informed investment and policy decisions, the yield response function to chill must first be assessed. This task has proven quite challenging. The effects of chill thresholds on bloom can be explored in controlled environments, but for various reasons these relationships are not necessarily reflected in commercial yield data. For example, Pope et al. report that the threshold level of CP for successful bud breaking in California pistachios was experimentally assessed at 69, but could not identify a negative response of commercial yields to chill portions of the same level or even lower. They use a similar yield panel of California counties, but only have one “representative” CP measure per county-year. Using Bayesian methodologies, they fail to find a threshold CP level for pistachios, and reach the conclusion that “Without more data points at the low amounts of chill, it is difficult to estimate the minimum-chill accumulation necessary for average yield.” The statistical problem of low variation in treatment at the growing area, encountered by Pope et al., is very common in published articles on pistachios.
Simply put, pistachios are not planted in areas with adverse climate. Too few “bad” years are therefore available for researchers to work with when trying to estimate commercial yield responses. An ideal experiment would randomize a chill treatment over entire orchards, but that is not possible. Researchers resort either to small scale experimental settings, with limitations as mentioned above, or to yield panels, which usually are small in size , length , or both. Zhang and Taylor investigate the effect of chill portions on bloom and yields in two pistachio growing areas in Australia, growing the “Sirora” variety. Using data from “selected orchards” over five years, they note that on two years where where chill was below 59 portions in one of the locations, bloom was uneven. Yields were observed, and while no statistical inference was made on them, the authors noted that “factors other than biennial bearing influence yield”. Elloumi et al. Investigate responses to chill in Tunisia, where the “Mateur” variety is grown. They find highly non-linear effects of chill on yields, but this stems from one observation with a very low chill count. Standard errors are not provided, and the threshold and behavior around it are not really identified. Kallsen uses a panel of California orchards, with various temperature measures and other control variables to find a model which best fits the data. Unfortunately, only 3 orchards are included in this study, and the statistical approach mixes a prediction exercise with the estimation goal, potentially sacrificing the latter for the former. Besides the potential over-fitting using this technique, the dependent variables in the model are not chill portions but temperature hour counts with very few degree levels considered, and no confidence interval is presented. Finally, Benmoussa et al. use data collected at an experimental orchard in Tunisia with several pistachio varieties. They reach an estimate for the critical chill for bloom, and find a positive correlation between chill and tree yields, with zero yield following winters with very low chill counts. However, they also have many observation with zero or near-zero yields above their estimated threshold, and the external validity of findings from an experimental plot to commercial orchards is not obvious.Pistachio growing areas are identified using USDA satellite data with pixel size of roughly 30 meters. About 30% of pixels identified as pistachios are singular. As pistachios don’t grow in the wild in California, these are probably missidentified pixels. Aggregating to 1km pixels, I keep those pixels with at least 20 acres of pistachios in them. Looking at the yearly satellite data between 2008-2017, I keep those 1km pixels with at least six positive pistachio identifications. These 2,165 pixels are the grid on which I do temperature interpolations and calculations. Observed temperatures for 1984-2017 come from the California Irrigation Management Information System , a network of weather stations located in many counties in California,draining plant pots operated by the California Department of Water Resources. A total of 27 stations are located within 50km of my pistachio pixels. Missing values at these stations are imputed as the temperature at the closest available station plus the average difference between the stations at the week-hour window. Future chill is calculated at the same interpolation points, with data from a CCSM4 model CEDA . These predictions use an RCP8.5 scenario. This scenario assumes a global mean surface temperature increase of 2o C between 2046-2065 . The data are available with predictions starting in 2006, and include daily maximum and minimum on a 0.94 degree latitude by 1.25 degree longitude grid. Hourly temperature are calculated from the predicted daily extremes, using the latitude and date . I then calibrate these future predictions with quantile calibration procedure , using a week-hour window.
Past observed and future predicted hourly temperatures in the dormancy season are interpolated at each of the 2,165 pixels, and chill portions are calculated from these temperatures. Erez and Fishman produced an Excel spreadsheet for chill calculations, which I obtain from the University of California division of Agriculture and Natural Resources, together with instructions for growers . For speed, I code them in an R function . The data above are used for estimation and later for prediction of future chill effects. For the estimation part, I have a yield panel with 165 county-year observations. For each year in the panel, I calculate the share of county pixels that had each CP level. For example: in 2016, Fresno county had 0.4% of its pistachio pixels experiencing 61 CP, 1.8% experiencing 62 CP, 12% experiencing 63 CP, and so on. The support of CP through the panel is [36, 86]. Past county yields are from crop reports published by the California Department of Food and Agriculture. Figure 3.1 presents chill counts and their estimated effects in percent yield change for two time periods: 2000-2018 and 2020-2040. The top left panel shows the chill counts in the 1/4 warmest years between 2000 and 2018 . The top right panel shows the chill counts in the 1/4 warmest years in climate predictions between 2020 and 2040. Chill at the pistachio growing areas is likely to drop substantially within the lifespan of existing trees.Results from the polynomial regression are presented in Table 3.2 . The first coefficient is for an intercept term, and it is a zero with very wide error margins. This makes sense, as centering around the means also gets rid of intercepts. The second coefficient is positive, as we would expect, and statistically significant. The third coefficient is negative, as we would also expect since the returns from chill should decrease at some point, but not statistically significant even at the 10% level. However, as dropping it would eliminate the decreasing returns feature, I keep it at the cost of having a wide confidence area. With the estimated coefficients, I build the polynomial curve that represents the effect of temperatures on yields. It is presented in Figure 3.2 with a bold dashed line. The 90% confidence area boundaries are the dotted lines bounding it above and below. Note that the upper bound of the confidence area does not curve down like the lower one. This is the manifestation of the third coefficient’s P-value being greater than 0.1. In both cases, the confidence area was calculated by bootstrapping. The data was resampled and estimated 500 times, producing 500 curves with the resulting parameters. At each CP level, I take the 5th and 95th percentiles of bootstrapped curve values as the bounds for the confidence area. This approach also deals with the potential spatial correlation in error terms. Another minor issue requiring the bootstrap approach is that the implicit potential yield estimation should change the degrees of freedom in the non-linear regressions when estimating the standard errors. In the lower panel of Figure 3.2, a histogram of positive shares is presented. That is, for each chill portion, the count of panel observations where the share of that chill portion was positive. The actual shares of the very low and very high portions are usually quite low. This shows the relatively small number of observations with low chill counts. The two yield effects curves look very similar in the relevant chill range. By both estimates, the yield loss is very close to 0 at higher chill portions, and starts declining substantially somewhere in the upper 60’s, as the experimental literature would suggest. Interestingly, the polynomial curve does not exceed zero effect, although it is not mechanically bounded from above like the logistic curve. This probably reflects the fact that historically, the average growing conditions has not deviated much from the optimal range. The “within” transformation hence did not deviate the potential yield much from the optimum in this case.