Category Archives: Agriculture

Weed block fabric is commonly used in organic and hydroponic production systems

Fabric was unrolled and pinned by hand to cover the post-row surface between raspberry beds prior to post installation. The fabric remained in place during the experiment and was unpinned and rolled up at the end of the project for potential reuse. Yard waste mulch from local suppliers was delivered to the project sites. Mulch was a woody < 2-inch screened material with < 20% fine components. Different mulch sources at the two sites were used because the distance between sites and volume requirements for each site were prohibitively large to source from a single supplier. Mulch was delivered by tractor to post rows, where it was spread with rakes to cover the entire post row with a 2- to 3-inch thick layer. At both locations mulch was applied once prior to post installation and persisted throughout the trial period. Polyacrylamide , a nontoxic soil-binding polymer, was applied prior to rain events at a rate of 2 pounds per acre. In 2016–2017, PAM was mixed with water and applied with a backpack sprayer, but due to plugging of nozzles we dispersed dry PAM to post rows instead in 2017–2018 and observed similar efficacy and increased ease of application.In the 2016–2017 season, we collected runoff samples by hand within 30 min from the beginning of the runoff generation, approximately 25 feet away from the ends of each of the treatment post rows . About 250 milliliters of runoff water in each sample were brought from field sites to the UC Cooperative Extension Ventura County lab and immediately tested for turbidity using a turbidimeter , acidified with sulfuric acid to reach pH 3 and either shipped immediately to the ANR analytical lab at UC Riverside or stored at 4°C until shipment. Levels of nitrogen forms and total nitrogren and phosphorus were determined using a Discrete Analyzer AQ2 . In 2017–2018,blueberry box we collected grab samples as described above. We also collected runoff in 5-gallon buckets installed at 25 feet from the end of post rows to intercept first flush of runoff at soil surface level.

Additionally, we installed suction lysimeters about 30 feet away from the ends of the post rows at 8-inch depth at Santa Maria and 8- and 24-inch depths at Somis and collected leachate after rains. In 2017–2018 we also collected sediment from the buckets after runoff occurred, and the sediment samples were dried and weighed at the UCCE Ventura County lab. In April 2018, we took soil samples that were analyzed for soil moisture, nitrate nitrogen and phosphorus content. We calculated the costs of each treatment for the 1,800– square foot experiment plot and then extrapolated the costs into a per acre basis for one tunnel use period. A tunnel use period covers a 3-year production cycle of raspberry from establishment until termination. Costs of treatments included materials, labor and equipment when applicable. Granular dry PAM formulation application to soil was used in the analyses. We also adjusted the treatment’s costs if it provided weed control benefit. In addition, some treatments can serve for more than one tunnel use period. Therefore, we distributed the costs accordingly.Not all treatments had runoff during light rains. Barley cover crop and yard waste mulch likely interfered with low flows and aided water retention in post rows. We observed slower flows and greater puddling in post rows with barley or mulch than in other treatments or untreated soil . Soil sampled 3 days after rain in March 2018 at Somis had 8% to 12% greater moisture content at both sampling depths under mulch compared with other treatments . Mulch also conserved more soil moisture than fabric at Santa Maria .Combined nitrite and nitrate levels in runoff samples ranged from 0.29 to 6.48 milligrams per liter over two seasons of sampling. This variability is due to the intensity and frequency of the rains during this period, which also affected the accumulated fertigated nitrogen that occurred between rain events. Fabric and PAM did not reduce nitrate or nitrite in runoff compared with untreated soil at any of the sampling dates at both locations and sampling seasons , while mulch was equally ineffective in 2016–2017 in reducing NOx in runoff at both locations. During one out of five runoff events in 2016–2017, barley reduced NOx levels in runoff by 48% compared with untreated soil, but not significantly during other rain events of that season.

During two out of five runoff events at Somis in 2017–2018, barley reduced NOx levels in runoff by 71% and 82% and mulch reduced them by 67% and 91% compared with untreated soil, but reductions were not significant at other sampling events. At Santa Maria, none of the treatments had significant impact on NOx in runoff when compared with untreated soil . All treatments at Somis were effective in reducing ammonium in runoff in 2016–2017 compared with untreated soil , but only barley was effective in 2017–2018. The overall greater average levels of ammonium in 2017–2018 were likely due to use of passive samplers that intercepted the first flush of runoff, which may have had a greater concentration of pollutants than runoff collected later . Ammonium is typically carried on sediments, so lower ammonium would indicate less sediment movement. This suggests that barley cover crop and yard waste mulch can reduce both the concentration of dissolved ammonium nitrogen in runoff and the volume of runoff, leading to potential reductions in nitrogen losses to the environment compared with untreated soil. Soil under barley and mulch had significantly less nitrate nitrogen compared with other treatments in March 2018 at Somis . At Santa Maria, all treatments except for mulch had 25% to 81% less nitrate nitrogen than that of untreated soil, although mulch was also similar to all other treatments. Mulch deterioration might have reduced its efficacy at Santa Maria. At Santa Maria, nitrate nitrogen levels in leachate collected at 8-inch depth on all sampling dates ranged from 12 to 27 parts per million in PAM and untreated plots, which was 52% to 80% greater than those in other treatments . At Somis a similar trend was observed: nitrate nitrogen levels in leachate under PAM and untreated soil were 7 to 22 ppm, which was 80% to 90% greater than those under barley or mulch. Leachate nitrate concentrations under fabric were not different from those in untreated soil . These results suggest that barley and mulch can reduce nitrate nitrogen in soil and leachate. Mulch and cover crop act as a barrier to runoff water with dissolved nitrogen and sediment and may retain nitrogen to be used for cover crop growth and for residue and mulch decomposition. Turbidity in first flush of runoff was reduced 5- to 10-fold by all treatments compared with untreated soil at both locations in 2018 . These results were similar to turbidity in grab samples taken in 2017 and 2018 , which suggests that all treatments were effective in reducing waterborne sediments on site. Additionally, 75% to 97% less sediment was collected from passive samplers in all treated post rows compared with those in untreated soil, as shown for March 10, 2018 .

Relatively high sediment load in fabric treatment resulted from deposits of soil on top of the fabric during removal of plastic from raspberry beds. Similar to the March 10 rain event, we observed significantly lower sediment levels after other rains in all treated post rows compared with untreated rows . We also observed fewer erosion channels in treated post rows compared with untreated plots at both sites during the trial. Besides the agronomic benefits,blueberry package retaining soil in the field is also a good pesticide management practice because soil-adsorbed pesticides will stay in the field and not end up in receiving bodies of water. In a previous study, Mangiafico et al. showed that concentrations of the harmful insecticide chlorpyrifos in runoff were linearly related to sample turbidity. This suggests that retaining waterborne sediments on-site is an effective method for mitigating runoff of this pesticide. Preventing soil movement with these post row treatments may also reduce the costs of sediment removal from receiving waterways and associated environmental impacts . Phosphorus levels in the first flush of runoff samples were reduced by 24% to 85% in all treatments compared with untreated soil at Somis in 2018, except for PAM on Feb. 27, 2018 . Lack of efficacy of PAM on that date may have resulted from deterioration of the PAM seal due to soil disturbance after PAM application and before runoff sample collection. At Somis in 2016–2017 and Santa Maria in 2018, we observed a similar reduction in phosphorus by all post row treatments compared with untreated soil . Since phosphorus is normally adsorbed to soil particles , reduction in turbidity and phosphorus in runoff samples from treated post rows followed a similar trend. Reducing losses of phosphorus from production fields may help prevent eutrophication in receiving waterways when this micro-element is limiting for algal growth . Since tunnel post rows receive water and retain soil moisture, conditions are favorable for weed growth. At both locations weed barrier fabric provided nearly complete weed control with only occasional weed germination in areas where soil was deposited on the top of the fabric.

Application of PAM did not provide control, and weed densities in PAM-treated rows were similar to those in untreated plots. Yard waste mulch provided 81% to 90% weed control at Somis but did not control weeds in two out of three evaluation dates at Santa Maria . Mulch at Santa Maria was much finer compared with the one at Somis, and likely decomposed more rapidly, allowing weed growth. Barley cover crop provided 86% and 42% weed control on two evaluation dates at Somis, but after barley was reseeded, high germination of little mallow occurred . Incorporation of barley during reseeding likely disturbed hard-coated weed seeds sufficiently to break dormancy; however, mallow was controlled before seed production when barley was mowed in spring. Barley cover cropat Santa Maria provided 87% and 43% weed control at two out of three evaluation dates. At Somis in 2018, we observed 3.5 more volunteer raspberry shoots in post rows with mulch compared with other treatments or untreated plots . Unlike weeds, raspberry shoots were able to penetrate mulch and establish, likely benefiting from the greater soil moisture content under it . These results show that weed barrier fabric, mulch and barley can effectively reduce weed control costs in raspberry tunnel post rows, but greater volunteer raspberry shoot management may be required if mulch is used.Although the diversity and economic size of California’s agricultural production may increase its resilience and resistance to perturbations such as urbanization, higher temperatures and increasing resource costs are forecast for the next 100 years , and there is great uncertainty as to how producers will respond to a changing climate both within California and globally. Producers may face significant challenges as regional temperatures, precipitation and weather pattern variability, and national and international markets are altered by global climate change. As commodity prices are dependent on global production and demand, any assessment of the impacts of climate change on California agriculture must be done in the context of both regional and global changes in yields. The magnitude and direction of these yields will be determined by climatic factors such as temperature, precipitation, and weather variability, and production factors such as biotic responses to elevated atmospheric CO2 concentrations, the availability and application of nutrients, and the ability of producers to adapt to these changes. Furthermore, as global markets develop for carbon trading, opportunities may arise for California agricultural producers to mitigate greenhouse gases . Therefore, adjustments in global food and mitigation markets together will no doubt significantly determine California agricultural producers’ response to climate change. Furthermore, since agriculture is not only of economic significance, but also secures the livelihood of most of the world’s population, impacts of climate change on food and farm security are of particular importance . Predicting yields in the coming century requires complex modeling that integrates both global and regional climate change models, crop growth models and economic models, with the expectation that climate change will likely impact different regions of the world in distinct ways .

Are geographic and economic mobility linked for workers who get non-farm jobs?

The United States and Mexico appeared close to agreement on a program to legalize farm and other workers before September 11, 2001. However, after the war on terror was declared, the momentum for a new meant that more tons of vegetables were produced from the same acreage, while acreage of fruits and nuts rose from 2 million acres in 1990 to 2.4 million acres in 2000, a 19% increase over the 1990s. Many FVH commodities are labor intensive, with labor accounting for 15% to 35% of production costs. Most of the workers employed on FVH farms are immigrants from Mexico, and a significant percentage are believed to be unauthorized . In recent years, several proposals have aimed to reduce unauthorized worker employment in agriculture . In September 2001, Mexican President Vincente Fox called for a U.S.-Mexico labor migration agreement so that “there are no Mexicans who have not entered this country [U.S.] leguest-worker program and the legalization of immigrants already in the country slowed. In summer 2003, there were several new proposals for a migration agreement with Mexico to legalize the status of currently unauthorized workers and allow some to earn immigrant status by working and paying taxes in the United States. There is little agreement, however, on what impacts such a program would have on California’s farm labor market. We used a unique database to examine farm employment trends in California agriculture. The data suggests that: about three individuals are employed for each year-round equivalent job, helping to explain low farm worker earnings; there was a shift in the 1990s from crop farmers hiring workers directly to farmers hiring via farm labor contractors ; and there is considerable potential to improve farm labor market efficiency,growing bags by using a smaller total workforce with each worker employed more hours and achieving higher earnings.

California employers who pay $100 or more in quarterly wages are required to obtain an unemployment insurance reporting number from the California Employment Development Department . The EDD then assigns each employer or reporting unit a four-digit Standard Industrial Classification or, since 2001, a six-digit North American Industry Classification System code that reflects the employer’s major activity . Major activities are grouped in increasing levels of detail; for example, agriculture, forestry and fisheries are classified as a major industrial sector and, within this sector, SIC 01 is assigned to crops, 017 to fruits and nuts and 0172 to grapes. We defined “farm workers” as unique Social Security numbers reported by farm employers to the EDD, and then summed their California jobs and earnings. This enabled us to answer questions such as how many farm and non-farm jobs were associated with a particular SSN or individual in 1 year, and in which commodity or county a person had maximum earnings. We adjusted the raw data before doing the analysis. Farm employers have reported their employees and earnings each quarter since 1978, when near universal UI coverage was extended to agriculture. Although it is sometimes alleged that farm employers, especially FLCs, do not report all their workers or earnings, there is no evidence that under reporting of employees or earnings is more common in agriculture than in other industries that hire large numbers of seasonal workers, such as construction. We excluded from the analysis SSNs reported by 50 or more employers in 1 year . We also excluded wage records or jobs that had less than $1 in earnings and jobs, or that reported earnings of more than $75,000 in one quarter. These adjustments eliminated from the analysis 2,750 SSNs, 62,571 wage records or jobs and $803 million in earnings. These exclusions were about 0.25%, 2.7% and 6.1% of the totals, respectively, and are documented more fully in Khan et al. .

There is no single explanation for the outlier data we excluded. In some cases, several workers may share one SSN, while in others our suspicion that a SSN had “too many” jobs may represent data-entry errors. During the 1990s, the Social Security Administration cleaned up SSNs, including threatening to fine and reject tax payments from employers with too many mismatches between SSNs and the names associated with those SSNs, which should have reduced the number of SSNs reported by employers. We think the rising number of SSNs reflectsmore individuals employed in agriculture, not more noise in the data.Agricultural employment can be measured in three major ways: at a point in time, as an average over time or by counting the total number of individuals employed over some period of time. In the non-farm labor market the three employment concepts yield similar results. If 100 workers are employed during each month and there is no worker turnover from month to month, then point in time, average and total employment is 100. However, agricultural employment during the six summer months may be 150, versus 50 during the six winter months, meaning that point, average and total employment counts differ. We began with all SSNs reported by agricultural employers , summed the jobs and earnings of these SSNs within each SIC code, and assigned each SSN to the four-digit SIC code in which the worker had the highest earnings. This means that a SSN reported by a grape employer as well as by an FLC would be considered a grape worker if his highest-earning job was in grapes. The number of individuals or unique SSNs reported by California agricultural employers has been stable over the past decade — 907,166 in 1991, 966,593 in 1996 and 1,086,563 in 2001 .Farm workers had a total of 1.5 million farm jobs in 1991, 1.7 million in 1996 and 1.8 million in 2001. One-quarter also had at least one non-farm job — about 407,000 workers were both farm and non-farm workers in 1991, 453,000 in 1996 and 697,000 in 2001 . The total California earnings of persons employed in agriculture were $11.1 billion in 1991, $12.0 billion in 1996 and $15.8 billion in 2001 . The share of total earnings for farm workers from agricultural employers was 77% in 1991, 77% in 1996 and 71% in 2001, indicating that in the late 1990s, farm workers tended to increase their supplemental earnings via non-agricultural jobs.

Average earnings per job were highest in livestock, $13,800 per job in 2001. There was little difference between average earnings per job in agricultural services and crops . Average earnings per job were higher for the non-farm jobs of agriculture workers than for agriculture jobs .In 2001, California’s farm workers held 2.5 million jobs, including 1.8 million jobs with agricultural employers. These agricultural jobs included 630,000 in crops, 69,000 in livestock and 1.1 million in agricultural services. The agricultural services sector includes both farm and non-farm activities, such as veterinary and lawn and garden services; FLCs accounted for 70% of the employees reported by farm agricultural services. Fruits and nuts accounted for 53% of the crop jobs, dairy for 39% of the livestock jobs and FLCs for 58% of the agricultural services jobs. The major change between 1991 and 2001 was the drop of 54,000 jobs in crop production and increase of 313,000 jobs in agricultural services. We placed SSNs in the detailed commodity or SIC code that reflected the maximum reported earnings for the worker,nursery grow bag and considered workers to be primarily employed in the SIC with maximum earnings. In 2001, there were 877,000 primary farm workers, and they included 322,000 reported by crop employers, 50,000 reported by livestock employers and 504,000 reported by agricultural service employers. Fruit and nut employers accounted for 47% of the crop-reported workers, dairy for 40% of the livestock-reported workers and FLCs for 44% of the agricultural services–reported workers. The major change between 1991 and 2001 was the increase in number of SSNs with their primary job in agriculture — from 758,000 to 877,000. There was a slight drop in the number of workers reported by crop employers, a slight increase in livestock workers and a sharp 135,000 increase in agricultural services workers, anchored by a 59,000 increase in workers reported by FLCs in 2001. Most farm workers had only one job. In 2001, 53% of the SSNs were reported by only one employer to the EDD, 26% were reported twice, 12% three times, 5% four times and 4% five or more times. During the 1990s, about 65% of farm workers were reported by one agricultural employer only, 17% to 21% by two agricultural employers, 5% by at least two agricultural employers and one non-farm employer, and 9% to 12% by one farm and one non-farm employer. In the three-digit SIC codes representing more detailed commodity sectors, 60% to 83% of the employees had only one job. For example, in 2001 79% of the employees reported by dairy farms had one dairy farm job, while 7% also had a second agricultural job — 3% had a dairy job, a second farm job and a non-farm job, and 11% had a non-farm job in addition to the dairy job. About two-thirds of the employees of FLCs and farm management companies had only jobs with one such employer; 22% had another farm job; 6% had an FLC job, another farm job and a non-farm job; and 6% had a non-farm job in addition to the FLC job. Even more detailed four-digit SIC codes showed the same pattern: the commodities or SICs most likely to offer year-round jobs such as dairies and mushrooms had 70% to 80% of employees working only in that commodity, while commodities or SICs offering more seasonal jobs, such as deciduous tree fruits and FLCs, had 53% to 63% of employees working only in that commodity.

At the four-digit, SIC-code level, the five largest SICs accounted for about 45% of the agricultural wages reported.Agricultural employers paid a total of $11 billion in wages in 2001, an average of $10,200 per worker . Earnings were highest for the 64,000 workers primarily employed in livestock; they averaged $14,800, followed by those primarily employed by crop employers and those employed by agricultural farm services, custom harvesters and FLCs . There was considerable variation in earnings among workers in agricultural farm services: workers in soil preparation services averaged $21,100 in 2001, versus $12,700 for crop preparation services for market and $4,400 for FLC employees. The average earnings of primarily farm workers varied significantly, even within detailed four-digit SIC codes — in most cases, the standard deviation exceeded the mean wage . Median earnings were generally less than mean earnings, reflecting that higher wage supervisors and farm managers pulled up the mean. If the workers in detailed commodities are ranked from lowest-to-highest paid, the lowest 25% of earners in an SIC category generally earned less than $4,000 a year. For example, among workers primarily employed in vegetables and melons in 2001 , the first quartile or 25th percentile of annual earnings was $3,000. This reflects relatively few hours of work — if these workers earned the state’s minimum wage of $6.25 an hour in 2001, they worked 480 hours. The 25th percentile earnings cutoff was lowest for those employed primarily by FLCs, only $634, suggesting that FLC employees receiving the minimum wage worked 101 hours. The highest 25th percentile mark was in mushrooms , $9,491, which reflects 1,519 hours at minimum wage. The 75th percentile marks the highest earnings that a non-supervisory worker could normally expect to achieve — 75% of workers reported earning less than this amount and 25% earned more. The 75th percentile varied widely by commodity: $6,172 for those primarily employed by FLCs, $10,572 for those in grapes and $29,465 for those in mushrooms.The number of individuals and jobs reported by agricultural employers increased in the 1990s, reflecting increased production of labor-intensive fruit and vegetable crops and, the data suggests, more farm workers each worked a fewer number of hours. With the state’s minimum wage at $6.25 per hour after Jan. 1, 2001 , the earnings reported by employers suggest that most farm workers are employed fewer than 1,000 hours per year . FLCs increased their market share in the 1990s, but dependence on them varied by commodity. For example, FLCs rather than citrus growers reported many citrus workers, while dairy employers reported most dairy workers. FLCs are associated with low earnings, which suggests few hours of work — the median earnings reported by FLCs for their employees in 2001 were $2,650, or 400 hours if workers earned the state’s $6.25 minimum wage. California’s farm labor market has large numbers of workers searching for seasonal jobs; FLCs are matching an increasing share of these workers with jobs, resulting in lower earnings for FLC employees. Workers who avoid FLCs experience higher earnings in agriculture or in the non-farm labor market. If FLCs are most likely to hire recently arrived and unauthorized workers, as the National Agricultural Worker Survey suggests, FLCs serve as a port of entry for immigrant farm workers. The impact of guest workers, legalization and earned legalization will depend on the details of any new program. If the status quo continues, the percentage of unauthorized workers is likely to rise. Alternatively, if there were a legalization program, farm workers might more quickly exit the farm workforce. However, an earned legalization program could slow this exit if workers were required to continue working in agriculture to earn full legal status. The next step in this analysis is to examine the mobility of individual farm workers over time and geography, examining where workers migrate during 1 year and patterns of entrance to and exit from the farm workforce . Do farm workers who increase their earnings by moving to non-farm jobs stay in non-farm jobs, or do they sometimes return to agriculture? Answers to these questions will help to determine the trajectory of the farm labor market.

We geocoded maternal residential addresses listed on the birth certificates using an automated approach

Since most pesticide applications are spatially explicit and the usage may vary over years, these studies face challenges of lacking high spatial and temporal resolution for exposure assessment, on top of the known ecological fallacy. On the other hand, individual level GIS-based studies that assessed exposure to agricultural pesticides in proximity to residences, have shown inconsistent findings. A recent Spanish study explored the possible association between childhood renal tumors and residential proximity to environmental pollution sources by calculating the percentage of total crop surface within a 1-km buffer around each individual’s last known residence and discovered that children living in the proximity of agricultural crops have higher risk of developing renal cancer . However, US studies of individual measures of agricultural pesticides in proximity to residences, mostly suggested no association with childhood cancers, or at best modest associations for certain types of cancers and chemicals or chemical groups. A Texas study measured crop field density within a 1-km buffer of residence at birth for both cases and controls born in 1990-1998 to study the risk for childhood cancers by sub-types and found no evidence of elevated risk associated with residential proximity to cropland for most childhood cancers, except for modestly positive associations with non-Hodgkin lymphoma , Burkitt lymphoma , and other gliomas. A California population-based case-control study of early childhood cancer used mothers’ residential addresses at the time of birth to evaluate risks associated with residential proximity to agricultural applications of pesticides during pregnancy and also found no associations with most specific chemicals and chemicals groups,plastic planters except for modestly elevated ORs for leukemia associated with probable and possible carcinogen use and with nearby agricultural applications of organochlorines and organophosphates.

They suggested that the few elevated risk associations for specific chemicals including metam sodium and dicofol in this study are likely due to chance from multiple comparisons . Another Northern California study assessed residential proximity within a half-mile of pesticide applications by linking address histories with reports of agricultural pesticide use and examined the association of the first year of life or early childhood pesticide exposures and childhood acute lymphoblastic leukemia and suggested elevated ALL risk was associated with lifetime moderate exposure to certain physicochemical groups of pesticides, including organophosphates ,cholorinated phenols , and triazines , and with pesticides classified as insecticides or fumigants.These results vary by chemicals or chemical groups examined, and methods of exposure assessment, and therefore produced inconsistent results. In addition, many of the above studies found no or week associations between all cancer types grouped together and proximity to crop fields, agricultural activities, specific agents, or groups of chemicals. It may problematic because the etiologies of most childhood cancer sub-types remain largely unknown and may not share the same causal pathway mediated by pesticides. As mentioned before, these studies of ambient pesticide exposures in pregnancy or early childhood, often assign exposures based upon the child’s or mother’s residence. While large scale record-linkage studies can avoid selection and recall biases that often impact smaller studies with active subject recruitment, previous record-based studies often relied solely on maternal residential address at birth, which is readily available on many birth certificates and/or residential address at diagnosis, as was done in some childhood cancer studies . Previous evidence suggested that exposure to pesticides before or during pregnancy may harm the developing fetus. Nevertheless, increased risks were seen for postnatal period as well. For instance, a study of childhood leukemia tried to distinguish between pre-pregnancy, pregnancy and postnatal exposures as critical windows for household pesticide exposure, and insecticide exposures early in life appear to be significant, though the effect is not as strong as prenatal exposures .

After birth, children may also be more susceptible to the harmful effects of pesticides than adults, as they have more actively dividing cells , providing the rationale to studies of childhood cancers not only focus on prenatal but also early life exposures. The reliance on one address for assessment of exposures in the first year of life or early childhood implicitly makes the assumption that a child’s residence remained the same throughout the entire period of interest, or if they moved, that the exposure levels remained the same. Even if some studies assessed exposures as the children’s residential proximity to agricultural fields or pesticide applications at the time of birth, these exposure indicators at birth are assumed to reflect exposures in prenatal and/or postnatal period, which are believed to be critical windows of exposures for childhood cancers. Consequently, the “one-address” approach may lead to exposure misclassification for those who moved in early childhood especially for exposures with high spatial heterogeneity. In a 2003-2007 California statewide representative survey, only 14% of all women moved in the 2-7 months post-partum , but with increasing age of the child, the frequency of residential moves also increased. For more than 50% of childhood cancer cases under age 5 diagnosed in California between 1988 and 2005, address at birth differed from the address at cancer diagnosis , which raises concerns about using residence at birth to assess exposures in early childhood. Exposure misclassification due to moving is a ubiquitous problem encountered by nearly all record-based studies that lack a complete residential history for each child. Previous studies suggested that residential mobility may be associated with certain risk factors for childhood cancers such as maternal age, marital status, parity, family income, and other socioeconomic status metrics , resulting in non-differential or even differential misclassification of exposures. Currently, in the US, the privilege of having an accurate complete residential history still only belongs to interview-based environmental epidemiological studies, which are often quite small with hundreds of subjects because of high time and monetary costs of such interviews and likely under powered.

Small case-control studies typically asked for individuals’ lifetime residential histories , including the beginning and end dates for each address. Large cohort studies follow participants over for a long time and often update their addresses periodically from follow-up questionnaires, so they may not know the exact moving dates. Sometimes these studies additionally collect information from the US Postal Service change-of-address forms or major credit reporting agencies, but the date associated with each address does not necessarily capture an accurate “move-in” date, but rather reflects the first known date . While it is not feasible to acquire complete residential histories from interviews for all subjects in large record-based studies as a gold standard to compare against the recorded birth or diagnosis address, databases containing public records of individuals collected by commercial companies have become available in recent years, allowing us to trace individuals without a self reported residential history. For example, LexisNexis Public Records, Inc. , a commercial credit reporting company,plastic nursery plant pot provides all known addresses for a set of individuals upon request. If the commercial residential history data has relatively high accuracy, their low cost and broad coverage would provide valuable information to all studies requiring residential history data . The basic service provided by LexisNexis returns the latest three known addresses while the enhanced service, with a higher cost, returns all known addresses from at least 1995. This database was maintained primarily for the purpose of contacting study participants but not for scientific research use, and therefore may not be as accurate as residential history obtained from interviews or self-administrated questionnaires. Several earlier studies have attempted to compare reconstructed residential history based on LexisNexis records with interview-based residential history for enrolled subjects and validate its use for research purposes. A Michigan case-control study of bladder cancer first compared lifetime residential histories collected through written surveys and 3 residential addresses available in LexisNexis and reported 71.5% match rate . Their bladder cancer cases were less than 80 years of age upon diagnosis, and controls were selected from similar age groups. Both cases and controls had lived in 1 of the 11 counties in Michigan for more than 5 years before recruitment in 2008-2009. Another US-wide study selected a random sample of 1000 subjects originally enrolled in the National Institutes of Health-American Association of Retired Persons Diet and Health Study, with AARP members aged 50–69 years and living in one of six US states or two metropolitan areas at the time of enrollment. Authors found 72% and 87% detailed address match rates with the basic and enhanced services provided by LexisNexis, respectively . The most recent LexisNexis validation study looked into participants in California Teachers Study, a prospective cohort study initiated in 1995-1996 and originally designed to study breast cancer. These women aged between 22 to 104 years at enrollment , and lived throughout California. The study pointed out that though the overall match rate between the two sources of addresses was good , it was diminished among black women and younger women . In summary, such residential information from LexisNexis, if of high quality, could potentially augment existing address information and help us reconstruct residential histories for subjects in large record-linkage studies and provide more accurate exposure estimates.

However, researchers should be aware that residential mobility of young children or their mothers may be different from that of mid-aged or older population, who were the majority in all three validation studies discussed above and believed to have a more stable residence. In addition, differences in distributions of race/ethnicity, socioeconomic status, and geographic regions may influence residential mobility in various study populations as well.During the 1st decade of the 21st century, the estimated rates of preterm birthand low birth weightpeaked at 11%-13% and 7%-8% in the US, respectively . Though survival of infants born preterm and/or low birthweight has improved in the last decades due to advancements in prenatal and neonatal care, they are more susceptible to adverse health outcomes such as neurodevelopmental impairment, respiratory and gastrointestinal complications , obesity, diabetes mellitus, hypertension, and kidney disease ; they also result in substantially higher infant and childhood mortality rates . California is the largest agricultural state in the United States, with more than 150 million pesticide active ingredients applied every year . Previous experimental studies show that various pesticides, including organophosphates and pyrethroids, can influence prenatal development related to adverse birth outcomes .Proposed mechanisms include disturbance of placental functions , endocrine disruption , immune regulation and inflammatory mechanisms . Pesticides have been found in indoor residential dust in residences near agricultural fields, and may persist for years . However, epidemiologic studies yielded inconsistent results, specifically while ecological and cross-sectional studies reported positive associations for PTB and LBW and pesticide use in agriculture , results from studies assessing self-reported or occupational use of pesticides were inconsistent . A systematic review of 25 studies examining agriculture-related exposures from residential proximity to pesticide applications suggested weak or no effects on preterm birth and low birth weight, possibly due to the methodological difficulties of exposure assessment . More recent residential proximity studies using simple or aggregate-level exposure assessments provided some evidence for pesticide influencing birth outcomes . Two recent Geographic Information System -based studies restricted to the San Joaquin Valley of California reported conflicting results – one found pesticide exposures to increase preterm birth and low birthweight by 5-9% in those highly exposed to chemicals with acute toxicity as based on the US EPA Signal Word , while the other assessed 543 individual chemicals and 69 physicochemical groupings found negative associations for spontaneous preterm birth . Nevertheless, various small pesticide biomarker-based studies with measured organochlorines, organophosphates, or pyrethroids and their metabolic breakdown products in maternal blood or urine or umbilical cord blood suggested positive associations with preterm birth or with lower birthweight but results varied by chemicals and outcomes assessed . Here, we assessed GIS-derived exposures during pregnancy to selected agricultural pesticides applied near maternal residences and risks of preterm birth and term low birthweight, considering trimester-specific exposure windows in a large sample of births in agricultural regions of California.Birth addresses with a low geocode quality due to missing or non-geocodeable fields on the birth certificates accounted for ~12% of all addresses geocoded. We then calculated measures of residential ambient pesticide exposures using a GIS-based Residential Ambient Pesticide Estimation System, as previously described . In brief, since 1974 agricultural pesticide applications for commercial use are recorded in Pesticide Use Reports mandated by the CA Department of Pesticide Regulation .

Data-driven statistical approaches can provide complementary insight into these questions

Additionally, studies have found statistically significant negative associations between living in proximity to agriculture and adverse outcomes , but not with pesticide metabolite levels directly. Similarly counter intuitive results have illustrated that specific chemicals such as methyl bromide or OP pesticides have negative associations with some birth outcomes, but also unexpected positive associations for others.Large samples provide a powerful opportunity to control for various different demographic and environmental characteristics that may be obscuring the relationship between agricultural pesticide exposure and adverse birth outcomes in surrounding communities. Here we revisit the relationship between pesticide exposure and birth outcomes using a large sample of births , which includes individual-level data on maternal and birth characteristics, and pesticide exposure at a small geographical scale. We concentrate on the agriculturally dominated San JoaquinValley, California. California is the most populous state in the United States with roughly 12% of annual births. It is also the greatest user of pesticides with over 85 million kg applied annually, an amount equivalent to roughly 30% of the cumulative active ingredients applied to US agriculture. The San Joaquin Valley is the state’s most productive agricultural region, growing an abundance of high value, high chemical input, and labor-intensive fruit, vegetable, and nut crops. We evaluate pesticide exposure by summing active ingredients of agricultural pesticides applied over gestation, by trimester,round flower buckets and by grouped the United States Environmental Protection Agency’s acute toxicity categories, along with several additional robustness checks. For outcomes, we focus on birth weight, gestational age, and birth abnormalities.

Our sample of over 500 000 individual birth observations and fine-scale data on the timing and amount of pesticide applied allow us to detect statistically significant negative effects of pesticide exposure for all birth outcomes, but generally only for pregnancies exposed to the very highest levels of pesticides .To explore if either inaccuracies in geocoding or spillover of pesticides from surrounding areas contaminated our results we excluded births for mothers living within 200 m of a PLS Section boundary. We found a similar overall pattern of statistical significance as in the larger sample. Although the magnitude of the coefficients increased, the effects on birth weight and gestational length remained <1%, and the effects on the probability of low birth weight, preterm birth, and abnormalities were at most 13% higher for the high exposure group relative to the low exposure group . We also estimated the trimester model including pesticide use in the “fourth trimester” . As anticipated, exposure during the three months following birth did not have a significant effect on any outcomes observed at birth . This “placebo” analysis indicates that our empirical results are unlikely to be caused by omitted trends or factors that are correlated with both pesticide applications and infant health. To further ensure the robustness of our results and inference, we checked different exposure cutoffs as well as a continuous measure of exposure . The magnitude of effects was small and generally non-significant with the 75th percentile cutoff. Being in the top one percent of pesticide exposure led to an 11% increased probability of preterm birth, 20% increased probability of low birth weight, and ~30 g decrease in birth weight relative to lower exposure . We also evaluated models with different location fixed effects, different assumptions about clustering the standard errors to address spatial and temporal error correlation, different sample exclusion restrictions on gestational age and different calculations of trimester, as well as models with other environmental contaminants that can affect in utero infant health .

Although the exact magnitude and patterns of significance did change with these different models, all models consistently reported similar effect sizes. Overall, we report over 100 coefficients in the main text, of which 19 are significant. It is noteworthy that in all these tests, only a single significant coefficient in one model has the opposite sign from that expected. The fact that only one of roughly 20 statistically significant coefficients has the wrong sign is consistent with the notion that our empirical estimates are not plagued by omitted variable bias. Further, since we do not adjust p-values for multiple comparisons, the number of significant effects we report is an upper bound on the “true” number of significant effects. Applying a Bonferroni correction for multiple comparisons that accounts for five outcomes and up to five covariates of interest , the α-level for statistical significance would change from 0.05 to as small as 0.002 . The only three coefficients that remained statistically significant with this Bonferroni correction were those associated with a single covariate of interest, total pesticide exposure over the gestation . Of these, two were associated with preterm birth and one with log gestation .Concerns about the effects of harmful environmental exposure on birth outcomes have existed for decades. Great advances have been made in understanding the effects of smoking and air pollution, among others, yet research on the effects of pesticides has remained inconclusive. While environmental contaminants generally share the ethical and legal problems of evaluating the health consequences of exposure in a controlled setting and the difficulties associated with rare outcomes, pesticides present an additional challenge. Unlike smoking, which is observable, or even air pollution, for which there exists a robust network of monitors, publicly available pesticide use data are lacking for most of the world. As a result, studies have typically been either highly correlative at coarse resolutions or have included a small number of subjects. Both constraints make it difficult to assess whether residential agricultural pesticide exposure has no effector whether logistical and analytical barriers have obfuscated the identification of important effects. Our study bridges the gap between detail and scale by leveraging vast pesticide and birth data for the San Joaquin Valley, CA. Our study has far stronger statistical power to identify effects than previous studies owing to over a hundred thousand birth observations, individual maternal and birth characteristics, and the inclusion of fine-scale regional and temporal fixed effects .

As a result of our statistical design, we have the analytical power to identify extremely small, but statistically significant negative effects of pesticide exposure on several birth outcomes, if they occur. Furthermore, our study design and extensive pesticide data enable us to evaluate many details of the nature of pesticide exposure. For example, we can evaluate whether pesticide exposure in different trimesters or pesticides of different toxicity levels affected birth outcomes in different ways. Fetal susceptibility to environmental exposure varies through development. Similarly, different chemical toxicity can have different expected health outcomes. Here we focused on aggregate chemicals grouped into high and low toxicity pesticides by their EPA Signal Word, which reflects acute toxicity. Acute toxicity does not necessarily indicate impacts from long-term exposure. As such, chemicals suspected to cause negative birth outcomes, such as organophosphates or atrazine would be classified as low toxicity. Nevertheless, we consistently find effects of less than a 10% increase in adverse outcomes for individuals in the top 5% of exposure regardless of timing or toxicity of exposure, even though which effects are statistically significant depends on the model. Pesticide exposure has a highly skewed distribution in the San Joaquin Valley, where over half of births received no pesticides,plastic flower buckets wholesale the top quarter received about 250 kg and the top 5% received over 16 times that amount. Further, exposure to the top 25% levels had virtually no detectable effect whereas exposure to the top 1% had effects that were up to double the magnitude of effects observed for the top 5% of exposure. In other words, for most births, there is no statistically identifiable impact of pesticide exposure on birth outcome. Yet, for individuals in the top 5 percent of exposure, pesticide exposure led to 5–9% increases in adverse outcomes. The magnitude of effects were further enlarged for the top 1%, where these extreme exposures led to an 11% increased probability of preterm birth, 20% increased probability of low birth weight, and ~30 g decrease in birth weight. For perspective, other environmental conditions such as air pollution and extreme heat generally report a 5–10% increase in adverse birth outcomes, but from less extreme exposure. Similar magnitudes of effects are also observed for other, non-exposure conditions of pregnancy. For example, stress during pregnancy may increase the probability of low birth weight by ~6%, while enrollment in supplemental nutrition programs is estimated to reduce the probability of low birth weight by a similar amount. The significance of the negative effects of extreme pesticide exposure on birth outcomes is heightened by the fact that birth outcomes are persistent and costly. Reducing the incidence of adverse birth outcomes has obvious benefits for individuals, but also for society.

Healthier babies require less intensive care as infants, have better long term health and are higher achieving in terms of earnings and employment. Thus, even small reductions in adverse outcomes can economically offset societal investment in programs such as supplemental nutrition programs offered to millions of low-income women. Due to the concentration of negative outcomes at the very highest pesticide exposures, policies, and interventions that target the extreme right tail of the pesticide exposure distribution could largely eliminate the adverse birth outcomes associated with agricultural pesticide exposure documented in this study. As such, valuable and pressing future directions for research should focus on identifying the extreme pesticide users near human development and on the underlying causes for their extreme quantities of use. These insights are critical to designing appropriate and adaptive interventions for the population living nearby. For instance, crops vary dramatically in their average pesticide use. Commodities such as grapes receive nearly 50 kg ha−1 per year of insecticides alone in the San Joaquin Valley region, while other high value crops such as pistachios receive barely on third of that amount. Within these broad differences, there are also relevant differences among crops with regard to the chemical composition and seasonal timing of pesticide application. Finally, not all agricultural fields are in proximity to human settlement. Rather, as we illustrate, areas with consistent births and pesticides are a small fraction of the San Joaquin Valley. Thus, if extreme pesticide areas and vulnerable populations could be identified, strategies or interventions could be developed to mitigate the likelihood of extreme exposures. One further difficulty is isolating the roles of individual chemicals and their mixtures in driving the negative outcomes. Doing so is extremely challenging, because many chemicals are used in conjunction or in close spatial or temporal windows. Using a large scale data-driven approach could provide a starting point from which individual or community based studies could be built. For example, statewide birth certificate data could enable the identification of potential hot spots of negative birth outcomes while the Pesticide Use Reports provide a large sample of different pesticide mixtures. This could yield valuable information for targeting more detailed studies of individual exposures and difficult to observe outcomes towards regions and months of the highest concern. There are some important limitations to our study. As with other environmental contaminants, controlled experiments evaluating the effects of pesticide exposure on birth outcomes are impossible due to clear ethical and legal constraints. This presents challenges both for interpretation and estimation. With regard to interpretation, we cannot observe all individual adaptive responses to pesticide use, such as staying indoors to avoid exposure to pesticide. Further, we can only observe the effects on live births. As a result, our estimates reflect both the direct effect of exposure on live births and the mitigating effects of avoidance behaviors. With regard to identification and estimation, establishing causality without random assignment into pesticide exposure relies on quasi-experimental approaches, such as the panel data models used here with observational data. While there is no way to formally test if our methods have eliminated all sources of bias that preclude causal interpretation of the regression coefficients, our results are robust to multiple modeling approaches, including controlling for other environmental contaminants such as ambient concentration of pollutants and extreme temperatures. Similarly, we find no significant placebo effects of exposure in the 3 months following birth.Birth records do not fully capture adverse outcomes such as abnormalities that are difficult to observe at birth nor are they comprehensive with regards to socio-demographics. Measurement error on the outcome variable would not bias our estimates of the effects of pesticide exposure unless it was somehow correlated with pesticide use, yet it could reduce our precision and thus the likelihood of finding statistical significance.

An understanding of the depth to the groundwater table is also needed

As is the case with any model, and with soil survey information in particular, ground-truthing at the field scale is necessary to verify results. We acknowledge limitations to our model. It does not consider proximity to a surface water source, which is an issue especially in areas that are irrigated solely from groundwater wells and are not connected to conveyance systems that supply surface water. The SAGBI also does not consider characteristics of the vadose zone or depth to groundwater. In arid regions, deep vadose zones may contain contaminants such as salts or agricultural pollutants that have accumulated over years of irrigation and incomplete leaching. These deep accumulations of contaminants could be flushed into the water table when excess water is applied during groundwater banking events. Furthermore, deep sediment likely contains hydraulically restrictive horizons that have not been documented, creating uncertainty as to where the water travels.Given these issues, SAGBI may be most useful when used in concert with water infrastructure models and hydrogeologic models — which generally do not incorporate soil survey information in a comprehensive way — to develop a fuller assessment of the processes and limitations involved in a potential groundwater banking effort.Selenium received recognition as an environmental contaminant in the 1980s,procona system as a result of the unprecedented events at the Kesterson Reservoir in California , a national wildlife refuge at the time . Large amounts of this trace element had been mobilized through irrigation of selenium-rich soils in the western San Joaquin Valley, transported along with agricultural runoff, and accumulated at the Reservoir.

Toxic selenium concentrations brought about death and deformities for as much as 64% of the wild aquatic birds hatched at the reservoir, including both local migratory species. Within a few years, the habitat of a variety of fish and waterfowl was classified as a toxic waste site . Today, the Reservoir’s ponds are drained and covered beneath a layer of soil fill , yet the mechanisms of selenium release now known as “the Kesterson effect” are still a threat in California and around the world . The environmental and management conditions creating irrigation-induced selenium contamination have been characterized in Theresa Presser’s seminal work . In brief, problems arise when seleniferous soils, such as those formed from Cretaceous marine sedimentary deposits along the Western side of the San Joaquin basin are subjected to irrigated agriculture. Salts, including selenium, naturally present in such soils are mobilized through irrigation, and high evaporation rates concentrate them in the root zone. In order to avoid negative effects on plant growth, subsurface drainage systems are used to export excess salts from the soil. This is particularly necessary in places where deep percolation is inhibited by a shallow impermeable layer. Such subsurface runoff routinely contains selenium in concentrations that exceed the US Environmental Protection Agency designation of toxic waste and thus poses an acute threat to aquatic ecosystems that receive it . The irrigation runoff feeding into the evaporation ponds of the Kesterson reservoir averaged 300 µg Se/L . The discovery of widespread deformities among waterfowl hatched near these ponds in 1983 led to a shift in the perception of selenium. While research had thus far been focused on farm-scale problems related to crop accumulation and toxicity to livestock, it became clear that excessive selenium concentrations in agricultural runoff was a watershed-scale resource protection issue that would greatly complicate irrigation management throughout the Western United States . As a result, California has been a hot spot for global research and management of environmental selenium contamination .

As selenium load management in the San Joaquin basin has made significant progress, new major sites of concern, such as the San Francisco Bay-Delta and the Salton Sea , have emerged in California. Current regulatory standards for selenium as aquatic contaminant are insufficient to be protective of sensitive ecosystems because they do not account for amplified exposure through bio-accumulation . There are many other pathways of anthropogenic selenium contamination – the San Francisco Bay-Delta for example receives half of its input from refineries . However, the diffuse agricultural sources are particularly hard to control , are the principal source of selenium in western US surface waters , and have shaped California’s history like no other selenium source. This paper analyzes what can be learned from the last three decades of seleniferous drainage management and regulatory approaches developed in California. In particular I seek to answer two key questions: 1) What were the greatest achievements and shortfalls of seleniferous drainage management in California? 2) To what extent may the current development of site-specific selenium water quality criteria for the San Francisco Bay and Delta serve as a model for future regulation?Selenium is a naturally occurring trace element heterogeneously distributed across terrestrial and marine environments . On land, seleniferous soils and those marked by selenium deficiency sometimes occur as close as 20 km from one another . Selenium contamination of natural ecosystems is linked to an array of human activities including irrigated agriculture, mining and smelting of metal ores, as well as refining and combusting of fossil fuels. The bio-spheric enrichment factor, which is computed as the ratio of anthropogenic to estimated “natural” emissions of a substance, was found to be 17 for selenium , highlighting the dominance of the anthropogenic component in the modern selenium cycle . Anthropogenic fluxes are expected to keep increasing in the foreseeable future as energy and resource demands increase . Selenium bio-accumulates, with tissue concentrations in animals and plants typically 1-3 orders of magnitude above those found in water.

Consequently, the predominant selenium uptake pathway for animals is through the consumption of food rather than water. Bio-accumulation and bio magnification are particularly intense in aquatic ecosystems and selenium contamination of such habitats is a global concern . In the Western United States alone, nearly 400,000 km2 of land are susceptible to irrigation induced contamination by the same mechanisms that led to the demise of the Kesterson Reservoir . Other nations where irrigation induced selenium contamination has been observed include Canada, Egypt, Israel, and Mexico . The environmental impacts of selenium depend on the element’s chemical speciation. The element’s primary dissolved forms, selenate and selenite , are mobile and bio-available . They can be sequestered in soils or sediments upon microbial reduction to solid elemental Se, metal selenides, or volatilized to the atmosphere upon reduction to gaseous methylated Se. Both selenate and selenite are toxic at elevated concentration,procona valencia buckets selenite however was found to be more toxic than selenate in direct exposure studies involving invertebrates and fish and also to bio-accumulate more readily at the base of aquatic food chains . Additionally, once any dissolved form of selenium is assimilated by an organism it is converted into highly bio-available organo-selenide species, Se . Exposure studies comparing organo-selenides to selenite in the diets of water birds established lower toxicity thresholds for the former . Organo-selenides are released from decaying organisms and organic matter during decomposition and can then persist in solution or be oxidized to selenite, while the conversion back to selenate does not occur at relevant rates in aquatic environments . Thus, recycling of selenium at the base of aquatic food webs through assimilation and decomposition usually leads to a buildup of the more bio-available and toxic forms over time . This buildup of bio-available selenium species may also explain why tissue concentrations in the upper trophic levels of stagnant or low-flowing ecosystems typically exceed those of fast flowing ecosystems with comparable selenium inputs, but shorter residence times . The complex environmental cycling of selenium has been a major obstacle in creating water quality regulations for this element . Regulatory concentration guidelines vary widely between jurisdictions and there are significant opportunities for new regulatory approaches . The Californian office of the EPA is currently working on site-specific water quality criteria for the protection of wildlife in the San Francisco Bay and Delta . These criteria are to be based on a modeling approach developed by USGS scientists, capable of translating tissue limits to dissolved concentration limits . There is hope among aquatic toxicologists that California’s new site-specific approach may become a model for national standards . For all contaminants regulated since 1985, aquatic life criteria under the Clean Water Act have been defined through separate dissolved concentration limits for longer term “continuous” and short term “maximum” limits . The selenium criteria that were established in 1987 defined continuous concentration limits of 5 µg/L as acid-soluble selenium with maximum concentrations not exceeding 20 µg/L more than once every three years for freshwater environments, but allowed up to 71 µg/L with up to one three-year exceedance of 300 µg/L for saltwater environments.These selenium limits became legally binding for 14 states including California after promulgation with the 1992 Water Quality Standards.

A central problem with the current criteria is that they were predominantly based on data drawn from direct exposure laboratory studies and thus failed to take into account the more ecologically relevant toxic effects due to bio-accumulation and trophic transfer. The freshwater criteria were based on field data from a contamination event , while the saltwater criteria were purely based on laboratory studies which did not account for bio-accumulation. The resulting difference of more than one order of magnitude between fresh- and saltwater criteria is not supported by field data . In fact, the saltwater criteria have widely been regarded as under protective of wildlife, including waterfowl . In addition, the freshwater criteria appear under productive of particularly sensitive ecosystems and species . To be protective of waterfowl in the wetlands of the Central Valley Region, a 2 µg Se/L monthly mean water quality criterion was deemed necessary by the Regional Water Quality Control Board and this objective was officially approved for the region by the EPA in 1990 . For the wetlands of the Central Valley Region, this criterion overrides the statewide criteria promulgated in 1992 and remains in effect today. However, given the wide range of bio-availability between different selenium species and the complex transfer processes between environmental compartments and trophic levels, regulation based solely on dissolved or acid-soluble concentrations has been characterized as inadequate . In response to such criticism the EPA proposed in 2004 a new tissue-based criterion for selenium with a 7.91 µg/g fish tissue limit to supersede the previous national water quality guidelines for selenium. This limit is based on the lowest level of effect in juvenile bluegill sunfish under simulated overwintering conditions . Whereas there is little doubt that tissue concentrations are more representative of exposure than dissolved concentrations for individual species, it is unclear if a single fish tissue limit will be protective across entire food webs including a diversity of fish and waterfowl . The proposed tissue based criteria have to date remained at draft stage due to objection by the US Fish and Wildlife Service. The historic developments that lead to the rise of selenium contamination in the San Joaquin Valley can be traced to the passage of the California Water Resources Development Act of 1960. The Act laid the financial foundation for the State Water Plan providing for the construction of the nation’s largest water distribution system and including also infrastructure measures for “the removal of drainage water” . The State Water Projects funded under this plan began delivering water to 4,000 km2 in the Southern San Joaquin Valley as of 1968 . To prevent salinization and manage agricultural runoff, the Bureau of Reclamation constructed collector drains, a main drainage canal , and a regulating reservoir, Kesterson . Originally, the San Luis Drain was planned to deliver drainage out of the San Joaquin Valley all the way to the San Francisco Bay Delta, however the northern part of the drain was never completed . Instead, from the time of the San Luis Drain’s completion in 1975 until its temporary closure in 1986, all runoff water channeled through the drain was delivered to the evaporation ponds of the Kesterson Reservoir, which had become part of a newly created national wildlife refuge in 1970 . There, in the early 1980s, high rates of embryo deformity and mortality, as well as large numbers of adult deaths among waterfowl were identified as caused by the elevated selenium concentrations in the evaporation ponds . This led to the closure of the Reservoir to all runoff inputs in 1986 .

Poor knowledge of winds at the field scale also represent a significant limitation

As field size increases, the length of time required to move a packet of air from one side of the field to the other will increase, decreasing the probability that wind speed and direction will remain relatively constant. Furthermore, as the moisture content increases down wind, this would decrease vapor pressure deficit, potentially reducing rates of ET downwind. Another explanation, suggested by that fact that some crops showed a positive correlation between LSTand slope, is that rather than advection of plant-transpired moisture downwind over individual fields, there is instead an accumulation of water vapor over the field. This idea will be explored further in Section 4.1.3. Second, we did not find positive correlations between GV fraction and water vapor slope as postulated in Hypothesis E. If green vegetation is transpiring and adding to the water vapor above a field, we would expect higher fractions of GV to contribute more water vapor, and thus increase the size of the gradient. We found no correlation between water vapor slope and the GV fraction, even when results were segmented by field size and GV fraction. We used 50% GV as the cutoff to demarcate sparsely vegetated fields from highly vegetated fields, as is consistent with previous studies. However, we found that the average fractional GV coverage of fields that showed good alignment between wind direction and water vapor directionality was around 45%. Therefore, future studies may want to consider a lower GV threshold or a segmentation of fields into multiple GV classes. Finally, we did not find an inverse correlation between water vapor slope and LST in support of Hypothesis G. Either no correlation was found, drainage collection pot or highest water vapor slopes were found with higher temperature crops.Water vapor patterns were as expected at the field level, in response to wind.

However, water vapor patterns were not as expected in response to the surface properties of field size, GV fraction, and ET rate as expressed by field-scale LST. We had hypothesized that field-level water vapor slopes can be used to infer crop transpiration, but did not find evidence supporting that hypothesis. Rather, our results suggested that water vapor accumulation from transpiration was more dominant than the advection signal at the field level. The rate of ET has been found to remain constant with downwind distance across a field, even if warm, dry air is being advected toward a vegetated field. If plants are transpiring at a constant rate and winds are not strong enough or stable enough in directionality to evenly advect the moisture, the concentration of water vapor above the field would increase relatively evenly throughout the field, leading to a diminished slope. Crops are also more aerodynamically rough than an empty soil field, and the resultant turbulence caused by vegetation creates eddies and atmospheric mixing that may muddle signals of field-level advection discernable above smoother landscapes. The hypothesis of water vapor accumulation is supported by results that found a positive relationship between LST and slope for some crops, a negative relationship between field size and slope, and a weak positive correlation between water vapor intercept and GV fraction in 2013 and 2015. Therefore, the results of this study lead us to new conceptual understanding that the magnitude of water vapor as assessed though the intercept of a fitted plane may be better indicator of ET than the slope. However, underlying heterogeneity of the landscape and scaling issues, as discussed below, prohibited isolated analyses of intercepts in this study area.There is error within all water vapor estimates regardless of which retrieval method is used, and the estimates vary significantly from model to model. However, Ben-Dor et al. found that, of six different water vapor retrievals, ACORN estimated water content with acceptable accuracy and, importantly for our study, it was one of only two models that accurately discriminated water vapor from liquid water in plants.

Therefore, the positive correlations found in years 2013 and 2015 between water vapor and vegetation fraction are assumed to be a product of coupling between the landscape and the atmosphere, rather than an artifact of the retrieval.Wind direction and magnitude can change significantly within a small period of time, making estimations of wind within the study scene at the time of the flight particularly difficult. Furthermore, a sparse network of meteorological stations, may not accurately capture more local variation in wind between the stations. Thus the IDW wind field we used in this study may not adequately characterize fine spatial or temporal variability in winds at the field scale.Unlike Ogunjemiyo et al. who studied water vapor over a relatively homogeneous area of transpiring poplar trees, this study evaluated water vapor as it varies across a very diverse agricultural landscape with many different crop species, green vegetation cover, and irrigation regimes. As such, Ogunjemiyo’s conceptual model illustrated an ideal relationship between water vapor and vegetation at the field-scale that may not hold in our complex study area. First, interactions between water vapor occurring over two diverse, adjacent fields may alter the vapor deficit and stomatal response of a single crop field and result in water vapor trends that do not follow Ogunjemiyo’s model. The schematic in Fig 15A illustrates one possible interaction in which a transpiring field is upwind of a non-transpiring field. While the transpiring field will act as hypothesized with the slope and direction of a fitted plane in line with the wind direction, a plane fitted to the fallow field downwind will likely show a slope that is opposite in direction to the wind. The wind carries moist air from the vegetated field onto the fallow field, leading the upwind edge of the fallow field to have higher water vapor concentrations than the edge that is downwind. In the case of the downwind area being another highly transpiring field , the moist, advected air from the upwind field may reduce the transpiration rate of the downwind field at the boundary by decreasing the vapor pressure deficit.

This may lead to an exaggerated water vapor slope over the downwind field. The accumulation of water vapor from one field can therefore lead to shifts in vegetation response that are difficult to account for. Fig 15C illustrates the scenario where a dry, fallow field is upwind of a transpiring field. If the area upwind of a vegetated field is fallow, we would expect the saturation deficit of the dry advecting air to increase the evaporation rate at the boundary unless the vapor pressure deficit is high enough to initialize stomatal closure. A higher ET rate at the upwind side of the field will lessen the expected, observable trend of advection across the field. The transpiration response will be species-dependent. Second, not all fields will interact with the atmosphere in the same ways, due to differences in aerodynamic roughness, affected by row spacing, plant height, plant size, orientation, and composition. The aerodynamic roughness of a field will influence how effectively and at what height the transpired water vapor will mix with the atmosphere. Agricultural fields may differ strongly in aerodynamic roughness, and these differences will lead to deviations from the hypothesized water vapor slope and intercept patterns as they vary with crop type. Therefore, we would not expect all fields to show the same relationships between water vapor, wind, and estimated transpiration rates. We would expect aerodynamically rougher surfaces, such as orchards,snap clamps for greenhouse to generate greater turbulence, generate mixing higher up in the atmosphere, and show greater coupling with the wind than row crops . Depending on the wind speed, orchards may show higher or lower slopes than row crops if their vapor patterns are more tied to wind patterns. In contrast, shorter and smoother row crops such as alfalfa will be less coupled to the atmosphere . Because crops such as orchards are more closely coupled to the atmosphere, they may be more appropriate to study with water vapor imagery. Therefore, isolating the effects of neighboring fields would be beneficial for field-level water vapor analyses, but this was not logistically possible in our study. The study area is a high-producing agricultural area where most fields are bordered by multiple neighbors of varying GVcover, crop type, size, physical characteristics that influence roughness, and ET rate. Further, without LiDAR data from which physical characteristics such as orientation, height and structure could be obtained, it was not possible to model field-scale differences in aerodynamic roughness in this study. This work has aimed to enhance understanding of the impact of GV fraction, field size, crop type and water use on patterns of water vapor.

Positive findings include the presence of significant vapor gradients over most fields, and regional patterns in water vapor that are consistent with advection. High water use crops also showed a disproportionally higher level of agreement between interpolated wind direction and the direction of water vapor gradients. Field size impacted water vapor slope, although slopes were higher in smaller fields than larger fields, in contrast to expectations. We suspect improved knowledge of winds at the field scale, would improve our ability to interpret water vapor gradients. For example, given that a majority of the fields showed statistically significant water vapor slopes, an alternative hypothesis may be that those gradients better represent winds at the field scale, than interpolated winds from a sparse network of stations. Finally, we found the intercept of the best-fit surface for water vapor over a field to be more significant than the slope, suggesting that water vapor is accumulating over fields, rather than advecting.Water vapor imagery shows patterns of vapor that are highly variable through space and time and that hold valuable information about land-atmosphere interactions. We suggest there is considerable potential for this imagery and explored some of this potential here.To further scientific understanding of water vapor imagery analysis, further studies are necessary to refine observation and quantification of land-surface interactions as the signal is highly complex and is affected by many factors. While water vapor imagery could potentially be used to parameterize models of land-surface interactions, additional studies in a diversity of landscapes are necessary to define the conditions and scales at which this imagery can be used. Almost 4,000 AVIRIS images have been collected since 2006 and are available for public download. With such a large repository of data collected at different time points, under varied atmospheric conditions, and over diverse surfaces, future research could tease out the conditions under which interactions can best be observed in a more comprehensive way than this study of three snapshots in time could. Further, with future remote sensing missions such as SBG, which will collect hyperspectral imagery at moderate spatial resolutions and enable column water vapor estimates globally, these data streams can be exploited for comparisons of water vapor over large agricultural areas worldwide. These large archives of water vapor observations can also act as a compliment to models that estimate water vapor and plant water use by providing validation data. In addition to increasing analysis of similarly complex scenes, future studies would benefit from additional data sources that could isolate the signal of water vapor and validate its link to the surface. Such controls include on-site continuous wind measurements, flux tower measurements of ET, and/or more spatially comprehensive wind data. On-site wind data and ET measurements at a high temporal resolution would both validate trends seen in the water vapor imagery and assist in pinpointing the appropriate temporal scale and time of day for which this analysis is best suited. A mesoscale weather model such as the Weather Research and Forecasting Model , might also provide a more accurate fine scale representation of wind fields than simple IDW of weather stations as used here. A finer network of weather stations, and or controlled experiments with meteorological equipment deployed in advance of a flight at specific fields would also be of benefit. Although more work is needed in order to refine understanding of the water vapor signal in a complex agricultural environment, the results suggest that this technique could be of use for crop water analyses in agricultural areas that experience less variation in crop type, wind, and field size than the Central Valley of California.

Hypatia presents users with an easy-to-use interface that it makes available via any web browser

In the third column, we show results for training during sunny days followed by prediction during rainy periods. January 2nd, 3rd, and 4th were days without precipitation followed by three days with 1.29, 1.06, and 1.0 inches of precipitation respectively. The results show that the model trained only on three rainy days had errors slightly higher than when tested on sunny days, while the model trained on sunny days behaved similarly to the models we discussed before, even when tested on rainy days. Part of our future work is to expand test cases to more variable weather conditions . However,these results indicate that the prediction errors are robust to what are essentially “shocks” to the temperature time series in the explanatory weather data and the predicted variables . Because the CPUs were in sealed containers the effects of precipitation on the CPU series are less pronounced. Still, the errors are largely unaffected by precipitation.Figure 4.7 illustrates the errors when predicting DHT-1 temperature with different subsets of explanatory variables. We observe that if we only rely on the nearby weather station the error is much higher than for a subset that includes at least one of the CPU temperatures . Farmers, today, often use only a weather station temperature reading when implementing manual frost prevention practices. Often, though, the weather station they choose to use for the outdoor temperature is even farther away from the target growing block than the station we use in this study. Notice, also, that when the CPU that is directly connected to the DHT is not included , the errors are higher than when it is included . Thus, as one might expect, proximity plays a role in determining the error. However, using only the attached CPU generates a higher MAE than all CPUs and the weather station . Indeed,vertical planting tower the best performing model is this one that uses all four CPU temperatures and WU-T measurements as explanatory variables, yielding an MAE < 0.5 ◦F across all time frames.

Thus using the nearest CPU improves accuracy, but using only the nearest CPU does not yield the most accurate prediction. Finally, while the weather station data does not generate an accurate prediction by itself, including it does improve the accuracy over leaving it out. In summary, our methodology is capable of automatically synthesizing a “virtual” temperature sensor from a set of CPU measurements and externally available weather data. By including all of the available temperature time series, it automatically “tunes” itself to generate the most accurate predictions even when one of the explanatory variables is, by itself, a poor predictor. These predictions are durable , with errors often at the threshold of measurement error , on average, and relatively insensitive to seasonal and meteoro- logical effects, as well as typical CPU loads in the frost-prevention setting where we have deployed it as part of an IoT system.there are no studies of which we are aware that use the devices themselves as thermometers. To enable this, we estimate the outdoor temperature from CPU temperature linear regression Hastie et al. of temperature time series. Others have shown that doing so is useful for other applications and analyses Guestrin et al. , Xie et al. , Lane et al. , Yao et al. . Our work is complementary to these and is unique in that it combines SSA with regression to improve prediction accuracy. As in other work, we leverage edge computing to facilitate low latency response and actuation for IoT systems Alturki et al. , Feng et al. . With the prior chapters, we have contributed new methods for clustering correlated, multidimensional data and for synthesizing virtual sensors using the data produced from combinations of other sensors. We next unify these advances into a scalable, open-source, end-to-end system called Hypatia. We design Hypatia to permit multiple analytics algorithms to be “plugged in” and to simplify the implementation and deployment of a wide range of data science applications. Specifically, Hypatia is a distributed system that automatically deploys data analytics jobs across different cloud-like systems. Our goal with Hypatia is to provide low latency, reliable, and actionable analytics, machine learning model selection, error analysis, data visualization, and scheduling, in a unified scalable system. To enable this, Hypatia places this functionality “near” the sensing devices that generate data, at the edge of the network. It then automates the process of distributing the application execution across different computational tiers: “edge clouds” and public/private cloud systems.

Hypatia does so to reduce the response latency of applications so that data-driven decisions can be made by people and devices at the edge more quickly. Such edge decision making is important for a wide range of application domains including agriculture, smart cities, and home automation where decisions, actuation, and control are all local and make use of information from the surrounding environment. Hypatia automatically deploys and scales tasks on-demand both locally and remotely – if/when there are insufficient resources at the edge.Users can choose the algorithms they need for data analysis and prediction and select the dataset they are interested in. Hypatia iterates through the list of available parameters, and multiple training and scoring models for each parameter set. It then selects those with the best score. Such model selection can be used to provide data-driven decision support for users as well as to actuate and control digital and physical systems . In this chapter, we focus on Hypatia support for clustering and regression. The Hypatia scheduler automates distributed deployment across edge and cloud systems to minimize time to completion . It uses the computational and communication requirements of model training, testing, and inference, to make placement decisions for independent jobs that comprise a workload. For data-intensive workloads, Hypatia prioritizes the use of the edge cloud. For compute-intensive jobs , Hypatia prioritizes public/private cloud use.Hypatia is an online platform for distributed cloud services that implement common data analytics utilities. It takes advantage of cloud-based, large-scale distributed computation, provides automatic scaling , and implements data management and user interfaces in support of visualization and browser-based user interaction. Hypatia currently supports two key building blocks for popular statistical analysis and machine learning applications: clustering and linear regression. For clustering, Hypatia implements the different variants of k-means clustering. The variants include different distance computations , input data scaling , and the six combinations of covariance matrices. Hypatia runs the configuration for successive values of K ranging from 1 to a user-assigned large number, max_k. For each clustering, Hypatia computes a pair of scores based on both the Bayesian Information Criterion Schwarz and the Akaike Information Criterion Akaike . Hypatia allows the user to change the number of independent, randomly seeded runs to account for statistical variation. Finally, it provides ways for the user to graph and visualize both two-dimensional “slices” of all clusterings as well as the relative BIC and AIC scores.

It uses these scores to provide decision support for the user – e.g. presenting the user with the “best” clustering across all variants. For linear regression, Hypatia implements different approaches for analyzing correlated, multidimensional data , Golubovic et al.. Since we focus on synthesizing new sensors,vertical hydroponic farming we are looking for the most important inputs from other sensors that can be used to accurately estimate a synthesized measurement. Hypatia allows users to decide on the number of input variables and which ones to use. They also can specify the start time of the test, duration of the training and testing periods, the scoring metric to use . Users also choose whether or not to smooth the input data using different techniques . Finally,to predict outdoor temperature, users can select nearby single-board computers and/or weather stations . Once the user makes these choices or accepts/modifies the defaults, Hypatia create an experiment with as many tasks as there are parameter choices. Each task produces a linear regression model with coefficients for each input variable and a score that can be used for model selection. As is done for clustering, Hypatia scores the various parameterizations using the scoring metric to provide decision support to users. The user can then use the visualization tools to verify the similarity between input variables and estimated sensor measurements. Hypatia is unique in that it is extensible – different data analytics algorithms can be “plugged in” easily, and automatically deployed with and compared to others. User can also extend the platform with both scoring and visualization tools. Visualization is particularly important when some of the sensors are faulty and unreliable, or some of the smoothing or filtering techniques do not produce the desired outcome. Figure 5.1 shows such an example where visualization is used to show growers how soil moisture responds to precipitation and temperature on east, and west sides of a tree in an almond grove at different depths of 1 foot and 2 feet . Being able to understand how significant each parameter is to soil moisture provides decision support that can be used to guide irrigation and harvest. To implement Hypatia, we have developed a user-facing web service and distributed, cloud-enabled backend. Users upload their datasets to the web service front end as files in a common, simple format: as comma-separated values . The user interface also enables users to modify the various algorithms and their parameters, or accept the defaults.

Hypatia considers each parameterization that the user chooses as a “job”. Each job consists of multiple tasks that Hypatia deploys. Users can also use the service to check the status of a job or to view the report and results for a job . The status page provides an overview of all the tasks for a job showing a progress bar for the percentage of tasks completed and a table showing task parameters and outcomes. Hypatia uses a report page to provide its recommendation for both analysis building blocks, clustering and regression. For clustering, the recommendation consists of the number of clusters and the k-means variant that produces the best BIC score. This page also shows the cluster assignments, spatial plots using longitude and latitude , BIC and AIC score plots. Hypatia also provides cluster labels in CSV files that the user can download. For regression, the report page consists of a map of error analysis for each model grouped by their parameters. Users can quickly navigate to the model with the smallest error. The software architecture of Hypatia is shown in Figure 5.2. We implement Hypatia using Python v3.6 and integrate a number of open-source software packages and cloud services. At the edge, Hypatia uses a small private cloud that runs Eucalyptus software v4.4Nurmi et al. , Aristotle . The public cloud is Amazon Web Services Elastic Compute Cloud . Hypatia integrates virtual servers from these two cloud systems with different capabilities , which we describe in our empirical methodology. Hypatia is deployed on an edge cloud and a private/public cloud if available. We assume that the edge cloud has limited resources and is located near where data is produced by sensors. The public cloud provides vast resources and is located across a long-haul network with varying performance and perhaps intermittent connectivity. We use NEC to denote the number of machines available in the edge cloud and to denote the number of machines available in the public cloud where NEC << NP C. Users submit multiple jobs to the edge system . Each job describes the datasets to be used for training, testing, and inference or analysis. In some jobs we can assume that the entire dataset is needed while in others we can assume that data can be split and tasks within the job can operate on different parts of the dataset in parallel. Each job has ntasks . In the numerous jobs that we have evaluated over the course of this dissertation, we have observed that for the applications we have studied, n can range tens of tasks to millions of tasks. We consider tasks from the same job as having the same “type”. To estimate the time each task will take to complete the data transfer and computation, we compute an average per job i as ti , across past tasks of the same type. Each task fetches its dataset upon invocation.

The Centaurus implementation consists of a user-facing web service and distributed cloud-enabled backend

Precision farming integrates cyber infrastructure and computational data analysis to overcome the challenges associated with extracting useful information and actionable insights from the vast amount of information that surrounds the crop life cycle. Precision ag attempts to help growers answer key questions about irrigation and drainage, plant disease, insect and pest control, fertilization, crop rotation, and soil health, weather protection, and crop production. Existing precision ag solutions include sensor-software systems for irrigation, mapping, and image capture/processing , intelligent implements , and more recently, public cloud software-as-a-service solutions that provide visualization and analysis of farm data over time OnFarm , Climate Corporation , MyAgCentral , gThrive , WatrHub , PowWow . Current precision ag technologies fall short in three key ways that have severely limited their impact and widespread use: First, they fail to provide growers with control over the privacy of their data and second, they lock growers into proprietary, closed, inflexible, and potentially costly technologies and methodologies. In terms of data privacy, extant solutions require that farmers relinquish control over and ownership of their most valuable asset: their data. Farm data reveals private and personal information about grower practices, crop input , and farm implement use, purchasing and sales details, water use, disease problems, etc.,vertical growing systems that define a grower’s business and competitiveness. Revealing such information to vendors in exchange for the ability to visualize it puts farmers at significant risk Federation , Russo , Vogt . The second limitation of extant precision ag solutions is “lock-in”. Lock-in is a well-known business strategy in which vendors seek to create barriers to exit for their customers as a way of ensuring revenue from continued use, new or related products, or add-ons in the future.

In the precision ag sector, this manifests as proprietary, closed, and fragmented solutions that preclude advances in sustainable agriculture science and engineering by anyone other than the companies themselves. Lock-in also manifests as a lack of support for cross-vendor technologies, including observation and sensing devices, farm implements, and data management and analysis applications. Since farmers face many challenges switching vendors once they choose one, the one they choose can charge fees for training, customizations, add-ons, and use of their online resources without limit because of the lack of competition. The third limitation is that most precision ag solutions today employ the centralized approach described above. As solutions become increasingly on-line , the lock-in also requires that farmers upload all of their data to the cloud giving vendor full control and access, and leaving growers without recourse when vendors go out of business Rodrigues . In addition to these risks, such network communication of potentially terabytes of image and sensor data is expensive and time consuming for many because of poor network connectivity and costly data rates that are typical of rural areas. Finally, many of these technologies impose high premiums and yearly subscriptions ArcGIS . The goal of our work is to address these limitations and to provide such a scalable, data analytics platform that facilitates open and scalable precision agadvances. To enable this, we leverage recent advances in Internet of Things , cloud computing, and data analytics and extend them them to contribute new research that defines a software architecture that tailors each to agricultural settings, applications, and sustainability science. These constituent technologies cannot be used off-the-shelf however because they require significant expertise and staffing to setup, manage, and maintain – which are show stoppers for today’s growers. We attempt to overcome these challenges with a comprehensive, end-to-end system for scalable agriculture analytics that is open source and that can run anywhere , precluding lock-in. To enable this, we contribute new advances in scalable analytics, low-cost sensing, easy to use data visualization, data-driven decision support, and automatic edge-cloud scheduling, all within a single, unified distributed platform. In the next chapter, we begin by focusing on an important analytics building block and tailoring its use for farm management zone identification using soil electrical conductivity data.

Statistical clustering, also known as a separation of measurements into related groups, is a key requirement for solving many analytics problems. Lloyd’s algorithm Lloyd , commonly called k-means, is one of the most widely used approaches Duda et al. . K-means is an unsupervised learning algorithm, requiring no training or labeling, that partitions data into K clusters, based on their “distance” from K centers in a multi-dimensional space. Its basic form is simple to implement and has become an indispensable component of pattern recognition, data mining, image processing, information retrieval, and recommendation applications across fields ranging from marketing and advertising to astronomy and agriculture. While conceptually simple, there is a myriad of k-means algorithm variants based on how distances are calculated in the problem space. Some k-means implementations also require “hyper parameters” that control for the amount of statistical variation in clustering solutions. Identifying which algorithm variant and set of implementation parameters to use in a given analytics setting is often challenging and error-prone for novices and experts alike. In this chapter, we present Centaurus as an approach to simplifying the application of k-means through the use of cloud computing. Centaurus is a web accessible, cloud-hosted service that automatically deploys and executes multiple k-means variants concurrently, producing multiple models. It then scores the models to select the one that best fits the data – a process known as model selection. It also allows for experimentation with different hyper parameters and provides a set of data and diagnostic visualizations so that users can best interpret its results. From a systems perspective, Centaurus defines a pluggable framework into which clustering algorithms and k-means variants can be chosen. When users upload their data, Centaurus executes and automatically scales the execution of concurrently executing k-means variants using public or private cloud resources. To perform model selection, Centaurus employs a scoring component based on information criteria. Centaurus computes a score for each result and provides a recommendation of the best clustering to the user. Users can also employ Centaurus to visualize their data,its clusterings, and scores, and to experiment with different parameterizations of the system .

We implement Centaurus using production-quality, open-source software and validate it using synthetic datasets with known clusters. We also apply Centaurus in the context of a real-world, agricultural analytics application and compare its results to the industry-standard clustering approach. The application analyzes fine-grained soil electrical conductivity measurements, GPS coordinates, and elevation data from a field to produce a “map” of differing soil zones. These zones can then be used by farmers and farm consultants to customize the management of different zones on the farm Fridgen et al. , Moral et al. , Fortes et al. , Corwin & Lesch . We compare Centaurus to the state of the art clustering tool for farm management zone identification and show that Centaurus is more robust, obtains more accurate clusters, and requires significantly less input and effort from its users. In the sections that follow, we provide some background on the use of EC for agricultural zone management. We then describe the general form of the kmeans algorithm, variants for computing covariance matrices, and scoring method that Centaurus employs . Following this, we present our datasets,an empirical evaluation of Centaurus, related research specifically related to Centaurus,outdoor vertical plant stands and summarize our contributions. The soil health of a field can vary significantly and change over time due to human activity and forces of nature. To optimize yields, farmers increasingly rely on site-specific farming in which a field is divided into contiguous regions, called zones, with similar soil properties. Agronomic strategies are then tailored to specific zones to apply inputs precisely, to lower costs and input use, and to ultimately increase yields. Management zone boundaries can be determined with many different procedures: soil surveys with or without other measurements Bell et al. , Kitchen et al. ; spatial distribution estimates of soil properties by interpolating soil sample data Mausbach et al. , Wollenhaupt et al. fine-grain soil electrical conductivity measurements Mulla et al. , Jaynes et al. , Sudduth , Rhoades et al. , Sudduth et al. , Corwin & Lesch , Veris , and a combination of sensing technologies Adamchuk et al. . EC-based zone identification is widely used because it addresses many of the limitations of the other approaches: it is inexpensive, it can be repeated overtime to capture changes, and it produces useful and accurate estimates of many yield-limiting soil properties including compaction, water holding capacity, and chemical composition.

As a result, EC-based management tools are used extensively for a wide variety of field plants Peeters et al. , Aggelopooulou et al. , Gili et al. . To collect EC data, EC sensors are typically attached to a GPS-equipped tractor or all-terrain vehicle and pulled across a field to collect measurements at multiple depths and at a very fine grain spatially . EC maps generated from this data can either be used to directly define management zones or to inform the future, more extensive, soil sampling locations Veris , Lund et al. . Alternatively, EC values can be clustered into related regions using fast, automated, unsupervised statistical clustering techniques and its variants Bezdek , Murphy Fridgen et al. , Molin & Castro , Fraisse et al. , et al . Given the potential and wide-spread use of EC-based zone identification tools that rely on automated unsupervised algorithms, in this chapter we investigate the impact of using different k-means implementations and deployment strategies for EC-based management zone identification. We consider different algorithm variants, different numbers of randomized runs, and the frequency of degenerateruns – algorithm solutions which are statistically questionable because they include empty clusters, clusters with too few data points, or clusters that share the same cluster center Brimberg & Mladenovic . To compare k-means solutions , we define a model selection framework that uses the Bayesian Information Criterion Schwarz to score and select the best model. Past work has used BIC to score models for the univariate normal distribution Pelleg et al. . Our work extends this use to multivariate distributions and multiple k-means variants.The k-means algorithm attempts to find a set of cluster centers that describe the distribution of the points in the dataset by minimizing the sum of the squared distances between each point and its cluster center. For a given number of clusters K, it first assigns the cluster centers by randomly selecting K points from the dataset. It then alternates between assigning points to the cluster represented by the nearest center, and recomputing the centers Lloyd , Bishop , while decreasing the overall sum of squared distances Linde et al. . The sum-of-squared distances between data points and their assigned cluster centers provides a way to compare local optima – the lower the sum of thedistances, the closer to a global optimum a specific clustering is. Note, that it is possible to use distance metrics other than Euclidean distance to compute per-cluster differences in variance, or covariance between data features. Thus, for a given data set, the algorithm can generate a number of different k-means clusterings – one for each combination of starting centers, distance metrics, and a method used to compute the covariance matrix. Centaurus integrates both Euclidian and Mahalanobis distance. The computation of Mahalanobis distance requires computation of a covariance matrix for the dataset.In addition, each of these approaches for computing the covariance matrix can be Tied or Untied. Tied means that we compute a covariance matrix per cluster, take the average across all clusters, and then use the averaged covariance matrix to compute distance. Untied means that we compute a separate covariance matrix for each cluster, which we use to compute distance. Using a tied set of covariance matrices assumes that the covariance among dimensions is the same across all clusters, and that the variation in the observed covariance matrices is due to sampling variation. Using an untied set of covariance matrices assumes that each cluster is different in terms of its covariance between dimensions.Users upload their datasets to the web service frontend as files in a simple format: as comma-separated values .

A major limitation to this study is that the reference dataset was not entirely accurate

The plotted detections in Figure 10 and Figure 12 are counted in the overall AP and AR metrics as successful detections because the IoU of the detection and reference label is greater than 50%. Yet, many pixels that belong to individual center pivot fields are not included in the detection. Across similar scenes, Mask R-CNN does not appear to have the consistent boundary eccentricity bias that FCIS has . Because Mask R-CNN showed better boundary accuracy along with comparably high performance metrics relative to the FCIS model, further comparisons and visual examples comparing accuracy across different field size ranges and with different training dataset sizes used Mask R-CNN instead of FCIS. Many scenes are more complex than the arid landscape with fully cultivated center pivots shaped like full circles in Figure 10. Figure 11 is a representative example of a scene with more complex fields. These include non center pivot fields, center pivots containing two halves, quarters, or other fractional portions in different stages of development , and partial pivots, which is semicircular not because the rest of the circle is in a different stage of development or cultivation, but because of another landscape feature that restricts the field’s boundary . In this scene, at least 25% of the detections in this scene are below the 90% confidence threshold, and many atypical pivots are missed based on this threshold . Figure 12 is a simpler scene that like Figure 12, has a high density of center pivot fields. In this case the detections more closely match the reference labels and detection confidence scores are higher, either because of the FCIS model’s tendency to produce higher confidence scores or because the scene has less variation in center pivot field type. Figure 13 highlights another common issue when testing both models,vertical farming technology that reference labels are truncated by the tiling grid used to make 128 by 128 sized samples from Landsat 5 scenes.

These tend to be missed detections based on the 90% confidence threshold since they are not mapped with a high confidence score. There are cases where no high confidence detections above 90% are produced, such as in Figure 14. In this scene, No Data values in the Landsat 5 imagery, partial pivots near non-center pivot fields, and mixed pivots with indistinct boundaries all result in a scene that is not mapped with high confidence. However, in cases where there is high contrast between center pivot fields and their surrounding environment, they are mapped nearly perfectly by the Mask R-CNN model with high confidence scores . Figures 16 through 18 were selected to illustrate the impact that size range has on detection accuracy, since center pivots can come in various semicircular shapes and sizes. Figure 16 shows that in a scene with no reference labels, no high confidence detections were produced for any size category. The highest confidence score associated with an erroneous detection in this case was at most ~0.66, which is relatively low for both models. Figure 17 shows a case where a large center pivot is mapped accurately, whereas smaller and medium center pivots in the scene are not. This example shows the scale invariance of the Mask RCNN model in that it can accurately map the large center pivot because it looks similar to a medium sized center pivot, only larger. On the other hand, smaller center pivots, partial pivots, and mixed pivots are detected with lower confidences or not detected. Figure 18 highlights a case where a large center pivot is not mapped with a high confidence score above 90%. Unlike Figure 17 where the large center pivot is uniform in appearance, the large center pivot in Figure 18 has a mixture of three different land cover types. This indicates that the inaccuracies from large center pivot detection may come from large center pivots that were partially cultivated or divided into multiple portions with different crop types or cultivation stages. Many false negatives in this scene and others are the result of partial pivots, mixed pivots or pivots that had not been annotated yet in the 2005 dataset.

Small fields are more difficult to detect than large ones and so as expected, removing 50% of the training data available to train the Mask R-CNN model caused large drops in performance. Having more training data available to improve features that are attuned to detect small fields is particularly important with regard to overall model performance. The metric results for small fields is likely biased toward a worse result because many full pivots overlapped a sample image boundary, leading to small, partial areas of pivot irrigation at scene edges being over represented after Landsat scenes were tiled into 128×128 image chips. Since these fields have a less distinctive shape, some full pivots at scene edges were missed. However, since small fields make up a minority of the total population of fields and the medium and large category were more accurate by 20 or more percentage points for both AR and AP, this shows that both the FCIS and Mask R-CNN models can map a substantial majority of pivots with greater than 50% intersection over union. Zhang et al. tested their model on the same geographic locations at a different year and used samples produced from two Landsat scenes to train their model over a 21,000 km^2 area versus the Nebraska dataset which spans 200,520 km^2. These results extend upon the work by Zhang et al. , as the test set is geographically independent from both the training and validation set and 32 Landsat 5 scenes across a large geographic area were used to train and test the model. Furthermore, Zhang et al. ’s approach produces bounding boxes for each field, while Mask R-CNN produces instance segmentations that can be used to count fields and identify pixels belonging to individual fields.While comparing metrics is useful, they don’t indicate how performance varies across different landscapes or how well the quality of the boundary matches reference given that a detection is determined to be correct by having an IoU over 50%. While the FCIS model slightly outperformed the Mask R-CNN model in terms of the medium size category average precision, it also exhibited poorer boundary fidelity. Figure 10 demonstrates arbitrarily boxy boundaries that appear to be truncated by the position sensitive score map voting process that is the final stage of the FCIS model.

The Mask R-CNN model’s high confidence detections that had a confidence score of 0.9 or higher matched the reference boundaries much more closely, showing that this model can be usefully applied to delineate center pivot agriculture in a complex, humid landscape. While the FCIS model could also be employed with post processing to assign a perfect circle to the center of a FCIS detection ,vertical tower planter this would lead to further errors that would overestimate the size and wrongly estimate the shape of partial center pivots. The results from Mask R-CNN on medium sized fields are encouraging because it indicates that the model could potentially generalize well to semi-arid and arid regions outside of Nebraska . In addition, the results from Figures 10, 12, and 15 indicate that where many uniform center pivots are densely colocated and not many non center pivots are present, a higher quantity will be mapped correctly. This is encouraging, since in many parts of the world, center pivots are densely colocated, or are cultivated in semiarid or arid environments, where contrast is high. Therefore, the model can be expected to generalize well outside of Nebraska, though it remains future work to test the model in other regions. False negatives are present for many scenes that are heavily cultivated, and in many cases it is ambiguous whether the absence of a high confidence detection is due to the absence of a center pivot or because of Landsat’s inability to resolve fuzzier boundaries between a field and its surrounding environment . A time series based approach similar to Deines et al. could improve detections so that only pivots that exhibited a pattern of increased greenness would be detected in a given year. However, this requires multiple images within a growing season from a Landsat sensor, which are not always available due to clouds, and also precludes the use of traditional CNN methods and pretrained networks which ease the computational burden of training and detection. Furthermore, this approach is difficult to incorporate into a CNN based method for segmentation, as it precludes the use of pretraining.

Another alternative is to develop higher quality annotated datasets which make meaningful semantic distinctions between agriculture in different stages of cultivation. For example, in Figure 11, brown center pivots that are not detected could instead be labeled as “fallow” or “uncultivated”, and this information could be used to refine the training samples used to train a model to segment pivots within specific cultivation stages. With 4 hours of training on 8 GPUs, the original implementation of Mask R-CNN achieved 37.1 AP using a ResNet-101-FPN backbone on the COCO dataset, a large improvement over the best FCIS model tested, which achieved 33.6 AP . This amounts to a difference of 3.5 AP percentage points, with Mask R-CNN performing better on the COCO dataset. On the Nebraska dataset, for the medium size category, the difference in AP was 3.2 AP percentage points, with the FCIS model outperforming Mask R-CNN. However Mask R-CNN outperformed FCIS in terms of AR by 5.1%. These results indicate that COCO detection baselines are not necessarily reflective of overall metric performance, given that FCIS outperformed Mask R-CNN in the more numerous size category. The improvements on the COCO baseline do reflect the improved boundary accuracy of MAsk R-CNN relative to the FCIS model. The AR and AP results on the Nebraska center pivot dataset are higher relative to Reike , which is to be expected since center pivots are a simpler detection target than fields in the Denmark dataset, which come in more various shapes and sizes. What’s especially notable is that even though Rieke trained the FCIS model on approximately 11 times the training data compared to the training data used in this study, the AP results and AR results on the Nebraska dataset were about 10 to 20 points higher for each of the size categories. The difference for the AP for the small category was 0.42-0.28 = 0.14 , the difference for the medium category was 0.732 – 0.473 = 0.259, and the difference for the large category was 0.734 – 0.51 = 0.224 . Rieke used 159042 samples compared to 13625 samples used in this study. These samples were equivalently sized to this study, at 128×128 pixels. Even though the size categories used in this study and Rieke are not exactly the same, the fact that each of the categories saw substantially better performance for the FCIS model on the Nebraska dataset indicates that the relative simplicity of the center pivot detection target played a substantial role in the jump in performance. Given that Rieke used 11 more training samples, this indicates that the simplicity of the detection target played an even larger role in the performance difference. This is an important lesson for remote sensing researchers looking to use CNN techniques to map fields or other land cover objects, the feasibility of mapping the detection target can be even more important than using an order of magnitude more training data to improve a model’s ability to generalize. These results are comparable to other results achieved for other detection targets. Wen et al. applied a slightly modified version of Mask R-CNN which can produce rotated bounding boxes to segment building footprints in Fujian Province, China from Google Earth imagery. The model was trained on manually labeled annotations across a range of scenes containing buildings with different shapes, materials, and arrangements. Though an independent test set was not used that was separate from the validation set , the model was tested on half of the imagery collected, while the other half was used to train the model, providing a large amount of samples to test the model. The total dataset used to split between training and testing/validation amounted to 2000 500×500 pixel images containing 84,366 buildings .

Citizens of Member States do not have standing to bring WTO-based complaints

As of July 2012, the GENERA database listed 583 scientific studies on the safety of GMO crops and their food ingredients. In addition, the experiential evidence of billions of meals consumed by persons around the world since commercial release of genetically-engineered crops in 1996 supports the safety of genetically-modified foods. Since 1996, there has not been one verified health complaint to humans, animals or plants from genetically-engineered crops, raw foods, or processed foods. Despite some published attempts to deny this overwhelming scientific evidence in support of genetically engineered foods, the scientific consensus is clear —genetically-engineered crops, foods, and processed ingredients do not present health and safety concerns for humans, animals, or plants. SPS Agreement Article 3 sets forth provisions that could save Proposition 37. Paragraph 3.2 affirms a SPS measure that conforms to international standards relating to health and safety. However, Paragraph 3.2 does not protect Proposition 37 because there are no international standards that categorize genetically-engineered raw or processed foods as unsafe or unhealthy. Comparing Proposition 37 to the legal standards in the SPS Agreement shows that Proposition 37 almost assuredly is not compliant with the SPS Agreement. Indeed, the WTO SPS claim against Proposition 37 is so strong that its proponents are probably not going to defend it as meeting the legal standards of the SPS Agreement. Despite its textual language and the electoral advertising emphasizing food safety and health concerns,vertical farming aeroponics proponents will argue that Proposition 37 cannot properly be characterized as a labeling requirement “directly related to food safety.” Proponents of Proposition 37 will seek to have it classified as a technical barrier to trade in order to avoid the SPS Agreement and its scientific evidence standards.

The TBT Agreement applies to technical regulations, including “marking or labelling requirements as they apply to a product, process or production method.” As Proposition 37 imposes mandatory labels, Proposition 37 is a technical regulation under the TBT definitions. TBT Article 2 sets forth several provisions against which to measure technical regulations for compliance with the TBT Agreement. It states, “Members shall ensure that technical regulations are not prepared, adopted or applied with a view to or with the effect of creating unnecessary obstacles to international trade. For this purpose, technical regulations shall not be more trade-restrictive than necessary to fulfill a legitimate objective, taking account of the risks non-fulfillment would create. Such legitimate objectives are, inter alia, … the prevention of deceptive practices; protection of human health or safety, animal or plant life or health, or the environment. …” Article 2.2 expressly lists three legitimate objectives: national security requirements; protection of human health or safety, animal or plant life or health, or the environment; and prevention of deceptive practices. As for health and safety, Proposition 37 does not provide a label giving consumers information about how to use a product safely or a safe consumption level or any other health and safety data—unless the warning-style label against genetically-modified food itself is considered a valid warning. But, as discussed with regard to the SPS Agreement, there is no scientific evidence available to indicate that genetically modified foods have negative health or safety implications for humans, animals, or the environment. Proposition 37 does not assert a legitimate health and safety objective under TBT Article 2.2.Proposition 37 can be defended as upholding the third legitimate objective—prevention of deceptive practices. Indeed, the Proposition is titled the “California Right to Know Genetically Engineered Food Act,” indicating that labels will assist California consumers in knowing what they are purchasing and avoiding purchases that they desire to avoid. Those who would challenge Proposition 37 for noncompliance with the TBT Article 2.2 will argue that Proposition 37 is not a protection against deceptive practices. Opponents can point to the structure of the proposed Act and its exemptions to provide evidence that Proposition 37 will actually confuse consumers more than inform them accurately.

Proposition 37 exempts foods that lawfully have the USDA Organic label. Under the USDA National Organic Program , organic foods can contain traces of unintentional genetically-modified crops or ingredients without losing the organic label. Simultaneously, those California consumers still will be eating unlabeled food products containing genetically modified crops or ingredients at trace levels, except those products will carry the label “USDA Organic.” In other words, opponents of Proposition 37 will argue that Proposition 37 is itself the deceptive labeling practice and, thus, fails to promote a legitimate objective under TBT Article 2.2. Proponents of Proposition 37 will respond by citing to the recent WTO Dispute Resolution Appellate Body relating to the challenge of Canada and Mexico against the United States country-of-origin label for meat. The WTO Panel ruled against COOL on the grounds of a violation of TBT Article 2.2 because the COOL law would confuse consumers. But the WTO Appellate Body reversed this Panel ruling and determined that COOL did provide information as a legitimate objective under Article 2.2.Aside from “legitimate objectives,” TBT Article 2.2 also requires that technical regulations not be “unnecessary obstacles to international trade” and “not more trade-restrictive than necessary.” Opponents of Proposition 37 will argue that it violates these TBT obligations primarily because consumers already have labels that provide the same level of consumer protection from deception. Opponents will point to the existence of the Non-GMO label and the USDA-Organic label that allow consumers to choose foods which will have minimal levels of genetically-engineered content. These Non-GMO and USDA-Organic labels are voluntary labels that do not impose legal and commercial burdens upon other food products in international trade. TBT Article 2.1 also provides a standard against which to measure Proposition 37 by stating, “Members shall ensure in respect of technical requirements, products imported from the territory of any Member shall be accorded treatment no less favorable than that accorded like products of national origin and to like products originating in any other country.”TBT Article 2.1 requires Members to treat “like products” alike and to refrain from favoring either domestic or other international “like products” as against the products of the Member bringing the Article 2.1 complaint.

Obviously, proponents of Proposition 37 consider genetically-engineered agricultural products as fundamentally different than organic and conventional agricultural products. Proponents will argue that Proposition 37 deals with genetically-engineered agricultural products that constitute a class of products of their own.Opponents of Proposition 37 will respond with two arguments. Opponents can argue that regulatory agencies around the world have considered genetically-engineered raw agricultural products to be substantially equivalent in every regard to conventional and organic agricultural products. Opponents will argue that the substantive qualities of genetically-engineered agricultural products are “like products” and that the process producing the “like products” does not create a separate product classification. Opponents will argue “product” over “process” as the appropriate TBT Article 2.1 interpretation. Opponents of Proposition 37 will also present a second argument. More precisely, opponents of Proposition 37 will highlight the fact that Proposition 37 imposes labels, testing, and papertrail tracing on vegetable oils even though the oil has no DNA remnants of the crop from which the oil came. Soybean oil is soybean oil regardless of what variety of soybean the food processor crushed to produce the oil. With regard to the TBT Article 2.1 arguments,vertical indoor hydroponic system opponents of Proposition 37 may gain support from the Canada and Mexico WTO complaints against the U.S. COOL law. Both the WTO Panel and the WTO Appellate Body determined that Canadian and Mexican meat was a “like product” to United States meat. As a “like product,” the WTO reports ruled that the U.S. COOL law violated TBT Article 2.1 by imposing discriminatory costs and burdens on meat imported into the United States.TBT Articles 2.4 and 2.5 provide a safe harbor for technical regulations if those technical regulations adopt international standards. However, the Codex Alimentarius Commission, the international standards body for food labels, has not created an international standard which proponents of Proposition 37 can claim as its origin and safe harbor.SPS Agreement Article 11 and TBT Agreement Article 14 are both titled “Consultation and Dispute Settlement.” Thereby the SPS Agreement and the TBT Agreement make explicit that Member States to these agreements can complain using the WTO Dispute Settlement Understanding Agreement. For example, Argentina or Brazil or Canada—all likely to be affected by Proposition 37 for the export of soybeans and canola, especially for cooking oils—have the treaty right to file a complaint within the WTO dispute resolution system. Bringing a WTO complaint is fraught with difficulties. Members must think politically and diplomatically about whether it is worthwhile to bring a complaint—even a clearly valid complaint. Members must be willing to expend significant resources in preparing, filing, and arguing WTO complaints. Finally, even if a Member prevails in the Panel or Appellate Body reports, Members recognizes that its WTO remedies are indirect and possibly not fully satisfactory. Although the United States is a Member of the WTO Agreements, the United States, in contrast to Argentina, Brazil and Canada, is not an exporting Member to California.

Consequently, the United States cannot file a WTO complaint invoking the DUS Agreement against California. But by being a Member of the WTO Agreements, the United States has ratified these treaties as part of the law of the United States, transforming these treaties into the supreme law of the land under the U.S. constitution. Moreover, under the WTO Agreements, the United States has the duty to ensure that local governments comply with the WTO Agreements. Therefore, the United States has the legal authority to challenge Proposition 37 in order to protect its supreme law of the land and to avoid violating its WTO obligations.Opponents of Proposition 37 are likely to challenge Proposition 37 immediately if California voters adopt it in November 2012. As indicated in the introduction, these opponents are likely to bring challenges on three different grounds under the U.S. Constitution. These opponents have non-frivolous grounds upon which to pursue these U.S. constitutional challenges. Whether these opponents can add a claim challenging Proposition 37 based on alleged violations of the SPS Agreement or the TBT Agreement is much less clear. TBT Agreement Article 14.4 highlights that the opponents will have difficulty in bringing a WTO-based challenge. TBT Article 14.4 makes clear is that Member States have the legal status to bring WTO-based complaints.Proponents of Proposition 37 will challenge the standing of those opponents who seek to challenge Proposition 37. Proponents will seek to have this WTO-based claim dismissed because the opponents do not have a right to make a legal claim based on the WTO. Proponents will argue that standing to bring a WTO-based claim resides solely in exporting Member States or the United States. By contrast, opponents bringing the immediate challenge containing a WTO-based claim will argue that they are not invoking the WTO Agreements directly. Opponents will argue that they are challenging Proposition 37 to enforce the supreme law of the United States. By invoking the supreme law of the United States, opponents will hope to blunt the standing issue and to avoid dismissal of the WTO-based claim.Assuming that the United States does not file a lawsuit against California and that other opponents are blocked, by the doctrine of standing, from raising WTO-based challenges, Proposition 37, if adopted in November 2012, would become California law. Thus, the first lawsuits related to Proposition 37 would come through either administrative action or a consumer lawsuit against food companies and grocery stores alleging failure to label or misbranding. When facing administrative actions or consumer lawsuits, food companies and grocery stores will want to respond with all possible legal challenges to Proposition 37. Food companies and grocery stores will want to raise the issues of whether Proposition 37 complies with the SPS Agreement and the TBT Agreement as defenses to being found liable for administrative penalties or consumer damages. The agency or consumer bringing the lawsuit against the food company or grocery store will argue that the food company or grocery store does not have standing to raise the WTO-based challenges. The plaintiff likely has to concede that the defendant faces an actual injury. However, the plaintiff will contest vigorously that the defendant is not within the zone of interests that the WTO Agreements mean to protect. In other words, the plaintiff will argue that the WTO Agreements only mean to protect sovereign interests and not private commercial interests.