Marijuana cultivars are known to have THC levels exceeding 2–24% of inflorescence dry weight whereas hemp cultivars produce substantially less THC but rather high levels of CBD . THCA and CBDA share the same bio-synthetic pathway except for the last step in which THCA synthase and CBDA synthase produce THCA or CBDA, respectively . Recent evidence suggests that the genes encoding the two synthases are allelic . CBD and THC are enatiomers, but only THC elicits psychotropic effects, whereas CBD may mediate anti-psychotropic effects , a difference highlighting the stereo-selectivity of receptors in the human body that bind these compounds. Although classified as a drug without therapeutic value in the United States, ingestion of THC is widely regarded as having effects including pain relief and appetite stimulation, that may, among other things, increase the tolerance of cancer patients to chemotherapy . Dronabinol, a synthetic analogue of THC, is approved for use as an appetite stimulant in the United States as a Schedule III drug . Cesamet , another synthetic analogue, is used as an anti-emetic for patients undergoing cancer therapy. The natural product Sativex is approved for use in the UK and is derived from Cannabis cultivars containing both THC and CBD, and is used to treat pain symptoms associated with multiple sclerosis. Compounds from Cannabis sativa are of undeniable medical interest, and subtle differences in the chemical nature of these compounds can greatly influence their pharmacological properties. For these reasons,berry pots a better understanding of the secondary metabolic pathways that lead to the synthesis of bio-active natural products in Cannabis is needed . Knowledge of genetics underlying cannabinoid biosynthesis is also needed to engineer drug-free and distinctive Cannabis varieties capable of supplying hemp fibre and oil seed.
In this report, RNA from mature glands isolated from the bracts of female inflorescences was converted into cDNA and cloned to produce a cDNA library. DNA from over 2000 clones has been sequenced and characterized. Candidate genes for almost all of the enzymes required to convert primary metabolites into THCA have been identified. Expression levels of many of the candidate genes for the pathways were compared between isolated glands and intact inflorescence leaves.Seeds from the marijuana cultivar Skunk no. 1 were provided by HortaPharm BV and imported under a US Drug Enforcement Administration permit to a registered controlled substance research facility. Plants were grown under hydroponic conditions in a secure growth chamber yielding cannabinoid levels in mature plants as reported in Datwyler and Weiblen . Approximately 5 g of tissue was harvested from mature female inflorescences 8 weeks after the onset of flowering. Tissue was equally distributed into four 50 ml tubes containing 20 ml phosphate buffered saline as described by Sambrook et al. , but made with all potassium salts and mixed at maximum speed with a Vortex 2 Genie for four repetitions of 30 s mixing followed by 30 s rest on ice, for a total of 2 min of mixing. Material was sieved through four layers of 131 mm plastic mesh and the flow-through was split into two 50 ml tubes and spun in a centrifuge for 30 s at 500 rpm. Supernatants were decanted and pellets were resuspended in PBS. The suspensions were combined into one tube and pelleted as before. The resulting pellet was diluted into 100 ll of PBS. Five ll were used for cell counting with a haemocytometer, and the total suspension was estimated to contain 70 000 intact glands. Plant residue was incinerated by a DEA-registered reverse distributor .Quantitative reactions were performed as described previously using primers listed in Supplementary Table 4B at JXB online. Equivalent quantities of RNA isolated from glands and inflorescence associated leaves were used to generate the respective single stranded cDNAs. qPCR reactions containing equal quantities of gland or leaf cDNA were run in duplicate along with reactions containing standards consisting of 100-fold sequential dilutions of isolated target fragments, on a Lightcyler qPCR machine .
Lightcycler software was used to generate standard curves covering a range of 106 to which gland and leaf data were compared. Two biological replicates were used to generate the means and standard deviations shown in Supplementary Table 4A at JXB online. These values were used to compute the gland over leaf ratios and P-values shown in Supplementary Table 4A at JXB online. Raw relative expression data, means, standard deviations, P-values from gland versus leaf t tests, qPCR primer sequences, and representative real-time qPCR tracings are shown in Supplementary Table 4A at JXB online.Anatomical study revealed that glands located on mature floral bracts of female plants are the site of enhanced secondary metabolism leading to the production of THCA and other compounds in Cannabis sativa . These glands are located on multicellular stalks and typically are composed of eight cells . The outer gland surface is composed of a smooth capsule covered by a membrane. The capsule contains exudates derived from the gland cells . The weakly attached glands can easily be separated from the bracts and purified as shown in Fig. 1E and F. An EST library was constructed using RNA isolated from purified glands. Over 100 000 ESTs were cloned. Plasmid DNA was isolated and sequenced from over 2000 clones. Because of the directed orientation of cDNA insertion, sequences are expected to represent the coding strand. After the removal of vectoronly, poor quality sequences, and sequences obviously originating from organelles or ribosomal RNA, the remaining sequences were clustered into 1075 unigenes . Overall, 111 of the unigenes were contigs containing two or more closely related ESTs . Only 14 contigs lacked a similar sequence in the NCBI database. Nine hundred and sixty four of the ESTs were only found once and of these 710 were similar to sequences in the NCBI database . The top three unigenes representing the greatest number of ESTs encoded proteins related to metallothionein, RD22-like BURP domain-containing proteins, and chitin binding hevein-like proteins . All three of these proteins have functions related to biotic or abiotic stress responses . Gene Ontology analysis was performed on the sequence dataset . An analysis of biological function indicates that 27% of the unigenes encode proteins with metabolic activity. Unigenes with NCBI matches encoding proteins with unknown function comprise 14% of the total and another 28% are predicted to be involved in various cellular processes such as protein synthesis and protein degradation.
The specific biochemical steps leading to THCA are proposed to begin with a reaction involving a type III PKS enzyme that catalyses the synthesis of olivetolic acid from hexanoyl-CoA and three molecules of malonyl-CoA . Malonyl-CoA is derived from the carboxylation of acetyl-CoA. ESTs encoding acetyl-CoA carboxylase were identified. Hexanoyl-CoA could be produced by more than one pathway in the trichomes. One route to produce hexanoyl-CoA would involve the early termination of the fatty acid biosynthetic pathway, yielding hexanoyl-ACP . The hexanoyl moiety would then be transferred to CoA by the action of an ACP-CoA transacylase or it would be cleaved by the action of a thioesterase, yielding n-hexanol, which would then be converted into n-hexanoyl-CoA by the action of acyl-CoA synthase. Most of the enzymes needed for this route are represented in the EST database, except for thetransacylase and 2,3-trans-enoyl-ACP reductase . A second route to hexanoyl-CoA would involve the production of hexanol from the breakdown of the fatty acid linoleic acid via the lipoxygenase pathway . A survey of the sequenced ESTs revealed candidate genes encoding the enzymes needed to synthesize linoleic acid from acetyl-CoA by the typical fatty acid biosynthetic pathway in plastids followed by the production of hexanol from linoleic acid via the LOX pathway. An third pathway related to the biosynthesis of branched chain amino acids has been proposed to be involved in the production of short-chain and medium-chain fatty acids . However,hydroponic grow system the enzymes in this pathway [2-isopropylmalate synthase, 3-isopropylmalate dehydratase, 3-isopropylmalate dehydrogenase, and 2- oxoisovalerate dehydrogenase ] were not represented in the Cannabis trichome EST library. After the formation of olivetolic acid, a prenyltransferase is proposed to add a prenyl group derived from geranyl diphosphate to create cannabigerolic acid. GPP is derived from the fusion of two isoprene units . Two different biochemical pathways support the synthesis of isoprenoids in plants . Within the list of unigenes all but one of the enzymatic activities needed to convert pyruvate and glyceraldehyde-3-phosphate into isopentenyl and dimethylallyl diphosphate via the methylerythritol 4-phosphate pathway were represented . This finding is consistent with isotopic studies showing that the GPP cannabinoid precursors are synthesized via this pathway . The formation of GPP is mediated by GPP synthase. Several unigenes related to GPP synthase were identified , however, they were more closely related to other terpene synthases. In particular, CAN36 and CAN55, which possibly were derived from the same gene, and the closely related CAN37, are most similar to hop sesquiterpene synthases HISTS1 and HISTS2 , with an average identity of 56% over the first 160 amino acid residues . CAN41 is most similar to hop monoterpene synthase HIMTS2 .The nature of the prenyltransferase is unknown. However, previous studies identified a soluble aromatic geranylpyrophosphate:olivetolate geranyltransferase in the extract of young leaves with the appropriate activity .
The only EST encoding a predicted prenyltransferase was CAN121. However, the encoded protein is more closely related to members of the membrane-bound chloroplast-localized family of prenyltransferases than to soluble prenyltransferases . The final step in the pathway is mediated by THCA synthase, which mediates the conversion of cannabigerolic acid to THCA . Two ESTs with sequences identical to the previous reportedly THCA synthase were identified .Whereas the nature of the prenyltransferase responsible for the synthesis of cannabigerolic acid is unknown, three unigenes, CAN24, CAN383, and CAN1069, comprising eight, one, and two ESTs, respectively, could encode the PKS activity needed to synthesize olivetolic acid. These were therefore characterized in more detail. All three unigenes were represented by individual ESTs encoding complete PKS polypetides. These were sequenced and compared to related PKS sequences . CAN1069 was identical to a previously identified Cannabis gene encoding a chalcone synthase, and is the most closely related of the PKS sequences to other known chalcone synthases from hop and Arabidopsis . The relationships of hop phlorisovalerophenone synthase , which mediates the conversion of malonyl-CoA and isovaleryl-CoA to phlorisovalerophenone, to CAN24 and CAN383 is less clear . CAN24 and CAN383 show 64.6% identity and are nearly equally similar to hop VPS at 71.2% and 72.0%, respectively. The enzymatic activities encoded by CAN24 and CAN1069 were explored in detail.The tagged proteins were purified on a nickel-containing magnetic bead matrix and were assayed for chalcone and olivetol/olivetolic acid synthase activities . Recombinant protein from CAN1069, but not CAN24, produced reaction products when incubated with coumaroyl-CoA and malonyl-CoA . The reaction products were analysed by LC-MS and peak 2 was found to have a molecular mass and absorption spectrum consistent with naringenin , the major product of chalcone synthases. Both CAN24 and CAN1069 were capable of using malonyl-CoA and hexanoyl-CoA as reaction substrates and LC-MS indicated that products of these enzymes were the same, but neither molecular mass nor the absorption spectrum of this product matched olivetol or olivetolic acid . Results similar to CAN24 were obtained using protein purified from CAN383 .Genes required for THCA production are probably more highly expressed in glands of pistillate inflorescences because this is where THCA is most highly concentrated. To test this hypothesis, the relative expression levels in isolated glands versus young inflorescence-associated leaves of selected unigenes were compared using real-time qPCR. The identity of the genes assayed and the differences in relative expression levels are listed in Table 2 and in Supplementary Table 4A at JXB online. Consistent with this hypothesis, THCA synthase expression was 437 times higher in isolated glands than in leaves. CAN24 was expressed 1600 times higher in glands of the inflorescence than in associated leaves. CAN1069 encoding CHS was also more highly expressed in glands than leaves . The expression of a third PKS, CAN383, was expressed at similar levels in glands and leaves . These results are not explained by poorRNA isolation from leaves as unigene CAN219 encoding chlorophyll A/B binding protein showed elevated leaf expression levels .