Journal:High-throughput methods to identify male Cannabis sativa using various genotyping methods
|Full article title||High-throughput methods to identify male Cannabis sativa using various genotyping methods|
|Journal||Journal of Cannabis Research|
|Author(s)||Torres, Anthony; Pauli, Christopher; Givens, Robert; Argyris, Jason; Allen, Keith; Monfort, Amparo; Gaudino, Reginald J.|
|Author affiliation(s)||Front Range Biosciences, Centre for Research in Agricultural Genomics|
|Primary contact||Email: atorres frontrangebio dot com|
|Volume and issue||4|
|Distribution license||Creative Commons Attribution 4.0 International|
Background: Cannabis sativa is a primarily dioecious angiosperm that exhibits sexual developmental plasticity. Developmental genes for staminate male flowers have yet to be elucidated; however, there are regions of male-associated DNA from Cannabis (MADC) that correlate with the formation of pollen-producing staminate flowers. MADC2 is an example of a polymerase chain reaction-based (PCR-based) genetic marker that has been shown to produce a 390-bp amplicon that correlates with the expression of male phenotypes. We demonstrate applications of a cost-effective high-throughput male genotyping assay and other genotyping applications of male identification in Cannabis sativa.
Methods: In this study, we assessed data from 8,200 leaf samples analyzed for real-time quantitative polymerase chain reaction (qPCR) detection of MADC2 in a commercial testing application offered through Steep Hill Laboratories. Through validation, collaborative research projects, and follow-up retest analysis, we observed a > 98.5% accuracy of detection of MADC2 by qPCR. We also carried out assay development for high-resolution melting analysis (HRM), loop-mediated isothermal amplification (LAMP), and TwistDx recombinase polymerase amplification (RPA) assays using MADC2 for male identification.
Results: We demonstrate a robust high-throughput duplex TaqMan qPCR assay for identification of male-specific genomic signatures using a novel MADC2 qPCR probe. The qPCR cycle quotient (Cq) value representative of MADC2 detection in 3,156 males, the detection of tissue control cannabinoid synthesis for 8,200 samples, and the absence of MADC2 detection in 5,047 non-males demonstrate a robust high-throughput real-time genotyping assay for Cannabis. Furthermore, we also demonstrated the viability of using nearby regions to MADC2 with novel primers as alternative assays. Finally, we also show proof of concept of several additional commercially viable sex determination methodologies for Cannabis sativa.
Discussion: In industrial applications, males are desirable for their more rapid growth and higher-quality fiber, as well as their ability to pollinate female plants and produce grain. In medicinal applications, female cultivars are more desirable for their ability to produce large amounts of secondary metabolites, specifically the cannabinoids, terpenes, and flavonoids that have various medicinal and recreational properties. In previous studies, traditional PCR and non-high-throughput methods have been reported for the detection of male cannabis, and in our study, we present multiple methodologies that can be carried out in high-throughput commercial cannabis testing.
Conclusion: With these markers developed for high-throughput testing assays, the cannabis industry will be able to easily screen and select for the desired sex of a given cultivar depending on the application.
Keywords: sex marker, high-resolution melt (HRM), genotyping, loop-mediated isothermal amplification (LAMP), male-associated DNA marker in Cannabis 2 (MADC2), DNA, recombinase polymerase amplification (RPA), male Cannabis plant
Cannabis sativa is an annual plant in the Cannabaceae family that contains both monoecious and dioecious cultivars; however, in most commercial industrial and medicinal applications, dioecious cultivars are generally preferred. The species can be divided into two types: fiber type (or hemp), used for its fiber or seed oils, and drug type (marijuana), used for secondary metabolite production. For medicinal or drug-type applications, female (sinsemilla) plants are desired for their increased cannabinoid production compared to past monoecious varieties that would produce self-pollinated seed that detracts from medically valued cannabinoid production. In industrial and breeding applications, male plants are desired for their more rapid growth, high-fiber quality, and ability to pollinate female plants to produce seed or a grain crop.
With primarily dioecious cultivars being grown, the segregation of males, or pollen bearing, staminate flower containing plants from female plants is essential for large-scale producers and breeders. Thus, the need for a high-throughput screening methodology to rapidly identify these males after seed germination has been amplified over the past years with the legalization of hemp in the Agriculture Improvement Act of 2018 and simultaneous state-led legalization of recreational cannabis products. Cannabis carries an XY/XX sex determination system; however, it has also been proposed that it may also use an X-to-autosome balance system like other flowering plants. With the presence of hermaphroditic and monoecious populations that carry an XX genotype, further research is needed to elucidate the main genetic factors contributing to sex determination (although the presence of a Y chromosome found in our work supports previous work suggesting it as a reliable genetic marker to identify phenotypic males.)
Previous research has focused on developing male-associated polymerase chain reaction (PCR) markers; however, during this time, the lack of an assembled Y chromosome and limited diversity of dioecious germplasm readily available limited the application of this work in commercial applications. Our research has validated and expanded this work through testing a diverse set of medicinal and recreational varieties from the California, Colorado, and European markets. This work has led us to understand there are two distinct genotypes of this male-associated DNA from Cannabis (MADC) 2 region, which correlated with sexual dimorphism and have been termed “male positives” and “non-male negatives.”
Using a PacBio long-read sequenced male along with targeted sequencing of 48 non-related male genotypes, a male-specific genomic signature was revealed that matches the canonical MADC2. This observation prompted us to investigate the role of multiple methodologies for identifying male Cannabis plants and explore alternative processes such as high-resolution melting analysis (HRM), loop-mediated isothermal amplification (LAMP), and TwistDx recombinase polymerase amplification (RPA) genetic testing platforms to explore variants in MADC2 homology and their ability to distinguish male genotypes from non-males. All platforms are developed and tested using this locus, and nearby loci were able to successfully distinguish the two classes of the male-specific PCR amplicon from each other and from products templated by DNA of plants that bore pistillate flowers, thus providing a convenient, economical, and robust saturation-point assay for male DNA in Cannabis seedlings.
Materials and methods
The Cannabis samples used in our validation work were obtained through anonymous samples submitted to Steep Hill Laboratories for genetic sex testing as part of the GenKit offering. We established an initial set of 20 samples for assay development and validation. The plants were propagated from seed grown, and phenotypic observations for sex were collected as previously described by Faux et al. Subsequently, 8,200 leaf samples were used for this analysis, which includes the main three types of Cannabis (types 1, 2, and 3) from various related and unrelated lineages. These samples, anonymously submitted to Steep Hill Laboratories and presumably of California/USA origin, were assigned with unique identifiers, thereby masking any cultivar/customer-related information. Additionally, our collaborators performed a second validation experiment of our HRM and real-time quantitative polymerase chain reaction (qPCR) assays on a set of European Cannabis leaf samples from plants that were cultivated at the Centre for Research in Agricultural Genomics (CRAG) in Barcelona. The plants were phenotyped for sex as previously described by Razumova et al. The European cultivars were obtained from a known diverse lineage set, containing 158 cultivars of various drug-type hemp, fiber-type hemp, and grain-type hemp. We calculated an estimated percent accuracy of MADC2 detection of > 98.5%. The value was calculated from the retest rate and result from the 8,200-sample test set (data not shown) and confirmed through phenotypic characterization of plants through assay validation (Supplemental Table 4) and collaborative research projects (Supplemental Table 2). The Y-chromosome-containing genome assemblies used within our analyses included Jamaican Lion Father (JL_Father) (BioProject no.: PRJNA575581, Medicinal Genomics, Beverly, MA) and Pineapple Banana Bubba Kush (PBBK) (BioProject no.: PRJNA378470, Steep Hill, Berkeley, CA, USA).
Genomic DNA was isolated from Cannabis samples using the Qiagen DNA Easy Plant genomic DNA isolation kits (Qiagen, Redwood City, CA, USA), using manufacturer’s instructions, and Promega Wizard genomic DNA kit (Promega, Madison, WI, USA), using manufacturer’s instructions, as well as a more cost-effective method using Flinders Technology Associates (FTA) PlantSaver cards (GE Healthcare, Chicago, IL, USA) that used a preparation of crude genomic DNA. Crude DNA extract was prepared by Tris/Triton-X pre-treatment of 1-mm raw leaf or leaf-imprinted FTA card sections, as modified from Klimyuk et al., in a modified 96-well format for high-throughput processing. Leaf or FTA selections were placed aseptically in a 96-well microtiter plate, 100 μL 0.25M Tris-HCl with 0.25% Triton-X-100 was added to each well, and the plates were incubated at 100 °C for five minutes, on the Veriti Thermal Cycler (ABI Biosystems, Waltham, MA, USA). A total of 3 μL of crude genomic DNA extract was used as input for pre-amplification PCR reaction. DNA isolated from the kit-based extractions did not require a pre-amplification step due to the higher quality and purity of DNA obtained.
A 10-cycle pre-amplification PCR step was introduced to the protocol to reduce the effect of plant materials in subsequent reactions, including plant pigments and potentially real-time qPCR inhibiting compounds often found in leaf extracts. A total of 2.5 μL of crude genomic DNA extract was transferred to a second PCR plate, with each well preloaded with 22.5 μL of pre-amplification PCR master mix prepared per reaction as follows for either qPCR or HRM assays: 12.5 μL 2× Promega Colorless GoTaq (Promega, Madison, WI, USA), 3 μL of 4 μM qPCR mix of (MADC2 Fwd + MADC2 Rev and TC Fwd + TC Rev) or one of the following HRM mixes (Choco Mando Fwd + Choco Mando Rev, Break 1 Fwd + Plus 9 Rev, MADC2 Fwd + MADC2 Rev, or –403 Fwd + –237 Rev) (Table 1), and 7 μL nuclease-free water (Ambion, Austin, TX, USA). The reactions were subjected to the following thermocycler protocol: 1 cycle of 95 °C for 10 minutes, 10 cycles of 95 °C for 40 seconds, 60 °C for two minutes, 72 °C for two minutes, 1 cycle of 72 °C for five minutes, and then an indefinite 4 °C hold.
Pre-amplification reactions were diluted 1:5 with 100 μL nuclease-free water (Ambion, Austin, TX, USA). Diluted pre-amplification reactions were prepared for quantification using the QuantiFluor dsDNA System (Promega, Madison, WI, USA) and quantified using a Quantus Fluorometer (Promega, Madison, WI, USA), as per manufacturer’s instructions. Quantitated diluted pre-amplification reactions revealed final working concentration of ~1 ng/μL, which were used as non-normalized input into the real-time qPCR or HRM reactions.
Quantitative real-time PCR TaqMan analysis
The qPCR analysis was performed in 10 μL reactions on a LightCycler 480 qPCR (Roche Applied Systems, Pleasanton, CA, USA) using the following protocol: (1) pre-incubation cycle (95 °C for 20 seconds), 45 amplification cycles (95 °C for one second, 60 °C for 20 seconds, 72 °C for 20 seconds) with a single acquisition mode setting for each cycle at 60 °C annealing, followed by a final cooling cycle (40 °C for 30 seconds).
Each reaction contained 5 μL of ~1 ng/μL of the diluted pre-amplified template used as input, 5 μL of TaqMan Master Mix (prepared per reaction as follows: 3.75 μL of FastTq Advanced Reaction Mix (Applied Biosciences, Pleasanton, CA, USA), 2.25 μL sex test primer mix (MADC2 fwd/rev at 3.33 μM and TC Universal CS fwd/rev at 0.167 μM working concentration), and 0.1875 μL of TC Universal CS TaqMan probe/MADC2 TaqMan probe (10 μM working concentration).
qPCR data was analyzed using the LightCycler 480 software AbsQuant/2nd Derivative Max algorithm for calculating Cp values.
Accumulation of fluorescent signal in target FAM and reference HEX wavelength channel results in cycle quotient values that increase in fluorescence signal as target DNA element copies are accumulated. Once the fluorescent signal passes an instrument derived threshold (calculated using the second derivative max of the cycle threshold), the wavelength channel signal is called positive for the sample, and the cycle at which this occurs is measured and reported. The absence of fluorescent signal results in no measurement of a cycle quotient value. Detection of cycle quotient value for both MADC2 Taqman probe (FAM) and for TC Universal CS TaqMan probe (HEX) DNA elements complementary to the fluorescently tagged TaqMan assay probes results in ID for true genetic males, while the absence of MADC2 DNA element and the presence of cycle quotient value for cannabinoid synthase control probe result in identification of not-male Cannabis individuals.
High-resolution melting (HRM) analysis
HRM analysis was performed in 10 μL reactions on a LightCycler 480 qPCR (Roche Applied Systems, Santa Clara, CA, USA) using the following protocol: (1) pre-incubation cycle (95 °C for 10 minutes), 45 amplification cycles (95 °C for 10 seconds, 60 °C for 15 seconds, 72 °C for 10 seconds), 1 cycle of HRM (95 °C for one minute, 40 °C for one minute, 65 °C for one second, and heat to 95 °C with 25 continuous acquisitions per degree (C), followed by a final cooling cycle (40 °C for 10 seconds).
Each reaction contained 5 μL of ~1 ng/μL of the diluted pre-amplified template, 5 μL of HRM master mix (prepared per reaction as follows: 3.5 μL 2× HRM master mix containing HRM dye [Roche Applied Systems, Santa Clara, CA, USA], 0.6 μL of 4 μM primer mix, 0.8 μL of 25 mM MgCl2, 1.125 μL of nuclease-free water). Alternatively, if kit-purified DNA was used, 1 μL of purified DNA was used in replacement of the 5 μL of diluted pre-amplified NDA.
HRM data was analyzed using the LightCycler 480 melt genotyping software. Fluorescence intensity as a function of temperature for each sample also was analyzed using custom R scripts to determine statistical variation of melt curves and clustering of samples into the male and non-male genotypic classes.
DNA amplicon sequencing
Male samples were sequenced using Thermo Fisher’s SeqStudio capillary sequencer for MADC2 with the BigDye Direct (BDD) Cycle Sequencing kit (ThermoFisher, Fremont, CA, USA). M13 tailed end primers were designed to modify MADC2 Fwd and Rev primers with M13 Fwd and Rev oligonucleotides, respectively. Two reactions were prepared in a 96-well plate format for a forward and reverse read with 1 μL of 4 ng/μL genomic DNA used as template, 1.5 μL of 0.8 μM tailed end primer mix, 2.5 μL nuclease-free water, and 5 μL of BDD Master Mix to formulate a BDD PCR reaction using the BDD under the manufacturer’s instructions. The following PCR protocol was carried out on a Veriti Thermal Cycler (Applied Biosystems, Waltham, MA, USA): (1) hold hot start cycle (95 °C for 10 minutes), 35 amplification cycles (95 °C for three seconds, 60 °C for 15 seconds, 68 °C for 30 seconds), and a final extension (72 °C for two minutes) with an indefinite 4 °C hold. The BDD sequencing master mix was prepared with 2 μL of the BDD sequencing master mix and 1 μL of either one sequencing primer: BDD M13 Fwd primer or BDD M13 Rev primer.
Loop-mediated isothermal amplification and isothermal recombinase polymerase amplification
LAMP and RPA sex determination assays were designed in-house using Steep Hill male Cannabis genome assembly. In the RPA assay designs, biotin-labeled primers were designed for target amplification and detection. A standard reaction was prepared with reconstituting a TwistDX reaction pellet with 30 μL rehydration buffer, 2.1 μL Fwd and Rev primer each, 0.6 μL TwistMan target probe, 2.1 μL nuclease-free water, and 10 μL of crude DNA extract. A total of 2.5 μL 280 mM magnesium acetate was added to start the reaction (TwistDX, Cambridge, England). The reactions were incubated at room temperature for 30 minutes and then were applied to a disposable HybiDetect-2 strip and nucleic acid detection device type 3 (Milenia Genline, Germany). The results developed in one to two minutes following application. A positive male result is two bands detected, while a non-male-positive RPA reaction results in one detected band. In a similar workflow, LAMP primers were designed using Eiken’s LAMP PrimerExplorer V5 (Fujitsu Limited, primerexplorer.jp). Following optimization, a standard reaction was set up with 12.5 μL WarmStart LAMP 2× Master Mix, 2.5 μL LAMP primer mix 10× (16 μM FIP, 16 μM BIP, 2 μM F3, 2 μM B3, 4 μM LF, and 4 μM LB from Table 1), 1 μL DNA, and 8.5 μL nuclease-free water for a 25 μL final volume. Reactions were incubated at 65 °C for 30 minutes and measured for visual change in color indicating amplification of target male amplicon. Positive reactions accumulate amplicons and modify the pH, which changed the color to yellow, while negative reactions do not, and remain pink. New England Biolab standard LAMP PCR protocols were executed according to manufacturer’s instructions (NEB, Ipswich, MA, USA).
Results and discussion
Real-time qPCR sex determination assay in Cannabis sativa
A total of 8,200 individual Cannabis plants have been genotyped using Steep Hill’s SexID assay, with genetic material isolated either from an early leaf raw sample or an FTA card leaf imprint. The SexID assay is a TaqMan real-time PCR in a duplex with custom-designed molecular fluorescent TaqMan probes: 5′FAM-3′BHQ labeled for MADC2 male detection and 5′HEX-3′BHQ for universal tetrahydrocannabinolic acid (THCA) / cannabidiolic acid (CBDA) / cannabichromenic acid (CBCA) synthase detection as a tissue control/Cannabis-specific reference gene controlling for detection of any Cannabis lineage irrespective of its cannabinoid synthase loci. An early validation set of 20 plants were germinated and characterized in-house, analyzed by SexID qPCR assay, and conferred with observed phenotype (Supplemental Table 4).
A box plot analysis of Cq (cycle quotient) values for Cannabis samples tested from 2015 to 2019 is reported (Fig. 1A), generated using the box plot function in the R ggplot2 package. Averages and population standard deviation of raw cycle quotient values were calculated (Supplemental Table 3) and reveal concordance between male and non-male cannabinoid synthase tissue control detection and MADC2 detection in males (average is thick black bar in Fig. 1A).
The qPCR cycle quotient (Cq) value representative of MADC2 detection in 3,156 males, the detection of tissue control cannabinoid synthesis for 8,200 samples, and the absence of MADC2 detection in 5,047 non-males demonstrate a robust high-throughput real-time genotyping assay for Cannabis. We suspect that population skew towards female cannabis from our customer submissions influenced expected values, so we did not make comments on this dataset as it was a true breeding population. The 8,200 samples coming to the lab were from various sources, breeding populations, and cultivation operations, not just traditional plant breeders. However, we have highlighted sex test genotype results (by HRM) in true breeding populations of European seeds tested at the CRAG from a given traditional cross, and we observed 40-45% male and 55-60% female, as demonstrated in Supplemental Table 2.
To confirm sequence specificity for the MADC2 molecular probe, targeted sequencing of MADC2 was performed using Thermo Fisher SeqStudio capillary sequencing. Genetic alignment of targeted MADC2 amplicon sequence analysis reveals highly specific genomic homology. Conservation at molecular probe loci for MADC2 from 45 randomly selected male samples matches canonical MADC2 isolated and cloned from male Cannabis by Mandolino and high homology to JF298280.1 Cannabis sativa MADC2 male-specific sequence submitted to GenBank ensure accurate detection of genetically diverged male Cannabis individuals. Direct genomic sequence data available for the male sample Pineapple Banana Bubba Kush (BioProject no.: PRJNA378470) exhibits strain’s amplicon was otherwise identical to the consensus MADC2 sequence (Fig. 1B). Male MADC2 amplicon maps to 1 Mbp Y-chromosome contig in Jamaican Lion Male genome, in addition to a relatively small 16-k bp fragment from the PBBK male genome (Supplemental Figs. 3 and 4), further indicating that the MADC2 male homology is a Y chromosome-associated fragment, and our target TaqMan probe detects Y-chromosome individuals carrying this genotype. The Y chromosome is considered homomorphic and degenerate with little to no homology or overlap with the X-chromosome and likely does not recombine.
Assays as alternative sex ID methodologies
We developed sex-determining-specific designs around the Y-chromosome-specific region of MADC2 and designed several assays that can be used in the field to process and detect male DNA from early vegetative plants. We explored two different isothermal protocols, including TwistDX primer and probe designs and LAMP PCR designs for each respective assay. In preliminary analysis using the TwistDx RPA system, we employed a custom designed TwistDX assay to analyze randomly selected samples of male and non-male Cannabis individuals from the 8,200-sample test set. We designed custom twist primers and probe to target specific Y-chromosome DNA elements in a single-step reaction at room temperature. In this assay, results are measured from test samples, with two positive bands for the target amplicon, and an assay-control-developed Milenia Genline HybriDetect-2 strip in a nucleic acid detection device type 3 for male samples and one assay control positive band (Milenia Genline, Germany). In our preliminary analysis of a small subset of Cannabis samples, we successfully identified male from non-male Cannabis individuals and water control (Supplemental Fig. 5B). We have also designed LAMP primers for MADC2 around the canonical sequence found in diverged males and carry out loop-mediated isothermal amplification using the WarmStart Colorimetric LAMP 2× Master Mix under manufacturer specifications. After 30 minutes at 65 °C incubation, male MADC2 DNA elements are copied, amplified, and accumulated if they are present; detection of target amplicon can be visually observed with yellow visual signal positively identifying genetic diverged male individuals and pink visual negative signals for non-male Cannabis. In a preliminary analysis of small sample size (n = 5 genetic individuals), tested in duplicate, we successfully identified males (yellow) from non-males (pink); two no-template controls were used as negative input (Supplemental Fig. 5A). While this preliminary analysis demonstrates proof of concept for male detection using these two isothermal amplification technologies, additional testing and validation is needed, though the assays show promise as viable commercial screening assays.
High-resolution DNA melting analysis as a contextually effective proxy for sequencing and qPCR
HRM analysis provides a cost-effective and reproducible method to screen for novel sex-specific variants in a large population, particularly in non-male populations. By producing sex-specific amplicons, such as MADC2, from a variety of germplasm, HRM allows the user to understand genotypic differences through the temperature at which DNA transitions from double stranded (dsDNA) to single stranded (ssDNA). By increasing the temperature in small increments over time, the relative fluorescence signal can be detected, providing a definitive temperature at which the DNA melted apart. These temperature-specific fluorescence signals allow for the discovery of novel variants of males and non-males, or if there are potential off-target amplicons from non-males that potentially provide a false result in a simple presence-absence reaction. Thus, these advantages suggest HRM as a superior method of screening for MADC2 and other sex-related elements, to provide a verification of the nucleotide signature encoding the male-specific element. Specifically, we have shown this differentiation of genotypes via HRM in both the literature-described MADC2 region (Fig. 2A) and other non-previously described male-associated DNA (Fig. 2B), which shows that non-male genotypes are more likely to have variation in this targeted region. This revelation may play a role in understanding sex determination in Cannabis, particularly with hermaphroditic and monoecious genotypes.
While HRM provides a more cost-effective and accurate result than previous methods, the melting temperature of these DNA amplicons is of particular importance in HRM analyses, especially in the considerations needed to interpret the data. The melting temperature (Tm) of any specific DNA sequence is conventionally defined as the temperature at which one of every two copies of that segment of DNA molecules is rendered completely single-stranded by the thermal energy of the system. While the specific Tm of a given DNA fragment is determined by its overall base composition, the pyrimidine:purine ratio specifically plays a particularly important role in addition to the various secondary structures of said amplicon, such as sites of potential formation of cruciforms, z-DNA, or other alternate secondary structures. These thermally relevant structural features can manifest as secondary peaks or shoulders in the melt profile for non-male individuals, sized in proportion to the fraction of the total sequence encompassed by the responsible irregularity and correspondingly detracting from, or slightly shifting the location of the main peak, while male individuals result in clearly defined and conserved HRM melting profile peak that is unshifted across different male cultivars. Therefore, HRM profiling for MADC2 was employed to rapidly and economically screen multiple genetic individuals to discriminate genetically conserved male amplicons from the various low homology versions MADC2 amplicons found in non-male cultivars. This process was also performed in a diverse set of 158 genetically unique individuals from European cultivars of varying lineages to ensure the robust and reproducibility of the assay being tested (Supplemental Table 2). Melt profiles of MADC2 were grouped by genotype and used to identify and associate MADC2 genotypes of males and not-males. The plants were propogated and phenotyped and example morphologies photographed for phenotypic reference (Supplemental Fig. 6). Nearby regions to the MADC2 locus, specifically the −403 Fwd and −237 Rev, were also investigated and presented a similar pattern of males having a high homology that differs from the more variable non-male genetic signature, showing that not only the MADC2 region is conserved but also nearby loci on the Y chromosome (Supplemental Fig. 2).
Figure 2 demonstrates a clear genotype difference observable in the second derivative melting curve analysis of two marker sets high-resolution melting profiles of male and non-male Cannabis plants: (A) high-resolution melting analysis for sex test (−403 Fwd + −237Rev) and (B) HRM sex test (MADC2 Fwd + MADC2 Rev). In Fig. 2A, the gray lines represent melting profiles of male genotypes, whereas the red lines represent melting profiles non-male genotypes. In Fig. 2B, the blue lines represent melting profiles of male genotypes, whereas the red lines represent the melting profiles of non-male genotypes.
In this study, we investigated several methodologies for performing high-throughput sex identification genotyping in Cannabis plants using both novel and literature marker sets. We expanded on Mandolino’s report of MADC2 male specificity and have found our application of male-specific MADC2 sequence and other Y-chromosome-associated sequence for male detection assays to be suitable for the high-performance application of Cannabis testing. Our method is cost-effective comparatively, and the markers described in this study provide comparable end genotype result of male and non-male detection. Furthermore, we were able to place these markers on the recently assembled Y chromosome and show the conservation, or lack there of, within the amplicons produced by these widely used markers. The accurate identification of sex is essential to large-scale production and breeding of Cannabis; thus, the multiple methodologies presented here allow for accurate, quick, and cost-effective screening that will enable future development of germplasm and the industry. Our real-time assays can be performed by cannabis testing laboratories performing diagnostic testing, as well as in the field with minimal molecular biology equipment and expertise, allowing for a wide range of users and throughput options.
- Additional file 1: The file contains the following: Supplemental Figure 1 - High-resolution melting profiles on selected Cannabis sativa male and not-male samples from the SexID Steep Hill high-throughput testing sample set of MADC2 primer set showing multiple non-male melt profile genotypes. The samples that are shown in red represent male calls with a Tm melting peak at 82°C, whereas the other colors (purple, blue, maroon) represent non-male genotypes with variable target melting peak. Supplemental Figure 2 - High-resolution melting profiles of a region upstream of MADC2. This analysis was performed using primers designated, -403 Fwd and -237 Rev, which target an upstream region of MADC2 on the proposed Y-chromosome. Curves with an inflection point at 78°C are indicative of a male genotype, and the other curves present represent the various non-male genotypes of the homologous region. Supplemental Figure 3 - Genomic Scaffold and Amplicon Alignment on JL_Father’s Y-Chromosome. The above alignment shows the location of where the MADC2 region lies on a 1Mbp region of the JL_Father’s Y-chromosome scaffold. The highlighted region near 523Mbp shows the MADC2 probe binding site, and the black bar represents the amplicon region. Supplemental Figure 4 - Alignment of 42 male amplicon sequences to the MADC2 genomic region in the JL_Father genome that shows a high conservation between male genotypes of diverse drug-type and fiber-type cultivars. The highlighted region represents the probe binding site that is also conserved between all the genotypes. 011623 contig from PBBK aligned to 1Mbp Y_000295F contig from JL father with TaqMan probe highlighted 2) MADC2 aligned to Y contig from Jamaican Lion Father with TaqMan probe highlighted examined. Supplemental Figure 5 - A. Preliminary Analysis of Loop mediated Isothermal PCR for Male (yellow) and not Male (Pink) identification. B. Twist Dx Assay of Male and Not-Male DNA samples. Supplemental Figure 6 - Example phenotypes of representative Cannabis plants from the 158 European cultivars germinated, propagated, and tested at the CRAG. A. Male Cannabis plant expressing a male sex phenotype of immature staminate bearing flowers. B. Selected female trial plants grown in greenhouse post sex testing. C. Non-male Cannabis plant expressing an alternative female sex phenotype of maturing pistallate bearing flowers with partial staminate bearing. Supplemental Table 1 - SexID qPCR Results for 8,595 Cannabis Samples – This table shows the Cycle quoient values observed for the SexID assay using the MADC2 Fwd + MADC2 Rev + MADC2 Taqman Probe. The sample IDs are representative of the Genkit lots that the samples as they were received. Supplemental Table 2 - Results of 158 European Cultivars tested by HRM using the MADC2 primers. 71 out of 158 or 45% of the population was measured with the melting peak at 82°C indicative of the male MADC2 genotype, and variable target melting peaks were observed for the remaining 186 test samples indicative of a not-male MADC2 genotype. Supplemental Table 3 - Cycle Thresholds/Quotients for MADC2 and the TC Cannabinoid Synthase Positive Control qPCR Primers. The average Cq Value describes the cycle at which a positive detection of the target amplification occurred. The Population standard deviation describes the standard deviation observed in that Cq value when ran on 5,047 non-male samples for the MADC2 target and 8,200 samples for the positive control. On analyzing our qPCR results we see an avg. Ct value of 26 with a standard deviation of three for our tissue control and MADC2 qPCR assays suggesting they are reproducible and within our acceptable range of median Cq +/- 10 for a specific and reproducible qPCR assay. Supplemental Table 4 - A Validation set of 20 samples (M1-20) with two template controls (M21-22) and a no-template control was tested with the SexID qPCR assay as a part of early assay validation. The validation plants were grown from seed and phenotyped. The observed sex phenotype results were recorded and compared against the results obtained by the SexID assay. Plants with a positve MADC2 signal corresponded with plants with an observed male phenotype (staminate bearing flowers).
Abbreviations, acronyms, and initialisms
- BDD: BigDye Direct
- Ct/Cq: cycle threshold/quotient
- DNA: deoxyribonucleic acid
- dsDNA: double-stranded DNA
- FTA: Flinders Technology Associates
- Fwd: forward primer sequence
- HRM: high-resolution melting
- JL: Jamaican lion
- LAMP: loop-mediated isothermal amplification
- MADC: male-associated DNA from Cannabis sativa
- PBBK: Pineapple Banana Bubba Kush
- PCR: polymerase chain reaction
- qPCR: quantitative polymerase chain reaction
- Rev: reverse primer sequence
- RPA: recombinase polymerase amplification
- ssDNA: single-stranded DNA
- Tm: melting temperature
The authors are thankful for Daniela Vergara, who has helped form and revise the manuscript during the validation work. Additionally, the authors would like to acknowledge Front Range Biosciences and Steep Hill Laboratories for enabling and funding this work throughout the years. Finally, the authors would like to thank the Centre for Research in Agricultural Genomics for their collaboration with Front Range Biosciences, which enabled this method to be validated on a diverse set of samples from around the planet to ensure the accuracy and reproducibility of the technology.
AT has collected the experimental data, designed the novel primers, and drafted the manuscript. CP assisted in primer design, primer refinement and optimization, and manuscript drafting and editing. RMG assisted in the project conceptualization, primer design, and manuscript formation. KA assisted in the creation of the figures for the manuscript and in the processing of the R code used to process the data. JA provided genetic material for the validation of this assay and edited the manuscript. AM advised the method validation and edited the manuscript. RJG guided the original experimental design and edited the manuscript. The authors read and approved the final manuscript.
This work was funded by two private Cannabis companies, Front Range Biosciences and Steep Hill Laboratories. The assay development and validation were performed at Steep Hill Laboratories, and further validation, analyses, and this publication were funded by and performed at Front Range Biosciences.
Availability of data and materials
All data generated or analyzed during this study are included in this published article and its supplementary information files.
AT, CP, RG, KA, and RG were employed by Steep Hill labs, which developed and offered the sex marker analysis as a commercial service. AT, CP, RG, KA, and RG transitioned from being employed by Steep Hill to being employed by Front Range Biosciences upon acquisition of Steep Hill Genomics. No role was played by Steep Hill labs or Front Range Biosciences (other than by the authors themselves) in the design and conduct of the study, analysis of the data, writing of the manuscript, and the decision to publish. JA and AM declare that they have no competing interests.
- Small, Ernest; Cronquist, Arthur (1 August 1976). "A PRACTICAL AND NATURAL TAXONOMY FOR CANNABIS" (in en). TAXON 25 (4): 405–435. doi:10.2307/1220524. ISSN 0040-0262. https://onlinelibrary.wiley.com/doi/abs/10.2307/1220524.
- McPartland, John M.; Guy, Geoffrey W. (1 December 2017). "Models of Cannabis Taxonomy, Cultural Bias, and Conflicts between Scientific and Vernacular Names" (in en). The Botanical Review 83 (4): 327–381. doi:10.1007/s12229-017-9187-0. ISSN 0006-8101. http://link.springer.com/10.1007/s12229-017-9187-0.
- van Velzen, Robin; Schranz, M Eric (3 August 2021). Van De Peer, Yves. ed. "Origin and Evolution of the Cannabinoid Oxidocyclase Gene Family" (in en). Genome Biology and Evolution 13 (8): evab130. doi:10.1093/gbe/evab130. ISSN 1759-6653. PMC PMC8521752. PMID 34100927. https://academic.oup.com/gbe/article/doi/10.1093/gbe/evab130/6294932.
- Committee on the Health Effects of Marijuana: An Evidence Review and Research Agenda; Board on Population Health and Public Health Practice; Health and Medicine Division; National Academies of Sciences, Engineering, and Medicine (31 March 2017). The Health Effects of Cannabis and Cannabinoids: The Current State of Evidence and Recommendations for Research. Washington, D.C.: National Academies Press. doi:10.17226/24625. ISBN 978-0-309-45304-2. https://www.nap.edu/catalog/24625.
- Salentijn, Elma M. J.; Petit, Jordi; Trindade, Luisa M. (16 May 2019). "The Complex Interactions Between Flowering Behavior and Fiber Quality in Hemp". Frontiers in Plant Science 10: 614. doi:10.3389/fpls.2019.00614. ISSN 1664-462X. PMC PMC6532435. PMID 31156677. https://www.frontiersin.org/article/10.3389/fpls.2019.00614/full.
- Flajšman, Marko; Slapnik, Miha; Murovec, Jana (1 November 2021). "Production of Feminized Seeds of High CBD Cannabis sativa L. by Manipulation of Sex Expression and Its Application to Breeding". Frontiers in Plant Science 12: 718092. doi:10.3389/fpls.2021.718092. ISSN 1664-462X. PMC PMC8591233. PMID 34790210. https://www.frontiersin.org/articles/10.3389/fpls.2021.718092/full.
- Graddy-Lovelace, Garrett; Diamond, Adam; Ichikawa, Nina F. (1 August 2020). "Contextualizing the Farm Bill: questions of food, land and agricultural governance" (in en). Renewable Agriculture and Food Systems 35 (4): 352–357. doi:10.1017/S1742170520000125. ISSN 1742-1705. https://www.cambridge.org/core/product/identifier/S1742170520000125/type/journal_article.
- Yu, Bin; Chen, Xinguang; Chen, Xiangfan; Yan, Hong (1 December 2020). "Marijuana legalization and historical trends in marijuana use among US residents aged 12–25: results from the 1979–2016 National Survey on drug use and health" (in en). BMC Public Health 20 (1): 156. doi:10.1186/s12889-020-8253-4. ISSN 1471-2458. PMC PMC6998313. PMID 32013937. https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-020-8253-4.
- Mandolino, G.; Carboni, A.; Forapani, S.; Faeti, V.; Ranalli, P. (1 January 1999). "Identification of DNA markers linked to the male sex in dioecious hemp (Cannabis sativa L.):" (in en). Theoretical and Applied Genetics 98 (1): 86–92. doi:10.1007/s001220051043. ISSN 0040-5752. http://link.springer.com/10.1007/s001220051043.
- Dills, A.; Goddard, S.; Miron, J. et al. (2 February 2021). "The Effect of State Marijuana Legalizations: 2021 Update, Policy Analysis no. 908". Cato Institute. doi:10.36009/PA.908. https://www.cato.org/policy-analysis/effect-state-marijuana-legalizations-2021-update.
- McKernan, Kevin J.; Helbert, Yvonne; Kane, Liam T.; Ebling, Heather; Zhang, Lei; Liu, Biao; Eaton, Zachary; McLaughlin, Stephen et al. (5 January 2020) (in en). Sequence and annotation of 42 cannabis genomes reveals extensive copy number variation in cannabinoid synthesis and pathogen resistance genes. doi:10.1101/2020.01.03.894428. http://biorxiv.org/lookup/doi/10.1101/2020.01.03.894428.
- Prentout, Djivan; Razumova, Olga; Henri, Hélène; Divashuk, Mikhail; Karlov, Gennady; Marais, Gabriel AB (28 May 2020) (in en). Development of genetic markers for sexing Cannabis sativa seedlings. doi:10.1101/2020.05.25.114355. http://biorxiv.org/lookup/doi/10.1101/2020.05.25.114355.
- Prentout, Djivan; Razumova, Olga; Rhoné, Bénédicte; Badouin, Hélène; Henri, Hélène; Feng, Cong; Käfer, Jos; Karlov, Gennady et al. (1 February 2020). "An efficient RNA-seq-based segregation analysis identifies the sex chromosomes of Cannabis sativa" (in en). Genome Research 30 (2): 164–172. doi:10.1101/gr.251207.119. ISSN 1088-9051. PMC PMC7050526. PMID 32033943. http://genome.cshlp.org/lookup/doi/10.1101/gr.251207.119.
- Prentout, D; Stajner, N; Cerenak, A; Tricou, T; Brochier-Armanet, C; Jakse, J; Käfer, J; Marais, Gab (12 March 2021) (in en). Plant genera Cannabis and Humulus share the same pair of well-differentiated sex chromosomes. doi:10.1101/2021.03.11.434957. http://biorxiv.org/lookup/doi/10.1101/2021.03.11.434957.
- Mandolino, Giuseppe; Carboni, Andrea; Bagatta, Manuela; Moliterni, V.M. Cristiana; Ranalli, Paolo (2002). "Occurrence and frequency of putatively Y chromosome linked DNA markers in Cannabis sativa L.". Euphytica 126 (2): 211–218. doi:10.1023/A:1016382128401. http://link.springer.com/10.1023/A:1016382128401.
- Vergara, Daniela; Baker, Halie; Clancy, Kayla; Keepers, Kyle G.; Mendieta, J. Paul; Pauli, Christopher S.; Tittes, Silas B.; White, Kristin H. et al. (1 November 2016). "Genetic and Genomic Tools for Cannabis sativa" (in en). Critical Reviews in Plant Sciences 35 (5-6): 364–377. doi:10.1080/07352689.2016.1267496. ISSN 0735-2689. https://www.tandfonline.com/doi/full/10.1080/07352689.2016.1267496.
- Ming, Ray; Wang, Jianping; Moore, Paul H.; Paterson, Andrew H. (1 February 2007). "Sex chromosomes in flowering plants" (in en). American Journal of Botany 94 (2): 141–150. doi:10.3732/ajb.94.2.141. ISSN 0002-9122. https://onlinelibrary.wiley.com/doi/10.3732/ajb.94.2.141.
- Ming, Ray; Bendahmane, Abdelhafid; Renner, Susanne S. (2 June 2011). "Sex Chromosomes in Land Plants" (in en). Annual Review of Plant Biology 62 (1): 485–514. doi:10.1146/annurev-arplant-042110-103914. ISSN 1543-5008. https://www.annualreviews.org/doi/10.1146/annurev-arplant-042110-103914.
- Punja, Zamir K.; Holmes, Janesse E. (25 June 2020). "Hermaphroditism in Marijuana (Cannabis sativa L.) Inflorescences – Impact on Floral Morphology, Seed Formation, Progeny Sex Ratios, and Genetic Variation". Frontiers in Plant Science 11: 718. doi:10.3389/fpls.2020.00718. ISSN 1664-462X. PMC PMC7329997. PMID 32670310. https://www.frontiersin.org/article/10.3389/fpls.2020.00718/full.
- Razumova, Olga V.; Alexandrov, Oleg S.; Divashuk, Mikhail G.; Sukhorada, Tatiana I.; Karlov, Gennady I. (1 May 2016). "Molecular cytogenetic analysis of monoecious hemp (Cannabis sativa L.) cultivars reveals its karyotype variations and sex chromosomes constitution" (in en). Protoplasma 253 (3): 895–901. doi:10.1007/s00709-015-0851-0. ISSN 0033-183X. http://link.springer.com/10.1007/s00709-015-0851-0.
- Faux, Anne-Michelle; Berhin, Alice; Dauguet, Nicolas; Bertin, Pierre (1 March 2014). "Sex chromosomes and quantitative sex expression in monoecious hemp (Cannabis sativa L.)" (in en). Euphytica 196 (2): 183–197. doi:10.1007/s10681-013-1023-y. ISSN 0014-2336. https://link.springer.com/10.1007/s10681-013-1023-y.
- Divashuk, Mikhail G.; Alexandrov, Oleg S.; Razumova, Olga V.; Kirov, Ilya V.; Karlov, Gennady I. (21 January 2014). Marais, Gabriel A. B.. ed. "Molecular Cytogenetic Characterization of the Dioecious Cannabis sativa with an XY Chromosome Sex Determination System" (in en). PLoS ONE 9 (1): e85118. doi:10.1371/journal.pone.0085118. ISSN 1932-6203. PMC PMC3897423. PMID 24465491. https://dx.plos.org/10.1371/journal.pone.0085118.
- Techen, Natascha; Chandra, Suman; Lata, Hemant; ElSohly, Mahmoud; Khan, Ikhlas (1 November 2010). "Genetic Identification of Female Cannabis sativa Plants at Early Developmental Stage" (in en). Planta Medica 76 (16): 1938–1939. doi:10.1055/s-0030-1249978. ISSN 0032-0943. http://www.thieme-connect.de/DOI/DOI?10.1055/s-0030-1249978.
- Gray, Dennis J.; Baker, Halie; Clancy, Kayla; Clarke, Robert C.; deCesare, Kymron; Fike, John; Gibbs, Matthew J.; Grotenhermen, Franjo et al. (1 November 2016). "Current and Future Needs and Applications for Cannabis" (in en). Critical Reviews in Plant Sciences 35 (5-6): 425–426. doi:10.1080/07352689.2017.1284529. ISSN 0735-2689. https://www.tandfonline.com/doi/full/10.1080/07352689.2017.1284529.
- Fetterman, Patricia S.; Keith, Elizabeth S.; Waller, Coy W.; Guerrero, Oswaldo; Doorenbos, Norman J.; Quimby, Maynard W. (1 August 1971). "Mississippi-Grown Cannabis sativa L.: Preliminary Observation on Chemical Definition of Phenotype and Variations in Tetrahydrocannabinol Content versus Age, Sex, and Plant Part" (in en). Journal of Pharmaceutical Sciences 60 (8): 1246–1249. doi:10.1002/jps.2600600832. https://linkinghub.elsevier.com/retrieve/pii/S0022354915380333.
- Clarke, Robert Connell; Merlin, Mark (2016). Cannabis: evolution and ethnobotany (First paperback printing ed.). Berkeley Los Angeles London: University of California Press. ISBN 978-0-520-29248-2.
- Small, E.; Beckstead, H. D. (1 June 1973). "Common cannabinoid phenotypes in 350 stocks of Cannabis". Lloydia 36 (2): 144–165. ISSN 0024-5461. PMID 4744553. https://pubmed.ncbi.nlm.nih.gov/4744553.
- Small, Ernest; Beckstead, H. D. (1 September 1973). "Cannabinoid Phenotypes in Cannabis sativa" (in en). Nature 245 (5421): 147–148. doi:10.1038/245147a0. ISSN 0028-0836. https://www.nature.com/articles/245147a0.
- Klimyuk, Victor I.; Carroll, Bernard J.; Thomas, Colwyn M.; Jones, Jonathan D.G. (1 March 1993). "Alkali treatment for rapid preparation of plant material for reliable PCR analysis" (in en). The Plant Journal 3 (3): 493–494. doi:10.1111/j.1365-313X.1993.tb00169.x. https://onlinelibrary.wiley.com/doi/10.1111/j.1365-313X.1993.tb00169.x.
- Wickham, Hadley (2009). Ggplot2: elegant graphics for data analysis. Use R!. New York: Springer. ISBN 978-0-387-98140-6. OCLC 382399721. https://www.worldcat.org/title/mediawiki/oclc/382399721.
- Vossen, R.; van der Stoep, N.; den Dunnen, J.T. (2007). "Transfering PCRs to HRM-assays on the LightCycler 480 System – Examples for BRCA1" (PDF). Biochemica 4: 10–11. https://www.gene-quantification.de/vossen-et-al-ras-hrm-2007.pdf.
- Lorenz, Todd C. (22 May 2012). "Polymerase Chain Reaction: Basic Protocol Plus Troubleshooting and Optimization Strategies" (in en). Journal of Visualized Experiments (63): 3998. doi:10.3791/3998. ISSN 1940-087X. PMC PMC4846334. PMID 22664923. http://www.jove.com/video/3998/.
- Distefano, Gaetano; Caruso, Marco; La Malfa, Stefano; Gentile, Alessandra; Wu, Shu-Biao (30 August 2012). Niedz, Randall P.. ed. "High Resolution Melting Analysis Is a More Sensitive and Effective Alternative to Gel-Based Platforms in Analysis of SSR – An Example in Citrus" (in en). PLoS ONE 7 (8): e44202. doi:10.1371/journal.pone.0044202. ISSN 1932-6203. PMC PMC3431301. PMID 22957003. https://dx.plos.org/10.1371/journal.pone.0044202.
- Blake, R. D. (1 July 1987). "Cooperative lengths of DNA during melting" (in en). Biopolymers 26 (7): 1063–1074. doi:10.1002/bip.360260706. ISSN 0006-3525. https://onlinelibrary.wiley.com/doi/10.1002/bip.360260706.
This presentation is faithful to the original, with only a few minor changes to presentation. Some grammar and punctuation was cleaned up to improve readability. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version—by design—lists them in order of appearance.