Genetic diversity and phylogenetic relationship of higher termite Globitermes sulphureus ( Haviland ) ( Blattodea : Termitidae )

The subterranean termite Globitermes sulphureus is commonly found in Malaysia, Singapore, Thailand, and Vietnam (Ahmad, 1965; Bordereau et al., 1997; Kuswanto et al., 2015; Lee et al., 2007; Ngee & Lee, 2002). This termite belongs to higher group termites which possess only bacteria and archaea in their gut (Bujang et al., 2014). As a wood feeder termite, this species has been reported to infest premises’ wood structures (Ab Majid & Ahmad, 2009; Neoh et al., 2011). Moreover, it was also reported as the primary pest in agricultural sectors such as coconut and oil palm plantations (Lee et al., 2003). G. sulphureus is recognized as a pest of significant economic importance in Southeast Asia (Rust & Su, 2012). Abstract The subterranean higher termite Globitermes sulphureus (Blattodea: Termitidae) is a peridomestic forager and regarded as a significant pest in Southeast Asia. In this study, populations of G. sulphureus from the USM main campus area were investigated based on partial sequences of the mitochondrial COII gene. The genetic diversity was determined using DnaSP v5 software, while the phylogenetic relationship was defined using Neighbor-joining (NJ) and maximum likelihood (ML) methods using Molecular Evolutionary Genetics Analysis (MEGA 7) software. A total of 2 haplotypes were detected among 5 sample sequences distinguished through two variable sites. Also, both phylogenetic trees gave similar topology and supporting the results from haplotype diversity. Based on the haplotype diversity and molecular phylogeny, it is proposed that geographic isolation and lack of human activities have contributed to the neutral genetic diversity of G. sulphureus. Sociobiology An international journal on social insects

The COII is a subunit of cytochrome c oxidase in the mitochondria (Frati et al., 1997). COII is considered the fastest evolving gene compared to 12S and 16S genes (Yeap et al., 2007). The rapid evolution of these genes has been proven helpful for deducing phylogenetic relationships between closely related insect species due to the relatively high degree of variation at the 3' end of this gene (Aly et al., 2012;Singla et al., 2016).
G. sulphureus is a peridomestic forager and moundbuilding termite (Lee et al., 2003). Its mound is easily identified based on a dome shape and a dark brown color (Ahmad, 1965). G. sulphureus is easily identified since the soldiers possess a bright-yellow colored body (Ab Majid & Ahmad, 2011;Hussin & Ab Majid, 2017;Hussin et al., 2018;Khizam & Ab Majid, 2019). The goal of this study is to determine the genetic diversity and phylogenetic relationship among local populations of G. sulphureus in Universiti Sains Malaysia (USM) main campus, Penang by using partial sequences of the mitochondrial COII gene. There may also be differences in the termite genome from several mounds of G. sulphureus in response to local ecological conditions.

DNA Extraction and PCR Amplification
Ten worker termites from each collection site were rinsed in sterile distilled water and dried with a paper towel. The workers' heads were cut off from the body using sterile dissecting scissors for DNA extraction. Genomic DNA extraction was performed using the DNeasy Blood & Tissues Kit (Qiagen, Germany) for COII gene PCR amplification. The COII gene was amplified using a pair of primer, Forward (5'-CATTGCACCCGCAATCATCC-3') and Reverse (5'-GAATCTGTGGTTTGCTCCTCCGC-3'). The PCR reaction mixture (50 μL) contains 25 μL of 2x TopTaq Master Mix (Qiagen, Germany), producing a final concentration of 1.5 mM MgCl2, 1.25 units TopTaq DNA polymerase, 1× PCR buffer, and 200 µM of each dNTP, together with 1 μL (10 μM) of each primer, 5 μL (10x) CoralLoad Concentrate (as a substitute to loading dye), 10-100 ng of a bulk DNA template, and sterile distilled water. The PCR reaction profile comprises an initial denaturation at 94 °C for 2 min, followed by 40 cycles of denaturation at 94 °C for 45 seconds, primer annealing at 50 °C for 45 seconds, the first extension step at 72 °C for 60 seconds, and the final extension step at 72 °C for 5 min. The PCR product was purified with MEGAquick-spinTM Total Fragment DNA Purification Kit (iNtRON Biotechnology, South Korea).

Sequencing and Analysis of COII Gene Sequences
The purified PCR products were sequenced using the Sanger sequencing machine at First Base Laboratories Sdn. Bhd., Malaysia. Raw data obtained were extracted using FinchTV 1.4 (www.geospiza.com). Then, the forward and reverse sequences were aligned using T-Coffee (Notredame et al., 2000). The aligned sequences were edited manually to remove the low-quality bases before converting the files to FASTA format. The FASTA files were BLASTN search at the NCBI database (https://blast.ncbi.nlm.nih.gov) to compare with all COII gene sequences available in the GenBank. The species identified are based on ≥ 99% sequences similarity with the NCBI database. The partial COII sequences were submitted to the GenBank.

Genetic Diversity and Phylogenetic Analysis
All sequences were aligned automatically using the multiple alignment algorithm in ClustalX v2.1 (Larkin et al., 2007;Thompson et al., 1997) with default settings. Genetic characteristics such as haplotype diversity, nucleotide diversity, and the total number of mutations were calculated using DnaSP v5 (Librado & Rozas, 2009). The pairwise genetic distance was calculated using the p-distance method in MEGA 7. The phylogenetic relationship was carried out using the Molecular Evolutionary Genetics Analysis (MEGA) version 7 (Kumar et al., 2016). The phylogenetic trees were constructed using the Neighbor-Joining (NJ) (Saitou & Nei, 1987) and Maximum Likelihood (ML) method based on p-distance (Nei & Kumar, 2000) and Hasegawa-Kishino-Yano model (Hasegawa et al., 1985), respectively. The C. gestroi (lower termite) with an accession number GU931692.1 was used as an outgroup. Bootstrap analysis with 1000 resamplings was used to establish the NJ and ML trees (Fellsenstein, 1985).

Results
Five COII gene sequences were used with an average amplicon size of 420 base pairs after editing and removing the low-quality bases (bp). BLASTN results confirm the termite species is G. sulphureus, and the accession number for submitted sequences is written in

Nucleotide Analysis
The average base frequencies for five COII nucleotide sequences are A = 35.5%, T = 22.9%, G = 16.1%, and C = 25.5%. The total content of A+T is 58.4%, much higher than Table 2. Haplotype, haplotype diversity (Hd), nucleotide diversity (Pi), and accession G. sulphureus from five sample locations. C+G = 41.6%. The nucleotide diversity, Pi, is 0.00192 (Table 2). Two polymorphic or singleton sites are observed in the COII nucleotide sequence of DV samples at positions 252 and 255 (Table 3). The pairwise genetic distance between partial COII gene sequences is 0.000 to 0.005 (Table 4).

Haplotype Analysis
We observed two haplotypes, with haplotype 1 consisting of four colonies and haplotype 2 consisting of only one colony (Table 2), distinguished by two variable sites ( Table 3). The haplotype diversity (Hd) of the five samples is 0.400 (Table 2). The relationship between haplotypes is further confirmed upon employing NJ and ML methods as implemented in MEGA 7 (Fig 2 and Fig 3).

Phylogenetic Relationship Inferred From COII Genes
The phylogenetic relationship of G. sulphureus from five locations is analyzed using two approaches; distance matrix: NJ (Fig 2) and character-based method: ML (Fig 3). Both trees show a monophyletic group among five samples but with three separate clades corresponding to the outgroup species, C. gestroi CG003TW. The first clade comprises haplotype 1, while the second clade comprises haplotype 2. The third clade shows the separation of Termitidae from the Rhinotermitidae family. The trees are also supported by pairwise genetic distance values (Table 4). Colony DV has a pairwise genetic distance value of 0.005 between other colonies, causing it to be located in the second clade. Meanwhile, all colonies in haplotype 1 have 0.000 value of the pairwise genetic distance between them.

Discussion
There are five accessible mounds of G. sulphureus identified on the USM main campus. Colony DV is located at Durian Valley, known as natural forest/habitat in the USM main campus area. This location is protected to preserve flora and fauna there (Meng et al., 2002). Colony PM is located at the bamboo trees near the Minden Field, where students actively play football. Colony BP is located closer to the hostel's area, while colony IK is near the main roadside. Lastly, colony TH is located between hostels and the lake.
Globitermus sulphureus is regarded as a peridomestic forager and previously found in a disturbed forest area rather than a natural forest area (Aiman Hanis et al., 2014). A disturbed forest area is a place that had been cleared and developed for eco-tourism activities where various wooden structures and facilities were built. G. sulphureus is also commonly found in rural and urban areas Lee et al., 2003;Ab Majid & Ahmad, 2009). Urbanization of plantation areas attracted this termite, causing infestation at the door and window frames of houses (Ngee & Lee, 2002). Therefore, the encounter of this species in the USM area is not surprising.
Partial COII gene sequences of G. sulphureus isolated from five colonies are used for genetic diversity and phylogenetic analysis. The composition of different nucleotides in COII gene sequences from the colonies is calculated. From the results, the COII gene sequences show a high percentage of A+T (58.4%) than G+C (41.6%). Adenine and thymine bias in nucleotide sequences is consistent with the data on COII mitochondrial genes of other higher group termites species (genus Microtermes, Microcerotermes, and Odontotermes) and lower termites (genus Coptotermes) (Singla et al., 2016;Yeap et al., 2007). High A+T content is typical in insect mitochondrial DNA (Keller et al., 2007). However, the adenine-thymine percentage recorded in this study is lower than reported by those studies (59.87 -62.0%). This result might occur because partial sequences of the COII gene are used in this study instead of full sequences (675 -680 bp).
The pairwise genetic distance for this study ranges from 0.000 to 0.005. This value is lower than reported previously regarding the genetic distance comparison between different genus of Odontotermes (ranged from 0.025 to 0.072) and genus Microtermes (ranged from 0.028 to 0.229) (Singla et al., 2016), and higher than the pairwise genetic distance between Macrotermes carbonarius populations (0.003) (Ab . This result suggests that populations within a species have smaller genetic distances than populations within a genus because intraspecific populations are more closely related and have a recent common ancestor. From the haplotype analysis, two haplotypes are formed with DV colony solely in haplotype 2. The DV colony has two variable sites causing the pairwise genetic distance value to be 0.005 between other colonies. This result might occur due to the biased nature of natural forest zones (DV) compared to urban zones (IK, PM, BP, and TH). In a previous study, Szalanski et al. (2008) demonstrated that the genus Reticulitermes isolated from Lake Wedington (National Forest) has a higher frequency of rare haplotypes than Reticulitermes isolated from urban areas. Geographic isolation and the lack of human involvement in forested regions contributed to the rare haplotype of Reticulitermes species. Besides, demographic fluctuations and natural selections can affect any particular species' neutral genetic diversity (Ellegren & Galtier, 2016).
Two variable sites are detected in this study. This is contrary to other higher group termite (M. carbonarius) isolated from USM main campus, which has only one variable site (Ab . Both variable sites are silent mutations since changes in the nucleotide bases do not affect amino acid sequences. Silent mutations in this study can be regarded as synonymous mutations since the nucleotide bases only change at the codon's third position. Synonymous mutations usually occur by neutral selection and later become fixed (Fouks & Lattorff, 2016). It is long thought to be without phenotypic consequences but is currently recognized as critical in shaping gene expression, protein folding, cellular function, and the organism's fitness (Plotkin & Kudla, 2011;Zwart et al., 2018). However, the consequences of synonymous mutations in termites fitness remained understood.
The phylogenetic trees supported genetic diversity results where two clusters are formed regarding the haplotypes. Both trees (NJ and ML) show monophyletic of all five colonies. Bootstrap values (> 80%) show the significance of both trees. In the phylogenetic tree, closely related organisms are group together according to the order, family, subfamily, genus, and species (Ab Bourguignon et al., 2014;Singla et al., 2015;Yeap et al., 2007). Since there is no parsimony-informative site detected in this study, parsimony analysis is excluded.
More sampling sites and analysis are needed to confirm the geographic distribution and the native and invasive termite G. sulphureus in USM main campus and Malaysia. Genotype analysis may explain the breeding pattern at the microsatellite level and reveal the nature of haplotype variability within G. sulphureus.
In conclusion, this study demonstrated the COII gene's ability to differentiate between G. sulphureus populations from a few different USM main campus locations. The genetic diversity analysis shows nucleotide divergence between isolated populations. Phylogenetic analysis supports the haplotype relationship of G. sulphureus. However, geographical area influences on the species' genetic diversity require more sampling sites and further analysis such as microsatellite genotyping.