Genome-wide analysis of 944 133 individuals provides insights into the etiology of haemorrhoidal disease

Objective Haemorrhoidal disease (HEM) affects a large and silently suffering fraction of the population but its aetiology, including suspected genetic predisposition, is poorly understood. We report the first genome-wide association study (GWAS) meta-analysis to identify genetic risk factors for HEM to date. Design We conducted a GWAS meta-analysis of 218 920 patients with HEM and 725 213 controls of European ancestry. Using GWAS summary statistics, we performed multiple genetic correlation analyses between HEM and other traits as well as calculated HEM polygenic risk scores (PRS) and evaluated their translational potential in independent datasets. Using functional annotation of GWAS results, we identified HEM candidate genes, which differential expression and coexpression in HEM tissues were evaluated employing RNA-seq analyses. The localisation of expressed proteins at selected loci was investigated by immunohistochemistry. Results We demonstrate modest heritability and genetic correlation of HEM with several other diseases from the GI, neuroaffective and cardiovascular domains. HEM PRS validated in 180 435 individuals from independent datasets allowed the identification of those at risk and correlated with younger age of onset and recurrent surgery. We identified 102 independent HEM risk loci harbouring genes whose expression is enriched in blood vessels and GI tissues, and in pathways associated with smooth muscles, epithelial and endothelial development and morphogenesis. Network transcriptomic analyses highlighted HEM gene coexpression modules that are relevant to the development and integrity of the musculoskeletal and epidermal systems, and the organisation of the extracellular matrix. Conclusion HEM has a genetic component that predisposes to smooth muscle, epithelial and connective tissue dysfunction.


INTRODUCTION
Haemorrhoids are normal anal vascular cushions filled with blood at the junction of the rectum and the anus. It is assumed that their main role in humans is to maintain continence 1 but other functions such as sensing fullness, pressure and perceiving anal contents have been suggested given the sensory innervation. 2 Haemorrhoidal disease (hereafter referred to as HEM) occurs when haemorrhoids enlarge and become symptomatic (sometimes associated with rectal bleeding and itching/soiling) due to the deterioration or prolapse of the anchoring connective tissue, the dilation of the haemorrhoidal plexus or the formation of blood clots. Severe forms of HEM often require surgical treatment and the removal of abnormally enlarged and/ or thrombosed haemorrhoids. 1 HEM prevalence increases with age and shows staggering figures worldwide (up to 86% prevalence in some reports), 3 whereby a large proportion of cases remain undetected as asymptomatic or mild enough to be selftreated with over-the-counter treatment. HEM represents a considerable medical and socioeconomic burden with an estimated annual cost of US$800 million in the USA alone, mainly related to the large number of haemorrhoidectomies performed every year. 4 A number of HEM risk factors have been suggested, including human erect position. The tight anal sealing provided by the elaborated haemorrhoidal plexus may have developed during human evolution co-occurring with permanent bipedalism, as shown by our histology comparison of four different mammals (human, gorilla, baboon, mouse; online supplemental figure 1, online supplemental material 1). Other suggested risk factors are a sedentary lifestyle, obesity, reduced dietary fibre intake, spending excess time on the toilet, straining during defecation, strenuous lifting, constipation, diarrhoea, pelvic floor dysfunction, pregnancy and giving natural birth, with several being controversially reported. The hypothetical model shown in online supplemental figure 2 summarises the contemporary concepts regarding the pathophysiology of HEM development. 5 Until today, HEM aetiopathogenesis is poorly investigated, and neither the exact molecular mechanisms nor the reason(s) why only some people develop HEM are known. Genetic susceptibility may play a role in HEM development, but no large-scale, genome-wide association study (GWAS) for HEM has ever been conducted. To evaluate the contribution of genetic variation to the genetic architecture of HEM, we carried out a GWAS metaanalysis in 218 920 affected individuals and 725 213 population controls of European ancestry.

METHODS
Detailed methods are provided in the online supplemental material 1 and summarised in the online supplemental figure 3.

GWAS meta-analysis and fine-mapping genomic regions
We conducted a GWAS meta-analysis in 944 133 individuals of European ancestry from five large population-based cohorts (23andMe, 6 UK Biobank (UKBB), 7 Estonian Genome Centre at the University of Tartu, 8 Michigan Genomics Initiative, 9 and Genetic Epidemiology Research on Aging (GERA) 10 ) including 218 920 HEM cases and 725 213 controls (online supplemental table 1). HEM cases had higher body mass index (BMI), were significantly older and more often women compared with non-HEM controls; hence, age, sex, BMI (if available) and top principal components from principal component analysis (PCA) were included as covariates in individual GWAS (see 'Methods' section in online supplemental material 1, hereafter referred to as online methods).
After data harmonisation and quality control, 8 494 288 highquality, common (minor allele frequency >0.01) single nucleotide polymorphisms (SNPs) were included in a fixed-effect inverse variance meta-analysis using the software METAL 11 (online methods). We identified 5480 genome-wide significant associations (P Meta <5×10 −8 ), which were mapped to 102 independent genomic regions using FUMA 12 (online methods). A summary of the association results for 102 lead SNPs is provided in figure 1 and online supplemental table 2. Although genomic inflation was observed (λ=1.3; online supplemental figure 4), this was likely due to polygenicity rather than population stratification, as determined via linkage disequilibrium score regression analysis (LDSC, intercept=1.06; online methods) and based on a normalised λ 1000 =1.001. The heritability of HEM was estimated at 5% (SNP-based heritability, h 2 SNP computed with LDSC), whereby the newly identified 102 risk variants explained about 0.9% of the variance (online methods).

Significance of this study
What is already known about this subject? ► Human haemorrhoidal disease (HEM) is a prevalent anorectal pathology characterised by the symptomatic enlargement and distal displacement of anal cushions. ► As a consequence of scarce research and of being a taboo topic in society, only a few HEM non-genetic risk factors have been suggested, thus the aetiology of the disease still remains unclear. ► Genetic susceptibility to HEM has been suspected but was never systematically investigated.
What are the new findings? ► Here, we report the first well-powered genome-wide association study (GWAS) meta-analysis with a sample size of 944 133 individuals to identify genetic risk factors for HEM. ► We describe 102 novel independent HEM risk loci that are functionally linked to pathways associated with smooth muscles, epithelial and endothelial development and morphogenesis. ► We show genetic correlations of HEM with several other diseases that are classified as GI, neuroaffective and cardiovascular disorders. ► We report significance of computed HEM polygenic risk scores, which were validated in independent populationbased cohorts. ► We show significant enrichment of HEM genes in tissue coexpression modules responsible for the development of musculoskeletal and epidermal systems, and the organisation of the extracellular matrix. ► Based on our data, we outline HEM as a disorder of impaired neuromuscular motility, smooth muscle contraction and extracellular matrix organisation.
How might it impact on clinical practice in the foreseeable future? ► The results from this GWAS provide new insights on HEM genetic predisposition to smooth muscle, epithelial and connective tissue dysfunction. ► HEM polygenic risk scores identify individuals at risk and correlate with a more severe phenotype.

Figure 1
Annotation of 102 haemorrhoidal disease (HEM) genome-wide association study (GWAS) risk loci. From left to right: Manhattan plot of GWAS meta-analysis results, (genome-wide significance level-P Meta <5×10 −8 -indicated with vertical dotted red line); Lead single nucleotide polymorphism (SNP)-marker associated with the strongest association signal from each locus (also annotated with a red circle in the Manhattan plot); Effect allele-allele associated with reported genetic risk effects (OR), also always the minor allele; OR with respect to the effect allele; Effect allele frequency-frequency of the effect allele in the discovery dataset; Number of SNPs in 95% credible set-the minimum set of variants from Bayesian fine-mapping analysis that is >95% likely to contain the causal variant; SNP with probability >50%-single variant (if detected) with >50% probability of being causal (coding SNPs highlighted in red); Nearest gene (#genes within locus boundaries)-gene closest to the lead SNP (if within 100 kb distance, otherwise 'na') and number of additional genes positionally mapped to the locus using FUMA (online supplemental table 2 and online methods). Signif. DGEx-locus containing HEM genes differentially expressed in RNA Combo-Seq analysis of HEM affected tissue, detected at higher (green) and/or lower (red) level of expression (see online methods).

Endoscopy
Bayesian fine-mapping analysis with FINEMAP 13 identified a total of 3323 SNPs that belong to the 95% credible sets of variants most likely to be causal at each locus (online methods, online supplemental figure 5 and online supplemental table 3). For six loci, we pinpointed the association signal to a single causal variant with greater than 95% certainty, including two missense variants rs2186797 (ANO1) and rs35318931 (SRPX). For another 24 loci, there was evidence that the lead variant is causal with >50% certainty (figure 1).

Cross-trait analyses
A lookup of HEM association signals in previous GWAS studies retrieved from GWAS Atlas, 14 GWAS Catalog 15 and via Phenoscanner v2 16 (online methods) revealed that 76/102 loci had been previously associated with diseases and traits across the metabolic, cardiovascular, digestive, psychiatric, environmental and other domains (online supplemental figure 6 and online supplemental table 4).
We then investigated genetic correlations with 1387 other human traits and conditions using LDSC as implemented in CTG-VL 17 (online methods). The strongest correlations (r g ) were observed with diseases and traits from the GI domain, including 'other diseases of anus and rectum' (r g =0.78, P FDR =4.94×10 −8 ), 'fissure and fistula of anal and rectal regions' (r g =0.58, P FDR =2.70×10 −12 ), 'self-reported IBS' (r g =0. 42, P FDR =1.87×10 −7 ), use of 'laxatives' (r g =0. 42, P FDR =2.35×10 −14 ), and 'diverticular disease' (r g =0. 23, P FDR =6.68×10 −9 ) among others (figure 2). Given these similarities, we performed a genome-wide pleiotropy analysis for diverticular disease, IBS and HEM, revealing 44 independent genomic regions shared by at least two phenotypes from the group of diverticular disease, IBS and HEM phenotypes (online supplemental table 5). Other notable correlations were detected for psychiatric and neuroaffective disorders (anxiety, depression and neuroticism), painrelated traits (including abdominal pain and painful gums), diseases of the circulatory system, and diseases of the musculoskeletal system and connective tissue (figure 2 and online supplemental table 6).

Figure 2
Genetic correlation between haemorrhoidal disease and other traits estimated by linkage disequilibrium score regression analysis. Genetic correlations (r g +se) are shown for selected traits, grouped by domain. Only correlations significant after Bonferroni correction were considered (full list available in online supplemental table 6). ICD, International Classification of Diseases.
To gain further insight into potential cross-trait overlaps and to validate the LDSC results, we assessed whether genetically correlated traits and conditions also occur more frequently in patients with HEM by analysing data on diagnoses and medications from UKBB. The results were highly consistent with those obtained via LDSC (figure 3, online supplemental table 6). Compared with controls, patients with HEM additionally suffered more often from diverticular disease, IBS and other functional GI disorders (FGIDs), abdominal pain, hypertension, ischaemic heart disease, depression, anxiety, and diseases of the musculoskeletal system and connective tissue among others. To further consolidate these findings, we analysed an independent population-scale healthcare record dataset comprising 8 172 531 individuals from the Danish National Patient Registry (online methods), obtaining similar results (see inner circle of figure 3). Therefore, based on these largely overlapping observations at the genetic and epidemiological level, HEM appears to be strongly associated with other diseases of the digestive, neuroaffective and cardiovascular domains.

Polygenic risk scores (PRS)
We next exploited meta-analysis summary statistics to compute HEM PRS and evaluate their relevance and translational potential in independent datasets. HEM PRS were calculated with PRSice-2 18 (online methods) and their performance tested in three independent datasets: (1) −15 , respectively for the two population-based cohorts HUNT and DBDS, and the German case-control cohort). In HUNT and DBDS, HEM prevalence increased across PRS percentile distributions, with individuals from the top 5% tail being exposed to higher HEM risk compared with the rest of the population (OR=1.68, p=1.55×10 −5 , and OR=1.68, p=6.11×10 −8 , respectively for HUNT and DBDS, figure 4). Higher HEM PRS were also associated with a more severe phenotype as defined by the need for recurrent invasive procedures (OR=1.03, p=8.63×10 −3 in German patients) and a younger age of onset (p=1.90×10 −3 in DBDS; p=4.01×10 −3 in German patients).

Functional annotation of GWAS loci, tissue and pathway enrichment analyses
In order to identify most likely candidate genes and relevant molecular pathways, we used independent computational pipelines for the functional annotation of GWAS results. This yielded  table 6) were studied for their differential prevalence in UKBB and DNPR, based on data extracted from participants' healthcare records. Significant results are reported, respectively, as ORs (log(OR), UKBB, middle ring) and relative risk (log(RR), DNPR, inner ring) or 'ns' (for non-significant findings). Diseases and traits are categorised according to ICD10 diagnostic codes or self-reported conditions and use of medications from questionnaire data (see online methods). Self-reported traits in UKBB (dark blue colour) were manually mapped to ICD10-codes in DNPR.

Endoscopy
a total of 819 non-redundant HEM-associated transcripts (hereafter referred to as HEM genes; online supplemental table 7) derived from alternative positional (N=540 total from FUMA, 12 DEPICT 22 and MAGMA 23 ) and expression quantitative trait (eQTL, N=562 from FUMA) mapping efforts (online methods). Tissue-specific enrichment analyses (TSEA) of 540 positional candidates led to very similar results for all three approaches, with enrichment of HEM gene expression in blood vessels, colon and other relevant tissues (online supplemental figure 7 and online supplemental table 8). Similarly, gene-set enrichment analyses (GSEA) highlighted common pathways important in the development of vasculature and the intestinal tissue including the gene ontology (GO) terms 'tube morphogenesis and development', 'artery morphogenesis and development', 'epithelium morphogenesis', 'smooth muscle tissue morphogenesis' and others. Additionally, DEPICT detected enrichment for a number of traits from the Mammalian Phenotype Ontology including 'abnormal intestinal morphology', 'rectal prolapse', 'abnormal blood vessel (and artery) morphology', and 'abnormal smooth muscle morphology and physiology' (online supplemental table 8). TSEA and GSEA analyses performed on all 819 transcripts including eQTL genes from FUMA gave rise to similar results although these did not reach statistical significance (not shown).

Gene expression in HEM tissue and gene-network analyses
The expression of HEM genes was studied in integrated mRNA and microRNA Combo-Seq analysis of enlarged haemorrhoidal tissue from 20 patients with HEM and normal specimens from 18 controls (online methods, online supplemental table 1). HEM genes were examined with regard to their expression status, differential expression between cases and controls, and their connectivity and topology in gene coexpression networks. After normalisation for cell-type heterogeneity in different tissues (online methods, online supplemental figure 8), 720 out of 819 candidate genes were found to be expressed in haemorrhoidal tissue, with 287 (39.9%) of these being among the most strongly expressed transcripts (upper quartile) (online supplemental table 7). Compared with normal tissue from controls, 18 HEM candidate genes from 14 independent loci showed differential expression in haemorrhoidal tissue (P FDR <0.05 and |log 2 fold change|>0.5), with 12 genes showing increased and 6 decreased expression (figure 5A).

Prioritised HEM genes
In order to identify genes most likely to play a causative role in HEM, we selected candidates based on a scoring approach by prioritising those associated with one or more of the following: (1) linked to a high-confidence fine-mapped variant (posterior probability (PP)>50%), (2) differentially expressed in enlarged haemorrhoidal tissue, (3) highlighted by pathway and tissue/ cell-type enrichment DEPICT analysis (online methods), or (4) predicted hub of a WGCNA coexpression HEM module (M1, M4, M7). This reduced the number of candidates from 819 to 100 prioritised genes associated with 58 independent HEM loci (online supplemental table 7). Some notable observations were made in relation to a subset of these prioritised genes, whose associated evidence and known biological function(s) make them remarkably good candidates to play a role in HEM risk.
Two genes, ANO1 and SRPX, were both linked to a single coding variant fine-mapped as causal with very high confidence (rs2186797, ANO1, p.Phe608Ser with PP=97.0%, and rs35318931, SRPX, p.Ser413Phe with PP=87.3%, respectively). ANO1 encodes the voltage-gated calcium-activated anion channel anoctamin-1 protein, which is highly expressed in the interstitial cells of Cajal (ICCs) throughout the human GI tract, where it contributes to the control of intestinal motility and peristalsis. 25 The ANO1:p.Phe608Ser (F608S) variant is predicted to destabilise local protein structure and to disturb ANO1activating phospholipid interactions (online supplemental figure  9). Indeed, site-directed mutagenesis and electrophysiology experiments in vitro showed that the amino acid 608 Phe to

Endoscopy
Ser change leads to an increased voltage-dependent instantaneous Cl − current, and a slowing of activation and deactivation kinetics (especially at high Ca 2+ concentrations) (online supplemental figurer 10). The X-linked gene SRPX codes for a Sushi repeat-containing protein whose domain composition implies a role in ECM, and is expressed in various ECM tissues including colon and liver. 26 The SRPX:p.Ser413Phe variant (rs35318931) may potentially destabilise the C-terminal domain of unknown function (DUF4174) (online supplemental figure 11), which is conserved in various ECM proteins (online methods, section In silico variant protein analysis).
At locus 7q22.1, the signal was fine-mapped with very high confidence to rs4556017 (PP=96.2%), which exerts eQTL effects on ACHE and SRRT, both showing high levels of expression in enlarged haemorrhoidal tissue, and found in the M4 and M7 coexpression modules (figure 5), respectively. However, while SRRT encodes a poorly characterised capped-RNA binding protein, ACHE appears a much better candidate as the gene encodes an enzyme that hydrolyses the neurotransmitter acetylcholine at neuromuscular junctions, and corresponds to the Cartwright blood group antigen Yt. 27 One more locus was finemapped to single-variant resolution with high confidence, SNP rs10956488 from locus 8q24.21 (PP=0.962), which is linked to an eQTL for GSDMC. Gasdermin C (encoded by GSDMC) is a poorly characterised member of the gasdermin family of proteins expressed in epithelial cells and in enlarged haemorrhoidal tissue, though the mechanisms of its eventual HEM involvement remain elusive.
Other genes were linked via eQTL to a variant mapped with PP >50% and were prioritised based on additional experimental evidence. Among these, the fine-mapped SNP rs6498573 from locus 16p13.11 (PP=63.2%) is associated with eQTL effects on MYH11 (encoding muscle myosin heavy chain 11; online supplemental table 10), a gene coding for a smooth muscle myosin heavy chain that shows mRNA upregulation in enlarged haemorrhoidal tissue and constitutes a hub for the M1 coexpression module associated with ECM organisation and muscle contraction. Notable prioritised candidates were also observed at loci that were not finemapped, including ELN (encoding elastin, a key component of the ECM found in the connective tissue of many organs, highly expressed in enlarged haemorrhoidal tissue and hub gene for the M1 coexpression module), COL5A2 (encoding type V collagen protein; highly expressed in HEM tissue and belong to M1 coexpression module), PRDM6 (encoding a putative histone methyltransferase regulating vascular smooth muscle cells contractility, expressed in enlarged haemorrhoidal tissue and hub for the M1 module), and others (online supplemental table 7). Finally, while no candidate genes could be highlighted from the top GWAS hit region on chr12q14.3 (rs11176001), both the second and third strongest GWAS signals were detected at loci linked to genes involved in the determination of blood groups, namely ABO (rs676996) and the Kell Blood Group Complex Subunit-Related Family member XKR9 (rs1838392). In addition, blood group antigens are encoded at additional loci, including ACHE (rs4556017) and XKR6 (identified by MAGMA). Imputation of human ABO blood types from genotype data revealed the O type to be associated with increased and A and B types decreased HEM risk, both in UKBB and GERA datasets (online supplemental figure 12).

Localisation of selected HEM gene-encoded proteins
A site-specific analysis of selected candidate proteins for their localisation in anorectal tissues underlined the complex multifactorial nature of cellular components potentially involved in the pathogenesis of HEM. Indeed, analysed candidates displayed a broad spectrum of expression in intestinal mucosal, neuromuscular, immune and anodermal tissues (figure 6, online supplemental figure 13, online supplemental tables 11 and 12). They were also directly colocalised with haemorrhoidal blood vessels, suggesting a putative role connected with the haemorrhoidal vasculature itself.

DISCUSSION
Given the lack of large and systematic epidemiological and molecular studies, and despite its worldwide distribution, HEM still can be regarded as an understudied disease. In this study, we report the largest and most detailed genome-wide analysis of HEM, implemented via a combination of classical GWAS approaches and the use of minimal phenotyping, as recently shown to be effective in boosting sample size for increased statistical power. 28 We demonstrate for the first time that HEM is a partly inherited condition with a weak but detectable heritability estimated at 5% based on SNP data. We identify 102 independent risk loci, which were functionally annotated based on computational predictions and gene expression analysis of diseased and normal tissue. These loci alone explain approximately 0.9% of HEM heritability, which is of similar magnitude with respect to the genetic contribution of other common complex traits. 29 They provide important novel pathophysiological insight, which we discuss below in relation to individual pathways and mechanisms proposed to contribute to HEM aetiology.

ECM, elasticity of the connective tissue and smooth muscle function
The non-vascular components of the anal cushions consist of the transitional epithelium, connective tissue (elastic and collagenous) and the submucosal anal muscle (muscle of Treitz). 5 Treitz's muscle tightly maintains the anal cushions in their normal position, and its deterioration is considered one of the most important pathogenetic factors in the formation of enlarged and prolapsed haemorrhoids (online supplemental figure 2). Anal cushion fixation is further facilitated by elastic and collagenous connective tissue, whose degeneration, due, for instance, to abnormalities in collagen composition, has been involved in HEM aetiology, 30 although without strong molecular evidence.
The coexpression module M1 identified from HEM tissue is linked to ECM organisation and muscle function and is enriched for HEM gene thus providing novel important evidence for a role of these two interconnected processes in HEM pathogenesis. ELN (lead SNP rs11770437) is one of the HEM prioritised genes, and also a main hub gene for the M1 module. ELN codes for the elastin protein, a key component of elastic fibres that comprise part of the ECM and confer elasticity to organs and tissues including blood vessels. Mutations in the ELN gene have been shown to cause cutis laxa, a disease in which dysfunctional elastin interferes with the formation of elastic fibres, thus weakening connective tissue in the skin and blood vessels. Of note, ELN has been recently implicated also in diverticular disease, 31 a condition characterised by outpouchings of the colonic wall at sites of relative weakness and/or defective elasticity of the connective and muscle layers, as well as in the common skin condition non-syndromic striae distensae (NSD, also known as stretch marks), whose manifestation is due to lost tissue elasticity at affected skin sites. 32 Hence, similar mechanisms may underly HEM risk due to genetic variation in ELN and, notably, also the SRPX (lead SNP rs35318931) and COL5A2 (lead SNP rs16831319) genes. The SRPX lead SNP is also strongly associated with NSD, and is a coding variant potentially impacting the function of an ECM protein. COL5A2 codes for type V fibrilforming collagen that has regulatory roles during development and growth of type I collagen-positive tissues. Mutations in this gene are known to cause Ehlers-Danlos syndrome, a rare connective tissue disease that affects the skin, joints and blood vessels, and for which a link to HEM has already been postulated. 33 An additional hub gene for the coexpression module M1, MYH11 (lead SNP rs6498573) encodes a smooth muscle myosin protein that is important for muscle contraction and relaxation, and whose dysfunction has been linked to vascular diseases 34 and GI dysmotility. 35 Functional studies in smooth muscle cells showed that overexpression of MYH11 led to a paradoxical decrease of protein levels through increased autophagic degradation followed by disruption of contractile signalling. 36 Our transcriptome analysis also showed MYH11 RNA upregulation in HEM tissue, thus suggesting possible alterations in smooth muscle action that may be relevant, for instance, to the Treitz's muscle function(s). Additional evidence for the involvement of the muscoskeletal system may come from the associations detected for GSDMC (lead SNP rs10956488), an uncharacterised gene also shown to be relevant to lumbar disc herniation and back pain 37 and PRDM6, a histone methyltransferase that acts as a transcriptional repressor of smooth muscle gene expression. Finally, besides individual association signals, we detected significant genetic correlation with several other complex diseases of shared aetiology (hernia, dorsalgia), for which connective tissue and/or muscle alterations are described.

Gut motility
Several lines of evidence link gut motility to the pathophysiology of HEM in this study. In our UKBB analyses, patients with HEM were found to suffer more often from IBS and other dysmotility syndromes than controls, as evidenced also by the increased use of medications including laxatives. These conditions also showed strongest genetic correlation with HEM among all tested traits, indicating similar genetic architecture and predisposing mechanisms. Moreover, given the important role of the gut-brain axis in IBS and other FGIDs, 38 it is possible that the correlations observed for anxiety, depression and other neuroaffective traits may mediate genetic risk effects at least in part via similar mechanisms also involving gut motility. Constipation and prolonged sitting and straining during defecation are associated Endoscopy with delayed GI transit and reduced peristalsis and are among the proposed HEM risk factors (online supplemental figure  2). Harder stools, due for instance to infrequent defecations, can cause difficulty in bowel emptying and therefore increase pressure and mechanical friction on the haemorrhoidal cushions, leading to excessive engorgement and stretching or tissue damage. The relationship between HEM and gut motility is probably best evidenced by the association signal detected at the ANO1 locus: its lead SNP rs2186797 corresponds to the missense variant ANO1:p.Phe608Ser (F608S) that was fine-mapped with very high confidence and shown to impact anoctamin-1 function. Anoctamin-1 is an ion channel expressed in the ICCs, the pacemakers of the GI tract controlling intestinal peristalsis, has already been implicated in IBS, 39 and is also expressed in the vasculature. Hence, it represents an ideal candidate to also affect HEM risk via genotype-driven modulation of ICC function. An additional interesting candidate is ACHE, which shows an eQTL for the fine-mapped lead SNP rs4556017: ACHE codes for an enzyme that hydrolyses the neurotransmitter acetylcholine at neuromuscular junctions and is overexpressed in Hirschsprung's disease, a condition in which gut motility is compromised due to the absence of nerve cells (aganglionosis) in the distal or entire segments of the large bowel. 40 Of interest, expression in the enteric ganglia next to the haemorrhoidal plexus was observed for several proteins encoded by HEM genes in our immunofluorescence experiments.

Vasculature and circulatory system
Previous observations showed that HEM is not varicosities and accelerated blood flow velocities were observed in afferent vessels of patients with HEM. 41 An impaired drainage or filling of the anal cushion may contribute to cushion slippage and may thus be considered as one of many disease-causing factors, as already previously proposed. 41 Our genetic data support the involvement of the vasculature as an important player in HEM pathophysiology. TSEA and GSEA results downstream of HEM GWAS meta-analysis highlight blood vessels and artery morphogenesis among the HEM gene-enriched tissues and GO pathways, respectively. At the same time, moderate genetic correlation is detected for diseases of the circulatory system in the LDSC analyses. We identified a very strong association signal in correspondence of the ABO locus on chromosome 9, which determines the corresponding ABO blood group type (A, B, AB and O). In addition to red blood cells, ABO antigens are expressed on the surface of many cells and tissues, and have been strongly associated with coronary artery disease, thrombosis, haemorrhage, 42 GI bleeding 43 and other conditions related to the circulatory system. Interestingly, increased risk for coronary artery disease has been reported in patients with HEM in at least some studies 44 and replicated in our UKBB cross-disease analyses although amidst other hundreds of associations of similar magnitude of effects. We imputed ABO blood types from genotype data in UKBB, and detected increased HEM risk for carriers of the O type. O type has been reported to be protective for coronary artery disease in UKBB, although also predisposing to hypertension. 45 Hence, the potential mechanism(s) by which variation at this locus impacts HEM risk remain elusive at this stage. Blood antigens are nevertheless likely relevant, as other genes involved in the determination of specific blood groups are also among the 102 HEM GWAS hits (Kell blood group locus XKR9, lead SNP rs1838392).
In summary, our data provide important new insight into currently lacking evidence 46 about HEM pathogenesis (online supplemental figure 14). This sets the stage for more detailed genetic and mechanistic follow-up analyses, the search for therapeutically actionable genes and pathways, and the eventual exploitation for the adoption of preventive measures based on computed individual predisposition.