A comparative meta and in silico analysis of differentially expressed genes and proteins in canine and human bladder cancer

Canine and human bladder cancer present several similar anatomical, morphological and molecular characteristics and dogs can be considered a model for human bladder cancer. However, the veterinary literature lacks information regarding cross validation analysis between human and canine large-scale data. Therefore, this research aimed to perform a meta-analysis of the previous canine literature on bladder cancer, identifying genes and protein previously evaluated in these studies. Besides that, we also performed a cross validation of the canine transcriptome data and the human data from The Cancer Genome Atlas (TCGA) to identify potential markers for both species. It was performed a meta-analysis using the following indexing terms “bladder” AND “carcinoma” AND “dog” in different international databases and 385 manuscripts were identified in our initial search. Then, several inclusion criteria were applied and only 25 studies met these criteria. Among these studies, five presented transcriptome data and 20 evaluated only isolated genes or proteins. Regarding the studies involving isolated protein analysis, HER-2 protein was the most studied (3/20), followed by TAG-72 (2/20), COX-2 (2/2), Survivin (2/2) and CK7 (2/2). Regarding the cross-validation analysis of human and canine transcriptome data, we identified 35 deregulated genes, including ERBB2, TP53, EGFR and E2F2. Our results demonstrated that the previous canine literature on bladder cancer was focused on the evaluation of isolated markers with no association with patient’s survival. Besides that, the lack of information regarding tumor muscle-invasion can be considered an important limitation when comparing human and canine bladder tumors. Our in-silico analysis involving canine and human transcriptome data provided several genes with potential to be markers for both human and canine bladder tumors and these genes should be considered for future studies on canine bladder cancer.

Canine and human bladder cancer present several similar anatomical, morphological and molecular 18 characteristics and dogs can be considered a model for human bladder cancer. However, the 19 veterinary literature lacks information regarding cross validation analysis between human and canine 20 large-scale data. Therefore, this research aimed to perform a meta-analysis of the previous canine 21 literature on bladder cancer, identifying genes and protein previously evaluated in these studies. 22 Besides that, we also performed a cross validation of the canine transcriptome data and the human 23 data from The Cancer Genome Atlas (TCGA) to identify potential markers for both species. It was 24 performed a meta-analysis using the following indexing terms "bladder" AND "carcinoma" AND 25 "dog" in different international databases and 385 manuscripts were identified in our initial search. 26 Then, several inclusion criteria were applied and only 25 studies met these criteria. Among these 27 studies, five presented transcriptome data and 20 evaluated only isolated genes or proteins. 28 Regarding the studies involving isolated protein analysis, HER-2 protein was the most studied (3/20), 29 followed by TAG-72 (2/20), COX-2 (2/2), Survivin (2/2) and CK7 (2/2). Regarding the cross-30 validation analysis of human and canine transcriptome data, we identified 35 deregulated genes, 31 including ERBB2, TP53, EGFR and E2F2. Our results demonstrated that the previous canine 32 literature on bladder cancer was focused on the evaluation of isolated markers with no association 33 with patient's survival. Besides that, the lack of information regarding tumor muscle-invasion can be 34 considered an important limitation when comparing human and canine bladder tumors. Our in-silico 35 1 Introduction 40 Transitional cell carcinoma (TCC), also called urothelial carcinoma, is the most common 41 bladder cancer in both humans and dogs, sharing clinical, pathological and molecular alterations 42 (Dhawan et al., 2018, Maeda et al., 2018, Ramsey et al., 2017. In the United States, it is expected 43 81.400 new cases and 17.980 bladder cancer-related deaths (Siegel et al., 2020). The last global 44 cancer statistics (GLOBOCAN) revealed 549.393 new cases and 199.922 bladder cancer-related 45 deaths (Bray et al. 2018). In dogs, urothelial carcinoma is the most common malignant tumor in 46 canine bladder, representing 1% of all neoplasms that affect dogs (Patrick et al., 2006). In humans, 47 TCC is a tumor associated with several factors such as cigarette smoking, occupational exposure 48 (Myasaki and Nishiyama, 2017), arsenic and aromatic compounds (Cha et al., 2018). In household 49 dogs, a case-control study was previously performed to correlate cigarette smoke, obesity, use of 50 topical insecticides and chemicals used at home with canine bladder cancer development (Glickman 51 et al. 1989). These authors founded high risk of bladder cancer development in obese dogs and dogs 52 that used topical insecticides (Glickman et al. 1989). Since dogs and humans shares the same 53 environment, dogs could be considered a sentinel (Knapp et al., 2014). 54 Canine and human TCC are usually a local infiltrative cancer, that can extend to the entire 55 bladder, including submucosa and muscular layers (Grzegółkowski et al., 2016, Andrade et al., 56 2004. Besides that, human bladder carcinoma can invade adjacent tissues and organs such as ureter, 57 prostatic urethra and prostate gland (Hernández-Fernández et al., 2016). Usually, human bladder 58 carcinoma are superficial tumors (70% of the cases) and are classified as a non-muscle invasive 59 bladder carcinoma (NMIBC) (Grzegółkowski et al., 2016). Since NMIBC presents better prognosis, 60 the muscle invasive bladder carcinoma is considered a challenge (Grzegółkowski et al., 2016, 61 Hernández-Fernández et al., 2016. Thus, in the human literature, the muscle invasive bladder 62 carcinoma is focus of recent studies. In dogs, the infiltration is not standardized, since in several 63 cases tissue samples comes from cystoscopy (Dhawan et al. 2018, Dhawan et al. 2015. 64 The molecular phenotype of human bladder cancer is widely studied, and some genomic 65 subtyping was previously proposed (Jalanko et al. 2020, Inamura et al. 2018 The study design is summarized in figure 1. We divided the study methods in three steps: 1) 89 metanalysis of the previous literature aiming to identify deregulated genes and proteins in canine 90 bladder cancer, 2) In silico analysis of deregulated gene and proteins to identify prognostic and 91 predictive marker in canine bladder cancer and 3) Were selected five previous studies with 92 transcriptome data and we extracted common gene information from these studies and validated with 93 The Cancer Genome Atlas (TCGA) data. 94

Meta-analysis 95
In order to identify previous published papers to include in our meta-analysis, we performed a 96 literature search in the following databases: PubMed, MEDLINE and Scielo using the following 97 indexing terms "bladder" AND "carcinoma" AND "dog" with no restriction regarding the year of 98 publication. Then, we reviewed the reference section of the selected manuscripts and performed a 99 manual search in the most relevant journals with veterinary oncology background to ensure we 100 included the highest number of available manuscripts. 101 Afterwards, we selected manuscripts by title and abstract, including scientific articles that 102 evaluated gene or proteins in canine bladder carcinomas. In this step, we excluded review 103 manuscripts, case reports and retrospective studies including only survival analysis. Then, we 104 analyzed each included manuscript and selected scientific papers that evaluate gene or proteins in 105 canine bladder samples, comparing with normal bladder tissues. In this step, we excluded manuscript 106 using only cell lines, manuscripts that compared bladder carcinomas to other conditions such as 107 cystitis and manuscripts evaluating only bladder carcinomas, with no comparison to normal bladder 108 tissue. Our first evaluation was performed on December 12 th , 2019 and it was last updated in April 109 2 nd , 2020. 110 From the selected manuscripts, we retrieved information's regarding each deregulated gene or 111 protein, the "p-value" for each gene or protein (comparison between bladder cancer and normal 112 bladder) and survival. 113

In silico analysis 114
The in-silico analysis of each evaluated gene and protein was performed using online free-115 evaluable tools. In the first step, we selected only the deregulated genes and evaluated by STRING 116 (https://string-db.org/) to translate into each respective protein. Then, we used only proteins for the 117 This is a provisional file, not the final typeset article subsequent analysis. We opted to evaluate only proteins in our in-silico study due the facility to use 118 proteins as prognostic and predictive markers. 119 The deregulated proteins were evaluated grouped (upregulated and downregulated) or 120 independently (upregulated or downregulated) using the online Search Tool for the Retrieval of 121 Interacting Genes -STRING (https://string-db.org/) to generate protein-protein interaction (PPI) 122 networks. We considered only STRING interactions of high confidence (0.700) and we hid the 123 disconnected nodes for a better visualization. The considered interaction to generate PPI networks 124 were co-expression, cooccurrence, databases and neighborhood interactions. 125

Gene ontology 126
The gene ontology (GO)  Then, the cross-validated proteins were evaluated by TCPA (https://tcpaportal.org/tcpa/index.html) 155 (Juan et al., 2017;Li et al., 2013). We considered 5% interval of confidence or protein integrations 156 with p-value lower than 0.05. 157 Besides that, we performed two different analysis using the TCPA. First, we used 158 "visualization" tool to perform a global analysis to evaluate interaction among genes, including 159 negative and positive correlations. Thus, we selected genes and pathways of human bladder cancer 160 differentially expressed as possible markers to be used in canine bladder cancer. In the second 161 analysis, we evaluated the overall survival of the 344 human bladder cancer patients, according to 162 protein expression level (high versus low). In this analysis, we selected all genes in human bladder 163 cancer with prognostic value. The Kaplan Meier curves were generated using the TCPA online tool 164 "individual cancer analysis (https://tcpaportal.org/tcpa/analysis.html) ( reading the title and abstract we excluded 329 manuscripts and after reading the full manuscript, 25 171 of them met our inclusion criteria ( Figure 2). Afterwards, we divided the selected manuscript in two 172 categories, being manuscript with global transcriptome analysis (N=5) and manuscript with reported 173 isolated genes or proteins (N=20). A complete list with the selected manuscripts can be found in 174 Table 1. 175 Regarding the studies involving isolated protein analysis, HER-2 was the most studied protein 176 (3/20), followed by TAG-72 (2/20), COX-2 (2/2), Survivin (2/2) and CK7 (2/2). The remaining 177 proteins were evaluated in only one previous study each (Table 1). In the protein-protein interaction 178 analysis, we identified one interaction network among proteins, being P53 the protein with the 179 highest number of interactions. After enrichment analysis using Enrichr, we evaluated the most 180 common ontology process associated with the previous published protein and we identified several 181 processes related to tyrosine kinase regulation, cell communication and signaling and MAPK 182 pathway (Supplementary Figure 1 and Supplementary Table 1). 183

In silico analysis of canine transcriptome data 184
In our meta-analysis, we identified five previous studies containing transcriptome data and in 185 the most recent study (Parker et al., 2020), they cross validated their finding with other three 186 published manuscripts (Dhawan et al., 2018, Maeda et al. 2018, Dhawan et al., 2015. Thus, we 187 opted to analyze the transcriptome data from these five previous studies and cross validate with 188 TCGA data. In our cross-validation analysis, we identified 61 deregulated genes (Figure 3), including 189 CD55, IL17B, EGFR, CDH17 and CDH26. Moreover, we performed a PPI analysis among these 190 genes and demonstrated a high interaction among them, including VEGFA, EGFR, TNF and CCND1 191 as central genes in the interaction network ( Figure 3). 192 We identified 35 deregulated genes in cross-validation among the five veterinary studies and 193 the TCGA data (Table 2). 194

Analysis of the 344 human bladder cancer 195
In the analysis of the human bladder cancer samples, we identified several protein interactions, 196 being positive or negative correlation ( Figure 4). Regarding the PPI, it was possible to observe a high This is a provisional file, not the final typeset article number of proteins from tyrosine kinase family, such as EGFR, ERK2, ERBB2 and BRAF. Besides 198 that, we identified 28 proteins with prognostic value in human bladder cancer (Table 3). Among 199 these proteins, only two were previously studied in canine bladder cancer (2/26). The top six proteins 200 with prognostic value were Annexin1, TAZ, SF2, SRC, ARID1A and GATA3 ( Figure 5). 201

Discussion 202
The human bladder cancer molecular knowledge it is widely described in the literature and the 203 deposit of these previous publish data brings the opportunity to reanalyze this data, providing new 204 insights for comparative oncology. Although dogs can be considered models to human bladder 205 cancer, few studies provided a full description of the canine bladder cancer molecular data. Since 206 most canine bladder cancer studies have published isolated assessments of different proteins, the 207 present study extracted these data and evaluated them together, to understand how these proteins 208 could interact with each other and identify a profile of the previous veterinary studies. 209 In the meta-analysis, after the first evaluation, a high number of studies were excluded because 210 they had no concomitant normal tissue analysis (N=31/56 The meta-analysis demonstrated that most of veterinary studies did not evaluate muscle 217 invasion (23/25) by the tumor or did not provide a clear information regarding this topic. Thus, as a 218 future direction for canine studies evaluating transitional cell carcinoma from the bladder, we 219 strongly suggest authors to evaluate muscle-invasion to provide a stronger evidence regarding dogs 220 being models to human bladder cancer. 221 Most of the published manuscript that met inclusion criteria, evaluated one up to three proteins 222 or genes and only five previous manuscript performed large-scale analysis. We extracted protein or 223 gene information from manuscript with isolated proteins and evaluated these proteins together to 224 identify the profile of these previous studies. All studies with isolated proteins were focused on the 225 evaluation of oncogenes. Interestingly, most of them evaluated tyrosine kinase receptors, such as 226 ERBB2, EGFR, VEGFR and PDGFR (Tsuboi et al. 2019, Walters et al. 2017, Hanazono et al. 2014). 227 Our PPI analysis revealed a high number of interactions among these proteins, even though they have 228 been evaluated separately in each study. Since they present these high interaction level, future studies 229 can use our PPI analysis to select connected proteins to evaluate its prognostic or predictive value in 230 canine bladder cancer. Interestingly, the ontology analysis of the studies with isolated proteins 231 revealed several terms related with tyrosine kinase activity, phosphorylation process and alteration of 232 ERK1 and ERK2 cascade. Thus, the previous literature was focused on the search of small molecules 233 inhibitors targets. On the other hand, until now, no small molecules inhibitors have been successful 234 proposed in the treatment of canine bladder cancer. 235 The canine transcriptome data cross validated with TCGA data, revealed 35 genes with high 236 probability to present deregulation in both human and canine bladder cancer. Since these data were 237 obtained from five different canine studies and the 344 human samples from TCGA, it seems 238 consistent and could be used for further investigation. Once it can be difficult to identify a potential 239 marker to be tested in a future study, our list provides markers with strong potential that are up or 240 down regulated in both human and canine bladder cancer. One important limitation of our analysis 241 was the absence of muscle invasion information/standardization in canine samples. Thus, we can lack 242 important genes related to muscle-invasiveness as a known worst prognosis finding. Nevertheless, 243 choosing a gene for future studies from this list could have a higher potential than a random search. 244 Besides that, we did an analysis on the data from 344 human bladder cancer to identify the genes 245 related to patient's overall survival. Since in veterinary medicine the survival data is usually absent in 246 the published studies, considering human survival data is a unique opportunity to identify candidates 247 related with prognostic potential in veterinary oncology. 248 Among the 28 proteins with prognostic value, we identified Annexin1, GATA-3 and EGFR. 249 Annexin1 overexpression was previously associated with tumor progression and was considered an 250 independent marker for metastasis-free survival prediction (Li et al., 2010). Besides that, Annexin1 251 expression was also previously associated with chemotherapy relapse and resistance in human 252 bladder cancer (Yu et al., 2014 analysis of the selected data. CEF-A and RFA checked the meta-analysis and the in-silico data 290 independently. RL-A and VG contributed with constructive comments. CEF-A supervised the 291 project. All authors read and approved the final manuscript. information from these manuscripts. Therefore, we performed two different analysis based on the 384 manuscript data. 2) manuscripts with information regarding isolated genes or proteins were evaluated 385 together. We extracted the name of the gene or protein and the p value to perform ontology and 386 protein-protein interaction analysis. In the end of this analysis, we identified different pathways and 387 ontology process related to the previous published data. 3) the manuscripts with large-scale 388 transcriptome data were evaluated together in a separated analysis. We identified genes commonly 389 deregulated among studies and cross-validated this data with The Cancer Genome Atlas (TCGA). 390 The diagram was generated using BioRender (https://app.biorender.com/).