• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

xenaPython/H03-May-2022-1,078868

xenaPython.egg-info/H03-May-2022-1211

MANIFEST.inH A D07-Apr-202094 43

PKG-INFOH A D09-Apr-2020290 1211

README.mdH A D13-Sep-20187.7 KiB229153

setup.cfgH A D09-Apr-202059 64

setup.pyH A D09-Apr-2020514 1817

README.md

1# xenaPython
2Python API for Xena Hub
3
4---------
5
6#### Requirement
7    support python2 python3
8
9
10#### Installation
11    pip install xenaPython
12
13
14#### Upgrade Installation
15    pip install --upgrade xenaPython
16
17
18#### Usage
19    >>> import xenaPython as Xena
20
21#### Examples
22
23##### 1: Query four samples and three identifers expression
24    import xenaPython as xena
25
26    hub = "https://toil.xenahubs.net"
27    dataset = "tcga_RSEM_gene_tpm"
28    samples = ["TCGA-02-0047-01","TCGA-02-0055-01","TCGA-02-2483-01","TCGA-02-2485-01"]
29    probes = ['ENSG00000282740.1', 'ENSG00000000005.5', 'ENSG00000000419.12']
30    [position, [ENSG00000282740_1, ENSG00000000005_5, ENSG00000000419_12]] = xena.dataset_probe_values(hub, dataset, samples, probes)
31    ENSG00000282740_1
32
33##### 2: Query four samples and three genes expression, when the dataset you want to query has a identifier-to-gene mapping (i.e. xena probeMap)
34    hub = "https://toil.xenahubs.net"
35    dataset = "tcga_RSEM_gene_tpm"
36    samples = ["TCGA-02-0047-01","TCGA-02-0055-01","TCGA-02-2483-01","TCGA-02-2485-01"]
37    genes =["TP53", "RB1", "PIK3CA"]
38    xena.dataset_gene_probe_avg(hub, dataset, samples, genes)
39
40##### 3: If the dataset does not have id-to-gene mapping, but the dataset used gene names as its identifier, you can query gene expression like example 1, example 2 will not work.
41    hub = "https://toil.xenahubs.net"
42    dataset = "tcga_RSEM_Hugo_norm_count"
43    samples = ["TCGA-02-0047-01","TCGA-02-0055-01","TCGA-02-2483-01","TCGA-02-2485-01"]
44    probes =["TP53", "RB1", "PIK3CA"]
45    [position, [TP53, RB1, PIK3CA]] = xena.dataset_probe_values (hub, dataset, samples, probes)
46    TP53
47
48##### 4: Find out the samples in a dataset
49    hub = "https://tcga.xenahubs.net"
50    dataset = "TCGA.BLCA.sampleMap/HiSeqV2"
51    xena.dataset_samples (hub, dataset, 10)
52    xena.dataset_samples (hub, dataset, None)
53
54##### 5: Find out the identifiers in a dataset
55    hub = "https://tcga.xenahubs.net"
56    dataset = "TCGA.BLCA.sampleMap/HiSeqV2"
57    xena.dataset_field (hub, dataset)
58
59##### 6. Find out the number of idnetifiers in a dataset
60    hub = "https://tcga.xenahubs.net"
61    dataset = "TCGA.BLCA.sampleMap/HiSeqV2"
62    xena.dataset_field_n (hub, dataset)
63
64##### 7. Find out hub id, dataset id
65    use xena browser datasets tool:  https://xenabrowser.net/datapages/
66
67#### Help
68    >>> import xenaPython
69    >>> help(xenaPython)
70
71Help on package xenaPython:
72
73NAME
74
75    xenaPython - Methods for querying data from UCSC Xena hubs
76
77DESCRIPTION
78
79    Data rows are associated with "sample" IDs.
80    Sample IDs are unique within a "cohort".
81    A "dataset" is a particular assay of a cohort, e.g. gene expression.
82    Datasets have associated metadata, specifying their data type and cohort.
83
84    There are three primary data types: dense matrix (samples by probes),
85    sparse (sample, position, variant), and segmented (sample, position, value).
86
87
88    Dense matrices can be genotypic or phenotypic. Phenotypic matrices have
89    associated field metadata (descriptive names, codes, etc.).
90
91    Genotypic matricies may have an associated probeMap, which maps probes to
92    genomic locations. If a matrix has hugo probeMap, the probes themselves
93    are gene names. Otherwise, a probeMap is used to map a gene location to a
94    set of probes.
95
96FUNCTIONS
97
98    all_cohorts(host, exclude)
99
100    all_datasets(host)
101
102    all_datasets_n(host)
103        Count the number datasets with non-null cohort
104
105    all_field_metadata(host, dataset)
106        Metadata for all dataset fields (phenotypic datasets)
107
108    cohort_samples(host, cohort, limit)
109        All samples in cohort
110
111    cohort_summary(host, exclude)
112        Count datasets per-cohort, excluding the given dataset types
113
114        xena.cohort_summary(xena.PUBLIC_HUBS["pancanAtlasHub"], ["probeMap"])
115
116    dataset_fetch(host, dataset, samples, probes)
117        Probe values for give samples
118
119    dataset_field(host, dataset)
120        All field (probe) names in dataset
121
122    dataset_field_examples(host, dataset, count)
123        Field names in dataset, up to <count>
124
125    dataset_field_n(host, dataset)
126        Number of fields in dataset
127
128    dataset_gene_probe_avg(host, dataset, samples, genes)
129        Probe average, per-gene, for given samples
130
131    dataset_gene_probes_values(host, dataset, samples, genes)
132        Probe values in gene, and probe genomic positions, for given samples
133
134    dataset_list(host, cohorts)
135        Dataset metadata for datasets in the given cohorts
136
137    dataset_metadata(host, dataset)
138        Dataset metadata
139
140    dataset_probe_signature(host, dataset, samples, probes, weights)
141        Computed probe signature for given samples and weight array
142
143    dataset_probe_values(host, dataset, samples, probes)
144        Probe values for given samples, and probe genomic positions
145
146        host = xena.PUBLIC_HUBS["pancanAtlasHub"]
147        dataset = "EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena"
148        samples = xena.dataset_samples(host, dataset, None)
149        [position, [foxm1, tp53]] = xena.dataset_probe_values(host, dataset, samples, ["FOXM1", "TP53"])
150
151    dataset_samples(host, dataset, limit)
152        All samples in dataset (optional limit)
153
154        samples = xena.dataset_samples(xena.PUBLIC_HUBS["pancanAtlasHub"], "EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena", None)
155
156    dataset_samples_n_dense_matrix(host, dataset)
157        All samples in dataset (faster, for dense matrix dataset only)
158
159    datasets_null_rows(host)
160
161    feature_list(host, dataset)
162        Dataset field names and long titles (phenotypic datasets)
163
164    field_codes(host, dataset, fields)
165        Codes for categorical fields
166
167    field_metadata(host, dataset, fields)
168        Metadata for given fields (phenotypic datasets)
169
170    gene_transcripts(host, dataset, gene)
171        Gene transcripts
172
173    match_fields(host, dataset, names)
174        Find fields matching names (must be lower-case)
175
176    probe_count(host, dataset)
177
178    probemap_list(host)
179        Find probemaps
180
181    ref_gene_exons(host, dataset, genes)
182        Gene model
183
184    ref_gene_position(host, dataset, gene)
185        Gene position from gene model
186
187    ref_gene_range(host, dataset, chr, start, end)
188        Gene models overlapping range
189
190    segment_data_examples(host, dataset, count)
191        Initial segmented data rows, with limit
192
193    segmented_data_range(host, dataset, samples, chr, start, end)
194        Segmented (copy number) data overlapping range
195
196    sparse_data(host, dataset, samples, genes)
197        Sparse (mutation) data rows for genes
198
199    sparse_data_examples(host, dataset, count)
200        Initial sparse data rows, with limit
201
202    sparse_data_match_field(host, field, dataset, names)
203        Genes in sparse (mutation) dataset matching given names
204
205    sparse_data_match_field_slow(host, field, dataset, names)
206        Genes in sparse (mutation) dataset matching given names, case-insensitive (names must be lower-case)
207
208    sparse_data_match_partial_field(host, field, dataset, names, limit)
209        Partial match genes in sparse (mutation) dataset
210
211    sparse_data_range(host, dataset, samples, chr, start, end)
212        Sparse (mutation) data rows overlapping the given range, for the given samples
213
214    transcript_expression(host, transcripts, studyA, subtypeA, studyB, subtypeB, dataset)
215
216
217DATA
218
219    LOCAL_HUB = 'https://local.xena.ucsc.edu:7223'
220    PUBLIC_HUBS = {'gdcHub': 'https://gdc.xenahubs.net', 'icgcHub': 'https...
221    excludeType = ['probeMap', 'probemap', 'genePredExt']
222
223#### Contact
224     http://xena.ucsc.edu/
225     https://groups.google.com/forum/#!forum/ucsc-cancer-genomics-browser
226     genome-cancer@soe.ucsc.edu
227
228
229