# Data usage policy When using this data, you must acknowledge the source by citing the publication "Widespread dose-dependent effects of RNA expression and splicing on complex diseases and traits" (https://doi.org/10.1101/814350). # GTEx-GWAS integration: Finemapping This package contains DAP-G results on GTEx v8 eQTL and sQTL data.
See ([DAP-G software](https://github.com/xqwen/dap)) for details.
We used only European individuals and variants with MAF>0.01, on genes that are annotated as `protein_coding` or `lncRNA`.
DAP-G `ld_control` parameter was 0.75. The results were analyzed in [this preprint](https://www.biorxiv.org/content/10.1101/814350v1) ## Contents ```
finemapping/
|-- README_finemapping.md
|-- dapg_eqtl.tar
`-- dapg_sqtl.tar
```
Unpack each tarball with a command like `tar -xvpf dapg_sqtl.tar` For every tissue: * `{tissue}.variants_pip.txt.gz` contains the variants' posterior inclusion probabilities at being causal for every gene.
* gene: gene id (or intron id)
* rank: ranking of the variant according to its PIP (see below)
* variant_id: gtex variant id
* pip: posterior inclusion probability of the variant in the causal models
* log10_abf: approximate Bayes factor (-log10)
* cluster_id: id of cluster to which the variant belongs
* `{tissue}.models_variants.txt.gz` contains, for every model contemplated by DAPG, the list of variants involved. Most of them have single variant.
* `{tissue}.model_summary.txt.gz` contains, for every analized gene, a summary of the modes such as expected number of causal variants
* gene: gene id (or intron id)
* pes: posterior expected model size (i.e. number of causal variants)
* pse_se: standard error of the above
* log_nc: dapg undocumented statistic
* log10_nc: dapg undocumented statistic
* `{tissue}.models.txt.gz` for every analyzed gene:
* gene: gene id (or intron id)
* model: number (serving as a model name)
* n: number of variants (0 for null model)
* pp: posterior inclusion probability of the model
* ps: posterior score
* `{tissue}.clusters.txt.gz` for every analyzed gene:
* gene: gene id (or intron id)
* cluster: number (serving as cluster name)
* n_snps: number of variants in the cluster
* pip: posterior inclusion probability
* average_r2: average correlation within the cluster
* `{tissue}.cluster_correlations.txt.gz`: upper triangular matrix of correlations...
See ([DAP-G software](https://github.com/xqwen/dap)) for details.
We used only European individuals and variants with MAF>0.01, on genes that are annotated as `protein_coding` or `lncRNA`.
DAP-G `ld_control` parameter was 0.75. The results were analyzed in [this preprint](https://www.biorxiv.org/content/10.1101/814350v1) ## Contents ```
finemapping/
|-- README_finemapping.md
|-- dapg_eqtl.tar
`-- dapg_sqtl.tar
```
Unpack each tarball with a command like `tar -xvpf dapg_sqtl.tar` For every tissue: * `{tissue}.variants_pip.txt.gz` contains the variants' posterior inclusion probabilities at being causal for every gene.
* gene: gene id (or intron id)
* rank: ranking of the variant according to its PIP (see below)
* variant_id: gtex variant id
* pip: posterior inclusion probability of the variant in the causal models
* log10_abf: approximate Bayes factor (-log10)
* cluster_id: id of cluster to which the variant belongs
* `{tissue}.models_variants.txt.gz` contains, for every model contemplated by DAPG, the list of variants involved. Most of them have single variant.
* `{tissue}.model_summary.txt.gz` contains, for every analized gene, a summary of the modes such as expected number of causal variants
* gene: gene id (or intron id)
* pes: posterior expected model size (i.e. number of causal variants)
* pse_se: standard error of the above
* log_nc: dapg undocumented statistic
* log10_nc: dapg undocumented statistic
* `{tissue}.models.txt.gz` for every analyzed gene:
* gene: gene id (or intron id)
* model: number (serving as a model name)
* n: number of variants (0 for null model)
* pp: posterior inclusion probability of the model
* ps: posterior score
* `{tissue}.clusters.txt.gz` for every analyzed gene:
* gene: gene id (or intron id)
* cluster: number (serving as cluster name)
* n_snps: number of variants in the cluster
* pip: posterior inclusion probability
* average_r2: average correlation within the cluster
* `{tissue}.cluster_correlations.txt.gz`: upper triangular matrix of correlations...