Project: bed-reader

Read and write the PLINK BED format, simply and efficiently.

Project Details

Latest version: 1.0.0
Home Page: https://fastlmm.github.io/
PyPI Page: https://pypi.org/project/bed-reader/

Project Popularity

PageRank: 0.0015248169796761278
Number of downloads: 233414

PyPI

Read and write the PLINK BED format, simply and efficiently.

This is the Python README. For Rust, see README-rust.md.

Features:

Fast multi-threaded Rust engine.
Supports all Python indexing methods. Slice data by individuals (samples) and/or SNPs (variants).
Used by PySnpTools, FaST-LMM, and PyStatGen.
Supports PLINK 1.9.

Install

Full version: With all optional dependencies:

pip install bed-reader[samples,sparse]

Minimal version: Depends only on numpy:

pip install bed-reader

Usage

Read genomic data from a .bed file.

>>> import numpy as np
>>> from bed_reader import open_bed, sample_file
>>>
>>> file_name = sample_file("small.bed")
>>> bed = open_bed(file_name)
>>> val = bed.read()
>>> print(val)
[[ 1.  0. nan  0.]
 [ 2.  0. nan  2.]
 [ 0.  1.  2.  0.]]
>>> del bed

Read every second individual and SNPs (variants) from 20 to 30.

>>> file_name2 = sample_file("some_missing.bed")
>>> bed2 = open_bed(file_name2)
>>> val2 = bed2.read(index=np.s_[::2,20:30])
>>> print(val2.shape)
(50, 10)
>>> del bed2

List the first 5 individual (sample) ids, the first 5 SNP (variant) ids, and every unique chromosome. Then, read every genomic value in chromosome 5.

>>> with open_bed(file_name2) as bed3:
...     print(bed3.iid[:5])
...     print(bed3.sid[:5])
...     print(np.unique(bed3.chromosome))
...     val3 = bed3.read(index=np.s_[:,bed3.chromosome=='5'])
...     print(val3.shape)
['iid_0' 'iid_1' 'iid_2' 'iid_3' 'iid_4']
['sid_0' 'sid_1' 'sid_2' 'sid_3' 'sid_4']
['1' '10' '11' '12' '13' '14' '15' '16' '17' '18' '19' '2' '20' '21' '22'
 '3' '4' '5' '6' '7' '8' '9']
(100, 6)

Project: bed-reader

Project Details

Project Popularity

Install

Usage

Project Links

Related Projects