dvpio.read.omics.read_pg_table#
- dvpio.read.omics.read_pg_table(path, search_engine, *, column_mapping=None, measurement_regex=None, reader_provider_kwargs=None, **kwargs)#
Read protein group table to the
anndata.AnnDataformatRead (features x observations) protein group matrices from proteomics search engines into the
anndata.AnnDataformat (observations x features). Per default, raw intensities are returned, which can be modified dependening on the search engine.Supported formats include
AlphaDIA (
alphadia)AlphaPept (
alphapept, csv+hdf)DIANN (
diann)MaxQuant (
maxquant)Spectronaut (
spectronaut, parquet + tsv)
see
dvpio.read.omics.available_reader()for a complete list.See
alphabase.pg_readermodule for more information- Parameters:
path (
str) – Path to protein group matrixreader_type – Name of engine output, pass the method name of the corresponding reader. You can list all available readers with the
dvpio.read.omics.available_reader()helper functioncolumn_mapping (
Optional[dict[str,Any]] (default:None)) – A dictionary of mapping alphabase columns (keys) to the corresponding columns in the other search engine (values). IfNonewill be loaded from thecolumn_mappingkey of the respective search engine inpg_reader.yaml. Passed toalphabase.pg_reader.pg_reader.PGReaderProvider.get_reader().measurement_regex (
Optional[str] (default:None)) – Regular expression that identifies correct measurement type. Only relevant if PG matrix contains multiple measurement types. For example, alphapept returns the raw protein intensity per sample in columnAand the LFQ corrected value inA_LFQ. IfNoneuses all columns. Passed toalphabase.pg_reader.pg_reader.PGReaderProvider.get_reader().reader_provider_kwargs (
Optional[dict] (default:None)) – Passed toalphabase.pg_reader.pg_reader.PGReaderProvider.get_reader()kwargs (
Any) – Passed tospatialdata.models.TableModel.parse()
- Return type:
- Returns:
anndata.AnnDataAnnData object that can be further processed with scVerse packages.- adata.X
Stores values of the intensity columns in the report of shape observations x features
- adata.obs
Stores observations with protein group matrix sample names as
sample_idcolumn.
- adata.var
Stores features and feature metadata.
Example
from dvpio.io.read.omics import read_report alphadia_path = ... adata = read_pg_table(alphadia_path, reader_type="alphadia") maxquant_path = ... # Read LFQ values from MaxQuant report adata = read_pg_table(maxquant_path, reader_type="maxquant", measurement_regex="lfq")
Get available regular expressions
from alphabase.pg_reader import pg_reader_provider alphapept_reader = pg_reader_provider.get_reader("alphapept") alphapept_reader.get_preconfigured_regex() > {'raw': '^.*(?<!_LFQ)$', 'lfq': '_LFQ$'}
See also