蛋白质组学数据处理的讨论

R/BIOCONDUCTOR FOR MASS SPECTROMETRY AND PROTEOMICS

Posted by Chevy on January 18, 2023

背景

对于如何处理蛋白组学的半定量结果目前好像还没有一个特别的定论,本篇post旨在讨论和引用相关对此的讨论。

首先明确本讨论针对的是TMT-MS和DIA-MS两类MassSpec数据的分析,通过公司测序定量以后拿到的数据有点类似RNA-seq的count数据。

方法讨论

  1. Wilcox.test (之前写的comment用的是英文,就不翻译了,就直接贴上来了) Actually, as reported in Cell paper publised in 2020: Integrative Proteomic Characterization of Human Lung Adenocarcinoma, Wilcoxon signed-rank test (or Wilcox rank sum test) with BH adjustment could be used to calculate the p-value or adjusted p-value of our mass sepc data, but considering the replicates(always 2 or 3 replications), Wilcoxon signed-rank test is not suitable for our MS data.

Compared with Wilcoxon Signed-Rank test, student’s T-Test assumes that the dataset is from a normally distributed population but the Wilcoxon Signed-Rank test does not make this assumption.

点评:

However, when applied to data with only few replicates, these approaches(t.test or Wilcox) are lacking statistical power, due to difficulties in estimating variance.

  1. t.test
    • 比较直结的统计方法,但只是评估差异是否真实存在,和LogFC搭配可以完成分析或者火山图作图。
  2. R包Limma或者基于Limma开发的包例如DEqMS

参考文献