补充
这篇文献介绍的非常具体:
Embrace R to Boost Your Proteomic Analysis
背景
对于如何处理蛋白组学的半定量结果目前好像还没有一个特别的定论,本篇post旨在讨论和引用相关对此的讨论。
首先明确本讨论针对的是TMT-MS和DIA-MS两类MassSpec数据的分析,通过公司测序定量以后拿到的数据有点类似RNA-seq的count数据。
方法讨论
- Wilcox.test (之前写的comment用的是英文,就不翻译了,就直接贴上来了) Actually, as reported in Cell paper publised in 2020: Integrative Proteomic Characterization of Human Lung Adenocarcinoma, Wilcoxon signed-rank test (or Wilcox rank sum test) with BH adjustment could be used to calculate the p-value or adjusted p-value of our mass sepc data, but considering the replicates(always 2 or 3 replications), Wilcoxon signed-rank test is not suitable for our MS data.
Compared with Wilcoxon Signed-Rank test, student’s T-Test assumes that the dataset is from a normally distributed population but the Wilcoxon Signed-Rank test does not make this assumption.
点评:
However, when applied to data with only few replicates, these approaches(t.test or Wilcox) are lacking statistical power, due to difficulties in estimating variance.
- 如果是大批量蛋白质组学数据的话,Wilcox统计是一个很适合的统计方法,例如思考问题的熊就发过一篇推文别再用DEseq2和edgeR进行大样本差异表达基因分析了
- 对于实验室数据的话,样本应该不会太对,Wilcox应该不适用于这类数据。
- t.test
- 比较直结的统计方法,但只是评估差异是否真实存在,和LogFC搭配可以完成分析或者火山图作图。
- R包Limma或者基于Limma开发的包例如DEqMS