Software

Over the years, I have led or co-led the development of the following downloadable software for both bioinformatics and big-data analysis. Please cite the related articles if you use them in your research. They are listed chronologically:

  • MACAW. A self-extracting Window’s software developed jointly with people at NCBI (not popular anymore). Search and align a subtly conserved single block-motif among multiple sequences, assuming one occurrence in each sequence. See its companion articles published in Science (Lawrence et al. 1993) and J. Am. Statist. Assoc. (Liu 1994).
  • Gibbs Motif Sampler. A software to search for multiple motifs with unknown number of repeats in multiple protein sequences. Its companion articles were published in J. Am. Statist. Assoc. (Liu et al. 1995) and Protein Sci. (Neuwald et al 1995). A server of the Motif Sampler for both discovering DNA regulatory binding sites and protein sequence motifs can be accessed from the Wadsworth Lab Bioinformatics Center directed by Dr. Chip Lawrence.
  • PROBE. A UNIX (Sun OS) software tool for block-based multiple protein sequence alignment and for database search to detect remote protein homology. Its companion articles appeared in Nucl. Acid Res. (Neuwald et al. 1997 ) and J. Am. Statist. Assoc. (Liu et al. 1999)
  • Bayesian Aligner. A Bayesian pairwise alignment tool; also called ‘Bayesian Phylogenetic Footprint.’ Its companion article appeared in Bioinformatics
  • BioProspector. An improved web-interactive algorithm for finding gene regulatory binding motifs. See the companion article published in the Proceedings of Pacific Symposium on Biocomputing (Liu, Brutlag, Liu 2001).
  • BLADE v2. Bayesian LinkAge DisEquilibrium mapping algorithm based on Liu et al. (2001) published in Genome Research. This executable program was produced by Dr. Xin Lu with a companion publication Lu, Niu and Liu (2003) in the same journal.
  • HAPLOTYPER for SNP haplotype reconstruction based on the Partition-Ligation method (Niu et al. 2002) published in Am. J. Hum. Genet. Accompanying with it, we also developed EM-DeCODER for the same purpose.
  • PL-EM for SNP haplotype reconstruction based on the Partition-Ligation method and EM Algorithm (Qin et al. 2002) published in Am. J. Hum. Genet.
  • BPPS (Bayesian Partition and Pattern Selection).  BPPS identifies and characterizes subgroups of proteins by partitioning a multiple sequence alignment of (MSA) all the proteins into a hierarchically nested series of sub-MSAs based on correlated residue patterns that are distinctive of each subgroup.
  • MDScan. A new, fast, and accurate algorithm for finding protein-DNA interacting sites (gene regulatory binding motifs) from the 5′ untranslated sequences selected by Chromatin-immunoprecipitation microarray (ChIP-array) and other microarray experiments. Its companion paper was published in Nature Biotechnology, 2002.
  • BMC. A novel Bayesian algorithm for putative motif clustering, see the companion paper published in Nature Biotechnology, 2003.
  • Motif Regressor. An efficient algorithm for integrating sequence motif discovery with measures from mRNA expression microarray or Chromatin-Immunoprecipitation microarray (ChIP-chip) experiments. Its companion paper was published in Proc. Nat’l Acad. Sci. USA, 2003. A more user-friendly download site is maintained by Erin Conlon at here.
  • GMS-MP: Gibbs Motif Sampler for Paired Correlation Model. See the Zhou & Liu (2004) in Bioinformatics.
  • BioOptimizer: A Bayesian scoring method for comparing and optimizing regulatory motif predictions from AlignACE, BioProspector, CONSENSUS, and MEME. Read details in Jensen & Liu (2004) in Bioinformatics.
  • Smoothing Spline Clustering (SSC) algorithm (Ma, Castillo-Davis, et al. 2006): a data-driven statistical method for clustering time-series gene expression data. A newer version of the software is available here.
  • BEAM (Bayesian Epistasis Association Mapping): A powerful Bayesian inference algorithm for detecting marker interactions in case-control population genetic studies. The companion paper, Zhang and Liu (2007) was published in Nature Genetics.
  • Tmod (Toolbox for Motif Discovery): A windows-based software suite that incorporates 12 different popular sequence motif discovery algorithms such as MEME, BioProspector, AlignACE, GLAM, YMF, etc. It helps researchers to compare and combine motif finding results from different algorithms. Compatible with Windows 2000, XP, Vista, and 7! The companion paper, Sun et al. (2009), is published in Bioinformatics
  • Mining Biological Literature for Protein-Protein Interactions: Webserver is available! The companion paper, Chowdhary, Zhang, and Liu (2009) was published in Bioinformatics. Programs and Scripts also available here.
  • HiCNorm: removing biases in Hi-C data via Poisson regression (Hu, Qin, and Liu 2012)
  • BACH: Bayesian 3D constructor for Hi-C data (Hu, Deng, et al 2013)
  • CLIME Clustering by Inferred Models of Evolution (Li, Calvo, et al. 2014)  Some OMIM results,
  • RABIT:  Regression Analysis with Background InTegration
  • SIRI:  SIR for variable selection via Inverse modeling  
  • DS:  Dynamic Slicing for k-Sample Testing  
  • CLIC:  Clustering by Inferred Co-Expression. (Li, Jourdian et al. 2017). Identify new members of the target pathway and its relevant expression datasets
  • SODA: Stepwise method for variable and interaction selection for Logistic regression and Index models
  • PhyloAcc: Bayesian detection of changes of conservation of a genomic region (Hu et al. 2019; Sackton et al 2019)