Description
Statsample is a statistical library for Ruby. Has modules for descriptive and inferencial statistics. Compatible with Ruby 1.8.7, 1.9.1, 1.9.2 and ruby-head
Features:
- Classes for Vector, Datasets (set of Vectors) and Multisets (multiple datasets with same fields and type of vectors), and multiple methods to manipulate them.
- Converters to and from database (using dbi), csv (standard library) and Excel files (using spreadsheet). Can output to GGobi, MX and Rosuda's Mondrian files
- Module Codification, to help to codify open questions
- Factorial Analysis. Module Factor provides classes to extract factors (PCA and PrincipalAxis), rotate component matrix (Varimax, Equimax and Quartimax) and define number of components to retent (Horn's ParallelAnalysis and Velicer's MPA)
- OLS Multiple Regression. Listwise analysis optimized with use of GSL library. Pairwise analysis is executed on pure ruby, based on a matrix of correlation or actual data.
- Dominance Analysis. Based on Budescu and Azen papers, DominanceAnalysis class can report dominance analysis for a sample and DominanceAnalysis. Bootstrap can execute bootstrap analysis to determine dominance stability, as recomended by Azen & Budescu (2003).
- Binomial Regression: Logit and probit
- Module Bivariate provides covariance and pearson, spearman, point biserial, tau a, tau b, gamma, tetrachoric and polychoric correlations. Include methods to create correlation (Pearson, tetrachoric, polychoric) and covariance matrices, for internal analysis or exportation to SPSS.
- Module Crosstab provides function to create crosstab for categorical data
- Class Anova let you perform One-Way or Two-Way Anova with actual data or direct input of variances and degrees of freedom.
- Module Test have several statistical test availables: F, Mann_Whitney's U, Student's T for 1 sample and 2 independent samples and Chi-Square.
- Reliability analysis provides functions to analyze scales.
- Class ScaleAnalysis provides statistics like mean, standard deviation for a scale, Cronbach's alpha and standarized Cronbach's alpha, and for each item: mean, correlation with total scale, mean if deleted, alpha is deleted.
- Class MultiScaleAnalysis provides a DSL to perform multiple ScaleAnalysis with minimum effort
- Class ICC provides intra-class correlation
- Module SRS (Simple Random Sampling) provides a lot of functions to estimate standard error for several type of samples
- Module Graph provides several classes to create beautiful graphs using rubyvis: Histogram, Boxplot and Scatterplot
Examples
1 require 'statsample' 2 3 # Create a Vector 4 a=(1..100).collect { rand(100)}.to_scale 5 6 # Retrieve vector mean and standard deviation 7 a.mean 8 a.sd 9 10 # Calculate correlation coefficient 11 b=(1..100).collect { rand(100)}.to_scale 12 Statsample::Bivariate.pearson(a,b) 13 # Creates a dataset 14 ds={"a"=>a,"b"=>b}.to_dataset 15 16 # Creates a new vector based on previous vectors and add it to the dataset 17 ds['c']=ds.collect {|r| r['a']*10+r['b']*5+rand(10) } 18 19 # OLS 20 lr=Statsample::Regression.multiple(ds,"c")
Download
With rubygems
$ gem install statsample
To install optimization packages
$ sudo gem install statsample-optimization
Download directly on Gemcutter
Resources
- Source code: Available on http://github.com/clbustos/statsample.
- Documentation: API available
- Mailing list: Google groups - Statsample
- Bug report or feature request: http://github.com/clbustos/statsample/issues.
Related projects
- Statsample Bivariate Extension: Polychoric and Tetrachoric correlation support for statsample
- ReportBuilder: an interface to create documents on various formats using one API. Currently, text, rtf and html implementations available (soon PDF and Latex)
- Minimization: algorithms for minimization of funcionts. Only unidimensional minimization.
- Rserve-client: Rserve ruby Client. Allows to retrieve and send data to R
- Rubyvis: Ruby port of Protovis, a graphic library
Other useful projects
- SVG::Graph module: Gem for SVG::Graph (author: Sean Russell)
Manuals
- Como hacer una análisis de dominancia (PDF; en español).