**Introduction**

The need to use software packages in statistical data analysis has increasingly become inevitable. The modern use of the capabilities of these software packages in the real world must be appreciated and embraced by modern data analysts.

**Statistical Computing**

Computing is the activity of using computers or computer software. It is an essential component of statistical practice and research. Computational tools form the basis for virtually all applied statistics. Investigations in visualization, model assessment and model fitting, all rely on computation. Computation can also be used for statistical inferences.

Statistical computing is basically analysing data using computers. With the widespread access to computers, data processing has become an easier task. Not only has data analysis become easier, we can perform advanced statistical analysis like multivariate analysis with utmost ease. Computers are particularly useful if one has to perform the same task several times., if data set is large etc. Computer can execute the iterative algorithm within a short time and with maximum accuracy.

Tools such as simulation, bootstrapping and Markov chain Monte Carlo are widely used to validate procedures and to provide guidance for both practical and theoretical problems.

**Statistical packages for analysis**

Numerical Computing: Topics include computer arithmetic, optimisation, numerical linear algebra, random number generation, and simulation.

Computational Inference: Simulation Markov Chain Monte Carlo, bootstrapping, resampling. Problems the can be used to solve: Random number generation, Bayesian methods

Statistical Modeling using software: Basic instruction in using statistical software for fitting models, interpreting output, etc. An introduction to statistical language can be treated here

Visualization: Techniques for plotting and visualizing data. Appropriate use of colour, basic techniques of information visualization.

Programming Languages: Introduction to programming and the different languages that can be used are Java, Python, Perl, Object-oriented programming, Compilers, procedural languages, scripting languages, Interpreters, distributed computing.

Data Technologies: Markup languages, database and query languages, web services, output delivery systems.

Algorithm Design and Implementation: Learning basic strategies for designing and implementing algorithms. Software design, unit testing documentation user interface design.

**Importance of Statistical Computing**

1.** **Technology is rapidly changing the way things are done, and statistical data analysis is not exempted. Manual computation and analysis of statistical data can be labourous, time-consuming and results obtained can be inaccurate and misleading. Statistical computing provides software solutions to statistical practice

2. Data technology: The methods and software used to obtain, manage, manipulate, and store data have changed dramatically. Statistical computing prepare statisticians and data analysts to face the current realities of data analysis and management. Big data technology is used in real time analysis of market trend and performance. It would help one gain knowledge of markup languages like XML, database technologies or web services technologies. Software technologies that are evolving include object-oriented programming, aspect oriented programming, distributed computing and other paradigms and they have important roles to play in the development of statistical packages and hence statistical practice.

3. Computational inference: In many cases, computation and theory are complementary and better results are obtained using both. But, by and large, few statisticians are capable of handling both components. Reliable easy-to-use software implementations can make these tools more accessible and easier.

**Statistical Computing and Analysis Software**

- SPSS
- TORA
- Minitab
- EView
- MatLab
- Excel
- Statistica
- Stata
- SAS
- R

**SPSS**

**How to Use SPSS for Statistical Data Analysis**

**SPSS Windows**

- Data Editor: This window displays the contents of the data file. Here you can create new data files or open and modify existing data files. Data Editor window opens automatically when you start an SPSS session.
- Viewer Window: Viewer window displays the outputs or results of analysis in text, pivot table or charts.

**Running Analysis on SPSS**

*Rules for Naming Variables*- Must be unique (that is each variable in a data set must have a different name)
- Can only have not more than eight characters
- Cannot include full stop, blank, or special characters like !, *, “ etc
- Cannot include words used as commands by SPSS (all, he, eg, to, le, it, by, or, gt, and, not, ge, with).

- click the variable view tab and enter the variable
- Click Data View tab, and enter the data
- Then click Analyze at the Menu bar

**TORA**

TORA is an optimisation software basically used for operations research (OR) analysis. It handle the following topics

- Linear equation
- Linear programming
- Integer programming
- Network models
- Project planning
- Queuing analysis
- Zero-sum games

Click here to learn **how to use Tora for statistical analysis**

**Minitab**

Minitab Statistical Software is used for different statistical data analysis starting from basic statistics to professional statistics. It is used in predictive mark trend analysis, experimental and medical research data analysis. It uses graphic user interface that makes it easy to be used by even novice.

Minitab for Windows gives you a data analysis environment that consists of the following

- A worksheet that contains your data
- A Data window that shows columns of data
- Menu to issue commands for statistical analysis, data manipulation, and data transformation. Menu items can execute a command . or open a dialog box.
- A Session window that displays your results
- Graph window for high-resolution graphs
- An Info window that displays a summary of your worksheet
- An History window that lists commands you have used in your session. You can re-execute commands by copying them from the History window and pasting them into the Command Line Editor.
- Session commands are alternative to menu commands that you can type in the Session window or in the Command Line Editor. You can intersperse menu commands and session commands throughout your session if you wish
- A pop-up Command Line Editor that allows you quickly edit and re-execute session commands
- Context-sensitive Help for dialog boxes, Session window commands , and overview information.
- A complete macro language that let’s you automate repetitive tasks, extend Minitab's functionality, or even design your own session commands.

**Steps in Statistical Computation and Analysis Using Minitab**

- Start Minitab
- Enter data into worksheet
- Analyse your data
- Graph your data
- Save and Print your Results.

**EViews**

EViews provides sophisticated data analysis, regression, correlation, time series analysis, and forecasting tools on Windows-based computers. With EViews you can quickly develop a statistical relation from your data then use the relation to forecast future values of the data. Areas where EViews can be applied are scientific data analysis and evaluation, financial analysis, macroeconomic forecasting, simulation, sales forecasting, market trend, and cost analysis.

EViews provides convenient visual ways to enter data series from the keyboard or from disk files, to create new series from existing ones, to display print series, and to carry out statistical analysis of the relationships among series.

**MatLab**

MatLab, also known as the Mathworks is a complete command line interface (CLI) for mathematical and statistical computational. It is widely applied in science and engineering fields MATLAB is a programming language that requires you to create your own code at some point as in R. It contains many toolsboxes that are useful in research questions such as EEGLab for analysing EEG data).

MATLAB can easily be integrated with high-end programming languages such Python and C++. It is also compatible with data analysis software tools files like Excel, SPSS, XML for easy import and export of data files.

**R**

R is an open source statistical software that has the capabilities of handling data analysis and visualization, and heavy computing. It is programming command line interface (CLI). It has many packages integration which are used for medical researches especially in the areas epidemiological, molecular biological, biostatistics, meta-analyses. R-studio integrated Development Environment (IDE) that contains R tools works the same way Oracle Data Base Engine that uses SQL works. R statistical software tools is relatively new, as it first version was released in 1993 and the version with IDE released in 2011. R is compatible with some other statistical software that uses spreadsheet like Excel, Stars, SAS.

**DataMelt**

DataMelt is a software for numeric mathematical, statistical data computations, symbolic calculations, data analysis and visualization. DataMelt software platform combines the simplicity of different programming scripting such as Python, Grooving, Ruby, and so on, with the power of hundreds of Java packages.