Tuesday, 19 February 2013

Working with R2MLwiN Part 2




Specifying the model

This is the second part of a series of notes demonstrating use of the R package, R2MLwiN, an R command interface to the multilevel modelling software package, MLwiN (see the MLwiN site for getting access to MLwiN). The first set of notes showed how to get started with R2MLwiN. In these notes, I show how to fit predictors (continuous, categorical, and interactions) to the fixed-effects part of a multilevel regression model, and how to fit random slopes to the regression model. The examples use the ALNT.csv data (see Working with R2MLwiN Part 1 for a description of the data). Though the series is concerned with demonstrating Bayesian estimation using MCMC methods, the examples presented here do not depend on MCMC methods of estimations, and so to speed up the running of the examples, they use maximum likelihood estimation. It is easy enough to switch between maximum likelihood and MCMC procedures; set the estM option to 1 for MCMC, and 0 for maximum likelihood.

Sunday, 3 February 2013

Scatterplot with marginal boxplots


Using R and ggplot2 to draw a scatterplot with the two marginal boxplots

Drawing a scatterplot with the marginal boxplots (or marginal histograms or marginal density plots) has always been a bit tricky (well for me anyway). The approach I take here is, first, to draw the three separate plots using ggplot2:
  • the scatterplot;
  • the horizontal boxplot to appear in the top margin;
  • the vertical scatterplot to appear in the right margin;
then second, to set widths and heights of the spaces used for axis and tick mark labels, and to combine the three plots using functions from the gtable package. The difficulty has been to ensure that the tick mark labels on the vertical axis in the scatterplot panel and in the top marginal boxplot panel take up the same space. Functions from the gtable package make this a reasonably straightforward process.
To draw the following chart, I borrowed and modified code from here and here. The final code and data are available on GitHub.

plot of chunk ScatterBoxPlot

Thursday, 24 January 2013

Working with R2MLwiN Part 1



Getting started with the R2MLwiN package

With the release of the R2MLwiN package late 2012, R users have access to another software package for running Bayesian models using Markov chain Monte Carlo (MCMC) methods. R2MLwiN is an R command interface to MLwiN, allowing users to fit multilevel models using MLwiN from within the R environment.

To use R2MLwiN, MLwiN needs to be installed. MLwiN is available from the Centre for Multilevel Modelling at the University of Bristol. See the MLwiN site for availability: MLwiN is free for UK researchers; there is a 30-day fully functional free trail version available; or MLwiN can be purchased from the Centre for Multilevel Modelling. As the name suggests, MLwiN is available for the Windows platform only.

Thursday, 10 January 2013

Getting Access data into R


1. Introduction

These notes give the steps to configure a Windows machine so that R can communicate with Microsoft Access databases. It turns out that the same mechanism can be used to connect with Microsoft Excel workbooks, so the notes include R to Excel communication as well. In R, there are two main ways to connect with Access databases: using the ODBC (Open DataBase Connectivity) facility available on many computers; and using the DBI (DataBase Interface) package in R. These notes deal with ODBC only. The notes also include some details on how the set-up is different on a Mac, but none of the steps have been tested on a Mac. (It is worth mentioning a commercial product, Stat/Transfer (www.stattransfer.com), that simplifies the task of transferring data between data formats, including Access to R.)

ODBC allows a connection to a database to be opened, but that is only half the process. The second half of the process requires the use of SQL (Structured Query Language) to import database tables into R. Thus the notes also provide a brief introduction to SQL, and show how to formulate SQL requests within R and then to send the request through the open connection to the database.

Monday, 19 November 2012

R and SQLite: Part 1

Creating SQLite databases from R


1. Introduction

These notes show how to create an SQLite database from within R. The notes outline two way in which R can communicate with SQLite databases: using the RSQLite package and using the sqldf package. Both packages use reasonably standard versions of SQL to administer and manage the database, but the two packages differ in the way meta statements are constructed.

Management of SQLite databases requires the use of SQL (Structured Query Language). These notes show how to formulate relevant SQL requests within R and then to send the requests through the open connection to an SQLite database. But for a comprehensive treatment of SQL, and in paticular, SQLite's flavour of SQL, readers should consult texts such as Allen & Owens (2010) and van der Lans (2009).