Tuesday, 17 September 2013

Ongoing dramas with epicurves date scales

Ongoing dramas with epicurves date scales

I'm attempting to use ggplot and R for analysing some epidemiologic data,
and I'm continuing to struggle with getting an epidemic curve to appear
properly.
Data is here
attach(epicurve)
head(epicurve)
onset age
1 21/12/2012 18
2 14/06/2013 8
3 10/06/2013 64
4 28/05/2013 79
5 14/04/2013 56
6 9/04/2013 66
epicurve$onset <- as.Date(epicurve$onset, format="%d/%m/%Y")
ggplot(epicurve, aes(onset)) + geom_histogram() +
scale_x_date(breaks=date_breaks("1 year"), minor_breaks=date_breaks("1
month"), labels = date_format("%b-%Y"))
gives this graph. This is fine, but the binwidths are not related to any
time period of note, and adjusting them is a bit trial and error.
For this particular dataset, I'd like to display the cases by month of onset.
One way I worked out how to do this is:
epicurve$monyr <- format(epicurve$onset, "%b-%Y")
epicurve$monyr <- as.factor(epicurve$monyr)
ggplot(epicurve, aes(monyr)) + geom_histogram()
Outputs a graph I can't post because of the reputation system. The bars
represent something meaningful, but the axis labels are a bomb-site. I
can't format the axes using scale_x_date because they aren't dates and I
can't work out what arguments to pass to scale_x_discrete to give useful
labels.
I have a feeling there should be an easier way to do this by doing an
operation on the onset column. Can anyone give me any pointers, please?

No comments:

Post a Comment