Download and Parse DJ/UBS Commodities Indexes

Here is another data downloading and parsing script, this one for the Dow Jones/UBS Commodities Indexes. Compared to the last post, this parser deals with multiple sheets and multiple columns in each sheet. It also constructs monthly series from the daily data, and stores it using a different symbol. Finally, it’s a good example of how a script can be functionalized, maybe for use with cron.

Let’s look at the code…

download.DJUBS <- function (filesroot = "~/Data/DJUBS") {

  # Load needed packages:

  # filesroot: Set the working directory, where there's a .incoming folder that 
  # contains the downloaded spreadsheet.
  # Create and set the working directory if it doesn't exist
  if (!file.exists(filesroot))
    dir.create(filesroot, mode="0777")
  # Create and set the .incoming directory if it doesn't exist
  if (!file.exists(paste(filesroot, "/.incoming", sep="")))
    dir.create(paste(filesroot, "/.incoming", sep=""), mode="0777")
  setwd(paste(filesroot, "/.incoming", sep=""))
  # Remove the old file from .incoming
    system("rm DJUBS_full_hist.xls")
  # Download the xls workbook directly from the web site:
  print("Downloading excel spreadsheet from DJUBS web site...")
    stop(paste("No spreadsheet exists.  Download the spreadsheet to be processed from into ", filesroot, "/.incoming", sep=""))
  sheetnames=c("Excess Return", "Total Return")
  for(sheet in sheetnames){  
    print(paste("Reading", sheet, "sheet... This will take a moment..."))
    x = read.xls("DJUBS_full_hist.xls", sheet=sheet)
    # Add column names, get the descriptions to add as attributes
    colnames(x)=t([2,], FUN=as.character, MARGIN=1), stringsAsFactors=FALSE))
    x.attr = t([1,], stringsAsFactors=FALSE))
    # Get rid of the last line, which contains the disclaimer
    # Remove blank columns between sections
    # Get attributes and labels
    categoryNames = x.attr[,!]
    symbolNames = paste(make.names(colnames(x[,])), ".IDX", sep="")
    symbolNamesMonthly = paste(make.names(colnames(x[,])), ".M.IDX", sep="")
    ISOdates = as.Date(x[,1], "%m/%d/%Y")
    for(i in 2:length(symbolNames)) {
      # check to make sure directories exist for each symbol, first for daily series...
      dir.create(paste(filesroot, symbolNames[i], sep="/"), showWarnings = FALSE, 
      recursive = FALSE, mode = "0777")
      # ... then for monthly series
      dir.create(paste(filesroot, symbolNamesMonthly[i], sep="/"), showWarnings = FALSE, 
      recursive = FALSE, mode = "0777")
    # Parse the columns into individual price objects
    print("Processing columns as symbols...")
    for( i in 2:dim(x)[2]){
      x.xts = as.xts(as.numeric(x[,i]),
      R.xts = Return.calculate(x.xts)
      x.xts = cbind(x.xts, R.xts)
      colnames(x.xts)=c("Close", "Returns")
      xtsAttributes(x.xts) <- list(Description = paste(categoryNames[,i], sheet, "Index"))
      save(x.xts, file=paste(filesroot, symbolNames[i], paste(symbolNames[i], ".rda", sep=""), sep="/"))
      print(paste(symbolNames[i],", ",categoryNames[,i], ", ", sheet, sep=""))
      # Describe the metadata for each index
      instrument(symbolNames[i], currency="USD", multiplier=1, tick_size=.01, start_date=head(index(x.xts),1), description=paste(categoryNames[,i], "Index"), data="CR", source="DJUBS", frequency="Daily", assign_i=TRUE)
      # Construct a monthly series from the daily series
      x.m.xts = to.monthly(Cl(x.xts))
      x.m.xts = cbind(x.m.xts[,4], Return.calculate(x.m.xts[,4]))
      # @ TODO Want to delete the last line off ONLY IF the month is incomplete
      if(tail(index(x.xts),1) != as.Date(as.yearmon(tail(index(x.xts),1)), frac=1)) {
        # That test isn't quite right, but its close.  It won't work on the first
        # day of a new month when the last business day wasn't the last day of 
        # the month.  It will work for the second day.
        x.m.xts = x.m.xts[-dim(x.m.xts)[1],]
      # Index is set to last trading day of the month.  
      # Reset index to last day of the month to make alignment easier with other monthly series.  
      index(x.m.xts)=as.Date(index(x.m.xts), frac=1)
      xtsAttributes(x.m.xts) <- list(Description = paste(categoryNames[,i], sheet, "Index"))
      save(x.m.xts, file=paste(filesroot, symbolNamesMonthly[i], paste(symbolNamesMonthly[i], ".rda", sep=""), sep="/"))
      print(paste(symbolNamesMonthly[i],", ",categoryNames[,i], ", ", sheet, sep=""))
      # Describe the metadata for each index
      instrument(symbolNamesMonthly[i], currency="USD", multiplier=1, tick_size=.01, start_date=head(index(x.xts),1), description=paste(categoryNames[,i], "Index"), data="CR", source="DJUBS", frequency="Monthly", assign_i=TRUE)
  setSymbolLookup.FI(base_dir=filesroot, split_method='common')
  print( "Now, whenever you log in you need to register the instruments.  This")
  print( "might be a line you put into .Rprofile so that it happens automatically:")
  print( "> require(quantmod) # this requires a dev build after revision 560 or so.")
  print( "> setSymbolLookup.FI(base_dir=filesroot, split_method='common')")
  print( "Now you should be able to type:")
  print( "> getSymbols('DJUBSTR.IDX') ")

The data source provides daily data for the indexes, so the script constructs monthly time series for those symbols as well. Once the data is parsed and stored, we can do some analysis. In a fresh R session, I would do the following to prepare the workspace:

setSymbolLookup.FI(base_dir="~/Data/DJUBS", split_method='common')

Now I’ll retrieve the data from disk…


and chart the DJUBS Total Returns Index:

charts.PerformanceSummary(DJUBSTR.IDX[,"Returns"], ylog=TRUE, wealth.index=TRUE, main = "DJUBS Total Returns Index Returns")

Or I can use quantmod’s charts:

chartSeries(Cl(DJUBSTR.IDX), theme="white")

Those are the daily series, but I want to look at real returns during the last couple of decades so I’ll use the monthly series instead:


The code is available on R-Forge, in the /inst/parsers directory of the FinancialInstrument package.

One thought on “Download and Parse DJ/UBS Commodities Indexes

  1. Costas says:


    Great idea. However:

    Maybe I am doing something wrong but when I issue

    chartSeries(Cl(DJUBSTR.IDX), theme=”white”)

    I get a very funny plot for this index which does not correspond to the actual series… It seems it is joining prices from different parts of the spreadsheet…


    charts.PerformanceSummary(DJUBSTR.IDX[,”Returns”], ylog=TRUE, wealth.index=TRUE, main = “DJUBS Total Returns Index Returns”)


    harts.PerformanceSummary(DJUBSTR.IDX[,”Returns”], ylog=TRUE, wealth.index=TRUE, main = “DJUBS Total Returns Index Returns”)

    > sessionInfo()
    R version 2.14.1 (2011-12-22)
    Platform: i686-pc-linux-gnu (32-bit)

    [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8

    attached base packages:
    [1] stats graphics grDevices utils datasets methods base

    other attached packages:
    [1] FinancialInstrument_1.0 quantmod_0.3-17
    [3] TTR_0.21-1 Defaults_1.1-1
    [5] gdata_2.11.0 PerformanceAnalytics_1.0.4.4
    [7] xts_0.8-6 zoo_1.7-7

    loaded via a namespace (and not attached):
    [1] grid_2.14.1 gtools_2.7.0 lattice_0.20-0

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

%d bloggers like this: