R/Finance 2013 Is Coming Quickly…

There is about two weeks remaining until R/Finance 2013 - being held on May 17th and 18th at UIC in Chicago.  Make sure you register beforehand to ensure you have a spot, and – yes - you do want to come to the conference dinner on Friday.  RFinance2013

I am particularly excited about the lineup of keynotes this year, which includes:

  • Sanjiv Das – Santa Clara University; Author of Derivatives: Principles and Practice;
  • Attilio Meucci – Chief Risk Officer at Kepos Capital, LP; Author of Risk and Asset Allocation
  • Ryan Sheftel – Managing Director for Electronic Market Making at Credit Suisse; and
  • Ruey Tsay – University of Chicago; Author of An Introduction to Analysis of Financial Data with R

In addition, the agenda for the two day conference is quite interesting – I’m anticipating several pages of interesting things to try coming from this lineup.

And there are several optional pre-conference sessions this year, some of which are close to sold out – you’ll want to act quickly if you want a seat.  Those cover topics and packages such as quantstrat, data.table, Rcpp, distributed computing, and whatever Jeff Ryan has on his mind (which is always interesting).

Make sure to introduce yourself – I hope to see you there!

Writing from R to Excel with xlsx

Paul Teetor, who is doing yeoman’s duty as one of the organizers of the Chicago R User Group (CRUG), asked recently if I would do a short presentation about a “favorite package”.  I picked xlsx, one of the many packages that provides a bridge between spreadsheets and R.  Here are the slides from my presentation last night; the script is below.

I’ll be honest with you – I use more than one package for reading and writing spreadsheets. But this was a good opportunity for me to dig into some unique features of xlsx and I think the results are worth recommending.

A key feature for me is that xlsx uses the Apache POI API, so Excel isn’t needed.  Apache POI is a mature, separately developed API between Java and Excel 2007.  That project is focused on creating and maintaining Java APIs for manipulating file formats based on the Office Open XML standards (OOXML) and Microsoft’s OLE 2 Compound Document format (OLE2).  As xlsx uses the rJava package to link Java and R, the heavy lifting of parsing XML schemas is being done in Java rather than in R.
Continue reading

GSoC and R: Off to the Races

Google Summer of Code has now opened for student applications, and the R Project has once again been selected as a mentoring organization.  I’ve discussed before that a variety of mentors have proposed a number of projects for students to work on during this summer, but I wanted to emphasize some points about the schedule.

The deadline for student submissions is May 03 at 19:00 UTC.  You have to have a credible application in Melange by this time, or the application will not get a slot.  That’s not a lot of time to create or pick an idea, write an applicationGSOC2013, identify a mentor, sign up for a Melange account, and post your application.  But students can improve their applications once they are posted, so it is worth putting up an incomplete draft if you need to.

Even after the deadline, students will receive questions and advice for improvements from the mentors once the application is up, and should be responsive to those requests.  All of the mentors are involved in voting about which projects will be funded.

Google has extended the amount of time spent determining slots this year, so the ‘behind-the-scenes’ process will be longer this year than it was in past years.  Accept/reject notices to students will come on May 31st.

Everyone who wants to participate in this year’s Google Summer of Code with R should join the Google Group: gsoc-r@googlegroups.com.

Good luck!

GSoC 2013: At the starting line

Google Summer of Code will be open for students on Monday, April 22.  The R Project has once again been selected as a mentoring organization , and a variety of mentors have proposed a number of projects for students to work on during this summer.  Here’s a bit about the program, and more on the R-related projects that are lining up for students this summer.GSOC2013

About Google Summer of Code

The concept is relatively simple – Google brings together students with mentors to work on open-source projects of their choosing.  Mentors get code written for their project, but no money; students get paid $5,000, equivalent to a nice summer internship.

If you’re a student and you’re interested on something R-related, pick something you’re interested in working on (whether a mentor has submitted an interesting idea you want to pursue, or if you have an idea and want a mentor).  With an idea in hand, submit a project application directly to Google.   Google will award a certain number of student slots to the R project, and projects will be ranked and slots allocated by the GSOC-R administrators and mentors.

Continue reading

Tagged ,

GSOC 2013: IID Assumptions in Performance Measurement

GSOC2013Google Summer of Code for 2013 has been announced and organizations such as R are beginning to assemble ideas for student projects this summer. If you’re an interested student, there’s a list of project proposals on the R wiki. If you’re considering being a mentor, post a project idea on the site soon – project outlines end up being 1-2 pages of text, plus references – and they should be up on the wiki by mid-to-late March. Google will use the listed projects outlines as part of their criteria for accepting the R project for another year of GSoC and in their preliminary budgeting of slots.

I’ve posted one project idea so far, one that would extend PerformanceAnalytics’ standard tools for analysis to better deal with various violations of a standard assumption that returns are IID (that is, each observation is drawn from an identical distribution and is independent of other observations).

Observable autocorrelation is one of those violations. There have been a number of different approaches for addressing autocorrelation in financial data that have been discussed in the literature. Various authors, such as Lo (2002) and Burghardt, et. al. (2012), have noted that the effects of autocorrelation can be huge, but are largely ignored in practice. Burghardt observes that the effects are particularly interesting when measuring drawdowns, a widely used performance measure that describes the performance path of an investment. Recently, Bailey and Lopez del Prado (2013) have developed a closed-form solution for the estimating drawdown potential, without having to assume IID cashflows.

There’s more detail at the project site, including a long list of references. I’d be glad to hear from you if you have any ideas, thoughts, or even code in this vein (or others). Here are a few of the references to get you thinking:

The Paul Tol 21-color salute

You may or may not know that PerformanceAnalytics contains a number of specific color schemes designed for charting data in R (they aren’t documented well, but they show up in some of the chart examples). I’ve been collecting color palates for years in search of good combinations of attractiveness, relative weight, and distinctiveness, helped along the way by great sites like ColorBrewer and packages like RColorBrewer.   I’ve assembled palettes that work for specific purposes, such as the color-focus palates (e.g., redfocus is red plus a series of dark to light gray colors). Others, such as rich#equal, provide a palette for displaying data that all deserve equal treatment in the chart. Each of these palettes have been designed to create readable, comparable line and bar graphs with specific objectives outlined before each category below.

I use this approach rather than generating schemes on the fly for two reasons: it creates fewer dependencies on libraries that don’t need to be called dynamically; and to guarantee the color used for the n-th column of data.

Oh, and here’s a little utility function (that I don’t think I wroteEDIT: that I know I didn’t write, since it was written by Achim Zeileis and is found in his colorspace package, but I have carried it around for quite a while) for displaying a palette:

# Function for plotting colors side-by-side
pal <- function(col, border = "light gray", ...){
  n <- length(col)
  plot(0, 0, type="n", xlim = c(0, 1), ylim = c(0, 1),
       axes = FALSE, xlab = "", ylab = "", ...)
  rect(0:(n-1)/n, 0, 1:n/n, 1, col = col, border = border)
}

Continue reading

Visually Comparing Return Distributions

Here is a spot of code to create a series of small multiples for comparing return distributions. You may have spotted this in a presentation I posted about earlier, but I’ve been using it here and there and am finally satisfied that it is a generally useful view, so I functionalized it.

require(PerformanceAnalytics)
data(edhec)
page.Distributions(edhec[,c("Convertible Arbitrage", "Equity Market Neutral","Fixed Income Arbitrage", "Event Driven", "CTA Global", "Global Macro", "Long/Short Equity")])

Compare-Returns
Continue reading

R/Finance 2013 Call for Papers

It’s that time of year again – we’ve just posted our Call for Papers for the R/Finance 2013 conference, which focuses on applied finance using R. This is our fifth annual conference, again organized by a group of R package authors and community contributors and hosted by the International Center for Futures and Derivatives (ICFD) at the University of Illinois at Chicago.

The conference will be held this spring in Chicago, IL, on Friday May 17 and Saturday May 18, 2013.

I’m particularly excited about our lineup of speakers this year, which we’ve just finalized:

Sanjiv Das, who is a Professor of Finance and the Chair of the Finance Department at Santa Clara University’s Leavey School of Business. He is also the author of Derivatives: Principles and Practice, and he’s a senior editor of The Journal of Investment Management and co-editor of The Journal of Derivatives. You’ll find R spread through most of his work and his blog.

Attilio Meucci is the Chief Risk Officer at Kepos Capital, L.P. and author of Risk and Asset Allocation. He is a thought leader in advanced risk and portfolio management, and somewhat rare in the world of financial research in that he regularly posts code along with his working papers – a characteristic that I deeply appreciate. Unfortunately for me and the broader finance community for R, he prefers to code in Matlab. All of Meucci’s original MATLAB source is available on http://www.symmys.com, but a recent Google Summer of Code project was dedicated to translating some of it to R.

Ryan Sheftel is a Managing Director for Electronic Market Making at Credit Suisse and has been introducing more and more automation for their Treasury bond execution services out to clients. He’s noted publicly that many of CS’ best traders spend a lot of time pounding away writing code – and, perhaps unusually for a senior manager at a large bank, spends time himself coding in R.

Ruey Tsay is a Professor of Econometrics and Statistics at the University of Chicago Booth School of Business. R users may be interested in his new book, An Introduction to Analysis of Financial Data with R, or may already own an edition of Analysis of Financial Time Series, a core book that is well applied in his course on time series analysis at U of C. Also look for companion packages on CRAN.

Hopefully that will whet your appetite enough for you to make plans to attend.

But perhaps you should consider speaking. We’re looking for speakers who focus significantly on the application of R (and packages in R) in various applications to finance. We strongly encourage speakers provide working R code to accompany the presentation/paper, as our audience enjoys being able to take concrete ideas and apply them to their own problems after the conference.

Ideally, data sets would also be made public for the purposes of reproducibility (though we realize this may be limited due to contracts with data vendors). We tend to give preference to presenters who have released R packages.

As in previous years, we will keep all presentations in one track in a large presentation hall with dual projections screens and a stage. This allows all of our conference participants to see all presentations. Given that we have had well over 200 attendees in prior years and a mix of academics and practitioners, you should plan for this type of large and varied audience.

So, unlike an academic conference where you may be presenting your work to 10-15 people who are highly knowledgeable in your field of expertise, you will be presenting to an audience with more varied skills and interests: think TED talk and not detailed exposition of your theory to experts.

Presentations that have been best received in the past have clearly communicated the motivation for the work, and how it could be applied in practice. Presentation that have been less well received have sought to go through the detailed math behind the theories, or have an unclear link to R.

Hopefully that will give you a sense of what we’re looking for, assuming you haven’t attended before. This has been a conference I’ve really enjoyed in the past, and I’m sure this year will be no exception. Much of that comes from hanging out with the attendees – and I hope to see you there, too.

xts and GSOC 2012

Josh Ulrich and Jeff Ryan mentored a Google Summer of Code (GSOC) project this summer focused on experimental functionality for xts in collaboration with R. Michael Weylandt, a student in operations research and financial engineering from Princeton. You might recognize Michael from his presentation at R/Finance this year, where he gave a talk entitled “A Short Introduction to Real-Time Portfolio/Market Monitoring with R“.

There were three main objectives of this GSOC project. One was to extend the plotting functionality of xts – to replace the existing plot.xts function with something much more generally useful and to add a barchart.xts primitive that handles stacked bars for time series with negative values. The proof of concept for both of these graphics come from chart functions in PerformanceAnalytics, but a variety of other improvements were also discussed.

Another objective was to experiment with supporting multiple data types within the same object for time series. The concept here is something like a data.frame, which allows class-specific list elements, aligned on an index. Michael wrote a prototype and definitely moved the ball forward here. Fuller functionality will require more test cases to be written to validate the approach and flush out bugs, as well as to add a number of utility functions such as rbind, cbind, etc.

The third objective was to provide ‘bridge’ functionality to convert xts objects to methods that assume a regular time series, such as AR/ARIMA, Holt Winters, or VAR methods, using something like the the zooreg subclass and some translations. Michael provides a number of these for arima, acf, pacf, HoltWinters, and others. These are convenience wrappers for xts users that manage the xts data into the underlying functions, then as appropriate with the results (such as residuals in the case of arima) are coerced back to xts objects.

The result is contained in a supplementary package called xtsExtra, which Michael constructed as a side-pocket for newly developed functionality, any or all of which may end up in the xts package at some point. Beyond Jeff and Josh, Michael opened up to the broader r-sig-finance community to get feedback on xtsExtra, which resulted in several helpful conversations with Jonathan Cornelison, Eric Zivot, Rob Hyndman, Stuart Greenlee, Kenton Russell, Brian Peterson and me.

I want to step back to the first objective for a moment to talk for a moment about plot.xts. klr at TimelyPortfolio immediately took to the code and exercised it well – here is a particularly good chart. Here’s another. And another. Oh, and this one! These were great examples, and I think they are suggestive of how the function could be extended even further, perhaps simplifying the interface and extending the panel functionality. That might require some significant re-work, but I think the results will be well worth it. I think Jeff Ryan might have some tricks up his sleeve as well…

We’ll see where some of this speculation goes, but I want to thank Michael again for his commendable efforts this summer! His has been a considerable effort to extend and improve xts in some very useful ways, and I’m looking forward to his continued involvement in this and perhaps other endevors.

FinancialInstrument Moves to CRAN

I thought I would break up the posts about GSOC (no, I’m not done yet – there are a few more to do) with a quick note about FinancialInstrument.

The FinancialInstrument package provides a construct for defining and storing meta-data for tradable contracts (referred to as instruments, e.g., stocks, futures, options, etc.). The package can be used to create any asset class and derivatives, so it is required for packages like blotter and quantstrat.

FinancialInstrument was originally conceived as blotter was being written. Blotter provides portfolio accounting functionality, accumulating transactions into positions, then into portfolios and an account. Blotter, of course, needs to know something about the instrument being traded.

FinancialInstrument is used to hold the meta-data about an instrument that blotter uses to calculate the notional value of positions and the resulting P&L. FinancialInstrument, however, has plenty of utility beyond portfolio accounting, such as pre-trade pricing, risk management, etc., and was carved out so that others might take advantage of its functionality. Brian Peterson did the heavy lifting there, constructing FinancialInstrument as a meta-data container based on a data design we developed for a portfolio management system years ago.

Utility packages like this are generally thankless work, although incredibly useful and powerful for (potentialy several) end applications. They quietly do a bunch of heavy lifting that allows the user interface to be simpler, more powerful and more flexible than they otherwise might be, and allow the developer to focus on the specific application rather than re-inventing already-existing but trapped functionality.

Thankfully, Garrett See has found both the time and motivation to take what was a useful but unfinished package and help Brian carry it across the finish line into CRAN. Garrett also added a great deal of functionality around managing the .instrument namespace, such as ls_instruments() and many other ls_* and rm_* functions. Those ls_* functions get names of instruments of a particular type or denominated in a given currency (or currencies), while rm_* functions remove instruments. Similarly, a series of update_* functions help update instruments from various sources, such as Yahoo!.

At this point, FinancialInstrument has a lot of functionality. Let’s take a closer look…
Continue reading

Follow

Get every new post delivered to your Inbox.