Category Archives: GSOC

GSOC 2014: Let’s do it again!

Google Summer of Code opened for students on Monday, March 10, more than a month earlier than last year.  If you weren’t following the announcements and that deadline caught you wrong-footed, all I can say is that the good news is that students will know their fate by April 21, well before the summer starts.

In other good news, the The R Project has once again been selected as a mentoring organization , and a variety of mentors have proposed a number of projects for students to work on during this summer.  If you’re interested, here’s a quick introduction to the GSOC program and pointer to the R-related project ideas that are lining up for students this summer.
GSoC 2014 Flags

About Google Summer of Code

A quick reminder – Google brings together students with mentors to work on open-source projects of their choosing.  Mentors get code written for their project, but no money; students get paid $5,000, equivalent to a nice summer internship.

If you’re a student and you’re interested on something R-related, pick something you’re interested in working on (whether a mentor has submitted an interesting idea you want to pursue, or if you have an idea and want a mentor).  With an idea in hand, submit a project application directly to Google before Friday, March 21 at 12:00pm PDT.   Google will award a certain number of student slots to the “R Project for Statistical Computing,” and projects will be ranked and slots allocated by the GSOC-R administrators and mentors.

Finance-related Projects

Like last year, there are a few proposed R projects that are finance-related.  I won’t go through them here, but look for the names of past mentors: Jonathan Cornelissen, Kris Boudt, David Ardia, and Doug Martin; as well as new mentors such as Daniele Signori and David Matteson.  This is promising to be a very productive summer…

This year, Brian Peterson and I are looking for a student to work on improving time series visualization for xts time series objects, building on what we learned in successful GSOC projects in 2012 and 2013.  This project will specifically focus on developing multi-panel time series charts that may be created from specified panels with a chart layout. Charts may (eventually) be of composed of panels with several different chart types, but the focus this summer is only on time series charts that may be linked via the x- and/or y-axes.

By default, plot.xts will simply chart the time series data in the form it is passed in, using a default panel that is a simple line chart. The following will show a single panel line chart with six lines:

> dim(x.xts) 
[1] 60 6 
> plot(x.xts)

A panel function will be used to define transformations of the data and its display. For example, there might be a function called panel.CumReturns that takes return data, chains together the individual returns, and produces a line chart of cumulative returns through time. This code will show a single panel as defined in panel.CumReturns with six lines:

> dim(x.xts) 
[1] 60 6 
> plot(x.xts, panels=panel.CumReturns)

Multiple panels can be used in a chart.  Say we have a panel.BarVaR function that takes returns and plots only the first series in a bar chart overlayed with an expanding window calculation of VaR for that asset. And we have a panel.Drawdowns function that produces a line chart of drawdowns through time for all time series returns data passed in.  These panels can be passed the same way via an argument. In this case, the layout will be simply be divided by the number of panels, for example, divided into thirds in the following:

> dim(x.xts) 
[1] 60 6 
> plot(x.xts, panels = c(panel.CumReturns, panel.BarVaR, panel.Drawdowns)) 

This would result in a three-panel chart, each with six data data series (although the panel function may not choose to draw all of them) available.

There’s much more, but that should whet your appetite.

Functions will likely be included in xtsExtra, an R package that provides supplementary functionality for xts. The package also served as a development platform for the GSoC 2012 and 2013 xts project, for experimental code that may eventually end up in the xts package.

There are also several other very interesting projects proposed for the broader R Project organization as well. Take a look – these are in various states of needing students or mentors.

Students, start your proposal…

Students should also take a look at the R Project’s proposal template as a starting point.  Proposals are expected to be very detailed, and may run to ten or more pages.  In short, this is a competitive process and you will need to put your best foot forward.  I should also note that the process is very iterative – you’ll get feedback as time goes on and will be expected to be responsive to the questions people ask.  Project mentors usually also propose a test – some task that they think is representative of the summer’s work that will help demonstrate your skills and fitness for the project.

Or, consider bringing something new to the table.  This is an active, dynamic group of people who have a broad set of interests, and the process can accommodate well-proposed ideas that garner support.

Good luck, and I hope to hear from you soon.


Some belated spring cleaning

A very busy spring has transitioned into a very busy summer, so let me recap a few topics that probably deserve more time than I’ll give them here. Here are the things I’m overdue on, in no particular order:


In the March edition of the Journal of Risk, Kris Boudt, Brian Peterson and I published a paper titled Asset allocation with conditional value-at-risk budgets. You can also see a pre-publication version on SSRN. It was nice to see this finally hit paper – many thanks to my co-authors for all their work on an interesting topic.

Equal CVaR Concentration

Panel 3 of Figure 4 shows the weights through time and contribution of CVaR for a minimum CVaR concentration portfolio.

Dirk Eddelbuettel’s book is finally out. Congrats to him – that’s a nice accomplishment! I tried to steal the pre-print at the R/Finance conference, but Dirk made me buy my own copy.

R/Finance 2013

R/Finance 2013 went very well. This event has already been covered here and here – even with an article in the R Journal – but I thought I’d briefly mention a few highlights.

I thought the keynote speakers were fantastic. Every time I see Atillio Meucci speak, I learn something new about a topic I thought I already knew pretty well. This time, Atillio pulled out several animimated visualizations that were very thoughtfully designed – each presented a huge amount of information in a linked way that showed relationships between measures and how they changed dynamically through time. Each of the animations served to underscore the (sometimes simple) intuition behind the complex math. “A quant presentation without equations,” he said. Exceptionally well done – developing the intuition behind these concepts is a significant challenge, even in a room full of quants. No slides, but more on that later.

Revolution’s blog did more justice to Ryan Sheftel’s talk than I’m going to do here. Ryan did an excellent job describing the implementation issues within a large organization, providing a strong dose of reality that I think was appreciated by the audience of practitioners.

Sanjiv Das hit one out of the park as well. Flip through his slides when you have a chance – it was a nice demonstration of how he’s used R in very different projects related to finance. He’s a polymath. I particularly enjoyed his talk on network analysis usig SEC and FDIC filings to identify banks that pose systematic risk – a talk that echoed one given by Michael Gordy, a senior economist in the FRB, in 2012.

I have to plump for the hometown, as well. Ruey Tsay always has something interesting up his sleeve, and this presentation was no different. He warned that this is work in progress, but with Y. Hu he’s developing Principal Volatility Components as a way to identify common volatility components among financial assets. That struck me as work that is well worth tracking.

That was more than I had intended to write on the topic, but a few other presentations stood out to me as well: David Matteson’s talk on change points was accompanied by an excellent paper; Samantha Azzarello’s presentation on a Baysian interpretation of the Taylor Rule was as well; Thomas Harte gave another from-the-trenches viewpoint; David Ardia; Ronald Hochreiter; Alexios Ghalanos, and many others – there were a number of excellent sessions. We were also glad to have several returning speakers – Doug Martin, Kris Boudt, Bernhard Pfaff, Jiahan Li, Bryan Lewis. Lightning talks were also well received, particularly Winston Chang’s demonstration of Shiny. Jan Humme and Brian Peterson did a very nice overview of quantstrat in the pre-conference tutorials. All of the slides for the 2013 conference are here. Great stuff – take a look. Then pencil in the 2014 conference in May of next year…

Other R Conferences

Continuing on the topic of finance-related R conferences, congratulations go to Markus Gesman and Cass Business Scool for organizing the inagural R in Insurance conference this year. If you missed it, as I did, check out the presentations here and plan your travel accordingly for next year.

On the other side of the ledger, this was the last year that Diethelm Wuertz’s R-Metrics conference is to be held in Meielesalp. I didn’t make this one, but I’ve since heard that next year’s will be held in Paris.

Google Summer of Code 2013

GSOC 2013 has not only started, but is well underway. Of the nineteen R projects going on, six are finance-related. In no particular order:

All this activity is resulting in a tremendous amount of code covering a variety of topics and projects. Thanks to all who are participating – both mentors and students – and to Google for supporting open source! I’ll try to provide more detailed project wrap-ups at the end of the summer.


I’ve made some changes to blotter recently for handling account-level transactions, such as additions and withdrawals (rev. 1485). That should improve the package’s functionality for cash reconciliation. The functionality is pretty rudimentary, but it appears to work. Let me know if you see opportunities for improvement. Blotter is pretty close to CRAN-ready at this point, but requires a final push that is incongruous with good weather.

A very belated thanks to Brian Peterson for pushing out version 1.1 of PerformanceAnalytics to CRAN early this year. I’ve been intending to go over some of the significant changes in that version for months now, but we might have another version out before I get the posting done. Never mind.

GSoC and R: Off to the Races

Google Summer of Code has now opened for student applications, and the R Project has once again been selected as a mentoring organization.  I’ve discussed before that a variety of mentors have proposed a number of projects for students to work on during this summer, but I wanted to emphasize some points about the schedule.

The deadline for student submissions is May 03 at 19:00 UTC.  You have to have a credible application in Melange by this time, or the application will not get a slot.  That’s not a lot of time to create or pick an idea, write an applicationGSOC2013, identify a mentor, sign up for a Melange account, and post your application.  But students can improve their applications once they are posted, so it is worth putting up an incomplete draft if you need to.

Even after the deadline, students will receive questions and advice for improvements from the mentors once the application is up, and should be responsive to those requests.  All of the mentors are involved in voting about which projects will be funded.

Google has extended the amount of time spent determining slots this year, so the ‘behind-the-scenes’ process will be longer this year than it was in past years.  Accept/reject notices to students will come on May 31st.

Everyone who wants to participate in this year’s Google Summer of Code with R should join the Google Group:

Good luck!

GSoC 2013: At the starting line

Google Summer of Code will be open for students on Monday, April 22.  The R Project has once again been selected as a mentoring organization , and a variety of mentors have proposed a number of projects for students to work on during this summer.  Here’s a bit about the program, and more on the R-related projects that are lining up for students this summer.GSOC2013

About Google Summer of Code

The concept is relatively simple – Google brings together students with mentors to work on open-source projects of their choosing.  Mentors get code written for their project, but no money; students get paid $5,000, equivalent to a nice summer internship.

If you’re a student and you’re interested on something R-related, pick something you’re interested in working on (whether a mentor has submitted an interesting idea you want to pursue, or if you have an idea and want a mentor).  With an idea in hand, submit a project application directly to Google.   Google will award a certain number of student slots to the R project, and projects will be ranked and slots allocated by the GSOC-R administrators and mentors.

Continue reading

Tagged ,

GSOC 2013: IID Assumptions in Performance Measurement

GSOC2013Google Summer of Code for 2013 has been announced and organizations such as R are beginning to assemble ideas for student projects this summer. If you’re an interested student, there’s a list of project proposals on the R wiki. If you’re considering being a mentor, post a project idea on the site soon – project outlines end up being 1-2 pages of text, plus references – and they should be up on the wiki by mid-to-late March. Google will use the listed projects outlines as part of their criteria for accepting the R project for another year of GSoC and in their preliminary budgeting of slots.

I’ve posted one project idea so far, one that would extend PerformanceAnalytics’ standard tools for analysis to better deal with various violations of a standard assumption that returns are IID (that is, each observation is drawn from an identical distribution and is independent of other observations).

Observable autocorrelation is one of those violations. There have been a number of different approaches for addressing autocorrelation in financial data that have been discussed in the literature. Various authors, such as Lo (2002) and Burghardt, et. al. (2012), have noted that the effects of autocorrelation can be huge, but are largely ignored in practice. Burghardt observes that the effects are particularly interesting when measuring drawdowns, a widely used performance measure that describes the performance path of an investment. Recently, Bailey and Lopez del Prado (2013) have developed a closed-form solution for the estimating drawdown potential, without having to assume IID cashflows.

There’s more detail at the project site, including a long list of references. I’d be glad to hear from you if you have any ideas, thoughts, or even code in this vein (or others). Here are a few of the references to get you thinking:

xts and GSOC 2012

Josh Ulrich and Jeff Ryan mentored a Google Summer of Code (GSOC) project this summer focused on experimental functionality for xts in collaboration with R. Michael Weylandt, a student in operations research and financial engineering from Princeton. You might recognize Michael from his presentation at R/Finance this year, where he gave a talk entitled “A Short Introduction to Real-Time Portfolio/Market Monitoring with R“.

There were three main objectives of this GSOC project. One was to extend the plotting functionality of xts – to replace the existing plot.xts function with something much more generally useful and to add a barchart.xts primitive that handles stacked bars for time series with negative values. The proof of concept for both of these graphics come from chart functions in PerformanceAnalytics, but a variety of other improvements were also discussed.

Another objective was to experiment with supporting multiple data types within the same object for time series. The concept here is something like a data.frame, which allows class-specific list elements, aligned on an index. Michael wrote a prototype and definitely moved the ball forward here. Fuller functionality will require more test cases to be written to validate the approach and flush out bugs, as well as to add a number of utility functions such as rbind, cbind, etc.

The third objective was to provide ‘bridge’ functionality to convert xts objects to methods that assume a regular time series, such as AR/ARIMA, Holt Winters, or VAR methods, using something like the the zooreg subclass and some translations. Michael provides a number of these for arima, acf, pacf, HoltWinters, and others. These are convenience wrappers for xts users that manage the xts data into the underlying functions, then as appropriate with the results (such as residuals in the case of arima) are coerced back to xts objects.

The result is contained in a supplementary package called xtsExtra, which Michael constructed as a side-pocket for newly developed functionality, any or all of which may end up in the xts package at some point. Beyond Jeff and Josh, Michael opened up to the broader r-sig-finance community to get feedback on xtsExtra, which resulted in several helpful conversations with Jonathan Cornelison, Eric Zivot, Rob Hyndman, Stuart Greenlee, Kenton Russell, Brian Peterson and me.

I want to step back to the first objective for a moment to talk for a moment about plot.xts. klr at TimelyPortfolio immediately took to the code and exercised it well – here is a particularly good chart. Here’s another. And another. Oh, and this one! These were great examples, and I think they are suggestive of how the function could be extended even further, perhaps simplifying the interface and extending the panel functionality. That might require some significant re-work, but I think the results will be well worth it. I think Jeff Ryan might have some tricks up his sleeve as well…

We’ll see where some of this speculation goes, but I want to thank Michael again for his commendable efforts this summer! His has been a considerable effort to extend and improve xts in some very useful ways, and I’m looking forward to his continued involvement in this and perhaps other endevors.