At work we've been in "crunch mode". To be fair, I work at a startup awaiting some "Series A" funding to be tied up, so we're always in crunch mode. However, the last 90 days have been more intense than expected.
We're wrapping up in the next couple of weeks and that means I'll have some long winter nights that I can dedicate to tidying up the bots and doing some active trading. As part of that, I want to release my code out to Github, and then you can all dive in and play along too.
So if you were thinking I'd given up, you were only half-right. I'm just insanely busy. I've not been able to find time to even do some traditional manual trading in the last few months! It's all been a bit frustrating, but the day job has to take priority... for now, at least.
Betting Algorithms
A blog about statistical analysis of sports and betting exchanges, bot building and a software developer's take on betting.
Wednesday, 26 October 2011
Monday, 26 September 2011
Great Article on Quant Trading and HFT
I'm not entirely sure why this article is currently the number one most popular article on the BBC News as there isn't a sure context for it. However, there are some interesting points:
However, given Wall Street is currently occupied by protestors - which is mostly being ignored by the mainstream media - and we are perhaps days or weeks away from a global financial crash, I'm glad that I'm not trying to code for share trading.
More updates on the bots this week, I promise. I've just been rather hectic. By the end of October, I'll be trading.
I think this is a little misguided, because the human mob mentality is reflected in historical price movements. Does a trading algorithm need to understand fundamentals, or can it get away with pure technical analysis? I think - and have stated here before - that in the early stages of automated trading, technical analysis is enough, but ultimately historical analysis will help inform that analysis by abstracting the fundamentals."As one leading actuary says: "Prices are determined by supply and demand, not by mathematics."Could it be, then, that academic statisticians are congenitally unsuited to the job they are being paid to do?Paul Wilmott, a prominent lecturer in quantitative finance, has questioned whether they are "capable of thinking beyond maths and formulas"."Do they appreciate the human side of finance, the herding behaviour of people, the unintended consequences?"And if mathematicians do not, there is little chance the computer programmes they create will."
However, given Wall Street is currently occupied by protestors - which is mostly being ignored by the mainstream media - and we are perhaps days or weeks away from a global financial crash, I'm glad that I'm not trying to code for share trading.
More updates on the bots this week, I promise. I've just been rather hectic. By the end of October, I'll be trading.
Labels:
finance,
hft,
mathematicians,
quants,
shares,
technical analysis,
trading
Location:
Manchester, UK
Sunday, 4 September 2011
The 'Voltaire' and 'Condamine' Bots
I've started work on code - finally! - and am hoping to release some code to github in the next week. I'm actually developing a couple of different bots, as I believe in writing software that fits well with the various takes on Unix philosophy, particularly the philosophy of tools dedicated to doing a single job very well, and those tools being able to talk to each as they need to.
I needed to come up with some names for the first two bots dedicated to data collection and analysis. When this story about Voltaire and Charles Marie de La Condamine turned up in my RSS reader this morning, the choice seemed obvious. The first group of bots will therefore be named after these two rather wonderful characters.
Voltaire
The (now newly-named) Voltaire bot I've started is very simply about just grabbing data from multiple sources. I have some code for grabbing historical data from Betfair (awful though it is), however I'm concerned about the lack of granularity. It would be nice to see not just what was matched, but what was offered. And critically, when.
So I have some code that also hits the APIs to log data for specified events, and I'm hoping before release to be able to expand it to "event types". This means with a single command all markets of a type (e.g. horse racing, soccer, etc.) will be "watched" and when a change occurs log entries are placed in a database for later analysis. I'm also keen to add support for writing directly to Google Docs Spreadsheets as well, for reasons I'll explain another time.
What this gives us then is code that gives us timestamped data going forward. Over time this will help us build our own collection of detailed information to analyse, but obviously we can't go back in time and get historical data this way.
You might well ask "Why just betting data? What about sports data?"
I should stress that at the moment I'm only interested in gathering data about price movements, amounts offered and matched and eventual winners. I'm interested in tracking ante-post, day of event and in-play betting activity. But why not sports data?
Firstly, historical data about which state event "runners" were in at any given data or time is easily obtained. Historical betting price data to the level I want it, isn't. If I want data about any horse, football player or politician on a given date and time (to the minute, even during an event), I can probably find it, and for free or very low cost, somewhere. I might change my mind about that down the line, but for now the existing data sets are "good enough".
Secondly, I probably wouldn't use it! In stock market analysis there are two camps: those who look at fundamentals, and technical analysts. Most people use a mixture of the two techniques to make decisions, but when it comes to betting I'm more of a technical analyst kind of a guy. It's those skills I wish to bring initially to my bots, although fundamentals will surely follow in due course.
In the sports betting world, you can see fundamentals as being about form book study and forming a view on likely results before the event, whilst technical analysis is about looking at the maths of price movements and exploiting likely movements to produce a profit. Technical analysis is for the people who take an opinion about how many ticks a price might move in a market, fundamentals are for those who have taken an opinion about the likely winner. I'm in the first camp.
As such, at this time, the data collection and analysis will all be about identifying movements and mathematical "lock-ins". If that's not your bag, maybe check back with me in 6 months - I might be onto analysing fundamentals by then.
Condamine
The Condamine bot is at the moment a skeleton piece of code for analysing the data collected by Voltaire. Right now, it doesn't do much, but eventually its job is to find and test techniques and strategies for profitable betting. Condamine v1 will simply be a way to test the profitability of a strategy. However I'm already keen to explore machine learning techniques and write a bot that given a set of data will work out an optimal strategy for itself.
This is ambitious. I'm sure plenty of people have tried it, and perhaps it's over-complicating things, but I want to try it because I eventually want to do something a bit funkier than just implementing a standard strategy with Condamine, however I also want the flexibility to be able to test theories of my own. That might evolve into another piece of code all by itself, time will tell.
Right now however, my goal is to be able to express a strategy (perhaps in a "domain-specific language" or DSL) and to see how that strategy would have performed against the events Voltaire has managed to collect data about. I'm still exploring options in this regard, but the R statistics package is likely to be a major aide to Condamine.
All in all, it's about to get quite interesting. Watch this space!
I needed to come up with some names for the first two bots dedicated to data collection and analysis. When this story about Voltaire and Charles Marie de La Condamine turned up in my RSS reader this morning, the choice seemed obvious. The first group of bots will therefore be named after these two rather wonderful characters.
Voltaire
The (now newly-named) Voltaire bot I've started is very simply about just grabbing data from multiple sources. I have some code for grabbing historical data from Betfair (awful though it is), however I'm concerned about the lack of granularity. It would be nice to see not just what was matched, but what was offered. And critically, when.
So I have some code that also hits the APIs to log data for specified events, and I'm hoping before release to be able to expand it to "event types". This means with a single command all markets of a type (e.g. horse racing, soccer, etc.) will be "watched" and when a change occurs log entries are placed in a database for later analysis. I'm also keen to add support for writing directly to Google Docs Spreadsheets as well, for reasons I'll explain another time.
What this gives us then is code that gives us timestamped data going forward. Over time this will help us build our own collection of detailed information to analyse, but obviously we can't go back in time and get historical data this way.
You might well ask "Why just betting data? What about sports data?"
I should stress that at the moment I'm only interested in gathering data about price movements, amounts offered and matched and eventual winners. I'm interested in tracking ante-post, day of event and in-play betting activity. But why not sports data?
Firstly, historical data about which state event "runners" were in at any given data or time is easily obtained. Historical betting price data to the level I want it, isn't. If I want data about any horse, football player or politician on a given date and time (to the minute, even during an event), I can probably find it, and for free or very low cost, somewhere. I might change my mind about that down the line, but for now the existing data sets are "good enough".
Secondly, I probably wouldn't use it! In stock market analysis there are two camps: those who look at fundamentals, and technical analysts. Most people use a mixture of the two techniques to make decisions, but when it comes to betting I'm more of a technical analyst kind of a guy. It's those skills I wish to bring initially to my bots, although fundamentals will surely follow in due course.
In the sports betting world, you can see fundamentals as being about form book study and forming a view on likely results before the event, whilst technical analysis is about looking at the maths of price movements and exploiting likely movements to produce a profit. Technical analysis is for the people who take an opinion about how many ticks a price might move in a market, fundamentals are for those who have taken an opinion about the likely winner. I'm in the first camp.
As such, at this time, the data collection and analysis will all be about identifying movements and mathematical "lock-ins". If that's not your bag, maybe check back with me in 6 months - I might be onto analysing fundamentals by then.
Condamine
The Condamine bot is at the moment a skeleton piece of code for analysing the data collected by Voltaire. Right now, it doesn't do much, but eventually its job is to find and test techniques and strategies for profitable betting. Condamine v1 will simply be a way to test the profitability of a strategy. However I'm already keen to explore machine learning techniques and write a bot that given a set of data will work out an optimal strategy for itself.
This is ambitious. I'm sure plenty of people have tried it, and perhaps it's over-complicating things, but I want to try it because I eventually want to do something a bit funkier than just implementing a standard strategy with Condamine, however I also want the flexibility to be able to test theories of my own. That might evolve into another piece of code all by itself, time will tell.
Right now however, my goal is to be able to express a strategy (perhaps in a "domain-specific language" or DSL) and to see how that strategy would have performed against the events Voltaire has managed to collect data about. I'm still exploring options in this regard, but the R statistics package is likely to be a major aide to Condamine.
All in all, it's about to get quite interesting. Watch this space!
Friday, 2 September 2011
Other bot software worthy of consideration?
I've been rather busy the last fortnight. As my last post indicated, a girlfriend with a broken arm can be quite time-consuming. She's getting better, but she still needs plenty of looking after.
Today I've been planning what bot writing I could do this weekend to pick the momentum up, and which posts I could write for here. The data scraping is harder than expected because Betfair (who provide the best historical data at the moment), make it an absolute pain in the rear to extract automatically all their data. However, I think that has to be our starting place, combined with writing our own data collection bot with time-stamping.
I plan to do that at some point this weekend, and I hope to release a pretty full version as open source in the next week or so.
However something struck me as a I read a post about trialling Fairbot - there are other bots that are worthy of consideration.
I'm going to crack on with my own bot writing stuff simply because I want to do things none of the existing products do, however if there is interest in a review of a commercial bot out there I'm happy to do it, and look at it from the perspective of having it run unaccompanied with a range of strategies. I've reviewed a lot of software over the years, so might see things others miss. If this intrigues anybody, leave a message in the comments.
Otherwise, I'll crack on: this weekend it's time to finally get historical data out of Betfair and into a usable format, and to start writing our own data gathering bot with some time-stamping for a couple of different markets.
Today I've been planning what bot writing I could do this weekend to pick the momentum up, and which posts I could write for here. The data scraping is harder than expected because Betfair (who provide the best historical data at the moment), make it an absolute pain in the rear to extract automatically all their data. However, I think that has to be our starting place, combined with writing our own data collection bot with time-stamping.
I plan to do that at some point this weekend, and I hope to release a pretty full version as open source in the next week or so.
However something struck me as a I read a post about trialling Fairbot - there are other bots that are worthy of consideration.
I'm going to crack on with my own bot writing stuff simply because I want to do things none of the existing products do, however if there is interest in a review of a commercial bot out there I'm happy to do it, and look at it from the perspective of having it run unaccompanied with a range of strategies. I've reviewed a lot of software over the years, so might see things others miss. If this intrigues anybody, leave a message in the comments.
Otherwise, I'll crack on: this weekend it's time to finally get historical data out of Betfair and into a usable format, and to start writing our own data gathering bot with some time-stamping for a couple of different markets.
Labels:
betfair,
betting,
betting exchanges,
bot,
reviews
Wednesday, 17 August 2011
Not my kind of race...
... but I bet a few quid was matched at 1000 on the number 6 before it hit its stride in the last couple of furlongs :-)
You have to hand it to those Japanese...
You have to hand it to those Japanese...
Sunday, 7 August 2011
Programming Language Choice for Writing a Bot
Programmers, contrary to popular opinion, do not write software. We design it.
Much like how architects do not build buildings, but rather design them in a very specific way, programmers design software blueprints. We then hand those blueprints over to the real builders (in the case of programmers, we call these builders "compilers" or "interpreters"), and they turn the abstract design into the 1s and 0s that the computer is able to use.
Some programmers don't like this analysis, but it's true. "Programming languages" are just different ways of designing software, different ways to draw blueprints, different ways for us to talk about the intended results. Some languages are "low-level" which means you spend most of your time designing a series of processes to control the computer (and you have to think about memory management, and in some cases CPU registers and I/O interrupts and the like), and others are "high-level" where you spend 100% of your time thinking about solving the problem at hand.
It goes without saying that we want to spend our time writing a bot thinking more about the problem at hand than worrying about how to control the computer. However there are a lot of high-level languages out there, and it came as little surprise to me that many people wanted to know what language I would be writing my bots in.
There are many candidates, but for me there is only one winner: Ruby
I write Ruby code for a living, and have done so for about 6 years now. The fact I'm familiar with it is perhaps the most important consideration. If you know a programming language, that's the one you should choose. But sometimes it pays dividends to learn another language for a specific purpose (most professional programmers learn a new language every year), so are there are any other candidates?
StefanBelo wrote a comment suggesting I should consider Clojure. The reasoning is that it's a "functional programming language" and that has inherent power, flexibility and succinctness to make incredibly powerful bots in very few lines of code - maybe even just one line of code! Say to most programmers "it's a Lisp dialect" and they'll either get very curious, or run a mile. I'm of the former variety, so I considered it.
But my intent here is not to be clever, and not to write a blog for people who already know how to write code in a functional language, or who know how to code but in a more "traditional" language. My intent is to help those who have no idea where to begin get a grip on getting started, to build my own bots and to share what I learn on the way.
Ruby is a wonderful introductory language, and even better it has some functional programming features built in, so down the line when we want that power and expressiveness, we can get it if we want it. But right now, it's great for even novices to get the hang of.
I think everybody should learn how to program at some point. It is the most self-empowering thing you can do in the 21st century. No other art, science or industry has ever allowed you to have an idea in your head, spend some time at a commodity piece of equipment bought for the price of a week's average wages and to share or sell a product with billions of people at zero cost of reproduction. That's what programming gives you the ability to do.
If you'd like to start learning, and you'd like to start learning Ruby, Chris Pine has some excellent material and his book is worth considering if you're a complete beginner. This blog will never be a programming tutorial, but I'm hoping that by using Ruby and answering questions in the comments, even novices will be able to get a handle on things. You can get yourself a little more clued up now by having a read through that material and maybe take a look at the Ruby Koans.
Good luck if you're just starting out! I'm envious of the fun you're about to have, learning to code Ruby for the first time.
P.S. - my plan was to have data scraping articles up by now. Alas, my girlfriend broke her arm last week and hospital visits have absorbed all my free time. Later this week I should be getting back on track. The response to this blog so far has been fantastic, and so I'm really looking forward to getting started on the meat of this project and sharing it with you all.
Much like how architects do not build buildings, but rather design them in a very specific way, programmers design software blueprints. We then hand those blueprints over to the real builders (in the case of programmers, we call these builders "compilers" or "interpreters"), and they turn the abstract design into the 1s and 0s that the computer is able to use.
Some programmers don't like this analysis, but it's true. "Programming languages" are just different ways of designing software, different ways to draw blueprints, different ways for us to talk about the intended results. Some languages are "low-level" which means you spend most of your time designing a series of processes to control the computer (and you have to think about memory management, and in some cases CPU registers and I/O interrupts and the like), and others are "high-level" where you spend 100% of your time thinking about solving the problem at hand.
It goes without saying that we want to spend our time writing a bot thinking more about the problem at hand than worrying about how to control the computer. However there are a lot of high-level languages out there, and it came as little surprise to me that many people wanted to know what language I would be writing my bots in.
There are many candidates, but for me there is only one winner: Ruby
I write Ruby code for a living, and have done so for about 6 years now. The fact I'm familiar with it is perhaps the most important consideration. If you know a programming language, that's the one you should choose. But sometimes it pays dividends to learn another language for a specific purpose (most professional programmers learn a new language every year), so are there are any other candidates?
StefanBelo wrote a comment suggesting I should consider Clojure. The reasoning is that it's a "functional programming language" and that has inherent power, flexibility and succinctness to make incredibly powerful bots in very few lines of code - maybe even just one line of code! Say to most programmers "it's a Lisp dialect" and they'll either get very curious, or run a mile. I'm of the former variety, so I considered it.
But my intent here is not to be clever, and not to write a blog for people who already know how to write code in a functional language, or who know how to code but in a more "traditional" language. My intent is to help those who have no idea where to begin get a grip on getting started, to build my own bots and to share what I learn on the way.
Ruby is a wonderful introductory language, and even better it has some functional programming features built in, so down the line when we want that power and expressiveness, we can get it if we want it. But right now, it's great for even novices to get the hang of.
I think everybody should learn how to program at some point. It is the most self-empowering thing you can do in the 21st century. No other art, science or industry has ever allowed you to have an idea in your head, spend some time at a commodity piece of equipment bought for the price of a week's average wages and to share or sell a product with billions of people at zero cost of reproduction. That's what programming gives you the ability to do.
If you'd like to start learning, and you'd like to start learning Ruby, Chris Pine has some excellent material and his book is worth considering if you're a complete beginner. This blog will never be a programming tutorial, but I'm hoping that by using Ruby and answering questions in the comments, even novices will be able to get a handle on things. You can get yourself a little more clued up now by having a read through that material and maybe take a look at the Ruby Koans.
Good luck if you're just starting out! I'm envious of the fun you're about to have, learning to code Ruby for the first time.
P.S. - my plan was to have data scraping articles up by now. Alas, my girlfriend broke her arm last week and hospital visits have absorbed all my free time. Later this week I should be getting back on track. The response to this blog so far has been fantastic, and so I'm really looking forward to getting started on the meat of this project and sharing it with you all.
Labels:
chris pine,
koans,
language choice,
learn to program,
programming,
ruby
Tuesday, 2 August 2011
Planning the Development of a Bot
Time to get down to writing some bots. As a software developer by trade, that means we need to do some planning.
A bot is simply an automated way to do something you would do manually. It has several advantages we wish to exploit, and if anything we do negates one of these advantages we must reconsider whether it's suitable for a bot to be doing it.
A bot has a few things it needs to be able to to meet these objective.
It must be able to login to a betting exchange or bookmaker website (the latter important for arbing bots), select an event or events, determine which selection or selections it will bet on, strike the bets thereby committing capital according to some sort of staking method, monitor those bets and impose a stop loss if necessary, green up when optimal if we believe greening up is the best way to behave, and log all the data about what it's done somewhere so we can go back and check what it was doing in a spreadsheet later.
There's a couple of bits of this which are relatively straight-forward, and others where we need to make some decisions. The logging into an exchange at first appears trivial (use the "login" function of the API), but do we want to use the API or do we want to screen scrape? Why have we made that choice? I'll be exploring that at some point and showing what the code for each looks like. If you've never written/read any code before, don't panic! It'll be understandable.
However the other jobs all give us much more to think about. We need to justify every line of code, and to meet the "systematic and analytical" test that a bot should pass, we need to handle event selection, runner selection, staking, stop lossing and greening up based on some sort of scientific analysis.
Next job then, it's time to do some data gathering to test some hypotheses.
The first idea for a bot I am going to test is one I expect to fail at the analysis stage (which is good! If it's going to fail, that's the point in time I want for it to fail, not after the bot is built and running out onto the exchanges with hard cash), but will be relatively simple to implement and therefore a good "starting project" for this blog. It is very simple to state:
Find non-handicap UK/IRE horse races which will go in-play, and back the favourite for 5% of my betting bank in the place market 5-10 minutes before the official off. Then lay it at a price giving a 10% RoI greened up with the "keep in play" option on the bet selected.
It's a very naive system based on the assumption that favourites in non-handicaps are priced fairly, and I currently have no evidence of it being profitable. Each component needs to be tested and verified before I build any code. I'm happy to modify my thoughts based on the results I get back, and I have plenty of other ideas for bots if this one doesn't stand up to analysis: the purpose of this idea is primarily to go through the analysis stages.
In the next few posts I'll be grabbing data from multiple sources and building/using tools to analyse whether this system is viable. I'll be using statistical analysis, of course, but also some funkier analysis tools like decision trees to refine event selection and the like.
Feel free to ask any questions if you have them.
A bot is simply an automated way to do something you would do manually. It has several advantages we wish to exploit, and if anything we do negates one of these advantages we must reconsider whether it's suitable for a bot to be doing it.
- Speed is of the essence. A bot can obviously act and react much faster than a human. Because of how the APIs work a bot can monitor many markets simultaneously and strike/cancel bets much faster than any human. This has obvious advantages as the earliest mover in a market to make the right decision typically will make the most money.
- It will behave consistently according to design. We have all found systems, techniques and tricks that appear to be profitable. Those with strong discipline might even be able to stick to them. For the rest of us though, there is always a desire to have a quick punt or to break our system based on a hunch. It makes the statistical analysis of what we're doing borderline ridiculous. That's when losses occur. Bots will do what they're programmed to do, and only what they're programmed to do.
- They free you up to do other things. Back in 2002 when I was first trading I was spending 10-12 hours/day at screen. It was demoralising and boring work grinding out a meagre existence. I would have rather spent my days enjoying watching the sport without needing to keep an exchange screen open constantly, or doing something entirely different instead. A bot can - and probably should - work completely unaccompanied and perhaps running on a server in a data centre so I don't even need my computer to be turned on if necessary. It should be able to work on markets whilst I am enjoying the fruits of its labour.
- The nature of building a bot forces you to be more systematic and analytical. Rather than going with "it's a hunch" style of betting/trading, you must justify every line of code. Even better, following industry standard programming methods, a bot can be "iteratively" developed, that is to say tweaked and enhanced bit by bit based on data and results. This requires us to go back and revisit our assumptions regularly, and make new ones. It's a healthier way to think about what we're doing here.
A bot has a few things it needs to be able to to meet these objective.
It must be able to login to a betting exchange or bookmaker website (the latter important for arbing bots), select an event or events, determine which selection or selections it will bet on, strike the bets thereby committing capital according to some sort of staking method, monitor those bets and impose a stop loss if necessary, green up when optimal if we believe greening up is the best way to behave, and log all the data about what it's done somewhere so we can go back and check what it was doing in a spreadsheet later.
There's a couple of bits of this which are relatively straight-forward, and others where we need to make some decisions. The logging into an exchange at first appears trivial (use the "login" function of the API), but do we want to use the API or do we want to screen scrape? Why have we made that choice? I'll be exploring that at some point and showing what the code for each looks like. If you've never written/read any code before, don't panic! It'll be understandable.
However the other jobs all give us much more to think about. We need to justify every line of code, and to meet the "systematic and analytical" test that a bot should pass, we need to handle event selection, runner selection, staking, stop lossing and greening up based on some sort of scientific analysis.
Next job then, it's time to do some data gathering to test some hypotheses.
The first idea for a bot I am going to test is one I expect to fail at the analysis stage (which is good! If it's going to fail, that's the point in time I want for it to fail, not after the bot is built and running out onto the exchanges with hard cash), but will be relatively simple to implement and therefore a good "starting project" for this blog. It is very simple to state:
Find non-handicap UK/IRE horse races which will go in-play, and back the favourite for 5% of my betting bank in the place market 5-10 minutes before the official off. Then lay it at a price giving a 10% RoI greened up with the "keep in play" option on the bet selected.
It's a very naive system based on the assumption that favourites in non-handicaps are priced fairly, and I currently have no evidence of it being profitable. Each component needs to be tested and verified before I build any code. I'm happy to modify my thoughts based on the results I get back, and I have plenty of other ideas for bots if this one doesn't stand up to analysis: the purpose of this idea is primarily to go through the analysis stages.
In the next few posts I'll be grabbing data from multiple sources and building/using tools to analyse whether this system is viable. I'll be using statistical analysis, of course, but also some funkier analysis tools like decision trees to refine event selection and the like.
Feel free to ask any questions if you have them.
Subscribe to:
Posts (Atom)