-
Notifications
You must be signed in to change notification settings - Fork 678
Broken options class for pandas.io.data - Did Yahoo just change their options site? #212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Same issue. Did yahoo proceeded to a complete overhaul of there site ? I hope not
|
Bump. |
ended up here as my codes have broken as well (see error below), I'll fork it and try to fix it, but I presume someone else can fix it better/faster...
|
how does this get fixed? by whom? and in how long? Any info is appreciated. ALL my shit is busted..lol |
I was hoping to contribute to the fix, but yahoo is not returning a neatly formatted json object ( from what I can see) The http response is different and I have not made sense of the changes yet. |
I noticed the yahoo page is a mess too (from a scrapers perspective), I moved over to google for a quick fix. Below returns a rather clean json file that might be a good starting point, but I haven't put it into a pandas data frame yet. A lot of the symbols I was looking up in yahoo can't be found in google though. import urllib2 as ul |
@nborggren Do you have any code that will parse this? Thanks |
Same issue. All my options data scripts return Data not available. They were all working when i last checked 7-6-2016 |
I'm having a little trouble parsing this, but here is what I have so far. I needed to reference and http://stackoverflow.com/questions/21104592/json-to-pandas-dataframe I have a pandas dataframe (two actually one for calls and one for puts) but it will take some massaging to put it in the form of yahoos data before this catastrophe. Perhaps someone else can try too. Below is what I have so far: from google_fix import * shows a dataframe that looks like the image. below is the google_fix code import token, tokenize
from six.moves import cStringIO as StringIO
import json
from pandas.io.json import json_normalize
import urllib2 as ul
def fix_lazy_json(in_text):
"""
Handle lazy JSON - to fix expecting property name
this function fixes the json output from google
http://stackoverflow.com/questions/4033633/handling-lazy-json-in-python-expecting-property-name
"""
tokengen = tokenize.generate_tokens(StringIO(in_text).readline)
result = []
for tokid, tokval, _, _, _ in tokengen:
# fix unquoted strings
if (tokid == token.NAME):
if tokval not in ['true', 'false', 'null', '-Infinity', 'Infinity', 'NaN']:
tokid = token.STRING
tokval = u'"%s"' % tokval
# fix single-quoted strings
elif (tokid == token.STRING):
if tokval.startswith ("'"):
tokval = u'"%s"' % tokval[1:-1].replace ('"', '\\"')
# remove invalid commas
elif (tokid == token.OP) and ((tokval == '}') or (tokval == ']')):
if (len(result) > 0) and (result[-1][1] == ','):
result.pop()
# fix single-quoted strings
elif (tokid == token.STRING):
if tokval.startswith ("'"):
tokval = u'"%s"' % tokval[1:-1].replace ('"', '\\"')
result.append((tokid, tokval))
return tokenize.untokenize(result)
def Options(symbol):
dat = ul.urlopen('https://www.google.com/finance/option_chain?q='+symbol+'&output=json').read()
dat = json.loads(fix_lazy_json(dat))
puts = json_normalize(dat['puts'])
calls = json_normalize(dat['calls'])
return calls, puts |
@nborggren Does it work for you? I tried it and it gives me this: line 556, in http_error_default The only thing I added was Options('AAPL') at the end of the code in order to run it. I'm relatively new to python (and coding), if it works for you, any insight would be appreciated. Thanks |
@Slimdog588 It looks like I introduced a typo when github wrapped the urlopen line. Make sure in the file you copied that the ?q='+symbol+... part in the Options function appears right next to the option_chain part. See if that helps. |
@nborggren thank you sir! |
Thanks @nborggren for citing my code. You might use requests http://docs.python-requests.org/ because it's currently use by pandas-datareader and so it will be easier to add your code here (and have Python 2 / 3 compatibility) import requests
url = "https://www.google.com/finance/option_chain"
r = requests.get(url, params={"q": "AAPL", "output": "json"})
content_json = response.text
... |
@nborggren I may be wrong but it seems as though that the url (dat) only has option contracts for 1 expiry and nothing further out? Do you know of a link to get the entire option chain including all expiries? Thanks! |
Hi @Slimdog588 thanks for pointing that out. It looks like we'll have to call requests (or urlopen) for each expiry which can be specified with a url like below: https://www.google.com/finance/option_chain?q=AAPL&expd=19&expm=1&expy=2018 I'll work on automating this and I'll get back with updated code. |
I've made some progress massaging this into a DataFrame like the old Yahoo DataFrame and getting all expirations. Here is what it looks like now: You'll notice that there are a lot more NaNs now and I had to coerce a lot of the variables to get data types like the yahoo dataframe. IV and QuoteTime are missing now too. Some ints are now floats though and trying to convert to ints throws an error. I'm hoping @Slimdog588 , @emican86 , @dscheste , @femtotrader could have a look and offer additional suggestions before I start trying to push this as well as femtotrader's blessing to push his code along with this. the code is below: import token, tokenize
from six.moves import cStringIO as StringIO
import json
from pandas.io.json import json_normalize
import requests
import pandas as pd
def fix_lazy_json(in_text):
"""
Handle lazy JSON - to fix expecting property name
this function fixes the json output from google
http://stackoverflow.com/questions/4033633/handling-lazy-json-in-python-expecting-property-name
"""
tokengen = tokenize.generate_tokens(StringIO(in_text).readline)
result = []
for tokid, tokval, _, _, _ in tokengen:
# fix unquoted strings
if (tokid == token.NAME):
if tokval not in ['true', 'false', 'null', '-Infinity', 'Infinity', 'NaN']:
tokid = token.STRING
tokval = u'"%s"' % tokval
# fix single-quoted strings
elif (tokid == token.STRING):
if tokval.startswith ("'"):
tokval = u'"%s"' % tokval[1:-1].replace ('"', '\\"')
# remove invalid commas
elif (tokid == token.OP) and ((tokval == '}') or (tokval == ']')):
if (len(result) > 0) and (result[-1][1] == ','):
result.pop()
# fix single-quoted strings
elif (tokid == token.STRING):
if tokval.startswith ("'"):
tokval = u'"%s"' % tokval[1:-1].replace ('"', '\\"')
result.append((tokid, tokval))
return tokenize.untokenize(result)
def Options(symbol):
url = "https://www.google.com/finance/option_chain"
r = requests.get(url, params={"q": symbol,"output": "json"})
content_json = r.text
dat = json.loads(fix_lazy_json(content_json))
puts = json_normalize(dat['puts'])
calls = json_normalize(dat['calls'])
np=len(puts)
nc=len(calls)
for i in dat['expirations'][1:]:
r = requests.get(url, params={"q": symbol,"expd":i['d'],"expm":i['m'],"expy":i['y'],"output": "json"})
content_json = r.text
idat = json.loads(fix_lazy_json(content_json))
puts1 = json_normalize(idat['puts'])
calls1 = json_normalize(idat['calls'])
puts1.index = [np+i for i in puts1.index]
calls1.index = [nc+i for i in calls1.index]
np+=len(puts1)
nc+=len(calls1)
puts = puts.append(puts1)
calls = calls.append(calls1)
calls.columns = ['Ask','Bid','Chg','cid','PctChg','cs','IsNonstandard','Expiry','Underlying','Open_Int','Last','Symbol','Strike','Vol']
puts.columns = ['Ask','Bid','Chg','cid','PctChg','cs','IsNonstandard','Expiry','Underlying','Open_Int','Last','Symbol','Strike','Vol']
calls['Type'] = ['call' for i in range(len(calls))]
puts['Type'] = ['put' for i in range(len(puts))]
puts.index = [i+len(calls) for i in puts.index]
opt=pd.concat([calls,puts])
opt['Underlying']=[symbol for i in range(len(opt))]
opt['Underlying_Price'] = [dat['underlying_price'] for i in range(len(opt))]
opt['Root']=opt['Underlying']
for j in ['Vol','Strike','Last','Bid','Ask','Chg']:
opt[j] = pd.to_numeric(opt[j],errors='coerce')
opt['IsNonstandard']=opt['IsNonstandard'].apply(lambda x:x!='OPRA')
opt = opt.sort_values(by=['Strike','Type'])
opt.index = range(len(opt))
col = ['Strike', 'Expiry', 'Type', 'Symbol', 'Last', 'Bid', 'Ask', 'Chg', 'PctChg', 'Vol', 'Open_Int', 'Root', 'IsNonstandard', 'Underlying', 'Underlying_Price', 'cid','cs']
opt = opt[col]
return opt
|
@nborggren , Great job! Thanks, it looks great. I'm trying to get it to print out option contracts that have volume of 500 or more, I can't seem to get it the way I did it last time, would you know how? Any help is greatly appreciated. Thanks |
@nborggren list.append(opt[opt.Vol > 500].index.get_level_values('Symbol')) |
I'm not sure how get_level_values works, I would just use: df = opt[opt['Vol']>500] this will give you a new dataframe of options with Vol over 500. Works for me. |
@nborggren , works like a charm. Many thanks! For those who want to implement this for multiple stocks, I tried doing this but kept getting a a value error(length of values does not match length of index). This is my last question, thank you for being so helpful!!! def launch_program():
launch_program() **input_file is a .csv that has 1 ticker in it error: Traceback (most recent call last): |
@Slimdog588 I originally didn't account for the fact that the number of call options might be different from the number of puts, try recopying it and trying again. No problem, I'm glad to have this working again too. |
@nborggren I'm scanning through 6500 tickers roughly and I'm getting a few errors for some tickers:
I'm trying to handle these errors like so:
My questions: 1)Is there a better way of handling these exceptions? 2)When I add while j < 13 and do the code below, it says TypeError: cannot concatenate 'str' and 'int' objects for line j+=1......any ideas?
-symbol in this case is a string I could easily be wrong but I think the "No JSON object could be decoded" error could be the result of requesting too much data from google too quickly, could be wrong, any thoughts are welcome. Apologies for yet another question, thanks a million for everything! |
I'm running a list of 2500 symbols or so and need to use the following to handle exceptions, as far as I can tell, caused from the symbols not having any options (I'm writing to a file so I know what tickers failed):
I haven't had the json error you speak of. If it is happening continuously it might be a loss of connectivity, or as you say google. If you wait and try again it should work in that case, but I didn't think we were violating any terms of use. Not sure about your TypeError problem, though it sounds like j is getting redefined as a string somewhere. I should probably add some logic for low liquidity cases where there might be puts and no calls or calls and no puts, which might be the source of the 'puts' error. |
Hi All, I just tried to retrieve options quote with the provided code. It seems to return only quote for monthly expiration dates. I cannot see the weekly ones. Thank you |
Hi @MaxGally, |
yes, it works. Thank you |
I am not much of a coder to help fix this bug, but I would like to propose a potential, although likely short-term solution. I have noticed, that while the main yahoo finance site has been changed beyond recognition and to some degree reason, the Canadian section remains intact and uses the same design the broken function is using. Is it possible to adjust the function that uses yahoo to pull all options data for a ticker to use the Canadian yahoo finance site instead? See for yourselves: I hope this helps. |
@dscheste thanks for this tip. i do not know how to edit the pandas source to have it use ca.finance.yahoo.com. https://gist.github.com/stephanschulz/98e1ea9d4d04d8d8fb4b943dcd8ea416 |
Fork this repository using GitHub interface Git clone your Create a branch for your fix
Try to modify https://github.com/pydata/pandas-datareader/blob/master/pandas_datareader/yahoo/options.py#L81 Run unit tests to see it doesn't break anything and if it fix issue.
Then commit and push
In GitHub open a Pull Request |
@femtotrader I think both options.py and data.py need to be touched. /usr/local/lib/python2.7/dist-packages/pandas/io/data.py
and /usr/local/lib/python2.7/dist-packages/pandas_datareader/yahoo/options.py
Even after I changed as per above in both files, nothing gets recognized. |
You don't need to modify Sad to know that http://ca.finance.yahoo.com/ doesn't work with current code. PR are welcome! |
@femtotrader if I just reinstall pandas using pip , will this work? New to github, don't know how to use the pull, fork, etc |
I don't think so... until someone fix it ;-) |
First step is to fork this repository and git clone yours |
nborggren: I was running a simple command: |
Hi @nemi83, |
Hi @nborggren: |
Hello,
Yahoo seems to have changed their site design as I no longer can pull any options data with panda.
Does anyone else experience this?
Here's what's called:
And here's what I am getting:
output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: