interpolating custom function to pandas series


Keywords:python 


Question: 

Here's a df:

2005-01-24    117.0
2005-02-22      NaN
2005-03-21      NaN
2005-04-18    114.0
2005-05-23      NaN
2005-06-20      NaN
2005-07-18    122.0

and expected output is to have the mean of the lower and upper values fill the NaN values like this:

2005-01-24    117.0
2005-02-22    115.5    (117-114 / 2)
2005-03-21    115.5
2005-04-18    114.0
2005-05-23    118.0    (122-114 / 2)
2005-07-18    122.0

To my knowledge df.interpolate() doesn't allow you to pass functions? have also tried experimenting with .rolling(2).mean() and re-indexing with no success.


1 Answer: 

Suppose you have your data in Series s:

import pandas as pd
import numpy as np

s = pd.Series({'2005-01-24': 117.0,
 '2005-02-22': np.nan,
 '2005-03-21': np.nan,
 '2005-04-18': 114.0,
 '2005-05-23': np.nan,
 '2005-06-20': np.nan,
 '2005-07-18': 122.0})

You can use ffill and bfill to find the upper and lower bounds and then take the mean.

s.ffill().add(s.bfill()).div(2)
Out[71]: 
2005-01-24    117.0
2005-02-22    115.5
2005-03-21    115.5
2005-04-18    114.0
2005-05-23    118.0
2005-06-20    118.0
2005-07-18    122.0
dtype: float64