Custom PyTensor Op to calculate an analytical gradient (original) (raw)
Hello,
I would like to implement a custom PyTensor Op to give me out a value, EA, while taking an analytical gradient of this function and passing it to pymc to continue forward. The gradient of the function I want to use has been implemented in this paper (Equations A5 and A6)
Here:
\begin{align*} \frac{\partial E}{\partial M} &= \frac{1}{1-e\cos E(t)} \end{align*}
and
\begin{align*} \frac{\partial E}{\partial e} &= \frac{\sin E(t)}{1-e\cos E(t)} \end{align*}
The current function I already have is:
class KeplerSolverOp(Op):
__props__ = ()
def make_node(self, M, e):
M = as_tensor_variable(M)
e = as_tensor_variable(e)
return Apply(self, [M, e], [M.type(), e.type()])
def perform(self, node, inputs, output_storage):
M, e = inputs
EA = _calc_ecc_anom(M, e) # Compute ecc-anomly
output_storage[0][0] = EA
def grad(self, inputs, output_grads):
M, e = inputs
EA = _calc_ecc_anom(M, e)
sea = pm.math.sin(EA)
cea = pm.math.cos(EA)
temp = 1 - e * cea
dEA_dM = 1.0 / temp
dEA_de = (sea * EA) / temp
output_grads[0] = dEA_dM
output_grads[1] = dEA_de
return [dEA_dM, dEA_de]
The problem is that the output of “KeplerSolverOp” is a not an individual value that i’m expecting. As I was expecting to call it like:
kepler_solver_op = KeplerSolverOp()
eanom = kepler_solver_op(manom, ecc)
What could I be missing?
You defined an Op that has two outputs, M.type() and e.type(), so calling the Op returns two variables
Thank you for the fast response! If I want the output of my "KeplerSolverOp " to be “EA” from
EA = _calc_ecc_anom(M, e)
and the gradients
\begin{align*} \frac{\partial E}{\partial M} &= \frac{1}{1-e\cos E(t)} \end{align*}
and
\begin{align*} \frac{\partial E}{\partial e} &= \frac{\sin E(t)}{1-e\cos E(t)} \end{align*}
sent to PyMC to continue working then I should write the make_node like:
def make_node(self, M, e):
M = as_tensor_variable(M)
e = as_tensor_variable(e)
EA = _calc_ecc_anom(M, e)
return Apply(self, [M, e], [EA])
When I do that I get:
Blockquote
Traceback (most recent call last):
File “/Users/jrob/Desktop/School/NU/Research/Jason/gj504/pymc_GJ504.py”, line 77, in
raoff, deoff = kepler.calc_orbit(epoch, sma, ecc, inc, aop, pan, tau, plx, mtot, tau_ref_epoch=epoch[0])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/jrob/Desktop/School/NU/Research/Jason/gj504/kepler.py”, line 227, in calc_orbit
f = pt.function([manom,ecc], KeplerSolverOp()(manom,ecc))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/jrob/miniconda3/envs/pymc/lib/python3.11/site-packages/pytensor/graph/op.py”, line 293, in call
node = self.make_node(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/jrob/Desktop/School/NU/Research/Jason/gj504/kepler.py”, line 75, in make_node
return Apply(self, [M, e], [EA])
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/jrob/miniconda3/envs/pymc/lib/python3.11/site-packages/pytensor/graph/basic.py”, line 158, in init
raise ValueError(
ValueError: All output variables passed to Apply must belong to it.
In make node you just specify the type of the input and output variables, your output is doing more than that (there shouldn’t be a need for a calling _calc_ecc_anom in it)
Type means an instance of tensor(shape=..., dtype=...)
That’s all PyTensor needs to know about the output at the make_node
method. Actual computation should be defined in the perform
method
And your gradient must have only PyTensor pure operations, so you may need to call self
if you need the output of the Op to define the gradient.
Perhaps this guided will clarify some of the things: Using a “black box” likelihood function — PyMC example gallery
I haven’t seen this yet. I’ll look through it and check back here when I implement it!
Alrighty, so I reorgianized my Custom Op’s to now look like this
class KeplerSolverOp(Op):
__props__ = ()
def make_node(self, M, e):
#setting inputs to function of M and e
M = at.as_tensor(M)
e = at.as_tensor_variable(e)
#setting input list for Apply
inputs = [M, e]
outputs = [M.type()]
return Apply(self, inputs, outputs)
def perform(self, node, inputs, outputs):
M, e = inputs
EA = _calc_ecc_anom(M, e) # Compute ecc-anomly
outputs[0][0] = EA #may have to add something likept.value...
def grad(self, inputs, out_grad):
M, e = inputs
grad_wrt_m, grad_wrt_e = kepler_loglikegrad(M,e)
out_grad = grad_wrt_m, grad_wrt_e
# print(grad_wrt_m, grad_wrt_e)
return[grad_wrt_m,grad_wrt_e]
class LogLikeGrad(Op):
def make_node(self, M, e):
#setting inputs to function of M and e
M = at.as_tensor(M)
e = at.as_tensor_variable(e)
#setting input list for Apply
inputs = [M, e]
# There are two outputs with the same type as data,
# for the partial derivatives wrt to m, c
outputs = [M.type(), M.type()]
return Apply(self, inputs, outputs)
def perform(self, node, inputs, outputs):
M, e = inputs
EA = _calc_ecc_anom(M, e)
sea = pm.math.sin(EA)
cea = pm.math.cos(EA)
temp = 1 - e * cea
dEA_dM = 1.0 / temp
dEA_de = (sea * EA) / temp
# return [grad_M, grad_e]
outputs[0][0] = dEA_dM
outputs[1][0] = dEA_de
keplergrad = KeplerSolverOp()
kepler_loglikegrad = LogLikeGrad()
It seems to run, but I get a segfault. I don’t see a “core” to look at to even investigate it. I wonder what could be consuming so much RAM? (If that’s the true answer)
One thing I have tried is to run
import numpy as np
import pytensor.tensor as at
import pymc as pm
from pytensor.graph.op import Op, Apply
from pytensor.tensor import grad, as_tensor_variable
from pytensor import function
import pytensor as pt
def _calc_ecc_anom(manom, ecc):
"""
Computes the eccentric anomaly from the mean anomlay.
Code from Rob De Rosa's orbit solver (e < 0.95 use Newton, e >= 0.95 use Mikkola)
Args:
manom (float/np.array): mean anomaly, either a scalar or np.array of any shape
ecc (float/np.array): eccentricity, either a scalar or np.array of the same shape as manom
Return:
eanom (float/np.array): eccentric anomalies, same shape as manom
Written: Jason Wang, 2018
"""
# print(type(ecc))
alpha = (1.0 - ecc) / ((4.0 * ecc) + 0.5)
beta = (0.5 * manom) / ((4.0 * ecc) + 0.5)
aux = pm.math.sqrt(beta**2.0 + alpha**3.0)
z = pm.math.abs(beta + aux)**(1.0/3.0)
s0 = z - (alpha/z)
s1 = s0 - (0.078*(s0**5.0)) / (1.0 + ecc)
e0 = manom + (ecc * (3.0*s1 - 4.0*(s1**3.0)))
se0 = pm.math.sin(e0)
ce0 = pm.math.cos(e0)
f = e0-ecc*se0-manom
f1 = 1.0-ecc*ce0
f2 = ecc*se0
f3 = ecc*ce0
f4 = -f2
u1 = -f/f1
u2 = -f/(f1+0.5*f2*u1)
u3 = -f/(f1+0.5*f2*u2+(1.0/6.0)*f3*u2*u2)
u4 = -f/(f1+0.5*f2*u3+(1.0/6.0)*f3*u3*u3+(1.0/24.0)*f4*(u3**3.0))
return (e0 + u4)
class KeplerSolverOp(Op):
__props__ = ()
def make_node(self, M, e):
#setting inputs to function of M and e
M = at.as_tensor(M)
e = at.as_tensor_variable(e)
#setting input list for Apply
inputs = [M, e]
outputs = [M.type()]
return Apply(self, inputs, outputs)
def perform(self, node, inputs, outputs):
M, e = inputs
EA = _calc_ecc_anom(M, e) # Compute ecc-anomly
outputs[0][0] = EA #may have to add something likept.value...
def grad(self, inputs, out_grad):
M, e = inputs
grad_wrt_m, grad_wrt_e = kepler_loglikegrad(M,e)
out_grad = grad_wrt_m, grad_wrt_e
# print(grad_wrt_m, grad_wrt_e)
return[grad_wrt_m,grad_wrt_e]
class LogLikeGrad(Op):
def make_node(self, M, e):
#setting inputs to function of M and e
M = at.as_tensor(M)
e = at.as_tensor_variable(e)
#setting input list for Apply
inputs = [M, e]
# There are two outputs with the same type as data,
# for the partial derivatives wrt to m, c
outputs = [M.type(), M.type()]
return Apply(self, inputs, outputs)
def perform(self, node, inputs, outputs):
M, e = inputs
EA = _calc_ecc_anom(M, e)
sea = pm.math.sin(EA)
cea = pm.math.cos(EA)
temp = 1 - e * cea
dEA_dM = 1.0 / temp
dEA_de = (sea * EA) / temp
# return [grad_M, grad_e]
outputs[0][0] = dEA_dM
outputs[1][0] = dEA_de
keplergrad = KeplerSolverOp()
kepler_loglikegrad = LogLikeGrad()
M = at.scalar("m")
e = at.scalar("e")
out = keplergrad(M,e)
eval_out = out.eval({M: 2.96466117, e: 0.4781471212028443})
print(eval_out.eval())
and I got no errors and received “3.021801889806251” which is correct…
Also I have tried
out = keplergrad(M,e)
grad_wrt_m, grad_wrt_e = pt.grad(out.sum(), wrt=[M, e])
pt.dprint([grad_wrt_m], print_type=True)
and received
LogLikeGrad.0 [id A] <Scalar(float64, shape=())>
├─ m [id B] <Scalar(float64, shape=())>
└─ e [id C] <Scalar(float64, shape=())>
Your perform method in the gradient op should have only python code. You’re using pm.math which is not correct. If you want to use those you would do it in the grad method of the original op (The grad method in contrast to the perform only uses PyTensor code)
Perfect! That was the solution. For anybody else that comes across this, the solution is to remove any pm.math functions and separate it into 2 Ops!
Thank you so much for the help!