I. Introduction
Motivated by feedback control of colloidal self-assembly (SA), this work focuses on learning the solution of the nonlinear stochastic optimal control problems over a given fixed time horizon of the form \begin{align*} &\underset { \boldsymbol {u}\in \mathcal {U}}{\inf }\:\mathbb {E}_{\mu ^{ \boldsymbol {u}}}\left [{\int _{0}^{T}\frac {1}{2}\| {u}\left ({t, \boldsymbol {x}}\right)\|_{2}^{2}\: {\mathrm{d}} t}\right] \tag{1a}\\ &\text {subject to}\,{\mathrm{d}}\, \boldsymbol {x} = \boldsymbol {f}\left ({t, \boldsymbol {x}, \boldsymbol {u}}\right) {\mathrm{d}} t + \sqrt {2}\: \boldsymbol {g}\left ({t, \boldsymbol {x}, \boldsymbol {u}}\right) {\mathrm{d}} \boldsymbol {w} \tag{1b}\\ &\hphantom {\text {subject to} } \boldsymbol {x}\left ({t=0}\right)\sim \mu _{0}\;\left ({\text {given}}\right), \; \boldsymbol {x}\left ({t=T}\right)\sim \mu _{T}\;\left ({\text {given}}\right)\tag{1c}\end{align*} where denote the prescribed probability measures over the state space at and , respectively. The constraint in (1b) is a controlled Itô stochastic differential equation (SDE) with the state vector , the control vector , and the standard Wiener process . For the solution to the SDE (1b) to be for colloidal SA systems, the state vector represents suitable order parameters. The drift coefficient is a vector field given by mapping , and the diffusion coefficient is a matrix field given by mapping . For the SDE solutions to be well-posed, we will detail suitable smoothness assumptions on and .