The average β energy data and average γ energy data of the β--decay released nuclei play an important role in many fields of nuclear technology and scientific research, such as the decay heat and antineutrino spectrum calculation for different kinds of reactors. However, the reliable experimental measurements of the average energies for many nuclei are lacking, and the theoretical calculation needs to be improved to meet the requirements for accuracy in the technical applications.In this study, the average β, γ and neutrino energies of the β–-decay released nuclei are investigated by the neural network method based on the newly evaluated experimental data of 543 nuclei that are selected from a total of 1136 β–-decay nuclei. In the neural network approach, three different characteristic groups are used for model training. Each characteristic group contains a characteristic value (one of the $T_{1/2}$, $\left( \dfrac{1}{T_{1/2}} \right)^{1/5}$, and $\dfrac{1}{3}Q$), along with five identical characteristic values (Z, N, parity of Z, parity of N, and $\Delta Z$).The three characteristic values are selected based on the physical mechanism below. 1) The average energy is obviously related to Q value and approximately taken as $\dfrac{1}{3}Q$ in the reactor industry. Therefore, the $\dfrac{1}{3}Q$ is chosen as one characteristic value. 2) The half-live is related to the Q value of β--decay, and $T_{1/2}$ is considered. 3) According to the Sargent’s law, $\left( \dfrac{1}{T_{1/2}} \right)^{1/5} \propto Q$, a more accurate $\left( \dfrac{1}{T_{1/2}} \right)^{1/5}$ value is selected.As a result, for the characteristic group of $T_{1/2}$, the training results for all three types of average energies are unsatisfactory. For the other groups, the relative errors of the average β energy data, are 19.32% and 28.11% for $\left( \dfrac{1}{T_{1/2}} \right)^{1/5}$ and $\dfrac{1}{3}Q$ feature groups in the training set, and 82% and 56.9% in the validation set; the relative errors of the average γ energy are 28.9% and 76.9% for $\left( \dfrac{1}{T_{1/2}} \right)^{1/5}$ and $\dfrac{1}{3}Q$ characteristic groups, respectively, and they are both >100% in the validation set; for the average neutrino energy, the relative errors in the training set are 27.82% and 35.33% for $\left( \dfrac{1}{T_{1/2}} \right)^{1/5}$ and $\dfrac{1}{3}Q$ feature group, and 76.32% and 37.76% in the validation set, respectively.Considering the accuracy comparison of the three groups, the $\dfrac{1}{3}Q$ characteristic group is chosen to predict the average energy data of nuclei in the fission product region (mass numbers range from 66 to 172), which lacks reliable experimental data. As a result, the average energy data with predicted values for 291 nuclei are supplemented. Besides, a comparison is made between the calculated data and the evaluated experimental data through the nuclide chart. It is found that the neural network accurately predicts the experimental data for the average β and neutrino energies which exhibit relatively strong regularity. However, the average β energy significantly deviates from the prediction of the average γ energy (relative error in the training set is 76.9%). Large deviation also emerges in the odd-odd nuclei and nuclei near magic numbers. This study confirms that integrating empirical relationships and physical principles can effectively improve the performance of the neural network, and simultaneously reveals the relationship between data regularity and model generalization capability. These findings provide a basis for using physical mechanisms to optimize machine learning models in the future.