**Abstract :**5G cellular networks come with many new features compared to the legacy cellular networks, such as network data analytics function (NWDAF), which enables the network operators
to either implement their own machine learning (ML) based data
analytics methodologies or integrate third-party solutions to their
networks. In this paper, the structure and the protocols of NWDAF
that are defined in the 3rd Generation Partnership Project (3GPP)
standard documents are first described. Then, cell-based synthetic
data set for 5G networks based on the fields defined by the 3GPP
specifications is generated. Further, some anomalies are added to
this data set (e.g., suddenly increasing traffic in a particular cell),
and then these anomalies within each cell, subscriber category,
and user equipment are classified. Afterward, three ML models,
namely, linear regression, long-short term memory, and recursive
neural networks are implemented to study behaviour information
estimation (e.g., anomalies in the network traffic) and network load
prediction capabilities of NWDAF. For the prediction of network
load, three different models are used to minimize the mean absolute
error, which is calculated by subtracting the actual generated data
from the model prediction value. For the classification of anomalies, two ML models are used to increase the area under the receiver operating characteristics curve, namely, logistic regression
and extreme gradient boosting. According to the simulation results, neural network algorithms outperform linear regression in
network load prediction, whereas the tree-based gradient boosting algorithm outperforms logistic regression in anomaly detection.
These estimations are expected to increase the performance of the
5G network through NWDAF.

**Index terms :**Handover, machine learning, NWDAF, 5G networks.