Extend TrainedModelSizeStats to include the actual OS-reported (on Linux) memory usage for trained model deployments (pytorch_inference process), instead of relying solely on estimated memory values.

Background

Currently, TrainedModelSizeStats reports:

model_size_bytes - the size of the model definition
required_native_memory_bytes - an estimated memory requirement calculated from the model definition length

This estimated value is computed in TransportGetTrainedModelsStatsAction.java:

long estimatedMemoryUsageBytes = totalDefinitionLength > 0L
    ? StartTrainedModelDeploymentAction.estimateMemoryUsageBytes(
        model.getModelId(),
        totalDefinitionLength,
        model.getPerDeploymentMemoryBytes(),
        model.getPerAllocationMemoryBytes(),
        numberOfAllocations
    )
    : 0L;

For anomaly detection jobs, PR #131981 (corresponding to ml-cpp#2846) added actual OS memory reporting via getrusage RSS values. This provides much more accurate information about real memory consumption.

Proposed Changes

Add new fields to TrainedModelSizeStats:
- runtime_native_memory_bytes - current resident set size as reported by the OS
- max_runtime_native_memory_bytes - peak resident set size as reported by the OS
Update the Java side to consume these values from the pytorch_inference native process output (requires corresponding ml-cpp changes).
Update the stats retrieval logic in TransportGetTrainedModelsStatsAction and related classes to populate and return these values for running deployments.
Consider backward compatibility: The estimated required_native_memory_bytes should remain for deployments that haven't reported actual usage yet, or for models that aren't currently deployed.

Files likely to be modified

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/TrainedModelSizeStats.java - Add new fields
x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportGetTrainedModelsStatsAction.java - Populate actual memory values
x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/pytorch/process/PyTorchResultProcessor.java - Process memory stats from native process
x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/pytorch/results/PyTorchResult.java - Parse memory stats if included in results
x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/deployment/DeploymentManager.java - Store/expose memory stats

API Changes

The GET _ml/trained_models/<model_id>/_stats API response would include additional fields:

{
  "model_size_stats": {
    "model_size_bytes": 438123456,
    "required_native_memory_bytes": 876246912,
    "system_memory_bytes": 892452864,
    "max_system_memory_bytes": 923845632
  }
}

Benefits

Provides accurate, real-time memory usage information for capacity planning
Helps users understand actual vs. estimated memory consumption
Enables better monitoring and alerting based on real memory footprint
Aligns trained model deployment monitoring with anomaly detection job monitoring

Dependencies

This issue depends on the corresponding ml-cpp changes elastic/ml-cpp#2885 to report actual memory usage from the pytorch_inference process.

[ML] Report actual memory usage for trained model deployments in TrainedModelSizeStats

Issue Description