Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Activation Rebalancing #9140

Open
wants to merge 42 commits into
base: main
Choose a base branch
from

Conversation

ledjon-behluli
Copy link
Contributor

@ledjon-behluli ledjon-behluli commented Sep 11, 2024

This PR introduces Memory-aware Activation Rebalancing (MAR), it also fixes #9135

MAR is designed to ensure a balanced distribution of activations across silos by considering both the number of activations and their memory usage. The goal is to maintain an efficient and balanced system, even as the cluster undergoes dynamic changes. MAR employs the principle of maximum entropy, which involves constraints to balance the total number of activations. The process involves forming pairs of silos, calculating deviations, and iteratively adjusting the activations until equilibrium is achieved.

The full theory and simulations behind MAR can be found here

Here you can find an illustration of MAR working in an imbalanced 4-silo cluster. The same can be run also by navigating to "playground/ActivationRebalancing" and starting the cluster & frontend projects.

image

Implementation Details

Rebalancer

MAR is implement as grain called ActivationRebalancerWorker, and the algorithm runs there. It must be a single grain and it must be active all the time if users decided to activate it (the feature is opt-in). User's can request a report from it any time, resume or suspend its activity for some time (or indefinitely).

[Alias("IActivationRebalancerWorker")]
internal interface IActivationRebalancerWorker : IGrainWithIntegerKey
{
    /// <summary>
    /// Returns the most recent rebalancing report.
    /// </summary>
    [AlwaysInterleave, Alias("GetReport")] ValueTask<RebalancingReport> GetReport();

    /// <summary>
    /// Resumes rebalancing if its suspended, otherwise its a no-op.
    /// </summary>
    [Alias("ResumeRebalancing")] Task ResumeRebalancing();
    /// <summary>
    /// Suspends rebalancing if its running, otherwise its a no-op.
    /// </summary>
    /// <param name="duration">
    /// The amount of time to suspend the rebalancer.
    /// <para><see langword="null"/> means suspend indefinitely.</para>
    /// </param>
    [Alias("SuspendRebalancing")] Task SuspendRebalancing(TimeSpan? duration);
}

Since the rebalancer is a grain that is hosted somewhere in the cluster, it must be kept alive during any circumstance. The rebalancer is under the watch of system targets IActivationRebalancerMonitor and will report to them periodically. If it fails to do so, one of them will contact the rebalancer. If its dead due to its silo crashing, it will be woken up and continue to report back, otherwise if its a network isolation issue, its a no-op from the monitor. When the silo that hosts the rebalancer is gracefully shut-down (rolling deployments and such), the system target that is in the same host as the rebalancer will instruct the runtime to migrate the rebalancer somewhere else. The system target itself dies with the silo. If migration is successful, the rebalancer will bring its current state over to the other silo, and will began right where it left off before its host silo began shutting down This essentially means that if its in between a (potentially) long rebalancing session, progress is preserved.

Monitor

The monitor system targets are defined like below. One of them will start the rebalancer, and the rebalancer will report to all of them. If it fails to do so, one of them will wake it up.

[Alias("IActivationRebalancerMonitor")]
internal interface IActivationRebalancerMonitor : ISystemTarget, IActivationRebalancer
{
    /// <summary>
    /// The period on which the <see cref="IActivationRebalancerWorker"/> must report back to the monitor.
    /// </summary>
    public static readonly TimeSpan WorkerReportPeriod = TimeSpan.FromSeconds(30);

    /// <summary>
    /// Invoked periodically by the <see cref="IActivationRebalancerWorker"/>.
    /// </summary>
    [Alias("Report")] Task Report(RebalancingReport report);
}

The monitor servers also as a proxy for clients to:

  1. Get reports statistics of the rebalancer without incurring remote calls.
  2. Subscribe to report changes.
  3. Control the rebalancer (resume/suspend activity).

The monitor sys target extends the IActivationRebalancer which servers as a a gateway to interface with the activation rebalancer itself.

/// <summary>
/// A gateway to interface with the activation rebalancer.
/// </summary>
/// <remarks>This is available only on the silo.</remarks>
public interface IActivationRebalancer
{
    /// <summary>
    /// Returns the rebalancing report.
    /// <para>The report can lag behind if you choose a session cycle period less than <see cref="IActivationRebalancerMonitor.WorkerReportPeriod"/>.</para>
    /// </summary>
    /// <param name="force">If set to <see langword="true"/> returns the most current report.</param>
    /// <remarks>Using <paramref name="force"/> incurs an asynchronous operation.</remarks>
    ValueTask<RebalancingReport> GetRebalancingReport(bool force = false);

    /// <inheritdoc cref="IActivationRebalancerWorker.ResumeRebalancing"/>
    Task ResumeRebalancing();

    /// <inheritdoc cref="IActivationRebalancerWorker.SuspendRebalancing(TimeSpan?)"/>
    Task SuspendRebalancing(TimeSpan? duration = null);

    /// <summary>
    /// Subscribe to activation rebalancer reports.
    /// </summary>
    /// <param name="listener">The component that will be notified.</param>
    void SubscribeToReports(IActivationRebalancerReportListener listener);

    /// <summary>
    /// Unsubscribe from activation rebalancer reports.
    /// </summary>
    /// <param name="listener">The already subscribed component.</param>
    void UnsubscribeFromReports(IActivationRebalancerReportListener listener);
}

Reports

Users can query the latest reports from their local monitor, but they can lag behind if ActivationRebalancerOptions.SessionCyclePeriod is chosen to be less than IActivationRebalancerMonitor.WorkerReportPeriod (which by default it is). If users want the latest report they can do GetRebalancingReport(force: true) which will contact the rebalancer grain but potentially incur a remote call.

The report structure contains information about the rebalancer itself, and rebalancing statistics for the active silos in the cluster.
Information about each property is provided via XML doc comments.

/// <summary>
/// A report of the current state of the activation rebalancer.
/// </summary>
[GenerateSerializer, Immutable, Alias("RebalancingReport")]
public readonly struct RebalancingReport
{
    /// <summary>
	/// The silo where the rebalancer is currently located.
	/// </summary>
	[Id(0)] public required SiloAddress Host { get; init; }

	/// <summary>
	/// The current status of the rebalancer.
	/// </summary>
	[Id(1)] public required RebalancerStatus Status { get; init; }

	/// <summary>
	/// The amount of time the rebalancer is suspended (if at all).
	/// </summary>
	/// <remarks>This will always be <see langword="null"/> if <see cref="Status"/> is <see cref="RebalancerStatus.Executing"/>.</remarks>
	[Id(2)] public TimeSpan? SuspensionDuration { get; init; }

	/// <summary>
	/// The current view of the cluster's imbalance.
	/// </summary>
	/// <remarks>Range: [0 - 1]</remarks>
	[Id(3)] public required double ClusterImbalance { get; init; }

	/// <summary>
	/// Latest rebalancing statistics.
	/// </summary>
	[Id(4)] public required ImmutableArray<RebalancingStatistics> Statistics { get; init; }
}

/// <summary>
/// Rebalancing statistics for the given <see cref="SiloAddress"/>.
/// </summary>
/// <remarks>
/// Used for diagnostics / metrics purposes. Note that statistics are an approximation.</remarks>
[GenerateSerializer, Immutable, Alias("RebalancingStatistics")]
public readonly struct RebalancingStatistics
{
    /// <summary>
    /// The time these statistics were assembled.
    /// </summary>
    [Id(0)] public required DateTime TimeStamp { get; init; }

    /// <summary>
    /// The silo to which these statistics belong to.
    /// </summary>
    [Id(1)] public required SiloAddress SiloAddress { get; init; }

    /// <summary>
    /// The number of activations that have been dispersed from this silo thus far.
    /// </summary>
    [Id(2)] public required ulong DispersedActivations { get; init; }

    /// <summary>
    /// The number of activations that have been acquired by this silo thus far.
    /// </summary>
    [Id(3)] public required ulong AcquiredActivations { get; init; }
}

/// <summary>
/// The status of the <see cref="IActivationRebalancerWorker"/>.
/// </summary>
[GenerateSerializer]
public enum RebalancerStatus : byte
{
    /// <summary>
    /// It is executing.
    /// </summary>
    Executing = 0,
    /// <summary>
    /// It is suspended.
    /// </summary>
    Suspended = 1
}

Options

The rebalancer can be configured via the standard Options pattern in dotnet, below are the possible options to control the behavior of the rebalancer. See XML doc comments for further details.

public sealed class ActivationRebalancerOptions
{
    /// <summary>
    /// The due time for the rebalancer to start the very first session.
    /// </summary>
    public TimeSpan RebalancerDueTime { get; set; } = DEFAULT_REBALANCER_DUE_TIME;

    /// <summary>
    /// The default value of <see cref="RebalancerDueTime"/>.
    /// </summary>
    public static readonly TimeSpan DEFAULT_REBALANCER_DUE_TIME = TimeSpan.FromSeconds(60);

    /// <summary>
    /// The time between two consecutive rebalancing cycles within a session.
    /// </summary>
    /// <remarks>It must be greater than 2 x <see cref="DeploymentLoadPublisherOptions.DeploymentLoadPublisherRefreshTime"/>.</remarks>
    public TimeSpan SessionCyclePeriod { get; set; } = DEFAULT_SESSION_CYCLE_PERIOD;

    /// <summary>
    /// The default value of <see cref="SessionCyclePeriod"/>.
    /// </summary>
    public static readonly TimeSpan DEFAULT_SESSION_CYCLE_PERIOD = TimeSpan.FromSeconds(15);

    /// <summary>
    /// The maximum, consecutive number of cycles, yielding no significant improvement to the cluster's entropy.
    /// </summary>
    /// <remarks>This value is inclusive, i.e. if this value is 'n', than the 'n+1' cycle will stop the current rebalancing session.</remarks>
    public int MaxStaleCycles { get; set; } = DEFAULT_MAX_STALE_CYCLES;

    /// <summary>
    /// The default value of <see cref="MaxStaleCycles"/>.
    /// </summary>
    public const int DEFAULT_MAX_STALE_CYCLES = 3;

    /// <summary>
    /// The minumum change in the entropy of the cluster that is considered an "improvement".
    /// When a total of n-consecutive stale cycles pass, during which the change in entropy is less than
    /// the quantum, than the current rebalancing session will stop. The "change" is a normalized value
    /// being relative to the maximum possible entropy.
    /// </summary>
    /// <remarks>Allowed range: (0-0.01)</remarks>
    public double EntropyQuantum { get; set; } = DEFAULT_ENTROPY_QUANTUM;

    /// <summary>
    /// The default value of <see cref="EntropyQuantum"/>.
    /// </summary>
    public const double DEFAULT_ENTROPY_QUANTUM = 0.001d;

    /// <summary>
    /// Represents the allowed entropy deviation between the cluster's current entropy, againt the theoretical maximum.
    /// Values lower than or equal to this are practically considered as "maximum", and the current rebalancing session will stop.
    /// </summary>
    /// <remarks>Allowed range is: (0-1)</remarks>
    public double AllowedEntropyDeviation { get; set; } = DEFAULT_ALLOWED_ENTROPY_DEVIATION;

    /// <summary>
    /// The default value of <see cref="AllowedEntropyDeviation"/>.
    /// </summary>
    public const double DEFAULT_ALLOWED_ENTROPY_DEVIATION = 0.01d;

	/// <summary>
	/// Determines whether <see cref="AllowedEntropyDeviation"/> should be scaled dynamically
	/// based on the total number of activations. When set to <see langword="true"/>, the allowed entropy
	/// deviation will increase logarithmically after reaching an activation threshold (10,000 activations),
	/// and will cap at the maximum (0.1 deviation).
	/// </summary>
	/// <remarks>This is in place because a deviation of say 10 activations has far lesser
	/// impact on a total of 100,000 activations, than it does for say 1,000 activations.</remarks>
	public bool ScaleAllowedEntropyDeviation { get; set; } = DEFAULT_SCALE_DEFAULT_ALLOWED_ENTROPY_DEVIATION;

	/// <summary>
	/// The default value of <see cref="ScaleAllowedEntropyDeviation"/>.
	/// </summary>
	public const bool DEFAULT_SCALE_DEFAULT_ALLOWED_ENTROPY_DEVIATION = true;

    /// <summary>
    /// <para>Represents the weight that is given to the number of rebalancing cycles that have passed during a rebalancing session.</para>
    /// Changing this value has a far greater impact on the migration rate than <see cref="SiloNumberWeight"/>, and is suitable for controlling the session duration.
    /// <para>Pick higher values if you want a faster migration rate.</para>
    /// </summary>
    /// <remarks>Allowed range: (0-1]</remarks>
    public double CycleNumberWeight { get; set; } = DEFAULT_CYCLE_NUMBER_WEIGHT;

    /// <summary>
    /// The default value of <see cref="CycleNumberWeight"/>.
    /// </summary>
    public const double DEFAULT_CYCLE_NUMBER_WEIGHT = 0.1d;

    /// <summary>
    /// <para>Represents the weight that is given to the number of silos in the cluster during a rebalancing session.</para>
    /// Changing this value has a far lesser impact on the migration rate than <see cref="CycleNumberWeight"/>, and is suitable for fine-tuning.
    /// <para>Pick lower values if you want a faster migration rate.</para>
    /// </summary>
    /// <remarks>Allowed range: [0-1]</remarks>
    public double SiloNumberWeight { get; set; } = DEFAULT_SILO_NUMBER_WEIGHT;

    /// <summary>
    /// The default value of <see cref="SiloNumberWeight"/>.
    /// </summary>
    public const double DEFAULT_SILO_NUMBER_WEIGHT = 0.1d;
}

Registration

As mentioned above Activation Rebalancing is opt-in and is marked Experimental("ORLEANSEXP002"). Everything is pretty much standard with the difference that users can supply their own implementation of IFailedRebalancingSessionBackoffProvider which is used to determine how long to wait between successive rebalancing sessions, if an aprior session has failed. A session is considered "failed" if n-consecutive number of cycles yielded no significant improvement to the cluster's entropy.

public static class ActivationRebalancerExtensions
{
    /// <summary>
    /// Enables activation rebalancing for the entire cluster.
    /// </summary>
    /// <remarks>
    /// Activation rebalancing attempts to distribute activations around the cluster in such a
    /// way that it optimizes both activation count and memory usages across the silos of the cluster.
    /// <para>You can read more on activation rebalancing <see href="https://www.ledjonbehluli.com/posts/orleans_adaptive_rebalancing/">here</see></para>
    /// </remarks>
    [Experimental("ORLEANSEXP002")]
    public static ISiloBuilder AddActivationRebalancer(this ISiloBuilder builder) =>
        builder.AddActivationRebalancer<FailedSessionBackoffProvider>();

    /// <inheritdoc cref="AddActivationRebalancer(ISiloBuilder)"/>.
    /// <typeparam name="TProvider">Custom backoff provider for determining next session after a failed attempt.</typeparam>
    [Experimental("ORLEANSEXP002")]
    public static ISiloBuilder AddActivationRebalancer<TProvider>(this ISiloBuilder builder)
        where TProvider : class, IFailedRebalancingSessionBackoffProvider =>
        builder.ConfigureServices(service => service.AddActivationRebalancer<TProvider>());

    private static IServiceCollection AddActivationRebalancer<TProvider>(this IServiceCollection services)
        where TProvider : class, IFailedRebalancingSessionBackoffProvider
    {
        services.AddSingleton<ActivationRebalancerMonitor>();
        services.AddFromExisting<IActivationRebalancer, ActivationRebalancerMonitor>();
        services.AddFromExisting<ILifecycleParticipant<ISiloLifecycle>, ActivationRebalancerMonitor>();
        services.AddTransient<IConfigurationValidator, ActivationRebalancerOptionsValidator>();
        
        services.AddSingleton<TProvider>();
        services.AddFromExisting<IFailedRebalancingSessionBackoffProvider, TProvider>();
        if (typeof(TProvider).IsAssignableTo(typeof(ILifecycleParticipant<ISiloLifecycle>)))
        {
            services.AddFromExisting(typeof(ILifecycleParticipant<ISiloLifecycle>), typeof(TProvider));
        }

        return services;
    }
}

Tests

There are tests which cover the following:

  1. Options - checking if the options & the validator work correctly.
  2. Static - activations are placed at the beginning and no further activations are created.
  3. Dynamic - activations are placed at the beginning and then further (less in number) activations are created and distribute in a non-random way to ensure that the placement itself does not contribute to balancing itself.
  4. State Preservation - intentionally shutdown gracefully the silo where the rebalancer is located to test if its migrated and if its state is preserved.
  5. Control - controlling the rebalancer itself (resume/suspend) and tests the report listeners.

orleans_tests

image

This is a graph that I ran at the beginning, it shows convergence with silos having different initial activations and different memory usages while during cycles we add more activations. It might not be perfect, but this was during its initial phases.

image

Microsoft Reviewers: Open in CodeFlow
@ledjon-behluli
Copy link
Contributor Author

this is how the allowed entropy deviation will change given a base rate AllowedEntropyDeviation, knowing that the deviation is less important with higher number of activations, i.e: a deviation of say 10 activations has far lesser impact on a total of 100,000 activations, than it does for say 1,000 activations

image

@ledjon-behluli
Copy link
Contributor Author

Here is plot that shows MAR working on a 5 silo REAL cluster. Initially 4 silos are activated and activations are placed, MAR converges them. Explanations are given in the graph itself.

image

@ledjon-behluli
Copy link
Contributor Author

Here is the same configuration, but this time MAR is configured to act more aggressively

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants