杂谈WebApiClient的性能优化

前言

WebApiClient的netcoreapp版本的开发已接近尾声,最后的进攻方向是性能的压榨,我把我所做性能优化的过程介绍给大家,大家可以依葫芦画瓢,应用到自己的实际项目中,提高程序的性能。

总体成果展示

使用MockResponseHandler消除真实http请求,原生HttpClient、WebApiClientCore和Refit的性能参考:

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18362.836 (1903/May2019Update/19H1)
Intel Core i3-4150 CPU 3.50GHz (Haswell), 1 CPU, 4 logical and 2 physical cores
.NET Core SDK=3.1.202
  [Host]     : .NET Core 3.1.4 (CoreCLR 4.700.20.20201, CoreFX 4.700.20.22101), X64 RyuJIT
  DefaultJob : .NET Core 3.1.4 (CoreCLR 4.700.20.20201, CoreFX 4.700.20.22101), X64 RyuJIT
MethodMeanErrorStdDev
HttpClient_GetAsync3.945 μs0.2050 μs0.5850 μs
WebApiClientCore_GetAsync13.320 μs0.2604 μs0.3199 μs
Refit_GetAsync43.503 μs0.8489 μs1.0426 μs
MethodMeanErrorStdDev
HttpClient_PostAsync4.876 μs0.0972 μs0.2092 μs
WebApiClientCore_PostAsync14.018 μs0.1829 μs0.2246 μs
Refit_PostAsync46.512 μs0.7885 μs0.7376 μs

优化之后的WebApiClientCore,性能靠近原生HttpClient,并领先于Refit。

Benchmark过程

性能基准测试可以帮助我们比较多个方法的性能,在没有性能基准测试工具的情况下,我们仅凭肉眼如何区分性能的变化。

BenchmarkDotNet是一款强力的.NET性能基准测试库,其为每个被测试的方法提供了孤立的环境,使用BenchmarkDotnet,我们很容易的编写各种性能测试方法,并可以避免许多常见的坑。

请求总时间对比

拿到BenchmarkDotNet,我就迫不及待地写了WebApiClient的老版本、原生HttpClient和WebApiClientCore三个请求对比,看看新的Core版本有没有预期的性能有所提高,以及他们与原生HttpClient有多少性能损耗。

MethodMeanErrorStdDev
WebApiClient_GetAsync279.479 us22.5466 us64.3268 us
WebApiClientCore_GetAsync25.298 us0.4953 us0.7999 us
HttpClient_GetAsync2.849 us0.0568 us0.1393 us
WebApiClient_PostAsync25.942 us0.3817 us0.3188 us
WebApiClientCore_PostAsync13.462 us0.2551 us0.6258 us
HttpClient_PostAsync4.515 us0.0866 us0.0926 us

粗略地看了一下结果,我开怀一笑,Core版本比原版本性能好一倍,且接近原生。
细看让我大吃一惊,老版本的Get请求怎么这么慢,想想可能是老版本使用Json.net,之前吃过Json.net频繁创建ContractResolver性能急剧下降的亏,就算是单例ContractResolver第一次创建也很占用时间。所以改进为在对比之前,做一次请求预热,这样比较接近实际使用场景,预热之后的老版本WebApiClient,Get请求从279us降低到39us

WebApiClientCore的Get与Post对比

从上面的数据来看,WebApiClientCore在Get请求时明显落后于其Post请求,我的接口是如下定义的:

public interface IWebApiClientCoreApi
{
    [HttpGet("/benchmarks/{id}")]
    Task<Model> GetAsyc([PathQuery]string id);

    [HttpPost("/benchmarks")]
    Task<Model> PostAsync([JsonContent] Model model);
}

Get只需要处理参数id,做为请求uri,而Post需要json序列化model为json,证明代码里面的处理参数的[PathQuery]特性性能低下,[PathQuery]依赖于UriEditor工具类,执行流程为先尝试Replace(),不成功则调用AddQUery(),UriEditor的原型如下:

class UriEditor
{ 
    public bool Replace(string name, string? value);
    public void AddQuery(string name, string? value);
}

考虑到请求uri为[HttpGet("/benchmarks/{id}")],这里流程上是不会调用到AddQuery()方法的,所以锁定性能低的方法就是Replace()方法,接下来就是想办法改造Replace方法了,下面为改造前的Replace()实现:

/// <summary>
/// 替换带有花括号的参数的值
/// </summary>
/// <param name="name">参数名称,不带花括号</param>
/// <param name="value">参数的值</param>
/// <returns>替换成功则返回true</returns>
public bool Replace(string name, string? value)
{
    if (this.Uri.OriginalString.Contains(‘{‘) == false)
    {
        return false;
    }

    var replaced = false;
    var regex = new Regex($"{{{name}}}", RegexOptions.IgnoreCase);
    var url = regex.Replace(this.Uri.OriginalString, m =>
    {
        replaced = true;
        return HttpUtility.UrlEncode(value, this.Encoding);
    });

    if (replaced == true)
    {
        this.Uri = new Uri(url);
    }
    return replaced;
}

Repace的改进方案性能对比

在上面代码中,有点经验一眼就知道是Regex拖的后腿,因为业务需要不区分大小写的字符串替换,而现成中能用的,有且仅有Regex能用了,Regex有两种使用方式,一种是创建Regex实例,一种是使用Regex的静态方法。

Regex实例与静态方法
MethodMeanErrorStdDev
ReplaceByRegexStatic480.9 ns5.50 ns5.15 ns
ReplaceByRegexNew2,615.8 ns41.33 ns36.63 ns

这一跑就知道原因了,把new Regex替换为静态的Regex调用,性能马上提高5倍!

Regex静态方法与自实现Replace函数

感觉Regex静态方法的性能还不是很高,自己实现一个Replace函数对比试试,万一比Regex静态方法还更快呢。于是我花一个晚上的时间写了这个Replace函数,对,就是整整一个晚上,来为它做性能测试,为它做单元测试,为它做内存分配优化。

/// <summary>
/// 不区分大小写替换字符串
/// </summary>
/// <param name="str"></param>
/// <param name="oldValue">原始值</param>
/// <param name="newValue">新值</param>
/// <param name="replacedString">替换后的字符中</param>
/// <exception cref="ArgumentNullException"></exception>
/// <returns></returns>
public static bool RepaceIgnoreCase(this string str, string oldValue, string? newValue, out string replacedString)
{
    if (string.IsNullOrEmpty(str) == true)
    {
        replacedString = str;
        return false;
    }

    if (string.IsNullOrEmpty(oldValue) == true)
    {
        throw new ArgumentNullException(nameof(oldValue));
    }

    var strSpan = str.AsSpan();
    using var owner = ArrayPool.Rent<char>(strSpan.Length);
    var strLowerSpan = owner.Array.AsSpan();
    var length = strSpan.ToLowerInvariant(strLowerSpan);
    strLowerSpan = strLowerSpan.Slice(0, length);

    var oldValueLowerSpan = oldValue.ToLowerInvariant().AsSpan();
    var newValueSpan = newValue.AsSpan();

    var replaced = false;
    using var writer = new BufferWriter<char>(strSpan.Length);

    while (strLowerSpan.Length > 0)
    {
        var index = strLowerSpan.IndexOf(oldValueLowerSpan);
        if (index > -1)
        {
            // 左边未替换的
            var left = strSpan.Slice(0, index);
            writer.Write(left);

            // 替换的值
            writer.Write(newValueSpan);

            // 切割长度
            var sliceLength = index + oldValueLowerSpan.Length;

            // 原始值与小写值同步切割
            strSpan = strSpan.Slice(sliceLength);
            strLowerSpan = strLowerSpan.Slice(sliceLength);

            replaced = true;
        }
        else
        {
            // 替换过剩下的原始值
            if (replaced == true)
            {
                writer.Write(strSpan);
            }

            // 再也无匹配替换值,退出
            break;
        }
    }

    replacedString = replaced ? writer.GetWrittenSpan().ToString() : str;
    return replaced;
}

这代码不算长,但为它写了好多个Buffers相关类型,所以总体工作量很大。不过总算写好了,来个长一点文本的Benchmark:

public class Benchmark : IBenchmark
{
    private readonly string str = "WebApiClientCore.Benchmarks.StringReplaces.WebApiClientCore";
    private readonly string pattern = "core";
    private readonly string replacement = "CORE";

    [Benchmark]
    public void ReplaceByRegexNew()
    {
        new Regex(pattern, RegexOptions.IgnoreCase).Replace(str, replacement);           
    }

    [Benchmark]
    public void ReplaceByRegexStatic()
    {
        Regex.Replace(str, pattern, replacement, RegexOptions.IgnoreCase);
    }

    [Benchmark]
    public void ReplaceByCutomSpan()
    {
        str.RepaceIgnoreCase(pattern, replacement, out var _);
    }
}
MethodMeanErrorStdDevMedian
ReplaceByRegexNew3,323.7 ns115.82 ns326.66 ns3,223.4 ns
ReplaceByRegexStatic881.9 ns16.79 ns43.94 ns868.3 ns
ReplaceByCutomSpan524.0 ns4.78 ns4.47 ns524.9 ns

大动干戈一个晚上,没多少提高,收支不成正比啊。

与Refit对比

在自家里和老哥哥比没意思,所以想跳出来和功能非常相似的Refit做比较看看,在比较之前,我是很有信心的。为了公平,两者都使用默认配置,都进行预热,使用相同的接口定义:

配置与预热

public abstract class BenChmark : IBenchmark
{
    protected IServiceProvider ServiceProvider { get; }

    public BenChmark()
    {
        var services = new ServiceCollection();

        services
            .AddHttpClient(typeof(HttpClient).FullName)
            .AddHttpMessageHandler(() => new MockResponseHandler());

        services
            .AddHttpApi<IWebApiClientCoreApi>()
            .AddHttpMessageHandler(() => new MockResponseHandler())
            .ConfigureHttpClient(c => c.BaseAddress = new Uri("http://webapiclient.com/"));

        services
            .AddRefitClient<IRefitApi>()
            .AddHttpMessageHandler(() => new MockResponseHandler())
            .ConfigureHttpClient(c => c.BaseAddress = new Uri("http://webapiclient.com/"));

        this.ServiceProvider = services.BuildServiceProvider();
        this.PreheatAsync().Wait();
    }

    private async Task PreheatAsync()
    {
        using var scope = this.ServiceProvider.CreateScope();

        var core = scope.ServiceProvider.GetService<IWebApiClientCoreApi>();
        var refit = scope.ServiceProvider.GetService<IRefitApi>();

        await core.GetAsyc("id");
        await core.PostAsync(new Model { });

        await refit.GetAsyc("id");
        await refit.PostAsync(new Model { });
    }
}

等同的接口定义

public interface IRefitApi
{
    [Get("/benchmarks/{id}")]
    Task<Model> GetAsyc(string id);

    [Post("/benchmarks")]
    Task<Model> PostAsync(Model model);
}

public interface IWebApiClientCoreApi
{
    [HttpGet("/benchmarks/{id}")]
    Task<Model> GetAsyc(string id);

    [HttpPost("/benchmarks")]
    Task<Model> PostAsync([JsonContent] Model model);
}

测试函数

/// <summary> 
/// 跳过真实的http请求环节的模拟Get请求
/// </summary>
public class GetBenchmark : BenChmark
{ 
    /// <summary>
    /// 使用原生HttpClient请求
    /// </summary>
    /// <returns></returns>
    [Benchmark]
    public async Task<Model> HttpClient_GetAsync()
    {
        using var scope = this.ServiceProvider.CreateScope();
        var httpClient = scope.ServiceProvider.GetRequiredService<IHttpClientFactory>().CreateClient(typeof(HttpClient).FullName);

        var id = "id";
        var request = new HttpRequestMessage(HttpMethod.Get, $"http://webapiclient.com/{id}");
        var response = await httpClient.SendAsync(request);
        var json = await response.Content.ReadAsByteArrayAsync();
        return JsonSerializer.Deserialize<Model>(json);
    }


    /// <summary>
    /// 使用WebApiClientCore请求
    /// </summary>
    /// <returns></returns>
    [Benchmark]
    public async Task<Model> WebApiClientCore_GetAsync()
    {
        using var scope = this.ServiceProvider.CreateScope();
        var banchmarkApi = scope.ServiceProvider.GetRequiredService<IWebApiClientCoreApi>();
        return await banchmarkApi.GetAsyc(id: "id");
    }


    /// <summary>
    /// Refit的Get请求
    /// </summary>
    /// <returns></returns>
    [Benchmark]
    public async Task<Model> Refit_GetAsync()
    {
        using var scope = this.ServiceProvider.CreateScope();
        var banchmarkApi = scope.ServiceProvider.GetRequiredService<IRefitApi>();
        return await banchmarkApi.GetAsyc(id: "id");
    }
}

测试结果

去掉物理网络请求时间段,WebApiClient的性能是Refit的3倍,我终于可以安心的睡个好觉了!

总结

这文章写得比较乱,是真实的记录我在做性能调优的过程,实际上的过程中,走过的大大小小弯路还更乱,要是写下来文章就没法看了,有需要性能调优的朋友,不防跑一跑banchmark,你会有收获的。

相关推荐