.net4.5中的HttpClinet是个非常强大的类,但是在最近实际项目运用中发现了些很有意思的事情。
起初我是这样用的:
???????????using (var client = new HttpClient()) ???????????{ ???????????}
但是发现相较于传统的HttpRequest要慢上不少,后来查阅资料,发现HttpClient不应该每使用一次就释放,因为socket是不能及时释放的,需要把HttpClient作为静态的来使用。
private static readonly HttpClient Client = new HttpClient();
再来后在使用过程当中需要密集的发送GET请求,但是总感觉慢,用fiddler查看,发现每次请求只并发了2次,代码是用semaphoreSlim信号量来控制的,最大数量为10。而电脑的配置为r5 1600,系统为win7 x64,按照道理来说并发10是没问题的,考虑到是否因为 ServicePointManager.DefaultConnectionLimit 限制了并发的数量,我修改了 ServicePointManager.DefaultConnectionLimit 的值为100,再次运行程序发现并发的数量还是2,于是上stackoverflow找到了这篇文章:
https://stackoverflow.com/questions/16194054/is-async-httpclient-from-net-4-5-a-bad-choice-for-intensive-load-applications
根据上面文章所讲,似乎HttpClient是不遵守ServicePointManager.DefaultConnectionLimit的,并且在密集应用中HttpClient无论是准确性还是效率上面都是低于传统意义上的多线程HttpRequest的。但是事实确实是这样的吗?如果真的是要比传统的HttpRequest效率更为底下,那么巨硬为什么要创造HttpClient这个类呢?而且我们可以看到在上面链接中,提问者的代码中HttpClient是消费了body的,而在HttpRequest中是没有消费body的。带着这样的疑问我开始了测试。
???????????var tasks = Enumerable.Range(1, 511).Select(async i => ???????????{ ???????????????await semaphoreSlim.WaitAsync(); ???????????????try ???????????????{ ???????????????????var html = await Client.GetStringAsync($"http://www.fynas.com/ua/search?d=&b=&k=&page={i}"); ???????????????????var doc = parser.Parse(html); ???????????????????????????????????????var tr = doc.QuerySelectorAll(".table-bordered tr:not(:first-child) td:nth-child(4)").ToList(); ???????????????????foreach (var element in tr) ???????????????????{ ???????????????????????list.Enqueue(element.TextContent.Trim()); ???????????????????} ???????????????????doc.Dispose(); ???????????????} ???????????????finally ???????????????{ ???????????????????semaphoreSlim.Release(); ???????????????} ???????????});
上面这段代码,是采集一个UserAgent大全的网站,而我的HttpClient及ServicePointManager.DefaultConnectionLimit是这样定义的:
???????static Program() ???????{ ???????????ServicePointManager.DefaultConnectionLimit = 1000; ???????} ???????private static readonly HttpClient Client = new HttpClient(new HttpClientHandler(){CookieContainer = new CookieContainer()});
经过多次试验,我发现,HttpClient是遵守了ServicePointManager.DefaultConnectionLimit的并发量的,默认还是2,大家仔细观察一下不难发现其实HttpClient是优先于ServicePointManager.DefaultConnectionLimit设置的,也就是说HttpClient比ServicePointManager.DefaultConnectionLimit要先实例化,接下来我把代码修改为这样:
???????static Program() ???????{ ???????????ServicePointManager.DefaultConnectionLimit = 1000; ???????????Client = new HttpClient(new HttpClientHandler() { CookieContainer = new CookieContainer() }); ???????} ???????private static readonly HttpClient Client;
然后再次运行,打开fiddler进行监视,发现这个时候程序就能够正常的进行并发10来访问了。
而HttpClient中的HttpMessagehandle也是一个非常有趣的地方,我们可以进行实际的情况来进行包装一下HttpMessageHandle,比如下面这段代码实现了访问失败进行重试的功能:
???public class MyHttpHandle : DelegatingHandler ???{ ???????public MyHttpHandle(HttpMessageHandler innerHandler):base(innerHandler) ???????{ ???????} ???????protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) ???????{ ???????????for (int i = 0; i < 2; i++) ???????????{ ???????????????var response = await base.SendAsync(request, cancellationToken); ???????????????if (response.IsSuccessStatusCode) ???????????????{ ???????????????????return response; ???????????????} ???????????????else ???????????????{ ???????????????????await Task.Delay(1000, cancellationToken); ???????????????} ???????????} ???????????return await base.SendAsync(request, cancellationToken); ???????} ???}
在实例化HttpClient的时候,把我们定义的handle传递进去:
private static readonly HttpClient Client = new HttpClient(new MyHttpHandle(Handler))
这样就实现了总共进行三次访问,其中任意一次访问成功就返回成功的结果,如果第二次访问还没成功就直接返回第三次访问的结果。
.net4.5中HttpClient使用注意点
原文地址:https://www.cnblogs.com/mldcy/p/8278035.html